C# Regex - Remove Heading Tags
C# regular expressions can be used to match and replace certain text patterns from a string variable.
Remove heading tags
The following regular expression can be used to remove all heading tags incl. h1 to h9 from HTML text string.
<[hH][1-9][^>]*>[^<]*</[hH][1-9]\s*>
Code snippet
var html = "Your HTML string..."; var regex = new Regex(@"<[hH][1-9][^>]*>[^<]*</[hH][1-9]\s*>", RegexOptions.Compiled | RegexOptions.Multiline var replacedHtml = regex.Replace(html, "");
Example
Assuming the following is the input string:
Headings: <h3>Heading h3</h3> <h4>LINQ to SQL - Select N Random Records</h4>
After replacement, the output looks like the following:
Headings:
Remove tags only
To keep all the text content but to remove all HTML tags, use the following regular expression:
<[^>]*>
Example
For the above input HTML, the output looks like the following:
Headings: Heading h3 LINQ to SQL - Select N Random Records
copyright
This page is subject to Site terms.
comment Comments
No comments yet.