C# regular expressions can be used to match and replace certain text patterns from a string variable.
Remove heading tags
The following regular expression can be used to remove all heading tags incl. h1 to h9 from HTML text string.
<[hH][1-9][^>]*>[^<]*</[hH][1-9]\s*>
Code snippet
var html = "Your HTML string...";
var regex = new Regex(@"<[hH][1-9][^>]*>[^<]*</[hH][1-9]\s*>", RegexOptions.Compiled | RegexOptions.Multiline
var replacedHtml = regex.Replace(html, "");
Example
Assuming the following is the input string:
Headings:
<h3>Heading h3</h3>
<h4>LINQ to SQL - Select N Random Records</h4>
After replacement, the output looks like the following:
Headings:
Remove tags only
To keep all the text content but to remove all HTML tags, use the following regular expression:
<[^>]*>
Example
For the above input HTML, the output looks like the following:
Headings:
Heading h3
LINQ to SQL - Select N Random Records