C# Regex - Remove Heading Tags

visibility 204 comment 0 access_time 2y languageEnglish

C# regular expressions can be used to match and replace certain text patterns from a string variable.

Remove heading tags

The following regular expression can be used to remove all heading tags incl. h1 to h9 from HTML text string.

<[hH][1-9][^>]*>[^<]*</[hH][1-9]\s*>

Code snippet

var html = "Your HTML string...";
var regex = new Regex(@"<[hH][1-9][^>]*>[^<]*</[hH][1-9]\s*>", RegexOptions.Compiled | RegexOptions.Multiline
var replacedHtml = regex.Replace(html, "");

Example

Assuming the following is the input string:

Headings:
<h3>Heading h3</h3>    
<h4>LINQ to SQL - Select N Random Records</h4>

After replacement, the output looks like the following:

Headings:
    

Remove tags only

To keep all the text content but to remove all HTML tags, use the following regular expression:

<[^>]*>

Example 

For the above input HTML, the output looks like the following:

Headings:
Heading h3    
LINQ to SQL - Select N Random Records

copyright This page is subject to Site terms.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

Tags
More from Kontext
C# 10.0 New Features
visibility 105
thumb_up 0
access_time 4m
C# 9.0 New Features
visibility 814
thumb_up 0
access_time 2y