Clean out All Style Attributes using HTML Agility Pack .NET

HTML Agility Pack (HAP) is one of the most commonly used .NET package to parse HTML. It creates a document object model in memory, which can be use to manipulate the nodes (including both elements and attributes). The package can be added to your project from NuGet via the following CLI: ``` dotnet add package HtmlAgilityPack --version 1.11.43 ``` This code snippet provides one example of using this HAP to remove all `style `attributes. The script can be run as [C# script](https://kontext.tech/article/511/introduction-to-c-interactive). For more details, refer to [Html Agility Pack](https://html-agility-pack.net/).

Kontext Kontext 0 1431 1.36 index 8/9/2022

Code description

HTML Agility Pack (HAP) is one of the most commonly used .NET package to parse HTML. It creates a document object model in memory, which can be use to manipulate the nodes (including both elements and attributes). 

The package can be added to your project from NuGet via the following CLI:

    dotnet add package HtmlAgilityPack --version 1.11.43  
    

This code snippet provides one example of using this HAP to remove all style attributes. The script can be run as C# script

For more details, refer to Html Agility Pack.

Code snippet

    using HtmlAgilityPack;
    using System.Linq;
    
    var html = @"
    
    
    This is a test paragraph.
    
    
    
    
    This is another test paragraph.
    
    
    ";
    
    // From String
    var doc = new HtmlDocument();
    doc.LoadHtml(html);
    
    // Clean out all elements with style attribute
    var styleNodes= doc.DocumentNode.SelectNodes("//*[@style]");
    if(styleNodes!=null){
    	foreach(var styleNode in styleNodes){
    		styleNode.Attributes["style"].Remove();
        }
    }
    
    // You can also directly use doc.Save API to avoid certain issues.
    var cleanedHtml = doc.DocumentNode.InnerHtml;
.net c#

Join the Discussion

View or add your thoughts below

Comments