HTML Parsing using HTML Agility Pack .NET
Code description
HTML Agility Pack (HAP) is one of the most commonly used .NET package to parse HTML. It creates a document object model in memory, which can be use to manipulate the nodes (including both elements and attributes).
The package can be added to your project from NuGet via the following CLI:
dotnet add package HtmlAgilityPack --version 1.11.43
This code snippet provides one example of using this HAP to remove all script
elements. The script can be run as C# script.
For more details, refer to Html Agility Pack.
Code snippet
using HtmlAgilityPack; using System.Linq; var html = @"This is a test paragraph.
"; // From String var doc = new HtmlDocument(); doc.LoadHtml(html); // Clean out all script elements var scriptElements = doc.DocumentNode.Descendants("script"); if(scriptElements !=null){ foreach(var el in scriptElements.ToList()){ el.Remove(); } }
copyright
This page is subject to Site terms.
comment Comments
No comments yet.