Handy REGEX expression for HTML Tag cleaning

By Peter Bromberg
ODBC Drivers for QuickBooks, Salesforce, SAP, MSCRM, SharePoint … Free Trial!

Here's some C# code for replacing either a specific tag, or removing a series of potentially malicious tags from an HTML string:

Here's some C# code for replacing either a specific tag, or removing a series of potentially malicious tags from an HTML string:


using System.Text.RegularExpressions

// Replace BODY tag with "<div id=advert/>" (for later replacement with ad code, for example):
strContent = Regex.Replace(strContent, @"</?(?i:body)(.|\n)*?>", "<div id=advert/>");


//Malicious script: replace any and all of script / body /embed / object / frameset / frame / iframe / meta / ling /style with "")
strContent = Regex.Replace(strContent, @"</?(?i:script|body|embed|object|frameset|frame|iframe|meta|link|style)(.|\n)*?>", "");

Popularity  (132 Views)