Registered in England and Wales No: 08696309. VAT: 253 7691 79. Registered office 2 The Apex, Sheriffs Orchard, Coventry, CV1 3PP| CALL: 020 8050 1909

Our most loved regular expressions

Our most loved regular expressions

We have decided to share our most commonly used Regular expressions to dealing with HTML documents.

Regex to find text url’s in the HTML document:

"(?<url>((?<protocol>http(s)?)\:\/\/)?.+?[^\n\]\s])"

Regex to find HTML links in the HTML document:

"\<(a|area|url).*? href\=((""(?<url>((?<protocol>http(s)?)\:\/\/)?[^""]*)"")|('(?<url>((?<protocol>http(s)?)\:\/\/)?[^']*)'))"

Regex to find CSS links in the HTML document:

"(""(?<url>((?<protocol>http(s)?)\:\/\/)?[^""]*\.(css))"")|('(?<url>((?<protocol>http(s)?)\:\/\/)?[^']*\.(css))')|(url\('(?<url>((?<protocol>http(s)?)\:\/\/)?[^']*\.(css))'\))|(url\(""(?<url>((?<protocol>http(s)?)\:\/\/)?[^""]*\.(css))""\))"

Regex to find background images in the HTML document:

"\<.*? background\=((""(?<url>((?<protocol>http(s)?)\:\/\/)?[^""]*)"")|('(?<url>((?<protocol>http(s)?)\:\/\/)?[^']*)'))"

Regex to find images in the HTML document:

"(\<img.*? src\=((""(?<url>((?<protocol>http(s)?)\:\/\/)?[^""]*)"")|('(?<url>((?<protocol>http(s)?)\:\/\/)?[^']*)')))"

Leave a Reply

Your email address will not be published. Required fields are marked *