See also:
Capturing HTML tags. Overview
External resources:
CSS selectors reference – W3 Schools
CSS selectors reference – Mozilla.org
CSS selectors are expressions commonly used to format the text in the HTML code but they can be used for HTML parsing as well.
As many HTML tags are labeled with a class name or an id you can use them to identify the specific tag you want to capture from the email BodyHTML field. Let’s see a very simple example:
<html> <head> </head> <body> <p> Hello Matt,<br /> <br /> Your website <span>MullerConstruction.com</span> has generated a new lead: </p> <table class="lead_table"> <tbody> <tr> <td>Name:</td> <td>John Doe</td> </tr> <tr> <td>Phone:</td> <td>+1234567890</td> </tr> </tbody> </table> <div> This lead was generated by: <div> The contact form </div> </div> </body> </html>
This email looks like this:
Using a dot followed by the class name “lead_table” we capture that table from the html code:
But CSS selectors can be much more complicated than just using the class name. For instance, if we use:
.lead_table td:nth-child(2)
We get the first value of the table:
John Doe
Here there is a table that quickly shows the type of CSS selectors and their meaning:
Selector | Example | Example description |
---|---|---|
.class | .header | Selects all elements with class=”header” |
.classA.classB | .nameA.nameB | Selects all elements with both nameA and nameB in the class attribute |
.classA .classB | .nameA .nameB | Selects all elements of class nameA that are a descendant of an element of class nameB |
#id | #customername | Selects the element with id=”customername” |
element | div | Selects all <div> elements |
element.class | div.pricing | Selects all <div> elements of the class pricing |
element,element | div, p | Selects all <div> elements and all <p> elements |
element element | div p | Selects all <p> elements inside <div> elements |
element>element | div > p | Selects all <p> elements where the parent is a <div> element |
element+element | div + p | Selects the first <p> element that is placed immediately after <div> elements |
element1~element2 | p ~ ul | Selects every <ul> element that is preceded by a <p> element |
[attribute] | [target] | Selects all elements with a target attribute |
[attribute=value] | [target=_blank] | Selects all elements with target=”_blank” |
[attribute~=value] | [title~=flower] | Selects all elements with a title attribute containing the word “flower” |
[attribute|=value] | [lang|=en] | Selects all elements with a lang attribute value equal to “en” or starting with “en-“ |
[attribute^=value] | a[href^=”https”] | Selects every <a> element whose href attribute value begins with “https” |
[attribute$=value] | a[href$=”.pdf”] | Selects every <a> element whose href attribute value ends with “.pdf” |
[attribute*=value] | a[href*=”w3schools”] | Selects every <a> element whose href attribute value contains the substring “w3schools” |
:not(selector) | :not(p) | Selects every element that is not a <p> element |
:nth-child(n) | p:nth-child(2) | Selects every <p> element that is the second child of its parent |
:nth-last-child(n) | p:nth-last-child(2) | Selects every <p> element that is the second child of its parent, counting from the last child |
:nth-last-of-type(n) | p:nth-last-of-type(2) | Selects every <p> element that is the second <p> element of its parent, counting from the last child |
:nth-of-type(n) | p:nth-of-type(2) | Selects every <p> element that is the second <p> element of its parent |
:only-of-type | p:only-of-type | Selects every <p> element that is the only <p> element of its parent |
:only-child | p:only-child | Selects every <p> element that is the only child of its parent |