Capturing an HTML tag. Overview - Email Parser software

Capturing an HTML tag

Every incoming email in Email Parser has a pre-defined set of fields that hold the email subject, the date it was sent, the text of the email, etc. Among these fields, there is one called “BodyHTML” that contains the email text, like the field “Body”, but in HTML format.

Capturing text from HTML is not as easy as doing it from plain text. As HTML can get quite complicated, Email Parser offers three different ways of capturing text from it:

XPATH expressions: These are path-like expressions (/div/tr/td…) that show where the HTML tag you want to capture is located based on their parent tags. It works like a path of a folder in your computer (C:\Users\John\….)
CSS selectors: HTML tags are usually labeled with a class name or an id. CSS selectors use these labels to identify a specific tag. For instance “#header” or “.bold_text”.
By the HTML tag properties: In this method you select the HTML tag indicating its content or properties like their style, attributes etc.

XPATH expressions and CSS selectors are not topics specific to Email Parser. Like Regular Expressions, you can find a lot of information online. Email Parser only uses these well-known technologies to capture information from the emails.