How to parse a table from an email and export to Google Spreadsheets
See also:
Capturing an HTML tag
Exporting email contents to Google Sheets
In this example we will explain how to capture the data from a table. There are two ways to do this: Parsing the plain text version of the email and parsing the HTML version of the email. We will show how to do the first one (HTML) as usually provides more accurate results.
The email we receive looks as following:
From: b.lastrange@example.com
To: johnjohnes@example.com
Subject: Supplier defects from Feb 1st to Feb 18th 2021
Hi John,
Here you can find the list of the defects found in the received orders from our suppliers. It is from the last 3 months, only.
ER48423 |
Protoveo Electronic Devices |
4/1/2021 |
Arrived a week later than expected |
TH84280 |
Wikimba packagings |
5/1/2021 |
|
AU32842 |
Aurend |
7/1/2021 |
Some minor scratches in the surface. Call QA? |
LT31325 |
Electrova |
10/1/2021 |
Reference 4587459 is missing |
Regards,
Bob
We need to save the table from the email to a Google Spreadsheet and have this result:
As you can see in the left panel view at the top, we have set up a
Office 365 email account, a filter, a
parser . For saving the captured data we have an
Google Spreadsheet action:
The most important part for this example is how to get the data contained in the HTML table:
We have used XPath here. A very common method to identify tags in an HTML document. The XPath expression is:
html[1]/body[1]/div[1]/div[2]/table[1]/tbody[1]/tr
This means: Find the first “html” tag, inside this tag find the first “body” tag. In the “body” tag get the first “div” and inside that div take the second one. Finally, get the first table tag and get any “tr” tag inside the “table” tag. Note that the “tr” tag in HTML means “table row”
With the “additional options” we tell
Email Parser to convert this “tr” tags to plain text.
Once we have the row, we get the columns from it:
The rest of the columns are retrieved in a similar way. And once we have the cells of the table we save it to a
Google Sheet this way:
When we receive an email the following output is produced: