See also:
Capturing an HTML tag from an email
Exporting to Google Sheets
This example shows how to capture information from an Amazon email that is received when an order is dispatched. It uses two different text capturing methods: starts with… continues until… and capture HTML tag and saves the data to a Google Sheet. It also features the use of multi-step parsing and some fancy regular expressions for email filtering.
It may sound complicated as it concentrates many Email Parser techniques but, if we go further into its details, you will find it is easy to understand.
And an incoming email from Amazon looks as this:
We want to capture from this email the following information:
Let’s go from top to bottom showing each item starting with the email filter:
The dispatch notifications are sent from the address auto-shipping@amazon.co.uk (for the United Kingdom). We have set it to take the emails from an address containing “amazon.co.uk” and with a subject matching the regular expression:
Your Amazon.co.uk order (.*?) has been dispatched
This regular expression is very simple, the element .*? means any text, so any subject containing that format will be processed by the following items.
The order number appears in the top right of the email body. We will take it from the plain text version of the email, which looks like this:
All we need is to take the text that starts after the word “Order” and continues until the end of line:
The total amount of the order is stored in a field called total in Email Parser. We also use the technique Starts with… continues until… to get this information. Let’s see how it looks like:
With this field we use a more complex technique: Capture HTML tag. We have found taking a look at the HTML version of the email body that Amazon puts this information in <strong> tag:
We use two steps to capture the destination address of the Amazon order. In the first step, we capture the area of the text of the email containing the physical address but also other information. Then, in the second step, we separate the address from the rest of information we do not need. The result of the first step is called destination_address_block and is captured like this:
As you can see Amazon has called this area of text critialInfo, very descriptive. We have found that it sometimes also includes the delivery date or simply the text ‘Your delivery info’. Either way, we use this as input for the field destination_address_actual_address which is the field we will use to fill our Google Sheet:
The final step in this example is to save the field we have retrieved from the email (except destination_address_block, which is a field we used to store intermediate text). The action looks like this: