Parsing text from attached files

The content of attached files is not available by default like the rest of the contents of the email. You need to tell Email Parser to read the contents of an attached file using a type of field called “Attachment reader”:

You can name the field as you want. In this case, we have used “attached_file_content” but you can use any name such as “report_text” or “attached invoice pdf content” .

The rest of the parameters are usually left as default. But if you expect to receive more than one attachment in the same email it is useful to set a file filter to prevent Email Parser from opening and reading all the attached files. A common file filter would be just the filename as you expect it, for instance “monthly report.pdf”, or “*.docx” to parse any attached Word document. Once we have the text of the attached file stored in a field, the next step is to capture data from it. This is done using Multiple-step parsing. This means creating a new field and setting the “attachment reader” field as input: