Removing line breaks (or other unwanted characters) from captured text

June 12, 2014

You can master regular expressions or c# scripting and build a parser that captures from the email exactly what you want but sometimes unwanted characters such as line breaks, tabs or weird characters (for instance ıġħť p̀ł) are part of the text captured. To solve these issues we need to apply to the captured text another step of parsing called “text filtering and replacing” : filter_captured_text-624x432

Notice the regular expression used:

[^dts()]

It matches any character except a digit (d), a tab (t), a space (s) and also the left and right parenthesis. This means that anything that it is not commonly used in a phone number will match and will be removed. For example, if we have captured the following phone number in the field “mobile_phone”:

758956-786s.aw

The field “mobile_phone_filtered” will be:

758956786

Tags Examples

Tags

Archives

Removing line breaks (or other unwanted characters) from captured text