For my internship over the past several months I’ve been working in the Marketing Science department and part of my job has been getting data into MS Access and generating reports. This involves getting lists of prospects from various data sources. This was usually a pretty simple feat involving some basic SQL queries. However, sometimes I was handed data such as addresses that didn’t match any standard format used by IT. In the worst case the data was provided in a pdf which meant I could only export it to a non-delimited text file. I found that I really needed a couple generic regular expression functions to parse out fields to import into MS Access. I found some .xla examples online but I really wanted an easier to use, more extensive, and portable library. I also wanted to include a few basic patterns for so it wasn’t necessary to re-invent the wheel every time.
So, I created a simple Excel Add-In Regular Expressions.xla that adds several custom functions to implement the standard VBScript regular expressions.
Installation & Use
Step 1: After downloading Regular Expressions.xla you can enable it by navigating to Tools >> Add-Ins.
Step 2: In the dialog that appears click Browse and select Regular Expressions.xla in the directory you downloaded it to. After enabling the Add-In you can disable it at any time by reopening this dialog and un-checking the box.
Step 3: To Insert a function select a cell and click the button to bring up the dialog. All of the functions and pre-defined regular expression patterns will be listed under the Regular Expressions category.
- xREPLACE: Replace all portions of the search text matching the pattern with the replacement text.
- xMATCHES: Find and return the number of matches to a pattern in the search text.
- xMATCH: Find and return an instance of a match to the pattern in the search text. MatchIndex may be used in the case of multiple matches.
- xMATCHALL: Find and return a comma-separated list of all matches to the pattern in the search text.
- xGROUP: Find and return a group from within a matched pattern.
- xSTARTSWITH: Returns true or false if the search text starts with the pattern.
- xENDSWITH: Returns true or false if the search text ends with the pattern.
- pattern (text): the regular expression to use to match. Regular Expression Syntax
- searchText (text): the text to search for a match.
- replacementText (text): the string to replace the matched portion of the string with.
- ignoreCase (true/false): Set to false for a case-sensitive match. The default value is true.
- matchIndex (integer): When multiple matches are found, which one should be returned (1+)
- group (integer): Which subgroup (section within parenthesis) of the regular expression should be returned.
So far I have added basic regular expressions for matching phone numbers, 5 & 9 digit zip codes, emails, and urls. If you have any other useful, generic patterns you can send them to me at: email@example.com and I may add them. I used a standard GNU license for this so you are free to re-distribute it and I’d appreciate you letting me know of any improvements so I can add them myself.