For my internship over the past several months I’ve been working in the Marketing Science department and part of my job has been getting data into MS Access and generating reports. This involves getting lists of prospects from various data sources. This was usually a pretty simple feat involving some basic SQL queries. However, sometimes I was handed data such as addresses that didn’t match any standard format used by IT. In the worst case the data was provided in a pdf which meant I could only export it to a non-delimited text file. I found that I really needed a couple generic regular expression functions to parse out fields to import into MS Access. I found some .xla examples online but I really wanted an easier to use, more extensive, and portable library. I also wanted to include a few basic patterns for so it wasn’t necessary to re-invent the wheel every time.
So, I created a simple Excel Add-In Regular Expressions.xla that adds several custom functions to implement the standard VBScript regular expressions.
Installation & Use
Step 1: After downloading Regular Expressions.xla you can enable it by navigating to Tools >> Add-Ins.
Step 2: In the dialog that appears click Browse and select Regular Expressions.xla in the directory you downloaded it to. After enabling the Add-In you can disable it at any time by reopening this dialog and un-checking the box.
Step 3: To Insert a function select a cell and click the button to bring up the dialog. All of the functions and pre-defined regular expression patterns will be listed under the Regular Expressions category.
Function Descriptions
- xREPLACE: Replace all portions of the search text matching the pattern with the replacement text.
- xMATCHES: Find and return the number of matches to a pattern in the search text.
- xMATCH: Find and return an instance of a match to the pattern in the search text. MatchIndex may be used in the case of multiple matches.
- xMATCHALL: Find and return a comma-separated list of all matches to the pattern in the search text.
- xGROUP: Find and return a group from within a matched pattern.
- xSTARTSWITH: Returns true or false if the search text starts with the pattern.
- xENDSWITH: Returns true or false if the search text ends with the pattern.
Parameter Descriptions
- pattern (text): the regular expression to use to match. Regular Expression Syntax
- searchText (text): the text to search for a match.
- replacementText (text): the string to replace the matched portion of the string with.
- ignoreCase (true/false): Set to false for a case-sensitive match. The default value is true.
- matchIndex (integer): When multiple matches are found, which one should be returned (1+)
- group (integer): Which subgroup (section within parenthesis) of the regular expression should be returned.
So far I have added basic regular expressions for matching phone numbers, 5 & 9 digit zip codes, emails, and urls. If you have any other useful, generic patterns you can send them to me at: contact@malcolmp.com and I may add them. I used a standard GNU license for this so you are free to re-distribute it and I’d appreciate you letting me know of any improvements so I can add them myself.
Download: Regular Expressions.xla
Thanks for this. I only have a little problem – every time I quit Excel I am asked if I would like to save ‘Regular Expressions.xla’. What do I do wrong?
Oh, I see why it does that now. You did nothing wrong. I uploaded a new version. Just re-download it and it shouldn’t prompt you anymore. Thanks!
This saved a lot of time for me. Thanks for posting it!
Waoo Thanks a Lot, what a great solution. It Works smoothly.
This functions should be included by default in Excel >:\
Thanks again!!
Thansk a lot. It works perfectly.
You beaut! You saved me a chunk of time there writing a module for this
Really great work here, and a massive timesaver. Can’t beat regular expressions, and I’ve always been surprised that there’s no built in regex functions in excel
Hi,
How do I use =xxEMAIL() function?
In cell G2 I have an e-mail address and in cell J2 I want to check if G2 in e-mail format.
==xxEMAIL(G2) gives me an error #Value!
Can you help?
Kınd Regards,
Ya, I’ve been meaning to refactor this and expand it a bit.
xxEMAIL(), xxPHONE(), xxURL() are just patterns that can be passed to Match, Replace
Here’s what you use in J2 to match an email:
xMATCHES(xxEMAIL(),G2,TRUE)
Just to save others some time looking for the answer:
To use a ‘saved’ match in the replacement string (one that was matched in the regular expression and enclosed with parantheses), use the $ symbol.
For e.g. our string is “catfood is good”
Our find expression: ^cat(food)
Replacement string: $1
The resulting output in the cell will be: food
The part of the matched string that was enclosed in parantheses (in this case “food”) was saved in the variable $1. This also works for several saved matches, e.g. $2 etc..)
This is a great tool, thanks. I just ran into one limitation, namely that it is not able to distinct between upper and lower case characters.
Let’s say for instance that I have a string that ends with two capital letters and I try:
xMATCH(“[A-Z]{2}$”, xyz) it will return yz, though it only should return something if YZ are upper case.
Any help would be appreciated!
The match function defaults to ignoreCase = true. If you pass false as the third parameter it should work for you.
Awesome, did not realize that. Thanks Malcolm!
Works great, thanks! Any chance you could add tool tips when typing manually? Would be good to get reminded of the parameters without having to go to the wizard.
Great add-in, thanks! Works flawlessly in Excel 2013 so far.
I second Richard Benson’s comment: it would be great to see a tooltip reminding the expected parameters.
I’ve only just found this, but already it’s saved me a shed load of time. Excellent!
Hi Malcolm,
many thanks for your great Add-In. Anyway I ran into an issue where the function returns an value error: =xMATCHES(“page\=/D\d+”;B88:B91)
I’m working with Excel 2007. Do you have idea?
Kind regards
Mike
In xGroup function can you have more than one group number? In normal regular expressions you can use \2\1 etc to backreference what is in the ().
Hi,
i have two columns, 1st containing regex and 2nd contaning pattern to be matched by regex…
is it possible to search for every single cell of 2nd column all the cell of 1st column to find some match?
ex.
1st col:
23*
24*
25*
26*
2nd col:
23000
27000
as output i would like to have something like 3rd column:
23*
not matched
thanks in advance
Malcolm, this add-in comes in very handy, but this is the very first time I work with REGEX. I want to use you addin to extract some data from one cell, just like it is done in the following site: http://www.slipstick.com/developer/regex-parse-message-text/
For example, there is explained “how to Get two (or more) values from a message”: if in my Excel workbook I have a whole message (text document) copied in one single cell, and I want o extract the Order ID. Can it be done with any the functions in your add-in? Thank you for your time!
Hi there
I’m trying to use the add-in for custom validation of data, but when I try to add the following:
=xMATCHES(“^\d\d(\.\d\d)*$”;B2;”true”)=”1″
as a custom validation function I get an error saying “this type of reference cannot be used in a data validation formula”. Am I out of luck or doing something wrong?
I have been trying for days to finish a project that involved RegEx an http response header output. I had a hard time finding VBA code that would work until I came across this excellent tool. Does the job wonderfully and I wanted to thank the Author! Thanks again.
For the life of me I can’t understand why Excel doesn’t come with regex capabilities out of the box. You, sir, are a gentleman and a scholar!
AWESOME! THANKS MAN!
Thank you so much for your add-in. I love Regex and can do so many things with them but I was constantly juggling between files. Your add-in allowed me to save so much time and is exactly what I’ve been looking for. Thank you so much for offering it for free.
Sarah, then it works for you, can you post some example about how to use it? thanks. David
It is a great idea, that should come with Excel out of the box…I would suggest providing some examples of how to use it. I am trying to follow you post, but I don’t get any result when I try to use some of the regular expression function you provide. For example, when I use: =xMATCH(“[^ABC]“,AU1:AU30000), I get #VALUE!. I am using MS Office 2010.
Hi, there seems to be an issue with this automatically loading on excel launch. I’ve tried adding it to the add-ins folder and unblocking it but I still have to manually add it everytime I launch a new instance of excel, is there something I’m missing on how to get around this?
Hi
I want to search for multiple patterns in a single cell. How do I to that using Regex?
Example:
2374834, this is test, 783457, 77777-343-343434, joe@domain.com, 03/03/16, iwerioweroieurowieuroiruroiuo
I want to capture digits, emails, and date
2374834, 783457,77777-343-343434,joe@domain.com,03/03/16
What a wonderful tool. Thank you. Saved me a lot of time.
Just for Info or debate
The mailto URL type is not recognized with the xxURL match, but it will pass the xxEMAIL expression !
This may not be obvious to many, but it can make sense, if the purpose is to extract ‘something’ looking like an email address once can click on.
And I think the next RegEx AddIn evolution of the searchstring parameter, would be to offer a Range instead of just a Cell value.
Otherwise a nice tool, but would be better with more examples from the users – put somewhere…
Pingback: All UM Classes in One Spreadsheet – Nathan Estell
Hi
I Want To Search and put in every cell column (325100139817),(LA KEDE),(450/R1),(44460),(MAY17),(01119600-01129100),(0BNS2588120652CD7E48CCBE359FC2BB).
How do I to that using Regex?
For example
R#32258# PLNPAYD.325100139817 SUKSES. SN/Token: LA KEDE/450/R1/Rp.44460/MAY17/1BLN/01119600-01129100/0BNS2588120652CD7E48CCBE359FC2BB/ADM2500. Saldo 14.777.009 – 42.160 = 14.734.849 @19:14.
Hi Malcolm,
Thank you for putting this together. It’s been very helpful over the years.
Is it possible for the functions to work on the formulas referenced instead of the values? Or is that a limitation of the RexExp library?
Minor issue: When you run a second instance of Excel, it complains about there already being a Regular Expressions.xls file. Easily ignored, but thought you’d want to know.
Thanks again,
Andrew
Ahh, never mind about the first part. I’ve just discovered the FORMULATEXT function.
Thank you! Excellent work.
Thank you so much for sharing this amazing Excel regexp tool!! Works like charm and very easy to configure the expressions..
Thanks!!
Do you have any examples of how to use this tool in Excel? Sorry to be a bit dense, but it is not obvious to me how to use this with Excel functions. I have to clean up a spreadsheet with a lot of hard carriage returns and line feeds and insert periods at the end of sentences.
Thank you.
Rich Dev
I think this would be an invaluable tool if I could get it to work. Can you post an example where you actually use the addin as the only examples posted here are ones where the users said it didn’t work.
Hi Malcolm,
Thank you. Great Job. Should be integrated in Excel!
I’m using XL 13 and I cannot use it with a range (xMatches). I got a value error. It works with a single cell.
Can u help ?
Thx