Pages: [1]
Author Topic: Issues with regular expressions  (Read 2200 times)
Ingo Mierswa
Hero Member
Posts: 1226

« on: May 23, 2008, 11:36:07 PM »

Original message from SourceForge forum at

The following regular expression works in RapidMiner: '[A-Z][a-z]+', when applied to any text, to extract words that begin with an upper case. 
However, if I add any space definition, it does not work. For example: '[A-Z][a-z][ ][A-Z][a-z]+', does not get recognized as a valid regular expression.
The same expressions work well in other regex text editors.
Any ideas on why RapidMiner is not recognizing the space definiton?

Edit by topic starter:

I found the answer shortly after posting this; spaces seem to be defined by \s as in:
The expression above works. However, it does find only the first occurrence of the match. Any ideas on how to get all occurrences?

Answer by Ingo Mierswa:

the regular expressions should be the same as they are supported by Java as explained here:
I am not too sure but it might be that capturing groups can help here:

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at
Pages: [1]
Jump to: