Open source software for big data analytics.
No programming required.

HomeContact UsSearchSitemapPrivacy PolicyImprint
  • Deutsch
  • English
Rapid-I. Report the Future. Home Download
Rapid-I Blog
Home Home
Search Search
RSS Feed RSS Feed

 

 

Blog Tags
Login Form





Lost Password?
No account yet? Register
Tag >> Regular Expressions
Regular Expressions 22 Oct 2010
Regular Expression Madness by Ingo Mierswa Comment (1)

I just stumbled upon this great blog post about some uncommon uses of regular expressions. RapidMiner also makes a lot use of those beasts, especially for the definition of filters so I thought this post might be interesting to you.

Both examples are taken from the book The Unix Programming Environment by Kernighan and Pike (1984).

The first problem is to produce a list of all English words that contain all five vowels exactly once and in alphabetical order.

The book creates a regular expression

^[^aeiou]*a[^aeiou]*e[^aeiou]*i[^aeiou]*o[^aeiou]*u[^aeiou]*$

then uses it to filter a dictionary file. This produced 16 words ranging from abstemious to majestious.

The second problem is to produce a list of all English words of at least six letters with letters appearing in increasing alphabetical order.

The book creates a regular expression

^a?b?c?d?e?f?g?h?i?j?k?l?m?n?o?p?q?r?s?t?u?v?w?x?y?z?$

then uses it to filter a dictionary file as before, except there is an additional filter stage.

This produced 17 words including common words such as almost and ghosty. Some of the more interesting results were bijoux, chintz, and egilops. Kernighan and Pike explain that egilops is a disease that attacks wheat.

For an explanation of those expressions please refer to the original blog post . And have fun while you are creating similar expressions for your next example filter ;-)

 

 

  • Share/Bookmark
  • Stay tuned with our RSS feed!
  • Watch videos on our YouTube channel!
  • Rapid Insight / Inside Rapid-I (Blog)
  • Visit Rapid-I on Facebook and become our fan!
  • Follow Rapid-I on Twitter!
  • Read the Rapid-I Newsletter