Parallel Processing with RapidMiner

From Rapid-I-Wiki

Jump to: navigation, search

RapidMiner Enterprise Edition is already capable of using multicore machines and provides several parallized algorithms. In our tests on a quadcore machine, we easily reached a speed-up of a factor 3 for processes containing cross validation or parameter optimization. The image below shows the smaller runtimes (pink) compared to the traditional non-parallelized versions (blue).

Improvement multicore.jpg

This is a nice speed-up and on a machine with 8 or 16 Gb memory we are able to cope with very large data sets containing hundreds of millions of transactions (keeping the data in the database of course).

True grid computing, however, is currently not supported. But starting with the Enterprise Edition as a base this probably can be done by yourself - we had done a quick test several months ago. Just as a side note: we actually decided against the support for grid computing (at least for the moment) regarding the current development of multicore machines.

Personal tools