But what is the real advantage towards WEKA, which is also open source and completly free?
So it is not allowed to have two free solutions?
I do not want to start a flame war which of the tools is better for which purpose - which seems stupid to me since both are open-source and freely available and everybody can test both and check which one better suits the user's needs.
But since you have asked I will give you at least a short idea of my motivation why I prefer RapidMiner over Weka. I actually used to employ Clementine for all my data mining problems and I also used Weka quite a lot (almost every time for the same reason: I missed some functionality in Clementine and Weka which I had to implement myself). About one year ago I came across RapidMiner and this actually has changed my everyday's working life - probably more any other application I was introduced to. Today, our company does not longer spend money for Clementine (sorry guys) and all of our analysts have fully changed to RapidMiner. Here are some of the main reasons (only applicable for Weka, for Clementine things are a bit different):
1. power and flexibility:
Weka's Experimenter is easy to use but let's face it: it is not flexible enough to meet real-worlds process requirements. IMHO it is not even flexible enough for scientific work (I know both quite well, the scientific data mining world as well as the real business). The same is basically true for the Weka Knowledge flow. Nice and in general quite similar to RapidMiner but not nearly as powerful when it comes to more complex processes as they are necessary to us. RapidMiner provides much more analysis steps (operators) than Weka and much more possibilities to combine them. I am often amazed how the small modules of RapidMiner can be combined in such a way that you can solve analysis problems which can not be solved by any other solution. Two thumbs up for the RapidMiner developers to come up with such a clear and modular concept for data analysis processes.
: the first versions of RapidMiner I used were actually not faster than Weka - but they used much less memory. In the meantime (I use a pre-release of RapidMiner 4.3) the algorithms were also optimized for speed. Our database contains 1.6 billion transactions and our data mining processes work quite well on that amount of data. On Weka we always had to use rather small samples and never were able to directly work on the database. By the way: we recently updated to the Enterprise Edition and got a great performance gain. On our analysis server with 8 cores we got a nice runtime boost - the parallel version of decision tree learner really rocks and delivers the results in about 1/8 of the time of the non-parallized version. That's pretty cool.
: things look much better in RapidMiner than in Weka. I do not refer to the look and feel here but to the really great visualization tools within RapidMiner. Try it yourself, you will love them.
: it is really amazing how many methods for preprocessing and data extraction / transformation are available directly within RapidMiner. There are much more methods for these really important aspects of data analysis than in Weka (and also more than in any other tool I am aware of). This integrates all phases of analysis into one process / tool and my work became really smooth.
this is not really an argument for the software but anyway. As I said before, I used to work with Weka a lot and I also have developed some algorithms. I found several bugs within Weka and have sent them to the Weka maintainers. Almost the same reaction: none. The developers of RapidMiner are much faster (did you notice how often and fast they implement feature requests coming from the community? You cannot imagine how well they work for their customers...). This is also something I never got from SPSS / Clementine and I really like this about RM / rapid-i as well.
Since some might ask why we prefer RapidMiner over Clementine the answer is quite simple: it offers much more data mining and analysis possibilities for no (or only a small) price.
The answer grew longer than I ihad ntended. But since this was also an important question for me one year ago I hope it helps some of the readers. But again: please be not offended if you prefer Weka for one reason or another. That's of course fine and this is only my own story why I have changed the tool. And let's not start a big discussion about the pros and cons - there are of course also some drawbacks of RapidMiner as every user might know (personally, I found the beginning quite hard since there were so much different possibilities). I just wanted to let others know why getting used to RapidMiner might be good idea even if this means slightly more work until you get used to many different options of RapidMiner.