|Untagged||16 Sep 2010|
|RCOMM 2010 - Day 2 by Ingo Mierswa||
The second day of RCOMM 2010 was just as exciting than the first one. We started with a great talk of ThomasOtt of http://www.neuralmarkettrends.com about Forecasting Historical Volatility for Option Trading. Thomas is an awesome speaker and it was great listening to him and his experiences about modeling the markets:
By the way, Thomas has written a nice wrap up of his RCOMM 2010 experience so far at http://www.neuralmarkettrends.com/2010/09/14/rcomm-2010-having-a-blast/. Here is another picture of him answering a question from the audience:
The second talk in the first session was given by Marin Matijas about a fascinating application domain for RapidMiner, namely the forecasting of load in the energy sector. It was great to see what Marin has already achieved by predicting the necessary amount of energy much better than what was achivieved before.
The next three talks dealt with aspects of the RapidMiner architecture and how the data analysis with RapidMiner can be improved in terms of memory efficiency and / or runtime. Alexander Arimond showed a solution for distributed data mining based on the Map & Reduce paradigm (for example Hadoop) which a tremendous speed-up up to a factor of 6 for eight machines.
Marco Stolpe showed how a hierachical variant of frequent item sets, namely hierarchical heavy hitters can be implemented in RapidMiner. This should become the starting point for the discussion about how stream mining can be integrated in RapidMiner in general. We will come back to this during the next weeks and I am looking forward to find a solution in collaboration with Marco.
The last architecture talk was given by Olaf Laber from our partner Ingres. He has shown how scalable high speed data mining can be achieved by a combination of RapidMiner with Ingres VectorWise :
Imagine how you learn a decision tree on 10 million records in a couple of seconds only while using less than 1 Gigabyte of memory only. We experienced a speed-up up to 40 for Naive Bayes. Welcome to Ingres VectorWise + RapidAnalytics!
In the workshop session, our head of research & development, Simon Fischer, has shown a life demo of RapidAnalytics and how easily data and processes can be shared or integrated by the means of web services. We got a lot of positive feedback on RapidAnalytics and we will release the Community Edition soon to the general public (please contact us if you want to become a pilot customer):
Sebastian Land has then shown the new R extension for RapidMiner. We got a lot of positive feedback on the extension as well but also hints for the improvement we will surely regard for the second version.
The last session dealt with information and relation extraction. Timur Fayruzov started with a great talk about the extraction of protein interaction. The results were quite impressive and consisted also of a nice web interface having RapidMiner running in the background as engine.
The last talk was then given by Felix Jungermann. He has shown his Information Extraction Plugin which allows for the generic extraction of information from documents, like for example Named Entitiy Recognition. The extension comes with an awesome graphical user interface as well as many new algorithms and I am really looking forward to the release of his extension.
The first RapidMiner Community Meeting and Conference was a complete success. The quality of the talks was far above the average and I met so many lovely people. We had a lot of great discussions and new plans and projects were born as well. Thanks to everyone who participated and I am looking forward to meeting all of you next year again.