Open Source Software für Big Data Analytics.
Ohne Programmierung.

HomeKontaktSucheSitemapDatenschutzImpressum
  • Deutsch
  • English
Rapid-I. Report the Future. Home Download
Rapid-I Blog
Home Home
Search Search
RSS Feed RSS Feed

 

 

Blog Tags
Login Form





Passwort vergessen?
Noch kein Benutzerkonto?
Registrieren
Tag >> Fun
FunData Mining 23 Mar 2011
Predictive Analytics and Cricket by Ingo Mierswa Comment (1)

I am not really deep into Cricket myself. However, I found this interesting blog entry which discusses some reasons for successful cricket games discoverey by data mining. It is not hard to tell that the author favors the Indian team :-)

The first thing to do is some basic statistics: How often did the Indian cricket team won in the past against certain other teams? For example, the Indian team won against England in 66% of all occasions during the last 5 years where both teams played against each other. Agains Australia, however, Indian won only in 40% of all those cases.

So the important point is: what were the circumstances under which India had won those 40%?  And here is where RapidMiner was used: the matches were described by attributes like "partnership", "pace bowlers", or "slow bowlers". The resulting decision tree looks like the following:

Decision Tree for Cricket

The model was built on all existing cases between India and Australia from the last 5 years. It is easy to tell that partnerships play the most significant role. In particular, 

  • India need to have 2 significant partnerships worth at least 77 runs
  • If not, the bowlers, specifically pace bowlers, have to step into the breach and take more than 7 wickets

Without any knowledge about cricket, I have hardly any idea what this actually means. I suppose that those two strong partnerships with 77 runs or more are two sets of good batting partners playing well with each other. If you don't have those, it seems that fast bowles taking down the wooden "goals" at least 7 times helps a lot.

This is what data mining is actually about: Finding insights in data without the need of having prior knowledge (of course you have to validate the findings!). The latter is actually missing in the blog post but maybe is part of the full report which can be downloaded on the web site. However, a fun read and a nice data mining application!

Fun 16 Feb 2011
Go, Watson, Go: Win at Jeopardy with Basic Statistics by Ingo Mierswa Comment (2)
Well done, IBM. The new super computer named Watson was created and trained during the last 4 years by 25 IBM engineers in order to play (and win!) at Jeopardy. I just have viewed a short video about the event and the result really looks impressive:




Watson played quite well against two of the best Jeopardy players in the world. I especially liked to see the confidences at the bottom of the screen, this allowed me to check the quality of their model. And they did a good job: the clear cases were those where Watson was right in many cases.

Another nice thing was the reactions of the other contestants: Several times they seem to  know the answer (the question) as well but they are simply too slow.

And this was only day 1, on the second day of this three-day contest Watson performed even better. But after having digged a bit deeper I found out that the used techniques were pretty simple: at first, I thought that Watson understood the question by hearing instead of getting them directly. This is of course a big advantage since you don't lose any time with "understanding" what has been said or written. Talking about time, there is of course another big advantage of Watson that he does not lose any time for pressing the buzzer.

The basic thechniques are pretty simple as well: Watson stores about 200 million pages in a large search index - among them the complete Wikipedia - and searches for the given answer in those pages (ok, we probably all know how this works). From the top k results Watson extracts the most important person / concept / object etc. and creates an appropriate question. Little details have leaked about that but from that little I got the impression, that it's merely a topic detection or a named entity recognition and the confidence is based more or less on the average of the topic / NER confidences. Mix those simple ideas with the power of 2800 traditional computers and you get an impressive result...

The simple ideas most often are the most robust ones and the scientific and engineering efforts are impressive. Thanks, IBM, for those efforts and also for the positive effect this show probably has on the public acceptance of data mining and business analytics.
Fun 29 Dec 2010
Christmas Tree 2.0 by Ingo Mierswa Comment (0)

Maybe next year:

 

 

The video shows some "Christmas trees" built from ferrofluids in dynamic magnetic fields:

http://en.wikipedia.org/wiki/Ferrofluid

Fascinating stuff.

MathFun 3 Dec 2010
Fun Math Trick: Squaring Numbers Close to 100 by Ingo Mierswa Comment (1)

I just stumbled upon this nice little trick which helps you to square numbers which are close to 100.

Let's say, you want to calculate 105*105. Then you can simply add the difference between 100 and 105 which is 5 to the 105 and get 110 which are the first three digits. Then just add 5 squared and you will get 11025 which is the result. This works for all numbers up to 150 but is more useful if the number is close to 100. By the way, it also works for numbers smaller then 100 but in that case you simply have to subtract the difference to 100 from your number.

The following video is giving you some more examples and also shows what happens for larger distances:

 

 

Neat!

 

Fun 28 Oct 2010
Juggling in a Cone by Ingo Mierswa Comment (0)

This video is just for fun for those readers interested into the mathematical and statistical backgrounds of data mining as well. It shows Greg Kennedy standing in an 8-foot high inverted cone. He starts juggling of 3, 5 & 7 balls on the inside surface and makes great use of the principles of geometry and physics:

 

 

Greg is well known worldwide not only for traditional juggling but also for creating entirely new forms of manipulation. Visit www.innovativejuggler.com for more info.

  • Share/Bookmark
  • Abbonieren Sie unseren RSS Feed!
  • Sehen Sie sich Videos in unserem YouTube Channel an!
  • Rapid Insight / Inside Rapid-I (Blog)
  • Besuchen Sie Rapid-I bei Facebook und werden Sie Fan!
  • Folgen Sie Rapid-I bei Twitter!
  • Lesen Sie den Rapid-I Newsletter