Open Source Software für Big Data Analytics.
Ohne Programmierung.

HomeKontaktSucheSitemapDatenschutzImpressum
  • Deutsch
  • English
Rapid-I. Report the Future. Home Download
Rapid-I Blog
Home Home
Search Search
RSS Feed RSS Feed

 

 

Blog Tags
Login Form





Passwort vergessen?
Noch kein Benutzerkonto?
Registrieren
Fun 16 Feb 2011
Go, Watson, Go: Win at Jeopardy with Basic Statistics by Ingo Mierswa
Well done, IBM. The new super computer named Watson was created and trained during the last 4 years by 25 IBM engineers in order to play (and win!) at Jeopardy. I just have viewed a short video about the event and the result really looks impressive:




Watson played quite well against two of the best Jeopardy players in the world. I especially liked to see the confidences at the bottom of the screen, this allowed me to check the quality of their model. And they did a good job: the clear cases were those where Watson was right in many cases.

Another nice thing was the reactions of the other contestants: Several times they seem to  know the answer (the question) as well but they are simply too slow.

And this was only day 1, on the second day of this three-day contest Watson performed even better. But after having digged a bit deeper I found out that the used techniques were pretty simple: at first, I thought that Watson understood the question by hearing instead of getting them directly. This is of course a big advantage since you don't lose any time with "understanding" what has been said or written. Talking about time, there is of course another big advantage of Watson that he does not lose any time for pressing the buzzer.

The basic thechniques are pretty simple as well: Watson stores about 200 million pages in a large search index - among them the complete Wikipedia - and searches for the given answer in those pages (ok, we probably all know how this works). From the top k results Watson extracts the most important person / concept / object etc. and creates an appropriate question. Little details have leaked about that but from that little I got the impression, that it's merely a topic detection or a named entity recognition and the confidence is based more or less on the average of the topic / NER confidences. Mix those simple ideas with the power of 2800 traditional computers and you get an impressive result...

The simple ideas most often are the most robust ones and the scientific and engineering efforts are impressive. Thanks, IBM, for those efforts and also for the positive effect this show probably has on the public acceptance of data mining and business analytics.
Comments (2)add comment

Mark van de Ven said:

0
more on workings
In this video, they explain a little more about how it works: http://www.youtube.com/watch?v=d_yXV22O6n4

very cool indeed!
 
February 16, 2011
Votes: +0

Ingo Mierswa said:

Ingo Mierswa
...
Hi Mark,

thanks for the link!

Cheers,
Ingo
 
February 16, 2011 | url
Votes: +0

Write comment

busy
  • Share/Bookmark
  • Abbonieren Sie unseren RSS Feed!
  • Sehen Sie sich Videos in unserem YouTube Channel an!
  • Rapid Insight / Inside Rapid-I (Blog)
  • Besuchen Sie Rapid-I bei Facebook und werden Sie Fan!
  • Folgen Sie Rapid-I bei Twitter!
  • Lesen Sie den Rapid-I Newsletter