Pages: [1]
  Print  
Author Topic: K-means on CSV file  (Read 776 times)
Swen
Newbie
*
Posts: 1


« on: January 13, 2014, 10:33:57 AM »

Hello everyone.

I have the following a csv file containing blogposts including author name, date posted etc.

Now I want to apply K-means clustering to the blog's content. I try to use the Rapidminer text tool to apply tf-idf vectorisation. However I can't figure out how to apply the tf-idf to every blog in the csv file. Any suggestions?

Cheers! 
Logged
Marius
Administrator
Hero Member
*****
Posts: 1794



WWW
« Reply #1 on: January 13, 2014, 04:36:33 PM »

Hi,

you need TF-IDF only if you have the actual contents of the blog, i.e. text. In this case you can find some useful video tutorials on text mining here: http://vancouverdata.blogspot.de/2010/11/text-analytics-with-rapidminer-loading.html

I would first focus on the text and add the other attributes like author and date later on. If you need help feel free to come back to this forum.

Best regards,
Marius
Logged

Please add [SOLVED] to the topic title when your problem has been solved! (do so by editing the first post in the thread and modifying the title)
Please click here before posting.
Pages: [1]
  Print  
 
Jump to: