Pages: [1]
  Print  
Author Topic: [SOLVED] Getting TF-IDF from unpivoted data  (Read 423 times)
louism
Newbie
*
Posts: 9


« on: February 18, 2014, 04:47:18 PM »

Hi, I am trying to do text mining.  I don't have the original documents, but my words are already in a database.  For example:

Doc A:  How are you?
Doc B: I am fine

What I have is a mysql table like

A How
A are
A you
B I
B am
B fine

The fact being I am a total newbie and relying heavily on text mining tutorials, it would perhaps be easier for me to go back to the document form so I can take that and "plug it" with what I see in most text mining tutorials and then generate my TF-IDF word vectors after my data clean up.  
« Last Edit: February 18, 2014, 07:05:15 PM by louism » Logged
louism
Newbie
*
Posts: 9


« Reply #1 on: February 18, 2014, 07:04:32 PM »

Solved this by using the GROUP_CONCAT operator in MySQL to rebuild a table with one row per document that includes a text field with all words appended one after the other.  Smiley
Logged
Pages: [1]
  Print  
 
Jump to: