Pages: [1]
  Print  
Author Topic: IDF Calculation for Test Set  (Read 234 times)
smjsmj1
Newbie
*
Posts: 3


« on: June 27, 2014, 12:32:00 PM »

Can anyone explain the calculation of IDF value for Test sets?
Is it based on the IDF of Training sets?
I see that test set take only the word list used by the training set and IDF is Calculated solely based on the test set. So, if Test set contain only 1 document, then there is a chance that IDF becomes 0, correct?
Logged
fras
Global Moderator
Jr. Member
*****
Posts: 85


« Reply #1 on: June 27, 2014, 08:45:52 PM »

If you are using TF-IDF you must store model _and_ wordlist after training.
To test or score unseen data you have to preprocess with exactly the same
"Process Documents"-Operator that you used for training including the wordlist.
Logged
smjsmj1
Newbie
*
Posts: 3


« Reply #2 on: July 03, 2014, 06:19:30 AM »

Thank you for the reply
Logged
Pages: [1]
  Print  
 
Jump to: