Pages: [1]
  Print  
Author Topic: Question Mark on p-Value  (Read 939 times)
Gabo
Newbie
*
Posts: 3


« on: January 16, 2014, 06:05:35 PM »

Hello Guys,

I'm doing a linear regression with a data set that have about 500k rows and 66 attributes, I'm running rapidminer on a windows os, rapidminer is using 8 gb of mem only for itself and a processor xeon 2.4GHz.  These are my problems:

First: The process takes about 40 minutes to finish, it seems a lot of time compared with other tools I've used

Second and more important: in the values of the p-values and std error and some other metrics I get an "?" (question mark), I don't know what that means and I starting to think that is something wrong with rm. I'm including a picture with the results


Thank you very much!!!
Logged
Marius
Administrator
Hero Member
*****
Posts: 1793



WWW
« Reply #1 on: January 17, 2014, 10:15:11 AM »

Hi,

RapidMiner's Linear Regression does not only do the actual regression, but also eliminates colinear features, performs a feature selection etc. This actually can take quite some time and often improves the model quality, but you can try to switch it off and see how the runtime is affected. Out of curiosity, which other tools are you using and how long do they need for your dataset?

The issue regarding the missing values has been forwarded to our development team.

Best regards,
Marius
Logged

Please add [SOLVED] to the topic title when your problem has been solved! (do so by editing the first post in the thread and modifying the title)
Please click here before posting.
Pages: [1]
  Print  
 
Jump to: