Pages: [1]
  Print  
Author Topic: meaning of sample ratio in ArffExampleSource  (Read 1007 times)
lotusinsnow
Newbie
*
Posts: 2


« on: June 09, 2009, 06:44:23 PM »

Dear all,

I have a very large dataset, so the miner can't finish clustering successfully and also took a long time. I used sample_ratio=0.1 in ArffExampleSource, it executed successfully! Could you please tell me what kind of sampling mechanism that rapidminer is using, so I can have an idea of what the data likes after sampling by sample_ratio?

Many thanks,
Jing
Logged
lotusinsnow
Newbie
*
Posts: 2


« Reply #1 on: June 10, 2009, 12:23:19 AM »

I saw the code, and the sample is randomly chosen by the ratio.

Jing
Logged
Sebastian Land
Administrator
Hero Member
*****
Posts: 2426


« Reply #2 on: June 15, 2009, 09:34:53 AM »

Hi Jing,
You are correct. For more sophisticated sampling algorithms, see the preprocessing/data/sampling group. There we provide operators like kennard-stone sampling, stratifiedSampling. Of course your data has to fit entirly into the memory, in order to sample it with this operators...

Greetings,
  Sebastian
Logged
Pages: [1]
  Print  
 
Jump to: