Pages: [1]
  Print  
Author Topic: Additional Descriptive Statistics Wanted as Options  (Read 2093 times)
mnorth
Newbie
*
Posts: 7


« on: October 25, 2010, 09:50:23 PM »

Hi all,

I think it would be very handy to have Median and Mode as additional default options within the Replace Missing Values operator.  I can do this now using the Value option, but this requires me to calculate the Median or Mode separately and then plug it in as the value.  I would like to see the Median and Mode displayed in the Statistics column of the Meta Data View of the Results Workspace.  If these descriptives were available there, it would easier to plug them into the Value option under the RMV operator in the Design Workspace.

Matt
Logged
Simon Fischer
Administrator
Sr. Member
*****
Posts: 448



WWW
« Reply #1 on: October 26, 2010, 09:29:32 AM »

Hi,

for nominal values, "average" means mode. The median is currently missing, however. For the median, use an Aggregate operator and extract the value as a macro.

Best,
Simon
Logged

Simon Fischer, Rapid-I
RapidMiner Development on Twitter: @simon_fis
mnorth
Newbie
*
Posts: 7


« Reply #2 on: October 26, 2010, 05:50:55 PM »

Hi Simon,

I had not thought to use an aggregate operator, so that suggestion is helpful to get the median. 

I think it makes sense that average is mode for nominal attributes, but for numeric attributes (especially ones which might not very normal in their distribution) the arithmetic mean, median and mode are likely to be different, so having all three of these measures of central tendency as options would, I think, be very useful when replacing missing values.

Thanks,

Matt
Logged
Pages: [1]
  Print  
 
Jump to: