com.rapidminer.operator.performance
Class RankStatistics

java.lang.Object
  extended by com.rapidminer.operator.performance.RankStatistics

public class RankStatistics
extends java.lang.Object

Provides methods to compute ranks for a single attribute and rank correlations for two attributes. When computing rank correlations, examples containing missing values for either attribute are skipped. When computing ranks, missing values are given missing ranks. All methods include an option to specify an imprecision tolerance when comparing values.

Author:
Paul Rubin

Constructor Summary
RankStatistics()
           
 
Method Summary
static double[] rank(ExampleSet eSet, Attribute att, Attribute mappingAtt)
          Calculates ranks for an attribute.
static double[] rank(ExampleSet eSet, Attribute att, Attribute mappingAtt, double fuzz)
          Calculates ranks for an attribute.
static double rho(ExampleSet eSet, Attribute a, Attribute b)
          Calculates the Spearman rank correlation between two attributes.
static double rho(ExampleSet eSet, Attribute a, Attribute b, double f)
          Calculates the Spearman rank correlation between two attributes.
static double tau_b(ExampleSet eSet, Attribute a, Attribute b)
          Computes Kendall's tau-b rank correlation statistic, ignoring examples containing missing values.
static double tau_b(ExampleSet eSet, Attribute a, Attribute b, double fuzz)
          Computes Kendall's tau-b rank correlation statistic, ignoring examples containing missing values, with approximate comparisons.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RankStatistics

public RankStatistics()
Method Detail

rho

public static double rho(ExampleSet eSet,
                         Attribute a,
                         Attribute b,
                         double f)
                  throws OperatorException
Calculates the Spearman rank correlation between two attributes.

Parameters:
eSet - the example set
a - the first attribute to correlate
b - the second attribute to correlate
f - a fuzz factor (allowance for imprecision) when ranking
Returns:
the rank correlation
Throws:
OperatorException

rho

public static double rho(ExampleSet eSet,
                         Attribute a,
                         Attribute b)
                  throws OperatorException
Calculates the Spearman rank correlation between two attributes.

Parameters:
eSet - the example set
a - the first attribute to correlate
b - the second attribute to correlate
Returns:
the rank correlation
Throws:
OperatorException

rank

public static double[] rank(ExampleSet eSet,
                            Attribute att,
                            Attribute mappingAtt,
                            double fuzz)
Calculates ranks for an attribute. Ranks are returned as double precision values, with 1 as the rank of the smallest value. Values within +/- fuzz of each other may be considered tied. Tied values receive identical ranks. Missing values receive rank NaN. Note that application of the "fuzz" factor is dependent on the order of the observations in the example set. For instance, if the first three values encountered are x, x+fuzz and x+2*fuzz, the first two will be considered tied but the third will not, since x+2*fuzz is not within +/- fuzz of x.

Parameters:
eSet - the example set
att - the attribute to rank
fuzz - values within +/- fuzz may be considered tied
Returns:
a double precision array of ranks

rank

public static double[] rank(ExampleSet eSet,
                            Attribute att,
                            Attribute mappingAtt)
Calculates ranks for an attribute. Ranks are returned as double precision values, with 1 as the rank of the smallest value. Tied values receive identical ranks. Missing values receive rank NaN.

Parameters:
eSet - the example set
att - the attribute to rank
mappingAtt - the attribute which might be used for remapping the values
Returns:
a double precision array of ranks

tau_b

public static double tau_b(ExampleSet eSet,
                           Attribute a,
                           Attribute b)
                    throws OperatorException
Computes Kendall's tau-b rank correlation statistic, ignoring examples containing missing values.

Parameters:
eSet - the example set
a - the first attribute to correlate
b - the second attribute to correlate
Returns:
Kendall's tau-b rank correlation
Throws:
OperatorException

tau_b

public static double tau_b(ExampleSet eSet,
                           Attribute a,
                           Attribute b,
                           double fuzz)
                    throws OperatorException
Computes Kendall's tau-b rank correlation statistic, ignoring examples containing missing values, with approximate comparisons.

Parameters:
eSet - the example set
a - the first attribute to correlate
b - the second attribute to correlate
fuzz - values within +/- fuzz may be considered tied
Returns:
Kendall's tau-b rank correlation
Throws:
OperatorException


Copyright © 2001-2009 by Rapid-I