com.rapidminer.operator.clustering.clusterer.soft
Class EMClusterer

java.lang.Object
  extended by com.rapidminer.operator.Operator
      extended by com.rapidminer.operator.clustering.clusterer.AbstractClusterer
          extended by com.rapidminer.operator.clustering.clusterer.soft.EMClusterer
All Implemented Interfaces:
ConfigurationListener, PreviewListener, ParameterHandler, LoggingHandler

public class EMClusterer
extends AbstractClusterer

This operator represents an implementation of the EM-algorithm.

Author:
Regina Fritsch

Field Summary
static int AVERAGE_PARAMETERS
          The parameter name for "Init distributions average parameters"
static java.lang.String[] INIT_DISTRIBUTION
          The parameter name for "List of the diferent init distributions"
static int K_MEANS
          The parameter name for "Init distributions hard clustering"
static java.lang.String PARAMETER_ADD_CLUSTER_ATTRIBUTE
          The parameter name for "Indicates if a cluster id is generated as new special attribute.
static java.lang.String PARAMETER_CORRELATED
          The parameter name for "Indicates if the example set has correlated attributes"
static java.lang.String PARAMETER_INITIALIZATION_DISTRIBUTION
          The parameter name for "Indicates the initalization distribution"
static java.lang.String PARAMETER_K
          The parameter name for "the maximal number of clusters"
static java.lang.String PARAMETER_LOCAL_RANDOM_SEED
          The parameter name for "Use the given random seed instead of global random numbers (-1: use global).
static java.lang.String PARAMETER_MAX_OPTIMIZATION_STEPS
          The parameter name for "the maximal number of iterations performed for one run of the k method"
static java.lang.String PARAMETER_MAX_RUNS
          The parameter name for "the maximal number of runs of the k method with random initialization that are performed"
static java.lang.String PARAMETER_QUALITY
          The parameter name for "the quality, which has to be fullfild for the stopping of the soft clustering"
static java.lang.String PARAMETER_SHOW_PROBABILITIES
          The parameter name for "Indicates if the probabilities will be shown in example table"
static int RANDOMLY_ASSIGNED
          The parameter name for "Init distributions randomly assigned"
 
Constructor Summary
EMClusterer(OperatorDescription description)
           
 
Method Summary
protected  int bestIndex(int exampleIndex, int k, double[][] exampleInClusterProbability)
           
protected  double computeLogLikelyhood(int k, double[][] exampleInClusterProbability, FlatFuzzyClusterModel resultModel)
           
 ClusterModel createClusterModel(ExampleSet exampleSet)
           
protected  void expectationCorrelated(ExampleSet exampleSet, int k, double[][] exampleInClusterProbability, FlatFuzzyClusterModel oldResult)
           
protected  void expectationNonCorrelated(ExampleSet exampleSet, int k, double[][] exampleInClusterProbability, FlatFuzzyClusterModel oldResult)
           
 ClusterModel generateClusterModel(ExampleSet exampleSet)
          Generates a cluster model from an example set.
 InputDescription getInputDescription(java.lang.Class cls)
          Indicates that the consumption of example sets can be user defined.
 java.util.List<ParameterType> getParameterTypes()
          Returns a list of ParameterTypes describing the parameters of this operator.
protected  void maximization(ExampleSet exampleSet, int k, double[][] exampleInClusterProbability, FlatFuzzyClusterModel result)
           
 
Methods inherited from class com.rapidminer.operator.clustering.clusterer.AbstractClusterer
apply, getInputClasses, getOutputClasses
 
Methods inherited from class com.rapidminer.operator.Operator
addError, addValue, addWarning, apply, checkDeprecations, checkForStop, checkIO, checkProperties, clearErrorList, cloneOperator, createExperimentTree, createExperimentTree, createFromXML, createMarkedExperimentTree, createMarkedProcessTree, createProcessTree, createProcessTree, getAddOnlyAdditionalOutput, getApplyCount, getDeliveredOutputClasses, getDeprecationInfo, getDesiredInputClasses, getEncoding, getErrorList, getExperiment, getInnerOperatorsXML, getInput, getInput, getInput, getIOContainerForInApplyLoopBreakpoint, getIODescription, getLog, getName, getOperatorClassName, getOperatorDescription, getParameter, getParameterAsBoolean, getParameterAsColor, getParameterAsDouble, getParameterAsFile, getParameterAsFile, getParameterAsInputStream, getParameterAsInt, getParameterAsMatrix, getParameterAsString, getParameterList, getParameters, getParameterType, getParent, getProcess, getStartTime, getStatus, getUserDescription, getValue, getValues, getXML, hasBreakpoint, hasBreakpoint, hasInput, inApplyLoop, isDebugMode, isEnabled, isExpanded, isParallel, isParameterSet, log, logError, logNote, logWarning, performAdditionalChecks, processFinished, processStarts, register, registerOperator, remove, rename, resume, setApplyCount, setBreakpoint, setEnabled, setExpanded, setInput, setListParameter, setOperatorParameters, setParameter, setParameters, setParent, setUserDescription, toString, unregisterOperator, writeXML
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

PARAMETER_ADD_CLUSTER_ATTRIBUTE

public static final java.lang.String PARAMETER_ADD_CLUSTER_ATTRIBUTE
The parameter name for "Indicates if a cluster id is generated as new special attribute."

See Also:
Constant Field Values

PARAMETER_K

public static final java.lang.String PARAMETER_K
The parameter name for "the maximal number of clusters"

See Also:
Constant Field Values

PARAMETER_MAX_RUNS

public static final java.lang.String PARAMETER_MAX_RUNS
The parameter name for "the maximal number of runs of the k method with random initialization that are performed"

See Also:
Constant Field Values

PARAMETER_MAX_OPTIMIZATION_STEPS

public static final java.lang.String PARAMETER_MAX_OPTIMIZATION_STEPS
The parameter name for "the maximal number of iterations performed for one run of the k method"

See Also:
Constant Field Values

PARAMETER_QUALITY

public static final java.lang.String PARAMETER_QUALITY
The parameter name for "the quality, which has to be fullfild for the stopping of the soft clustering"

See Also:
Constant Field Values

PARAMETER_LOCAL_RANDOM_SEED

public static final java.lang.String PARAMETER_LOCAL_RANDOM_SEED
The parameter name for "Use the given random seed instead of global random numbers (-1: use global)."

See Also:
Constant Field Values

PARAMETER_SHOW_PROBABILITIES

public static final java.lang.String PARAMETER_SHOW_PROBABILITIES
The parameter name for "Indicates if the probabilities will be shown in example table"

See Also:
Constant Field Values

PARAMETER_INITIALIZATION_DISTRIBUTION

public static final java.lang.String PARAMETER_INITIALIZATION_DISTRIBUTION
The parameter name for "Indicates the initalization distribution"

See Also:
Constant Field Values

INIT_DISTRIBUTION

public static final java.lang.String[] INIT_DISTRIBUTION
The parameter name for "List of the diferent init distributions"


RANDOMLY_ASSIGNED

public static final int RANDOMLY_ASSIGNED
The parameter name for "Init distributions randomly assigned"

See Also:
Constant Field Values

K_MEANS

public static final int K_MEANS
The parameter name for "Init distributions hard clustering"

See Also:
Constant Field Values

AVERAGE_PARAMETERS

public static final int AVERAGE_PARAMETERS
The parameter name for "Init distributions average parameters"

See Also:
Constant Field Values

PARAMETER_CORRELATED

public static final java.lang.String PARAMETER_CORRELATED
The parameter name for "Indicates if the example set has correlated attributes"

See Also:
Constant Field Values
Constructor Detail

EMClusterer

public EMClusterer(OperatorDescription description)
Method Detail

createClusterModel

public ClusterModel createClusterModel(ExampleSet exampleSet)
                                throws OperatorException
Throws:
OperatorException

bestIndex

protected int bestIndex(int exampleIndex,
                        int k,
                        double[][] exampleInClusterProbability)
                 throws java.lang.Exception
Throws:
java.lang.Exception

expectationNonCorrelated

protected void expectationNonCorrelated(ExampleSet exampleSet,
                                        int k,
                                        double[][] exampleInClusterProbability,
                                        FlatFuzzyClusterModel oldResult)

expectationCorrelated

protected void expectationCorrelated(ExampleSet exampleSet,
                                     int k,
                                     double[][] exampleInClusterProbability,
                                     FlatFuzzyClusterModel oldResult)
                              throws java.lang.Exception
Throws:
java.lang.Exception

maximization

protected void maximization(ExampleSet exampleSet,
                            int k,
                            double[][] exampleInClusterProbability,
                            FlatFuzzyClusterModel result)

computeLogLikelyhood

protected double computeLogLikelyhood(int k,
                                      double[][] exampleInClusterProbability,
                                      FlatFuzzyClusterModel resultModel)

generateClusterModel

public ClusterModel generateClusterModel(ExampleSet exampleSet)
                                  throws OperatorException
Description copied from class: AbstractClusterer
Generates a cluster model from an example set. Called by AbstractClusterer.apply().

Specified by:
generateClusterModel in class AbstractClusterer
Throws:
OperatorException

getInputDescription

public InputDescription getInputDescription(java.lang.Class cls)
Indicates that the consumption of example sets can be user defined.

Overrides:
getInputDescription in class Operator

getParameterTypes

public java.util.List<ParameterType> getParameterTypes()
Description copied from class: Operator
Returns a list of ParameterTypes describing the parameters of this operator. The default implementation returns an empty list if no input objects can be retained and special parameters for those input objects which can be prevented from being consumed.

Specified by:
getParameterTypes in interface ParameterHandler
Overrides:
getParameterTypes in class Operator


Copyright © 2001-2009 by Rapid-I