com.rapidminer.operator.preprocessing.filter
Class MissingValueImputation

java.lang.Object
  extended by com.rapidminer.operator.Operator
      extended by com.rapidminer.operator.OperatorChain
          extended by com.rapidminer.operator.preprocessing.filter.MissingValueImputation
All Implemented Interfaces:
ConfigurationListener, PreviewListener, ParameterHandler, LoggingHandler

public class MissingValueImputation
extends OperatorChain

The operator MissingValueImpution imputes missing values by learning models for each attribute (except the label) and applying those models to the data set. The learner which is to be applied has to be given as inner operator. In order to specify a subset of the example set in which the missing values should be imputed (e.g. to limit the imputation to only numerical attributes) an arbitrary filter can be used as the first inner operator. In the case that such a filter is used, the learner has to be the second inner operator. Please be aware that depending on the ability of the inner operator to handle missing values this operator might not be able to impute all missing values in some cases. This behaviour leads to a warning. It might hence be useful to combine this operator with a subsequent MissingValueReplenishment. ATTENTION: This operator is currently under development and does not properly work in all cases. We do not recommend the usage of this operator in production systems.

Author:
Tobias Malbrecht

Field Summary
static java.lang.String PARAMETER_FILTER_LEARNING_SET
          The parameter name for "Apply filter to learning set in addition to determination which missing values should be substituted.
static java.lang.String PARAMETER_ITERATE
          The parameter name for "Impute missing values immediately after having learned the corresponding concept and iterate.
static java.lang.String PARAMETER_LEARN_ON_COMPLETE_CASES
          The parameter name for "Learn concepts to impute missing values only on the basis of complete cases (should be used in case learning approach can not handle missing values).
static java.lang.String PARAMETER_LOCAL_RANDOM_SEED
          The parameter name for "Use the given random seed instead of global random numbers (-1: use global).
static java.lang.String PARAMETER_ORDER
          The parameter name for "Order of attributes in which missing values are estimated.
static java.lang.String PARAMETER_SORT
          The parameter name for "Sort direction which is used in order strategy.
 
Constructor Summary
MissingValueImputation(OperatorDescription description)
           
 
Method Summary
 IOObject[] apply()
          Applies all inner operators.
 InnerOperatorCondition getInnerOperatorCondition()
          Must return a condition of the IO behaviour of all desired inner operators.
 java.lang.Class<?>[] getInputClasses()
          Returns the classes that are needed as input.
 int getMaxNumberOfInnerOperators()
          Returns the maximum number of inner operators.
 int getMinNumberOfInnerOperators()
          Returns the minimum number of inner operators.
 Attribute[] getOrderedAttributes(ExampleSet exampleSet, int order, boolean ascending)
           
 java.lang.Class<?>[] getOutputClasses()
          Returns the classes that are guaranteed to be returned by apply() as additional output.
 java.util.List<ParameterType> getParameterTypes()
          Returns a list of ParameterTypes describing the parameters of this operator.
 
Methods inherited from class com.rapidminer.operator.OperatorChain
addAddListener, addOperator, addOperator, checkDeprecations, checkIO, checkNumberOfInnerOperators, checkProperties, clearErrorList, cloneOperator, createExperimentTree, createProcessTree, getAllInnerOperators, getIndexOfOperator, getInnerOperatorForName, getInnerOperatorsXML, getNumberOfAllOperators, getNumberOfOperators, getOperator, getOperatorFromAll, getOperators, performAdditionalChecks, processFinished, processStarts, registerOperator, removeAddListener, removeOperator, shouldAddNonConsumedInput, shouldReturnInnerOutput, unregisterOperator
 
Methods inherited from class com.rapidminer.operator.Operator
addError, addValue, addWarning, apply, checkForStop, createExperimentTree, createFromXML, createMarkedExperimentTree, createMarkedProcessTree, createProcessTree, getAddOnlyAdditionalOutput, getApplyCount, getDeliveredOutputClasses, getDeprecationInfo, getDesiredInputClasses, getEncoding, getErrorList, getExperiment, getInput, getInput, getInput, getInputDescription, getIOContainerForInApplyLoopBreakpoint, getIODescription, getLog, getName, getOperatorClassName, getOperatorDescription, getParameter, getParameterAsBoolean, getParameterAsColor, getParameterAsDouble, getParameterAsFile, getParameterAsFile, getParameterAsInputStream, getParameterAsInt, getParameterAsMatrix, getParameterAsString, getParameterList, getParameters, getParameterType, getParent, getProcess, getStartTime, getStatus, getUserDescription, getValue, getValues, getXML, hasBreakpoint, hasBreakpoint, hasInput, inApplyLoop, isDebugMode, isEnabled, isExpanded, isParallel, isParameterSet, log, logError, logNote, logWarning, register, remove, rename, resume, setApplyCount, setBreakpoint, setEnabled, setExpanded, setInput, setListParameter, setOperatorParameters, setParameter, setParameters, setParent, setUserDescription, toString, writeXML
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

PARAMETER_ORDER

public static final java.lang.String PARAMETER_ORDER
The parameter name for "Order of attributes in which missing values are estimated."

See Also:
Constant Field Values

PARAMETER_SORT

public static final java.lang.String PARAMETER_SORT
The parameter name for "Sort direction which is used in order strategy."

See Also:
Constant Field Values

PARAMETER_ITERATE

public static final java.lang.String PARAMETER_ITERATE
The parameter name for "Impute missing values immediately after having learned the corresponding concept and iterate."

See Also:
Constant Field Values

PARAMETER_FILTER_LEARNING_SET

public static final java.lang.String PARAMETER_FILTER_LEARNING_SET
The parameter name for "Apply filter to learning set in addition to determination which missing values should be substituted."

See Also:
Constant Field Values

PARAMETER_LEARN_ON_COMPLETE_CASES

public static final java.lang.String PARAMETER_LEARN_ON_COMPLETE_CASES
The parameter name for "Learn concepts to impute missing values only on the basis of complete cases (should be used in case learning approach can not handle missing values)."

See Also:
Constant Field Values

PARAMETER_LOCAL_RANDOM_SEED

public static final java.lang.String PARAMETER_LOCAL_RANDOM_SEED
The parameter name for "Use the given random seed instead of global random numbers (-1: use global)."

See Also:
Constant Field Values
Constructor Detail

MissingValueImputation

public MissingValueImputation(OperatorDescription description)
Method Detail

getMinNumberOfInnerOperators

public int getMinNumberOfInnerOperators()
Returns the minimum number of inner operators.

Specified by:
getMinNumberOfInnerOperators in class OperatorChain

getMaxNumberOfInnerOperators

public int getMaxNumberOfInnerOperators()
Returns the maximum number of inner operators.

Specified by:
getMaxNumberOfInnerOperators in class OperatorChain

getInnerOperatorCondition

public InnerOperatorCondition getInnerOperatorCondition()
Description copied from class: OperatorChain
Must return a condition of the IO behaviour of all desired inner operators. If there are no "special" conditions and the chain works similar to a simple operator chain this method should at least return a SimpleChainInnerOperatorCondition. More than one condition should be combined with help of the class CombinedInnerOperatorCondition.

Specified by:
getInnerOperatorCondition in class OperatorChain

getOrderedAttributes

public Attribute[] getOrderedAttributes(ExampleSet exampleSet,
                                        int order,
                                        boolean ascending)
                                 throws OperatorException
Throws:
OperatorException

apply

public IOObject[] apply()
                 throws OperatorException
Description copied from class: OperatorChain
Applies all inner operators. The input to this operator becomes the input of the first inner operator. The latter's output is passed to the second inner operator and so on. Note to subclassers: If subclasses (for example wrappers) want to make use of this method remember to call exactly this method (super.apply()) and do not call super.apply(IOContainer) erroneously which will result in an infinite loop.

Overrides:
apply in class OperatorChain
Returns:
the last inner operator's output or the input itself if the chain is empty.
Throws:
OperatorException

getOutputClasses

public java.lang.Class<?>[] getOutputClasses()
Description copied from class: Operator

Returns the classes that are guaranteed to be returned by apply() as additional output. Please note that input objects which should not be consumed must also be defined by this method (e.g. an example set which is changed but not consumed in the case of a preprocessing operator must be defined in both, the methods Operator.getInputClasses() and Operator.getOutputClasses()). The default behavior for input consumation is defined by Operator.getInputDescription(Class) and can be changed by overwriting this method. Objects which are not consumed (defined by changing the implementation in Operator.getInputDescription(Class)) must not be defined as additional output in this method.

May deliver null or an empy array (no additional output is produced or guaranteed). Must return the class array of delivered output objects otherwise.

Specified by:
getOutputClasses in class Operator

getInputClasses

public java.lang.Class<?>[] getInputClasses()
Description copied from class: Operator
Returns the classes that are needed as input. May be null or an empty (no desired input). As default, all delivered input objects are consumed and must be also delivered as output in both Operator.getOutputClasses() and Operator.apply() if this is necessary. This default behavior can be changed by overriding Operator.getInputDescription(Class). Subclasses which implement this method should not make use of parameters since this method is invoked by getParameterTypes(). Therefore, parameters are not fully available at this point of time and this might lead to exceptions. Please use InputDescriptions instead.

Specified by:
getInputClasses in class Operator

getParameterTypes

public java.util.List<ParameterType> getParameterTypes()
Description copied from class: Operator
Returns a list of ParameterTypes describing the parameters of this operator. The default implementation returns an empty list if no input objects can be retained and special parameters for those input objects which can be prevented from being consumed.

Specified by:
getParameterTypes in interface ParameterHandler
Overrides:
getParameterTypes in class Operator


Copyright © 2001-2009 by Rapid-I