com.rapidminer.operator.features.selection
Class FeatureSelectionOperator

java.lang.Object
  extended by com.rapidminer.operator.Operator
      extended by com.rapidminer.operator.OperatorChain
          extended by com.rapidminer.operator.features.FeatureOperator
              extended by com.rapidminer.operator.features.selection.FeatureSelectionOperator
All Implemented Interfaces:
ConfigurationListener, PreviewListener, ParameterHandler, LoggingHandler

public class FeatureSelectionOperator
extends FeatureOperator

This operator realizes the two deterministic greedy feature selection algorithms forward selection and backward elimination. However, we added some enhancements to the standard algorithms which are described below:

Forward Selection

  1. Create an initial population with n individuals where n is the input example set's number of attributes. Each individual will use exactly one of the features.
  2. Evaluate the attribute sets and select only the best k.
  3. For each of the k attribute sets do: If there are j unused attributes, make j copies of the attribute set and add exactly one of the previously unused attributes to the attribute set.
  4. As long as the performance improved in the last p iterations go to 2

Backward Elimination

  1. Start with an attribute set which uses all features.
  2. Evaluate all attribute sets and select the best k.
  3. For each of the k attribute sets do: If there are j attributes used, make j copies of the attribute set and remove exactly one of the previously used attributes from the attribute set.
  4. As long as the performance improved in the last p iterations go to 2

The parameter k can be specified by the parameter keep_best, the parameter p can be specified by the parameter generations_without_improval. These parameters have default values 1 which means that the standard selection algorithms are used. Using other values increase the runtime but might help to avoid local extrema in the search for the global optimum.

Another unusual parameter is maximum_number_of_generations. This parameter bounds the number of iterations to this maximum of feature selections / deselections. In combination with generations_without_improval this allows several different selection schemes (which are described for forward selection, backward elimination works analogous):

Author:
Simon Fischer, Ingo Mierswa ingomierswa Exp $

Field Summary
static int BACKWARD_ELIMINATION
           
static int FORWARD_SELECTION
           
static java.lang.String PARAMETER_GENERATIONS_WITHOUT_IMPROVAL
          The parameter name for "Stop after n generations without improval of the performance (-1: stops if the maximum_number_of_generations is reached).
static java.lang.String PARAMETER_KEEP_BEST
          The parameter name for "Keep the best n individuals in each generation.
static java.lang.String PARAMETER_MAXIMUM_NUMBER_OF_GENERATIONS
          The parameter name for "Delivers the maximum amount of generations (-1: might use or deselect all features).
static java.lang.String PARAMETER_SELECTION_DIRECTION
          The parameter name for "Forward selection or backward elimination.
 
Fields inherited from class com.rapidminer.operator.features.FeatureOperator
PARAMETER_CONSTRAINT_DRAW_RANGE, PARAMETER_DRAW_DOMINATED_POINTS, PARAMETER_LOCAL_RANDOM_SEED, PARAMETER_MAXIMAL_FITNESS, PARAMETER_NORMALIZE_WEIGHTS, PARAMETER_PLOT_GENERATIONS, PARAMETER_POPULATION_CRITERIA_DATA_FILE, PARAMETER_SHOW_POPULATION_PLOTTER, PARAMETER_SHOW_STOP_DIALOG, PARAMETER_USER_RESULT_INDIVIDUAL_SELECTION
 
Constructor Summary
FeatureSelectionOperator(OperatorDescription description)
           
 
Method Summary
 IOObject[] apply()
          Applies the feature operator: collects the pre- and postevaluation operators create an initial population evaluate the initial population loop as long as solution is not good enough apply all pre evaluation operators evaluate the population update the population's best individual apply all post evaluation operators return all generation's best individual
 Population createInitialPopulation(ExampleSet es)
          May es have n features.
 java.util.List<ParameterType> getParameterTypes()
          Returns a list of ParameterTypes describing the parameters of this operator.
 java.util.List<PopulationOperator> getPostEvaluationPopulationOperators(ExampleSet input)
          empty list
 java.util.List<PopulationOperator> getPreEvaluationPopulationOperators(ExampleSet input)
          The operators performs two steps: forward selection/backward elimination kick out all but the keep_best individuals remove redundant individuals
 boolean solutionGoodEnough(Population pop)
          Returns true if the best individual is not better than the last generation's best individual.
 
Methods inherited from class com.rapidminer.operator.features.FeatureOperator
createCleanClone, evaluate, getCheckForMaximum, getInnerOperatorCondition, getInputClasses, getMaxNumberOfInnerOperators, getMinNumberOfInnerOperators, getOutputClasses, getPopulation, getPopulationEvaluator, getRandom, setCheckForMaximum
 
Methods inherited from class com.rapidminer.operator.OperatorChain
addAddListener, addOperator, addOperator, checkDeprecations, checkIO, checkNumberOfInnerOperators, checkProperties, clearErrorList, cloneOperator, createExperimentTree, createProcessTree, getAllInnerOperators, getIndexOfOperator, getInnerOperatorForName, getInnerOperatorsXML, getNumberOfAllOperators, getNumberOfOperators, getOperator, getOperatorFromAll, getOperators, performAdditionalChecks, processFinished, processStarts, registerOperator, removeAddListener, removeOperator, shouldAddNonConsumedInput, shouldReturnInnerOutput, unregisterOperator
 
Methods inherited from class com.rapidminer.operator.Operator
addError, addValue, addWarning, apply, checkForStop, createExperimentTree, createFromXML, createMarkedExperimentTree, createMarkedProcessTree, createProcessTree, getAddOnlyAdditionalOutput, getApplyCount, getDeliveredOutputClasses, getDeprecationInfo, getDesiredInputClasses, getEncoding, getErrorList, getExperiment, getInput, getInput, getInput, getInputDescription, getIOContainerForInApplyLoopBreakpoint, getIODescription, getLog, getName, getOperatorClassName, getOperatorDescription, getParameter, getParameterAsBoolean, getParameterAsColor, getParameterAsDouble, getParameterAsFile, getParameterAsFile, getParameterAsInputStream, getParameterAsInt, getParameterAsMatrix, getParameterAsString, getParameterList, getParameters, getParameterType, getParent, getProcess, getStartTime, getStatus, getUserDescription, getValue, getValues, getXML, hasBreakpoint, hasBreakpoint, hasInput, inApplyLoop, isDebugMode, isEnabled, isExpanded, isParallel, isParameterSet, log, logError, logNote, logWarning, register, remove, rename, resume, setApplyCount, setBreakpoint, setEnabled, setExpanded, setInput, setListParameter, setOperatorParameters, setParameter, setParameters, setParent, setUserDescription, toString, writeXML
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

PARAMETER_SELECTION_DIRECTION

public static final java.lang.String PARAMETER_SELECTION_DIRECTION
The parameter name for "Forward selection or backward elimination."

See Also:
Constant Field Values

PARAMETER_KEEP_BEST

public static final java.lang.String PARAMETER_KEEP_BEST
The parameter name for "Keep the best n individuals in each generation."

See Also:
Constant Field Values

PARAMETER_GENERATIONS_WITHOUT_IMPROVAL

public static final java.lang.String PARAMETER_GENERATIONS_WITHOUT_IMPROVAL
The parameter name for "Stop after n generations without improval of the performance (-1: stops if the maximum_number_of_generations is reached)."

See Also:
Constant Field Values

PARAMETER_MAXIMUM_NUMBER_OF_GENERATIONS

public static final java.lang.String PARAMETER_MAXIMUM_NUMBER_OF_GENERATIONS
The parameter name for "Delivers the maximum amount of generations (-1: might use or deselect all features)."

See Also:
Constant Field Values

FORWARD_SELECTION

public static final int FORWARD_SELECTION
See Also:
Constant Field Values

BACKWARD_ELIMINATION

public static final int BACKWARD_ELIMINATION
See Also:
Constant Field Values
Constructor Detail

FeatureSelectionOperator

public FeatureSelectionOperator(OperatorDescription description)
Method Detail

apply

public IOObject[] apply()
                 throws OperatorException
Description copied from class: FeatureOperator
Applies the feature operator:
  1. collects the pre- and postevaluation operators
  2. create an initial population
  3. evaluate the initial population
  4. loop as long as solution is not good enough
    1. apply all pre evaluation operators
    2. evaluate the population
    3. update the population's best individual
    4. apply all post evaluation operators
  5. return all generation's best individual

Overrides:
apply in class FeatureOperator
Returns:
the last inner operator's output or the input itself if the chain is empty.
Throws:
OperatorException

createInitialPopulation

public Population createInitialPopulation(ExampleSet es)
                                   throws UndefinedParameterError
May es have n features. The initial population contains (depending on wether forward selection or backward elimination is used) either

Specified by:
createInitialPopulation in class FeatureOperator
Throws:
UndefinedParameterError

getPreEvaluationPopulationOperators

public java.util.List<PopulationOperator> getPreEvaluationPopulationOperators(ExampleSet input)
                                                                       throws OperatorException
The operators performs two steps:
  1. forward selection/backward elimination
  2. kick out all but the keep_best individuals
  3. remove redundant individuals

Specified by:
getPreEvaluationPopulationOperators in class FeatureOperator
Throws:
OperatorException

getPostEvaluationPopulationOperators

public java.util.List<PopulationOperator> getPostEvaluationPopulationOperators(ExampleSet input)
                                                                        throws OperatorException
empty list

Specified by:
getPostEvaluationPopulationOperators in class FeatureOperator
Throws:
OperatorException

solutionGoodEnough

public boolean solutionGoodEnough(Population pop)
                           throws OperatorException
Returns true if the best individual is not better than the last generation's best individual.

Specified by:
solutionGoodEnough in class FeatureOperator
Throws:
OperatorException

getParameterTypes

public java.util.List<ParameterType> getParameterTypes()
Description copied from class: Operator
Returns a list of ParameterTypes describing the parameters of this operator. The default implementation returns an empty list if no input objects can be retained and special parameters for those input objects which can be prevented from being consumed.

Specified by:
getParameterTypes in interface ParameterHandler
Overrides:
getParameterTypes in class FeatureOperator


Copyright © 2001-2009 by Rapid-I