com.rapidminer.operator.validation
Class SplitValidationOperator

java.lang.Object
  extended by com.rapidminer.tools.AbstractObservable<Operator>
      extended by com.rapidminer.operator.Operator
          extended by com.rapidminer.operator.OperatorChain
              extended by com.rapidminer.operator.validation.ValidationChain
                  extended by com.rapidminer.operator.validation.SplitValidationOperator
All Implemented Interfaces:
ConfigurationListener, PreviewListener, ResourceConsumer, ParameterHandler, LoggingHandler, Observable<Operator>

public class SplitValidationOperator
extends ValidationChain

A FixedSplitValidationChain splits up the example set at a fixed point into a training and test set and evaluates the model (linear sampling). For non-linear sampling methods, i.e. the data is shuffled, the specified amounts of data are used as training and test set. The sum of both must be smaller than the input example set size.

At least either the training set size must be specified (rest is used for testing) or the test set size must be specified (rest is used for training). If both are specified, the rest is not used at all.

The first inner operator must accept an ExampleSet while the second must accept an ExampleSet and the output of the first (which in most cases is a Model) and must produce a PerformanceVector.

This validation operator provides several values which can be logged by means of a ProcessLogOperator. All performance estimation operators of RapidMiner provide access to the average values calculated during the estimation. Since the operator cannot ensure the names of the delivered criteria, the ProcessLog operator can access the values via the generic value names:

Author:
Simon Fischer, Ingo Mierswa

Field Summary
static java.lang.String PARAMETER_SAMPLING_TYPE
           
static java.lang.String PARAMETER_SPLIT
           
static java.lang.String PARAMETER_SPLIT_RATIO
           
static java.lang.String PARAMETER_TEST_SET_SIZE
           
static java.lang.String PARAMETER_TRAINING_SET_SIZE
           
static int SPLIT_ABSOLUTE
           
static java.lang.String[] SPLIT_MODES
           
static int SPLIT_RELATIVE
           
 
Fields inherited from class com.rapidminer.operator.validation.ValidationChain
PARAMETER_CREATE_COMPLETE_MODEL
 
Constructor Summary
SplitValidationOperator(OperatorDescription description)
           
 
Method Summary
 void estimatePerformance(ExampleSet inputSet)
          This is the main method of the validation chain and must be implemented to estimate a performance of inner operators on the given example set.
 java.util.List<ParameterType> getParameterTypes()
          Returns a list of ParameterTypes describing the parameters of this operator.
protected  MDInteger getTestSetSize(MDInteger originalSize)
           
protected  MDInteger getTrainingSetSize(MDInteger originalSize)
           
 
Methods inherited from class com.rapidminer.operator.validation.ValidationChain
doWork, evaluate, executeEvaluator, executeLearner, learn, shouldAutoConnect
 
Methods inherited from class com.rapidminer.operator.OperatorChain
addOperator, addOperator, addSubprocess, areSubprocessesExtendable, assumePreconditionsSatisfied, checkDeprecations, checkIO, checkNumberOfInnerOperators, checkProperties, clear, cloneOperator, collectErrors, createProcessTree, createSubprocess, freeMemory, getAllInnerOperators, getAllInnerOperatorsAndMe, getImmediateChildren, getIndexOfOperator, getInnerOperatorCondition, getMaxNumberOfInnerOperators, getMinNumberOfInnerOperators, getNumberOfAllOperators, getNumberOfOperators, getNumberOfSubprocesses, getOperator, getOperatorFromAll, getOperators, getSubprocess, getSubprocesses, isEnabled, lookupOperator, notifyRenaming, performAdditionalChecks, processFinished, processStarts, propagateDirtyness, registerOperator, removeOperator, removeSubprocess, shouldAddNonConsumedInput, shouldReturnInnerOutput, unregisterOperator, updateExecutionOrder, walk
 
Methods inherited from class com.rapidminer.operator.Operator
acceptsInput, addError, addError, addValue, addWarning, apply, apply, checkAll, checkAllExcludingMetaData, checkForStop, clearErrorList, createExperimentTree, createExperimentTree, createFromXML, createFromXML, createFromXML, createMarkedExperimentTree, createMarkedProcessTree, createProcessTree, disconnectPorts, execute, fireUpdate, getAddOnlyAdditionalOutput, getApplyCount, getCompatibilityLevel, getDeliveredOutputClasses, getDeprecationInfo, getDesiredInputClasses, getDOMRepresentation, getEncoding, getErrorList, getExecutionUnit, getExperiment, getIncompatibleVersionChanges, getInput, getInput, getInput, getInputClasses, getInputDescription, getInputPorts, getIODescription, getLog, getLogger, getName, getNumberOfBreakpoints, getOperatorClassName, getOperatorDescription, getOutputClasses, getOutputPorts, getParameter, getParameterAsBoolean, getParameterAsChar, getParameterAsColor, getParameterAsDouble, getParameterAsFile, getParameterAsFile, getParameterAsInputStream, getParameterAsInt, getParameterAsMatrix, getParameterAsRepositoryLocation, getParameterAsString, getParameterHandler, getParameterList, getParameters, getParameterTupel, getParameterType, getParent, getPortOwner, getProcess, getResourceConsumptionEstimator, getRoot, getStartTime, getTransformer, getUserDescription, getValue, getValues, getXML, getXML, getXML, hasBreakpoint, hasBreakpoint, hasInput, inApplyLoop, isDebugMode, isDirty, isExpanded, isParallel, isParameterSet, isRunning, log, log, logError, logNote, logWarning, makeDirty, makeDirtyOnUpdate, preAutoWire, producesOutput, register, remove, removeAndKeepConnections, rename, resume, setBreakpoint, setCompatibilityLevel, setEnabled, setEnclosingProcess, setExpanded, setInput, setListParameter, setPairParameter, setParameter, setParameters, setUserDescription, shouldAutoConnect, shouldStopStandaloneExecution, toString, transformMetaData, writeXML, writeXML
 
Methods inherited from class com.rapidminer.tools.AbstractObservable
addObserver, addObserverAsFirst, fireUpdate, removeObserver
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

PARAMETER_SPLIT

public static final java.lang.String PARAMETER_SPLIT
See Also:
Constant Field Values

SPLIT_MODES

public static final java.lang.String[] SPLIT_MODES

SPLIT_ABSOLUTE

public static final int SPLIT_ABSOLUTE
See Also:
Constant Field Values

SPLIT_RELATIVE

public static final int SPLIT_RELATIVE
See Also:
Constant Field Values

PARAMETER_SPLIT_RATIO

public static final java.lang.String PARAMETER_SPLIT_RATIO
See Also:
Constant Field Values

PARAMETER_TRAINING_SET_SIZE

public static final java.lang.String PARAMETER_TRAINING_SET_SIZE
See Also:
Constant Field Values

PARAMETER_TEST_SET_SIZE

public static final java.lang.String PARAMETER_TEST_SET_SIZE
See Also:
Constant Field Values

PARAMETER_SAMPLING_TYPE

public static final java.lang.String PARAMETER_SAMPLING_TYPE
See Also:
Constant Field Values
Constructor Detail

SplitValidationOperator

public SplitValidationOperator(OperatorDescription description)
Method Detail

estimatePerformance

public void estimatePerformance(ExampleSet inputSet)
                         throws OperatorException
Description copied from class: ValidationChain
This is the main method of the validation chain and must be implemented to estimate a performance of inner operators on the given example set. The implementation can make use of the provided helper methods in this class.

Specified by:
estimatePerformance in class ValidationChain
Throws:
OperatorException

getTrainingSetSize

protected MDInteger getTrainingSetSize(MDInteger originalSize)
                                throws UndefinedParameterError
Specified by:
getTrainingSetSize in class ValidationChain
Throws:
UndefinedParameterError

getTestSetSize

protected MDInteger getTestSetSize(MDInteger originalSize)
                            throws UndefinedParameterError
Specified by:
getTestSetSize in class ValidationChain
Throws:
UndefinedParameterError

getParameterTypes

public java.util.List<ParameterType> getParameterTypes()
Description copied from class: Operator
Returns a list of ParameterTypes describing the parameters of this operator. The default implementation returns an empty list if no input objects can be retained and special parameters for those input objects which can be prevented from being consumed. ATTENTION! This will create new parameterTypes. For calling already existing parameter types use getParameters().getParameterTypes();

Specified by:
getParameterTypes in interface ParameterHandler
Overrides:
getParameterTypes in class ValidationChain


Copyright © 2001-2009 by Rapid-I