com.rapidminer.operator.preprocessing.join
Class ExampleSetJoin

java.lang.Object
  extended by com.rapidminer.tools.AbstractObservable<Operator>
      extended by com.rapidminer.operator.Operator
          extended by com.rapidminer.operator.preprocessing.join.AbstractExampleSetJoin
              extended by com.rapidminer.operator.preprocessing.join.ExampleSetJoin
All Implemented Interfaces:
ConfigurationListener, PreviewListener, ResourceConsumer, ParameterHandler, LoggingHandler, Observable<Operator>

public class ExampleSetJoin
extends AbstractExampleSetJoin

Build the join of two example sets using the id attributes of the sets, i.e. both example sets must have an id attribute where the same id indicate the same examples. If examples are missing an exception will be thrown. The result example set will consist of the same number of examples but the union set or the union list (depending on parameter setting double attributes will be removed or renamed) of both feature sets. In case of removing double attribute the attribute values must be the same for the examples of both example set, otherwise an exception will be thrown.

Please note that this check for double attributes will only be applied for regular attributes. Special attributes of the second input example set which do not exist in the first example set will simply be added. If they already exist they are simply skipped.

Author:
Ingo Mierswa, Tobias Malbrecht

Nested Class Summary
 
Nested classes/interfaces inherited from class com.rapidminer.operator.preprocessing.join.AbstractExampleSetJoin
AbstractExampleSetJoin.AttributeSource
 
Field Summary
static int JOIN_TYPE_INNER
           
static int JOIN_TYPE_LEFT
           
static int JOIN_TYPE_OUTER
           
static int JOIN_TYPE_RIGHT
           
static java.lang.String[] JOIN_TYPES
           
static java.lang.String PARAMETER_JOIN_TYPE
           
 
Fields inherited from class com.rapidminer.operator.preprocessing.join.AbstractExampleSetJoin
PARAMETER_REMOVE_DOUBLE_ATTRIBUTES
 
Constructor Summary
ExampleSetJoin(OperatorDescription description)
           
 
Method Summary
 java.util.List<ParameterType> getParameterTypes()
          Returns a list of ParameterTypes describing the parameters of this operator.
 ResourceConsumptionEstimator getResourceConsumptionEstimator()
          Subclasses can override this method if they are able to estimate the consumed resources (CPU time and memory), based on their input.
protected  boolean isIdNeeded()
           
protected  MemoryExampleTable joinData(ExampleSet leftExampleSet, ExampleSet rightExampleSet, java.util.List<AbstractExampleSetJoin.AttributeSource> originalAttributeSources, java.util.List<Attribute> unionAttributeList)
           
 
Methods inherited from class com.rapidminer.operator.preprocessing.join.AbstractExampleSetJoin
containsAttribute, doWork
 
Methods inherited from class com.rapidminer.operator.Operator
acceptsInput, addError, addError, addValue, addWarning, apply, apply, assumePreconditionsSatisfied, checkAll, checkAllExcludingMetaData, checkDeprecations, checkForStop, checkIO, checkProperties, clear, clearErrorList, cloneOperator, collectErrors, createExperimentTree, createExperimentTree, createFromXML, createFromXML, createFromXML, createMarkedExperimentTree, createMarkedProcessTree, createProcessTree, createProcessTree, disconnectPorts, execute, fireUpdate, freeMemory, getAddOnlyAdditionalOutput, getApplyCount, getCompatibilityLevel, getDeliveredOutputClasses, getDeprecationInfo, getDesiredInputClasses, getDOMRepresentation, getEncoding, getErrorList, getExecutionUnit, getExperiment, getIncompatibleVersionChanges, getInput, getInput, getInput, getInputClasses, getInputDescription, getInputPorts, getIODescription, getLog, getLogger, getName, getNumberOfBreakpoints, getOperatorClassName, getOperatorDescription, getOutputClasses, getOutputPorts, getParameter, getParameterAsBoolean, getParameterAsChar, getParameterAsColor, getParameterAsDouble, getParameterAsFile, getParameterAsFile, getParameterAsInputStream, getParameterAsInt, getParameterAsMatrix, getParameterAsRepositoryLocation, getParameterAsString, getParameterHandler, getParameterList, getParameters, getParameterTupel, getParameterType, getParent, getPortOwner, getProcess, getRoot, getStartTime, getTransformer, getUserDescription, getValue, getValues, getXML, getXML, getXML, hasBreakpoint, hasBreakpoint, hasInput, inApplyLoop, isDebugMode, isDirty, isEnabled, isExpanded, isParallel, isParameterSet, isRunning, log, log, logError, logNote, logWarning, lookupOperator, makeDirty, makeDirtyOnUpdate, notifyRenaming, performAdditionalChecks, preAutoWire, processFinished, processStarts, producesOutput, propagateDirtyness, register, registerOperator, remove, removeAndKeepConnections, rename, resume, setBreakpoint, setCompatibilityLevel, setEnabled, setEnclosingProcess, setExpanded, setInput, setListParameter, setPairParameter, setParameter, setParameters, setUserDescription, shouldAutoConnect, shouldAutoConnect, shouldStopStandaloneExecution, toString, transformMetaData, unregisterOperator, updateExecutionOrder, walk, writeXML, writeXML
 
Methods inherited from class com.rapidminer.tools.AbstractObservable
addObserver, addObserverAsFirst, fireUpdate, removeObserver
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

PARAMETER_JOIN_TYPE

public static final java.lang.String PARAMETER_JOIN_TYPE
See Also:
Constant Field Values

JOIN_TYPES

public static final java.lang.String[] JOIN_TYPES

JOIN_TYPE_INNER

public static final int JOIN_TYPE_INNER
See Also:
Constant Field Values

JOIN_TYPE_LEFT

public static final int JOIN_TYPE_LEFT
See Also:
Constant Field Values

JOIN_TYPE_RIGHT

public static final int JOIN_TYPE_RIGHT
See Also:
Constant Field Values

JOIN_TYPE_OUTER

public static final int JOIN_TYPE_OUTER
See Also:
Constant Field Values
Constructor Detail

ExampleSetJoin

public ExampleSetJoin(OperatorDescription description)
Method Detail

joinData

protected MemoryExampleTable joinData(ExampleSet leftExampleSet,
                                      ExampleSet rightExampleSet,
                                      java.util.List<AbstractExampleSetJoin.AttributeSource> originalAttributeSources,
                                      java.util.List<Attribute> unionAttributeList)
                               throws OperatorException
Specified by:
joinData in class AbstractExampleSetJoin
Throws:
OperatorException

isIdNeeded

protected boolean isIdNeeded()
Specified by:
isIdNeeded in class AbstractExampleSetJoin

getParameterTypes

public java.util.List<ParameterType> getParameterTypes()
Description copied from class: Operator
Returns a list of ParameterTypes describing the parameters of this operator. The default implementation returns an empty list if no input objects can be retained and special parameters for those input objects which can be prevented from being consumed. ATTENTION! This will create new parameterTypes. For calling already existing parameter types use getParameters().getParameterTypes();

Specified by:
getParameterTypes in interface ParameterHandler
Overrides:
getParameterTypes in class AbstractExampleSetJoin

getResourceConsumptionEstimator

public ResourceConsumptionEstimator getResourceConsumptionEstimator()
Description copied from class: Operator
Subclasses can override this method if they are able to estimate the consumed resources (CPU time and memory), based on their input. The default implementation returns null.

Specified by:
getResourceConsumptionEstimator in interface ResourceConsumer
Overrides:
getResourceConsumptionEstimator in class Operator


Copyright © 2001-2009 by Rapid-I