|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.rapidminer.tools.AbstractObservable<Operator>
com.rapidminer.operator.Operator
com.rapidminer.operator.io.AbstractReader<ExampleSet>
com.rapidminer.operator.io.AbstractExampleSource
com.rapidminer.operator.io.ExampleSource
public class ExampleSource
This operator reads an example set from (a) file(s). Probably you can use the default parameter values for the most file formats (including the format produced by the ExampleSetWriter, CSV, ...). Please refer to section First steps/File formats for details on the attribute description file set by the parameter attributes used to specify attribute types. You can use the wizard of this operator or the tool Attribute Editor in order to create those meta data .aml files for your datasets.
This operator supports the reading of data from multiple source files. Each attribute (including special attributes like labels, weights, ...) might be read from another file. Please note that only the minimum number of lines of all files will be read, i.e. if one of the data source files has less lines than the others, only this number of examples will be read.
The split points can be defined with regular expressions (please refer to the annex of the RapidMiner tutorial for an overview). The default split parameter ",\s*|;\s*|\s+" should work for most file formats. This regular expression describes the following column separators
Quoting is also possible with ". You can escape quotes with a backslash, i.e. \". Please note that you can change these characters by adjusting the corresponding settings.
Additionally you can specify comment characters which can be used at arbitrary locations of the data lines. Any content after the comment character will be ignored. Unknown attribute values can be marked with empty strings (if this is possible for your column separators) or by a question mark (recommended).
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from class com.rapidminer.operator.io.AbstractReader |
|---|
AbstractReader.ReaderDescription |
| Field Summary | |
|---|---|
static java.lang.String |
PARAMETER_ATTRIBUTES
The parameter name for "Filename for the XML attribute description file. |
static java.lang.String |
PARAMETER_COLUMN_SEPARATORS
The parameter name for "Column separators for data files (regular expression)" |
static java.lang.String |
PARAMETER_COMMENT_CHARS
The parameter name for "Lines beginning with these characters are ignored. |
static java.lang.String |
PARAMETER_DATAMANAGEMENT
The parameter name for "Determines, how the data is represented internally. |
static java.lang.String |
PARAMETER_DECIMAL_POINT_CHARACTER
The parameter name for "Character that is used as decimal point. |
static java.lang.String |
PARAMETER_PERMUTATE
The parameter name for "Indicates if the loaded data should be permuted. |
static java.lang.String |
PARAMETER_QUOTE_CHARACTER
Specifies the used quoting character. |
static java.lang.String |
PARAMETER_QUOTING_ESCAPE_CHARACTER
Specifies the used character for escaping quoting. |
static java.lang.String |
PARAMETER_SAMPLE_RATIO
The parameter name for "The fraction of the data set which should be read (1 = all; only used if sample_size = -1)" |
static java.lang.String |
PARAMETER_SAMPLE_SIZE
The parameter name for "The exact number of samples which should be read (-1 = use sample ratio; if not -1, sample_ratio will not have any effect)" |
static java.lang.String |
PARAMETER_SKIP_ERROR_LINES
Indicates if lines leading to errors should be skipped. |
static java.lang.String |
PARAMETER_TRIM_LINES
Indicates if the lines should be trimmed during reading. |
static java.lang.String |
PARAMETER_USE_COMMENT_CHARACTERS
The parameter name for "Indicates if a comment character should be used" |
static java.lang.String |
PARAMETER_USE_QUOTES
The parameter name for "Indicates if quotes should be regarded (slower!). |
| Constructor Summary | |
|---|---|
ExampleSource(OperatorDescription description)
|
|
| Method Summary | |
|---|---|
ExampleSet |
createExampleSet()
Creates (or reads) the ExampleSet that will be returned by Operator.apply(). |
MetaData |
getGeneratedMetaData()
|
java.util.List<ParameterType> |
getParameterTypes()
Returns a list of ParameterTypes describing the parameters of this operator. |
protected boolean |
isMetaDataCacheable()
|
protected boolean |
supportsEncoding()
|
| Methods inherited from class com.rapidminer.operator.io.AbstractExampleSource |
|---|
read |
| Methods inherited from class com.rapidminer.operator.io.AbstractReader |
|---|
addAnnotations, canMakeReaderFor, createReader, doWork, getFileParameterForOperator, registerOperator, registerReaderDescription |
| Methods inherited from class com.rapidminer.tools.AbstractObservable |
|---|
addObserver, addObserverAsFirst, fireUpdate, removeObserver |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String PARAMETER_ATTRIBUTES
public static final java.lang.String PARAMETER_SAMPLE_RATIO
public static final java.lang.String PARAMETER_SAMPLE_SIZE
public static final java.lang.String PARAMETER_PERMUTATE
public static final java.lang.String PARAMETER_COLUMN_SEPARATORS
public static final java.lang.String PARAMETER_USE_COMMENT_CHARACTERS
public static final java.lang.String PARAMETER_COMMENT_CHARS
public static final java.lang.String PARAMETER_DECIMAL_POINT_CHARACTER
public static final java.lang.String PARAMETER_USE_QUOTES
public static final java.lang.String PARAMETER_QUOTE_CHARACTER
public static final java.lang.String PARAMETER_QUOTING_ESCAPE_CHARACTER
public static final java.lang.String PARAMETER_TRIM_LINES
public static final java.lang.String PARAMETER_SKIP_ERROR_LINES
public static final java.lang.String PARAMETER_DATAMANAGEMENT
| Constructor Detail |
|---|
public ExampleSource(OperatorDescription description)
| Method Detail |
|---|
public MetaData getGeneratedMetaData()
throws OperatorException
getGeneratedMetaData in class AbstractExampleSourceOperatorExceptionprotected boolean isMetaDataCacheable()
isMetaDataCacheable in class AbstractReader<ExampleSet>
public ExampleSet createExampleSet()
throws OperatorException
AbstractExampleSourceOperator.apply().
createExampleSet in class AbstractExampleSourceOperatorExceptionprotected boolean supportsEncoding()
supportsEncoding in class AbstractReader<ExampleSet>public java.util.List<ParameterType> getParameterTypes()
Operator
getParameterTypes in interface ParameterHandlergetParameterTypes in class AbstractReader<ExampleSet>
|
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||