|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.rapidminer.operator.Operator
com.rapidminer.operator.io.AbstractReader<ExampleSet>
com.rapidminer.operator.io.AbstractExampleSource
com.rapidminer.operator.io.SimpleExampleSource
public class SimpleExampleSource
This operator reads an example set from (a) file(s). Probably you can use the
default parameter values for the most file formats (including the format
produced by the ExampleSetWriter, CSV, ...). In fact, in many cases this operator
is more appropriate for CSV based file formats than the CSVExampleSource operator
itself since you can better control some of the necessary settings like column separators etc.
In contrast to the usual ExampleSource operator this operator is able to read the attribute names from the first line of the data file. However, there is one restriction: the data can only be read from one file instead of multiple files. If you need a fully flexible operator for data loading you should use the more powerful ExampleSource operator which also provides more parameters tuning for example the quoting mechanism and other specialized settings.
The column split points can be defined with regular expressions (please refer to the annex of the RapidMiner tutorial). The default split parameter ",\s*|;\s*|\s+" should work for most file formats. This regular expression describes the following column separators
Quoting is also possible with ". Escaping a quote is done with \". Additionally you can specify comment characters which can be used at arbitrary locations of the data lines and will skip the remaining part of the lines. Unknown attribute values can be marked with empty strings or a question mark.
| Field Summary | |
|---|---|
static java.lang.String |
PARAMETER_COLUMN_SEPARATORS
|
static java.lang.String |
PARAMETER_COMMENT_CHARS
The parameter name for "Lines beginning with these characters are ignored. |
static java.lang.String |
PARAMETER_DATAMANAGEMENT
The parameter name for "Determines, how the data is represented internally. |
static java.lang.String |
PARAMETER_DECIMAL_POINT_CHARACTER
The parameter name for "Character that is used as decimal point. |
static java.lang.String |
PARAMETER_FILENAME
|
static java.lang.String |
PARAMETER_ID_COLUMN
The parameter name for "Column number of the id attribute (only used if id_name is empty; 0 = none; negative values are counted from the last column)" |
static java.lang.String |
PARAMETER_ID_NAME
The parameter name for "Name of the id attribute (if empty, the column defined by id_column will be used)" |
static java.lang.String |
PARAMETER_LABEL_COLUMN
The parameter name for "Column number of the label attribute (only used if label_name is empty; 0 = none; negative values are counted from the last column)" |
static java.lang.String |
PARAMETER_LABEL_NAME
The parameter name for "Name of the label attribute (if empty, the column defined by label_column will be used)" |
static java.lang.String |
PARAMETER_READ_ATTRIBUTE_NAMES
|
static java.lang.String |
PARAMETER_SAMPLE_RATIO
The parameter name for "The fraction of the data set which should be read (1 = all; only used if sample_size = -1)" |
static java.lang.String |
PARAMETER_SAMPLE_SIZE
The parameter name for "The exact number of samples which should be read (-1 = use sample ratio; if not -1, sample_ratio will not have any effect)" |
static java.lang.String |
PARAMETER_SKIP_ERROR_LINES
|
static java.lang.String |
PARAMETER_TRIM_LINES
|
static java.lang.String |
PARAMETER_USE_COMMENT_CHARACTERS
The parameter name for "Indicates if a comment character should be used" |
static java.lang.String |
PARAMETER_USE_QUOTES
|
static java.lang.String |
PARAMETER_WEIGHT_COLUMN
The parameter name for "Column number of the weight attribute (only used if weight_name is empty; 0 = none, negative values are counted from the last column)" |
static java.lang.String |
PARAMETER_WEIGHT_NAME
The parameter name for "Name of the weight attribute (if empty, the column defined by weight_column will be used)" |
| Constructor Summary | |
|---|---|
SimpleExampleSource(OperatorDescription description)
|
|
| Method Summary | |
|---|---|
ExampleSet |
createExampleSet()
Creates (or reads) the ExampleSet that will be returned by AbstractReader.apply(). |
java.util.List<ParameterType> |
getParameterTypes()
Returns a list of ParameterTypes describing the parameters of this operator. |
| Methods inherited from class com.rapidminer.operator.io.AbstractExampleSource |
|---|
read |
| Methods inherited from class com.rapidminer.operator.io.AbstractReader |
|---|
apply, getInputClasses, getOutputClasses |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String PARAMETER_LABEL_NAME
public static final java.lang.String PARAMETER_LABEL_COLUMN
public static final java.lang.String PARAMETER_ID_NAME
public static final java.lang.String PARAMETER_ID_COLUMN
public static final java.lang.String PARAMETER_WEIGHT_NAME
public static final java.lang.String PARAMETER_WEIGHT_COLUMN
public static final java.lang.String PARAMETER_SAMPLE_RATIO
public static final java.lang.String PARAMETER_SAMPLE_SIZE
public static final java.lang.String PARAMETER_DATAMANAGEMENT
public static final java.lang.String PARAMETER_USE_COMMENT_CHARACTERS
public static final java.lang.String PARAMETER_COMMENT_CHARS
public static final java.lang.String PARAMETER_DECIMAL_POINT_CHARACTER
public static final java.lang.String PARAMETER_FILENAME
public static final java.lang.String PARAMETER_READ_ATTRIBUTE_NAMES
public static final java.lang.String PARAMETER_USE_QUOTES
public static final java.lang.String PARAMETER_TRIM_LINES
public static final java.lang.String PARAMETER_SKIP_ERROR_LINES
public static final java.lang.String PARAMETER_COLUMN_SEPARATORS
| Constructor Detail |
|---|
public SimpleExampleSource(OperatorDescription description)
| Method Detail |
|---|
public ExampleSet createExampleSet()
throws OperatorException
AbstractExampleSourceAbstractReader.apply().
createExampleSet in class AbstractExampleSourceOperatorExceptionpublic java.util.List<ParameterType> getParameterTypes()
Operator
getParameterTypes in interface ParameterHandlergetParameterTypes in class Operator
|
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||