|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.rapidminer.operator.Operator
com.rapidminer.operator.io.AbstractReader<ExampleSet>
com.rapidminer.operator.io.AbstractExampleSource
com.rapidminer.operator.io.ResultSetExampleSource
com.rapidminer.operator.io.DatabaseExampleSource
public class DatabaseExampleSource
This operator reads an ExampleSet from an SQL
database. The SQL query can be passed to RapidMiner via a parameter or, in case of
long SQL statements, in a separate file. Please note that column names are
often case sensitive. Databases may behave differently here.
The most convenient way of defining the necessary parameters is the configuration wizard. The most important parameters (database URL and user name) will be automatically determined by this wizard and it is also possible to define the special attributes like labels or ids.
Please note that this operator supports two basic working modes:
The latter possibility will be turned on by the parameter "work_on_database". Please note that this working mode is still regarded as experimental and errors might occur. In order to ensure proper data changes the database working mode is only allowed on a single table which must be defined with the parameter "table_name". IMPORTANT: If you encounter problems during data updates (e.g. messages that the result set is not updatable) you probably have to define a primary key for your table.
If you are not directly working on the database, the data will be read with an arbitrary SQL query statement (SELECT ... FROM ... WHERE ...) defined by "query" or "query_file". The memory mode is the recommended way of using this operator. This is especially important for following operators like learning schemes which would often load (most of) the data into main memory during the learning process. In these cases a direct working on the database is not recommended anyway.
ResultSetMetaData interface does not provide
information about the possible values of nominal attributes, the internal
indices the nominal values are mapped to will depend on the ordering
they appear in the table. This may cause problems only when processes are
split up into a training process and an application or testing process.
For learning schemes which are capable of handling nominal attributes, this
is not a problem. If a learning scheme like a SVM is used with nominal data,
RapidMiner pretends that nominal attributes are numerical and uses indices for the
nominal values as their numerical value. A SVM may perform well if there are
only two possible values. If a test set is read in another process, the
nominal values may be assigned different indices, and hence the SVM trained
is useless. This is not a problem for label attributes, since the classes can
be specified using the classes parameter and hence, all
learning schemes intended to use with nominal data are safe to use.
| Field Summary | |
|---|---|
static java.lang.String |
PARAMETER_CLASSES
The parameter name for "Whitespace separated list of possible class values of the label attribute. |
static java.lang.String |
PARAMETER_DATABASE_SYSTEM
The parameter name for "Indicates the used database system" |
static java.lang.String |
PARAMETER_DATABASE_URL
The parameter name for "The complete URL connection string for the database, e.g. |
static java.lang.String |
PARAMETER_PASSWORD
The parameter name for "Password for the database. |
static java.lang.String |
PARAMETER_QUERY
The parameter name for "SQL query. |
static java.lang.String |
PARAMETER_QUERY_FILE
The parameter name for "File containing the query. |
static java.lang.String |
PARAMETER_TABLE_NAME
The parameter name for "Use this table if work_on_database is true or no other query is specified. |
static java.lang.String |
PARAMETER_USERNAME
The parameter name for "Database username. |
static java.lang.String |
PARAMETER_WORK_ON_DATABASE
The parameter name for "If set to true, the data read from the database is NOT copied to main memory. |
| Fields inherited from class com.rapidminer.operator.io.ResultSetExampleSource |
|---|
PARAMETER_DATAMANAGEMENT, PARAMETER_ID_ATTRIBUTE, PARAMETER_LABEL_ATTRIBUTE, PARAMETER_WEIGHT_ATTRIBUTE |
| Constructor Summary | |
|---|---|
DatabaseExampleSource(OperatorDescription description)
|
|
| Method Summary | |
|---|---|
ExampleSet |
createExampleSet()
Creates (or reads) the ExampleSet that will be returned by AbstractReader.apply(). |
protected DatabaseHandler |
getConnectedDatabaseHandler()
|
java.util.List<ParameterType> |
getParameterTypes()
Returns a list of ParameterTypes describing the parameters of this operator. |
java.sql.ResultSet |
getResultSet()
This method reads the file whose name is given, extracts the database access information and the query from it and executes the query. |
void |
processFinished()
Called at the end of the process. |
void |
setNominalValues(java.util.List attributeList,
java.sql.ResultSet resultSet,
Attribute label)
Since the ResultSet does not provide information about possible
values of nominal attributes, subclasses must set these by implementing
this method. |
void |
tearDown()
This method is invoked at the end of the data query process. |
| Methods inherited from class com.rapidminer.operator.io.ResultSetExampleSource |
|---|
createExampleSet |
| Methods inherited from class com.rapidminer.operator.io.AbstractExampleSource |
|---|
read |
| Methods inherited from class com.rapidminer.operator.io.AbstractReader |
|---|
apply, getInputClasses, getOutputClasses |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String PARAMETER_WORK_ON_DATABASE
public static final java.lang.String PARAMETER_DATABASE_SYSTEM
public static final java.lang.String PARAMETER_DATABASE_URL
public static final java.lang.String PARAMETER_USERNAME
public static final java.lang.String PARAMETER_PASSWORD
public static final java.lang.String PARAMETER_QUERY
public static final java.lang.String PARAMETER_QUERY_FILE
public static final java.lang.String PARAMETER_TABLE_NAME
public static final java.lang.String PARAMETER_CLASSES
| Constructor Detail |
|---|
public DatabaseExampleSource(OperatorDescription description)
| Method Detail |
|---|
public ExampleSet createExampleSet()
throws OperatorException
AbstractExampleSourceAbstractReader.apply().
createExampleSet in class ResultSetExampleSourceOperatorExceptionpublic void tearDown()
ResultSetExampleSource
tearDown in class ResultSetExampleSource
public void setNominalValues(java.util.List attributeList,
java.sql.ResultSet resultSet,
Attribute label)
throws UndefinedParameterError
ResultSetExampleSourceResultSet does not provide information about possible
values of nominal attributes, subclasses must set these by implementing
this method.
setNominalValues in class ResultSetExampleSourceattributeList - List of Attribute
UndefinedParameterError
protected DatabaseHandler getConnectedDatabaseHandler()
throws OperatorException,
java.sql.SQLException
OperatorException
java.sql.SQLException
public java.sql.ResultSet getResultSet()
throws OperatorException
getResultSet in class ResultSetExampleSourceOperatorExceptionpublic void processFinished()
Operator
processFinished in class Operatorpublic java.util.List<ParameterType> getParameterTypes()
Operator
getParameterTypes in interface ParameterHandlergetParameterTypes in class ResultSetExampleSource
|
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||