Pages: [1]
  Print  
Author Topic: DatabaseExampleSource returns only 10 rows  (Read 2788 times)
ajay
Guest
« on: June 24, 2008, 08:05:14 AM »

Hello,
I have a problem with the DatabaseExampleSource in combination with Oracle and rapidminer version 4.1.

My query that is executed should return over 100.000 data rows. But all i get within rapidminer are the first 10 rows. This matches (maybe) the default fetch size of the ODBC driver? So it seems that the first 10 rows will be fetched and afterwards the operator terminates...

Sometimes I get the following error message:
Jun 24, 2008 8:49:46 AM: [Error] While reading examples from result set: operation not allowed: Unsupported syntax for refreshRow()

Btw: when i switch to version 4.0 everything works fine...

Any suggestions what I'm doing wrong?

greetz
ajay
Logged
Tobias Malbrecht
Global Moderator
Sr. Member
*****
Posts: 293



WWW
« Reply #1 on: June 24, 2008, 04:07:41 PM »

Hi,

I have not yet an idea, what could be the problem here. Does the error show up, when you only try to read the data via the DatabaseExamplesSource operator? Or does your process involve other operators? What settings do you use in the example source operator? Maybe you can post your process XML?

Regards,
Tobias
Logged

Tobias Malbrecht
Director of Product Marketing
RapidMiner
ajay
Guest
« Reply #2 on: June 24, 2008, 09:21:34 PM »

ok, I tried it with and without further operators, but this does not make any difference.
still it isn't a real error - it just does not retrieve all data rows Smiley

here are some more details:

rapidminer 4.0

Process XML:
Code:
<operator name="Root" class="Process">
    <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
        <parameter key="database_system" value="Oracle"/>
        <parameter key="database_url" value="jdbc:oracle:thin:@*******:1521:***"/>
        <parameter key="password" value="***"/>
        <parameter key="query" value="SELECT JH_JOBNUMBER, DEVFROMMEAN FROM (  SELECT JH_JOBNUMBER, (CD_VALUE-AVG(CD_VALUE) OVER(PARTITION BY JH_JOBNUMBER, FEATURE_SIZE, ORIENTATION)) AS DEVFROMMEAN   FROM (   SELECT JH_JOBNUMBER, CD_FEATURENAME,    CASE     WHEN CD_FEATURENAME LIKE &#39;%280%&#39; THEN &#39;280&#39;    WHEN CD_FEATURENAME LIKE &#39;%400%&#39; THEN &#39;400&#39;   END AS FEATURE_SIZE,   CASE     WHEN CD_FEATURENAME LIKE &#39;%X%&#39; THEN &#39;X&#39;    WHEN CD_FEATURENAME LIKE &#39;%Y%&#39; THEN &#39;Y&#39;   END AS ORIENTATION,   CD_VALUE   FROM DIM_GENERAL_JOBHEADER header, DIM_CDFEATURE feature, FACT_CDRAWANDLER fact   WHERE header.DIM_JH_SK = fact.DIM_JH_SK   AND feature.DIM_CDFEATURE_SK = fact.DIM_CDFEATURE_SK   AND CD_FEATURENAME LIKE &#39;%_GCD_%&#39;   ORDER BY JH_JOBNUMBER, FEATURE_SIZE, ORIENTATION, FLOOR((fact.CD_MEASUREMENTPOINTCOORDINATEY+76200)/10500) DESC, (fact.CD_MEASUREMENTPOINTCOORDINATEX+76200)  ) ) -- WHERE DEVFROMMEAN &gt; -30 AND DEVFROMMEAN &lt; 30"/>
        <parameter key="username" value="***"/>
    </operator>
</operator>

output at the console:
Code:
P Jun 24, 2008 9:54:32 PM: Initialising process setup
P Jun 24, 2008 9:54:32 PM: [NOTE] No filename given for result file, using stdout for logging results!
P Jun 24, 2008 9:54:32 PM: Checking properties...
P Jun 24, 2008 9:54:32 PM: Properties are ok.
P Jun 24, 2008 9:54:32 PM: Checking process setup...
P Jun 24, 2008 9:54:32 PM: Inner operators are ok.
P Jun 24, 2008 9:54:32 PM: Checking i/o classes...
P Jun 24, 2008 9:54:32 PM: i/o classes are ok. Process output: ExampleSet.
P Jun 24, 2008 9:54:32 PM: Process ok.
P Jun 24, 2008 9:54:32 PM: Process initialised
P Jun 24, 2008 9:54:32 PM: [NOTE] Process starts
P Jun 24, 2008 9:54:32 PM: Process:
   Root[0] (Process)
   +- DatabaseExampleSource[0] (DatabaseExampleSource)
P Jun 24, 2008 9:54:50 PM: [NOTE] Process finished after 18 seconds
P Jun 24, 2008 9:54:50 PM: Process:
   Root[1] (Process)
   +- DatabaseExampleSource[1] (DatabaseExampleSource)
P Jun 24, 2008 9:54:50 PM: Produced output:
IOContainer (1 objects):
SimpleExampleSet:
134424 examples,
2 regular attributes,
no special attributes
(created by DatabaseExampleSource)
P Jun 24, 2008 9:54:50 PM: [NOTE] Process finished successfully
G Jun 24, 2008 9:54:50 PM: [Error] Plotter: the given data contains missing values. Probably most plotters will not be able to produce proper visualizations. Please replace missing values beforehand if possible.
G Jun 24, 2008 9:54:50 PM: [Warning] Cannot plot all data points, using only a sample of 1000 rows.
G Jun 24, 2008 9:54:51 PM: [NOTE] Cannot use plotter 'Scatter 3D': Data table is not allowed to contain missing values. Please replace them beforehand.
G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'Scatter 3D Color': Data table is not allowed to contain missing values. Please replace them beforehand.
G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'SOM': Data table is not allowed to contain missing values. Please replace them beforehand.
G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'Density': Data table is not allowed to contain missing values. Please replace them beforehand.
G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'Sticks 3D': Data table is not allowed to contain missing values. Please replace them beforehand.
G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'Box 3D': Data table is not allowed to contain missing values. Please replace them beforehand.
G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'Surface 3D': Data table must have between 0 and 50 rows, was 1000.

-----------------------------------

rapidminer 4.1

Process XML:
Code:
<operator name="Root" class="Process" expanded="yes">
    <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
        <parameter key="database_system" value="Oracle"/>
        <parameter key="database_url" value="jdbc:oracle:thin:@*******:1521:***"/>
        <parameter key="password" value="***"/>
        <parameter key="query" value="SELECT JH_JOBNUMBER, DEVFROMMEAN FROM (  SELECT JH_JOBNUMBER, (CD_VALUE-AVG(CD_VALUE) OVER(PARTITION BY JH_JOBNUMBER, FEATURE_SIZE, ORIENTATION)) AS DEVFROMMEAN   FROM (   SELECT JH_JOBNUMBER, CD_FEATURENAME,    CASE     WHEN CD_FEATURENAME LIKE &#39;%280%&#39; THEN &#39;280&#39;    WHEN CD_FEATURENAME LIKE &#39;%400%&#39; THEN &#39;400&#39;   END AS FEATURE_SIZE,   CASE     WHEN CD_FEATURENAME LIKE &#39;%X%&#39; THEN &#39;X&#39;    WHEN CD_FEATURENAME LIKE &#39;%Y%&#39; THEN &#39;Y&#39;   END AS ORIENTATION,   CD_VALUE   FROM DIM_GENERAL_JOBHEADER header, DIM_CDFEATURE feature, FACT_CDRAWANDLER fact   WHERE header.DIM_JH_SK = fact.DIM_JH_SK   AND feature.DIM_CDFEATURE_SK = fact.DIM_CDFEATURE_SK   AND CD_FEATURENAME LIKE &#39;%_GCD_%&#39;   ORDER BY JH_JOBNUMBER, FEATURE_SIZE, ORIENTATION, FLOOR((fact.CD_MEASUREMENTPOINTCOORDINATEY+76200)/10500) DESC, (fact.CD_MEASUREMENTPOINTCOORDINATEX+76200)  ) ) -- WHERE DEVFROMMEAN &gt; -30 AND DEVFROMMEAN &lt; 30"/>
        <parameter key="username" value="***"/>
    </operator>
</operator>

output at the console:
Code:
P Jun 24, 2008 10:15:23 PM: Initialising process setup
P Jun 24, 2008 10:15:23 PM: [NOTE] No filename given for result file, using stdout for logging results!
P Jun 24, 2008 10:15:23 PM: Checking properties...
P Jun 24, 2008 10:15:23 PM: Properties are ok.
P Jun 24, 2008 10:15:23 PM: Checking process setup...
P Jun 24, 2008 10:15:23 PM: Inner operators are ok.
P Jun 24, 2008 10:15:23 PM: Checking i/o classes...
P Jun 24, 2008 10:15:23 PM: i/o classes are ok. Process output: ExampleSet.
P Jun 24, 2008 10:15:23 PM: Process ok.
P Jun 24, 2008 10:15:23 PM: Process initialised
P Jun 24, 2008 10:15:23 PM: [NOTE] Process starts
P Jun 24, 2008 10:15:23 PM: Process:
   Root[0] (Process)
   +- DatabaseExampleSource[0] (DatabaseExampleSource)
G Jun 24, 2008 10:15:28 PM: [Error] While reading examples from result set: operation not allowed: Unsupported syntax for refreshRow()
P Jun 24, 2008 10:15:28 PM: [NOTE] Process finished after 5 seconds
P Jun 24, 2008 10:15:28 PM: Process:
   Root[1] (Process)
   +- DatabaseExampleSource[1] (DatabaseExampleSource)
P Jun 24, 2008 10:15:28 PM: Produced output:
IOContainer (1 objects):
SimpleExampleSet:
10 examples,
2 regular attributes,
no special attributes
(created by DatabaseExampleSource)
P Jun 24, 2008 10:15:28 PM: [NOTE] Process finished successfully

when i execute the process for a second time, the error message disappear..
Code:
P Jun 24, 2008 10:15:59 PM: Initialising process setup
P Jun 24, 2008 10:15:59 PM: [NOTE] No filename given for result file, using stdout for logging results!
P Jun 24, 2008 10:15:59 PM: Checking properties...
P Jun 24, 2008 10:15:59 PM: Properties are ok.
P Jun 24, 2008 10:15:59 PM: Checking process setup...
P Jun 24, 2008 10:15:59 PM: Inner operators are ok.
P Jun 24, 2008 10:15:59 PM: Checking i/o classes...
P Jun 24, 2008 10:15:59 PM: i/o classes are ok. Process output: ExampleSet.
P Jun 24, 2008 10:15:59 PM: Process ok.
P Jun 24, 2008 10:15:59 PM: Process initialised
P Jun 24, 2008 10:15:59 PM: [NOTE] Process starts
P Jun 24, 2008 10:15:59 PM: Process:
   Root[1] (Process)
   +- DatabaseExampleSource[1] (DatabaseExampleSource)
P Jun 24, 2008 10:16:06 PM: [NOTE] Process finished after 6 seconds
P Jun 24, 2008 10:16:06 PM: Process:
   Root[1] (Process)
   +- DatabaseExampleSource[1] (DatabaseExampleSource)
P Jun 24, 2008 10:16:06 PM: Produced output:
IOContainer (1 objects):
SimpleExampleSet:
10 examples,
2 regular attributes,
no special attributes
(created by DatabaseExampleSource)
P Jun 24, 2008 10:16:06 PM: [NOTE] Process finished successfully

I use the latest java jdk 1.5 with the ojdbc driver for java 1.4/1.5 (ojdbc14.jar).
Hope this helps a little ?!

-ajay-
Logged
Ingo Mierswa
Administrator
Hero Member
*****
Posts: 1226



WWW
« Reply #3 on: June 27, 2008, 09:47:27 PM »

Hello,

hmm, it could be that there is a problem with the Oracle JDBC driver and order / grouping statements:

http://www.orafaq.com/forum/?t=msg&th=34929/0/

and

http://forum.java.sun.com/thread.jspa?threadID=272795&messageID=1282624

Maybe these links and the suggested workaround could help? Although I would not yet have an idea why it worked in RM 4.0 then. I try to check what differs between both version with respect to database access.

Cheers,
Ingo
Logged

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at  http://marketplace.rapid-i.com
Pages: [1]
  Print  
 
Jump to: