Pages: [1]
  Print  
Author Topic: Hanging on SQL Queries  (Read 214 times)
johnyma22
Newbie
*
Posts: 18


« on: February 04, 2012, 07:16:30 PM »

Every now and again (1 in 3 times) Rapidminer will hang on an SQL query.  Usually my process will take 5 minutes.

By hang I mean the timer just keeps going, for hours, it never resolves.

The data in the database is static, im not updating/writing new rows to it.

is this a common issue?  Should I think about doing a large extract from the database to my local machine and working with that?

I sometimes get this: Feb 4, 2012 6:12:29 PM WARNING: Caught exception in concurrent execution of Read Database (Read Database): com.rapidminer.operator.UserError: Database error occurred: Operation not allowed after ResultSet closed
Logged
haddock
Hero Member
*****
Posts: 759



WWW
« Reply #1 on: February 04, 2012, 07:25:36 PM »

Hi,
Please post the XML.
Logged

Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?

T.S.Eliot ~ Choruses from the Rock 1934
johnyma22
Newbie
*
Posts: 18


« Reply #2 on: February 04, 2012, 07:37:59 PM »

Code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.2.000" expanded="true" name="Process">
    <parameter key="parallelize_main_process" value="true"/>
    <process expanded="true" height="386" width="835">
      <operator activated="true" class="read_database" compatibility="5.2.000" expanded="true" height="60" name="Read Database (2)" width="90" x="45" y="30">
        <parameter key="connection" value="slave2"/>
        <parameter key="query" value="SELECT label, data, school_list_holiday_sources.id FROM school_list_holiday_data INNER JOIN school_list_holiday_sources ON school_list_holiday_data.id=school_list_holiday_sources.id WHERE school_list_holiday_sources.label = &quot;true&quot; OR isGood = -2 AND school_list_holiday_sources.label = &quot;false&quot; LIMIT 0,200"/>
        <enumeration key="parameters"/>
      </operator>
      <operator activated="true" class="nominal_to_text" compatibility="5.2.000" expanded="true" height="76" name="Nominal to Text" width="90" x="179" y="30">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="data"/>
      </operator>
      <operator activated="true" class="text:process_document_from_data" compatibility="5.1.004" expanded="true" height="76" name="Process Documents from Data" width="90" x="313" y="30">
        <parameter key="keep_text" value="true"/>
        <parameter key="prune_method" value="absolute"/>
        <parameter key="prune_below_absolute" value="2"/>
        <parameter key="prune_above_absolute" value="999"/>
        <list key="specify_weights"/>
        <process expanded="true" height="480" width="815">
          <operator activated="true" class="web:extract_html_text_content" compatibility="5.1.004" expanded="true" height="60" name="Extract Content" width="90" x="45" y="120"/>
          <operator activated="true" class="text:transform_cases" compatibility="5.1.004" expanded="true" height="60" name="Transform Cases" width="90" x="179" y="120"/>
          <operator activated="true" class="text:tokenize" compatibility="5.1.004" expanded="true" height="60" name="Tokenize" width="90" x="313" y="120"/>
          <operator activated="true" class="text:filter_stopwords_english" compatibility="5.1.004" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="447" y="120"/>
          <operator activated="true" class="text:stem_snowball" compatibility="5.1.004" expanded="true" height="60" name="Stem (Snowball)" width="90" x="581" y="120"/>
          <operator activated="true" class="text:filter_by_length" compatibility="5.1.004" expanded="true" height="60" name="Filter Tokens (by Length)" width="90" x="715" y="120">
            <parameter key="min_chars" value="2"/>
          </operator>
          <connect from_port="document" to_op="Extract Content" to_port="document"/>
          <connect from_op="Extract Content" from_port="document" to_op="Transform Cases" to_port="document"/>
          <connect from_op="Transform Cases" from_port="document" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
          <connect from_op="Filter Stopwords (English)" from_port="document" to_op="Stem (Snowball)" to_port="document"/>
          <connect from_op="Stem (Snowball)" from_port="document" to_op="Filter Tokens (by Length)" to_port="document"/>
          <connect from_op="Filter Tokens (by Length)" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="90"/>
          <portSpacing port="sink_document 1" spacing="90"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="read_database" compatibility="5.2.000" expanded="true" height="60" name="Read Database" width="90" x="45" y="210">
        <parameter key="connection" value="slave2"/>
        <parameter key="query" value="SELECT data, label, isGood, school_list_holiday_sources.id FROM school_list_holiday_data INNER JOIN school_list_holiday_sources ON school_list_holiday_data.id=school_list_holiday_sources.id WHERE label != &quot;true&quot; AND isGood = 0 LIMIT 0,2000"/>
        <enumeration key="parameters"/>
      </operator>
      <operator activated="true" class="nominal_to_text" compatibility="5.2.000" expanded="true" height="76" name="Nominal to Text (2)" width="90" x="179" y="210">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="data"/>
      </operator>
      <operator activated="true" class="text:process_document_from_data" compatibility="5.1.004" expanded="true" height="76" name="Process Documents from Data (2)" width="90" x="313" y="210">
        <parameter key="keep_text" value="true"/>
        <parameter key="prune_method" value="absolute"/>
        <parameter key="prune_below_absolute" value="2"/>
        <parameter key="prune_above_absolute" value="999"/>
        <list key="specify_weights"/>
        <process expanded="true" height="340" width="803">
          <operator activated="true" class="web:extract_html_text_content" compatibility="5.1.004" expanded="true" height="60" name="Extract Content (2)" width="90" x="45" y="30"/>
          <operator activated="true" class="text:transform_cases" compatibility="5.1.004" expanded="true" height="60" name="Transform Cases (2)" width="90" x="180" y="30"/>
          <operator activated="true" class="text:tokenize" compatibility="5.1.004" expanded="true" height="60" name="Tokenize (2)" width="90" x="315" y="30"/>
          <operator activated="true" class="text:filter_stopwords_english" compatibility="5.1.004" expanded="true" height="60" name="Filter Stopwords (2)" width="90" x="450" y="30"/>
          <operator activated="true" class="text:stem_snowball" compatibility="5.1.004" expanded="true" height="60" name="Stem (2)" width="90" x="585" y="30"/>
          <operator activated="true" class="text:filter_by_length" compatibility="5.1.004" expanded="true" height="60" name="Filter Tokens (2)" width="90" x="703" y="30">
            <parameter key="min_chars" value="2"/>
          </operator>
          <connect from_port="document" to_op="Extract Content (2)" to_port="document"/>
          <connect from_op="Extract Content (2)" from_port="document" to_op="Transform Cases (2)" to_port="document"/>
          <connect from_op="Transform Cases (2)" from_port="document" to_op="Tokenize (2)" to_port="document"/>
          <connect from_op="Tokenize (2)" from_port="document" to_op="Filter Stopwords (2)" to_port="document"/>
          <connect from_op="Filter Stopwords (2)" from_port="document" to_op="Stem (2)" to_port="document"/>
          <connect from_op="Stem (2)" from_port="document" to_op="Filter Tokens (2)" to_port="document"/>
          <connect from_op="Filter Tokens (2)" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="0"/>
          <portSpacing port="sink_document 1" spacing="0"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="set_role" compatibility="5.2.000" expanded="true" height="76" name="Set Role (2)" width="90" x="447" y="210">
        <parameter key="name" value="label"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="5.2.000" expanded="true" height="76" name="Set Role" width="90" x="447" y="30">
        <parameter key="name" value="label"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="naive_bayes" compatibility="5.2.000" expanded="true" height="76" name="Naive Bayes" width="90" x="581" y="30">
        <parameter key="laplace_correction" value="false"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="5.2.000" expanded="true" height="76" name="Apply Model (2)" width="90" x="715" y="120">
        <list key="application_parameters"/>
      </operator>
      <connect from_op="Read Database (2)" from_port="output" to_op="Nominal to Text" to_port="example set input"/>
      <connect from_op="Nominal to Text" from_port="example set output" to_op="Process Documents from Data" to_port="example set"/>
      <connect from_op="Process Documents from Data" from_port="example set" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Process Documents from Data" from_port="word list" to_op="Process Documents from Data (2)" to_port="word list"/>
      <connect from_op="Read Database" from_port="output" to_op="Nominal to Text (2)" to_port="example set input"/>
      <connect from_op="Nominal to Text (2)" from_port="example set output" to_op="Process Documents from Data (2)" to_port="example set"/>
      <connect from_op="Process Documents from Data (2)" from_port="example set" to_op="Set Role (2)" to_port="example set input"/>
      <connect from_op="Set Role (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Naive Bayes" to_port="training set"/>
      <connect from_op="Naive Bayes" from_port="model" to_op="Apply Model (2)" to_port="model"/>
      <connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 1"/>
      <connect from_op="Apply Model (2)" from_port="model" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="90"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>
Logged
haddock
Hero Member
*****
Posts: 759



WWW
« Reply #3 on: February 04, 2012, 10:33:14 PM »

Hi there,

You're getting an SQL error - "Operation not allowed after ResultSet closed" - the question is why? I Googled that error and followed this link 

http://stackoverflow.com/questions/5840866/getting-java-sql-sqlexception-operation-not-allowed-after-resultset-closed

The bit that caught my eye was..

Quote
A ResultSet object is automatically closed when the Statement object that generated it is closed, re-executed, or used to retrieve the next result from a sequence of multiple results.

because there is a fair amount of common ground between the two SQL statements. I could imagine scenarios where the second request kicks off at the wrong time, especially if the main process is parallelised http://rapid-i.com/rapidforum/index.php/topic,2162.msg8510.html#msg8510.

As an aside, if we look at the SQL..

Code:
SELECT label, data, school_list_holiday_sources.id FROM school_list_holiday_data INNER JOIN school_list_holiday_sources ON school_list_holiday_data.id=school_list_holiday_sources.id WHERE school_list_holiday_sources.label = &quot;true&quot; OR isGood = -2 AND school_list_holiday_sources.label = &quot;false&quot; LIMIT 0,200"/>

SELECT data, label, isGood, school_list_holiday_sources.id FROM school_list_holiday_data INNER JOIN school_list_holiday_sources ON school_list_holiday_data.id=school_list_holiday_sources.id WHERE label != &quot;true&quot; AND isGood = 0 LIMIT 0,2000"/>

In each you are joining the same two tables, in the same way , on school_list_holiday_data.id=school_list_holiday_sources.id ; that could therefore be done in one statement, because you can filter out your training and test sets explicitly in RM. In fact this is a pretty good example of how it is easier to debug SQL using el rapido. Actually I'd do the join in RM as well  Cool

Logged

Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?

T.S.Eliot ~ Choruses from the Rock 1934
johnyma22
Newbie
*
Posts: 18


« Reply #4 on: February 04, 2012, 10:35:25 PM »

Ahhh, okay..  Well I will try to figure out how to filter out my training data.  Any pointers?

Logged
haddock
Hero Member
*****
Posts: 759



WWW
« Reply #5 on: February 04, 2012, 10:40:38 PM »

Don't want to sound grumpy, but you really need to work through help->tutorial.
Logged

Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?

T.S.Eliot ~ Choruses from the Rock 1934
johnyma22
Newbie
*
Posts: 18


« Reply #6 on: February 04, 2012, 10:42:14 PM »

Agreed.  I hadn't spotted this before, just the Video tutorials.  I have the book to read next week when I'm on holiday.  Will go through these tomorrow.  Good shout!
Logged
haddock
Hero Member
*****
Posts: 759



WWW
« Reply #7 on: February 04, 2012, 10:46:15 PM »

Cool.
Logged

Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?

T.S.Eliot ~ Choruses from the Rock 1934
Pages: [1]
  Print  
 
Jump to: