Pages: [1]
  Print  
Author Topic: W-Apriori doesn't work. need help  (Read 3255 times)
edfred
Newbie
*
Posts: 8


« on: January 05, 2009, 12:55:49 PM »

Hi at all,

i want to use the W-Apriori operator to generate some association rules, but it's not working.
I am using the rapidminer version 4.3.
This is my operatorchain:

root
|  |
|  |-Textinput
|     |
|     |-StringTokenizer
|     |-GermanStopwordFilter
|     |-ToLowerCaseConverter
|     |-TokenLengthFGilter
|
|-ExampleSetWriter
|
|-W-Apriori

If I press the start-button, there is a an exception like this:

Error: 905 External Error
Error in: W-Apriori (W-Apriori) W-Apriori caused an error: weka.core.UnsupportedAttributeTypeException: weka.associations.Apriori: Cannot handle numeric attributes! An external program or library has reported an error. Please see the documentation of this program or library for further information.

How can I get binary attributes. I think I have to converte them somehow.

Can youo give me an example operator chain, where it's works?

Best regards
edfred
Logged
earmijo
Full Member
***
Posts: 143


« Reply #1 on: January 05, 2009, 06:42:49 PM »

If you are using "Binary Ocurrences" as your Vector Creation choice, you will have a matrix of 0/1s. You still have to transform it into a matrix of true/false which is the input form accepted by the Asociators like Weka-apriori.  You can do this with the Numerical2Binomial converter (Preprocessing/Attributes/Filter/Converter/...).

Code:
<operator name="Root" class="Process" expanded="yes">
    <operator name="TextInput" class="TextInput" expanded="yes">
        <parameter key="attributes" value=""/>
        <parameter key="create_text_visualizer" value="true"/>
        <parameter key="default_content_encoding" value="ISO-8859-1"/>
        <list key="namespaces">
        </list>
        <parameter key="on_the_fly_pruning" value="3"/>
        <parameter key="prune_below" value="2"/>
        <list key="texts">
          <parameter key="graphics" value="../data/newsgroup/graphics"/>
          <parameter key="hardware" value="../data/newsgroup/hardware"/>
        </list>
        <parameter key="vector_creation" value="BinaryOccurrences"/>
        <operator name="StringTokenizer" class="StringTokenizer">
        </operator>
        <operator name="EnglishStopwordFilter" class="EnglishStopwordFilter">
        </operator>
        <operator name="TokenLengthFilter" class="TokenLengthFilter">
            <parameter key="min_chars" value="3"/>
        </operator>
    </operator>
    <operator name="Numerical2Binominal" class="Numerical2Binominal">
    </operator>
    <operator name="W-Apriori" class="W-Apriori">
    </operator>
</operator>
Logged
edfred
Newbie
*
Posts: 8


« Reply #2 on: January 06, 2009, 08:26:00 PM »

Thank you that was very helpfl. It works now but the German words aren't displayed in the right way. Like the letters "", "", "" and "". Where can I set the enccoding to utf-8 ?
Logged
Sebastian Land
Administrator
Hero Member
*****
Posts: 2426


« Reply #3 on: January 07, 2009, 01:38:58 PM »

Hi,
this can be switched in the Textinput operator. The parameter is called "default_encoding" or something like that.

Greetings,
  Sebastian
Logged
edfred
Newbie
*
Posts: 8


« Reply #4 on: January 08, 2009, 11:59:41 AM »

Hi,

I tried this:
Code:
<operator name="Root" class="Process" expanded="yes">
    <operator name="TextInput" class="TextInput" expanded="yes">
        <parameter key="attributes" value=""/>
        <parameter key="create_text_visualizer" value="true"/>
        <parameter key="default_content_encoding" value="UTF-8"/>
        <list key="namespaces">
        </list>
        <parameter key="on_the_fly_pruning" value="3"/>
        <parameter key="prune_below" value="2"/>
        <list key="texts">
          <parameter key="test" value="../rm_workspace/apriori/test"/>
        </list>
        <parameter key="vector_creation" value="BinaryOccurrences"/>
        <operator name="ToLowerCaseConverter" class="ToLowerCaseConverter">
        </operator>
        <operator name="StringTokenizer" class="StringTokenizer">
        </operator>
        <operator name="GermanStopwordFilter" class="GermanStopwordFilter">
        </operator>
        <operator name="TokenLengthFilter" class="TokenLengthFilter">
            <parameter key="min_chars" value="3"/>
        </operator>
    </operator>
    <operator name="Numerical2Binominal" class="Numerical2Binominal">
    </operator>
    <operator name="W-Apriori" class="W-Apriori">
    </operator>
</operator>

But this is not working. Rapidminer freezes after 5 minutes. I tried it with this:
java -Xms128M -Xmx1024M -jar rapidminer.jar
But Rapidminer still freeze. And I have to close the whole program.
If I use the default encoding (I let the space empty.), it's working. But it's not displaying the german letters.
Do you know why?

« Last Edit: January 08, 2009, 12:29:21 PM by edfred » Logged
Sebastian Land
Administrator
Hero Member
*****
Posts: 2426


« Reply #5 on: January 08, 2009, 03:47:56 PM »

Hi,
unfortunatly I don't have any clue, why this should happen. And I can't test it without the data.
Did you wait a few minutes before closing rapidMiner? Some parts of the TextMiningPlugin somehow manage to block the gui thread. But the gui thread recovers if the calculation has been finished.

Greetings,
  Sebastian
Logged
edfred
Newbie
*
Posts: 8


« Reply #6 on: January 08, 2009, 09:55:18 PM »

I was waiting a long time, but nevertheless the program was blocked and I have to abort it.
Logged
Pages: [1]
  Print  
 
Jump to: