Pages: [1]
  Print  
Author Topic: Cannot map index of nominal attribute to nominal value  (Read 2026 times)
Paul
Guest
« on: December 16, 2008, 10:40:42 AM »

Hi,

after recently updating my RapidMiner (branch Zaniah), the AttributeSubsetPreprocessing
operator somehow fails.

This is my model:
Code:
<operator name="Root" class="Process" expanded="yes">
    <operator name="CSVExampleSource" class="CSVExampleSource">
        <parameter key="filename" value="examples.csv"/>
        <parameter key="label_name" value="result"/>
    </operator>
    <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
        <parameter key="attribute_name_regex" value="result"/>
        <parameter key="condition_class" value="attribute_name_filter"/>
        <parameter key="process_special_attributes" value="true"/>
        <operator name="UserBasedDiscretization" class="UserBasedDiscretization">
            <list key="classes">
              <parameter key="no" value="1000000.0"/>
              <parameter key="yes" value="99.0"/>
            </list>
        </operator>
    </operator>
    <operator name="RandomForest" class="RandomForest">
    </operator>
</operator>

The dataset consists of numerical and nominal features, while the labels
are percent value. In order to use them for classification, I preprocess
my data by replacing all labels (named result) >= 99.0 with "yes", while setting the
other labels to "no". This worked fine with RapidMiner 4.2. With the
recent version I get the error message:

Quote
AttributeTypeException
Process failed Message:
Cannot map index of nominal attribute to nominal value: index -1 is out of bounds!

Any ideas what is wrong?

Regards,
Paul
Logged
Sebastian Land
Administrator
Hero Member
*****
Posts: 2426


« Reply #1 on: December 16, 2008, 10:54:31 AM »

Hi Paul,
did you try to set the upper bound of label no to "Infinity" ? This might help if values above 1000000 occur.
But I must admint that the operator info states that a additional class will be introduced then, but this doesn't happen for some reason. I will check that, but probably I will not get it done before next year.

Greetings,
  Sebastian
Logged
Paul
Guest
« Reply #2 on: December 16, 2008, 11:32:52 AM »

Hi Sebastian,

yes, I've just tried it but it didn't help. Also, in my case the
values are never larger than 160.0 so the value range should be
not exceeded.

It would be nice if you could check it. So long, I will switch back
to RM 4.2.

Regards,
Paul
Logged
Paul
Guest
« Reply #3 on: December 16, 2008, 02:43:09 PM »

Hi Sebastian,

I've found the bug. In one example, the label was missing. So, there
are no problems with RapidMiner. Sorry. Shocked

Can such problems be avoided in the future, i.e. is there a way
to check the dataset for invalid examples with missing labels?
Or must this be done in advance by the user?

Regards,
Paul
Logged
Pages: [1]
  Print  
 
Jump to: