Pages: [1]
  Print  
Author Topic: AttributeConstruction + average(X1)  (Read 2078 times)
Shubha
Full Member
***
Posts: 141


« on: March 20, 2009, 02:02:20 PM »

Hi,

I have two variables in my ExampleSet, 'X1' and 'average(X1)'. The variable 'average(X1)' is a variable created from RM. Now I want to do an 'AttributeConstruction' based on the variable, 'average(X1)'. Say this could be, (X1-average(X1))^2. But I get the error, 'Unrecognized Symbol "average" Syntax error (implicit multiplication not enabled).

How do I make this work?

Thanks, Shubha
Logged
haddock
Hero Member
*****
Posts: 853



WWW
« Reply #1 on: March 20, 2009, 03:30:53 PM »

RM doesn't like 'average(X)' as an attribute name, rename to something else......
see below..


Code:
<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="number_examples" value="200"/>
        <parameter key="target_function" value="random"/>
    </operator>
    <operator name="ChangeAttributeName" class="ChangeAttributeName">
        <parameter key="new_name" value="X1"/>
        <parameter key="old_name" value="att1"/>
    </operator>
    <operator name="ChangeAttributeName (2)" class="ChangeAttributeName">
        <parameter key="new_name" value="average(X)"/>
        <parameter key="old_name" value="att2"/>
    </operator>
    <operator name="OperatorSelector" class="OperatorSelector" expanded="yes">
        <parameter key="select_which" value="2"/>
        <operator name="OperatorChain" class="OperatorChain" expanded="yes">
            <operator name="AttributeConstruction" class="AttributeConstruction">
                <list key="function_descriptions">
                  <parameter key="Mmm" value="X1+average(X)"/>
                </list>
            </operator>
        </operator>
        <operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
            <operator name="ChangeAttributeName (3)" class="ChangeAttributeName">
                <parameter key="new_name" value="average"/>
                <parameter key="old_name" value="average(X)"/>
            </operator>
            <operator name="AttributeConstruction (2)" class="AttributeConstruction">
                <list key="function_descriptions">
                  <parameter key="Mmm" value="X1+average"/>
                </list>
            </operator>
        </operator>
    </operator>
</operator>
Logged

Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?

T.S.Eliot ~ Choruses from the Rock 1934
Shubha
Full Member
***
Posts: 141


« Reply #2 on: March 20, 2009, 04:51:14 PM »

Oh no.... I have too many have too many average(X1), average(X2) attributes like this.......
Logged
Shubha
Full Member
***
Posts: 141


« Reply #3 on: March 20, 2009, 04:55:43 PM »

Was wondering if that was not a valid name on which RM cannot operate, why should the 'Aggregation' operator should give the aggregate measure with that attribute name (the one with brackets average(X1))...

Now, this has to be done for all the attribues. Attribute names can again be anything...

Thanks, Shubha
Logged
haddock
Hero Member
*****
Posts: 853



WWW
« Reply #4 on: March 21, 2009, 03:27:02 PM »

Hi,

I agree, if RM makes an attribute, it should be usable. No doubt that will get changed, but in the meantime use a regex to remove the brackets in the attribute name, like this..

Code:
<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="number_examples" value="200"/>
        <parameter key="number_of_attributes" value="4"/>
        <parameter key="target_function" value="random"/>
    </operator>
    <operator name="ChangeAttributeName" class="ChangeAttributeName">
        <parameter key="new_name" value="X1"/>
        <parameter key="old_name" value="att1"/>
    </operator>
    <operator name="ChangeAttributeName (2)" class="ChangeAttributeName">
        <parameter key="new_name" value="X2"/>
        <parameter key="old_name" value="att2"/>
    </operator>
    <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
        <parameter key="attribute_name_regex" value="at.*"/>
        <parameter key="condition_class" value="attribute_name_filter"/>
        <parameter key="deliver_inner_results" value="true"/>
        <operator name="BinDiscretization" class="BinDiscretization">
            <parameter key="range_name_type" value="short"/>
        </operator>
    </operator>
    <operator name="Aggregation" class="Aggregation">
        <list key="aggregation_attributes">
          <parameter key="X1" value="average"/>
          <parameter key="X2" value="average"/>
        </list>
        <parameter key="group_by_attributes" value="at.*"/>
    </operator>
    <operator name="ChangeAttributeNamesReplace" class="ChangeAttributeNamesReplace">
        <parameter key="apply_on_special" value="false"/>
        <parameter key="attributes" value="av.*"/>
        <parameter key="replace_what" value="\(|\)"/>
    </operator>
    <operator name="AttributeConstruction" class="AttributeConstruction">
        <list key="function_descriptions">
          <parameter key="New_Att" value="averageX1+averageX2"/>
        </list>
    </operator>
</operator>
Logged

Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?

T.S.Eliot ~ Choruses from the Rock 1934
Ingo Mierswa
Administrator
Hero Member
*****
Posts: 1226



WWW
« Reply #5 on: March 23, 2009, 12:41:27 PM »

Hi,

Quote

I agree, if RM makes an attribute, it should be usable. No doubt that will get changed, but in the meantime use a regex to remove the brackets in the attribute name, like this..

that's true. Unfortunately, the parentheses go back to the first version of RapidMiner in 2001 and we cannot simply change the output names without breaking compatibility. So we have to write a parser for the processes etc. and this is something which is not easily done.

Until then, however, there are two new helper operators to overcome those naming issues:

- ChangeAttributeNamesReplace: replaces characters in matching attribute names, e.g. all non-word characters by an underscore
- ChangeAttributeNames2Generic: replaces the matching attribute names by generic names

Cheers,
Ingo
Logged

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at  http://marketplace.rapid-i.com
Pages: [1]
  Print  
 
Jump to: