Pages: [1]
  Print  
Author Topic: MultipleLabelIterator not allowing new iterations to overwrite values  (Read 1277 times)
keith
Full Member
***
Posts: 160


« on: August 30, 2008, 08:45:36 PM »

I think I've developed a simple example which shows that MultipleLabelIterator is somehow not overwriting values defined in previous iterations with new data.  Please correct me if I've got this wrong:

What this example is trying to do is:

1) Create a simple example set with two labels (label1, label2)
2) Use MultipleLabelIterator to do the following on each label

3) Run Linear Regression
4) Apply model
5) Compute a new attribute based on value of the predictions of the model.

The problem is that, in step 5, the attribute "prediction(label_1)" changes name with each iteration.  So what we ideally want to do is specify "prediction(label_%{a}).  However, that doesn't work inside a FeatureGenerator computation, so the workaround suggested in another thread was to rename it to a static name.  So the example changes it to "pred_val" and then has a two-step FeatureGenerator to generate a final value called pred_val_sq.  Since I want to retain this for each run, I then rename it to "pred_val_sq_%{a}".

This works on the first iteration.  However, on the 2nd iteration, the values of "pred_val_sq" still retain the values from the previous iteration, even though that attribute was renamed.

I realize this explanation is somewhat convoluted.  Hopefully it will be clearer once the operator chain is run, and you see that "pred_val_sq_1" and "pred_val_sq_2" have identical values.

Code:
<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="target_function" value="polynomial"/>
    </operator>
    <operator name="Change name: label to label1" class="ChangeAttributeRole">
        <parameter key="name" value="label"/>
        <parameter key="target_role" value="label1"/>
    </operator>
    <operator name="Change role of label1 to label_1" class="ChangeAttributeName">
        <parameter key="new_name" value="label_1"/>
        <parameter key="old_name" value="label"/>
    </operator>
    <operator name="Create attrib: label2" class="FeatureGeneration">
        <list key="functions">
          <parameter key="label_2" value="+(att1,+(att2,+(att3,+(att4,att5))))"/>
        </list>
        <parameter key="keep_all" value="true"/>
    </operator>
    <operator name="Change role of label2 to label_2" class="ChangeAttributeRole">
        <parameter key="name" value="label_2"/>
        <parameter key="target_role" value="label2"/>
    </operator>
    <operator name="MultipleLabelIterator" class="MultipleLabelIterator" expanded="yes">
        <operator name="LinearRegression" class="LinearRegression">
            <parameter key="keep_example_set" value="true"/>
        </operator>
        <operator name="ModelApplier" class="ModelApplier">
            <list key="application_parameters">
            </list>
        </operator>
        <operator name="Remove iteration# from prediction name" class="ChangeAttributeName">
            <parameter key="new_name" value="pred_val"/>
            <parameter key="old_name" value="prediction(label_%{a})"/>
        </operator>
        <operator name="Create value derived from pred_val" class="FeatureGeneration" breakpoints="before,after">
            <list key="functions">
              <parameter key="pred_val_step1" value="+(pred_val,const[1]())"/>
              <parameter key="pred_val_sq" value="*(pred_val_step1,pred_val_step1)"/>
            </list>
            <parameter key="keep_all" value="true"/>
        </operator>
        <operator name="Change name to add iteration# back to pred_val" class="ChangeAttributeName">
            <parameter key="new_name" value="pred_val_%{a}"/>
            <parameter key="old_name" value="pred_val"/>
        </operator>
        <operator name="Rename derived value to add iteration#" class="ChangeAttributeName">
            <parameter key="new_name" value="pred_val_sq_%{a}"/>
            <parameter key="old_name" value="pred_val_sq"/>
        </operator>
    </operator>
</operator>

Thanks for any assistance.

Keith


Logged
Tobias Malbrecht
Global Moderator
Sr. Member
*****
Posts: 293



WWW
« Reply #1 on: August 31, 2008, 09:18:48 PM »

Hi,

although I must admit I did not yet fully checked your problem, just a remark: we are currently working on the feature generation functionality. This is among others due to the reason that it simply does not work properly in all cases, especially when it is combined with transforming example sets by adding/removing attributes, etc. It may be, that this is the problem here as well.

Cheers,
Tobias
Logged

Tobias Malbrecht
Director of Product Marketing
RapidMiner
Ingo Mierswa
Administrator
Hero Member
*****
Posts: 1226



WWW
« Reply #2 on: September 23, 2008, 02:54:26 PM »

Hi Keith,

I just wanted to let you know that this behaviour is indeed a result of the  combination of the loop and the feature generation - the values are not re-created in the second (and following) loops due to an optimization in the feature generation operator. As Tobias said, we are currently in the process of re-implementing the whole feature generation approach and this behaviour will change for the next major upgrade.

Cheers,
Ingo
Logged

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at  http://marketplace.rapid-i.com
Pages: [1]
  Print  
 
Jump to: