Hi Steffen,

thank you for your answer.

Maybe some more explanation on the variable importance measure proposed by Breiman for his

Random Forests.

First of all he defines out-of-bag (OOB) data as data which is not part of the bootstrap

sample. The bootstrap sample drawn form the original data in turn is used to grow N decision

trees. At each node, a randomly chosen number of attributes is taken to find the best split.

Now the variable importance measure comes into play. For each of the trees grown in the forest,

the OOB data is put down and the number of votes cast for the correct class is counted. Next,

the values of variable m in the OOB cases are randomly permuted and again these cases are

put down the tree. Finally, the number of votes for the correct class in the variable-m-permuted

OOB data is subtracted from the number of votes for the correct class in the first untouched OOB

data. Thus, the larger the difference, the more important this variable m is.The average of

the differences over all trees in the forest is defined as an importance of variabble m.

If you want to calculate how import a variable is outside these learning algorithms I suggest the operators "GiniIndexWeighting", "InfoGainWeighting" / "InfoGainRatioWeighting". I personally prefer "InfoGainRatio".

I don't think that the approaches you've suggested can be applied

to realize the variable importance measure as suggested to be

most appropriate by Breiman. As far as I can see, the Weighting

operators are independent of the used learner. But to achieve

what I described above requires an itegration of a weighting operator

into the RandomForest operator, i.e. while growing the forest, the

variable importance estimation must take place. Or am I wrong and

you see a way on how to get Breiman's approach working in RapidMiner?

Regards,

Paul