Pages: [1]
  Print  
Author Topic: Unsure-aware Decision Tree  (Read 2750 times)
alfileres
Newbie
*
Posts: 6


« on: August 08, 2008, 03:35:17 PM »

Hi All,

I'm trying to develop a stock recommending system using RapidMiner. I have labeled the output in two possible classes: buy / sell. I'm using a decision tree as learner. I have observed that some of the leaves contain both buy / sell recommendations. However, I will be more happy if the Decision Tree would say 'I do not know' your output for this data instead of giving the majority class for this specific leave.

Propose feature:
- Be able to define a parameter to specify the required purity in a leave node (e.g. purity=0.9 could mean that the leave says 'sell' and out of the total instances that end up in this node, 90% at least are 'sell').
- This definition of purity does not depend in how many classes the problem has; so, in principle, it can be quite generic.
- I find it also interesting to extend this approach to other operators if possible.

Let me know what you think about this feature request and possible work-arounds.

Thank you,

alfileres
Logged
Tobias Malbrecht
Global Moderator
Sr. Member
*****
Posts: 293



WWW
« Reply #1 on: August 08, 2008, 10:29:44 PM »

Hi,

Let me know what you think about this feature request and possible work-arounds.

to imply a kind of uncertainty in the model is not really possible. Nevertheless, if it comes to prediction, such uncertainty is indicated by a small confidence. The operator UncertainPredictionsTransformation allows to mark predictions which have a confidence below a specified thresholds as uncertain which is indicated by setting the corresponding prediction to missing.

The following process should explain this:

Code:
<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="target_function" value="polynomial classification"/>
    </operator>
    <operator name="DecisionTree" class="DecisionTree">
        <parameter key="keep_example_set" value="true"/>
    </operator>
    <operator name="ModelApplier" class="ModelApplier">
        <list key="application_parameters">
        </list>
    </operator>
    <operator name="UncertainPredictionsTransformation" class="UncertainPredictionsTransformation">
        <parameter key="min_confidence" value="0.9"/>
    </operator>
</operator>

Maybe this is helpful for now. Concerning the decision tree uncertainty I am not sure whether it is of great help to add such an uncertainty indicator in the models itself, as in a descriptive setting this model can easily be interpreted by the user. A purity parameter might potentially be helpful to control fitting of the tree, but at the moment we are really busy and have really few time. But maybe we will discuss that issue in the long term. For now you have to use the other parameter to control how far the tree tries to generalize or to exactly fit the data.

Regards,
Tobias
Logged

Tobias Malbrecht
Director of Product Marketing
RapidMiner
alfileres
Newbie
*
Posts: 6


« Reply #2 on: August 09, 2008, 11:48:11 AM »

Hi Tobias,

Thank you for your help. The UncertainPredictionsTransformation operator seems to be a good work-around. However, I'm not sure of it usefulness as I don't know how prediction confidences are computed. Do you know how can I find this information?

Thank you!
Logged
Tobias Malbrecht
Global Moderator
Sr. Member
*****
Posts: 293



WWW
« Reply #3 on: August 09, 2008, 12:36:30 PM »

Hi,

for a decision tree the confidences are equal to the fractions of the classes in the leaves. If you have e.g. a leave consisting of 5 examples, one labeled as buy and 4 labeled as sell, then the confidences for an unseen test example which exhibits the attribute value combination that leads to that leave are 0.2 (for class buy) and 0.8 (for class sell), meaning that with a confidence of 0.8 the example is predicted to be of class sell.

Hope that helps,
Tobias
« Last Edit: August 09, 2008, 12:38:05 PM by Tobias Malbrecht » Logged

Tobias Malbrecht
Director of Product Marketing
RapidMiner
alfileres
Newbie
*
Posts: 6


« Reply #4 on: August 09, 2008, 12:51:24 PM »

Certainly it does.

Thank you very much for your quick & informative answers.
Logged
Pages: [1]
  Print  
 
Jump to: