Pages: [1]
  Print  
Author Topic: What happens when X-validation creates partition with no positive examples?  (Read 853 times)
DrGary
Newbie
*
Posts: 8


« on: June 05, 2009, 04:49:53 AM »


If a dataset is skewed, then positive and negative example sets are not be balanced in size. Skew is common when trying to learn a detector of rare events, for example.

Suppose that the data set has only 1 positive example. Then cross validation can produce only one training subset that has a positive example; the others will have no positive examples. What will happen? What do RapidMiner Models do when trained without a positive example?

I'm asking because I'm seeing a Java Exception in model training that I've traced back to an XVal partition with no positive examples.

Is there a way to detect the situation and skip training in this case?

Thanks,
Gary
Logged
Sebastian Land
Administrator
Hero Member
*****
Posts: 2426


« Reply #1 on: June 05, 2009, 07:46:13 AM »

Hi Gary,
probably the models can't do anything about it: Without examples of both classes you can't learn to separate them. Only few algorithms for the "one-class" case exist, the one-class SVM is one of them. But I don't know what the LibSVM implementation will do if there is really only one class.
As far as I see, you have only two options:
- Trying the bootstrapping operator to multiply your positive examples, so that the learner has examples in each XValidation fold.
- Alternativly you could extract the positive examples and add it to each training data. The following process would use that, but keep in mind,
that you undermine the goal of performance estimation, since you then will have a part of your training data in the test set, too...

Code:
<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="target_function" value="sum classification"/>
    </operator>
    <operator name="IOMultiplier" class="IOMultiplier">
        <parameter key="io_object" value="ExampleSet"/>
    </operator>
    <operator name="ExampleFilter" class="ExampleFilter">
        <parameter key="condition_class" value="attribute_value_filter"/>
        <parameter key="parameter_string" value="label = positive"/>
    </operator>
    <operator name="IOStorer" class="IOStorer">
        <parameter key="name" value="positiveSet"/>
        <parameter key="io_object" value="ExampleSet"/>
    </operator>
    <operator name="XValidation" class="XValidation" expanded="yes">
        <operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
            <operator name="IORetriever" class="IORetriever">
                <parameter key="name" value="positiveSet"/>
                <parameter key="io_object" value="ExampleSet"/>
                <parameter key="remove_from_store" value="false"/>
            </operator>
            <operator name="ExampleSetMerge" class="ExampleSetMerge">
            </operator>
            <operator name="DecisionTree" class="DecisionTree">
            </operator>
        </operator>
        <operator name="OperatorChain" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="Performance" class="Performance">
            </operator>
        </operator>
    </operator>
</operator>

Greetings,
  Sebastian
Logged
Pages: [1]
  Print  
 
Jump to: