Pages: [1]
 Author Topic: GA driven attribute selection according to Positive predictive value  (Read 682 times)
Jr. Member

Posts: 74

 « on: March 26, 2010, 02:01:08 PM »

In my case I am interested only in POSITIVE PREDICTIVE VALUE.

The problem is when I am  selecting attributes  - GA selects only the case with a single correctly classified example - and thus PPV = 100 %. This is of course of a very little  reliability.

Could anyone help me which performance evaluator will fit my needs?
Thank you in advance for any help.
 Logged
Sebastian Land
Hero Member

Posts: 2425

 « Reply #1 on: March 29, 2010, 08:31:12 AM »

Hi,
sorry, but I'm a bit confused. What exactly are you going to do? Which operator do you use?

Greetings,
Sebastian
 Logged
Jr. Member

Posts: 74

 « Reply #2 on: April 20, 2010, 01:02:40 PM »

I am sorry. I will try to be more clear now.

I have binominal classification problem and what I am interested in is to maximize positive predictive value (PPV) . Therefore lets say I got these confusion matrices:

Code:
1068 328
57 77
accuracy: 74.84%
PPV: 57.46 %
This is quite good as the PPV is of 57.46 %

Lat have a look at this example:
Code:
1135 394
0 1
accuracy: 74.25%
PPV: 100.0 %

Here the PPV is 100 % (i.e. the perfect solution from the point of PPV view and the first is considered to be better)

Unfortunately - one sample positively classified is only of a little significance. There is high probability that when deployed on validation data the results will be very bad.
Results on a validation example set is 1) PPV: 55.9% 2) PPV: 0.0 % (two misclassified samples).

And here is my question - is there any solution how to objectively compare these two results?

 Logged
Ingo Mierswa
Hero Member

Posts: 1220

 « Reply #3 on: April 20, 2010, 02:38:34 PM »

Hello,

hmm, there is actually always a risk in concentrating on precision alone. Beside taking other measures into account, be it by a combination like f-measure, be it by weighting or be it by multi-objective optimization schemes (which is all possible within RapidMiner), I am afraid there is no general solution for a objective comparison.

Cheers,
Ingo
 Logged