Pages: [1]
  Print  
Author Topic: GA driven attribute selection according to Positive predictive value  (Read 536 times)
radone
Jr. Member
**
Posts: 73


« on: March 26, 2010, 02:01:08 PM »

In my case I am interested only in POSITIVE PREDICTIVE VALUE.

The problem is when I am  selecting attributes  - GA selects only the case with a single correctly classified example - and thus PPV = 100 %. This is of course of a very little  reliability.

Could anyone help me which performance evaluator will fit my needs?
Thank you in advance for any help.
Logged
Sebastian Land
Administrator
Hero Member
*****
Posts: 2421


« Reply #1 on: March 29, 2010, 08:31:12 AM »

Hi,
sorry, but I'm a bit confused. What exactly are you going to do? Which operator do you use?

Greetings,
  Sebastian
Logged

Hope to see you at RapidMiner Community Meeting and Conference (RCOMM 2011) in Dublin from June 7-10, 2011.
The Call for Paper is online now!
More information at http://www.rcomm2011.org
radone
Jr. Member
**
Posts: 73


« Reply #2 on: April 20, 2010, 01:02:40 PM »

I am sorry. I will try to be more clear now.

I have binominal classification problem and what I am interested in is to maximize positive predictive value (PPV) . Therefore lets say I got these confusion matrices:

Code:
1068 328
57 77
accuracy: 74.84%
PPV: 57.46 %
This is quite good as the PPV is of 57.46 %

Lat have a look at this example:
Code:
1135 394
0 1
accuracy: 74.25%
PPV: 100.0 %

Here the PPV is 100 % (i.e. the perfect solution from the point of PPV view and the first is considered to be better)

Unfortunately - one sample positively classified is only of a little significance. There is high probability that when deployed on validation data the results will be very bad.
Results on a validation example set is 1) PPV: 55.9% 2) PPV: 0.0 % (two misclassified samples).

And here is my question - is there any solution how to objectively compare these two results?

Thanks in advance.
Logged
Ingo Mierswa
Administrator
Hero Member
*****
Posts: 1210



WWW
« Reply #3 on: April 20, 2010, 02:38:34 PM »

Hello,

hmm, there is actually always a risk in concentrating on precision alone. Beside taking other measures into account, be it by a combination like f-measure, be it by weighting or be it by multi-objective optimization schemes (which is all possible within RapidMiner), I am afraid there is no general solution for a objective comparison.

Cheers,
Ingo
Logged

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at  http://marketplace.rapid-i.com
Pages: [1]
  Print  
 
Jump to: