Pages: [1]
  Print  
Author Topic: Find Threshold threshold NaN  (Read 441 times)
HIshaq
Newbie
*
Posts: 1


« on: July 23, 2013, 12:51:20 AM »

Hello Folks,

I am trying to use the "Find Threshold" operator to find a threshold for some dummy data I have made for High school dropouts. I import the data using the wizard, and have assigned the "label", "prediction" and "confidence" by selecting them from the drop down menus, and are applied through the "set role" operators. What I am doing is that the if the "label" says "no", and the confidence level is above 0.5, I set my "prediction" to "no", i.e. if a person has not dropped out so far, with a confidence of >= 0.5, it is predicted that they will not drop out in the coming year. Here is the XML code:

Code:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Root">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="5.3.008" expanded="true" height="60" name="Retrieve test8" width="90" x="45" y="120">
        <parameter key="repository_entry" value="//Local Repository/test8"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="5.3.008" expanded="true" height="76" name="Set Role" width="90" x="179" y="120">
        <parameter key="attribute_name" value="Dropped out"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="5.3.008" expanded="true" height="76" name="Set Role (2)" width="90" x="296" y="120">
        <parameter key="attribute_name" value="Prediction"/>
        <parameter key="target_role" value="prediction"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="5.3.008" expanded="true" height="76" name="Set Role (3)" width="90" x="438" y="120">
        <parameter key="attribute_name" value="Confidence"/>
        <parameter key="target_role" value="confidence"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="find_threshold" compatibility="5.3.008" expanded="true" height="76" name="Find Threshold" width="90" x="581" y="120">
        <parameter key="show_roc_plot" value="true"/>
      </operator>
      <connect from_op="Retrieve test8" from_port="output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Set Role (2)" to_port="example set input"/>
      <connect from_op="Set Role (2)" from_port="example set output" to_op="Set Role (3)" to_port="example set input"/>
      <connect from_op="Set Role (3)" from_port="example set output" to_op="Find Threshold" to_port="example set"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
    </process>
  </operator>
</process>



What I was expecting was, like the tutorial, the "find threshold" operator would give me a new, better, threshold based on the data. The threshold value I get is NaN. The ROC is a vertical line along the y-axis, and then a horizontal line on the top. I guess that FP/N must be zero? Why? How do I fix this?

Following is the relevant part of the data I am working with:

Dropped out   Confidence   Confidence(negative)   Prediction
n                   0.75                    0.25                           n
n                   0.82                    0.18                           n
n                   0.43                    0.57                           y
y                   0.1                    0.9                                   y
n                   0.7                    0.3                                   n
n                   0.85                    0.15                           n
n                   0.6                    0.4                                   n
n                   0.89                  0.11                              n
n                   0.46                    0.54                           y
n                   0.39                    0.61                           y
n                   0.7                    0.3                                   n
n                   0.4                    0.6                                   y
n                   0.9                    0.1                                   n
n                   0.81                    0.19                           n
y                   0.69             0.31                           y
n                   0.55                    0.45                           n

I hope I have followed all the required steps in making this post. And I thank you in advance for your help.

Kind regards,
HIshaq
Logged
Marius
Administrator
Hero Member
*****
Posts: 1794



WWW
« Reply #1 on: July 23, 2013, 09:39:07 AM »

Hishaq, indeed there seems to be a problem with the Find Threshold operator. I have created an internal ticket such that the development team can have a look on it.

Best regards,
Marius
Logged

Please add [SOLVED] to the topic title when your problem has been solved! (do so by editing the first post in the thread and modifying the title)
Please click here before posting.
Pages: [1]
  Print  
 
Jump to: