Pages: [1]
  Print  
Author Topic: [SOLVED] Convert binominal to numeric?  (Read 257 times)
wessel
Sr. Member
****
Posts: 366


« on: January 24, 2012, 08:09:30 PM »

Dear All,

I have several binomial attributes, on which I wish to run linear regression.
So I must convert these binomial attributes with values "true" and "false" to real attributes with values "1" and "0".
How can I do this?

I tried the generate attributes operator but this did not work.
I used the following settings:
attribute name: myNewAtt    
functional expression: if(myAtt == true, 1, 0)

Even though this expression is functionally correct, it always returns 0.

Best regards,

Wessel
« Last Edit: January 24, 2012, 08:16:05 PM by wessel » Logged
wessel
Sr. Member
****
Posts: 366


« Reply #1 on: January 24, 2012, 08:15:53 PM »

A process that does work is the following:
using operators
1. replace (replace all true values to 1)
2. replace (replace all false values to 0)
3. parse numbers

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.017">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.017" expanded="true" name="Process">
    <process expanded="true" height="642" width="778">
      <operator activated="true" class="replace" compatibility="5.1.017" expanded="true" height="76" name="Replace" width="90" x="59" y="140">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attributes" value="|cluster_2|cluster_1|cluster_0"/>
        <parameter key="replace_what" value="true"/>
        <parameter key="replace_by" value="1"/>
      </operator>
      <operator activated="true" class="replace" compatibility="5.1.017" expanded="true" height="76" name="Replace (2)" width="90" x="187" y="85">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attributes" value="|cluster_2|cluster_1|cluster_0"/>
        <parameter key="replace_what" value="false"/>
        <parameter key="replace_by" value="0"/>
      </operator>
      <operator activated="true" class="parse_numbers" compatibility="5.1.017" expanded="true" height="76" name="Parse Numbers" width="90" x="315" y="30">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attributes" value="|cluster_2|cluster_1|cluster_0"/>
      </operator>
      <connect from_op="Replace" from_port="example set output" to_op="Replace (2)" to_port="example set input"/>
      <connect from_op="Replace (2)" from_port="example set output" to_op="Parse Numbers" to_port="example set input"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
    </process>
  </operator>
</process>
« Last Edit: January 24, 2012, 08:18:41 PM by wessel » Logged
earmijo
Jr. Member
**
Posts: 94


« Reply #2 on: January 26, 2012, 03:13:39 AM »

Hi Wessel:

Two additional solutions:

1) Use Weka's Linear Regression Operator. It will code the binomial attributes for you automatically. This is sooooo convenient.

2) Use the "Nominal to Numerical" Operator and select Dummy Coding. You have to define then for each binomial variable a "comparison group" which will get coded 0. According to your message, the comparison group will be false.

Regards,

\E

Here's a example that uses the Golf dataset:

Code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.017">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.017" expanded="true" name="Process">
    <process expanded="true" height="637" width="950">
      <operator activated="true" class="retrieve" compatibility="5.1.017" expanded="true" height="60" name="Retrieve" width="90" x="45" y="75">
        <parameter key="repository_entry" value="//Samples/data/Golf"/>
      </operator>
      <operator activated="true" class="nominal_to_numerical" compatibility="5.1.017" expanded="true" height="94" name="Nominal to Numerical" width="90" x="182" y="72">
        <parameter key="coding_type" value="dummy coding"/>
        <parameter key="use_comparison_groups" value="true"/>
        <list key="comparison_groups">
          <parameter key="Wind" value="false"/>
          <parameter key="Outlook" value="sunny"/>
        </list>
      </operator>
      <connect from_op="Retrieve" from_port="output" to_op="Nominal to Numerical" to_port="example set input"/>
      <connect from_op="Nominal to Numerical" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
« Last Edit: January 26, 2012, 03:28:08 AM by earmijo » Logged
Pages: [1]
  Print  
 
Jump to: