Pages: [1] 2 3 ... 10
 1 
 on: Today at 09:44:54 AM 
Started by sylar_19 - Last post by haddock
Hola sylar_19,

I spend most of my time on association rules ( see my website ), and feel the need to jump in, so here goes.

Association rules are a form of unsupervised learning, that means that their is no supervisor to tell the machine what to look for.  You use unsupervised learning to explore data. The machine first looks for things that happen together, and then makes rules from those patterns. In your case as follows:-

Code:
Sets
[Temperatura = Frio , Umidade]
[Vento, Temperatura = Frio ,Umidade]
[Probabilidade = Chuvoso, Temperatura = Frio ,Umidade]

Association Rules
[Temperatura = Frio] --> [Umidade] (confidence: 1.000)
[Vento, Temperatura = Frio] --> [Umidade] (confidence: 1.000)
[Probabilidade = Chuvoso, Temperatura = Frio,Umidade] (confidence: 1.000)

By contrast with Decision trees you know what you are looking for, in this case whether it is sunny or cloudy, this is called supervised learning. You use supervised learning when you want to predict something.

Code:
T r e e
Probabilidade = Chuvoso
| Vento = Nao: Sim {Nao=0, Sim=3}
| Vento = Sim: Nao {Nao=2, Sim=0}
Probabilidade = Ensolarado
| Umidade = Alta: Nao {Nao=3, Sim=0}
| Umidade = Normal: Sim {Nao=0, Sim=2}
Probabilidade = Nublado: Sim {Nao=0, Sim=4}

In my own case of text mining, if I have keywords already and type into Google that is supervised, whereas if I have the documents and want the key words that is unsupervised. It is the fundamental difference between " go find this in there" and "what is in there". Hope that helps,

H

 2 
 on: Today at 09:25:30 AM 
Started by Evgenii - Last post by NamSor
We'll go Smiley

Not from Russia but I speak Russian. We've built RapidMiner Onomastics Extension, fully calibrated on names of the Russian Federation (both in Latin or Cyrillic).
https://www.youtube.com/watch?v=wScgijiqA2c

Best,
Elian

 3 
 on: Today at 08:22:06 AM 
Started by sylar_19 - Last post by SvenVanPoucke
Hi,
A first step is to take a look at the following book:
https://rapidminer.com/wp-content/uploads/2013/10/DataMiningForTheMasses.pdf
If this does not provide you the answer you wanted, please come back.
Cheers
Sven

 4 
 on: Today at 01:16:52 AM 
Started by sylar_19 - Last post by sylar_19
I used weather nominal base. Someone could help me to interpret this results or give a tutorial to do it?

Association Rules
[Temperatura = Frio] --> [Umidade] (confidence: 1.000)
[Vento, Temperatura = Frio] --> [Umidade] (confidence: 1.000)
[Probabilidade = Chuvoso, Temperatura = Frio] --> [Umidade] (confidence: 1.000)

T r e e
Probabilidade = Chuvoso
| Vento = Nao: Sim {Nao=0, Sim=3}
| Vento = Sim: Nao {Nao=2, Sim=0}
Probabilidade = Ensolarado
| Umidade = Alta: Nao {Nao=3, Sim=0}
| Umidade = Normal: Sim {Nao=0, Sim=2}
Probabilidade = Nublado: Sim {Nao=0, Sim=4}

 5 
 on: Today at 12:02:50 AM 
Started by ToniLilly - Last post by ToniLilly
Appreciate it. An abundance of facts.

Also visit my web page - expedia promo codes

 6 
 on: May 22, 2015, 06:13:59 PM 
Started by SvenVanPoucke - Last post by SvenVanPoucke
Hi,
I would like to know if anyone has experience with the representation of the results of a NB classification (Binominal Classification)
I find the PerformanceVector (cfr infra) but the plot view is not very illustrative. Anyone experience with a nice way to present the performance vector in another way?
Cheers
Sven

PerformanceVector:
accuracy: 89.04%
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
classification_error: 10.96%
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
kappa: 0.514
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
precision: 97.80% (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
recall: 89.96% (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
lift: 107.63% (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
fallout: 20.11% (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
f_measure: 93.72% (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
false_positive: 2189.000 (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
false_negative: 10873.000 (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
true_positive: 97407.000 (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
true_negative: 8695.000 (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
sensitivity: 89.96% (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
specificity: 79.89% (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
youden: 0.698 (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
positive_predictive_value: 97.80% (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
negative_predictive_value: 44.43% (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407
psep: 0.422 (positive class: N)
ConfusionMatrix:
True:   Y   N
Y:   8695   10873
N:   2189   97407

 7 
 on: May 22, 2015, 03:16:57 PM 
Started by Timo - Last post by Martin Schmitz
wow, nice one!

I got a new building block! Thanks!

 8 
 on: May 22, 2015, 02:26:38 PM 
Started by Timo - Last post by JEdward
Following on from Martin's note. 
Here's a very quick example of a couple of RegEx ways to extract the dates & format them. 
It uses Cut Document & Select Subprocess to allow you to add more date formats as you write the RegEx expressions.  In this example it only selects the first date it finds in the document (as with a press release that's likely to be at the top).

Code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.4.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="subprocess" compatibility="6.4.000" expanded="true" height="76" name="Example Documents" width="90" x="45" y="120">
        <process expanded="true">
          <operator activated="true" class="text:create_document" compatibility="6.4.001" expanded="true" height="60" name="Create Document (2)" width="90" x="45" y="390">
            <parameter key="text" value="This is a press release from 12/05/2010&#10;sfsfsd&#10;sdfsdfsd&#10;fsdgsdgsd g sdg sdfg dfgg"/>
          </operator>
          <operator activated="true" class="text:create_document" compatibility="6.4.001" expanded="true" height="60" name="Create Document" width="90" x="45" y="255">
            <parameter key="text" value="This is a press release from Monday 12th May 2010&#10;sfsfsd&#10;sdfsdfsd&#10;fsdgsdgsd g sdg sdfg dfgg"/>
          </operator>
          <operator activated="true" class="text:documents_to_data" compatibility="6.4.001" expanded="true" height="94" name="Documents to Data" width="90" x="179" y="300">
            <parameter key="text_attribute" value="press_release"/>
          </operator>
          <connect from_op="Create Document (2)" from_port="output" to_op="Documents to Data" to_port="documents 2"/>
          <connect from_op="Create Document" from_port="output" to_op="Documents to Data" to_port="documents 1"/>
          <connect from_op="Documents to Data" from_port="example set" to_port="out 1"/>
          <portSpacing port="source_in 1" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="generate_id" compatibility="6.4.000" expanded="true" height="76" name="Generate ID" width="90" x="179" y="120"/>
      <operator activated="true" class="multiply" compatibility="6.4.000" expanded="true" height="94" name="Multiply" width="90" x="313" y="210"/>
      <operator activated="true" class="text:process_document_from_data" compatibility="6.4.001" expanded="true" height="76" name="Process Documents from Data" width="90" x="447" y="75">
        <parameter key="create_word_vector" value="false"/>
        <parameter key="keep_text" value="true"/>
        <parameter key="select_attributes_and_weights" value="true"/>
        <list key="specify_weights">
          <parameter key="press_release" value="1.0"/>
        </list>
        <process expanded="true">
          <operator activated="true" class="text:cut_document" compatibility="6.4.001" expanded="true" height="60" name="Cut Document" width="90" x="112" y="30">
            <parameter key="query_type" value="Regular Expression"/>
            <list key="string_machting_queries"/>
            <list key="regular_expression_queries">
              <parameter key="1" value="((([0-9]|[0-9])|([0-3][0-9]))(/)(([0-9]|[0-9])|([0-9][0-9]))(/)([1-2][0-9][0-9][0-9]))"/>
              <parameter key="2" value="((([0-9]|[0-9])|([0-9][0-9]))(...)(January|February|March|April|May|June|July|August|September|October|November|December)(.)([1-2][0-9][0-9][0-9]))"/>
            </list>
            <list key="regular_region_queries"/>
            <list key="xpath_queries"/>
            <list key="namespaces"/>
            <list key="index_queries"/>
            <list key="jsonpath_queries"/>
            <process expanded="true">
              <operator activated="true" class="text:documents_to_data" compatibility="6.4.001" expanded="true" height="76" name="Documents to Data (3)" width="90" x="112" y="30">
                <parameter key="text_attribute" value="dateformat"/>
              </operator>
              <operator activated="true" class="extract_macro" compatibility="6.4.000" expanded="true" height="60" name="Extract Macro" width="90" x="179" y="210">
                <parameter key="macro" value="query_key"/>
                <parameter key="macro_type" value="data_value"/>
                <parameter key="attribute_name" value="query_key"/>
                <parameter key="example_index" value="1"/>
                <list key="additional_macros"/>
                <description align="center" color="transparent" colored="false" width="126">Extract the date format type for the subprocess selection</description>
              </operator>
              <operator activated="true" class="text_to_nominal" compatibility="6.4.000" expanded="true" height="76" name="Text to Nominal" width="90" x="313" y="210">
                <parameter key="attribute_filter_type" value="single"/>
                <parameter key="attribute" value="dateformat"/>
              </operator>
              <operator activated="false" class="handle_exception" compatibility="6.4.000" expanded="true" height="60" name="Handle Exception" width="90" x="514" y="390">
                <process expanded="true">
                  <portSpacing port="source_in 1" spacing="0"/>
                  <portSpacing port="sink_out 1" spacing="0"/>
                </process>
                <process expanded="true">
                  <portSpacing port="source_in 1" spacing="0"/>
                  <portSpacing port="sink_out 1" spacing="0"/>
                </process>
                <description align="center" color="transparent" colored="false" width="126">You should really use 'Handle Exception' around the 'Select Subprocess' as there are bound to be some extracted dates that don't parse. Left disabled for illustration.</description>
              </operator>
              <operator activated="true" class="select_subprocess" compatibility="6.4.000" expanded="true" height="76" name="Select Subprocess" width="90" x="514" y="210">
                <parameter key="select_which" value="%{query_key}"/>
                <process expanded="true">
                  <operator activated="true" class="nominal_to_date" compatibility="6.4.000" expanded="true" height="76" name="Nominal to Date" width="90" x="112" y="30">
                    <parameter key="attribute_name" value="dateformat"/>
                    <parameter key="date_format" value="dd/MM/yyyy"/>
                  </operator>
                  <connect from_port="input 1" to_op="Nominal to Date" to_port="example set input"/>
                  <connect from_op="Nominal to Date" from_port="example set output" to_port="output 1"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="source_input 2" spacing="0"/>
                  <portSpacing port="sink_output 1" spacing="0"/>
                  <portSpacing port="sink_output 2" spacing="0"/>
                </process>
                <process expanded="true">
                  <operator activated="true" class="nominal_to_date" compatibility="6.4.000" expanded="true" height="76" name="Nominal to Date (2)" width="90" x="179" y="30">
                    <parameter key="attribute_name" value="dateformat"/>
                    <parameter key="date_format" value="dd'th' MMMMM yyyy"/>
                  </operator>
                  <connect from_port="input 1" to_op="Nominal to Date (2)" to_port="example set input"/>
                  <connect from_op="Nominal to Date (2)" from_port="example set output" to_port="output 1"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="source_input 2" spacing="0"/>
                  <portSpacing port="sink_output 1" spacing="0"/>
                  <portSpacing port="sink_output 2" spacing="0"/>
                </process>
                <description align="center" color="transparent" colored="false" width="126">Create a subprocess to parse each date format.</description>
              </operator>
              <operator activated="true" class="text:data_to_documents" compatibility="6.4.001" expanded="true" height="60" name="Data to Documents" width="90" x="447" y="75">
                <list key="specify_weights"/>
              </operator>
              <operator activated="true" class="text:combine_documents" compatibility="6.4.001" expanded="true" height="76" name="Combine Documents" width="90" x="648" y="75"/>
              <connect from_port="segment" to_op="Documents to Data (3)" to_port="documents 1"/>
              <connect from_op="Documents to Data (3)" from_port="example set" to_op="Extract Macro" to_port="example set"/>
              <connect from_op="Extract Macro" from_port="example set" to_op="Text to Nominal" to_port="example set input"/>
              <connect from_op="Text to Nominal" from_port="example set output" to_op="Select Subprocess" to_port="input 1"/>
              <connect from_op="Select Subprocess" from_port="output 1" to_op="Data to Documents" to_port="example set"/>
              <connect from_op="Data to Documents" from_port="documents" to_op="Combine Documents" to_port="documents 1"/>
              <connect from_op="Combine Documents" from_port="document" to_port="document 1"/>
              <portSpacing port="source_segment" spacing="0"/>
              <portSpacing port="sink_document 1" spacing="0"/>
              <portSpacing port="sink_document 2" spacing="0"/>
            </process>
            <description align="center" color="transparent" colored="false" width="126">The following formats are supported: &lt;br/&gt;1 : dd/MM/yyyy&lt;br/&gt;2 : dd'th' MMMMM yyyy</description>
          </operator>
          <operator activated="true" class="text:combine_documents" compatibility="6.4.001" expanded="true" height="76" name="Combine Documents (2)" width="90" x="313" y="30"/>
          <connect from_port="document" to_op="Cut Document" to_port="document"/>
          <connect from_op="Cut Document" from_port="documents" to_op="Combine Documents (2)" to_port="documents 1"/>
          <connect from_op="Combine Documents (2)" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="0"/>
          <portSpacing port="sink_document 1" spacing="0"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
        <description align="center" color="transparent" colored="false" width="126">Magic happens here. :)</description>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="6.4.000" expanded="true" height="76" name="Select Attributes" width="90" x="648" y="120">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="text"/>
        <parameter key="invert_selection" value="true"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="join" compatibility="6.4.000" expanded="true" height="76" name="Join" width="90" x="648" y="255">
        <list key="key_attributes"/>
      </operator>
      <connect from_op="Example Documents" from_port="out 1" to_op="Generate ID" to_port="example set input"/>
      <connect from_op="Generate ID" from_port="example set output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_op="Process Documents from Data" to_port="example set"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Join" to_port="right"/>
      <connect from_op="Process Documents from Data" from_port="example set" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Join" to_port="left"/>
      <connect from_op="Join" from_port="join" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

 9 
 on: May 22, 2015, 01:51:29 PM 
Started by PaulV - Last post by Martin Schmitz
I don't know whats recommended, but i personally never used Read.

 10 
 on: May 22, 2015, 01:47:47 PM 
Started by mattwl - Last post by Martin Schmitz
Hi All,

for the record: We fixed the issue via mail. The problem was related to inlcude special attributes and the select attribute operator which accendently filtered out everything.

Cheers,
Martin

Pages: [1] 2 3 ... 10