Pages: [1] 2 3 ... 10
 1 
 on: July 28, 2014, 08:17:05 PM 
Started by Viper1988 - Last post by Viper1988
Hi,

I have a problem. I have one database table with the following columns: text, customer id and customer name.
Now I want to get a wordlist out of the text. So that I can see customer 1 has written the word "RapidMiner" five times and customer 2 has written "RapidMiner" four times and "Mining" three times.

Does anybody have an idea? Sry for my bad english  Embarrassed

Thank you very much!

 2 
 on: July 28, 2014, 02:35:19 PM 
Started by wessel - Last post by wessel
Dear all,

I would like to see an AUC for N>2-class problems.
Not only for binomial problems.

This is already possible with R.
http://stats.stackexchange.com/questions/2151/how-to-plot-roc-curves-in-multiclass-classification
http://homepage.tudelft.nl/a9p19/papers/prasa_06_vuc.pdf
http://www.mathworks.nl/matlabcentral/fileexchange/30424-colauc/content/colAUC.m

Best regards,

Wessel






 3 
 on: July 28, 2014, 09:38:43 AM 
Started by GHERMAN Alina - Last post by GHERMAN Alina
?
 Is there any component that I do not know, and that can be used for this?

Thank you!

 4 
 on: July 25, 2014, 02:40:15 PM 
Started by Daniela - Last post by Marius
Hi Daniela,

well, I wouldn't call it a mistake, but yes, it seems that that was the problem :-)

Finally, you can also discover the formula with only 30 data points in RapidMiner using the linear regression. You need to disable all integrateds feature selection methods of the Linear Regression, though, otherwise the heuristics remove seemingly colinear features: set the feature selection method to "none" and disable "eliminate colinear features" in the Linear Regression.
As you can see, with 300 data points the heuristics have enough input to keep all relevant attributes.

Best regards,
Marius

Code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.0.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="6.0.008" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="subprocess" compatibility="6.0.008" expanded="true" height="76" name="Generate Data (2)" width="90" x="45" y="30">
        <process expanded="true">
          <operator activated="true" class="generate_data" compatibility="6.0.008" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
            <parameter key="number_examples" value="30"/>
            <parameter key="number_of_attributes" value="1"/>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="6.0.008" expanded="true" height="76" name="Select Attributes" width="90" x="179" y="30">
            <parameter key="invert_selection" value="true"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="generate_id" compatibility="6.0.008" expanded="true" height="76" name="Generate ID" width="90" x="313" y="30"/>
          <operator activated="true" class="generate_attributes" compatibility="6.0.008" expanded="true" height="76" name="Generate Attributes" width="90" x="447" y="30">
            <list key="function_descriptions">
              <parameter key="x" value="id"/>
              <parameter key="y" value="0.15 * x*x - 7.34 *x + 106.38"/>
            </list>
          </operator>
          <operator activated="true" class="materialize_data" compatibility="6.0.008" expanded="true" height="76" name="Materialize Data (2)" width="90" x="581" y="30"/>
          <connect from_op="Generate Data" from_port="output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Generate ID" to_port="example set input"/>
          <connect from_op="Generate ID" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_op="Materialize Data (2)" to_port="example set input"/>
          <connect from_op="Materialize Data (2)" from_port="example set output" to_port="out 1"/>
          <portSpacing port="source_in 1" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="set_role" compatibility="6.0.008" expanded="true" height="76" name="Set Role (2)" width="90" x="179" y="30">
        <parameter key="attribute_name" value="y"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="generate_function_set" compatibility="6.0.008" expanded="true" height="76" name="Generate Function Set" width="90" x="313" y="30">
        <parameter key="use_mult" value="true"/>
      </operator>
      <operator activated="true" class="rename_by_constructions" compatibility="6.0.008" expanded="true" height="76" name="Rename by Constructions" width="90" x="447" y="30"/>
      <operator activated="true" class="split_data" compatibility="6.0.008" expanded="true" height="94" name="Split Data" width="90" x="45" y="210">
        <enumeration key="partitions">
          <parameter key="ratio" value="0.7"/>
          <parameter key="ratio" value="0.3"/>
        </enumeration>
        <parameter key="sampling_type" value="stratified sampling"/>
      </operator>
      <operator activated="true" class="linear_regression" compatibility="6.0.008" expanded="true" height="94" name="Linear Regression" width="90" x="179" y="165">
        <parameter key="feature_selection" value="none"/>
        <parameter key="eliminate_colinear_features" value="false"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="6.0.008" expanded="true" height="76" name="Apply Model" width="90" x="313" y="210">
        <list key="application_parameters"/>
      </operator>
      <connect from_op="Generate Data (2)" from_port="out 1" to_op="Set Role (2)" to_port="example set input"/>
      <connect from_op="Set Role (2)" from_port="example set output" to_op="Generate Function Set" to_port="example set input"/>
      <connect from_op="Generate Function Set" from_port="example set output" to_op="Rename by Constructions" to_port="example set input"/>
      <connect from_op="Rename by Constructions" from_port="example set output" to_op="Split Data" to_port="example set"/>
      <connect from_op="Split Data" from_port="partition 1" to_op="Linear Regression" to_port="training set"/>
      <connect from_op="Split Data" from_port="partition 2" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Linear Regression" from_port="model" to_op="Apply Model" to_port="model"/>
      <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
      <connect from_op="Apply Model" from_port="model" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="180"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

 5 
 on: July 25, 2014, 02:14:25 PM 
Started by jaakko1 - Last post by jaakko1
Hello,

is it possible to include tables/reports from the Rapidminer process in the "Send Mail" node? If yes, how should I go about that?

Thanks in advance!

-J

 6 
 on: July 25, 2014, 01:52:09 PM 
Started by Daniela - Last post by Daniela
Hello Marius,

thank you so much for your help.

(R does calculate the formula pretty well, with only the 30 data points. But you're right, that's no 300: What I did for Rapidminer was creating a dataset with 300 data points following the calculated formula. So basically what you did with the Subprocess "Generate Data". )

I understand, the mistake was using the wrong operator? (I should have guessed that from the option "max iterations"... )


Best regards,
Daniela

 7 
 on: July 25, 2014, 12:08:20 PM 
Started by CharlieFirpo - Last post by CharlieFirpo
Dear All,

Is there a way to print in console the currently running operator? The RapidMiner Gui prints this information at the bottom of the main windows. But what if I want to run my process in command line? How can I know which operator is running currently?

Thank You!

 8 
 on: July 25, 2014, 09:36:40 AM 
Started by Daniela - Last post by Marius
One additional remark: the Polynomial Regression uses a numerical approach. The algorithm of the Linear Regression may be better suited in many cases. Of course it is necessary to manually calculate some interactions and quadratic terms. You can use the Generate Function Set operator for that. Please have a look at the process below for an example. There, the model finds the relation pretty perfect.

Best regards,
Marius

Code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.0.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="6.0.008" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="subprocess" compatibility="6.0.008" expanded="true" height="76" name="Generate Data (2)" width="90" x="45" y="30">
        <process expanded="true">
          <operator activated="true" class="generate_data" compatibility="6.0.008" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
            <parameter key="number_examples" value="300"/>
            <parameter key="number_of_attributes" value="1"/>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="6.0.008" expanded="true" height="76" name="Select Attributes" width="90" x="179" y="30">
            <parameter key="invert_selection" value="true"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="generate_id" compatibility="6.0.008" expanded="true" height="76" name="Generate ID" width="90" x="313" y="30"/>
          <operator activated="true" class="generate_attributes" compatibility="6.0.008" expanded="true" height="76" name="Generate Attributes" width="90" x="447" y="30">
            <list key="function_descriptions">
              <parameter key="x" value="id/10"/>
              <parameter key="y" value="0.15 * x*x - 7.34 *x + 106.38"/>
            </list>
          </operator>
          <operator activated="true" class="materialize_data" compatibility="6.0.008" expanded="true" height="76" name="Materialize Data (2)" width="90" x="581" y="30"/>
          <connect from_op="Generate Data" from_port="output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Generate ID" to_port="example set input"/>
          <connect from_op="Generate ID" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_op="Materialize Data (2)" to_port="example set input"/>
          <connect from_op="Materialize Data (2)" from_port="example set output" to_port="out 1"/>
          <portSpacing port="source_in 1" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="set_role" compatibility="6.0.008" expanded="true" height="76" name="Set Role (2)" width="90" x="179" y="30">
        <parameter key="attribute_name" value="y"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="generate_function_set" compatibility="6.0.008" expanded="true" height="76" name="Generate Function Set" width="90" x="313" y="30">
        <parameter key="use_mult" value="true"/>
      </operator>
      <operator activated="true" class="rename_by_constructions" compatibility="6.0.008" expanded="true" height="76" name="Rename by Constructions" width="90" x="447" y="30"/>
      <operator activated="true" class="split_data" compatibility="6.0.008" expanded="true" height="94" name="Split Data" width="90" x="45" y="210">
        <enumeration key="partitions">
          <parameter key="ratio" value="0.7"/>
          <parameter key="ratio" value="0.3"/>
        </enumeration>
        <parameter key="sampling_type" value="stratified sampling"/>
      </operator>
      <operator activated="true" class="linear_regression" compatibility="6.0.008" expanded="true" height="94" name="Linear Regression" width="90" x="179" y="165"/>
      <operator activated="true" class="apply_model" compatibility="6.0.008" expanded="true" height="76" name="Apply Model" width="90" x="313" y="210">
        <list key="application_parameters"/>
      </operator>
      <connect from_op="Generate Data (2)" from_port="out 1" to_op="Set Role (2)" to_port="example set input"/>
      <connect from_op="Set Role (2)" from_port="example set output" to_op="Generate Function Set" to_port="example set input"/>
      <connect from_op="Generate Function Set" from_port="example set output" to_op="Rename by Constructions" to_port="example set input"/>
      <connect from_op="Rename by Constructions" from_port="example set output" to_op="Split Data" to_port="example set"/>
      <connect from_op="Split Data" from_port="partition 1" to_op="Linear Regression" to_port="training set"/>
      <connect from_op="Split Data" from_port="partition 2" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Linear Regression" from_port="model" to_op="Apply Model" to_port="model"/>
      <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
      <connect from_op="Apply Model" from_port="model" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="180"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

 9 
 on: July 25, 2014, 07:27:41 AM 
Started by austincapobianco - Last post by Marco Boeck
Hi,

please follow the guidelines mentioned here: http://rapid-i.com/rapidforum/index.php/topic,4654.0.html
Especially post your process xml of your process so far (or a simple example that demonstrates the problem).
It's a bit hard to help when all the information you get is "xyz is not working for me" Smiley

Regards,
Marco

 10 
 on: July 25, 2014, 01:39:39 AM 
Started by austincapobianco - Last post by austincapobianco
I want to be able to follow links that do NOT have a particular string in them. Is there any way to accomplish this? I can't find anything about this on the internet anywhere.

Pages: [1] 2 3 ... 10