Pages: [1]
  Print  
Author Topic: Time series forecast (with Rapid Miner)  (Read 5453 times)
DaiWizard
Newbie
*
Posts: 4


« on: June 12, 2013, 11:31:56 AM »

Hi!

I've set up a model exactly as described by Thomas Ott of 'neuralmarkettrends' in videos 8-10 - and it's working well so far.

But what I would still need is the output of the probability for the predicted label (horizon = 1). The model only gives the average values in form of
prediction_trend_accuracy: 0.807 +/- 0.067 (mikro: 0.807).


Thanks for your help !

 
Logged
wessel
Hero Member
*****
Posts: 558


« Reply #1 on: June 12, 2013, 12:23:03 PM »

Hello.

I'm now using Google to find the video you describe.
Next time please use a direct link to the video that is of interest.
Video link:
https://www.youtube.com/watch?v=UmGIGEJMmN8

Can you upload your process?

As far as I understand the process is as follows:
- Order your data by date
- Split your data into two parts
- Use data before date X for training, use data after date X for testing.
- Features for training use created using windowing
- SVM is used as learner
* This process does not deal with horizons very well, neuralmarkettrends1 is aware of this fact, but does not want to complicate his video

Now to answer your question:
My suggestion would be to rescale absolute error to fall into range 0 to 1, and use this as a measure of probability.

This is the best answer I can give right now.
You need to provide better information to get a better answer.

Best regards,

Wessel
« Last Edit: June 12, 2013, 12:35:00 PM by wessel » Logged
DaiWizard
Newbie
*
Posts: 4


« Reply #2 on: June 12, 2013, 01:36:22 PM »

Thank you wessel for your answer!

You are right the question was a bit too unprecise, however you got it right that's the way I'm doing it.

Unfortunately I don't know what to do exactly regarding your answer "Now to answer your question:
My suggestion would be to rescale absolute error to fall into range 0 to 1, and use this as a measure of probabilit".

Where do I get the absolute error from ?

Thank you in advance !
Logged
wessel
Hero Member
*****
Posts: 558


« Reply #3 on: June 12, 2013, 05:28:58 PM »

Using this process you can define any performance measure you want.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="generate_data" compatibility="5.3.008" expanded="true" height="60" name="Gen TS" width="90" x="45" y="30">
        <parameter key="target_function" value="driller oscillation timeseries"/>
      </operator>
      <operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Create Sum" width="90" x="180" y="30">
        <list key="function_descriptions">
          <parameter key="sum" value="str(11*att1+22*att2+33*att3+44*att4+att5)"/>
        </list>
      </operator>
      <operator activated="true" class="guess_types" compatibility="5.3.008" expanded="true" height="76" name="Guess Types" width="90" x="315" y="30"/>
      <operator activated="true" class="select_attributes" compatibility="5.3.008" expanded="true" height="76" name="Select Sum" width="90" x="450" y="30">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="sum"/>
      </operator>
      <operator activated="true" class="normalize" compatibility="5.3.008" expanded="true" height="94" name="Normalize" width="90" x="585" y="30">
        <parameter key="method" value="range transformation"/>
      </operator>
      <operator activated="true" class="series:windowing" compatibility="5.3.000" expanded="true" height="76" name="Win 3 2" width="90" x="720" y="30">
        <parameter key="window_size" value="3"/>
        <parameter key="create_label" value="true"/>
        <parameter key="label_attribute" value="sum"/>
        <parameter key="horizon" value="2"/>
      </operator>
      <operator activated="true" class="series:predict_series" compatibility="5.3.000" expanded="true" height="60" name="Predict: 22 5 22" width="90" x="45" y="120">
        <parameter key="window_width" value="15"/>
        <parameter key="horizon" value="2"/>
        <parameter key="max_training_set_size" value="15"/>
        <process expanded="true">
          <operator activated="true" class="relevance_vector_machine" compatibility="5.3.008" expanded="true" height="76" name="Relevance Vector Machine" width="90" x="45" y="30"/>
          <connect from_port="window example set" to_op="Relevance Vector Machine" to_port="training set"/>
          <connect from_op="Relevance Vector Machine" from_port="model" to_port="prediction model"/>
          <portSpacing port="source_window example set" spacing="0"/>
          <portSpacing port="sink_prediction model" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="rename" compatibility="5.3.008" expanded="true" height="76" name="Rename" width="90" x="180" y="120">
        <parameter key="old_name" value="prediction(label)"/>
        <parameter key="new_name" value="pred"/>
        <list key="rename_additional_attributes"/>
      </operator>
      <operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Generate Attributes" width="90" x="315" y="120">
        <list key="function_descriptions">
          <parameter key="pred_times_label" value="pred*label"/>
          <parameter key="pred_times_label_greater_0" value="if(pred*label&gt;=0, 1, 0)"/>
          <parameter key="abs_pred_minus_label" value="abs(pred-label)"/>
        </list>
      </operator>
      <operator activated="true" class="extract_performance" compatibility="5.3.008" expanded="true" height="76" name="Performance" width="90" x="469" y="119">
        <parameter key="performance_type" value="statistics"/>
        <parameter key="attribute_name" value="abs_pred_minus_label"/>
      </operator>
      <connect from_op="Gen TS" from_port="output" to_op="Create Sum" to_port="example set input"/>
      <connect from_op="Create Sum" from_port="example set output" to_op="Guess Types" to_port="example set input"/>
      <connect from_op="Guess Types" from_port="example set output" to_op="Select Sum" to_port="example set input"/>
      <connect from_op="Select Sum" from_port="example set output" to_op="Normalize" to_port="example set input"/>
      <connect from_op="Normalize" from_port="example set output" to_op="Win 3 2" to_port="example set input"/>
      <connect from_op="Win 3 2" from_port="example set output" to_op="Predict: 22 5 22" to_port="example set"/>
      <connect from_op="Predict: 22 5 22" from_port="example set" to_op="Rename" to_port="example set input"/>
      <connect from_op="Rename" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
      <connect from_op="Generate Attributes" from_port="example set output" to_op="Performance" to_port="example set"/>
      <connect from_op="Performance" from_port="performance" to_port="result 1"/>
      <connect from_op="Performance" from_port="example set" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>
« Last Edit: June 12, 2013, 05:31:05 PM by wessel » Logged
wessel
Hero Member
*****
Posts: 558


« Reply #4 on: June 12, 2013, 06:05:40 PM »

You should get a result looking like this:
( I have problems uploading images, will edit this image later, just go into results dataset and plot "predicted" and "label" and maybe "abs_pred_minus_label" ).

Try figure out why absolute error is different from average(abs_pred_minus_label)
Also note that I'm not using a fixed split, instead I'm using a sliding window validation, because this is the proper way to validate time series models).

http://



This XML shows how you can use the Regression Performance Operator.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="subprocess" compatibility="5.3.008" expanded="true" height="76" name="Generate Data (6)" width="90" x="45" y="30">
        <process expanded="true">
          <operator activated="true" class="generate_data" compatibility="5.3.008" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
            <parameter key="target_function" value="driller oscillation timeseries"/>
            <parameter key="number_examples" value="200"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Generate Sum" width="90" x="180" y="30">
            <list key="function_descriptions">
              <parameter key="sum" value="str(11*att1+22*att2+33*att3+44*att4+att5)"/>
            </list>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="5.3.008" expanded="true" height="76" name="Select Sum" width="90" x="319" y="29">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="sum"/>
          </operator>
          <operator activated="true" class="parse_numbers" compatibility="5.3.008" expanded="true" height="76" name="Parse Numbers (2)" width="90" x="441" y="26">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="sum"/>
          </operator>
          <operator activated="true" class="normalize" compatibility="5.3.008" expanded="true" height="94" name="Normalize" width="90" x="561" y="27">
            <parameter key="method" value="range transformation"/>
          </operator>
          <operator activated="true" class="rename" compatibility="5.3.008" expanded="true" height="76" name="Rename Label" width="90" x="699" y="28">
            <parameter key="old_name" value="sum"/>
            <parameter key="new_name" value="label"/>
            <list key="rename_additional_attributes"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Generate Sum" to_port="example set input"/>
          <connect from_op="Generate Sum" from_port="example set output" to_op="Select Sum" to_port="example set input"/>
          <connect from_op="Select Sum" from_port="example set output" to_op="Parse Numbers (2)" to_port="example set input"/>
          <connect from_op="Parse Numbers (2)" from_port="example set output" to_op="Normalize" to_port="example set input"/>
          <connect from_op="Normalize" from_port="example set output" to_op="Rename Label" to_port="example set input"/>
          <connect from_op="Rename Label" from_port="example set output" to_port="out 1"/>
          <portSpacing port="source_in 1" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="series:windowing" compatibility="5.3.000" expanded="true" height="76" name="Win 3 2" width="90" x="187" y="32">
        <parameter key="window_size" value="3"/>
        <parameter key="create_label" value="true"/>
        <parameter key="label_attribute" value="label"/>
        <parameter key="horizon" value="2"/>
      </operator>
      <operator activated="true" class="multiply" compatibility="5.3.008" expanded="true" height="94" name="Multiply" width="90" x="309" y="34"/>
      <operator activated="true" class="series:sliding_window_validation" compatibility="5.3.000" expanded="true" height="112" name="Validation" width="90" x="515" y="30">
        <parameter key="training_window_width" value="15"/>
        <parameter key="test_window_width" value="1"/>
        <parameter key="horizon" value="2"/>
        <parameter key="average_performances_only" value="false"/>
        <process expanded="true">
          <operator activated="true" class="relevance_vector_machine" compatibility="5.3.008" expanded="true" height="76" name="Relevance VM (2)" width="90" x="152" y="50"/>
          <connect from_port="training" to_op="Relevance VM (2)" to_port="training set"/>
          <connect from_op="Relevance VM (2)" from_port="model" to_port="model"/>
          <portSpacing port="source_training" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
          <portSpacing port="sink_through 1" spacing="0"/>
        </process>
        <process expanded="true">
          <operator activated="true" class="apply_model" compatibility="5.3.008" expanded="true" height="76" name="Apply Model" width="90" x="91" y="12">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="performance_regression" compatibility="5.3.008" expanded="true" height="76" name="Performance" width="90" x="282" y="61">
            <parameter key="root_mean_squared_error" value="false"/>
            <parameter key="absolute_error" value="true"/>
          </operator>
          <connect from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
          <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
          <portSpacing port="source_model" spacing="0"/>
          <portSpacing port="source_test set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="sink_averagable 1" spacing="0"/>
          <portSpacing port="sink_averagable 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="series:predict_series" compatibility="5.3.000" expanded="true" height="60" name="Predict: 22 5 22" width="90" x="78" y="331">
        <parameter key="window_width" value="15"/>
        <parameter key="horizon" value="2"/>
        <parameter key="max_training_set_size" value="15"/>
        <process expanded="true">
          <operator activated="true" class="relevance_vector_machine" compatibility="5.3.008" expanded="true" height="76" name="Relevance VM" width="90" x="412" y="29"/>
          <connect from_port="window example set" to_op="Relevance VM" to_port="training set"/>
          <connect from_op="Relevance VM" from_port="model" to_port="prediction model"/>
          <portSpacing port="source_window example set" spacing="0"/>
          <portSpacing port="sink_prediction model" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="rename" compatibility="5.3.008" expanded="true" height="76" name="Rename" width="90" x="263" y="330">
        <parameter key="old_name" value="prediction(label)"/>
        <parameter key="new_name" value="pred"/>
        <list key="rename_additional_attributes"/>
      </operator>
      <operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Generate Attributes" width="90" x="439" y="335">
        <list key="function_descriptions">
          <parameter key="abs_pred_minus_label" value="abs(pred-label)"/>
        </list>
      </operator>
      <operator activated="true" class="extract_performance" compatibility="5.3.008" expanded="true" height="76" name="Performance (2)" width="90" x="657" y="349">
        <parameter key="performance_type" value="statistics"/>
        <parameter key="attribute_name" value="abs_pred_minus_label"/>
      </operator>
      <connect from_op="Generate Data (6)" from_port="out 1" to_op="Win 3 2" to_port="example set input"/>
      <connect from_op="Win 3 2" from_port="example set output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_op="Validation" to_port="training"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Predict: 22 5 22" to_port="example set"/>
      <connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
      <connect from_op="Predict: 22 5 22" from_port="example set" to_op="Rename" to_port="example set input"/>
      <connect from_op="Rename" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
      <connect from_op="Generate Attributes" from_port="example set output" to_op="Performance (2)" to_port="example set"/>
      <connect from_op="Performance (2)" from_port="performance" to_port="result 2"/>
      <connect from_op="Performance (2)" from_port="example set" to_port="result 3"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
      <portSpacing port="sink_result 4" spacing="0"/>
    </process>
  </operator>
</process>
« Last Edit: June 12, 2013, 06:08:57 PM by wessel » Logged
DaiWizard
Newbie
*
Posts: 4


« Reply #5 on: June 12, 2013, 11:10:06 PM »

Dear Wessel!

Thank you so much for your answer. Due to  the fact that I'm a beginner I don't know how to import your data as a new operator into my process of video 8 to 10 & I'm not sure at which position of the chain to position this operator then.



Best regards, Dai Wizard!



Logged
wessel
Hero Member
*****
Posts: 558


« Reply #6 on: June 12, 2013, 11:42:52 PM »

Click view.

Create new perspective.

In show view, tick XML, untick all others.

In XML tab:
Paste XML code

Click green V symbol.

Return to your standard view.
Logged
DaiWizard
Newbie
*
Posts: 4


« Reply #7 on: June 15, 2013, 10:08:28 PM »

Hi!

Thank you wessel for your tips but I'm afraid it looks too complicated for me, I think I cannot handle (understand) it completely. Therefore I've created a PDF -  file that you could view using this link:  http://www.professor-heusenstamm.com/model.pdf

Bild 1 shows my original process, Bild 2 is the content of the validation operator.
Bild 3 shows the general performance output.

Bild 4 is my latest progress :-) I've inserted the "Log - Operator" and defined here the values for performance and prediction accuracy.

Bild 5 shows the result of the latter.

My question is: Did I insert the Log - operator at the correct position in the process (Bild4) to be sure it delivers the performance of the predicted n+1 value, that's content of "Read Excel (2)" or do I have to rearrange / add something Huh

As usual I'm looking forward to anybodies comments.

Logged
wessel
Hero Member
*****
Posts: 558


« Reply #8 on: June 16, 2013, 06:51:02 PM »

My process (I call this process not model) looks like this:

http://i.snag.gy/STABy.jpg

I used this button to create a new perspective (I named this perspective XML):
http://i.snag.gy/A53kc.jpg

So now my screen looks like:
http://i.snag.gy/6QXgV.jpg

This is easy for sharing processes.
Logged
Pages: [1]
  Print  
 
Jump to: