Pages: [1]
  Print  
Author Topic: Predict Values  (Read 184 times)
chelanz
Newbie
*
Posts: 1


« on: March 16, 2014, 02:26:42 PM »

THIS IS A REPOST. THE POST BELOW ISN'T MINE BUT WE DO HAVE THE SAME DILEMMA AND NO ONE REPLIED TO IT. NOW I HOPE THIS POST COULD BE ANSWERED. THANK YOU! ALL EFFORTS WILL BE MUCH APPRECIATED! CHEERS, CHEL

Hi,

I am doing an academic project on stock prediction. while trying to figure out how SVM works, i bumped into rapid miner. I am using it since last 2 hours and i am not able to figure out how to predict values for future dates (horizon > 1). I increased the horizon size but then it shows me 1 future value for every value in input data (if horizon is 5, it shows me 1 value for every input which is suposed to be a predicted value on 5th day after current input). Is there any way by which i can display future values in proper sequence e.g. day 1 -  predicted value 1, day 2 - predicted value 2, etc.
also, is there any way by which I can improve the prediction accuracy Huh??
also, can i somehow incorporate such a particular prediction module in my java code for my GUI or should i call rapid miner explicitly from my java program Huh (i just want to use the SVm prediction module and not all the features of rapid miner)
It would be great if you can help me out

I am attaching here the XML of my test file


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input>
      <location/>
    </input>
    <output>
      <location/>
      <location/>
      <location/>
      <location/>
    </output>
    <macros/>
  </context>
  <operator activated="true" class="process" expanded="true" name="Process">
    <process expanded="true" height="423" width="763">
      <operator activated="true" class="read_csv" expanded="true" height="60" name="Read CSV" width="90" x="45" y="30">
        <parameter key="file_name" value="C:\Users\Rj\Downloads\train.csv"/>
      </operator>
      <operator activated="true" class="set_role" expanded="true" height="76" name="Set Role" width="90" x="179" y="30">
        <parameter key="name" value="1"/>
        <parameter key="target_role" value="id"/>
      </operator>
      <operator activated="true" class="series:windowing" expanded="true" height="76" name="Windowing" width="90" x="313" y="30">
        <parameter key="horizon" value="5"/>
        <parameter key="window_size" value="1"/>
        <parameter key="create_label" value="true"/>
        <parameter key="label_attribute" value="564.08"/>
      </operator>
      <operator activated="true" class="series:sliding_window_validation" expanded="true" height="112" name="Validation" width="90" x="447" y="30">
        <parameter key="training_window_width" value="5"/>
        <parameter key="training_window_step_size" value="1"/>
        <parameter key="test_window_width" value="5"/>
        <process expanded="true">
          <operator activated="true" class="nominal_to_numerical" expanded="true" height="94" name="Nominal to Numerical" width="90" x="45" y="255"/>
          <operator activated="true" class="support_vector_machine" expanded="true" height="112" name="SVM" width="90" x="112" y="75">
            <parameter key="kernel_degree" value="5.0"/>
            <parameter key="C" value="1.0"/>
          </operator>
          <connect from_port="training" to_op="Nominal to Numerical" to_port="example set input"/>
          <connect from_op="Nominal to Numerical" from_port="example set output" to_op="SVM" to_port="training set"/>
          <connect from_op="SVM" from_port="model" to_port="model"/>
          <portSpacing port="source_training" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
          <portSpacing port="sink_through 1" spacing="0"/>
        </process>
        <process expanded="true">
          <operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model" width="90" x="66" y="30">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="series:forecasting_performance" expanded="true" height="76" name="Performance" width="90" x="195" y="25">
            <parameter key="horizon" value="1"/>
          </operator>
          <connect from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
          <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
          <portSpacing port="source_model" spacing="0"/>
          <portSpacing port="source_test set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="sink_averagable 1" spacing="0"/>
          <portSpacing port="sink_averagable 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="read_csv" expanded="true" height="60" name="Read CSV (2)" width="90" x="45" y="255">
        <parameter key="file_name" value="C:\Users\Rj\Downloads\test.csv"/>
      </operator>
      <operator activated="true" class="set_role" expanded="true" height="76" name="Set Role (2)" width="90" x="179" y="255">
        <parameter key="name" value="1"/>
        <parameter key="target_role" value="id"/>
      </operator>
      <operator activated="true" class="series:windowing" expanded="true" height="76" name="Windowing (2)" width="90" x="313" y="255">
        <parameter key="window_size" value="1"/>
        <parameter key="label_attribute" value="562.21"/>
      </operator>
      <operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model (2)" width="90" x="492" y="261">
        <list key="application_parameters"/>
      </operator>
      <connect from_op="Read CSV" from_port="output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Windowing" to_port="example set input"/>
      <connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
      <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
      <connect from_op="Validation" from_port="training" to_port="result 1"/>
      <connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
      <connect from_op="Read CSV (2)" from_port="output" to_op="Set Role (2)" to_port="example set input"/>
      <connect from_op="Set Role (2)" from_port="example set output" to_op="Windowing (2)" to_port="example set input"/>
      <connect from_op="Windowing (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
      <connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 3"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
      <portSpacing port="sink_result 4" spacing="0"/>
    </process>
  </operator>
</process>


csv files i used had following data
train.csv
1   GOOG   564.08   564.78   561.01   565.18
2   GOOG   562.48   564.78   561.01   565.18
3   GOOG   562.76   559.46   558.71   564.66
4   GOOG   562.3   559.46   558.71   564.66
5   GOOG   562.17   559.46   558.71   564.66
6   GOOG   562.08   559.46   558.71   564.66
7   GOOG   561.658   559.46   558.71   564.66
8   GOOG   561.52   559.46   558.71   564.66
9   GOOG   560.548   559.46   558.71   564.66
10   GOOG   560.19   559.46   556.5   564.66
11   GOOG   562.77   563.75   562.4   564.22
12   GOOG   564.95   563.75   562.21   565.85
13   GOOG   566.87   563.75   562.21   568
14   GOOG   571.01   563.75   562.21   571.22
15   GOOG   571.89   563.75   562.21   571.909
16   GOOG   570.8115   563.75   562.21   572
17   GOOG   567.34   563.75   562.21   572
18   GOOG   569.2   563.75   562.21   572
19   GOOG   570.73   563.75   562.21   572
20   GOOG   570.13   563.75   562.21   572
21   GOOG   572.16   563.75   562.21   572.2

test.csv
1   GOOG   575.22   563.75   562.21   575.25
2   GOOG   575.16   563.75   562.21   578.5

I wanted to predict future values (for next 10 days) using input from test.csv. Is there any way by which I can predict all 10 values (with as high accuracy as possible) and display them too Huh
Logged
Pages: [1]
  Print  
 
Jump to: