Pages: [1]
  Print  
Author Topic: "Read CSV" to example set  (Read 1981 times)
Monaco
Newbie
*
Posts: 5


« on: March 28, 2011, 06:38:47 PM »

Hi,

Just beginning RapidMiner experiment & having trouble with "Read CSV" operator.
I can output the data to res  (and see the ExampleSet), but when other operators require an example set in input, no data is available. Is this a limitation of Read CSV or is there a way to make the data available as an example set ?
Regards.
Logged
haddock
Hero Member
*****
Posts: 853



WWW
« Reply #1 on: March 28, 2011, 06:47:14 PM »

HI, and welcome!

Start Rapidminer and go Help->Tutorial, that will load runnable examples, so you have some idea of what RM can and cannot do. Believe me, it saves time in the long run!

Logged

Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?

T.S.Eliot ~ Choruses from the Rock 1934
colo
Full Member
***
Posts: 245


« Reply #2 on: March 29, 2011, 10:29:27 AM »

Hi Monaco,

if your operator provides an example set to the results port of the process, it will do the same for other operators. Did you check the connection from the output port of "Read CSV" to the input port of the following operator? Perhaps you might want to post your process (code from XML tab) here to reveal possible mistakes in process design.

Regards
Matthias
Logged
Monaco
Newbie
*
Posts: 5


« Reply #3 on: March 31, 2011, 05:27:35 PM »

Hi Colo,

Many thanks for your quick reply.
Here is the code (nothing fancy). Doesn't work with CSV Reader but works well with Read Excel or Retrieve.
When you are modifying the file that has been stored as a Data Table in the repository, do you know how to automaticaly update this Data Table ?

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.006">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
    <process expanded="true" height="426" width="673">
      <operator activated="true" class="read_csv" compatibility="5.1.006" expanded="true" height="60" name="Read CSV" width="90" x="45" y="120">
        <parameter key="csv_file" value="D:\Data.csv"/>
        <parameter key="date_format" value="yyyyMMdd"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
        </list>
        <parameter key="locale" value="French"/>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="Date.true.date.id"/>
          <parameter key="1" value="Data.true.integer.attribute"/>
        </list>
      </operator>
      <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing" width="90" x="179" y="30">
        <parameter key="horizon" value="1"/>
        <parameter key="window_size" value="1"/>
        <parameter key="create_label" value="true"/>
        <parameter key="label_attribute" value="Data"/>
      </operator>
      <connect from_op="Read CSV" from_port="output" to_op="Windowing" to_port="example set input"/>
      <connect from_op="Windowing" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
« Last Edit: April 04, 2011, 07:05:18 PM by Monaco » Logged
Monaco
Newbie
*
Posts: 5


« Reply #4 on: March 31, 2011, 05:40:06 PM »

HI, and welcome!

Start Rapidminer and go Help->Tutorial, that will load runnable examples, so you have some idea of what RM can and cannot do. Believe me, it saves time in the long run!



Hi Haddock,

Thank you for your insight. I've studied this tutorial last week and effectively the ressource is amazingly powerful and educative. But I haven't found an answer to my current problem. I've posted the code, but I don't think it will help. You can try for yourself with a very simple csv file, when you drag the mouse cursor over the operator output, it indicates "number of examples=-1".
Regards
Logged
Ingo Mierswa
Administrator
Hero Member
*****
Posts: 1226



WWW
« Reply #5 on: March 31, 2011, 05:58:22 PM »

Aehem, only a quick question: Did you actually have executed the process (i.e. pressed the "Play" icon in the toolbar?). Does it work then?

Cheers,
Ingo
Logged

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at  http://marketplace.rapid-i.com
Monaco
Newbie
*
Posts: 5


« Reply #6 on: March 31, 2011, 06:19:42 PM »

Hi Ingo,

When I execute the process, I works fine to display the data (even if number of example set=-1). But when I add a windowing operator, which requires a number of example set superior to the horizon (set to 1), it fails.
Cheers
Logged
Ingo Mierswa
Administrator
Hero Member
*****
Posts: 1226



WWW
« Reply #7 on: March 31, 2011, 06:31:22 PM »

Ok, then try the following:

1. Load the data with "Read CSV", add an operator "Store" and save the data set directly again in your repository.
2. Drag the freshly saved data from your repository (it will be transformed into a new operator named "Retrieve" which will load the data for you from the repository)

Try again with this data set loaded with "Retrieve". Expected behaviour: Everything works like expected. Reason for your confusion: Search in the forum for "Repository" and "meta data". Best solution for you: Book a training at Rapid-I - it definitely will help  Cheesy
This would probably also the best option if you do not know what I mean with "Repository"  Grin

Cheers,
Ingo

P.S. (for the more experienced readers here...): I never did expect that this - definitely very unique and innovative - feature of RapidMiner called "meta data propagation" would cause so much uncertainty for some users. I am open for all suggestions how we could make the difference more clear between "meta data" and "actual data" and why it is sometimes impossible to provide meta data (like for CSV files...)
Logged

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at  http://marketplace.rapid-i.com
colo
Full Member
***
Posts: 245


« Reply #8 on: April 04, 2011, 09:41:57 AM »

Hi Monaco,

just to be sure... you didn't use the "Window Document" operator after "Read CSV", did you? Which operators did you try?
I hoped you would post your process with this second operator to reveal possible problems Wink

Regards
Matthias
Logged
Monaco
Newbie
*
Posts: 5


« Reply #9 on: April 04, 2011, 07:28:34 PM »

Hey Ingo,

Just read your post at http://rapid-i.com/rapidforum/index.php/topic,2902.msg11559.html#msg11559
Frequent update of my csv files is why I don't use the repository (unless there is a way to easily and automatically update it).
I don't understand why the same data can be output when in xls and can't in csv format. Fortunately I have found alternative ways to properly deal with this issue, but I would have prefered (it's not crucial) to output directly fron Read CSV.
Many thanks for your support.

Best regards.
Logged
dragoljub
Full Member
***
Posts: 246


« Reply #10 on: April 04, 2011, 09:40:08 PM »

Read CSV should pass the data correctly assuming you have set all the attributes types & special attributes correctly . Most times read CSV just produces the raw data, you still need to set things like labels, special attributes etc. Also maybe your values are not read in as reals or integers and imported as some wrong data type like polynomial. This can cause all types of problems. It might just be easier to run an import process right before you run your analysis, to make sure your data is perfect.

-Gagi
Logged
SKOM
Newbie
*
Posts: 1


« Reply #11 on: May 29, 2013, 04:10:50 PM »

I've just run into a similar problem, with "Read CSV" output number of examples = -1, and one of subsequent nodes not working. Since apparently it's a feature and not a bug  Tongue , shouldn't the operator description include something like "recommended use with Store and Retrieve modes"?

Best,
PK
Logged
Marius
Administrator
Hero Member
*****
Posts: 1793



WWW
« Reply #12 on: June 11, 2013, 11:08:18 AM »

I've just run into a similar problem, with "Read CSV" output number of examples = -1, and one of subsequent nodes not working. Since apparently it's a feature and not a bug  Tongue , shouldn't the operator description include something like "recommended use with Store and Retrieve modes"?

Best,
PK

Good idea, we should probably promote the complete repository-based approach better to our users and explain why it is often easier to use than file-based approaches.

Best regards,
Marius
Logged

Please add [SOLVED] to the topic title when your problem has been solved! (do so by editing the first post in the thread and modifying the title)
Please click here before posting.
Pages: [1]
  Print  
 
Jump to: