Pages: [1]
  Print  
Author Topic: How to use Read RSS Feed  (Read 1694 times)
mendicott
Newbie
*
Posts: 2


« on: July 04, 2012, 10:43:29 PM »

> http://www.meta-guide.com/home/knowledgebase/best-rapidminer-videos

I've made a webpage of 87x "Best RapidMiner Videos", above.  After a few *hours* of googling I could find no quickstart, tutorial or examples for using "Read RSS Feed"....  I want to process web feeds, primarily Twitter feeds.  I could also find no quickstart, tutorial or examples for using RapidMiner with Twitter.  The main thing I want to be able to do is filter tweets on non-identical similarity, presumably via classification.  I would also like to try using RapidMiner for link metrics.  I have searched this forum for "rss" and "twitter", without finding anything helpful to me.  Also, it is not clear to me how to access RapidMiner results programmatically like an API.  It seems that it would be much more accessible if RapidMiner were available in the cloud as a web service.  Ultimately, I hope to be able to use RapidMiner in conjunction with Yahoo! Pipes.  (And, I could find no quickstart, tutorial or examples for using RapidMiner with Yahoo! Pipes....)
Logged
haddock
Hero Member
*****
Posts: 853



WWW
« Reply #1 on: July 05, 2012, 08:19:55 AM »

I Googled "rapidminer Read RSS Feed" and got http://www.myexperiment.org/workflows/1465.html as the first link. It says, "This workflow connects RapidMiner to Twitter and downloads the timeline". Are we using the same Google?
Logged

Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?

T.S.Eliot ~ Choruses from the Rock 1934
mendicott
Newbie
*
Posts: 2


« Reply #2 on: July 05, 2012, 02:40:45 PM »

I also got that, but could not get it to work....  There seems to be some kind of inconsistency or incompatibility between the "Read RSS Feed" and the "Process Documents from Data".  Can you reproduce it in a working version?  Can you post the steps you took to get it working?
Logged
tobyb
Newbie
*
Posts: 11


« Reply #3 on: April 15, 2013, 09:57:38 PM »

I am having the same issue.  Has this been resolved?

Thanks,
Toby
Logged
Rene
Newbie
*
Posts: 25


« Reply #4 on: April 17, 2013, 12:03:18 AM »

This is the example that haddock cited -
I just added my Twitter feed url and my
user agent to "Read RSS Feed" and it works
fine. (Rapid Miner 5.3.000):
Code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.000" expanded="true" name="Process">
    <process expanded="true" height="394" width="547">
      <operator activated="true" class="web:read_rss" compatibility="5.3.000" expanded="true" height="60" name="Read RSS Feed" width="90" x="45" y="75">
        <parameter key="url" value="https://api.twitter.com/1/statuses/user_timeline.rss?screen_name=lukeanker"/>
        <parameter key="user_agent" value="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0"/>
      </operator>
      <operator activated="true" class="text:process_document_from_data" compatibility="5.3.000" expanded="true" height="76" name="Process Documents from Data" width="90" x="179" y="75">
        <parameter key="prune_method" value="percentual"/>
        <parameter key="prunde_below_percent" value="5.0"/>
        <parameter key="prune_above_percent" value="80.0"/>
        <parameter key="prune_below_rank" value="5.0"/>
        <parameter key="prune_above_rank" value="5.0"/>
        <list key="specify_weights"/>
        <process expanded="true" height="374" width="547">
          <operator activated="true" class="text:tokenize" compatibility="5.3.000" expanded="true" height="60" name="Tokenize" width="90" x="45" y="30"/>
          <operator activated="true" class="text:transform_cases" compatibility="5.3.000" expanded="true" height="60" name="Transform Cases" width="90" x="179" y="30"/>
          <operator activated="true" class="text:filter_by_length" compatibility="5.3.000" expanded="true" height="60" name="Filter Tokens (by Length)" width="90" x="313" y="30">
            <parameter key="min_chars" value="3"/>
          </operator>
          <operator activated="true" class="text:filter_stopwords_english" compatibility="5.3.000" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="447" y="30"/>
          <connect from_port="document" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_op="Transform Cases" to_port="document"/>
          <connect from_op="Transform Cases" from_port="document" to_op="Filter Tokens (by Length)" to_port="document"/>
          <connect from_op="Filter Tokens (by Length)" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
          <connect from_op="Filter Stopwords (English)" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="0"/>
          <portSpacing port="sink_document 1" spacing="0"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="text:wordlist_to_data" compatibility="5.3.000" expanded="true" height="76" name="WordList to Data" width="90" x="313" y="75"/>
      <connect from_op="Read RSS Feed" from_port="output" to_op="Process Documents from Data" to_port="example set"/>
      <connect from_op="Process Documents from Data" from_port="word list" to_op="WordList to Data" to_port="word list"/>
      <connect from_op="WordList to Data" from_port="word list" to_port="result 1"/>
      <connect from_op="WordList to Data" from_port="example set" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

Logged
Pages: [1]
  Print  
 
Jump to: