Pages: [1] 2 3 ... 10
 1 
 on: Today at 06:21:28 PM 
Started by fokko - Last post by fokko
Hello,
I have a problem with my text pre processing. Maybe anyone can help me Smiley

My text looks like this:

T-Mobile US Inc. and two regional carriers, General Communication Inc. in Alaska and CT Cube LP in Texas. The order is subject to review by President Barack Obama.
Commodities
Oil futures rose 67 cents to $93.98 a barrel as U.S. crude supplies dropped, while gold for August delivery climbed $8 to $1,405 an ounce.
Europe
European markets finished sharply lower today with shares in London leading the region. The FTSE 100 was down 2.12% while France's CAC 40 was off 1.87% and Germany's DAX fell lower by 1.20%.
[1]: http://www.proactiveinvestors.com/companies/overview/2245/Salesforce.com [2]: http://www.proactiveinvestors.comcompanies/overview/2245/salesforcecom--2245.html [3]: http://www.proactiveinvestors.com/companies/overview/2397/Goldman+Sachs [4]: http://www.proactiveinvestors.comcompanies/overview/3787/general-motors-company--3787.html [5]: http://www.proactiveinvestors.com/companies/overview/1189/Dell [6]: http://www.proactiveinvestors.comcompanies/overview/1189/dell-1189.html [7]: http://www.proactiveinvestors.com/companies/overview/1189/Dell [8]: http://www.proactiveinvestors.com/companies/overview/2306/Apple [9]: http://www.proactiveinvestors.comcompanies/overview/2306/apple-2306.html [10]: http://www.proactiveinvestors.com/companies/overview/4450/Samsung+Electronics [11]: http://www.proactiveinvestors.com/companies/overview/2306/Apple [12]:



I want to remove the URLs from the text. How can I do this?I think filter tokens does not work?! Is the solution Remove Document parts?

I think the solution should look like this rule: if the word starts with http. or www. then delete the word from the text..... (but only the url of the text)



Kind regards

 2 
 on: Today at 05:53:57 PM 
Started by NamSor - Last post by NamSor
Hi Marco, I've been looking everywhere to try and customize that;-) thanks for the info. I'll try and release a new version next month which will use the Batch API (much faster) to determine the gender of names so I'll also include this important -if not vital- piece of cosmetics.

 3 
 on: Today at 04:00:20 PM 
Started by hammadalam89 - Last post by hammadalam89
I want to know that how can I implement Levenshtein Minimum Edit Distance in Rapidminer. I am unable to found any operator for this. Is it possible to use it in Rapidminer ?? Or is there any other way we can use this ??

 4 
 on: Today at 02:37:33 PM 
Started by mbenson - Last post by mbenson
Update.

Rapidminer runs from the .jar but not from .exe.

What's up with that? Weird.

Oh - Java 1.7.0_67

 5 
 on: Today at 02:29:30 PM 
Started by Viper1988 - Last post by Viper1988
Hi,

I want to get a wordlist out of about one million rows of text.
Error codes are like A1, A2, B1, B2 ...
But in the results I dont get them. I only tokenize, filter stopwords(dictionary) and filter Tokens by length (min. 2 - max. 50) in the process documents operator
Does anybody have an idea why I dont get the error codes in the results?

Best regards

 6 
 on: Today at 02:23:37 PM 
Started by mbenson - Last post by mbenson
Hi, very basic problem here.

I am learning Rapidminer. I downloaded 5.3 and tried some various exercises that worked wonderfully. Rapidminer is fun!

I'm running the Windows version on Win 7. The laptop I am using belongs to my employer, and may have some weird security policies in place, but looking at the logs, I see no evernts related to trying to start Rapidminer. I could however be looking at the wrong place, as there are layers of security going on.

Anyway, Rapidminer used to run fine. I am hoping  to teach some students some very basic data mining and analytics concepts with it, and I can no longer work with it myself.

If I try to run the executable, either from the shortcut or the file, the cursor turns to  "busy" and then nothing happens. This is true when I try running it as administrator as well. There are no error messages.

Does anyone have any ideas?

Thank you very much!

 7 
 on: Today at 09:20:14 AM 
Started by NamSor - Last post by Marco Boeck
Hi,

this is cool Cool
However may I suggest filling the ABOUT.NFO file in your extension .jar/META-INF directory with life. The text in there is taken as the short description in the RapidMiner Studio marketplace.
Otherwise it looks a bit weird by saying "This is an empty project to be customized".
An extension icon would also not go amiss and make the extension more appealing Smiley

Regards,
Marco

 8 
 on: Today at 07:47:01 AM 
Started by NamSor - Last post by NamSor
Hi!

This is a three minutes tutorial on how to use RapidMiner Onomastics extension to determine the gender of personal names (for gender studies, marketing analysis, etc.)
http://namesorts.com/2014/09/10/video-tutorial-how-to-extract-the-gender-of-personal-names-using-rapidminer/

The extension documentation is on Github
https://github.com/namsor/rapidminer-onomastics-extension/blob/master/doc/201407_NamSor_RapidMiner_Extension_v003.pdf?raw=true

For an example of application, read this analysis of the US FAA 'Airman Directory'
http://gendergapgrader.com/studies/airline-pilots/

Best,
Elian
contact@namsor.com

 9 
 on: September 16, 2014, 03:27:06 PM 
Started by z.meftahi - Last post by homburg
Hi z.meftahi,

right now the maximum capacity of the excel import is limited to a value smaller than 200.000 rows. To workaround this problem you may export your data from MS Excel to csv file format and use the csv input operator or wizard.

Cheers,
Helge

 10 
 on: September 16, 2014, 03:19:43 PM 
Started by drudel173 - Last post by homburg
Hi drudel173,

the "Optimize Weights" operators perform internal validations using different weight vectors depending on the algorithm they use for optimization. The attributes are scaled / selected / deselected according to the current vector and the modified example set is the piped to the inner learner. The process is repeated several times (again depending on the selected method and parameters) and at the end you will get your example set scaled using those weights that worked best during the internal validation. You may also read those weights from the corresponding output port of the "Optimize Weight" operator. In the help view you can scroll down to get a link to some sample processes showing how the usage of those operators is meant to be. You may also store those weights and use to scale other example sets using "Scale by Weights".

Cheers,
Helge

Pages: [1] 2 3 ... 10