Welcome,
Guest
. Please
login
or
register
.
Did you miss your
activation email?
Home
Help
Search
Login
Register
Rapid-I
Rapid-I Forum
»
RapidMiner
»
Problems and Support
»
[SOLVED] Problem with tokenize
Pages: [
1
]
« previous
next »
Print
Author
Topic: [SOLVED] Problem with tokenize (Read 202 times)
jose
Newbie
Posts: 14
[SOLVED] Problem with tokenize
«
on:
February 07, 2012, 02:37:34 PM »
hello!
My question is this, so I have understood the tokenize operator divides the sentences into words. there is some way of dividing the prayers taking two words and not a word as usual the operator tokenize?.
«
Last Edit: February 10, 2012, 10:58:53 AM by Marius
»
Logged
text_miner
Newbie
Posts: 10
Re: Problem with tokenize
«
Reply #1 on:
February 07, 2012, 03:41:56 PM »
Hi Jose,
Are you asking if you can have terms of more than one word/token? If so, the answer is yes. After you tokenize, use the Generate n-Grams (Terms) operator. This will generate phrases of n sequential tokens. Note: you will still have the single terms in your term-by-document matrix too. For example, generating 2-grams you would have "heart", "attack", and "heart attack" in the matrix.
Logged
jose
Newbie
Posts: 14
Re: Problem with tokenize
«
Reply #2 on:
February 07, 2012, 06:03:25 PM »
ok, perfect, thanks
Logged
Pages: [
1
]
Print
« previous
next »
Jump to:
Please select a destination:
-----------------------------
General Community
-----------------------------
=> News and Updates
=> Data Mining
=> Chit Chat
-----------------------------
RapidMiner
-----------------------------
=> Getting Started
=> Data Mining / ETL / BI Processes
=> Problems and Support
=> Feature Requests
=> Development
-----------------------------
RapidAnalytics
-----------------------------
=> Getting Started
=> Applications and Integration
-----------------------------
RapidNet
-----------------------------
=> Getting Started
=> Problems and Support
Loading...