Welcome,
Guest
. Please
login
or
register
.
Did you miss your
activation email?
Home
Help
Search
Login
Register
Rapid-I
Rapid-I Forum
»
RapidMiner
»
Data Mining / ETL / BI Processes
»
KMeans and Nominal Measures
Pages: [
1
]
« previous
next »
Print
Author
Topic: KMeans and Nominal Measures (Read 89 times)
annbra
Newbie
Posts: 3
KMeans and Nominal Measures
«
on:
May 08, 2013, 07:33:04 AM »
Hi,
I am new in the forum and I have a question about kMeans.
I have read in the forum that it is not possible to use polynominal data for kMeans algorithm. But in RapidMiner 5.2.008, it is possible to choose nominal measures (e.g. nominal distance) and to determine clusters with centroids of polynominal data. There is only a warning that it is not possible to use polynominal data by kMeans, but it is possible because I get "good" results.
How does RapidMiner calculate the clusters? Is it possible to have a look on the different steps during the calculation?
Greetings,
Anne
Logged
Marius
Global Moderator
Hero Member
Posts: 1283
Re: KMeans and Nominal Measures
«
Reply #1 on:
May 08, 2013, 01:03:27 PM »
Hi Anne,
it is not possible to see intermediate steps of the algorithm. But yes, k-Means is capable of handling nominal attributes with the nominal or mixed measures. NominalDistance e.g. is 0 if two strings match exactly, and 1 otherwise. The issue about the warning (which is obviously wrong) has just been fixed and will disappear in the next release.
Best regards,
Marius
Logged
Please add [SOLVED] to the topic title when your problem has been solved! (do so by editing the first post in the thread and modifying the title)
Please
click here
before posting.
annbra
Newbie
Posts: 3
Re: KMeans and Nominal Measures
«
Reply #2 on:
May 09, 2013, 09:25:15 AM »
Hi Marius,
thank you very much for your answer.
The calculation of distance is now clear but how does the algorithm determine the centroid. A centroid of one cluster is a linear combination of the examples, and for this you need numeric data. Does the algorithm transfer the polynominal data in natural numbers and determine the centroid by "normal" summation?
Greetings,
Anne
Logged
Marius
Global Moderator
Hero Member
Posts: 1283
Re: KMeans and Nominal Measures
«
Reply #3 on:
May 09, 2013, 04:15:55 PM »
Actually it seems so. It may be better to use Nominal to Numerical with coding_type=dummy_coding before actually applying k-Means. That way you'll be on the safe side.
Logged
Please add [SOLVED] to the topic title when your problem has been solved! (do so by editing the first post in the thread and modifying the title)
Please
click here
before posting.
annbra
Newbie
Posts: 3
Re: KMeans and Nominal Measures
«
Reply #4 on:
May 16, 2013, 10:40:38 AM »
Thank you very much.
Logged
Pages: [
1
]
Print
« previous
next »
Jump to:
Please select a destination:
-----------------------------
General Community
-----------------------------
=> News and Updates
=> Data Mining
=> Chit Chat
-----------------------------
RapidMiner
-----------------------------
=> Getting Started
=> Data Mining / ETL / BI Processes
=> Problems and Support
=> Feature Requests
=> Development
-----------------------------
RapidAnalytics
-----------------------------
=> Getting Started
=> Applications and Integration
-----------------------------
RapidNet
-----------------------------
=> Getting Started
=> Problems and Support
Loading...