Pages: [1]
  Print  
Author Topic: Decision Tree Parser  (Read 5511 times)
Stephan
Guest
« on: September 29, 2008, 12:16:48 PM »

Hi,

as far as I saw, RapidMiner 4.2 does not provide a module to dump the tree
representation (available in the "Text View") of decision trees and
random forests into equivalent programming language constructs. Especially,
I'm interested in a tree parser for the language C/C++.

I think that this is a very useful feature (accidentally I've also seen that
there was a related post couple of days ago) when you generate decision
trees that you afterwards want to use in your application. This is common
practice in some domains and I think that people would be very happy about that,
in particular because this is also not supported by R yet.
Currently, you must do this translation by hand which is quite cumbersome and error-prone.

I would highly appreciate this extension. :-)

Thank you.

Cheers,
Stephan
Logged
haddock
Hero Member
*****
Posts: 853



WWW
« Reply #1 on: September 29, 2008, 12:59:24 PM »

Hi,

Could you not make use of the Tree2RuleConverter ( 5.4.65 in the manual )?

Logged

Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?

T.S.Eliot ~ Choruses from the Rock 1934
Stephan
Guest
« Reply #2 on: September 29, 2008, 02:52:55 PM »

No idea if Tree2RuleConverter can be used.  Wink
The documentation is somehow short at this point.

Did you ever use it or do you have an idea how Tree2RuleConverter
could be applied in particular for my problem (maybe you have an
XML example)?
Logged
Sebastian Land
Administrator
Hero Member
*****
Posts: 2426


« Reply #3 on: September 30, 2008, 10:23:20 AM »

Hi Stephan,
at least to my knowledge this feature does not exist until now. And im not quite sure if this will be added in future, since RapidMiner is built to be utilized as a library for doing exactly such things. You could easily translate the decision into an example, put it into the treeModel and get the decision back. If you are an experienced C++ programmer this should be done before breakfast.
On the other hand as experienced as your are then, you should get a programm to work, which translates the TreeModel of rapid miner recursivly into an C++ if/then programm.
Easiest solution but involving a little bit of handwork would be, to overwrite the toString method in TreeModel, respectivly in TreeNode. This method is already recursivly and needs only little changes.

If you want it totaly without manual interaction, you could overwrite
Code:
public final void write(OutputStream out) throws IOException
   within TreeModel, so that it outputs an appropriate representation.
Then you could use IOObjectWriter to write the model into a file. If you use XML encoding without compression, this would be your desired  result.

Greetings,
  Sebastian
Logged
Stephan
Guest
« Reply #4 on: October 01, 2008, 05:24:22 PM »

Hi,

I don't know exactly how I could use the Java RapidMiiner library in my C++ project.
My idea was to invoke RapidMiner (for example via "system") from my application
with "java -jar rapidminer.jar myfile.XML" . By passing the XML file, RapidMiner
should learn the randomforest classifier and produce a file where the
model is dumped to, so that I can parse it afterwards in my C++ application.
Is this a good idea or do you know a better solution for using the Java library
in my C++ program?

In doing it that way that I described above, I could use the IOObjectWriter or
something similar to generate a file where the model is written to. However,
with IOObjectWriter I can just chose XML or binary as output format
that is hard to parse in my application. Is there a way to dump the randomforest
tree models into a file exactly in the form as seen in the TextView mode like
(taken from samle 01_DecisionTree.xml):

Outlook = sunny
 |   Humidity <= 77.500: yes {no=0, yes=2}
 |   Humidity > 77.500: no {no=3, yes=0}
Outlook = overcast: yes {no=0, yes=4}
Outlook = rain
 |   Wind = false: yes {no=0, yes=3}
 |   Wind = true: no {no=2, yes=0}

I don't want to extend the RapidMiner functions since my Java skills are not that good,
thus a pure C++ solution is best suited.  Roll Eyes
Logged
Ingo Mierswa
Administrator
Hero Member
*****
Posts: 1226



WWW
« Reply #5 on: October 09, 2008, 06:59:47 PM »

Hi,

Quote
Is this a good idea or do you know a better solution for using the Java library in my C++ program?

maybe the information in this link help:

http://www.javaworld.com/javaworld/javatips/jw-javatip17.html


Quote
Is there a way to dump the randomforest tree models into a file exactly in the form as seen in the TextView mode like (taken from samle 01_DecisionTree.xml):

Yes. Please use the operator "ResultWriter" which will output the textual result into the specified file (or into the global result file).

Cheers,
Ingo
Logged

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at  http://marketplace.rapid-i.com
Stephan
Guest
« Reply #6 on: October 13, 2008, 10:43:28 PM »

Hi,

it seems to me that the operator "ResultWriter"  can be only used for single decision trees
but not for random forests, right?

Regards,
Stepan
Logged
Tobias Malbrecht
Global Moderator
Sr. Member
*****
Posts: 293



WWW
« Reply #7 on: October 15, 2008, 01:47:58 PM »

Hi Stephan,

it seems to me that the operator "ResultWriter"  can be only used for single decision trees
but not for random forests, right?

you are right. The SimpleVoteModel which holds the decision tree models did not support the text output. I have changed that in the newest developer version (the branch Zaniah). The change will of course be part in the next release.

Regards,
Tobias
Logged

Tobias Malbrecht
Director of Product Marketing
RapidMiner
Pages: [1]
  Print  
 
Jump to: