Pages: [1]
Author Topic: Parameter classification in Material Science  (Read 385 times)
Posts: 4

« on: September 01, 2014, 03:30:44 PM »


My task is related to object classification in the field of material science. The program RapidMiner is new to me, wherefore I want to draw on the broad knowledge of the community. I would be happy for response on m problem.

My problem (pre work excluded):
I have built a Excel list in the following way.

-   First row: Object ID (running number) approx. 4500
-   Following rows: Object parameters (e.g. Area, Perimeter, ) approx. 26
-   Last row: belonging/label

In the first place I had for each label/class (total of 12) an own Excel list. I put all Data from those in one Excel list.

My Goals:
-   To find a classification method (e.g. SVM) for training the problem in order to apply the model on unknown/not classified objects for getting their belonging.
-   To find out which of the parameters are of interest for the model (optimize selection)

For now, I imported the Master Excel file with all objects in RapidMinder (Version 5.3) as followed:
-   Object ID (running number): Integer; ID
-   Object parameters: Real, Attributes
-   Class (total of 12): Text; Label

From own research I started as follows (code can be found further down):
-   Main Process
   o   Retrieve Data  Excel file Repository
   o   Optimize selection
-   Evalution Process
   o   Validation
-   Training
   o   SVM Linear
-   Testing
   o   Apply Model
   o   Performence

Is my approach correct? How would you build up the process structure in order to solve the problem?
If more information is needed I will provide it.

Thanks to any help and response.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.015">
  <operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="5.3.015" expanded="true" height="60" name="Retrieve MasterExcel" width="90" x="45" y="75">
        <parameter key="repository_entry" value="../MasterExcel/MasterExcel"/>
      <operator activated="true" class="optimize_selection_evolutionary" compatibility="5.3.015" expanded="true" height="94" name="Optimize Selection (Evolutionary)" width="90" x="246" y="75">
        <process expanded="true">
          <operator activated="true" class="x_validation" compatibility="5.3.015" expanded="true" height="112" name="Validation" width="90" x="45" y="30">
            <process expanded="true">
              <operator activated="true" class="support_vector_machine_linear" compatibility="5.3.015" expanded="true" height="76" name="SVM (Linear)" width="90" x="45" y="30"/>
              <connect from_port="training" to_op="SVM (Linear)" to_port="training set"/>
              <connect from_op="SVM (Linear)" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="5.3.015" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                <list key="application_parameters"/>
              <operator activated="true" class="performance" compatibility="5.3.015" expanded="true" height="76" name="Performance" width="90" x="180" y="30"/>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
          <connect from_port="example set" to_op="Validation" to_port="training"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="performance"/>
          <portSpacing port="source_example set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="sink_performance" spacing="0"/>
      <connect from_op="Retrieve MasterExcel" from_port="output" to_op="Optimize Selection (Evolutionary)" to_port="example set in"/>
      <connect from_op="Optimize Selection (Evolutionary)" from_port="weights" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>

Pages: [1]
Jump to: