I'm new here and I aleady need help
I like to predict a classification of a new DataSet.
Here's the problem:
Imagine you have a car and everybody has its own way of driving. The biggest problem of all is the clutch. Because of your bad way of driving you are destroying your clutch after a couple of times. Bad Driving is for example holding your car at a slop with your clutch. The thermic impact on the clutch over a certain time is just bad.
With the collection of Data I would like to predict the damage of the clutch!
Here's my proposal of the soultution:
I collected Data about the driving behaviuor of many perople. I define2 classes of a good behaviour (no impact on the clutch) and a bad behaviour (big impact on the clutch). I collecht data about the gearposition, speed, engine speed, is the clutch open or closed, etc....Things which has an impact on the clutch. So my label is "Danage of the Clutch" with yes or no.
So I use rapid miner to train my model. I import the excel files of Driver A with good and bad way of driving. I'm doing the same with the Data of Driver B and Driver C etc....I'm training my model with X-Validation. In this X-Validation I'm using k-NN or SVM (linear) to train it.
Question 1: Which of the Algorithm is the best to use? k-NN? Bayes? SVM?
In the Testing field I'm using Apply Model and Performance.
Question 2: Which Performance operator is the best to use in this case? The regular Performance Operator or the Performance(Classification)?
In the next step I'm importing the untrained Data and use another apply model to apply the model on the untrained Data. I hope to predict how much it is possible to get a damage for the clutch with the way of driving .
If i calcute it, the prediction is 0%. So is this the right way to predict the classification?
I hope you can understand my problem and sorry for my bad english