For others: Here is the link to the mentioned sf-thread: click
Regarding your problem:
I understand Platt Scaling this way:input
: the ExampleSet the original model was created with and the created model output:
the "corrected" model.
Applying the corrected model to your test set results in real probabilities in the confidence column, i.e. confidence(c)=p(y=c|x=current Example).
hope this was helpful
PS: This topic is really interesting. Recently I stumbled upon a paper converting scores of multiple-class-classifiers, but unfortunately, I did not have the time to study it yet. Here is the link (PDF):Transforming Classifier Scores into Accurate Multiclass Probability Estimates