Pages: [1]
Author Topic: Problem with too many parameter to put as columns into example set  (Read 340 times)
Posts: 1

« on: July 25, 2013, 06:07:50 PM »

My problem-task is that I have customers with a unique ID and they have parameter (binomial) and I would like to predict the value of certain target variables, so far only one but possible multiple.
In my test case I used the following input dataset, see meta data, each customer is represented in a row and the parameter are in the columns simply the usual way.
meta data:
Role           Name           Type
id           Customer_Id   integer
label       Target           binominal
regular   Para1           binominal
regular   Para2           binominal
regular   Para3           binominal
regular   Para4           binominal
Customer_Id   Target   Para1   Para2   Para3   Para4
1   M   1   0   1   0
2   V   1   0   0   1
3   M   0   1   1   1

=> With Nave Bayes I get great prediction results in the test case with limited dimensions.

Problem with the actual dataset:
I have some 100,000s of parameter and the number is growing a lot. The actual number of active parameter for a customer is very small and so the table would be extremely large and sparse. So my idea was to use the following dataset format as input:
meta data:
Role           Name           Type
id           Customer_Id   integer
label       Target           binominal
regular   ActivePara   polynominal
Customer_Id   Target   ActivePara
1   M   Para1
1   M   Para3
2   V   Para1
2   V   Para4
3   M   Para2
3   M   Para3
3   M   Para4

BUT now I do not get consistent predictions per customer what I get is something like this
 Customer_Id   Target   ActivePara   Prediction of Target
1   M   Para1   V
1   M   Para3   M
2   V   Para1   V
2   V   Para4   V
3   M   Para2   M
3   M   Para3   M
3   M   Para4   V

But I want/need the target prediction per customer_id to be consistent.

How do I need to set up the input data/ the model to get the result!

Thanks a lot in advance for any hints and help!!!

Pages: [1]
Jump to: