cyph00
Newbie

Posts: 2
|
 |
« on: December 21, 2011, 09:53:40 PM » |
|
Hi. I am looking for some guidance with the following scenario. I have a labeled sample set of data (labeled true/ false) of fraudulent and not fraudulent transactions. I also have a set of rules that hit/executed for each transaction. I have two types of rules, rules that are fraud indicators and another set that are indicators of valid transactions. I have a very simple weighting assigned at the moment. If a fraud indicator is met, I add 1 point. If a valid indicator is met, I subtract 1 point (-1). If the rule condition is not met, 0 points are assigned. I also have a cumulative score generated, but that is what I need help with...better weighting on each rule and a thresholds.
My question is as follows: How do I go about feeding the below sample set to RapidMiner and having RapidMiner do the following: 1. Assign a weight (score) to each rule 2. Establish a cumulative score threshold for "valid", "undetermined", and "fraud".
Here is a sample of data in delimited form. Do I need to reformat the data to make it easier to work with in RapidMiner. If so, what format, boolean ? Also, what steps /components would I use to produce the desired results.
Thanks in advance for any help and guidance. Also, if there are any tutorials or post describing this please let me know. I looked but did not find anything applicable.
TRAN_ID,IS_FRAUD,rule1,rule2,rule3,rule4,rule6,rule5,rule7,rule8,rule9,rule10,pos_rule0,pos_rule1,pos_rule2,pos_rule3,pos_rule7,pos_rule4,pos_rule5,pos_rule6,ModelScore A00023141,FALSE,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0,-1,-2 A00023142,FALSE,0,0,0,0,0,0,0,0,0,0,0,-1,0,0,0,-1,0,-1,-3 A00023143,FALSE,0,0,0,0,1,1,0,0,0,0,0,0,0,-1,-1,-1,-1,-1,-4 A00023144,FALSE,0,1,0,0,1,1,0,0,0,0,0,0,0,-1,-1,-1,-1,-1,-3 A00023145,FALSE,0,0,0,0,0,0,0,0,0,0,0,-1,0,0,0,0,0,0,-1 A00023146,FALSE,0,1,0,0,1,1,0,0,0,0,0,0,0,0,-1,-1,0,-1,0 A00023147,FALSE,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,-1,-1,0,-3 A00023148,FALSE,0,0,0,0,0,0,0,0,0,0,-1,0,-1,-1,0,-1,-1,-1,-7 A00023149,FALSE,0,1,0,0,0,0,0,0,0,0,0,0,-1,-1,-1,-1,-1,0,-5 A00023150,FALSE,0,0,0,0,0,0,0,0,0,0,-1,0,-1,-1,-1,-1,-1,0,-7 A00023151,FALSE,0,0,0,0,0,0,1,0,0,0,0,-1,0,0,0,0,0,-1,-1 A00023152,FALSE,0,1,0,0,0,0,0,0,0,0,0,0,-1,-1,0,-1,-1,-1,-5 A00023153,FALSE,0,0,0,0,0,0,0,0,0,0,0,0,-1,-1,0,-1,-1,0,-5 A00023154,FALSE,0,1,0,0,0,0,0,0,0,0,0,0,-1,-1,0,-1,-1,-1,-5 A00023155,FALSE,0,1,0,0,1,0,0,0,0,0,0,0,0,-1,-1,-1,-1,-1,-4 A00023156,TRUE,1,1,0,0,1,1,0,1,0,1,0,0,0,0,0,0,0,0,6 A00023157,TRUE,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2 A00023158,TRUE,0,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,3 A00023159,TRUE,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,3 A00023160,TRUE,1,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,0,5 A00023161,TRUE,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,2 A00023162,TRUE,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0,0,-1 A00023163,TRUE,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0,0,-1
|