Pages: [1]
  Print  
Author Topic: Discretize all the attributes (together)  (Read 2374 times)
fjcuberos
Newbie
*
Posts: 19


« on: July 06, 2008, 11:46:46 AM »

Im focusing the treatment of multivariate temporal series using the AttributeSubsetPreprocessing.
The idea is process the exampleSet several times (one per dimension) selecting only the attributes of one dimension.
Supose the attribute list for a 2D trajectory
 x1
 x2
...
 x200
 y1
 y2
...
 y200

I need to discretize the values of x1..x200 but taking all the values of all attributes into account. So the discretization model has 200 rangesmap that are identical.
This could be accomplished by a new parameter in the discretization operators.

I include a sample with BinDiscretization Id developed for my use.

The DiscretizationModelSeries is a empty inheritance of DiscretizationModel needed because the DiscretizationModel constructor is private.

Code:
public Model createPreprocessingModel(ExampleSet exampleSet) throws OperatorException {
if (getParameterAsBoolean(PARAMETER_ALL_ATTRIBUTES)){
DiscretizationModelSeries model = new DiscretizationModelSeries(exampleSet);

exampleSet.recalculateAllAttributeStatistics();
int numberOfBins = getParameterAsInt(PARAMETER_NUMBER_OF_BINS);
HashMap<Attribute, double[]> ranges = new HashMap<Attribute, double[]>();

//Get the values of every attibute
double min = Double.POSITIVE_INFINITY;
double max = Double.NEGATIVE_INFINITY;
for (Attribute attribute : exampleSet.getAttributes()) {
if (attribute.isNumerical()) { // skip nominal and date attributes
double mi = exampleSet.getStatistics(attribute, Statistics.MINIMUM);
double ma = exampleSet.getStatistics(attribute, Statistics.MAXIMUM);
if (mi < min) min=mi;
if (ma > max) max=ma;
}
}
// Compute the limits
double[] binRange = new double[numberOfBins];
for (int b = 0; b < numberOfBins - 1; b++) {
binRange[b] = min + (((double) (b + 1) / (double) numberOfBins) * (max - min));
}
binRange[numberOfBins - 1] = Double.POSITIVE_INFINITY;
// Assign the same limits to every attribute 
for (Attribute attribute : exampleSet.getAttributes()) {
ranges.put(attribute, binRange);
}

model.setRanges(ranges, "range", getParameterAsBoolean(PARAMETER_USE_LONG_RANGE_NAMES));
return (model);
}
else{
return ( super.createPreprocessingModel(exampleSet));
}
}


public List<ParameterType> getParameterTypes() {
List<ParameterType> types = super.getParameterTypes();

ParameterType type = new ParameterTypeBoolean(PARAMETER_ALL_ATTRIBUTES , "Indicates if ALL the attributes are discretized together.", false);
type.setExpert(false);
types.add(type);
return types;
}


Thanks and congratulations for making RM better every release.

F.J. Cuberos
Logged
Ingo Mierswa
Administrator
Hero Member
*****
Posts: 1226



WWW
« Reply #1 on: July 07, 2008, 09:22:39 AM »

Hi,

thanks for sending this in. I will add this to our todo but we will first make the next release which is coming probably end of this week / beginning of the next one.

Cheers,
Ingo
Logged

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at  http://marketplace.rapid-i.com
Pages: [1]
  Print  
 
Jump to: