|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.rapidminer.operator.preprocessing.outlier.SearchSpace
public class SearchSpace
SearchSpace is a class for building a room full of SearchObjects (see class definition) and provides various methods to place those objects into the SearchSpace (by associating those Objects to the list of objects in the SearchSpace) as well as to do some Outlier Tests on those Objects.
| Constructor Summary | |
|---|---|
SearchSpace()
This constructor creates a SearchSpace with (integer) 2 dimensions as a default and initializes all fields in the instance of that Class with zero values where appropriate. |
|
SearchSpace(int dim)
This constructor creates a SearchSpace with (integer) dim dimensions and initializes all fields in the instance of that Class with zero values where appropriate. |
|
SearchSpace(int dim,
int minptslb,
int minptsub)
This constructor creates a SearchSpace with (integer) dim dimensions and initializes all fields in the instance of that Class with zero values where appropriate. |
|
| Method Summary | |
|---|---|
void |
addObject(SearchObject objectToAdd)
This adds a SearchObject to the SearchSpace. |
void |
allRadiusSearch(double d,
double p,
int kindOfDistance)
This method invokes the class method radiusODSearch on all objects in the SearchSpace (associated to this Searchroom via the listOfObjects vektor). radiusODSearch does a brute force distance Outlier test based on the parameters d and p for DB(p,d)-Outliers acc. to Knorr and Ng's approach to unify statistical Outlier tests. |
void |
computeDKN(int dk,
int n)
This function computes the D^k_n Outliers according to Ramaswamy, Rastogi and Shim which computes the top-n D^k-Outliers, the outliers (= objects) with the maximum distance to the k-th nearest neighbors. |
void |
computeLOF(int kMin,
int kMax)
Some deeper magic to compute all the LOFs for the objects in the searchroom up to MinPtsUB = kMax! |
void |
findAllKdContainers(int kindOfDistance)
Finds and fills all K distance containers for all objects in the Search Room by invoking the process of finding all k distance containers for one Search Object. |
void |
findKdistanceContainers(SearchObject so,
int kindOfDistance)
This method processes a sequential search over the SearchSpace for a SearchObject so (named p here to be in line with the literature). |
double[] |
getAverageDistanceMeasures(int kindOfDistance)
Returns the average distances measures for the objects in the SearchSpace, calculating: mean distance standard deviation variance The calculation is time consuming and should only be invoked if the data set is parsed for the first time (to get a feeling on it for statistical choices of parameters p and d for e.g. |
double[] |
getAverageLOFMeasures()
Returns the average LOF measures for the objects in the SearchSpace, calculating: mean LOF standard deviation variance |
int |
getDimensions()
Returns the number of dimensions of the SearchSpace. |
double |
getMaximumOutlierFactor()
This method returns the maximum Outlier Factor of all SearchObjects in the SearchSpace. |
int |
getNumberOfObjects()
Returns the (integer) number of objects in the Searchroom (associated with it via addObject(SearchObject) to the room) as an integer value as we overall do
not expect the searchroom to hold more than 2 billion objects. |
SearchObject |
getObject(int index)
This method returns a SearchObject with the i-th index in the listOfObjects; the result has to be casted to SearchObject (Vector Class speciality, as it returns only a JAVA Object Class object). |
java.util.Enumeration |
getObjects()
This method returns an Enumeration of all SearchObjects from a SearchSpace. |
boolean |
getSearchObjectOutlierStatus(int i)
This method returns the outlierstatus of the Searchobject (element at index i) in the SearchSpace from the Searchroom's listOfObjects. |
java.util.Vector<SearchObject> |
getSearchObjects()
Delivers the list of objects. |
void |
radiusODSearch(double d,
double p,
SearchObject rObject,
int kindOfDistance)
BruteForce Radius Search to determine the outlier status of an object rObject of the type SearchObject this method takes d and p as parameters acc. to distance based DB(p,D)-Outlier (Knorr, Ng) and identifies an object as being an outlier, if more than a proportion p of the objects is more than distance D from rObject away. |
void |
resetOutlierStatus()
This method resets the Outlier Status for all Objects in the Search room to have a clean start or to have a new identification of outliers with a separate method. |
void |
setDimensions(int dim)
Sets the number of dimensions for the SearchSpace to dim. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public SearchSpace(int dim)
public SearchSpace()
public SearchSpace(int dim,
int minptslb,
int minptsub)
| Method Detail |
|---|
public int getNumberOfObjects()
addObject(SearchObject) to the room) as an integer value as we overall do
not expect the searchroom to hold more than 2 billion objects.
public void setDimensions(int dim)
Sets the number of dimensions for the SearchSpace to dim.
Attention: This is a value that the SearchSpace keeps for the purpose of consistency checks for all SearchObjects (as each SearchObject has its own number of dimensions and not all the dimensions of the SearchObjects need to be the same - to give implementation freedom).
dim - public int getDimensions()
public java.util.Vector<SearchObject> getSearchObjects()
public boolean getSearchObjectOutlierStatus(int i)
i -
public void addObject(SearchObject objectToAdd)
It prints a warning to STDOUT in case the dimensions of the SearchObject and SearchSpace are incompatible, but as the SearchSpace can perform some operations over SearchObjects with different dimensions, this is not a showstopper.
The method also automatically updates the min/max/range information the SearchSpace knows for itself.
objectToAdd - public SearchObject getObject(int index)
index - public java.util.Enumeration getObjects()
public void resetOutlierStatus()
public void radiusODSearch(double d,
double p,
SearchObject rObject,
int kindOfDistance)
public void allRadiusSearch(double d,
double p,
int kindOfDistance)
public double[] getAverageDistanceMeasures(int kindOfDistance)
mean distance
standard deviation
variance The calculation is time consuming and should only be invoked if the data set is parsed for the first time (to get a feeling on it for statistical choices of parameters p and d for e.g. DB(p,d)-Outliers). It parses the objects matrix upper half to build an array of distances between objects (without doubling and without the distances of objects to themselves) which should be (n^2-n)/2 distances of value.
public double[] getAverageLOFMeasures()
mean LOF
standard deviation
variance
public double getMaximumOutlierFactor()
public void findKdistanceContainers(SearchObject so,
int kindOfDistance)
This method processes a sequential search over the SearchSpace for a SearchObject so (named p here to be in line with the literature).
As a result of the search a structure of k-distance-Containers is build and listed within the SearchObject. Each container for a distance of an object or a number of objects o in relation to p is filled with all the objects within that distance. The containers are sorted in a linked list in the SearchObject by increasing distance. Just imagine it like p being a submarine sending a ping and listing all echos in radiuses (=distance) with the echos stored in a band (=container) if they are on the same radius.
so - public void findAllKdContainers(int kindOfDistance)
public void computeLOF(int kMin,
int kMax)
Some deeper magic to compute all the LOFs for the objects in the searchroom up to MinPtsUB = kMax! The LOF output is only done up from kMin!
This one is heavily documented in the source, so if you are interested on how it is done, have a look at the source for the method.
kMin - kMax -
public void computeDKN(int dk,
int n)
dk - n -
|
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||