Optimize Selection (Evolutionary)
From Rapid-I-Wiki
A genetic algorithm for feature selection.
Contents |
Description
A genetic algorithm for feature selection (mutation=switch features on and off, crossover=interchange used features). Selection is done by roulette wheel. Genetic algorithms are general purpose optimization / search algorithms that are suitable in case of no or little problem knowledge.
- * Generate an initial population consisting of
- For all individuals in the population
- Perform mutation, i.e. set used attributes to unused with probability
p_mutationand vice versa. - Choose two individuals from the population and perform crossover with probability
p_crossover. The type of crossover can be selected bycrossover_type. - Perform selection, map all individuals to sections on a roulette wheel whose size is proportional to the individual's fitness and draw
population_sizeindividuals at random according to their probability. - As long as the fitness improves, go to 2
population_size individuals. Each attribute is switched on with probability p_initialize
Input
- example set in: expects: ExampleSetMetaData: #examples: = 0; #attributes: 0
- attribute weights in: optional: AttributeWeights
- through 1:
Output
- example set out:
- weights:
- performance:
Parameters
- use exact number of attributes:
Determines if only combinations containing this numbers of attributes should be tested.
Range: boolean; default: false - restrict maximum:
If checked the maximal number of attributes might be restricted. Otherwise all combinations of all number of attributes are generated and tested.
Range: boolean; default: false - min number of attributes:
Determines the minimum number of features used for the combinations.
Range: integer; 1-+?; default: 1 - max number of attributes:
Determines the maximum number of features used for the combinations.
Range: integer; 1-+?; default: 1 - exact number of attributes:
Determines the exact number of features used for the combinations.
Range: integer; 1-+?; default: 1 - initialize with input weights:
Indicates if this operator should look for attribute weights in the given input and use the input weights of all known attributes as starting point for the optimization.
Range: boolean; default: false - population size:
Number of individuals per generation.
Range: integer; 1-+?; default: 5 - maximum number of generations:
Number of generations after which to terminate the algorithm.
Range: integer; 1-+?; default: 30 - use early stopping:
Enables early stopping. If unchecked, always the maximum number of generations is performed.
Range: boolean; default: false - generations without improval:
Stop criterion: Stop after n generations without improval of the performance.
Range: integer; 1-+?; default: 2 - normalize weights:
Indicates if the final weights should be normalized.
Range: boolean; default: true - use local random seed:
Indicates if a local random seed should be used.
Range: boolean; default: false - local random seed:
Specifies the local random seed
Range: integer; 1-+?; default: 1992 - show stop dialog:
Determines if a dialog with a button should be displayed which stops the run: the best individual is returned.
Range: boolean; default: false - user result individual selection:
Determines if the user wants to select the final result individual from the last population.
Range: boolean; default: false - show population plotter:
Determines if the current population should be displayed in performance space.
Range: boolean; default: false - plot generations:
Update the population plotter in these generations.
Range: integer; 1-+?; default: 10 - constraint draw range:
Determines if the draw range of the population plotter should be constrained between 0 and 1.
Range: boolean; default: false - draw dominated points:
Determines if only points which are not Pareto dominated should be painted.
Range: boolean; default: true - population criteria data file:
The path to the file in which the criteria data of the final population should be saved.
Range: filename - maximal fitness:
The optimization will stop if the fitness reaches the defined maximum.
Range: real; 0.0-+? - selection scheme:
The selection scheme of this EA.
Range: uniform, cut, roulette wheel, stochastic universal sampling, Boltzmann, rank, tournament, non dominated sorting; default: tournament - tournament size:
The fraction of the current population which should be used as tournament members.
Range: real; 0.0-1.0 - start temperature:
The scaling temperature .
Range: real; 0.0-+? - dynamic selection pressure:
If set to true the selection pressure is increased to maximum during the complete optimization run.
Range: boolean; default: true - keep best individual:
If set to true, the best individual of each generations is guaranteed to be selected for the next generation (elitist selection).
Range: boolean; default: false - save intermediate weights:
Determines if the intermediate best results should be saved.
Range: boolean; default: false - intermediate weights generations:
Determines if the intermediate best results should be saved. Will be performed every k generations for a specified value of k.
Range: integer; 1-+?; default: 10 - intermediate weights file:
The file into which the intermediate weights will be saved.
Range: filename - p initialize:
Initial probability for an attribute to be switched on.
Range: real; 0.0-1.0 - p mutation:
Probability for an attribute to be changed (-1: 1 / numberOfAtt).
Range: real; -1.0-1.0 - p crossover:
Probability for an individual to be selected for crossover.
Range: real; 0.0-1.0 - crossover type:
Type of the crossover.
Range: one_point, uniform, shuffle; default: uniform