Optimize Selection (Evolutionary)

From Rapid-I-Wiki

Jump to: navigation, search


A genetic algorithm for feature selection.

Contents

Description

A genetic algorithm for feature selection (mutation=switch features on and off, crossover=interchange used features). Selection is done by roulette wheel. Genetic algorithms are general purpose optimization / search algorithms that are suitable in case of no or little problem knowledge.

A genetic algorithm works as follows
    * Generate an initial population consisting of population_size individuals. Each attribute is switched on with probability p_initialize
    • For all individuals in the population
    • Perform mutation, i.e. set used attributes to unused with probability p_mutation and vice versa.
    • Choose two individuals from the population and perform crossover with probability p_crossover. The type of crossover can be selected by crossover_type.
    • Perform selection, map all individuals to sections on a roulette wheel whose size is proportional to the individual's fitness and draw population_size individuals at random according to their probability.
    • As long as the fitness improves, go to 2
If the example set contains value series attributes with blocknumbers, the whole block will be switched on and off.



Input

  • example set in: expects: ExampleSetMetaData: #examples: = 0; #attributes: 0
  • attribute weights in: optional: AttributeWeights
  • through 1:


Output

  • example set out:
  • weights:
  • performance:


Parameters

  • use exact number of attributes:
    Determines if only combinations containing this numbers of attributes should be tested.
    Range: boolean; default: false
  • restrict maximum:
    If checked the maximal number of attributes might be restricted. Otherwise all combinations of all number of attributes are generated and tested.
    Range: boolean; default: false
  • min number of attributes:
    Determines the minimum number of features used for the combinations.
    Range: integer; 1-+?; default: 1
  • max number of attributes:
    Determines the maximum number of features used for the combinations.
    Range: integer; 1-+?; default: 1
  • exact number of attributes:
    Determines the exact number of features used for the combinations.
    Range: integer; 1-+?; default: 1
  • initialize with input weights:
    Indicates if this operator should look for attribute weights in the given input and use the input weights of all known attributes as starting point for the optimization.
    Range: boolean; default: false
  • population size:
    Number of individuals per generation.
    Range: integer; 1-+?; default: 5
  • maximum number of generations:
    Number of generations after which to terminate the algorithm.
    Range: integer; 1-+?; default: 30
  • use early stopping:
    Enables early stopping. If unchecked, always the maximum number of generations is performed.
    Range: boolean; default: false
  • generations without improval:
    Stop criterion: Stop after n generations without improval of the performance.
    Range: integer; 1-+?; default: 2
  • normalize weights:
    Indicates if the final weights should be normalized.
    Range: boolean; default: true
  • use local random seed:
    Indicates if a local random seed should be used.
    Range: boolean; default: false
  • local random seed:
    Specifies the local random seed
    Range: integer; 1-+?; default: 1992
  • show stop dialog:
    Determines if a dialog with a button should be displayed which stops the run: the best individual is returned.
    Range: boolean; default: false
  • user result individual selection:
    Determines if the user wants to select the final result individual from the last population.
    Range: boolean; default: false
  • show population plotter:
    Determines if the current population should be displayed in performance space.
    Range: boolean; default: false
  • plot generations:
    Update the population plotter in these generations.
    Range: integer; 1-+?; default: 10
  • constraint draw range:
    Determines if the draw range of the population plotter should be constrained between 0 and 1.
    Range: boolean; default: false
  • draw dominated points:
    Determines if only points which are not Pareto dominated should be painted.
    Range: boolean; default: true
  • population criteria data file:
    The path to the file in which the criteria data of the final population should be saved.
    Range: filename
  • maximal fitness:
    The optimization will stop if the fitness reaches the defined maximum.
    Range: real; 0.0-+?
  • selection scheme:
    The selection scheme of this EA.
    Range: uniform, cut, roulette wheel, stochastic universal sampling, Boltzmann, rank, tournament, non dominated sorting; default: tournament
  • tournament size:
    The fraction of the current population which should be used as tournament members.
    Range: real; 0.0-1.0
  • start temperature:
    The scaling temperature .
    Range: real; 0.0-+?
  • dynamic selection pressure:
    If set to true the selection pressure is increased to maximum during the complete optimization run.
    Range: boolean; default: true
  • keep best individual:
    If set to true, the best individual of each generations is guaranteed to be selected for the next generation (elitist selection).
    Range: boolean; default: false
  • save intermediate weights:
    Determines if the intermediate best results should be saved.
    Range: boolean; default: false
  • intermediate weights generations:
    Determines if the intermediate best results should be saved. Will be performed every k generations for a specified value of k.
    Range: integer; 1-+?; default: 10
  • intermediate weights file:
    The file into which the intermediate weights will be saved.
    Range: filename
  • p initialize:
    Initial probability for an attribute to be switched on.
    Range: real; 0.0-1.0
  • p mutation:
    Probability for an attribute to be changed (-1: 1 / numberOfAtt).
    Range: real; -1.0-1.0
  • p crossover:
    Probability for an individual to be selected for crossover.
    Range: real; 0.0-1.0
  • crossover type:
    Type of the crossover.
    Range: one_point, uniform, shuffle; default: uniform


Personal tools