HomeSearchSitemapLegalContact Us
Quick Links
Testimonials

"I would like to first congratulate you on a wonderful product. The text plugin is, of course, spectacular."

Timothy R. Tangherlini, USA

 
Training Seminars

 

Random Image
Hosted by
SourceForge.net Logo
Home arrow Products arrow RapidMiner Community Edition arrow Introduction arrow Operator Overview
Operator Overview

RapidMiner (formerly YALE) and its plugins provide more than 400 operators for all aspects of Data Mining. Meta operators automatically optimize the experiment designs and users no longer need to tune single steps or parameters any longer. A huge amount of visualization techniques and the possibility to place breakpoints after each operator give insight into the success of your design - even online for running experiments. On this page we discuss the main groups of operators and give operator examples for each of the groups.

 

RapidMiner provides operators for:

Image In- and output: flexible operators for data in- and output in different file formats including
  • Known data mining and learning scheme formats (Arff, C4.5, csv ...)
  • Sparse file formats (known from SVMlight, mySVM ...)
  • Excel files
  • SPSS files
  • Data sets from databases (Oracle, mySQL, PostgreSQL, Microsoft SQL Server, Sybase ...)
  • dBase
  • Text files (Word Vector plugin) and Audio files (Value Series Plugin)
  • and more
Image Machine learning algorithms: more than 100 learning schemes for regression, classification, and clustering tasks, including:
  • Support Vector Machines (SVM, LibSVM, SMO, mySVM ...)
  • Decision Tree and Rule Learners (ID3, C4.5, PART, PRISM, RIPPER ...)
  • Lazy Learners (Nearest Neighbors, K*, LBR ...)
  • Bayesian Learners (Naive Bayes, Bayes Net, AODE ...)
  • Logistic Learners (Logistic Regression, SimpleLogistic ...)
  • Gaussian Processes
  • Meta Learning (AdaBoost, Bagging, Stacking, BayesianBoosting ...)
  • Association Rule Mining (Apriori, Tertius ...)
  • Clustering (Clustering Plugin: k-Means, k-Medoids, DBscan, SVClustering ...)
  • and more
Image Weka operators: all learning schemes and attribute evaluators of the Weka learning environment are also available and can be used like all other RapidMiner operators
Image Data preprocessing: operators which often have to be applied before the learning process include
  • Discretization (Binning, Frequency ...)
  • Example and feature filtering (Conditioned, ValueTypeFilter ...)
  • Normalization (Interval, Standardization, z-Transformation ...)
  • Sampling (Simple, Stratified, ModelBased ...)
  • Dimensionality Reduction (PCA, Kernel-PCA, GHA, ICA ...)
  • Missing and infinite value replenishment
  • Removal of useless features
  • and more
Image Feature operators:
  • Feature Selection (Forward Selection, Backward Elimination, Genetic Algorithms, WeightGuided ...)
  • Feature Weighting and Relevance (ChiSquared, Correlation, InfoGain, RelieF ...)
  • Feature Construction (GGA, YAGGA ...)
  • Feature Extraction from time series (Value Series Plugin)
  • and more
Image Performance evaluation: several validation and evaluation schemes to estimate the performance of learning or preprocessing on your dataset, e.g.
  • Cross-validation (stratified, shuffled, non-shuffled ...)
  • Training and test set splitting (random, fixed ...)
  • Leave-one-out
  • Significance tests (ANOVA, paired t-Tests ...)
  • Large number performance criteria for classification and regression (absolute, relative, accuracy, precision, recall, kappa, AUC ...)
  • and more
Image Meta operators: several optimization operators for experiment design, e.g.
  • Parameter Optimization (Grid, Quadratic, Evolutionary ...)
  • Learning Curves
  • Experiment loops and iterations
  • and more
Image Visualization: logging and presenting results include the visualization of
  • Online 1D, 2D and 3D plots of your data and experiment results (ExperimentLog, DataView, MetaDataView ...)
  • Built-in color, histogram, and distribution plots
  • Quartile / box plots
  • Learned Models (Tree View, ClusterModel graphs ...)
  • High-dimensional data (Andrew's Curves, GridViz, Parallel, RadViz, Survey, SOM ...)
  • SVM functions (HyperplaneProjection, AttributeFunction ...)
  • ROC plots
  • Lift Charts
  • and more

 
< Prev   Next >