"I have encountered various learning environments,
but none so broad, powerful, and easy-to-use as RapidMiner / YALE.
Many of us who are not skilled in programming are thankful."
On Tuesday, May 29th, 2007, Rapid-I will release the next version of the
open-source data mining software YALE, YALE 4.0beta, and change
its name to RapidMiner 4.0beta.
Because of legal issues, Rapid-I decided to change the name of YALE.
The name RapidMiner was chosen for its fit to the company name Rapid-I
and to the naming scheme of the line of products planned by Rapid-I
for the future.
Only the name changes.
Everything else stays the same.
RapidMiner / YALE will remain open-source software under GNU GPL
and available to end-users free of charge.
About RapidMiner / YALE
RapidMiner / YALE is a rapid prototyping system for knowledge discovery and data mining.
While Rapid-I ensures the maintenance and further development of RapidMiner
and the support of its users, RapidMiner will remain open-source (GPL).
So, for end users of RapidMiner, nothing changes: using RapidMiner is free
of charge.
For developers of closed-source software who would like to integrate RapidMiner
into their products, there now is the option of acquiring a developer
license (OEM) for RapidMiner and using it as a powerful library to enhance
their products with learnability, adaptability, and innovative analytical
features.
In addition to this dual licensing, Rapid-I continues the maintenance
and further development of RapidMiner as well as the support of its rapidly growing
user base.
For and beyond RapidMiner, Rapid-I also offers services like consulting,
professional support, customization, and integration.
RapidMiner / YALE is the technologically leading and with more than 400 data mining operators
most comprehensive open-source data mining software world-wide. It is widely used
from a large number of organizations covering a wide range of different branches.
Today, thousands of applications of RapidMiner / YALE in more than 30 countries give their users
a competitive edge.
Changes from YALE 3.4 to RapidMiner 4.0beta
Preview
General Improvements:
Improved overall speed: Most YALE runs now use less than 60% of the runtime needed before.
Large API changes now better support the embedding of YALE into your own applications.
All YALE file formats are now based on XML.
Improved printing.
Several bugfixes.
New Operators:
More than 80 new operators in total, including:
Several operators for outlier detection
FPGrowth (fast and memory efficient association rule mining)
A huge amount of new learning and meta learning schemes
CostBasedThresholdLearner (also allowing to classify examples as unknown)
(Weighted) Bootstrapping and BootstrappingValidation
Learning missing values (instead of simply replacing them)
Many new preprocessing operators like Merge, Cartesian Products, group by, aggregation, sorting,
etc.
Writing data sets into databases now possible
Generic attribute subset preprocessing
Generic visualization of models via dimensionality reduction
New ANOVA matrix
The clustering plugin is now part of the YALE core and
hence does no longer need to be installed separately.
New Look and Feel:
Drag & Drop for operator trees
Completely revised look and feel as well as icons
New file chooser providing favorites
All tables (viewers) can be sorted by all columns by
clicking on the corresponding table headers
All textual results now support text selection allowing
for copy and paste into other applications
Log scale added as new option to usual scatter plotter
Several chart plots added (new bars 2D and 3D, pie charts 2D and 3D, bubble plotter)
Graphical User Interface (GUI) is now able to immediately stop a running experiment
Graph view for Bayesian net models added
Textual and graphical view modes added for models which are capable of both,
e.g. decision trees and Bayesian nets
Result history viewer showing textual descriptions of all experiment results
in the session so far; allows also the calculations of Anova for different results
New Functions:
Improved example filter (now also supporting inversion and concatenations)
New additional performance criteria: Spearman's rho and Kendall's tau
New data representation types based on short or even boolean
further reducing the amount of memory needed
New HSQLDB JDBC database drivers
ExampleSetWriter now also supports zipped data files
Source definition added for all IO objects,
i.e. the results can now show, which operator was their creator, if necessary
Improved automatic value type guessing in ExampleSource configuration wizard
Weighted performance measures added for weighted means of the per-class recalls and precisions
Model writing and loading now also works for zipped files (.gz)
Improved attribute statistics handling and display