New release: Rapid-I released a new version of the leading open-source data mining solution RapidMiner 4.2
RapidMiner 4.2: New Small Enterprise Edition
The new version of course is again available in different versions. Users can choose between the free Community Edition of RapidMiner and one of the Enterprise Editions which are more suitable for professional data analysts:
Enterprise Edition =
Community Edition +
More Features +
Services +
Guarantees
The good news for smaller companies or single users is that we now also provide the new Small Enterprise Edition. Check out the details at the Enterprise Edition feature page!
RapidMiner 4.2: Focus on Large-Scale Data Mining
This release is a minor bugfix and major feature release. The access to databases is improved and provides now constant access times for databases with arbitrary sizes. The focus on large-scale data mining is also demonstrated by the re-implementations of several learning schemes leading to a boost in running time of up to an factor of 13. Several unnecessary data scans were removed leading to further reduced runtimes.
General Improvements:
Aggregation now supports multiple grouping and multiple aggregation attributes with different aggregation functions each
Date type now better supported including several powerful conversion operators for almost arbitrary date and time formats
Histogram plotters now support jittering and log scales
Improved database wizard now supporting larger data sets which caused memory problems in the older versions during table and attribute name retrieval
Statistics for larger data sets are now only calculated on request
Reduced the iteration time through partitioned / splitted data sets
All plotters can now handle missing values
Many plotters now support the plotting of absolute values and / or sorting according to the plotted column
One-Class SVM for LibSVMLearner now properly supported
New operators GroupModel and UngroupModel now replace the automatic building of ContainerModels (merging preprocessing with prediction models) and hence give the user more control over the model building / grouping process
AttributeSubsetPreprocessing now supports the inversion of the specified regular expression
AttributeSubsetPreprocessing can now be applied on subsets defined similarly to the new AttributeFilter operator. Hence, the subset preprocessing can for example only be performed on nominal or numerical attributes only
The database example set writer now supports new overwriting / appending modes
and a lot more
New Operators:
More than 10 new operators were added since version 4.1:
Nominal2Date
Date2Nominal
KernelPCA
EqualLabelWeighting
StataExampleSource
FeatureSubsetIteration
RelativeRegression
AttributeValueSubstring
CachedDatabaseExampleSource
NameBasedWeighting
BatchProcessing
GroupModel
UngroupModel
Bugfixes:
This release fixes several bugs, especially two errors in the new parameter wizard GUI for string and integer parameters. The CSV and the SimpleExampleSource operators now also supports lines which correctly divide empty strings (i.e. missing values) at the end of the lines. Several other smaller issues were also fixed.