| RapidMiner, Preview, Date | 21 Nov 2010 |
| Preview: New Date Functions for Attribute Generation by Ingo Mierswa | Comment (0) |
Recently we improved the creation of new attributes a lot. The operator "Generate Attributes" is the basis for those calculations. Here, the analyst can define a list of expressions which are evaluated and new attributes can be calculated based on the values of already existing attributes. This made the "Generate Attributes" operator to one of the most important operators for data preprocessing.
We already have discussed two additions which will be published with the next major release 5.1 of RapidMiner:
- A set of extended operations for nominal / text processing,
- A new user interface resembling a calculator supporting the analyst in creating the expressions.
Today we are happy to announce a third addition for the "Generate Attributes" operator, namely the ability to deal with dates and perform calendar operations.
Let's have a look to the supported functions:
In the example above, the difference between now calculated with date_now() and the date stored in the column named "Datum" ist calculated with the function date_diff() and parsed. The result can then for example be further processed, e.g. with the operator "Date to Numerical" which would extract the number of days of this difference.
The date functions supported by the operator "Generate Attributes" are:
- date_parse(): Parses a date given as string or as number of milliseconds
- date_parse_loc(): Same as date_parse() but using a specified locale
- date_parse_custom(): Same as date_parse() but using a specified format
- date_before(): Compares two dates and returns true if the first date is before the second
- date_after(): Compares two dates and returns true if the first date is after the second
- date_str(): Transforms a date to a string representation
- date_str_loc(): Same as date_str() but using a specified locale
- date_str_custom(): Same as date_str() but using a specified format
- date_now(): Creates the current date and time
- date_diff(): Calculates the difference between two dates
- date_add(): Adds a specified amount of time to the given date
- date_set(): Sets a specific part of the given date
- date_get(): Delivers a specific part of the given date
Together with the already existing operators "Date to Numerical", "Date to Nominal", "Numerical to Date", "Nominal to Date", and "Adjust Date", these new date functions for the attribute generation build a powerful base for all types of date transformations. Have fun and stay tuned - the next version RapidMiner 5.1 will be released soon!


