Open source software for big data analytics.
No programming required.

HomeContact UsSearchSitemapPrivacy PolicyImprint
  • Deutsch
  • English
Rapid-I. Report the Future. Home Download
Rapid-I Blog
Home Home
Search Search
RSS Feed RSS Feed

 

 

Blog Tags
Login Form





Lost Password?
No account yet? Register
Tag >> Preview
RapidMinerPreviewDate 21 Nov 2010
Preview: New Date Functions for Attribute Generation by Ingo Mierswa Comment (0)

Recently we improved the creation of new attributes a lot. The operator "Generate Attributes" is the basis for those calculations. Here, the analyst can define a list of expressions which are evaluated and new attributes can be calculated based on the values of already existing attributes. This made the "Generate Attributes" operator to one of the most important operators for data preprocessing.

We already have discussed two additions which will be published with the next major release 5.1 of RapidMiner:

Today we are happy to announce a third addition for the "Generate Attributes" operator, namely the ability to deal with dates and perform calendar operations.

Let's have a look to the supported functions:

 

 

In the example above, the difference between now calculated with date_now() and the date stored in the column named "Datum" ist calculated with the function date_diff() and parsed. The result can then for example be further processed, e.g. with the operator "Date to Numerical" which would extract the number of days of this difference.

The date functions supported by the operator "Generate Attributes" are:

  • date_parse(): Parses a date given as string or as number of milliseconds
  • date_parse_loc(): Same as date_parse() but using a specified locale
  • date_parse_custom(): Same as date_parse() but using a specified format
  • date_before(): Compares two dates and returns true if the first date is before the second
  • date_after():  Compares two dates and returns true if the first date is after the second
  • date_str(): Transforms a date to a string representation
  • date_str_loc(): Same as date_str() but using a specified locale
  • date_str_custom(): Same as date_str() but using a specified format
  • date_now(): Creates the current date and time
  • date_diff(): Calculates the difference between two dates
  • date_add(): Adds a specified amount of time to the given date
  • date_set(): Sets a specific part of the given date
  • date_get(): Delivers a specific part of the given date

Together with the already existing operators "Date to Numerical", "Date to Nominal", "Numerical to Date", "Nominal to Date", and "Adjust Date", these new date functions for the attribute generation build a powerful base for all types of date transformations. Have fun and stay tuned - the next version RapidMiner 5.1 will be released soon!

RapidMinerPreview 21 Oct 2010
New GUI for Generate Attributes: Calculator Style by Ingo Mierswa Comment (2)

I just wanted to show you the new graphical user interface for the expression creator of the quite important operator "Generate Attributes". Thomas Ott of Neural Market Trends recently made a video about this great operator which in general can be used to define new attributes (column, dimension...) based on a calculation on already existing ones. For example, you could create a new attribute named "area" by calculating the product of the two attribute "width" and "length" with the formula "width * length". Pretty easy, huh?

Recently, we introduced a whole set of new functions which can be used to work on text data. Now it is even possible to extract substrings from nominal values like in "cut(att1, 2, 5)" which creates a new attribute with the substring of length 5 starting at position 2 of the values of attribute "att1". Together with the numerous numerical functions and special functions like if-then-else conditions and others, the total amount of supported functions hence grew a lot. And for exactly that reason we decided to develop a new user interface for the expression generation which now follows a nice calculator style:

 

 

As you can easily see, the user interface is inspired by a calculator. At the top, we have the actual expression which is created with the help of the other elements. Of course, you can type in any part of the expression into the field yourself at any time.

In the lower left part, you will find all available functions. You can change the currently displayed set by selecting a different function type in the combo box. On the lower right part you will find a list of all known attribute names. This list is only filled if the meta data is available but you can of course simply type in the name if you want and it is unknown.

The usage of the new user interface is pretty simple: just click on a function the selected part will automatically become the argument of the new function.in order to add it at the current caret position in the expression field. The caret is then placed in the parentheses so you can directly edit the function arguments. If you add an attribute by double clicking on one of them in the attribute list on the right, is is also added at the current position and the new caret position will be directly after the added attribute. You can also select some text in the expression field before you add a new function: in this case

By the way, this is how to start the new user interface: simply click on the small calculator icon on the right of the expression field in the expression definition list of the Generate Attributes operator:

 

 

Another cool thing: each change - either manual or by using the elements - will trigger a validation  check. If the check was successful, this is indicated by the small green check on the right. If not, a red cross appears and a tool tip explains why the expression can not be successfully validated. This again is a great help for analysts who do not want to wait until a long-running process crashes since there was an error in the function expression.

This new calculator and the new text functions will be delivered with the next release of RapidMiner coming soon!

  • Share/Bookmark
  • Stay tuned with our RSS feed!
  • Watch videos on our YouTube channel!
  • Rapid Insight / Inside Rapid-I (Blog)
  • Visit Rapid-I on Facebook and become our fan!
  • Follow Rapid-I on Twitter!
  • Read the Rapid-I Newsletter