Pages: [1]
  Print  
Author Topic: [SOLVED] Pivoting several lines into one single field  (Read 1235 times)
MacPhotoBiker
Jr. Member
**
Posts: 56


WWW
« on: May 01, 2013, 04:37:03 AM »

Hi guys,
this is my first post in this forum, I hope itīs on the right spot.

Iīm using currently RapidMiner mainly for ETL processes, and itīs truly impressive what RM is capable of. Iīm now looking for a particluar feature which I could not find, and Iīd appreciate somebody could put me on the right track. Let me make a very simple example.

Letīs say there are currently two columns, Column 1 shows several cheque numbers, column two shows which invoices were paid with each cheque. A cheque can pay one or several invoices (i. e. a typical payment allocation table).

CHEQUE     INVOICE_PAID
C1               Inv1
C1               Inv2
C2               Inv3
C3               Inv4
C3               Inv5
C3               Inv6

What I try to achieve is a table with only two columns, where the first column is grouped by Cheque number, and the second column contains ALL invoices paid by a particular cheque:


CHEQUE     INVOICES_PAID
C1              Inv1, Inv2
C2              Inv3
C3              Inv4, Inv5, Inv6

Itīs very important for me that all paid invoices are contained in one single field per record because I want to use it on a report (a statement, to be precise). Unfortunately, I wasnīt able to figure it out myself.
Any help would be greatly appreciated.
« Last Edit: May 03, 2013, 04:17:40 PM by MacPhotoBiker » Logged

MacPhotoBiker
Just do your Core Business. We do the rest.
MacPhotoBiker
Jr. Member
**
Posts: 56


WWW
« Reply #1 on: May 02, 2013, 05:55:49 PM »

Anyone?
Logged

MacPhotoBiker
Just do your Core Business. We do the rest.
awchisholm
Sr. Member
****
Posts: 398


WWW
« Reply #2 on: May 02, 2013, 06:47:09 PM »

Hello

Something like this would work...

Loop values for the cheque attribute

Inside the loop, filter examples where cheque = the current value

Create a macro containing null

Start another loop for all filtered examples

Inside this loop update the macro with the value of the invoice attribute for the current example

Regards,

Andrew
Logged

MacPhotoBiker
Jr. Member
**
Posts: 56


WWW
« Reply #3 on: May 03, 2013, 04:16:24 PM »

Hi Andrew,

thank you very much for your answer.

Frankly, something really hilarious just happened! Based on your answer I wanted to comment that I believe this functionality should be built into the "aggregate" functionality. I wanted to use the correct terminology, so I looked up the operator in RapidMiner, and - shame on me - I actually figured that it is already there, itīs the aggregation function "concatenation".

Thatīs precisely what I was looking for, and SO simple!

Again Andrew, I do thank you for your answer, after all you pointed me in the right direction, and you also gave me a glimpse of a more flexible approach. (Not that I could do that (yet), but down the road... who knows? Wink )
« Last Edit: May 08, 2013, 03:14:36 PM by MacPhotoBiker » Logged

MacPhotoBiker
Just do your Core Business. We do the rest.
awchisholm
Sr. Member
****
Posts: 398


WWW
« Reply #4 on: May 04, 2013, 09:01:36 AM »

Hello MacPhotoBiker

How cool - I've learnt something new too.

regards

Andrew
Logged

rowan.g
Newbie
*
Posts: 47


« Reply #5 on: May 08, 2013, 07:34:25 AM »

Thanks for that solution MacPhotoBiker.
Did you get the " |" between your values? For some reason I've got those characters and I can't get rid of them. Any ideas?

Cheers,
Logged
MacPhotoBiker
Jr. Member
**
Posts: 56


WWW
« Reply #6 on: May 08, 2013, 03:04:59 PM »

Hi rowan.g,

"|" is  simply a field delimiter, and RM decided to choose one that usually does not occur in any other context, which is a very good thing in order to keep the different values (Inv1, Inv2 etc. in my example) clearly separated.

It's very easy to change those, simply do the following:

- After the aggregate function, add a "Generate Attribute" operator
- "Attribute name" is the field you just generated, e.g. "MyListInOneField"
- "function expression" is "replace(MyListInOneField","|",",")

This replaces all occurrences of "|" with "," in the field "MyListInOneField".

In my example, I want one field that reads "Inv. # 4, 5, 6", here's how I do that:

- My initial "ListOfInvoices" is "4|5|6" -> replace(ListOfInvoices,"|",", ") -> "4, 5, 6"

- To add "Inv. # " I use the "concat" function: concat("Inv. # ", replace(ListOfInvoices,"|",", ")) -> "Inv. # 4, 5, 6"

That did the trick for me, I hope it helps.
Logged

MacPhotoBiker
Just do your Core Business. We do the rest.
rowan.g
Newbie
*
Posts: 47


« Reply #7 on: May 09, 2013, 12:17:40 AM »

Thanks MacPhotoBiker. Worked perfectly.
Logged
MacPhotoBiker
Jr. Member
**
Posts: 56


WWW
« Reply #8 on: May 09, 2013, 06:12:44 PM »

Glad to hear that it worked.
Logged

MacPhotoBiker
Just do your Core Business. We do the rest.
Pages: [1]
  Print  
 
Jump to: