Hi again,
I have had a quick look and it could work if the list of words (and therefore the columns/attributes) stayed the same... but the list of words already is large and having to set up the attributes in the de-pivot task would take a very long time each time the job was run.
You do not need to set them all up manually, you could use regular expressions instead. Maybe the following example might help:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.017">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.017" expanded="true" name="Process">
<process expanded="true" height="224" width="681">
<operator activated="true" class="retrieve" compatibility="5.1.017" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
<parameter key="repository_entry" value="//Samples/data/Market-Data"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.1.017" expanded="true" height="76" name="Generate Attributes" width="90" x="179" y="30">
<list key="function_descriptions">
<parameter key="AMOUNT" value="1"/>
</list>
</operator>
<operator activated="true" class="pivot" compatibility="5.1.017" expanded="true" height="76" name="Pivot" width="90" x="313" y="30">
<parameter key="group_attribute" value="TID"/>
<parameter key="index_attribute" value="ITEM"/>
<parameter key="skip_constant_attributes" value="false"/>
</operator>
<operator activated="true" class="de_pivot" compatibility="5.1.017" expanded="true" height="76" name="De-Pivot" width="90" x="447" y="30">
<list key="attribute_name">
<parameter key="AMOUNT" value="AMOUNT.*"/>
</list>
<parameter key="index_attribute" value="ITEM"/>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Pivot" to_port="example set input"/>
<connect from_op="Pivot" from_port="example set output" to_op="De-Pivot" to_port="example set input"/>
<connect from_op="De-Pivot" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Of course you could also use ".*" for all attributes but you should probably filter out the operator used for identifying the groups. This should do the trick.
I have had a quick look at the Cut Document operator, and it would appear to do what I want, expect it does not allow for any other meta data to be passed through so I cannot tell what document the words relate to.
Could also be a possible approach. Maybe you could multiply the data before, use Cut Document in one path and join both data sets afterwards?
Cheers,
Ingo