Pages: [1] 2 3 ... 10
 1 
 on: May 25, 2016, 07:57:25 PM 
Started by online360 - Last post by online360
Hi Martin!

I matched making this process work but unfortunately it always gets stuck between loop 150 and 300.

Do you have an idea to make this easier or to make it consume less memory?:

Code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="7.1.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.1.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.1.001" expanded="true" height="68" name="Retrieve t123_product_words" width="90" x="112" y="85">
        <parameter key="repository_entry" value="//Local Repository/data/t123_product_words"/>
      </operator>
      <operator activated="true" breakpoints="after" class="loop_values" compatibility="7.1.001" expanded="true" height="82" name="Loop Values" width="90" x="380" y="85">
        <parameter key="attribute" value="word"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="7.1.001" expanded="true" height="68" name="Retrieve synonyms_all_lowercase_splitted_trimmed" width="90" x="313" y="85">
            <parameter key="repository_entry" value="../data/synonyms_all_lowercase_splitted_trimmed"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="7.1.001" expanded="true" height="82" name="Generate Attributes (2)" width="90" x="648" y="136">
            <list key="function_descriptions">
              <parameter key="searched_word" value="trim(%{loop_value})"/>
            </list>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="7.1.001" expanded="true" height="82" name="Generate Attributes" width="90" x="782" y="136">
            <list key="function_descriptions">
              <parameter key="contains_attribute" value="if([word_1]==[searched_word]||[word_2]==[searched_word]||[word_3]==[searched_word]||[word_4]==[searched_word]||[word_5]==[searched_word]||[word_6]==[searched_word]||[word_7]==[searched_word]||[word_8]==[searched_word]||[word_9]==[searched_word]||[word_10]==[searched_word]||[word_11]==[searched_word]||[word_12]==[searched_word]||[word_13]==[searched_word]||[word_14]==[searched_word]||[word_15]==[searched_word]||[word_16]==[searched_word]||[word_17]==[searched_word]||[word_18]==[searched_word]||[word_19]==[searched_word]||[word_20]==[searched_word]||[word_21]==[searched_word]||[word_22]==[searched_word]||[word_23]==[searched_word]||[word_24]==[searched_word]||[word_25]==[searched_word]||[word_26]==[searched_word]||[word_27]==[searched_word]||[word_28]==[searched_word]||[word_29]==[searched_word]||[word_30]==[searched_word]||[word_31]==[searched_word]||[word_32]==[searched_word]||[word_33]==[searched_word]||[word_34]==[searched_word]||[word_35]==[searched_word]||[word_36]==[searched_word]||[word_37]==[searched_word]||[word_38]==[searched_word]||[word_39]==[searched_word]||[word_40]==[searched_word]||[word_41]==[searched_word]||[word_42]==[searched_word]||[word_43]==[searched_word]||[word_44]==[searched_word]||[word_45]==[searched_word]||[word_46]==[searched_word]||[word_47]==[searched_word]||[word_48]==[searched_word]||[word_49]==[searched_word]||[word_50]==[searched_word]||[word_51]==[searched_word]||[word_52]==[searched_word]||[word_53]==[searched_word]||[word_54]==[searched_word]||[word_55]==[searched_word]||[word_56]==[searched_word]||[word_57]==[searched_word]||[word_58]==[searched_word]||[word_59]==[searched_word]||[word_60]==[searched_word]||[word_61]==[searched_word]||[word_62]==[searched_word]||[word_63]==[searched_word]||[word_64]==[searched_word]||[word_65]==[searched_word]||[word_66]==[searched_word]||[word_67]==[searched_word]||[word_68]==[searched_word]||[word_69]==[searched_word]||[word_70]==[searched_word]||[word_71]==[searched_word]||[word_72]==[searched_word]||[word_73]==[searched_word]||[word_74]==[searched_word]||[word_75]==[searched_word]||[word_76]==[searched_word]||[word_77]==[searched_word]||[word_78]==[searched_word]||[word_79]==[searched_word]||[word_80]==[searched_word]||[word_81]==[searched_word]||[word_82]==[searched_word]||[word_83]==[searched_word]||[word_84]==[searched_word]||[word_85]==[searched_word]||[word_86]==[searched_word]||[word_87]==[searched_word]||[word_88]==[searched_word]||[word_89]==[searched_word]||[word_90]==[searched_word],&quot;YES&quot;,&quot;NO&quot;)"/>
            </list>
          </operator>
          <connect from_op="Retrieve synonyms_all_lowercase_splitted_trimmed" from_port="output" to_op="Generate Attributes (2)" to_port="example set input"/>
          <connect from_op="Generate Attributes (2)" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_port="out 1"/>
          <portSpacing port="source_example set" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="append" compatibility="7.1.001" expanded="true" height="82" name="Append" width="90" x="514" y="85"/>
      <operator activated="true" class="filter_examples" compatibility="7.1.001" expanded="true" height="103" name="Filter Examples" width="90" x="782" y="85">
        <list key="filters_list">
          <parameter key="filters_entry_key" value="contains_attribute.does_not_equal.YES"/>
        </list>
      </operator>
      <operator activated="true" class="store" compatibility="7.1.001" expanded="true" height="68" name="Store" width="90" x="983" y="85">
        <parameter key="repository_entry" value="//Local Repository/data/t123_synonyms_processed"/>
      </operator>
      <connect from_op="Retrieve t123_product_words" from_port="output" to_op="Loop Values" to_port="example set"/>
      <connect from_op="Loop Values" from_port="out 1" to_op="Append" to_port="example set 1"/>
      <connect from_op="Append" from_port="merged set" to_op="Filter Examples" to_port="example set input"/>
      <connect from_op="Filter Examples" from_port="example set output" to_op="Store" to_port="input"/>
      <connect from_op="Store" from_port="through" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

Thanks,
Steven

 2 
 on: May 25, 2016, 10:11:07 AM 
Started by sshilderman - Last post by sshilderman
Didn't know the Read Database supports it.
It is definitely what i needed.

Thanks a lot!

 3 
 on: May 25, 2016, 09:58:09 AM 
Started by online360 - Last post by online360
Hi Martin!

I added a "split" operator into the loop so it can test against each attribute using an exact match comparison.

How can I say euqal either attribute1 or attribute2 or attribut3, ...?
The process tells me that "||" is only allowed for boolean or numerical attributes.

Thanks,
Steven

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="7.1.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.1.000" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.1.000" expanded="true" height="68" name="Retrieve t123_product_words" width="90" x="112" y="85">
        <parameter key="repository_entry" value="//Local Repository/data/t123_product_words"/>
      </operator>
      <operator activated="true" class="sample_stratified" compatibility="7.1.000" expanded="true" height="82" name="Sample (Stratified)" width="90" x="246" y="85"/>
      <operator activated="true" breakpoints="after" class="loop_values" compatibility="7.1.000" expanded="true" height="82" name="Loop Values" width="90" x="380" y="85">
        <parameter key="attribute" value="word"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="7.1.000" expanded="true" height="68" name="Retrieve synonyms_all" width="90" x="179" y="85">
            <parameter key="repository_entry" value="//Local Repository/data/synonyms_all"/>
          </operator>
          <operator activated="true" class="split" compatibility="7.1.000" expanded="true" height="82" name="Split" width="90" x="313" y="85">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="att1"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="7.1.000" expanded="true" height="82" name="Generate Attributes" width="90" x="514" y="136">
            <list key="function_descriptions">
              <parameter key="contains_attribute" value="if(equals([att1_1]||[att1_2]||[att1_3]||[att1_4]||[att1_5]||[att1_6]||[att1_7]||[att1_8]||[att1_9]||[att1_10]||[att1_11]||[att1_12]||[att1_13]||[att1_14]||[att1_15]||[att1_16]||[att1_17]||[att1_18]||[att1_19],%{loop_value}),&quot;YESMATCH&quot;,&quot;NOMATCH&quot;)"/>
            </list>
          </operator>
          <connect from_op="Retrieve synonyms_all" from_port="output" to_op="Split" to_port="example set input"/>
          <connect from_op="Split" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_port="out 1"/>
          <portSpacing port="source_example set" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="append" compatibility="7.1.000" expanded="true" height="82" name="Append" width="90" x="514" y="85"/>
      <operator activated="true" class="remove_duplicates" compatibility="7.1.000" expanded="true" height="82" name="Remove Duplicates" width="90" x="648" y="85">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="contains_attribute"/>
      </operator>
      <operator activated="true" class="filter_examples" compatibility="7.1.000" expanded="true" height="103" name="Filter Examples" width="90" x="782" y="85">
        <list key="filters_list">
          <parameter key="filters_entry_key" value="contains_attribute.does_not_equal.NOMATCH"/>
        </list>
      </operator>
      <connect from_op="Retrieve t123_product_words" from_port="output" to_op="Sample (Stratified)" to_port="example set input"/>
      <connect from_op="Sample (Stratified)" from_port="example set output" to_op="Loop Values" to_port="example set"/>
      <connect from_op="Loop Values" from_port="out 1" to_op="Append" to_port="example set 1"/>
      <connect from_op="Append" from_port="merged set" to_op="Remove Duplicates" to_port="example set input"/>
      <connect from_op="Remove Duplicates" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
      <connect from_op="Filter Examples" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

 4 
 on: May 25, 2016, 09:51:22 AM 
Started by Sabrine - Last post by Sabrine
By the way,

you might also simply read it in as nominal and use Nominal To Date afterwards.

~Martin
By the way,

you might also simply read it in as nominal and use Nominal To Date afterwards.

~Martin
Hi,

yea changing the source data is definetly not the way you want to go Grin

This works for me:

1. Change the Data Format to "yyyy.MM.dd hh:mm:ss"
2. Change the colum type to "date_time"



The test data I used can be found here: https://www.dropbox.com/s/ivrn8o70iie9f71/test.csv?dl=0

Regards,
Marco
Thank you marco this works! sorry that I didnt notice that a custom data format can be typed directly there ( thought I can only select from the list below!). By the way I just tried doing the same thing using the operator "nominal to date"  (same idea as Martin) and it works too, the only inconvenience is that no attribute subset selection is possible.
Many thanks,
sabrine

 5 
 on: May 25, 2016, 09:43:06 AM 
Started by Sabrine - Last post by Martin Schmitz
By the way,

you might also simply read it in as nominal and use Nominal To Date afterwards.

~Martin

 6 
 on: May 25, 2016, 09:42:22 AM 
Started by online360 - Last post by Martin Schmitz
Hi,

sure. I think contains actually takes regexes, even though it is not explicity documented.

~Martin

 7 
 on: May 25, 2016, 09:38:18 AM 
Started by online360 - Last post by online360
Hi!

Thanks!
You mean like the following process?

At the moment for example "cable" would also be found if the synonym is named "energy-cable" or whatever. (Please see the function in "generate attribute")
Is there a way to only find those attributes that don't have any other letter at the beginning and the end of the loop_value (only space, comma or punctuation mark would be allowed; I guess using regex)?

Thanks!

Code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="7.1.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.1.000" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.1.000" expanded="true" height="68" name="Retrieve t123_product_words" width="90" x="112" y="85">
        <parameter key="repository_entry" value="//Local Repository/data/t123_product_words"/>
      </operator>
      <operator activated="true" class="sample_stratified" compatibility="7.1.000" expanded="true" height="82" name="Sample (Stratified)" width="90" x="246" y="85"/>
      <operator activated="true" breakpoints="after" class="loop_values" compatibility="7.1.000" expanded="true" height="82" name="Loop Values" width="90" x="380" y="85">
        <parameter key="attribute" value="word"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="7.1.000" expanded="true" height="68" name="Retrieve synonyms_all" width="90" x="179" y="85">
            <parameter key="repository_entry" value="//Local Repository/data/synonyms_all"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="7.1.000" expanded="true" height="82" name="Generate Attributes" width="90" x="514" y="136">
            <list key="function_descriptions">
              <parameter key="contains_attribute" value="if(contains(att1,%{loop_value}),att1,&quot;NOMATCH&quot;)"/>
            </list>
          </operator>
          <connect from_op="Retrieve synonyms_all" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_port="out 1"/>
          <portSpacing port="source_example set" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="append" compatibility="7.1.000" expanded="true" height="82" name="Append" width="90" x="514" y="85"/>
      <operator activated="true" class="remove_duplicates" compatibility="7.1.000" expanded="true" height="82" name="Remove Duplicates" width="90" x="648" y="85">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="contains_attribute"/>
      </operator>
      <operator activated="true" class="filter_examples" compatibility="7.1.000" expanded="true" height="103" name="Filter Examples" width="90" x="782" y="85">
        <list key="filters_list">
          <parameter key="filters_entry_key" value="contains_attribute.does_not_equal.NOMATCH"/>
        </list>
      </operator>
      <connect from_op="Retrieve t123_product_words" from_port="output" to_op="Sample (Stratified)" to_port="example set input"/>
      <connect from_op="Sample (Stratified)" from_port="example set output" to_op="Loop Values" to_port="example set"/>
      <connect from_op="Loop Values" from_port="out 1" to_op="Append" to_port="example set 1"/>
      <connect from_op="Append" from_port="merged set" to_op="Remove Duplicates" to_port="example set input"/>
      <connect from_op="Remove Duplicates" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
      <connect from_op="Filter Examples" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

 8 
 on: May 25, 2016, 09:31:01 AM 
Started by Sabrine - Last post by Marco Boeck
Hi,

yea changing the source data is definetly not the way you want to go Grin

This works for me:

1. Change the Data Format to "yyyy.MM.dd hh:mm:ss"
2. Change the colum type to "date_time"



The test data I used can be found here: https://www.dropbox.com/s/ivrn8o70iie9f71/test.csv?dl=0

Regards,
Marco

 9 
 on: May 25, 2016, 09:11:02 AM 
Started by Sabrine - Last post by Sabrine
Hi,

what version of Studio are you using? 7.0 and later have the Date format in the top left corner during the "Format your columns" step while adding data. You can freely change the date format there to whatever you need.



Regards,
Marco
Hi Marco,
Thank you for your answer, I am using the 7.0 version and already tried all possible data formats in RM but none worked for this type JJJJ.mm.dd hh:mm:ss. What I did now is setting a costum date format in excel for the original data that is compatible with one of the formats in RM ( in this case JJJJ-MM-DD HH:MM:SS which was recognized by RM). It is not that much fun however to do it for 20 columns (especially if I find out during importing data that I have overseen a date column and then have to go back to the original excel data file and change the format there!). Is there a way to fix this directly in RM?
Regards,
Sabrine

 10 
 on: May 25, 2016, 09:09:04 AM 
Started by online360 - Last post by Martin Schmitz
Hi,

sounds like you can use a Generate Attribute to generate new Attribute like "Contains Bike" or so and then join on this?

~Martin

Pages: [1] 2 3 ... 10