Pages: [1]
Author Topic: how to remove rows containing a particular string/word from an excel file?  (Read 580 times)
Posts: 1

« on: April 21, 2014, 11:00:08 AM »

Hi i want to delete rows which contains a specific word in excel file and get output without those rows. I am using 5.0.13 version rapid miner. i have started using rapid miner recently. can anyone suggest me how to go about it and what operators to choose?
i have read about "filter examples" operator. now having an excel file in .xls format, what will be the best way to get output without rows containing a particular word? please reply.
Marius Helf
Hero Member
Posts: 1811

« Reply #1 on: April 22, 2014, 10:27:36 AM »

You did already import the data via the Read Excel operator, right? Then just add a Filter Examples operator. With RapidMiner 5 you then can filter on one column. Select attribute_value_filter as condition_class. Then the parameter_string

column1 != .*badWord.*

will keep all rows where column1 does not contain the string "badWord".

To match only whole words, your filter should look like this:

column1 != != ^(.+\s)*badWord(\s.*)*$

The cryptic syntax used here are regular expressions Smiley Google for that term to get more information.

Best regards,

Please add [SOLVED] to the topic title when your problem has been solved! (do so by editing the first post in the thread and modifying the title)
Please click here before posting.
Pages: [1]
Jump to: