Skip to main content
Announcements
Live today at 11 AM ET. Get your questions about Qlik Connect answered, or just listen in. SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

how to remove duplicate words in a string

hi,

i have strings in this format:

manufacturer+modell,manufacturer+modell....

like this

Citroen C8 , Fiat Ulysse , Lancia Phedra , Peugeot 3008 ,  Peugeot 607 , Peugeot 807

Porsche 911 , Porsche Boxster , Porsche Cayman

BMW 3 , BMW Z1 , BMW Z3, BMW Z4 , Rover 75 , Rover 200 , Saab 9-5

i want to remove the duplicate manufacturer like this:

Citroen C8 , Fiat Ulysse , Lancia Phedra , Peugeot 3008 ,  607 , 807

Porsche 911 , Boxster , Cayman

BMW 3 , Z1 , Z3, Z4 , Rover 75 , 200 , Saab 9-5

my Problem ist ...every line has other words to delete.....sometimes more than one...

any idea?

1 Solution

Accepted Solutions
maxgro
MVP
MVP

my idea and the result

1.png

z:

load row, rowno() as line

inline [

row

Citroen C8 , Fiat Ulysse , Lancia Phedra , Peugeot 3008 ,  Peugeot 607 , Peugeot 807

Porsche 911 , Porsche Boxster , Porsche Cayman

BMW 3 , BMW Z1 , BMW Z3, BMW Z4 , Rover 75 , Rover 200 , Saab 9-5

] (delimiter is '|');

left join (z)

load row, trim(SubField(row, ',')) as split

Resident z;

left join (z)

load row, split, subfield(split, ' ') as car, subfield(split, ' ', 1) as brand

Resident z;

z1:

load

line, brand as brand1, concat(car, ',') as car1, brand & ' ' & concat(car, ',') as brandandcar

Resident z

where brand <> car

group by brand, line;

DROP Table z;

z2:

NoConcatenate

load line, Concat(brandandcar, ',') resident z1 group by line;

View solution in original post

5 Replies
Gysbert_Wassenaar

Why do you want to do it anyway? What are you going to use that data for? Isn't another data model much more useful?

What does each line mean? Is there a reason there are separate lines or could you concatenate all the lines into one line without losing any information?


talk is cheap, supply exceeds demand
Not applicable
Author

it's compatibility list of our product's, each line is an other item,`

Need to trim the lines,  this lines are just examples ... the lines can be more than 100 chars

i can split em up to manufacturer and Modell, but  how to concat them again in the in the wantet Format again?

maxgro
MVP
MVP

my idea and the result

1.png

z:

load row, rowno() as line

inline [

row

Citroen C8 , Fiat Ulysse , Lancia Phedra , Peugeot 3008 ,  Peugeot 607 , Peugeot 807

Porsche 911 , Porsche Boxster , Porsche Cayman

BMW 3 , BMW Z1 , BMW Z3, BMW Z4 , Rover 75 , Rover 200 , Saab 9-5

] (delimiter is '|');

left join (z)

load row, trim(SubField(row, ',')) as split

Resident z;

left join (z)

load row, split, subfield(split, ' ') as car, subfield(split, ' ', 1) as brand

Resident z;

z1:

load

line, brand as brand1, concat(car, ',') as car1, brand & ' ' & concat(car, ',') as brandandcar

Resident z

where brand <> car

group by brand, line;

DROP Table z;

z2:

NoConcatenate

load line, Concat(brandandcar, ',') resident z1 group by line;

Gysbert_Wassenaar

each line is an other item

What kind of entity is item? Does it represent anything or is it just a string of characters?


talk is cheap, supply exceeds demand
Not applicable
Author

Looks good.....will try it in a quiet Moment....busy with other stuff.....will answer then..