Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
sanjujeeboy
Creator
Creator

how to remove duplicate records( a bit tricky)

Hi All,

 

in the below table i want to remove the records highlighted in RED. as it is the sum of records above it.

How to remove this record? excel attached for Ref:

Product.productGroupClicks
  
AND(PRODUCT_TYPE_LEVEL_1 = "cuidado personal",CUSTOM_LABEL_3 = "afeitadoras aquatouch",PRODUCT_ID = "s5050/64")865
AND(PRODUCT_TYPE_LEVEL_1 = "cuidado personal",CUSTOM_LABEL_3 = "afeitadoras aquatouch",PRODUCT_ID = "s5400/06")233
AND(PRODUCT_TYPE_LEVEL_1 = "cuidado personal",CUSTOM_LABEL_3 = "afeitadoras aquatouch",PRODUCT_ID = "s5420/06")615
AND(PRODUCT_TYPE_LEVEL_1 = "cuidado personal",CUSTOM_LABEL_3 = "afeitadoras aquatouch")1713
AND(CUSTOM_LABEL_2 = "bodygroom",CUSTOM_LABEL_3 = "dauerhafte haarentfernung")488
AND(CUSTOM_LABEL_2 = "bodygroom",CUSTOM_LABEL_3 = "rasierer")269
AND(CUSTOM_LABEL_2 = "bodygroom",CUSTOM_LABEL_3 = "schneiden und rasieren")649
AND(CUSTOM_LABEL_2 = "bodygroom")1406
1 Solution

Accepted Solutions
cwolf
Creator III
Creator III

For your data you have to use at least also the date field. Possibly other fields as well, for example:

t:
load 
*,
Product.Date & '|' &  [Product.Paid Search Ad Group] & '|' & left(Product.productGroup,len(Product.productGroup)-1) as ppg
from [.\Search1.qvd](qvd);

Final:
load
*
;//where idx<>1;
load
*,
Index(Previous(ppg),ppg) as idx
Resident t Order by ppg desc;

//drop fields ppg,idx;
drop table t;

View solution in original post

6 Replies
cwolf
Creator III
Creator III

For example:

t:
load Product.productGroup, 
     Clicks,
     left(Product.productGroup,len(Product.productGroup)-1) as ppg //delete closing bracket
from [.\duplicate.xlsx](ooxml, embedded labels, table is Sheet1);

Final:
load
*
where idx<>1;
load
*,
Index(Previous(ppg),ppg) as idx
Resident t Order by ppg desc;

drop fields ppg,idx;
drop table t;
sanjujeeboy
Creator
Creator
Author

Hi 

The actual file structure is attached. can you help with the solution. i tried with above solution but somehow it didnt work on the actual file structure.

@cwolf  @sunny_talwar 

cwolf
Creator III
Creator III

It works. Look at the attachment. In the table all lines with idx=1 will removed. I think that is what you want.

sanjujeeboy
Creator
Creator
Author

Hi ,

the above solution is the exact output which i want.

Im  using the same code as yours but somehow  when i select idx 0 there are very few click numbers.

attached is reference file and qvd

Kindly look into this

@cwolf 

cwolf
Creator III
Creator III

For your data you have to use at least also the date field. Possibly other fields as well, for example:

t:
load 
*,
Product.Date & '|' &  [Product.Paid Search Ad Group] & '|' & left(Product.productGroup,len(Product.productGroup)-1) as ppg
from [.\Search1.qvd](qvd);

Final:
load
*
;//where idx<>1;
load
*,
Index(Previous(ppg),ppg) as idx
Resident t Order by ppg desc;

//drop fields ppg,idx;
drop table t;
sanjujeeboy
Creator
Creator
Author

Thanks @cwolf ...This one worked...

Really appreciate your help!!!