Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

remove duplicates while taking sum.

Hi All,

Hope you had a great weekend.

Here I am with a new set of problem.

I have attached an excel with this email with the data set.

The problem is

I have duplicate item Id. So how do I remove the duplicates?. My final aim is to total the Design - Create date and count the unique Item Id and then find the average.

But here because of the duplicate item ID, I also have extra amount in my total. How do i remove the duplicates and there influence in the total.

please help,

Thanks a lot,

Regards,

Bikash

1 Solution

Accepted Solutions
Clever_Anjos
Employee
Employee

That´s great news.
Please mark the topic as 'answered', so
Best regards

View solution in original post

16 Replies
Clever_Anjos
Employee
Employee

Do you need the dup data?
Why don´t you just LOAD DISTINCT from your excel file?

Not applicable
Author

Hi Anjos,

Thanks for the solution.

But I am reading them from QVDs.

The problem actually started this way,

I had several Modified Date for the same "stage" of the Item in the data model. Stage,here are like pending stage, design stage etc.

But my user is actually using the last. modified date. So I loaded the max(Modified Date) with the Item Id, which suppossedly should have given me single records for each item, but now it seems that even max(modified date) is repeating.

Please suggest,

Thanks,

Bikash

Clever_Anjos
Employee
Employee

LOAD
[ITEM ID],
MAX([Design -Create Date]) as MaxDesignDate
FROM [your file]
GROUP BY [ITEM ID];

Not applicable
Author

Could you please explain,

What your code will do? I understand that it will pick the item and the maximum of the difference.

Because I don't have a field called Design-Create Date in the model.

I have made that field in this excel just to explain my problem.

I am sorry, if that lead you astray.

Bikash

Clever_Anjos
Employee
Employee

This code will calculate the Maximum DesingCreate Date (This field is present in your excel file) for each ITEM ID.
LOAD
[ITEM ID],
MAX(Modified Date) as MaxModifiedDate

FROM [your file]
GROUP BY [ITEM ID];

Not applicable
Author

But now I see that this Max(Modified Date) is also repeating.

Clever_Anjos
Employee
Employee

Are you using GROUP BY modifier?
It seems weird to me.
Could you post a qvw with sample data?

Not applicable
Author

No, I am using group by ITEM ID.

Clever_Anjos
Employee
Employee

Are you using ou are you NOT using GROUP BY?
Could you post a qvw with sample data?