Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I am new to Qlikview scripting and need some help with de-duping. When loading data, I do NOT want to eliminate duplicates, but I’d like to identify them with a flag so I can filter them out later. From what I can tell, I might want to use the PEEK function, but I’m not quite sure how to incorporate that into the script. Here’s what I have so far:
(connecting to Salesforce)
SQL SELECT
AccountId,
Act_Total_Lines__c,
CloseDate,
DART_ID__c,
Id,
Total_SIMs__c,
FROM Opportunity
WHERE CreatedDate >= 2014-01-01T00:00:00.000Z;
LOAD *,
AccountId & '-' & CloseDate & '-' & Total_SIMs__c as DupeId,
if(Peek('DupeId')=DupeId,0,1) as Flag1,
Num
Resident Opportunity
order by DupeId Asc;
The error I’m getting is: “File not found” during the “LOAD” statement. Can you recommend how to get this to work? Am I way off base? Thanks!
Right, I missed that you are using DupeId also in the order by clause.
Opportunity:
LOAD *,
AccountId & '-' & CloseDate & '-' & Total_SIMs__c as DupeId;
SQL SELECT
AccountId,
Act_Total_Lines__c,
CloseDate,
DART_ID__c,
Id,
Total_SIMs__c,
FROM Opportunity
WHERE CreatedDate >= 2014-01-01T00:00:00.000Z;
LOAD *,
if(Peek('DupeId')=DupeId,0,1) as Flag1
Resident Opportunity
order by DupeId Asc;
You can also try to use exists() function to flag your duplicates.
file not found or table not found?
try this and check the table exists
SQL SELECT
AccountId,
Act_Total_Lines__c,
CloseDate,
DART_ID__c,
Id,
Total_SIMs__c,
FROM Opportunity
WHERE CreatedDate >= 2014-01-01T00:00:00.000Z;
exit script;
Actually, I misread the error. It says, "Field not found", then shows the LOAD statment. Is it not finding the DupeID field? Perhaps that has to be done before the SELECT statement?
You probably get a 'field not found' because field DupeId is not part of your resident table.
Try
LOAD *,
if(Peek('DupeId')=DupeId,0,1) as Flag1;
LOAD *,
AccountId & '-' & CloseDate & '-' & Total_SIMs__c as DupeId
Resident Opportunity
order by DupeId Asc;
assuming you labeled your first table Opportunity. I removed Num from your resident load, because it doesn't seem part of your resident load either.
I'm still getting the "Field not Found" error using this script:
Opportunity:
SQL SELECT
AccountId,
Act_Total_Lines__c,
CloseDate,
DART_ID__c,
Id,
Total_SIMs__c,
FROM Opportunity
WHERE CreatedDate >= 2014-01-01T00:00:00.000Z;
LOAD *,
if(Peek('DupeId')=DupeId,0,1) as Flag1;
LOAD *,
AccountId & '-' & CloseDate & '-' & Total_SIMs__c as DupeId
Resident Opportunity
order by DupeId Asc;
Right, I missed that you are using DupeId also in the order by clause.
Opportunity:
LOAD *,
AccountId & '-' & CloseDate & '-' & Total_SIMs__c as DupeId;
SQL SELECT
AccountId,
Act_Total_Lines__c,
CloseDate,
DART_ID__c,
Id,
Total_SIMs__c,
FROM Opportunity
WHERE CreatedDate >= 2014-01-01T00:00:00.000Z;
LOAD *,
if(Peek('DupeId')=DupeId,0,1) as Flag1
Resident Opportunity
order by DupeId Asc;
You can also try to use exists() function to flag your duplicates.
Thanks for your help, Massimo and Swuehl...it appears to be working.
You can do this in a single pass using autonumber.
Opportunity:
LOAD
*,
autonumber(Recno(), AccountId & '-' & CloseDate & '-' & Total_SIMs__c)>1 as Flag1
;
SQL SELECT
AccountId,
Act_Total_Lines__c,
CloseDate,
DART_ID__c,
Id,
Total_SIMs__c,
FROM Opportunity
WHERE CreatedDate >= 2014-01-01T00:00:00.000Z;
This technique can be slow if you have a lot of rows, but I include it for completeness. I like it because it doesn't rely on sort order.
-Rob