Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
Santanunandy
Contributor II
Contributor II

First Occurance

Hello Experts
I have a data set like the below and would like to find out the first occurrence of the circuit id
if there are instances of the same circuit id found once again in the next invoice , I would like to treat this as old.
The no of circuits that I have is around 1 billion , how do I find the first occurrence and tag that as New and other occurrences as old without having a sync key.

CircuitIDInvoice noResults
1ABC-001New
2ABC-001New
3ABC-001New
4ABC-002New
3ABC-002Old
2ABC-002Old
1ABC-002Old
6ABC-002New

 

 

Labels (1)
2 Solutions

Accepted Solutions
JuanGerardo
Partner - Specialist
Partner - Specialist

Hi @Santanunandy, I think the best way is using Exists() function with the invoice field, like this:

Load
	CircuitID,
    Invoice,
    If(Exists(CircuitID), 'Old', 'New')	AS Result
Inline [
CircuitID	Invoice
1	ABC-001
2	ABC-001
3	ABC-001
4	ABC-002
3	ABC-002
2	ABC-002
1	ABC-002
6	ABC-002
] (delimiter is '\t');

JG

View solution in original post

JuanGerardo
Partner - Specialist
Partner - Specialist

When you use Exists(), the circuit_id field already exists in table NewTableOne, so the values are not new...

You can use an auxiliary field for this:

...
FinalTable:
load 
   circuit_id,
   file_name, 
   circuit_id	AS ExistCircuitID,
   If(Exists(ExistCircuitID, circuit_id), 'Old', 'New')	AS Results
resident NewTableOne;

Drop Field ExistCircuitID;
...

JG

View solution in original post

7 Replies
JuanGerardo
Partner - Specialist
Partner - Specialist

Hi @Santanunandy, I think the best way is using Exists() function with the invoice field, like this:

Load
	CircuitID,
    Invoice,
    If(Exists(CircuitID), 'Old', 'New')	AS Result
Inline [
CircuitID	Invoice
1	ABC-001
2	ABC-001
3	ABC-001
4	ABC-002
3	ABC-002
2	ABC-002
1	ABC-002
6	ABC-002
] (delimiter is '\t');

JG

Santanunandy
Contributor II
Contributor II
Author

Hello ,
When I run this script it says circuitid not found. Can you pls help
Santanunandy
Contributor II
Contributor II
Author

Hello ,

With the inline table the data is getting loaded but while I am trying to pick up the data from the .qvd files I can see only old values are picked up and not New .Can you pls help.

NewTable:

LOAD
circuit_id,
file_name
FROM [lib://QVD/CHARGE.qvd]
(qvd);

NoConcatenate
NewTableOne:
load
distinct circuit_id,
file_name
resident NewTable
order by file_name asc;

drop table NewTable;

NoConcatenate

FinalTable:

load circuit_id,file_name, if(Exists(circuit_id),'Old','New') as Results
resident NewTableOne;

drop table NewTableOne;

exit Script;

JuanGerardo
Partner - Specialist
Partner - Specialist

When you use Exists(), the circuit_id field already exists in table NewTableOne, so the values are not new...

You can use an auxiliary field for this:

...
FinalTable:
load 
   circuit_id,
   file_name, 
   circuit_id	AS ExistCircuitID,
   If(Exists(ExistCircuitID, circuit_id), 'Old', 'New')	AS Results
resident NewTableOne;

Drop Field ExistCircuitID;
...

JG

Santanunandy
Contributor II
Contributor II
Author

Thank you JD for replying but in the NewTableOne table I had 91 M rows but when the FinalTable was created I had only 79 M but my exception that the final table would also be having the same rows of rows as the Newtableone  with results being  with Old and New.

There are ~12 M rows missing after the final table have been created , Can you help me understand why is this the case. 

JuanGerardo
Partner - Specialist
Partner - Specialist

Probably this is because you are using a distinct clause, so you have about 12 M duplicated rows. 
JG

Santanunandy
Contributor II
Contributor II
Author

Thanks JD and Spot on .Thanks for all your help