Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
ioannaiogr
Creator II
Creator II

need for short load script explanation

Hello all,

I read this script with the load without from clause which then concatenate with the next table that does resident load If I get it correctly. I need an explanation on how this works to avoid me just assuming the process. Thank you in advance.

Sales:
LOAD

SaleID as "Sales.SalesID",
"Sale_date" as "Sales.Sale_date"

Resident MainTable;

inner Keep

Last12monthsSales:
LOAD
    "Sales.SaleID"
    Where counter =1;
LOAD
    "Sales.SaleID", Count(DISTINCT "Sales.Sale_date") as counter
 
Resident Sales
Where "Sales.Sale_Date">=AddMonths(Today(), -12)
    Group By "Sales.SaleID"
    ;
Labels (2)
9 Replies
marcus_sommer

It's a filter on the table Sales for sales from the last 12 month which have exact one occurring. The approach looked a bit unusual but may solve a certain business-logic - whereby an inner join instead of an inner keep seems to be more sensible.

ioannaiogr
Creator II
Creator II
Author

Thank you both, this is the first time I've come across this.

Ok, so  it could also be done like 

 

Last12monthsSales:
 
LOAD
    "Sales.SaleID", Count(DISTINCT "Sales.Sale_date") as counter
 
Resident Sales
Where "Sales.Sale_Date">=AddMonths(Today(), -12) a
    Group By "Sales.SaleID"
having counter=1;
 
but the original approach is better because it is faster?
marcus_sommer

Qlik doesn't support a having-clause in an aggregation-load. But in this case the condition could be performed within the loads like:

t: load id, date, -1 as x resident X; inner join(t) load id, count(date)=1 as x resident t group by id;

More performant could be to skip the entire approach or most of it respectively applying any logic with more added value. For example it's quite unusual that there are existing sales-records without sales else it's very common that they don't exists. In such case a load distinct of id's from the last 12 months would return all which occur - zero wouldn't possible but of course id's with more as one date/record.

Like above hinted the intention behind the measure would be crucial. In regard to count certain activities in certain periods and/or the time-span between them or similar stuff to detect new/lost customers and/or scoring them in any way it would be not sufficient. 

ioannaiogr
Creator II
Creator II
Author

Marcus, 

I have a few questions on this script snippet. Let's break it down

t: load id, date, -1 as x resident X;

// Question : this would produce a table with columns id, date and a column with all values -1. How does x serve us in this case?

inner join(t)

load id, count(date)=1 as x resident t group by id;

//question2: this would produce a table with columns id, x with all values 1 and what about the rest of ids with another count? Is this similar to load id,count(date) as x resident t  where count(date)=1 group by id;

 

question 3: Inner joining  the tables on  both id and x wouldn't cause an error?

 

marcus_sommer

If count(date)=1 is TRUE it would return -1 and by FALSE = 0. The inner join then filters the table to all matching field-values of both key-fields and should not return an error.

ioannaiogr
Creator II
Creator II
Author

@marcus_sommer  I created a sample script to better understand your suggestion. Indeed the only returned id is 1 as it should be.  As for my question about the error, there would be an error (synthetic key I meant, so not much of an error) if the tables weren't joined, right? Thank you so much!

t:
LOAD * INLINE [
id, date, x
1, 20/11/2023, -1
2, 19/11/2023, -1
2, 20/11/2023, -1
];

inner join(t)
load id, count(date((date),'DD/MM/YYYY'))=1 as x
resident t
group by id;

Aasir
Creator III
Creator III

1. Main part.

Sales:
LOAD
SaleID as "Sales.SalesID",
"Sale_date" as "Sales.Sale_date"
Resident MainTable;
This loads data from the "MainTable" into a table named "Sales." It renames the fields "SaleID" and "Sale_date" to "Sales.SalesID" and "Sales.Sale_date," respectively.

2. Inner Keep
inner Keep
Last12monthsSales:
LOAD
"Sales.SaleID"
Where counter = 1;
hen loads data into a table named "Last12monthsSales." It only includes rows where the field "counter" is equal to 1. The "counter" field is created in the next part.

3. Counter load
LOAD
"Sales.SaleID", Count(DISTINCT "Sales.Sale_date") as counter
Resident Sales
Where "Sales.Sale_Date">=AddMonths(Today(), -12)
Group By "Sales.SaleID";

This part creates a table named "Counter" by counting the distinct "Sales.Sale_date" for each "Sales.SaleID" in the "Sales" table. It only includes rows where the "Sales.Sale_Date" is in the last 12 months.

The "counter" field represents the count of distinct sale dates for each sale ID in the last 12 months.

The "Last12monthsSales" table created in step 2 includes only those sale IDs where the "counter" is equal to 1. This effectively filters for sale IDs that have sales in all of the last 12 months.




marcus_sommer

Yes, id + date create a synthetic key if both tables remain - the source and this extra resident ones. I didn't considered at this point because it depends on all the other tables and their intended purpose how to address it. Renaming the fields and/or combining them and/or merging the tables or maybe skipping/replacing this part completely.