Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello all,
I have two large tables that are only going to get bigger over time. I need one column out of table B to be associated with table A but no matter how I try to associate them, the data manager either fails to load at all (gets stuck on "Loading Tables..." screen). I've tried multiple joins, creating the association in the data manager after the load, and apply map. The only way I was even able to get it to load to verify the association was working was to limit the data pulled from TableA using "WHERE "timestamp" > '2018-08-29';" and that still takes probably a few minutes on the "Loading Tables..." portion of the data manager and about a minute and a half to load. Ideally I would like to pull data as far back as the first of the year.
What would be perfect is a way to only load "new" data or data that isn't already existing from the previous load. I did run across a post or two suggesting using a WHERE NOT EXIST but I wasn't able to make it work in my script without errors.
What can I do to the script below to only load new data or decrease load time while still associating "Kind_V" with the "UnqID"?
LIB CONNECT TO 'My-DBO';
[TableA]:
Mapping Load
Order&Code as UnqID,
"Kind_V";
SELECT
Order,
Code,
"Kind_V"
FROM MY.dbo."V_Mat"
WHERE "timestamp" > '2018-08-29';
[TableB]:
Load
Order&Code as UnqID
From
['lib://MyFolder\*.xlsx']
(ooxml, embedded labels, table is [MatInfo]);
[TableC]:
Load
*,
ApplyMap('TableA',UnqID,Null()) as Mat_Kind
Resident TableB;
Drop Table TableB;
Qlik is in-memory analysis tool means all analysis done in the System RAM.
. Like
Qlik is in-memory analysis tool means all analysis done in the System RAM.
Always Load * or Select * fetches data faster from the source, so keeping that in mind I would suggest you not to make transformations on the preceding load of TableA, instead create a Two Tier Architecture by doing the extracts separately.
Store TableA into a QVD .
Store TableB into a QVD.
Use a different file to read both and do transformations.
/** File (QVW) 1 **/
LIB CONNECT TO 'My-DBO';
[TableA]:
Load
*;
SELECT
Order,
Code,
"Kind_V"
FROM MY.dbo."V_Mat";
A:
LOAD
Order&Code as UnqID,
"Kind_V"
Resident TableA;
DROP Table A;
STORE A into TableA.qvd (qvd);
/** File (QVW) 2 **/
LIB CONNECT TO 'My-DBO';
[TableB]:
Load
Order,
Code
From
['lib://MyFolder\*.xlsx']
(ooxml, embedded labels, table is [MatInfo]);
STORE TableB into TableB.qvd (qvd);
DROP Table TableB;
/***File (QVW) 3 **/
TableA_Map: /** Optimized Load **/
Mapping LOAD
UnqID,
"Kind_V"
FROM TableA.qvd (qvd);
TableB:
NoConcatenate
Load
Order,
Code
FROM TableB.qvd (qvd);
/*** If your TableB is small then you can add the applymap into the QVD load itself or else use below**/
Final:
NoConcatenate
LOAD
Order,
Code,
Order&Code as UnqID,
ApplyMap('TableA_Map',Order&Code, NUll()) as Mat_Kind
Resident TableB;
DROP Table TableB;
Coming to the tables getting larger over time. This is handled by Incremental Load and its easy on a QVD than the source. Refer
Overview of Qlik Incremental Loading.
https://www.analyticsvidhya.com/blog/2014/09/qlikview-incremental-load/
Thanks for the detailed explanation.
I did implement your suggest code and there was an improvement in load time but as you mentioned to be able to handle the data increasing over time I would need to do some kind of incremental load. In my earlier example one source was multiple xlsx files that were from a report that was run automatically once a day, I'm no longer using that connection source. I thought this could have potentially helped speed up the load time vs pulling from the server but with an incremental load that is no longer the case.
My latest plan is to have a "historical load" QDW that loads all my desired fields from "My-dbo" WHERE "timestamp" > X AND "timestamp" < x then STORE to "MY-QVD"
I will change the x and y variables month by month to build up the history of the file until i have everything from 2018-01-01 to present.
THEN
I will have another "incremental load" QDW that does the job of the incremental load and use this as the "working" file but I'm having an issue with this script getting stuck on the "loading tables..." screen but this time the data manager actually loads, its after I click "synchronize scripted tables" it gets stuck. I've stripped the script down to what should be a minimal load so I believe its actually something I'm doing wrong rather than loading to much data.... below are my current scripts for the historical and incremental loads.
//** HISTORICAL LOAD **//
LIB CONNECT TO 'My-DBO';
[Historical_Load]:
SQL SELECT
Order,
"timestamp"
FROM My-DBO.dbo."V_Mat"
WHERE "timestamp" > '2018-10-02 AND "timestamp" < '2018-10-03';
STORE Historical_Load into 'lib://MyQVD/WorkingFile.qvd' (qvd);
//**NOTE: "WHERE "timestamp" > '2018-10-02 AND "timestamp" < '2018-10-03'" was done to limit data pulled to verify if the increment was working **//
//** INCREMENTAL LOAD **//
[CurrentMaxDate]:
Load Max(Date("timestamp",'YYYY-MM-DD')) as MaxDate
FROM 'lib://MyQVD/WorkingFile.qvd' (qvd);
Let MaxDate = peek('MaxDate',0,CurrentMaxDate);
LIB CONNECT TO 'My-DBO';
[Incremental_Load]:
SQL SELECT
Order,
"timestamp"
FROM My-DBO.dbo."V_Mat"
WHERE "timestamp" > '$(MaxDate)';
/*Concatenate
LOAD *
FROM ['lib://MyQVD/WorkingFile.qvd'];
STORE Incremental_Load into 'lib://MyQVD/WorkingFile.qvd' (qvd);*/
//**NOTE: The above commented out section was done so to test the script output**//
/*** File to Extract *****/
1. Double check if your date formats are same in all places i.e. MaxDate, Max(MaxDate), timestamp, $(MaxDate)
2. As a suggestion store your daily files in WorkingFile_20181005.qvd (YYYYMMDD) format.