Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm trying to change some of our scripts to improve load performance and want to remove any data associated with the current day. I've changed to do the following:
Let vCutoff=floor(now());
CUTOFF:
Load
$(vCutoff) as KPI_Date
AutoGenerate 1
;
TEST:
LOAD KPI_ID,
KPI_Date,
Amount
FROM
[KPI_2016-05*.qvd]
(qvd)
WHERE NOT EXISTS (KPI_Date);
DROP TABLE CUTOFF;
The load is now optimised but it only reads one record from each qvd it's reading in, instead of the entire set of data. When I change to EXISTS it reads in everything from the input qvds related to today, which is what I would expect. Why is it not doing the same for the NOT EXISTS?
Thanks
Mark
The load is now optimised but it only reads one record from each qvd it's reading in, instead of the entire set of data. When I change to EXISTS it reads in everything from the input qvds related to today, which is what I would expect. Why is it not doing the same for the NOT EXISTS?
Thanks
Mark
That's because Exists(FIELD) is comparing the input record value of FIELD with the available symbols of FIELD in the symbol table, which contain all FIELD values loaded so far, including the current LOAD statement.
You could probably create the dates you want to load (all possible dates in the QVD except today()) and then just use a WHERE EXISTS(FIELD)
and still keep the optimized LOAD, or go with Sunny's suggestion and do an un-optimized LOAD.
Is KPI_Date also as floor(Date) stored? Have you the field KPI_Date already loaded before in any other load-statement?
- Marcus
Thanks for the reply Marcus.
No, this is the entire script just to test the method will work.
Try converting both fields to numeric format or both to date format and compare.
How about this:
CUTOFF:
LOAD Today() as KPI_Date_Check
AutoGenerate 1;
TEST:
LOAD KPI_ID,
KPI_Date,
Amount
FROM
[KPI_2016-05*.qvd]
(qvd)
WHERE NOT EXISTS (KPI_Date_Check, Floor(KPI_Date));
DROP TABLE CUTOFF;
In general exists() and not exists() do work - if not then is anything with the included field(s) wrong. Therefore check the values and formats from this field and the variable.
- Marcus
Thanks Sunny,
Unfortunately that doesn't give an optimized load as I believe the column names have to be identical for the optimization to work
Thanks Marcus, I don't think this is the case as I've even tried with an integer format of the date (e.g. 20160530) so there can be no confusion with formats and I still get the same error. Also, the EXISTS works ok and picks all records that match from the correct date. Its just the NOT EXISTS that picks one record from each input date/file (the files are split by date)
That's true, I forgot
Thanks Kiru, please see the answer to Marcus above. The date format matches when performing an EXISTS, but when I change to NOT EXISTS it reads in one record from every file we read in, instead of every record