Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

NOT EXISTS Optimized Load Issue

I'm trying to change some of our scripts to improve load performance and want to remove any data associated with the current day. I've changed to do the following:

Let vCutoff=floor(now());

CUTOFF:

Load

$(vCutoff) as KPI_Date

AutoGenerate 1

;

TEST:

LOAD KPI_ID,

     KPI_Date,

     Amount

FROM

[KPI_2016-05*.qvd]

(qvd)

WHERE NOT EXISTS (KPI_Date);

DROP TABLE CUTOFF;

The load is now optimised but it only reads one record from each qvd it's reading in, instead of the entire set of data. When I change to EXISTS it reads in everything from the input qvds related to today, which is what I would expect. Why is it not doing the same for the NOT EXISTS?

Thanks

Mark

1 Solution

Accepted Solutions
swuehl
MVP
MVP

The load is now optimised but it only reads one record from each qvd it's reading in, instead of the entire set of data. When I change to EXISTS it reads in everything from the input qvds related to today, which is what I would expect. Why is it not doing the same for the NOT EXISTS?

Thanks

Mark

That's because Exists(FIELD) is comparing the input record value of FIELD with the available symbols of FIELD in the symbol table, which contain all FIELD values loaded so far, including the current LOAD statement.

You could probably create the dates you want to load (all possible dates in the QVD except today()) and then just use a WHERE EXISTS(FIELD)

and still keep the optimized LOAD, or go with Sunny's suggestion and do an un-optimized LOAD.

View solution in original post

16 Replies
marcus_sommer

Is KPI_Date also as floor(Date) stored? Have you the field KPI_Date already loaded before in any other load-statement?

- Marcus

Anonymous
Not applicable
Author

Thanks for the reply Marcus.

No, this is the entire script just to test the method will work.

Not applicable
Author

Try converting both fields to numeric format or both to date format and compare.

sunny_talwar

How about this:

CUTOFF:

LOAD Today() as KPI_Date_Check

AutoGenerate 1;

TEST:

LOAD KPI_ID,

    KPI_Date,

    Amount

FROM

[KPI_2016-05*.qvd]

(qvd)

WHERE NOT EXISTS (KPI_Date_Check, Floor(KPI_Date));

DROP TABLE CUTOFF;

marcus_sommer

In general exists() and not exists() do work - if not then is anything with the included field(s) wrong. Therefore check the values and formats from this field and the variable.

- Marcus

Anonymous
Not applicable
Author

Thanks Sunny,

Unfortunately that doesn't give an optimized load as I believe the column names have to be identical for the optimization to work

Anonymous
Not applicable
Author

Thanks Marcus, I don't think this is the case as I've even tried with an integer format of the date (e.g. 20160530) so there can be no confusion with formats and I still get the same error. Also, the EXISTS works ok and picks all records that match from the correct date. Its just the NOT EXISTS that picks one record from each input date/file (the files are split by date)

sunny_talwar

That's true, I forgot

Anonymous
Not applicable
Author

Thanks Kiru, please see the answer to Marcus above. The date format matches when performing an EXISTS, but when I change to NOT EXISTS it reads in one record from every file we read in, instead of every record