Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I am doing an incremental load but I think I accidentally loaded my most recent data more than once even though there is a variable that is supposed to prevent it. How can I delete data from the QVD?
Something like:
PP_USAGE:
LOAD TRX_DATE,
VERSION_PW_CAT_ID,
DESCRIPTION,
FACILITY,
DOMAIN,
USAGE,
EXTRACT_DATE
FROM
[..\QVD\01_Extract_QVD\PP_USAGE.qvd]
(qvd);
drop table PP_USAGE where TRX_DATE > '07/01/2017'
Here is my load script:
vMaxDate looks right so I'm not sure how it duplicated the data.
PP_USAGE:
LOAD TRX_DATE,
VERSION_PW_CAT_ID,
DESCRIPTION,
FACILITY,
DOMAIN,
USAGE,
EXTRACT_DATE
FROM
[..\QVD\01_Extract_QVD\PP_USAGE.qvd]
(qvd);
MAX_DATE:
LOAD
max(date(TRX_DATE, 'MM/DD/YYYY')) as Maxdate
resident PP_USAGE;
//5.Store the Maximum date in a variable.
Let vMaxdate = date(peek('Maxdate'), 'MM/DD/YYYY');
TRACE vMaxdate = $(vMaxdate); //pbk
//7.Pull the rows that are new from `1DHO_EKG_PLAN_USAGE_CAREB.csv`.
PP_USAGE_NEW:
LOAD date(TRX_DATE, 'MM/DD/YYYY') as TRX_DATE,
VERSION_PW_CAT_ID,
DESCRIPTION,
FACILITY,
DOMAIN,
USAGE ,
EXTRACT_DATE
FROM
[..\External_Data\1DHO_EKG_PLAN_USAGE_CAREB.csv]
(txt, codepage is 1252, embedded labels, delimiter is ',', msq)
//where floor(date(TRX_DATE),'MM/DD/YYYY') > $(vMaxdate); ///Come back to this
;
//4. Find max date of QVD file
MAX_DATE:
LOAD
max(date(TRX_DATE, 'MM/DD/YYYY')) as Maxdate
resident PP_USAGE;
//5.Store the Maximum date in a variable.
Let vMaxdate = date(peek('Maxdate'), 'MM/DD/YYYY');
////6. Drop the stored table
drop table MAX_DATE;
//7.Pull the rows that are new from `1DHO_EKG_PLAN_USAGE_ECISA.csv`.
CONCATENATE (PP_USAGE)
//PP_USAGE_A:
LOAD date(TRX_DATE, 'mm-dd-yyyy') as TRX_DATE,
VERSION_PW_CAT_ID,
DESCRIPTION,
FACILITY,
DOMAIN,
USAGE ,
EXTRACT_DATE
FROM
[..\External_Data\1DHO_EKG_PLAN_USAGE_ECISA.csv]
(txt, codepage is 1252, embedded labels, delimiter is ',', msq)
//where TRX_DATE > $(vMaxdate);
;
//9. store date into existing qvd
store PP_USAGE into [..\QVD\01_Extract_QVD\PP_USAGE.qvd]
(qvd);
tmp:
Load TRX_DATE,
// VERSION_PW_CAT_ID,
// DESCRIPTION,
FACILITY,
// DOMAIN,
USAGE ,
EXTRACT_DATE,
// autonumber(DESCRIPTION & '|' & DOMAIN) as %DESC_DOMAIN_Key, //join to Regional Variability,
autonumber(VERSION_PW_CAT_ID & '|' & DOMAIN) as %PP_DOMAIN_Key //Join to REVIEW
Resident PP_USAGE;
drop Table PP_USAGE;
RENAME Table tmp to PP_USAGE;
Is it intentionally that the WHERE clauses in your LOAD statements that load the new data are commented out?
If you need to remove data from your QVD, you can apply a WHERE clause to filter your data, then store the data into the QVD.
Like Stefan stated you could only remove records from a qvd by reloading the qvd with an appropriate where-clause and storing them again.
What possibly didn't work is that you read the max. date a second time without dropping the similar table from before. Beside this your approach looked more complex as necessary and something like this should be enough:
table:
load TRX_DATE, ... from QVD;
concatenate(table)
load TRX_DATE, ... from CSV where not exists(TRX_DATE);
- Marcus
I am pulling data with the same TRX_DATE from 2 different CSV files to append to the QVD so I don't think I can use where not exists.
In theory, I will always be appending net new data and therefore there shouldn't be anything needing this but I think I was mistaken.