Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all.
So I've figured out a way of reading for the first time my large table. I'm reading it in parts, by each Year, so I have ended up with Qvd_2011, Qvd_2012, ... Qvd.2021, so the final table would be a concatenation of them.
My question is, how do I do the hard delete incremental loading without it taking AGAIN so long?
My script for incremental load is
NoConcatenate
TempYear:
load Distinct year(date(floor(datetime),'DD/MM/YYYY')) as Year;
SQL SELECT datetime
FROM public.tab1;
LET NumRows=NoOfRows('TempYear');
trace $(NumRows);
For n=1 To $(NumRows)
let vYear = Peek('Year',$(n)-1,'TempYear');
trace $(vYear);
TempSize:
NoConcatenate
LOAD
tab1_pk,
size,
year(date(floor(datetime),'DD/MM/YYYY')) as YearOfInterest;
SQL select
tab1.`pk` "tab1_pk",tab1.`datetime`,
tab2.`fk` "tab2_fk", tab2.`size` "size"
from
public.tab1 tab1
INNER JOIN
public.tab2 tab2
on tab2.`fk`=tab1.`pk`
where year(tab1.`datetime`)='$(vYear)' and tab1.`pk`>'$(MaxInitialID)'; //based on another table
Size:
LOAD
tab1_pk,
sum(size)
resident TempSize
group by tab1_pk;
DROP TABLE TempSize;
// let vFileName='Size-' &'$(vYear)';
// store Image_Size into '$(vPathQVD)/$(vFileName).qvd';//into $(vFileName).qvd (qvd);
next n;
drop table TempYear;
Concatenate
Load * from [$(vPathQVD)Initial.qvd] (qvd)
WHERE NOT EXISTS(tab1_pk);
Inner Join
Load tab1_pk as tab1_pk;
SQL SELECT tab1.`pk` "tab1_pk" from
public.tab1 tab1
INNER JOIN
public.tab2 tab2
on tab1.`pk`=tab2.`fk`;
How do i handle the red part, so that it doesn't take AGES to load? In fact it took at least 7 days to load and was still going... How do I do the year split in here as well?
Many many thanks in advance!
If I understand the problem correctly then you first load the primary keys of the deleted items and then load the qvd with a where not exists clause so the deleted items are not loaded from the qvd's. Then you store the qvd's again to 'delete' the items from the qvd's.
Since you have qvd's per year you'll probably have to use a loop to iterate of the qvd's and once all the qvd's are processed load them again into one table
DeletedItems:
SQL Select MyPrimaryKey FROM MyTable Where Status = 'deleted';
For vYear = 2010 To 2021
TmpQVD:
LOAD * FROM [LIB:/QVDPath/qvd_$(vYear).qvd] (qvd) WHERE Not Exists (MyPrimaryKey);
STORE TmpQVD INTO [LIB:/QVDPath/qvd_$(vYear).qvd] (qvd) ;
DROP TABLE TmpQVD ;
Next
DataWithoutDeletedItems:
LOAD * FROM [LIB://QVDPath/qvd_*.qvd] (qvd);
@Gysbert_Wassenaar i don't have such a field status='Deleted' though.
There is a routine for hard delete incremental loading implementation ( https://help.qlik.com/en-US/sense/February2021/Subsystems/Hub/Content/Sense_Hub/LoadData/use-QVD-fil...
which i followed. There needs to be a final inner join (as you said to "delete" from the table) to happen which reads again from the database... that is my problem and i don't know what to do... 😞
It doesn't matter that you don't have a field called status. What matters is that there is a way to identify the records that should be deleted. I don't know your data so I can't tell you how to do that. The point is you should load a set of unique primary keys that identify the items that should be deleted. Once you have that you can load the data from the qvd's you created earlier and use a where not exists clause when loading those qvd's so the deleted items don't get loaded from the qvd's. And then store that data again in qvd's so the only the items that are not deleted.