Discussion Board for collaboration related to QlikView App Development.
Hi,
I have a problem with reloading huge data set in qlikview. Suppose I have the data for month Oct 2017 to March 2018. There is around 20 million rows of data per month
Now I have applied automatic reload to this file. For shorter periods this will work. But for longer periods it will not be feasible as the data set is huge and everytime it will load the data from Oct till date.
I want to ensure that data extraction from Oct till date is a one time activity. The query should only load data from today and then after few days, when the data becomes huge to handle, it should again start reload the data for last ten days and start again .
Please suggest if there is any way to achieve this.
You mean you want to do incremental Load?
Read it, got the basic understanding and now understand whats required. Would still require your inputs here
I already have a Qvd file which contains data from 1st to 10th March. SQL query has been used to fetch data for this file. The sample data format in the table is below:
Month Day Bank Card_type Count
March 9 HDFC VISA 10
March 10 HSBC MASTERCARD10
March 10 CITI VISA 10
Query: Select month(t.addedon) as Month,day(t.addedon) as Day,t.bank,t.card_type,count(*)
from transaction t where t.slate in ('VV',DER'','GET')
and t.addedon>='2018-03-01'
and t.addedon<'2018-03-11'
group by month(t.addedon) ,day(t.addedon) ,t.bank,t.card_type;
Now i want to keep adding data from 11th March to this file. The new data will also be fetched using SQL from my side with same querywhere I would be using automated date field to keep fetching data for previous day. the query is below:
Query: Select month(t.addedon) as Month,day(t.addedon) as Day,t.bank,t.card_type,count(*)
from transaction t where t.slate in ('VV',DER'','GET')
and t.addedon>= DATE_FORMAT(DATE_SUB(NOW(),INTERVAL 1 DAY),'%Y-%m-%d')
and t.addedon< DATE_FORMAT(NOW(),'%Y-%m-%d')
group by month(t.addedon) ,day(t.addedon) ,t.bank,t.card_type
Say I would run this on 12th morning, then this will add 11th data to the original file.
I want to store the original+new data is same file(which was the original one) and the next time in same way new data is added to same file. Can you help me understand how can i write the code/query for the same.
Majority of examples on various websites show examples with excel files with last update time logic which is not clear. Can you help???
Let me try to give you a basic idea of what you need to do
--> Perform one time load of the data and store it in QVD1 (Make sure you have loaded the Last Modified Date & the primary Key)
Your SQL Query Here (Make sure you have a PrimaryKey and LastModifiedDate)
--> After one time load is done, comment the lines so that it does not run again.
--> Load QVD1 and Store the Max_of_LastModifiedDate in a variable (lets call it vMaxDate)
LOAD MAX("LastModifiedDate" ) as maxdate FROM [lib://ConnectionName/QVD1.qvd] (qvd);
LET vMaxDate = Timestamp(Peek('maxdate'));
--> Now write a SQL query and in the where clause add LastModifiedDate > vMaxDate
Incremental:
Select month(t.addedon) as Month,day(t.addedon) as Day,t.bank,t.card_type,count(*)
from transaction t where t.slate in ('VV',DER'','GET')
and t.addedon>='2018-03-01'
and t.addedon<'2018-03-11'
group by month(t.addedon) ,day(t.addedon) ,t.bank,t.card_type
AND LastModifiedDate > '$(vMaxDate)'
--> CONCATENATE your output of SQL query with QVD1 and add a where clause here which says
CONCATENATE
LOAD *
FROM [lib://ConnectionName/QVD1.qvd]
(qvd)
WHERE NOT EXISTS("PrimaryKey");
--> Finally store your qvd with the main qvd name ie. QVD1
STORE [Incremental] into [lib://ConnectionName/QVD1] (qvd);
Now QVD1 when it loads the next time will load the data after the newDate
Hope this is clear.
Does this mean that I need to include a field named lastModifiedDate in my SQL query.
Is my query fine or should I Include something here.
If you are performing an insert and an update in your incremental load, you need modified date and primary key.
If you are performing only an insert, it is different.
What are you doing? Insert and Update or only insert?
Only insert
Getting a lot of errors
here is my script:
LOAD MAX(lastModifiedDate )as maxdate FROM
LET vMaxDate = Timestamp(Peek('maxdate'));
Incremental:
select date(t.addedon),count(*)as count from payu.transaction t
where t.merchantid='13290'
and t.mode in ('CC','DC','NB')
and t.addedon>'2018-03-11'
and t.addedon<'2018-03-12'
group by date(t.addedon)
AND LastModifiedDate > '$(vMaxDate)' ;
CONCATENATE
LOAD *
FROM
(qvd)
WHERE NOT EXISTS("PrimaryKey");
Concatenate
Load *
from
(qvd);
STORE [Incremental] into
Is this script to be written in new qvd file or same old qvd file??
Also,
If I am sure that this whole process will run daily and is under control. Do I need lastmodified date anymore.
Can I exclude this if i am sure that every run will be timely and will load data for next day.
ODBC error
Connector reply error: SQL##f - SqlState: S0022, ErrorCode: 1054, ErrorMsg: [MySQL][ODBC 5.1 Driver][mysqld-5.5.5-10.1.9-MariaDB-log]Unknown column 'LastModifiedDate' in 'group statement'