Skip to main content
Announcements
Save $600 on Qlik Connect registration! Sign up by Dec. 6 to get an extra $100 off with code CYBERSAVE: REGISTER
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Incremental load in qlikview from hive

Hi

I want to do incremental data load process in qlikview and store the data into qvd and  source of data is hive.

I am following the below approach for delta load,

  1. Get the latest modified date from QVD and save it in a variable ex: Modifieddate.
  2. Get records from Hive whose modified date is greater than Modifieddate, i am using where condition
  3. Get new records from Hive whose primary key not exist in QVD.
  4. Loading the fetched records from hive to qvd.

           4.1. If the primary key already exist, update the record

           4.2. If the primary key doesn’t exist, insert the record

the issue i am facing here is the hive query is not able to fetch records from according to the where condition,I have checked the same query from hive CLI it is working fine and I have checked static query from Qlikview to hive it is also working fine. The issue is occurring when I am using variable substitution in query.

Thanks

Hari

15 Replies
fosuzuki
Partner - Specialist III
Partner - Specialist III

My guess is that you have to adjust the date format from the variable into an acceptable SQL date format.

Try using the Script Debugger to see exactly what is the SQL statement that QV is sending to Hive. When you are in the Script Editor, click the "Debug" button, and you can go step by step in the script reload.

Brice-SACCUCCI
Employee
Employee

I'm just taking a guess here... Have you tried something like

WHERE data >= '$(myVariable)' ?

Anonymous
Not applicable
Author

You can use the Debug option, or you could trace out the SQL on reload, whichever way is easiest really, with the traced SQL you should spot any anomalies.

The problem we experience is getting the max ID/timestamp from the QVD when we are incrementally updating it as we have to pull in all 30million records, as the performance is pretty bad.

To solve this we shard QVDs with records greater than 2 million rows into QVDs with 1 million rows in each one, so for a 30 million record QVD file we now have 30 x 1 million records in each one, this makes it quicker to get the max ID from 1 million records rather than 30 million.

Each table that we pull in has its own QVS file which is used within a loop that works out what sort of QVD we are building.

Not applicable
Author

Hi

I am using the WHERE datefiled >= '$(myVariable)' approach for filtering data from hive tables.

But the problem  i am getting is the date field in the hive table is in string format ,so  when i am debugging the script the variable is showing null,as it is not able to compare the records from qvd with hive.

     Could anyone please help on date conversion from sting to date and timestamp to date,because  i need to compare records based on date and not timestamp.

Thanks

Hari

Anonymous
Not applicable
Author

Whats the date string, can you give example

fosuzuki
Partner - Specialist III
Partner - Specialist III

when i am debugging the script the variable is showing null

Can you post the code where you define the content of myVariable?

Not applicable
Author

LOAD date(max(eventdate),'YYYY-MM-DD') as notemaxdate

FROM QvdFile (qvd);

LET vNoteMaxDate = peek('notemaxdate');

SQL SELECT * FROM HIVE.qvdb.eventstable

  WHERE  eventdate >= '$(vNoteMaxDate)';

vNoteMaxDate is a variable which contains maximum eventdate from qvd.

eventdate is the date field in hive table whose data type is string and format is DD/MM/YYYY

while debugging i am getting like   WHERE  eventdate >= '';

Thanks

Hari

Anonymous
Not applicable
Author

Hi Goli


Looking at your code, the syntax of your peek function is incorrect,


LET vNoteMaxDate = Peek('notemaxdate', -1, 'xx--table-name-xx');


Hope that works

fosuzuki
Partner - Specialist III
Partner - Specialist III

What is the result of:

LOAD date(max(eventdate),'YYYY-MM-DD') as notemaxdate

FROM QvdFile (qvd);

Is it correctly generating the table with the max date?