Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Incremental load in qlikview from hive

Hi

I want to do incremental data load process in qlikview and store the data into qvd and  source of data is hive.

I am following the below approach for delta load,

  1. Get the latest modified date from QVD and save it in a variable ex: Modifieddate.
  2. Get records from Hive whose modified date is greater than Modifieddate, i am using where condition
  3. Get new records from Hive whose primary key not exist in QVD.
  4. Loading the fetched records from hive to qvd.

           4.1. If the primary key already exist, update the record

           4.2. If the primary key doesn’t exist, insert the record

the issue i am facing here is the hive query is not able to fetch records from according to the where condition,I have checked the same query from hive CLI it is working fine and I have checked static query from Qlikview to hive it is also working fine. The issue is occurring when I am using variable substitution in query.

Thanks

Hari

15 Replies
Not applicable
Author

Hi

I debugged it ,

LOAD date(max(eventdate),'YYYY-MM-DD') as notemaxdate

FROM QvdFile (qvd);

LET vNoteMaxDate = peek('notemaxdate');

the value of vNoteMaxDate is showing as null

and eventdate values in qvd are something like 2010-07-10-05.22.27.12142, 2011-05-14-06.636490

Anonymous
Not applicable
Author

What did you do to resolve this issue ?

Could you please post you findings.

Many thanks

DaveT

fosuzuki
Partner - Specialist III
Partner - Specialist III

If the eventdate is a string in the database, I guess you have to convert it to a date before using the max() function. So, try this:

LOAD Max(Date#(eventdate, 'YYYY-MM-DD-h.mm.ss.fffff')) AS notemaxdate

Not applicable
Author

Hi,

1.I resolved the issue ,i am taking the timestamp in in hive in 2010-05-15-05.12.48.628010 format.

I used below condition to get maximum record from QVD.

IF $(vQvdExists) THEN

  maxdateTab:

  // Get the max date from this QVD for use in incremental SELECT

  LOAD date(max(moddate),'YYYY-MM-DD') as maxdate

  FROM $(vQvdFile) (qvd);

LET vIncrementalExpression = peek('maxdate');

ENDIF

2.I used below script in compare date in hive with date from QVD and fetch new records.

$(vTableName):

Directory;

IF $(vQvdExists) THEN

SQL SELECT id,moddate

FROM HIVE.qvdb.test

WHERE date(moddate,'YYYY-MM-DD') > '$(vIncrementalExpression)';

else

SQL SELECT id,moddate

FROM HIVE.qvdb.test;

ENDIF

To concatenate the newly fetched records from hive to qvd, i am using the below script.

Directory;

// If incremental reload was , load previous data and concatenate to data just read.

  IF $(vQvdExists) THEN

  // Use CONCATENATE in case we've added any new fields.

  CONCATENATE ($(vTableName)) LOAD * FROM $(vQvdFile) (qvd);

  WHERE NOT exists($(vPK)) // Load only QVD rows that were not already loaded in the data load.

END IF

With the above script i am able to load only the records with new primary key or the records with updated timestamp.

Thanks

Hari

Not applicable
Author

Hi,

1.I resolved the issue ,i am taking the timestamp in in hive in 2010-05-15-05.12.48.628010 format.

I used below condition to get maximum record from QVD.

IF $(vQvdExists) THEN

  maxdateTab:

  // Get the max date from this QVD for use in incremental SELECT

  LOAD date(max(moddate),'YYYY-MM-DD') as maxdate

  FROM $(vQvdFile) (qvd);

LET vIncrementalExpression = peek('maxdate');

ENDIF

2.I used below script in compare date in hive with date from QVD and fetch new records.

$(vTableName):

Directory;

IF $(vQvdExists) THEN

SQL SELECT id,moddate

FROM HIVE.qvdb.test

WHERE date(moddate,'YYYY-MM-DD') > '$(vIncrementalExpression)';

else

SQL SELECT id,moddate

FROM HIVE.qvdb.test;

ENDIF

To concatenate the newly fetched records from hive to qvd, i am using the below script.

Directory;

// If incremental reload was , load previous data and concatenate to data just read.

  IF $(vQvdExists) THEN

  // Use CONCATENATE in case we've added any new fields.

  CONCATENATE ($(vTableName)) LOAD * FROM $(vQvdFile) (qvd);

  WHERE NOT exists($(vPK)) // Load only QVD rows that were not already loaded in the data load.

END IF

With the above script i am able to load only the records with new primary key or the records with updated timestamp.

Thanks

Hari

Not applicable
Author

Hi,

1.I resolved the issue ,i am taking the timestamp in in hive in 2010-05-15-05.12.48.628010 format.

I used below condition to get maximum record from QVD.

IF $(vQvdExists) THEN

  maxdateTab:

  // Get the max date from this QVD for use in incremental SELECT

  LOAD date(max(moddate),'YYYY-MM-DD') as maxdate

  FROM $(vQvdFile) (qvd);

LET vIncrementalExpression = peek('maxdate');

ENDIF

2.I used below script in compare date in hive with date from QVD and fetch new records.

$(vTableName):

Directory;

IF $(vQvdExists) THEN

SQL SELECT id,moddate

FROM HIVE.qvdb.test

WHERE date(moddate,'YYYY-MM-DD') > '$(vIncrementalExpression)';

else

SQL SELECT id,moddate

FROM HIVE.qvdb.test;

ENDIF

To concatenate the newly fetched records from hive to qvd, i am using the below script.

Directory;

// If incremental reload was , load previous data and concatenate to data just read.

  IF $(vQvdExists) THEN

  // Use CONCATENATE in case we've added any new fields.

  CONCATENATE ($(vTableName)) LOAD * FROM $(vQvdFile) (qvd);

  WHERE NOT exists($(vPK)) // Load only QVD rows that were not already loaded in the data load.

END IF

With the above script i am able to load only the records with new primary key or the records with updated timestamp.

Thanks

Hari