Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
April 13–15 - Dare to Unleash a New Professional You at Qlik Connect 2026: Register Now!
cancel
Showing results for 
Search instead for 
Did you mean: 
Chirag_
Partner - Contributor III
Partner - Contributor III

Hadoop Target

Hi Team,

I need to understand some points regarding Hadoop Target.

1. One of my table don't have primary key so when update is happened it is showing as insert in my _ct table, is this because of absence of  PK ? How should I tackle this issue ?

2. We have set the Target format as Parque, In HDFS I can see two folders, i.e  delta and base so can i get more detail on this folders ?

3. We have configured HDFS and Hive both in endpoint connection, but we observed that count is differ than hive table count for CDC that is for _ct table. Is this taking time to write data in hive tables from HDFS files ? 

Could you please give more clearance on above mentioned points ?

 

Regards,

Chirag

Labels (3)
2 Replies
john_wang
Support
Support

Hello @Chirag_ ,

Welcome to Qlik Community forum and thanks for reaching out here!

Regarding the questions:

>> 1. One of my table don't have primary key so when update is happened it is showing as insert in my _ct table, is this because of absence of  PK ? How should I tackle this issue ?

      Not very sure what's the source endpoint type however in general the UPDATE in source DB will be parsed as UPDATE still rather than as INSERT. Please provide more information to help us to understand the behavior, include source DB type, the source table creation DDL, the UPDATE SQL sample etc.

>> 2. We have set the Target format as Parque, In HDFS I can see two folders, i.e  delta and base so can i get more detail on this folders ?

      You may get the table information in Hadoop GUI console, or if you are running Hive client or beeline then you may get the table and folder detailed information by command:

   describe extended <tableName>

>> 3. We have configured HDFS and Hive both in endpoint connection, but we observed that count is differ than hive table count for CDC that is for _ct table. Is this taking time to write data in hive tables from HDFS files ? 

      You are right. There are some tuning parameters to control the data change processing, see . A sample :

john_wang_0-1707741849426.png

Hope this helps.

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
SushilKumar
Support
Support

Hello team,

 

If our response has been helpful, please consider clicking "Accept as Solution". This will assist other users in easily finding the answer.

 

Regards,

Sushil Kumar