Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Abrie_M
Contributor III
Contributor III

File source endpoint - codepage

Good day

I'm using a file source endpoint - normal text file on Windows and need to replicate it to Hadoop (text format).

When I query the column in Hive, it displays the hex value.  Any other replications with the same target endpoint (hadoop), but from e.g. DB2,  works correctly.

"Marais" displays as "4D6172616973"

Any suggestions? I've tried different code pages, but the input file is a normal windows text file, so it should be 65001.

 

Thanx

Abrie

Labels (3)
1 Solution

Accepted Solutions
john_wang
Support
Support

Hello @Abrie_M ,

From JSON file you defined the NAME column as BYTES

john_wang_0-1660961781836.png

this request Replicate to read the column as-is (by default the code page is 65001). Please change the data type from BYTES(50) to STRING(50) in source table definition Type.

Let me know if it works for you.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!

View solution in original post

5 Replies
john_wang
Support
Support

@Abrie_M ,

Can you check what's the text values in HDFS files? and what's the data type of the column in HIVE?

It's better you open a support ticket and provide a sample file, and also the task Diag Package so far we may reproduce it in our labs to understand it correctly.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
Abrie_M
Contributor III
Contributor III
Author

Hi John

Thanx for the reply. The values are the same in the csv file on HDFS.

Abrie_M_0-1660899090864.png

 

I will log a support ticket.

 

Regards

Abrie

john_wang
Support
Support

Hello @Abrie_M ,

From JSON file you defined the NAME column as BYTES

john_wang_0-1660961781836.png

this request Replicate to read the column as-is (by default the code page is 65001). Please change the data type from BYTES(50) to STRING(50) in source table definition Type.

Let me know if it works for you.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
Abrie_M
Contributor III
Contributor III
Author

Thanx John! I've changed it to string as you suggested and the data in our lake is correct now.

Regards

Abrie

 

john_wang
Support
Support

Hello @Abrie_M ,

Glad to hear that and Thanks for the feedback.

Best Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!