Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Independent validation for trusted, AI-ready data integration. See why IDC named Qlik a Leader: Read the Excerpt!
cancel
Showing results for 
Search instead for 
Did you mean: 
Metikosh
Contributor
Contributor

tBufferOutput/Input components are changing the value of the byte array

Hi,

 

As described in the subject, I am having an issue with the tBufferOutput/Input components. I have a table in a MariaDB database which contains a column which is BINARY datatype (in this column I have a hash value calculated with SHA256). For this example I used a single row so it would be easy to notice the difference.

 

Here is a screenshot of my job:

 

0695b00000Lxqn7AAB.png

 

Following are screenshots of the schemas of the job components:

0695b00000Lxqo5AAB.png

 

 

0695b00000LxqqaAAB.png

 

 

And finally here is the result (testDataHash1 is the source table and testDataHash2 is the destination table - both have the same definition - only one column BINARY with length 64).

 

0695b00000LxqpDAAR.png

 

 

This is a simplified scenario to demonstrate the issue I am facing - I actually have complex parent-child jobs where I need to propagate data from the child to the parent job (including the DataHash column which I am later using in a tMap component as a lookup; the lookup fails because the values are different because of the tBuffer components).

 

There is no workaround in my job logic (in terms of not using the tBufferOuput/Input), so I would appreciate any advice on how to tackle this issue!

 

Thanks!

Labels (3)
1 Solution

Accepted Solutions
gjeremy1617088143

for example you can create a private static List<byte[]> in a routine.

on the child job you use tJavaFlex : in the begin part you instanciate the liste, in the main you add the byte[] of the current row in the ArrayList on the end part you set the private var with the list you filled.

on the father job you use another tJavaFlex with a foreach clause inside to send the list to a flow.

 

here for eg I use a list of string and globalVar for put or get the list :

 

0695b00000LyTavAAF.png0695b00000LyTaqAAF.png0695b00000LyTalAAF.png0695b00000LyTaWAAV.png 

you just have to transpose it to list<byte[]> and use getter setter of private variable in a routine instead of globalVar

 

View solution in original post

11 Replies
gjeremy1617088143

Hi, it seem you have an encoding problem,

check the encoding used in the advanced parameters of tBuffer components .

Send me love and Kudos

Metikosh
Contributor
Contributor
Author

Hi gjeremy1617088143,

 

Thank you for your answer but I'm afraid that is not the case.

 

I forgot to mention it in my question but I have already tried executing the job with different encoding selected in the advanced settings of the tBufferInput component (I used UTF-8, ISO-8859-15, ASCII, Cp1252, UTF16, UTF32 and some other encodings that I thought would make a difference) and all I got was multiple different values for the DataHash column - none of them is the same as the original.

 

If you have some other suggestions please share them with me.

 

Thanks.

gjeremy1617088143

you can add additionnal parameters in your tDB components to force encoding

Metikosh
Contributor
Contributor
Author

I don't think that the encoding would affect a binary column - but I tried this as you suggested and it does not fix the problem.

 

I have tested this also without the tBuffer components (just passing the row from tDBInput to tDBOutput) with different encodings in the additional JDBC parameters and it did not affect the value at all (I used latin1 and UTF8). So I think that the encoding in the source-destination is not the problem, the problem is with the tBuffer components (and again not with the encoding because changing it did nothing to fix the issue).

 

Update: I also tried force encoding in the source as latin1 and mapping it in a destination with UTF8 (without the tBuffer components) - the value of DataHash stayed the same.

gjeremy1617088143

have you tried with other component like tHashinput and tHashOutput to replace tBuffer component ?

Metikosh
Contributor
Contributor
Author

tHashinput and tHashOutput components work perfectly well with hashed values - and I use them in my jobs - but tHashinput and tHashOutput are not useful in this situation because I need to propagate data from a child to a parent job (what is shown in the screenshots in my question is just a simple example to demonstrate the issue and not my actual job) and as far as I know I can't achieve that with the tHash components, that is why I am using tBuffer components in the first place.

gjeremy1617088143

hi you could use this :

https://help.talend.com/r/i6eFKBuNsRD2KzBCYnXHhw/4jPcdaVw7eaDvMyLDjdYfQ

instead of a String you declare a byte[], then you can get the byte[] or set it in all your job

Metikosh
Contributor
Contributor
Author

This would probably work if I wanted to pass a single value - because for multiple rows the variable value would be reset for every row and in the parent job I would get only the last value of millions of rows. Maybe my example job confused you because in the example I am moving a single row (single value), but I am not actually working with a single value - I will have millions of rows for the DataHash column.

 

Thanks a lot for your time, I really appreciate your input!

gjeremy1617088143

so you can use the method with private variable and getter setter with a List<byte[]> instead of byte[]