Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

1 record is comprised of 2 rows - require 1 row in output

Hi, I am new to Talend. I am using it to parse financial files into something that is in a 'DB insert' - ready state. It is most likely exported from PDF to Excel by my source. 

 

Anyhow the records I get are of format : 

Group A Header

record1 someinfoA

record1_id someinfoB

record2 someinfoC

record2_id someinfoD

Group B Header etc etc 

 

The output I want is :

record1 record1_id someinfoA "Group A Header" someinfoB

record2 record2_id someinfoC "Group A Header" someinfoD

 

so I want to merge the data in the record pair as well as adding in the group header into the record. There is nothing to join the pair except that in the extract the format of a record is a line 1 and a line 2!

 

Any ideas would be really appreciated!

 

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

My solution was to read the header and save in context variable. Read record 1/2 and save values in the context parameters. Then I read record 2/2 and stamp the context parameters on the end of the record. I did this in a tJavaRow. Then I used a tmap to clean things up. 

View solution in original post

14 Replies
akumar2301
Specialist II
Specialist II

Can you please clarify ,the relation between Input format and output format.

 

it is not very clear in your message.

Anonymous
Not applicable
Author

Hi,

 

    Since you do not have any common input id for both line items, I would try to add a sequence number to the input flow. Then you can consider 1&2 as same record, 3&4 as next group etc. Based on this numeric grouping, you can use tDenormalize component to join 1 &2 records to same output record. Could you please try it?

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved

Anonymous
Not applicable
Author

Thank you for your input. I was going to use tJavaRow to add a row counter to just the content rows rather than the group header rows. 

I am utilising context variables to keep note of the group header and insert into each record until another header is encountered. 

0683p000009M7sP.png

if (row20.Quantity == null ) { //group header row
context.Temp_CCP = row20.Description; // set group header param
} else {
row21.Record_Counter = context.Counter; //set record part
row21.CCP = context.Temp_CCP;
context.Counter++;
}

(*CCP is group header)

0683p000009M7sU.png

Any tips for setting up the tDenormalize component which I have not used before for when I go to join up the records? 


Params.png
Anonymous
Not applicable
Author

Hi,

 

     You can store the previous record in a context variable and can join based on the condition. If the tjavarow is working, then you do not have to go for denormalize method. You will have to do sequence number generation and then you need to divide it by 2 to identify whether the record is part of same group or different group.

 

    Then denormalize it based on this group id. But I would say, your current method is easier than my option. So no worries 🙂

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved

Anonymous
Not applicable
Author

Thanks Nikhil. I changed to using context params to capture part 1 of the record. My job was working ok up to the Replicate with 333 records flowing all the way through. I then worked on my tJavaRow component and I am happy with the logic in it it but now only one record flows through and everything stops there once I activate my tJavaRow component in the flow. What might be going on - Do I need a wait somewhere possibly?  0683p000009M7ad.png0683p000009M7og.png


New Layout.png
Anonymous
Not applicable
Author

I have a couple of ideas, but would need an actual example of the data you are working with. Your comment about how the records are linked (or not) needs a bit more clarification. I think that might come from an actual example. No need to (in fact please do not) include actual data, just some pseudo data with realistic values

 

Anonymous
Not applicable
Author

Thank you. I am confused about how my job runs when I activate my tJavaRow component as opposed to when I turn it off. I don't understand why I don't get 333 records to flow through up to the point of the tJavaRow. Can you help me understand this. As show in this screenshot the 2 runs.0683p000009M7xZ.png

Here is an example of my data with only test data

0683p000009M7rr.png

Anonymous
Not applicable
Author

The first part of your problem is linking your record A with record B. I think you need another dataset to  create a linking record for both of these. So, for example you could have a table holding something like the following....

 

RecordA RecordB Key
APPLE APPL_ 1
VODAFONE VF_ 2
FACEBOOK FB_ 3

 

You would then use these to match against RecordA and the beginning of RecordB and then add the numeric key to the record. Send these records to two different tHashOutput components. Then you can join the data back together using the key you have created.

 

Regarding your job, I think it will need to be rewritten to accommodate that flow.

Anonymous
Not applicable
Author

Thank you, I will try that.

 

It is my first time using tJavaRow. I am still stumped as to why a job that is working up to that component, changes behaviour in the initial components when I add on the tJavaRow. (as per screenshots above. Any tricks to working with tJavaRow?