Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
BartG
Contributor
Contributor

Split input file into separate output files based on SQL lookup table

Hi,

 

I am new to Talend, so I'm still figuring out all the components.

 

  1. I need to read a machine log file that has a general header with a dozen of lines, and then abody of data in CSV format.
  2. This input file contains a serial number in the header, that I have to pass to a SQL Server stored procedure or query.
  3. The stored procedure or query returns a result set that contains a small lookup table so to speak.
  4. Based on this lookup table, the CSV data in the body of the input file needs to be split into several differerent parts.
  5. These separate parts need to be written to new separate output files on disk.

 

So put otherwise: split an input file into separate output files based on a SQL lookup table.

 

How could this flow look like? What components would be the obvious choices to tie togehter?

 

Many thanks for your advice!

Labels (2)
2 Replies
Anonymous
Not applicable

OK, this is quite a high level description of this problem. As such, I can only really give you a high level suggestion.

 

  1. To read the log file, you will need to use a tFileInputDelimited component (https://help.talend.com/r/en-US/8.0/delimited/tfileinputdelimited). You can either read just the first row (the header) and get your value for the SP or you can read the whole lot at this point. It might make it easier to just read the first row.
  2. You can use a tMSSqlSP component (https://help.talend.com/r/en-US/8.0/mssql/tmssqlsp) to use your value to return you data.
  3. Then, using your file again (this time reading everything but the first row), connect it to a tMap. Use your file data as your "Main" row. Then connect your "lookup" data as your "lookup row". Carry out any logic required in that component.

 

At this point, it gets a little trickier. Do you know the number of output files or will this be dynamic? Is there a max number/type by name or can this be totally dynamic? If there is a max, then you would create an output from your tMap for each of these types and just route your rows according to the lookup data. If you have a dynamic output, then this will be the tricky bit. Certainly not impossible, just will require a bit of code to achieve this. I have written an answer to a similar question here https://community.talend.com/s/feed/0D73p000004kPVOCA2#M62228

 

Have a play and see where you get to with this.

BartG
Contributor
Contributor
Author

Hi Richard,

 

Sorry for the late reply, lots of things going on at the same time here.

 

The header of the input file consists of multiple rows, so I guess I'll let tFileInputDelimited just read the whole file at one.

 

Then using tMSSqlSP and tMap I assume I end up with an enriched dataset (input file + lookup data).

 

I saw your solution that you linked to, from a few years ago. I do have a dynamic number of output files, so I think your approach with some Java code would work here.

 

I'm going to give it a try and see where I end up. I'll let you know.

 

Thanks!