Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Conkers
Contributor III
Contributor III

Extract specific data from text file

I have a text file as follows:

----------------------------------------

ID: 00070

Date: 2022-06-17T09:34:50

Item is now available

Export : 0bf08b33 (2022-06-17T09:35:07) is here -> File D:\PATH\TO\FILE\LOCATION\

----------------------------------------

I'm attempting to export just the ID and the

D:\PATH\TO\FILE\LOCATION\ elements into a table/columns.

0695b00000UzUyhAAF.png

Using this workflow, I can extract the relevent rows (beginning 'ID....' & ' Export....')

How would I extract the specific data required from these rows?

Thanks in advance

Conkers

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable

I think the best would be to use the following component: https://help.talend.com/r/en-US/8.0/ms-delimited/ms-delimited-scenario

View solution in original post

4 Replies
Anonymous
Not applicable

An easy first change would be to use tFileInputDelimited component. Set the "Field separator" to be ": " (remember the trailing space) and set up two columns. The first one being "RowType" which will hold your row identifier (ID, Date, Item is now available and Export). The second will be your data and will be called "Value". You may want 3rd and a 4th to hold minutes and seconds from the date row, but I guess that these are not necessary.

 

Once you have this, you can link to a tMap and have a filter on your output table which holds something like this.....

 

row1.RowType.equals("ID") || row1.RowType.equals("Export ")

 

This on its own will output this....

 

ID|00070

Export |0bf08b33 (2022-06-17T09:35:07) is here -> File D:\PATH\TO\FILE\LOCATION\

 

The pipes above ("|") simply separate the columns. So you have two rows with two columns. Your first row holds your ID value already sorted, the second row has your Export value which will need some further processing.

 

To do that, further edit the tMap's output table's "Value" column. Replace the row1.Value expression with this....

 

row1.Value!=null && row1.Value.indexOf("File ")>-1 ? row1.Value.substring(row1.Value.indexOf("File ")+5) : row1.Value

 

If you run the job again, you will get this.....

 

ID|00070

Export |D:\PATH\TO\FILE\LOCATION\

 

I've built a demo job to show you what it looks like. All of the settings I have described.

 

0695b00000UzWhsAAF.png 

 

Anonymous
Not applicable

I think the best would be to use the following component: https://help.talend.com/r/en-US/8.0/ms-delimited/ms-delimited-scenario

Anonymous
Not applicable

Good call @Balazs Gunics​. I completely forgot to consider the tFileOutputMSDelimited. Thanks for stepping in 👍

Conkers
Contributor III
Contributor III
Author

Thank you both, I will give tFileOutputMSDelimited a go! (Sorry for not acknowledging sooner, couldn't log in to respond)