Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I'm facing some issues with one of my jobs on Talend Open Studio. I'm working with an Excel file as input, with the following format :
Keys\ Parameters | X1 | X2 | X3 | X4 |
A | some value | some value | some value | some value |
B | some value | some value | some value | some value |
C | some value | some value | some value | some value |
D | some value | some value | some value | some value |
The tough part is that I can have a couple hundred parameters and they may not always be present in my file, so I need to check for each parameter if it is in the file or not. To do so, I'm using multiple tMaps, because one tMap component is not enough to support all my parameters. Therefore I have split my job in multiple subjobs. Also, for each parameter, if it is present in my file, I need to add another column representing the parameter code, which means I double the amount of expressions in my already massive tMaps. Afterwards, I need to split my data to have one row for each Key and Parameter, and one row for each Key and Parameter Code, so I first split my data to have one row for the parameters, and one row for the codes (i.e) :
rowNumber | Key | X1 | X2 |
1 | row.Key | row.X1 | row.X2 |
2 | row.Key | row.X1Code | row.X2Code |
I then split these rows again to have my final format :
rowNumber | Key | Data |
row.rowNumber | row.Key | row.X1 |
row.rowNumber | row.Key | row.X2 |
This results in having multiple thousands of rows, and eventually in a crash. Also it seems that the last 20ish expressions of my tMaps are not working as intended, leaving a different value from the other columns, even though they use the same expressions. Is there a limit to the amount of columns you can use in a tMap ?
The method I'm using may not be the most optimized one, so I'm open to any advice on how I could handle this to have something more optimized and avoid crashes and strange job behaviour.
I was thinking of a get-around that would involve using multiple jobs instead of a single one, but that may not be an option.
Kind regards,
Pierre
Hi,
Any ideas on this ?
Thanks,
Pierre