Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I have an input file contains rows of words and another input file contains rows of lines. The idea is to check if a word found in a line. If found then that line is rejected, if not then that line is written into a file.
In other programming language, i would read the words file first and put all the words into a list. Then i would compare each line against that list, to check if a word found in that line.
How to do this in Talend? I guess tmap is the answer...
Thanks in adavance for your help.
Thomas
Hi,
I believe you are looking for below output.
I have selected the inner join between two flows and selected the records which is not matching the join. Please refer the tMap details below.
Since you have mentioned that there will be larger data volume to process, it will be a good idea to provide temp data directory path also in tMap as below.
Hope I have answered your query 🙂 Please spare a minute to mark the topic as resolved and kudos are also welcome 🙂
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Check tFilterRow component!
@dipanjan93 wrote:
Check tFilterRow component!
Hello,
Thanks mate for your answer. But can tFilterRow accepts 2 inputs?
Nope it only works with a single input. If possible could you please elaborate your current scenario. I might be able to share details with you thereafter.
I hav set up a sample job hope it helps
file1
id,name,comment
1,Jack,tobedeleted
2,John,marked for delete
3,Ron,keep
4,Sam,review
5,Steve,to be purged
lookupfile
status
delete
purge
tobedeleted
I have attached the flow...the first tmap you create a new expression on right handside...attached screen prints
hope it helps.
Cheers
@dipanjan93 wrote:
Nope it only works with a single input. If possible could you please elaborate your current scenario. I might be able to share details with you thereafter.
I have 2 input files, one is person file and the other is data containing expenses for each person.
PersonInput
Id;Name;Department
145534;Andrea;IT
342832;Stephan;Operation
552121;Lionel;Finance
799299;Mael;IT
100001;Syergei;Administration
ExpenseInput
No;Name;City;Detail
101;Mael;Dallas;100
102;Melissa;New York;250
103;Pierre;Chicago;700
104;Lionel;Santa Fe;50
105;Andrea;Miami;550
106;Lionel;Washington;150
107;Stephan;Kansas;800
108;Valerie;Detroit;10
So my code would be (my own pseudo code 😞
PersonList = PersonInput[1..4][Name]
Loop ExpenseInput
If ExpenseInput[Name] Not In PersonList:
WriteToOutputFile
Else:
Reject
End Loop
So the OutputFile would be like below:
102;Melissa;New York;250
103;Pierre;Chicago;700
108;Valerie;Detroit;10
I have tried tMap join, but it always ends up out of memory as I have a very large ExpenseInput. It's the reason why I need to do loop.
Thanks in advance for your help.
Let me know if you need more information on the attachment I posted previously....for memory issues there are multiple ways please search talend community there are some recommended options...i.e. you can use local drive for tmap processing and increase the memory size for the job as well...
you can also use tfuzzymap component.
Cheers
Ashish
Hi,
I believe you are looking for below output.
I have selected the inner join between two flows and selected the records which is not matching the join. Please refer the tMap details below.
Since you have mentioned that there will be larger data volume to process, it will be a good idea to provide temp data directory path also in tMap as below.
Hope I have answered your query 🙂 Please spare a minute to mark the topic as resolved and kudos are also welcome 🙂
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
I wish I could give you 100 kudos
Thanks a lot mate !