Skip to main content
Announcements
July 15, NEW Customer Portal: Initial launch will improve how you submit Support Cases. IMPORTANT DETAILS
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Split file by column

Good Morning/Evening

I am new to Talend , have one excel file having columns ( EmpID FirstName LastName DeptName) 


I want output would be 3 files and EMPID (Primary Key)  Would be common
Thanks
Labels (2)
8 Replies
Anonymous
Not applicable
Author

first file (EmpID,FirstName)
second file (EmpID,LastName)
third file (EmpID,DeptName)
Thanks
Anonymous
Not applicable
Author

Hi,
You can use tMap to achieve your goal.
Please refer to component reference about:TalendHelpCenter:tMap
Best regards
Sabrina
Anonymous
Not applicable
Author

Thanks you very much for your quick reply.
Also suggest is there is way where if I run the job again only new rows append (No Duplicate) in each file (like in my example 3 files are there) . I know there is append checkbox option but how I can check for duplicate to stop them to append into the file
Thanks
Anonymous
Not applicable
Author

Hi,
There is a component  which compares entries and sorts out duplicate entries from the input flow.
Best regards
Sabrina
Anonymous
Not applicable
Author

Thank you very much, I have implemented using both tmap and tuniqrow, but I have one scenario where I am working with key-value-timestamp thing, if there is any change in database field then related file (for the Database field) only update not all the files (like in my example 2 files are there)
empfile
EmpID FirstName LastName    TimeStamp
1         AB           P                 01/27/2016 2.29AM 
2         CD          Q                 01/27/2016 2.29AM
3         EF           R                 01/27/2016 2.29AM
4         GH          S                 01/27/2016 2.29AM
5         IJ            T                 01/27/2016 2.29AM
split using tmap
file1 (empid,firstname,timestamp)
file2 (empid,lastname,timestamp)
if there is any change in lastname then new record will append only in file2 not in file1.
also same if there is any change in firstname then new record will append only in file1 not in file2.
Anonymous
Not applicable
Author

Hi,
What's your target DB? In t<DB>output component, there is an option
 "Update or insert" to update the record with the given reference. If the record does not exist, a new record would be inserted.
Best regards
Sabrina
Anonymous
Not applicable
Author

Hi  xdshi,
I am using MySQL DB and output is Excel file
job1
MySQL DB (Query) > tmap > output (delimited) 2 files created
Job2
input (delimited) 1 file (firstName) > tsort >tunique > output
input (delimited) 2 file (lastName) > tsort >tunique > output
firstname.csv (empid, firstname,timestamp)
secondname.csv (empid, firstname,timestamp)
now if there is any change in the database (which would be ay update or insert operation) then only the file which is for the field is appended not in all the file.
  
FirstName.xls      
EmpID FirstName Timestamp
1          AB        01/27/2016 2.29AM 
2          CD        01/27/2016 2.29AM           
3          EF        01/27/2016 2.29AM    
4          GH        01/27/2016 2.29AM           
5           IJ        01/27/2016 2.29AM         

LastName.xls
EmpID LastName Timestamp
1          P        01/27/2016 2.29AM
2          Q        01/27/2016 2.29AM
3          R        01/27/2016 2.29AM
4          S        01/27/2016 2.29AM
5          T        01/27/2016 2.29AM

key (primary key) ,Value and timestamp format. Any change  would append to the file with the new stamp.
Example : In this case we misspelled "Hit" (FirstName) as "IJ" so if we change back . FirstName of immutable would be and rest all would remain same.

FirstName.csv
1    AB   01/27/2016 2.29AM 
2    CD   01/27/2016 2.29AM
3    EF    01/27/2016 2.29AM
4    GH   01/27/2016 2.29AM
5    IJ     01/27/2016 2.29AM
5    Hit   01/27/2016 2.45AM

LastName.csv
1          P        01/27/2016 2.29AM
2          Q        01/27/2016 2.29AM
3          R        01/27/2016 2.29AM
4          S        01/27/2016 2.29AM
5          T        01/27/2016 2.29AM
Anonymous
Not applicable
Author

Good Morning
FirstName.xls  is FirstName.csv (misspelled)
LastName.xls  is LastName.csv (misspelled)
so there would be no confusion......., I am using delimited input/output in both the jobs (Job1 and Job2).
Thanks