Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik and ServiceNow Partner to Bring Trusted Enterprise Context into AI-Powered Workflows. Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Fasta file in Talend

Dear Talend, I am having a problem to read a fasta file from talend.
I am still new to talend open studio for big data.


The Fasta file is as Follows:

>FAM138A ENST00000417324 1:35138-35736(-)
atgctgctgactatagagacaaagtctcactatgttgctcaggctggtcttgaactcctggcctcaagcgatcctcccac
ctcagcctcccaaagtgttgggattatagacatgagccactgcacctggccgaccttgggcaagttcttaaacccttcaa
agcctcatttttctccaatcacaaaagggaaagatggtaatattttccccaccaaattcttgtcggatgccctcacagaa
ttgagattatgtacgtaa
>ENSG00000197490 ENST00000359752 1:37397-54936(+)
atgttgctcaccttatgggcagggtctcactatgttgctgaggctggtctcaaactcctgacctcaagcaatctgtctgc
ttcagcctcccaagtagctgagaatacagggacaagccattgcacctga

 

I have  tried to use several input components like tFileInputDelimited, tFileInputMSDelimited and so on but i dont know a standard way to read the fasta file from talend.
I have also tried to used some process component like tMap, tJavaRow and tJavaFlex. But i could not get the output i want.

 

My objective is to extract each information from the fasta file and store it in a csv file.

Can someone help me, i am stuck with that for more than 2 weeks.


The output should be as followed:

FAM138A; ENST00000417324 1:35138-35736(-); atgctgctgactatagagacaaagtctcactatgttgctcaggctggtcttgaactcctggcctcaagcgatcctcccacctcagcctcccaaagtgttgggattatagacatgagccactgcacctggccgaccttgggcaagttcttaaacccttcaaagcctcatttttctccaatcacaaaagggaaagatggtaatattttccccaccaaattcttgtcggatgccctcacagaattgagattatgtacgtaa

 

FAM138A;ENST00000417324;1:35138-35736(-);atgctgctgactatagagacaaagtctcactatgttgctcaggctggtcttgaactcctggcctcaagcgatcctcccacctcagcctcccaaagtgttgggattatagacatgagccactgcacctggccgaccttgggcaagttcttaaacccttcaaagcctcatttttctccaatcacaaaagggaaagatggtaatattttccccaccaaattcttgtcggatgccctcacagaattgagattatgtacgtaa

 

 

 

 

Labels (2)
1 Reply
Anonymous
Not applicable
Author

Hello,

How did you row separactor and field separator in input component?

From your requirement, we can create an input schema where you can take the row separator and field separator according to your Fasta file in and then use tMap component to pick the desired output columns.

Best regards

Sabrina