Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
 McDon
		
			McDon
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		I am new to Talend Studio and still learning various aspects of it. Your patience is appreciated.
Summary
I am trying to retrieve REST API data, parse it out and write it to Snowflake. By writing to this community I hope someone can help me refine my approach and further my Talend comprehension.
Goal
Receive a single text string response from the REST API that contains the rows and columns and then structure the data to write it to a Snowflake table.
Details
I am calling Smartsheet API via tREST component.
The request is set to Accept=text/csv.
What I receive back, as expected, is a single row with two fields: Body and Error_Code
I now want to load that data into a Snowflake structured table.
This is a low volume of data so my goal is simply to transform the data in Talend Studio code versus using ELT components. Plus this will force me to explore new areas of Talend.
Requirements
- Single string must be parsed into separate rows
- All data is quoted per CSV standards
- EOL indicator is embedded in the data within the quoted field
- Return structured row/column data per the pre-defined schema
How I got it to work:
- tREST -main-> tFileOuputDelimited -onComponentOk-> tFileInputDelimited -main-> tDBOutput (snowflake)
I would like to avoid writing the file to disk and reading it again just to get the data transformation from those widgets.
What else did I try:
- tSplitRow - I couldn't get it to work because the EOL character is within the data and tSplitRow must consider the CSV standard of quoted data - it is not a blind parse on character scenario
- tExtractDelimitedFilds - couldn't use this until at least the rows were parsed out.
Why didn't I...
- accept JSON response and use those Talend components - since this is Smartsheet, there are two loop nodes in the JSON - first the column headers then, the data for each column. Given this dynamic nature, I did not know how to approach it. Instead, I opted for the text/csv response as that was fixed. In addition, I get an excessive amount of data that I do not need, like formulas for example.
Thank you in advance for any guidance this community can provide.
Some screenshots of my Talend setup
 
					
				
		
Hi
Link tRest to a tLogRow to print the response on the console, we will see what does the response string look like, and let us know what data you expect to extract from the response.
Regards
Shong
 McDon
		
			McDon
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Here is the sample data.
There are only about 100 rows. I trimmed it down to the representative rows.
Also made the column names and data generic. This should give you the idea.
Ideally, I would like to tExtractDelimitedFields.
However, it seems, tExtractDelimitedFields only looks for the delimiter and does not consider CSV or other standards such as quoted values and delimiters within those quoted values.
tExtractRegexFields is an option but would require a fairly complex expression to account for all CSV standards. Somewhat negating the benefit of Talend Studio components.
tLogRow output:
Starting job Copy_of_Smart_sheet_to_Snowflake_b at 21:59 04/12/2021.
[statistics] connecting to socket on port 3683
[statistics] connected
"IND","MD","CC","SN","SST","CN","AM","IND2","VH","SHU","SRO","WN","WNN","SWS","SWO","SWW","SWT","HA","HH","FB","HV","SIND","SED","LM","HT","DR","LMH","ALPU","SHU","NOAE","NOEE","NOUOE","NOCE","NOAUE","NOLE","DSE","YHDE","SHE","PHE","DUTE","CSK","CMK","CSMSK","CDR","STR","EDD","AE",DSAT","DSUT","LU","SHA","MD","ND","ILM","FOTM","EOLM","FOLM","EI","LEI","ILE","IO","CSIN","IE","VS","EM","BEM","IVCE","IVSE","VSE","IDS","OE","EEM","WN","WNX","CSSN","PRY"
"","10/01/20","BBB","NOT APPLICABLE","All","BBB","Santa Claus","- Inactive","","","","","","","","","","Yes","Yes","","Yes","","01/08/21","09/01/20","Up","","No","-","0.0%","437,678","","2,137","598,000","","300","","","","","","more data","still more data","another data value","1","1","End Date: 01/08/21
Duration: 99 calendar days","Bugs bunny","0.00","0","56","0","10/01/20","","false","12/01/21","11/30/21","11/01/21","2320201001","2320201001","true","true","0023","true","true","","","","","","","","","","","BMC - SOW-none - All",""
<...100 more rows of similar format...>
"","11/01/21","EEE","SSS000","X,Y,Z,A","EEE-FN","J. Smith","- Inactive","31.00","","","- line 1 of data entry text from user
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
text data continues","","","","","","No","","No","","10/08/19","05/01/21","Sideways","","No","-","0.0%","","","","","189","","5","5","","658","0.12","PPP-01","PPPP99999","P","3","5","End Date: 10/08/19
Duration: 99 calendar days","Santa Claus, Easter Bunny","0.00","0","1025","0","11/01/21","","true","12/01/21","11/30/21","11/01/21","2620211101","2620211101","true","false","0026","true","true","","","","","","","","","- text entered by ueser
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data
- additional line of user entered data","EPS - SOW-01 - RMA, DMA, Ana",""
|
[statistics] disconnected
Job Copy_of_Smart_sheet_to_Snowflake_b ended at 21:59 04/12/2021. [Exit code = 0]
 
					
				
		
@Don McCarty, I understand your question, but tExtractDelimitedFields does not support CSV option now, this is a new feature, you can vote and watch the jira issue.
Regards
Shong
 
					
				
		
I don't have other options, as you did, output the data to to a temporary file and read the data back with CSV option.
 McDon
		
			McDon
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		