Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
 radhikari
		
			radhikari
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Hello,
I need to copy data from S3 to EMR in AWS. Can someone please let me know which component I can use to write data to EMR? I am using Talend Studio as part of the Data Management Platform.
 
					
				
		
Hi,
If we understand your requirement very well, you can use tS3Get component to retrieve a file from Amazon S3.
The work flow should be:tS3Connection-->tS3Get(retrieve files frm s3 to local)-->tfileunarchive(unzip your file)-->EMR cluster(amazon EMR). Let us know if it is Ok with you.
Best regards
Sabrina
 radhikari
		
			radhikari
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Thanks for the response.
We don't really have a need to download files locally. Is it possible to push data from S3 to EMR directly?
Also, in your proposed solution, which component handles the final (EMR cluster) step? The only EMR components that I see are "tAmazonEMRResize", "tAmazonEMRListInstaces", and "tAmazonEMRManage".
 
					
				
		
Hi,
So far, talend don't support for transferring data by air. You have to download files locally and then push data to EMR
You can get Amazon EMR distribution from hadoop component.
Please take a look at my screenshot.
Best regard
 radhikari
		
			radhikari
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Thanks for the response. However, it looks like we need to have a subscription to one of the Talend solutions with big data and our subscription is for Talend Data Management Platform 6.2.1. So, does this mean we won't be able to connect to Amazon EMR using the components that we have access to?
 
					
				
		
Hi,
The tHDFSOutput component can be available in talend open studio for bigdata.
So far, there is no specific component for AWS EMR Output.
Best regards
Sabrina
