Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Learn how to migrate to Qlik Cloud Analytics™: On-Demand Briefing!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

spark configuration to use cluster created by tAmazonEMRManage

Hi Experts, 
I am new to Talend and using Big data platform 6.1.1. I am able to create and launch a cluster using the TAmazonEMRManage however I wanted to use this as pre task to a big data batch job , using Amazon EMR Spark where I wanted to read from S3 and postgres (RDS) and write to S3 and postgres and I am facing following challenges
1> unable to pass the resource manager to the spark configuration of 2nd job dynamically.
2> unable to use 2 tS3Configuration components in same big data batch job, to read from multiple s3 buckets  
3> unable to find a postgres connector in big data batch job. 
Could you please advice . 
Thanks, 
ajmani 
Labels (2)
4 Replies
Anonymous
Not applicable
Author

Hi,
Have you tried to use tS3XXX component in a standard job and call a spark job through subjob(tRunjob)?
For RDS, you can use spark component to achieve it
tMysql component for RDS(Aurora/Mysql), 
tOracle component for RDS(Oracle)
tJDBC component for RDS(MariaDB/PostgreSQL/SQLServer)
Let us know if it is Ok with you case.
Best regards
Sabrina
Anonymous
Not applicable
Author

Hi  ajmani ,
Is there any update for your issue?
Best regards
Sabrina
Anonymous
Not applicable
Author

Hi Sabrina,
Thanks for looking into this . You have suggested to use Tjdbc a spark component to connect to postgres RDS, but TJdbc components are not available in big data batch job,
Thanks, 
ajmani
Anonymous
Not applicable
Author

Hi,
Generic JDBC Component(tJDBC) in spark will be available in 6.2.
Here is the related jira issue: https://jira.talendforge.org/browse/PMBD-384
Best regards
Sabrina