Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Deployment options for big data streaming job

Hi,

 

I have managed to build my first big data streaming job that consumes a kineses stream. I have installed a jobserver on an aws emr cluster and I am able to successfully deploy and run the job on that job server. 

 

My only concern is that we would need an emr cluster running 24/7 just for this one job. Is there any other ways of  deploying / "productionizing" a big data streaming job without running a whole cluster just for that?

Labels (2)
1 Reply
Anonymous
Not applicable
Author

Hello,

This is the whole purpose of a streaming processing running on top of a big data cluster.

If you do not need such computation power, could you please check 4 options :

- Set the spark configuration to run locally. It will only require an EC2 instance where the jobserver is deployed.
- Use Talend ESB / Camel
- Leverage the latest 7.0 feature with Cloudera Altus distribution (acting as Hadoop as a service)
- Leverage the new serverless distribution we shared on Talend Marketplace based on Qubole  Saas offering (Hadoop as a service too).

Let us know if it is what you are looking for.

Best regards

Sabrina