Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
thomas_ciampagl
Contributor
Contributor

Need advice on CI/CD for Talend 7.1

Hi all.  This is my first Talend project and I think there's a few fundamental things I don't understand fully that are making it difficult for me to design my deployment approach.  So, I am requesting some advice on how I should proceed.

 

I am using Talend 7.1 Big Data (but I'm not using Spark or anything like that, I just have standard data integration jobs at this point).  The coding is going well and my jobs work in Studio.

 

I have the cloud version that uses the management console (as opposed to the TAC).  Sorry if that's not accurate, but this is one of the things I don't fully understand yet.

 

I have done tutorials and have published jobs through Studio to the TMC and run them.

 

First question:  Where are these jobs actually running?  I have Studio running on an Azure VM, but the URL for TMC is outside of that domain.  I have nothing hardware related defined in the TMC, but yet these sample jobs run.  Where is that happening?

 

Ok, so now I have jobs and they run in studio and in TMC.  Now I am moving onto creating a deployment strategy.  We have the following in our development environment:

 

Talend Studio running on an Azure VM

The TMC which I connect through the "tmc.us.cloud.talend.com" url

My company has a GIT repository that I have the source code in

We have Jenkins, Maven and Nexus on prem

I can use Docker, too, but not sure if it is useful in my scenario or not.

 

So, I am working on creating a Jenkins pipeline using talend documentation I've read.  But what I missing is how this gets deployed to the management console.  Basically, for my development environment, please help me understand the high level steps I need to follow so my jobs are deployed somewhere I can run and scale hardware as needed.

 

Further, my clients will use different environments.  Some will be on prem, some will be in the cloud.  Is there a generic CI/CD solution I can implement or will it depend on the clients environment?

 

So to recap:

When using the TMC, where do the jobs actually run (and as a bonus, how does scaling work here)?

How can I use Talend CI/CD process to deploy built jobs to the TMC for scheduling and running?

What parts of the process can be generic vs. unique based on the clients environment setup?

 

Again, I understand some of this may be obvious and basic, but somehow I'm missing it, so please help me fill in the gaps.

 

Thanks!

Tom

 

Labels (4)
1 Solution

Accepted Solutions
Anonymous
Not applicable

Hello @thomas.ciampagl,

 

Thanks for your post. There are a lot questions in your post! I'll try to answer them.

 

First question:  Where are these jobs actually running?  I have Studio running on an Azure VM, but the URL for TMC is outside of that domain.  I have nothing hardware related defined in the TMC, but yet these sample jobs run.  Where is that happening?


It looks like you are referring to what we call "Cloud Engines". With Talend Cloud, you can either run your job through Remote Engines or Cloud Engines. The Remote Engines are installed by yourself in your own environment (could be your laptop, or your VPC in AWS for example). Cloud Engines are hosted by Talend allows you to run jobs without installing any software, Talend manage the infrastructure for you. This is probably where you have run your jobs.


Talend Studio running on an Azure VM

The TMC which I connect through the "tmc.us.cloud.talend.com" url

My company has a GIT repository that I have the source code in

We have Jenkins, Maven and Nexus on prem

I can use Docker, too, but not sure if it is useful in my scenario or not.

So, I am working on creating a Jenkins pipeline using talend documentation I've read.  But what I missing is how this gets deployed to the management console.  Basically, for my development environment, please help me understand the high level steps I need to follow so my jobs are deployed somewhere I can run and scale hardware as needed.


I will try to shed some light on that. Please look at the below diagram that might give you an overview:

0683p000009M77y.png

 

 

As you can see, you can setup Jenkins pipelines to deploy "on-prem' in an artifact repository such as Nexus. You can also deploy to TMC which you have done. Or you can deploy to a Docker Registry. It looks like you are more looking at deploying to TMC.

In order to achieve this you will have to use Jenkins along with Maven. Our Maven plugin let you deploy anywhere depending on the profiles you specify (respectively -Pnexus, -Pcloud-publisher and -Pdocker). You can find Jenkins scripts in our documentation.

 


So to recap:

When using the TMC, where do the jobs actually run (and as a bonus, how does scaling work here)?

How can I use Talend CI/CD process to deploy built jobs to the TMC for scheduling and running?

What parts of the process can be generic vs. unique based on the clients environment setup?


When using TMC, the jobs run in Remote Engines or Cloud Engines. The scaling can be done via clusters of Remote Engines. Concerning the Cloud Engines, Talend manages the resources for you.

The CI/CD process can be implemented with any CI/CD tool that supports Maven such as Jenkins.

If I understood well, your process can be pretty generic because we use Maven, only the options, parameters and configuration will differ.

 

I hope I answered most of your questions,

 

Sorry for the late response,

 

Thibaut

 

View solution in original post

2 Replies
sarora1
Creator
Creator

Let me know if you had success with this. I have the exact same situation. I am new to Talend, have a job that runs fine in studio. Would like to create CICD platform. I have code in git; jenkins and nexus running locally; and using TMC. I dont want to use Docker.

Anonymous
Not applicable

Hello @thomas.ciampagl,

 

Thanks for your post. There are a lot questions in your post! I'll try to answer them.

 

First question:  Where are these jobs actually running?  I have Studio running on an Azure VM, but the URL for TMC is outside of that domain.  I have nothing hardware related defined in the TMC, but yet these sample jobs run.  Where is that happening?


It looks like you are referring to what we call "Cloud Engines". With Talend Cloud, you can either run your job through Remote Engines or Cloud Engines. The Remote Engines are installed by yourself in your own environment (could be your laptop, or your VPC in AWS for example). Cloud Engines are hosted by Talend allows you to run jobs without installing any software, Talend manage the infrastructure for you. This is probably where you have run your jobs.


Talend Studio running on an Azure VM

The TMC which I connect through the "tmc.us.cloud.talend.com" url

My company has a GIT repository that I have the source code in

We have Jenkins, Maven and Nexus on prem

I can use Docker, too, but not sure if it is useful in my scenario or not.

So, I am working on creating a Jenkins pipeline using talend documentation I've read.  But what I missing is how this gets deployed to the management console.  Basically, for my development environment, please help me understand the high level steps I need to follow so my jobs are deployed somewhere I can run and scale hardware as needed.


I will try to shed some light on that. Please look at the below diagram that might give you an overview:

0683p000009M77y.png

 

 

As you can see, you can setup Jenkins pipelines to deploy "on-prem' in an artifact repository such as Nexus. You can also deploy to TMC which you have done. Or you can deploy to a Docker Registry. It looks like you are more looking at deploying to TMC.

In order to achieve this you will have to use Jenkins along with Maven. Our Maven plugin let you deploy anywhere depending on the profiles you specify (respectively -Pnexus, -Pcloud-publisher and -Pdocker). You can find Jenkins scripts in our documentation.

 


So to recap:

When using the TMC, where do the jobs actually run (and as a bonus, how does scaling work here)?

How can I use Talend CI/CD process to deploy built jobs to the TMC for scheduling and running?

What parts of the process can be generic vs. unique based on the clients environment setup?


When using TMC, the jobs run in Remote Engines or Cloud Engines. The scaling can be done via clusters of Remote Engines. Concerning the Cloud Engines, Talend manages the resources for you.

The CI/CD process can be implemented with any CI/CD tool that supports Maven such as Jenkins.

If I understood well, your process can be pretty generic because we use Maven, only the options, parameters and configuration will differ.

 

I hope I answered most of your questions,

 

Sorry for the late response,

 

Thibaut