Machine learning components are not available in Studio
Additional Versions
Product
Big Data
Component
Components
Problem Description
To use Talend machine learning components, you need one of these licenses:
Talend Big Data Platform (with just a Talend Big Data license, machine learning components are not available)
Talend Real-time Big Data Platform
Talend Big Data Fabric
A Talend Real-time Big Data Platform license is active in Talend Studio, TAC 6.3.1. However, after creating a Big Data batch Job (Spark) for a remote project, the machine learning components are not available in Studio.
Problem root cause
A Talend Real-time Big Data Platform license (or Data Fabric license) allows you to have machine learning components available for use in a Spark Big Data batch or streaming Job. With a Real-time Big Data Platform license, you can create two types of projects and users in TAC:
Data Integration/ESB
Data Quality
A Data Quality project type enables Big Data features such as machine learning, but a Data Integration/ESB project type does not. This is due to the remote project type defined in TAC being Data Integration/ESB, and the Job being created in this project.
Solution or Workaround
The solution consists of ensuring that:
The Talend license used in Talend Studio/TAC is Talend Big Data Platform, Talend Real-time Big Data Platform, or Talend Data Fabric license.
The Talend Job using machine learning components is a Spark Job (Big Data streaming/batch), since machine learning components rely on the Spark MLib libraries.
If the Talend Real-time Big Data Platform license is activated, and the Talend Job belongs to a remote project, the project is of the Data Quality type, and Talend Studio connects to TAC as a Data Quality type user.