Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
WSyahirah21
Creator
Creator

Is it possible to import python modules in Talend.

I'm running the spark job where it needs to use the library from the python module I created. The snippet of spark script:

 

  from loaders import ingestion

  logging.basicConfig(level=logging.INFO)

  log = logging.getLogger('jdbc-loader')

 

However, once I run the spark job, it returns the following error.

File "/data/Talend-7.3.1/jobserver/agent/TalendJobServersFiles/repository/<path>", line 17, in <module>

  from loaders import ingestion

ImportError: No module named loaders

 

Is there any ways to pass the

python module into a script in Talend Studio?

Labels (4)
1 Solution

Accepted Solutions
WSyahirah21
Creator
Creator
Author

Hi xdshi,

Issue solved. In order to use the python module in talend, I did following steps:

  1. Added modulefile.zip.tgz into resources, unarchived that file using tFileUnarchive.
  2. Set the java.nio.file.Paths.get(context.pylibs).getParent().toString() in Extraction Directory parameter to read the parent file of modulefile.
  3. Use the python module in pyspark script as usual using import statement.

 

Thanks for you respond!

View solution in original post

5 Replies
Anonymous
Not applicable

Hello,

Are you able to run your python script fine from CMD window? There is something lack of python module in your python script?

So far, there are two ways to import and use an external jar in talend

- Use tLibraryLoad to load the jar in the job.

- Create a custom routine, import the jar in the routine and then call the routine in the job.

It maybe not applicable for python module.

Best regards

Sabrina

 

 

WSyahirah21
Creator
Creator
Author

Hi Sabrina, it has been a while, but is it possible to use the python module if I add the resource into PYTHONPATH environment? I wanted to try it out.

 

However, to try that way, I need to put the python module into the directory of talend resources in server.

 

My question is, where can I locate the directory of the following resource in terminal.

0695b00000QCEArAAP.pngSo far, I only can see the directory of resources per job execution. E.g.

 

/data/Talend-7.3.1/jobserver/agent/TalendJobServersFiles/repository/INGESTION_20220402_130416_WoC1C/Ingestion_Dev/ingestion/ingestion_dev_0_13/resources/spark_jdbc_0.1.py

Anonymous
Not applicable

Hello,

I'm not sure that it is possible to use the python module if you add the resource into PYTHONPATH environment and I need to check and make an investigation on it with our manager. I will come back to you as soon as possible.

Best regards

Sabrina

WSyahirah21
Creator
Creator
Author

Hi xdshi,

Issue solved. In order to use the python module in talend, I did following steps:

  1. Added modulefile.zip.tgz into resources, unarchived that file using tFileUnarchive.
  2. Set the java.nio.file.Paths.get(context.pylibs).getParent().toString() in Extraction Directory parameter to read the parent file of modulefile.
  3. Use the python module in pyspark script as usual using import statement.

 

Thanks for you respond!

Anonymous
Not applicable

Hello,

Thanks for letting us know you have resolved this issue by yourself and sharing solution with us.

Best regards

Sabrina