Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Im currently trying to run a spark job where the job will use a module that is imported from a python script. This job works when I run it in terminal as I run the spark job in the same directory where the python script (for library) is stored.
However, when I tried to run the spark job in TOS, it returns this error:
from filename import full_ingestion
ImportError: No module named filename
.
I assumed that the talend cant found the library, so I used tLibraryLoad to import the new library to TOS and set it up as below:
In tLibraryLoad_1 :
basic setting : filename.py (this is the library script I wrote in python)
advanced setting : import full_ingestion (full_ingestion is a function in filename.py)
.
In tSystem component:
-> In spark job script , put import full_ingestion
.
.
However, I received this error.
.
Am i do a correct way to import a library created in python script to the spark job in TOS.
HI, i think tLibraryLoad is for java library, if you want to execute pyton script you surely have to use tSystem component and use a cmd to run the script in the tSystemComponent. Send me love and kudos
Do u know how is it possible for me to use python module and import it into a script? Im currently using tSystemComponent to run the python script but it requires me to import the python module first.
exp:
from python_filename import anylibrary
you have to reproduce the command you use when you say 'This job works when I run it in terminal as I run the spark job in the same directory where the python script (for library) is stored'.
I also found this workaround maybe it can help you :
maybe you don't have to install it in this specific folder but just in the plugin folder
#1 The one i run in terminal succeed is because i put the python module script using the same dir. however, in Talend studio i think its using the diff dir as in terminal, where the dir in TOS is:
/data/Talend-7.3.1/jobserver/agent/TalendJobServersFiles/repository/filepath_20220124_173021_mM8rA/jobname/subfilepath/jobname_0_1/resources/. so Im not so sure how should i run the python script for module or how should I pass the module to the spark job script
#2 that solution doesnt work as it can simply install using pip install unidecode. However, mine is using python script.
HI
I have a csv File
name; FirstName ; numero;adresse
1 a;b;12,xx
2 c;b;13;yy
3 x;y;47;zz
4 e;r;45;tt
I want identify the doublon row by FirstName( here row number 1 and 2 firstName =b )
and i want to update the csv file to add new column "status" NotUnique like this
name; FirstName ; numero;adresse;status
1 a;b;12,xx;NotUnique
2 c;b;13;yy;NotUnique
3 x;y;47;zz;unique
4 e;r;45;tt;unique
can some one help me
Thank you very mutch in advance
Dear friends,
I am trying to run python script in TAC,
I am getting this error.Please help me out .It is very critical for me.
Traceback (most recent call last):
File "C:\Users\Talenderuser1\Desktop\Talend\python\LuLu_Client_machine.py", line 1, in <module>
from bs4 import BeautifulSoup
ModuleNotFoundError: No module named 'bs4'
Your responce is very valuable to me.