Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
gt0731
Contributor III
Contributor III

Greenplum - gpload process error

We have small greenplum cluster. In that , trying for Merge operation using tgreenplumGPload Component.
Getting this error.
ENV Details :

OS detail 
Talend server  - windows server 2012
Greenplum Cluster version -   centos 7    

Hadoop cluster - centos 7 







Getting following error :
Exception in thread "Thread-1" java.lang.RuntimeException: Cannot run program "gpload": CreateProcess error=2, The system cannot find the file specified

Attached is screenshot error :
0683p000009MDUP.png

Job flow Setting at tgreenplumGPload component
0683p000009MDHj.png
0683p000009MDUU.png
gpfdist program is running at the Greenplum master host. 
$ ps -A | grep gpfdist
20071 pts/0    00:00:00 gpfdist
$


Do i need to copy file from Local windows on which talend job is running  to REMOTE linux server on which greenplum database master exist ? It would be great help if you will suggest on my current  data flow.
 Current Data flow:
                                       tgreenplumconnection
                                      |
Read from SQL server -->hdfs -->tmap-->tgreenplumGPload -->tgreenplumCommit
Q1 : How do I get  source HDFS data into greenplum at  serving directory of gpfdist protocol. so, that gpload merge operation start using it. We cannot use gphdfs because purpose is gpload merge operation. Please suggest if we have any alternate way to do this.

Checked    -  following process is running in greenplum server .
$ gpload -f gpload.yml
2017-02-25 20:20:48|INFO|gpload session started 2017-02-25 20:20:48
2017-02-25 20:20:48|INFO|started gpfdist -p 8081 -P 8082 -f "/home/gpadmin/demo/gp_RevenueReport_stg0.txt" -t 30
2017-02-25 20:20:48|INFO|running time: 0.20 seconds
2017-02-25 20:20:48|INFO|rows Inserted          = 0
2017-02-25 20:20:48|INFO|rows Updated           = 3
2017-02-25 20:20:48|INFO|data formatting errors = 0
2017-02-25 20:20:48|INFO|gpload succeeded

Main cause :
Greenplum database server (Linux) is remote to ETL talend server (window). hence , when i am running the job from window server . ALSO,  i am not able to configure component tgreenplumGPload. 




Any help on it would be much appreciated ? Thanks in advance
Labels (3)
11 Replies
Anonymous
Not applicable

Hi,
Could you please indicate on which build version you got this issue? Are you able  to load the same file into GPDB by runnning the gpload utility   from the command line?
Best regards
Sabrina
gt0731
Contributor III
Contributor III
Author

Hello xdshi,
Yes, I am able to load the File from greenplum database server. also i  am able to  point to the external table from ETL host .
But  Not able to output data in target table.  Insert is failing from tgreenplumGPload.
Here is detail I posted with  screenshot : 
https://www.talendforge.org/forum/viewtopic.php?id=56114

 Env 
greenplum loader tool windows -4.3   -  gpload version 4.3.8.1 build 1
Python  - 2.5.4 -64 bit 
 Talend - Windows server 2012 r2

I am finding the way 
How to use tgreenplumGPload  when it is not inserting record. Even job is throwing no error. it is completing with exit cod=0  without error. 



Also when I ADDED THE breakpoint  on tgreenplumGPload component

Starting job gpload_test at 07:45 03/03/2017.


connecting to socket on port 4007
connected
Exception in thread "Thread-1" java.lang.RuntimeException: Cannot run program "gpload": CreateProcess error=2, The system cannot find the file specified
at bigdata.gpload_test_0_1.gpload_test$2.run(gpload_test.java:848)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessImpl.create(Native Method)
at java.lang.ProcessImpl.<init>(ProcessImpl.java:386)
at java.lang.ProcessImpl.start(ProcessImpl.java:137)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
at java.lang.Runtime.exec(Runtime.java:620)
at java.lang.Runtime.exec(Runtime.java:528)
at bigdata.gpload_test_0_1.gpload_test$2.run(gpload_test.java:836)
disconnected
Job gpload_test ended at 07:45 03/03/2017.


Please ,I need Your direction on it. Thanks
gt0731
Contributor III
Contributor III
Author

I noticed following from  ETL server. i am able to run this command from  Windows command prompt 
c\0683p000009M9xp.pnggpload.py - f gpload.yml
output shows :
c:\>gpload.py -f gpload.yml
2017-03-04 06:03:07|INFO|gpload session started 2017-03-04 06:03:07
2017-03-04 06:03:07|INFO|started gpfdist -p 8081 -P 8082 -f "C:/gp_RevenueReport
_stg0.txt" -t 30
WARNING:  nonstandard use of \\ in a string literal
LINE 1: ...tg0.txt') format'text' (delimiter '|' null '' escape '\\' )
                                                                ^
HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
2017-03-04 06:03:07|INFO|running time: 0.23 seconds
NOTICE:  table "temp_staging_gpload_25e5cb21_00ca_11e7_b077_bc764e20d911" does n
ot exist, skipping
2017-03-04 06:03:08|INFO|rows Inserted          = 20
2017-03-04 06:03:08|INFO|rows Updated           = 200
2017-03-04 06:03:08|INFO|data formatting errors = 0
2017-03-04 06:03:08|INFO|gpload succeeded


At the same time when i try to   Merge the data through tgreenplumGPload  - it is failing to merge the data into target  table
Anonymous
Not applicable

Hi ,

 

i am facing error in Greenplum - gpload  process.

 

Can you please help on this. Thanks in Advance.


@gt0731 wrote:
I noticed following from  ETL server. i am able to run this command from  Windows command prompt 
c\0683p000009M9xp.pnggpload.py - f gpload.yml
output shows :
c:\>gpload.py -f gpload.yml
2017-03-04 06:03:07|INFO|gpload session started 2017-03-04 06:03:07
2017-03-04 06:03:07|INFO|started gpfdist -p 8081 -P 8082 -f "C:/gp_RevenueReport
_stg0.txt" -t 30
WARNING:  nonstandard use of \\ in a string literal
LINE 1: ...tg0.txt') format'text' (delimiter '|' null '' escape '\\' )
                                                                ^
HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
2017-03-04 06:03:07|INFO|running time: 0.23 seconds
NOTICE:  table "temp_staging_gpload_25e5cb21_00ca_11e7_b077_bc764e20d911" does n
ot exist, skipping
2017-03-04 06:03:08|INFO|rows Inserted          = 20
2017-03-04 06:03:08|INFO|rows Updated           = 200
2017-03-04 06:03:08|INFO|data formatting errors = 0
2017-03-04 06:03:08|INFO|gpload succeeded


At the same time when i try to   Merge the data through tgreenplumGPload  - it is failing to merge the data into target  table

PFA.0683p000009M5fB.png

Anonymous
Not applicable

Hi All,

 

i am facing issue in GPload in Talend.

 

Can you please help on this?

 

Thanks in Advance.

 

0683p000009M5QW.png

Anonymous
Not applicable

Hello,

Could you please specify the version of greenplum you are using?

Please refer to this online installation guide about:TalendHelpCenter: Supported systems, databases and business applications by Talend components.

Greenplum

4.2.1.0

Windows (client only) + Linux

Best regards

Sabrina

Anonymous
Not applicable

Hi sabrina,

 

Thanks for quick reply.

 

i am using below GP version:

 

"(Greenplum Database 4.3.29.0 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Aug 22 2018 23:17:57"

 

Thanks

Ramesh

 

Anonymous
Not applicable

Hello,

Greenplum Database 4.3.29.0 is not list in the supported table.

In the documentation we provide a list of databases that are supported, in the sense that we do provide an SLA and technical support for them. This doesn't mean other (non-listed) databases will not work but simply we won't necessarily be equipped to help you with any issue you may face with them.

Best regards

Sabrina

Anonymous
Not applicable

Hi Sabrina,

 

i am able to load the data into current version DB. 

 

Here problem is Tgreenplumgpload component.

 

Do we need to use any parameter in that component?

 

Thanks

Ramesh.


@xdshi wrote:

Hello,

Greenplum Database 4.3.29.0 is not list in the supported table.

In the documentation we provide a list of databases that are supported, in the sense that we do provide an SLA and technical support for them. This doesn't mean other (non-listed) databases will not work but simply we won't necessarily be equipped to help you with any issue you may face with them.

Best regards

Sabrina