Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
Dawn2
Contributor II
Contributor II

Write multiple queries in tHiveInput

Hi,
Does anyone know if it is possible to write multiple queries in the tHiveInput component? For example, I have a series of queries like create a table, main query and drop table.
 
I got error messages either "missing EOF" , or " cannot recognize input near ''<1st query tail>''  ';'  '<2nd query head>' in expression specification" if I use semicolon in between the 2 queries.
 
Thanks!
Dawn

Labels (2)
4 Replies
SachinD
Creator
Creator

First component will be tFixedFlowInput component. each line, will have Hive statement.

Connect it with a tHiveRow component and  SQL text area will be row1.sqlstmt - String column in tFixedFlowInput should have name "sqlStmt"

 

Then use tHiveInput for final select query.

Dawn2
Contributor II
Contributor II
Author

Dear SachinD,

Thank you for your reply!

 

Do you mean the workflow will be tFixedFlowInput- tHiveRow - tHiveInput? If my queries are -

1. create table temp as ...

2. select * from A join temp ....

3. drop table temp

 

Can you give more details about it?

 

Thanks,

Dawn

SachinD
Creator
Creator

 

tPreJob-->THiveConnection-->tHDFSConnection

 

tHiveRow (1. create table temp as .. ) -->THiveLoad (Load Data from HDFS file to Temp table, if your File is in HDFS location) --> on SubjobOk --> THiveInput (2. select * from A join temp ....

 

tPostJob--> tHiveRow (3. drop table temp)

 

we can drop Temp table in tPostJobs, or we can create temp table using stmt CREATE TEMPORARY TABLE , which will be dropped after use automatically. 

 

Thanks,

Sachin

 

Dawn2
Contributor II
Contributor II
Author

Dear Sachin,

Sorry for late reply! I was busy with my projects in the last couple of days.

 

I think your focus is to load external files into database using Talend components; then drop them after the query. My major concern is how to create/drop/store temp tables in the database through Talend, so I could use them multiple times in the query which can help to improve the performance. 

 

For now I am using Bash Shell scripts to load external txt/csv files from local machine into the Hive database, and also create temp tables in the Hive database through Bash Shell scripts; then do the main query from there; then output txt/csv/xlsx report through Talend; then drop those temp tables (either from external source or from Hive tables) by bash shell scripts. I am sure it is not the best way to handle it...

 

Dawn