Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi there,
[note: i am using Talend open studio for BigData v7.1]
I have this use case where I need to grab an SQL script file and execute it against MySQL at run time and then export that query result set to an Excel sheet.
I was able to do this by putting in the SQL statement in tFileInput etc.. and also exporting the results of that query to an excel sheet.
BUT i have multiple SQL script files, how would i grab all the SQL script files from a folder on disk and execute them sequentially to MySQL then export the results of each query to an Excel sheet in one Excel workbook.
So i am lost in terms of how to Loop/Iterate thru these files.
Hope someone can help me out with this.
Hi,
This flow should ideally satisfy your need under the assumption that the schema is same for all the different queries. If the schema for each query is different, you will not be able to use below method as Talend Studio is running on schema based logic.
First of all, please have the queries ready in the delimited file along with a query id. If you do not have a query id in your input file, you can even generate a numeric sequence and convert the value to String using tConvertType (from Integer to String).
Now, your tFlowtoIterate will have two columns query and query_id.
In the tMySQLInput, add the query variable (use Control+Space to get the list of variables under tFlowtoIterate and select the right variable).
In the output Excel, provide the query id variable to Sheet attribute.
This will make sure that the output from each query will be going to different sheets.
Hope I have answered your query. Please spare a second to mark the topic as resolved 🙂
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Yes, the queries are all different SQL statements against the very same table.
Here are my SQL queries in two separate files:
QUERY #1: file #1
"SELECT * FROM mybanktable;"
QUERY #2: file #2
"SELECT job, SUM(balance)
FROM mybanktable
GROUP BY job;"
But I don't understand why you are using a tFileInputDelimited.
And if we do use it, what is the structure of this Delimited file, is it:
'query id, SQL Statement'
My SQL query contains comma's which can be problematic i think.
I have used other component before to directly execute SQL file.
I cannot make changes to the SQL file as you suggest.
Hope there is a better way than you suggested.
Paul
Hi Paul,
Your schema is different for each query. So you cannot use the above approach since the assumption for the above method is that schema remains same.
In your case, you will have to handle the processing through separate flows.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi nikhilthampi,
I don't understand how you could misunderstand!
I have only one table, it has a fixed schema.
And i am executing 2 different queries against the same table.
SO i don't understand how you can say the schema is different.
Please take a look again, its quite straight forward.
Hi @Paluee
The queries you had given in your earlier post is as shown below.
QUERY #1: file #1
"SELECT * FROM mybanktable;"
QUERY #2: file #2
"SELECT job, SUM(balance)
FROM mybanktable
GROUP BY job;"
You have not specified the column list for the table "mybanktable" and you are using Select *. So I am not able to identify the list of columns used for this table. Now, the structure of the second table is job, sum(balance) which means you have two columns.
So I am confused when you are saying both table structures are one and same. Could you please share more details to these queries?
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi again,
Like i mentioned, the table is fixed, there are like maybe 15 columns.
I was able to do one query to select all the data and put it into a sheet in Excel.
Then i went back to change the query and changed to append a new sheet in excel and did a aggregate group by type query that brought in the name and the sum of sales for example for each salesman.
I am able to do this one at a time, stopping and making changes then running again, BUT with ETL tools the point is to be able to automate this. This use case in my opinion is a common use case, one just needs to be able to iterate thru the sql files in a folder on disk, load this in memory somewhere, and then iterate thru these SQL file/statements and send to the database and have the 2 results returned one by one and place each one of these results in a separate sheet in an Excel workbook.
I put the 2 queries in the database metadata under : DB Connections __ Queries where i named one of the queries
1) myQuery - this is the first query where i do a Select All on the table
2) myResult - this is second query that does a aggregation with groupby, essentially a pivot table of results on sales per salesman.
If there is some way to iterate thru the metadata query section, that would be another way.
I have some images attached of my project, see attached images.
Hi there,
I already replied to this, but then something went wrong here, not sure what happened, but my reply may appear twice, because i currently don't see my original reply, so i will repeat it again.
---------------------------------
The table in MySQL is a fixed table, has maybe around 15 columns. The amount of columns is not important, just that there are a fixed amount of columns.
I also place these 2 queries in the database connection Query section, and in the tree diagram under query the two queries are named:
1) myQuery - This is the Select All query
2) myResults - this is the second query with aggregation and groupby operations returning the some of sales of each salesman over a the period that the data provides.
In my Talend flow i could execute each one, separately, and appending the results to different sheets in an Excel workbook.
But each time i had to stop and change to other query etc...
I have to manually intervene.
This type of thing should be easily automated via loops or repetition of some kind.
I have attached images of my talend project screen shots, it fairly simple, not complicated at all.
See attached files.
Hi,
Thanks for the details and screen shots.
You are trying to populate your saved queries dynamically to populate the result set from each query to same output location. If you have to do the dynamic switch of queries during runtime, you will have to pass them as arguments (may be from a file or DB) to your existing tDBInput component.
If you observe my first post in this chain, I had already replied the method to do it where I am reading the query dynamically from a file (you can change this to any source you want) and then passing the query information to subsequent components. Did you get a chance to try to that method?
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
OK, so i see on the first post, you have your input file component that allows you to reference queries it seems.
I have not tried that yet.
Also the tFlowtoIterate, when one has more than one query does that automatically iterate thru all the sql script files?
Also just want to mention that i noticed this component: tSQLScriptParser
---------------------------------------------------------------------------------------------
https://github.com/jlolling/talendcomp_tSQLScriptParser
-----------------------------------------------------------------------------------------------
I am wondering if this component can do this task more easily.
Hope that the developer of this component, by the name of Jan Lolling: jlolling
Solution Architect. Talend addicted. , can provide his input on this.
Any help on this appreciated.