
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Recommended ETL Tools?
hey,
I've been working as an analyst for a while now, often doing consulting projects using Tableau & Llamasoft (supply chain optimization software).
As we are consultants we often are working with new data sources all of the time (SQL Server, MySQL, Excel, Access, CSV, Progress OpenEdge, etc., these can and will continue to change), we haven't really invested in a dedicated ETL tool because we have the feeling that if we set up a process, we won't really be able to replicate the use of it.
After talking with some people, we are realizing that this opinion isn't necessarily true.
We've done some testing with Alteryx and we also know of Llamasoft's offering (data guru), but we are pretty tight on budget unfortunately and I wanted to ask you guys, what tools do you use? Do you think Alteryx is worth it in this situation? Are there free alternatives that you think would help us? (Pentaho?)
Hope you guys can help. Thanks for reading.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For small project Talend Free version is the best it's more than a simple ETL who connect all db, it can also ftp, manage files and so many things…
for big projects or for dally run select the pro version.
Good luck.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Talend Open Studio or Pentaho
- Apache NiFi
- Python
- R
- Trifacta
- SAS
- Powercenter
However Talend is my prefered tool. And its eclipse based framework + Java and the wide availability and range of components, and if its not there, Java coding to the rescue to develop your own components or custom routines ... all for free and proper documented. Last but not least , if you want to change stuff download its source, recompile...
If your work is more datascience stuff, Im not a huge fan of Talend, I tend to use Python and R and available modules and packages... but the need thing is... you combine it with Talend and have a hybrid flow. Also webcrawling/scraping, statistics... not in Talend.
Regarding streams and big data ... I recently (last year) tend to move towards Apache NiFi for realtime integration with kafka, messageqs, etc. Building data pipelines directly into streams and have my schema registry outside this ecosystem like AVRO.
One thing to keep in mind... separation of concerns... don't go for 1 tool...
My setup : Talend (java) with Python and R.
No vendor/license lock.
Using Talend almost 10 years, beside analytics, from a technical point of view it always did the job.
However if you need more datawrangling... Alteryx or Trifacta... both worth the money.
