Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Recommended ETL Tools?

hey,

I've been working as an analyst for a while now, often doing consulting projects using Tableau & Llamasoft (supply chain optimization software).

As we are consultants we often are working with new data sources all of the time (SQL Server, MySQL, Excel, Access, CSV, Progress OpenEdge, etc., these can and will continue to change), we haven't really invested in a dedicated ETL tool because we have the feeling that if we set up a process, we won't really be able to replicate the use of it.

After talking with some people, we are realizing that this opinion isn't necessarily true.

We've done some testing with Alteryx and we also know of Llamasoft's offering (data guru), but we are pretty tight on budget unfortunately and I wanted to ask you guys, what tools do you use? Do you think Alteryx is worth it in this situation? Are there free alternatives that you think would help us? (Pentaho?)

Hope you guys can help. Thanks for reading.

Labels (3)
2 Replies
fdenis
Master

I work with ETL for 10 years….
For small project Talend Free version is the best it's more than a simple ETL who connect all db, it can also ftp, manage files and so many things…
for big projects or for dally run select the pro version.
Good luck.
Jesperrekuh
Specialist

... difficult one ... Im a consultant too and I use everything what suits my needs and fits my client(s).
- Talend Open Studio or Pentaho
- Apache NiFi
- Python
- R
- Trifacta
- SAS
- Powercenter

However Talend is my prefered tool. And its eclipse based framework + Java and the wide availability and range of components, and if its not there, Java coding to the rescue to develop your own components or custom routines ... all for free and proper documented. Last but not least , if you want to change stuff download its source, recompile...

If your work is more datascience stuff, Im not a huge fan of Talend, I tend to use Python and R and available modules and packages... but the need thing is... you combine it with Talend and have a hybrid flow. Also webcrawling/scraping, statistics... not in Talend.

Regarding streams and big data ... I recently (last year) tend to move towards Apache NiFi for realtime integration with kafka, messageqs, etc. Building data pipelines directly into streams and have my schema registry outside this ecosystem like AVRO.

One thing to keep in mind... separation of concerns... don't go for 1 tool...
My setup : Talend (java) with Python and R.
No vendor/license lock.
Using Talend almost 10 years, beside analytics, from a technical point of view it always did the job.

However if you need more datawrangling... Alteryx or Trifacta... both worth the money.