Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Datasets & hashfiles in Talend

Hi,

We are in process of migrating our Datastage 8.7 version jobs to TALEND.

Is there any component similar to 'datasets' & 'hashfiles' in datastage available in TALEND.....?

 

Regards,

Amirtharaj.R

Labels (1)
3 Replies
Anonymous
Not applicable
Author

You really need to give us a little more info on what "datasets" and "hashfiles" are and what they do with regard to Datastage. Chances are Talend will be able to handle the functionality they provide out of the box. If not, the major advantage of Talend is that you can write your own functionality (or include that of others) using Java.

Anonymous
Not applicable
Author

Datasets are internal file formats in Datastage, which can be used as intermediate files for lookup and other operations and manage the data within the job. Moreover since the dataset files are in binary format, read / write to these files are very fast comparatively

 

For e.g, when u need to do a lookup from a large table, we can write the data to the dataset file and use it in other jobs rather than select the data from the DB again.

Anonymous
Not applicable
Author

You can use tHashInput/Output components for this sort of thing in Talend. This will depend on memory though. If you want to store gigabytes of data in memory, you will need the memory on your machine. However it is very quick. There are other ways in which you can increase performance by removing the latency of db lookups, but tHash components are the first that come to mind. You also have to consider that with Talend you have every Java API available to you, so finding alternatives is easy if necessary