Solved: Re: Delete data in Elasticsearch before loading - Qlik Community

bradsheridan · ‎2018-11-14

Morning Community! Here's another one for you...

We are loading ESS (AWS Elasticsearch Service) via a Big Data Batch job in Talend Studio 7.0.1. What we need to know the best practice for is how to delete all existing documents in an index before re-loading.

For example, if we run the job and load 2000 'records' (documents in ESS terminology), the next time the job runs we want to remove them all first.

We've seen on other components, like the Redshift Output one, that there is a drop down to select update/delete/insert/truncate/etc... but this type of functionality doesn't exist on the ESS Output component.

Any insight is greatly appreciated

thanks

bradsheridan · ‎2018-11-14

Hi again...I not only posted my question here in the Community, but I had also reached out to Talend directly. Here is their response:

"We don’t have elastic search delete components in spark jobs, but if AWS has a rest API that we can leverage – you can first delete the index or records using that rest api from Talend DI job ( for e.g., tRestClient can be used).

Else if AWS has any java SDK, you can use that SDK directly in spark jobs to delete index."

so it sounds like my question is answered. Thanks!

View solution in original post

bradsheridan · ‎2018-11-14

Hi again...I not only posted my question here in the Community, but I had also reached out to Talend directly. Here is their response:

"We don’t have elastic search delete components in spark jobs, but if AWS has a rest API that we can leverage – you can first delete the index or records using that rest api from Talend DI job ( for e.g., tRestClient can be used).

Else if AWS has any java SDK, you can use that SDK directly in spark jobs to delete index."

so it sounds like my question is answered. Thanks!

Delete data in Elasticsearch before loading

AWS

elasticsearch

Talend Data Integration

Talend Studio

v7.x