Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
bradsheridan
Contributor III
Contributor III

Delete data in Elasticsearch before loading

Morning Community!  Here's another one for you...

 

We are loading ESS (AWS Elasticsearch Service) via a Big Data Batch job in Talend Studio 7.0.1.  What we need to know the best practice for is how to delete all existing documents in an index before re-loading. 

For example, if we run the job and load 2000 'records' (documents in ESS terminology), the next time the job runs we want to remove them all first.

 

We've seen on other components, like the Redshift Output one, that there is a drop down to select update/delete/insert/truncate/etc... but this type of functionality doesn't exist on the ESS Output component.

 

Any insight is greatly appreciated

 

thanks

 

Labels (5)
1 Solution

Accepted Solutions
bradsheridan
Contributor III
Contributor III
Author

Hi again...I not only posted my question here in the Community, but I had also reached out to Talend directly.  Here is their response:

 

"We don’t have elastic search delete components in spark jobs, but  if AWS has a rest API that we can leverage – you can first delete the index or records using that rest api from Talend DI job ( for e.g., tRestClient can be used).

Else if AWS has any java SDK, you can use that SDK directly in spark jobs to delete index."

 

so it sounds like my question is answered.  Thanks!

View solution in original post

1 Reply
bradsheridan
Contributor III
Contributor III
Author

Hi again...I not only posted my question here in the Community, but I had also reached out to Talend directly.  Here is their response:

 

"We don’t have elastic search delete components in spark jobs, but  if AWS has a rest API that we can leverage – you can first delete the index or records using that rest api from Talend DI job ( for e.g., tRestClient can be used).

Else if AWS has any java SDK, you can use that SDK directly in spark jobs to delete index."

 

so it sounds like my question is answered.  Thanks!