<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: tREST and/or tRESTclient - how to loop to retrieve data? in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350208#M117085</link>
    <description>&lt;P&gt;I think I solved it now&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;in place of a "string" I use the global variable like so&lt;EM&gt;&lt;STRONG&gt;&amp;nbsp;((String) globalMap.get("variable_name"))&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;I can use this in &lt;STRONG&gt;http body&lt;/STRONG&gt; and in &lt;STRONG&gt;relative path&lt;/STRONG&gt; for tREST and tRESTclient&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 27 Jul 2017 02:10:37 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2017-07-27T02:10:37Z</dc:date>
    <item>
      <title>tREST and/or tRESTclient - how to loop to retrieve data?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350203#M117080</link>
      <description>&lt;P&gt;Hello everyone&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I would like a sanity check as I am very new to Talend DI.&lt;/P&gt; 
&lt;P&gt;I am using version 6.4.1 of Talend DI, the 'free' edition.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I need to retrieve a large amount of data (via tREST JSON or XML) from &lt;STRONG&gt;Elastic Search&lt;/STRONG&gt;, about 1 GB of JSON per hour.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I am planning to use tRESTclient and tREST in Open DI.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;If I were to write a standalone Java program (just for sake of example) then I would first post a request to Elastic Search to obtain a &lt;STRONG&gt;scroll_id&lt;/STRONG&gt; which is conceptually like a database SQL Open Cursor statement. In this initial scroll set-up request I would establish the max payload size to be returned (for example 500 'rows').&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Next&amp;nbsp;I would use the returned scroll_id value in a &lt;STRONG&gt;loop&lt;/STRONG&gt; to make repeated calls to Elastic Search to get the &lt;STRONG&gt;next batch of data, &lt;/STRONG&gt;returned&amp;nbsp;in a JSON/XML document.&lt;/P&gt; 
&lt;P&gt;I would need to loop this call &lt;STRONG&gt;until end-of-data condition is reached&lt;/STRONG&gt; and inside the loop I would somehow store the retrieved JSON/XML returned payload in a database or a file and repeat.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;In Talend I intend to use tRESTclient to set up and return the scroll_id.&lt;/P&gt; 
&lt;P&gt;Then in a loop I plan to use tREST to pass the scroll_id and return next batch of payload in JSON/XML form.&lt;/P&gt; 
&lt;P&gt;Also in the same loop I would map the data and store it in a database/file.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Is this going to work?&lt;/P&gt; 
&lt;P&gt;Is this a good use-case for Talend DI or am I better off just writing a Java program without using Talend ?&lt;/P&gt; 
&lt;P&gt;Is this going to perform with large amount of data in Talend DI?&lt;/P&gt; 
&lt;P&gt;I will be retrieving about 1 GB of JSON/XML data every hour via the above tREST loop calls.&lt;/P&gt; 
&lt;P&gt;If there is a better solution using the free version of Talend DI, please advise.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Many thanks in advance&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jul 2017 01:37:08 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350203#M117080</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-26T01:37:08Z</dc:date>
    </item>
    <item>
      <title>Re: tREST and/or tRESTclient - how to loop to retrieve data?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350204#M117081</link>
      <description>&lt;P&gt;The tREST component will be calling a service. &amp;nbsp;How long will it take to download 1 GB of JSON? &amp;nbsp;What will be the payload per request? &amp;nbsp;Remember that a webservice is just like a webpage, there is a timeout setting. &amp;nbsp;If your payload is too big, and it takes too long to download the data for 1 request, the service may time out.&lt;/P&gt; 
&lt;P&gt;Also, would you be running 1 instance of this logic or multiple instances of this logic on multiple servers to parallelise? &amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Talend Big Data (paid version) has ElasticSearch components. That may simplify your need. &amp;nbsp;You can try the Big Data Sandbox, get a trial license, and test it out.&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jul 2017 01:43:04 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350204#M117081</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-26T01:43:04Z</dc:date>
    </item>
    <item>
      <title>Re: tREST and/or tRESTclient - how to loop to retrieve data?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350205#M117082</link>
      <description>many thanks for the reply! 
&lt;BR /&gt;I probably did not supply enough information, my apologies. 
&lt;BR /&gt; 
&lt;BR /&gt;1 GB of data per hour is the total, not in a single tREST call. 
&lt;BR /&gt;After tRESTclient call to get scroll_id this is passed to tREST http body to make the loop of calls. 
&lt;BR /&gt;So the tREST call will loop perhaps 50-100 times retrieving the next chunk of data per each call, say ~20 MB per call * 50-100 times = 1-2 GB total inside each hour. 
&lt;BR /&gt;Currently there is a Groovy program which does this job running in a single instance, not parallel. 
&lt;BR /&gt;We are replacing Groovy (and adding some more functionality) with either Talend DI or pure Java. 
&lt;BR /&gt;Based on Groovy performance with Elastic Search I figure that I probably will not need to run multiple/parallel tasks in Java or Talend. 
&lt;BR /&gt;I hope this provides enough information for you to give me further guidance. 
&lt;BR /&gt;Many thanks again! 
&lt;BR /&gt;</description>
      <pubDate>Wed, 26 Jul 2017 01:58:44 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350205#M117082</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-26T01:58:44Z</dc:date>
    </item>
    <item>
      <title>Re: tREST and/or tRESTclient - how to loop to retrieve data?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350206#M117083</link>
      <description>&lt;P&gt;If you have Groovy doing the same logic, I am sure you can reproduce the same in Java. &amp;nbsp;However, there is no ElasticSearch component in the open source version. &amp;nbsp;You may end up writing some code. &amp;nbsp;And you will need to test it to figure out whether Java perform the same as Groovy. &amp;nbsp;That is the comparison, since Talend just generates Java code.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jul 2017 02:28:23 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350206#M117083</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-26T02:28:23Z</dc:date>
    </item>
    <item>
      <title>Re: tREST and/or tRESTclient - how to loop to retrieve data?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350207#M117084</link>
      <description>thanks, 
&lt;BR /&gt; 
&lt;BR /&gt;(1) Can I use tREST in a loop in a Talend DI job and to pass at run-time the value for its HTTP Body (as done in Basic Settings UI, statically) ? The value of scroll_id will need to be deposited at run-time to HTTP Body. 
&lt;BR /&gt;Which Talend document will tell me how? 
&lt;BR /&gt; 
&lt;BR /&gt;(2) Can I pass to tRESTclient the value as seen in the "Relative Path" (in Basic Settings UI) - at run-time ? Once again, this must be done dynamically, i.e. using some sort of a 'variable' to set the value of Relative Path. Which Talend document will tell me how? 
&lt;BR /&gt; 
&lt;BR /&gt;I am willing to write some Java helper code to be called inside Talend DI to do this assuming I will save effort overall compared to writing everything in Java myself. 
&lt;BR /&gt; 
&lt;BR /&gt;thanks! 
&lt;BR /&gt;</description>
      <pubDate>Wed, 26 Jul 2017 02:39:08 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350207#M117084</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-26T02:39:08Z</dc:date>
    </item>
    <item>
      <title>Re: tREST and/or tRESTclient - how to loop to retrieve data?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350208#M117085</link>
      <description>&lt;P&gt;I think I solved it now&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;in place of a "string" I use the global variable like so&lt;EM&gt;&lt;STRONG&gt;&amp;nbsp;((String) globalMap.get("variable_name"))&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;I can use this in &lt;STRONG&gt;http body&lt;/STRONG&gt; and in &lt;STRONG&gt;relative path&lt;/STRONG&gt; for tREST and tRESTclient&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jul 2017 02:10:37 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tREST-and-or-tRESTclient-how-to-loop-to-retrieve-data/m-p/2350208#M117085</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-27T02:10:37Z</dc:date>
    </item>
  </channel>
</rss>

