Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
sbhadra
Contributor II
Contributor II

Iterating multiple pages

Hi Team,

 

I have an existing flow where in I am iterating a list of items which are coming from an api call. However, as the data volume is huge hence the data is coming in multiple pages which my job is currently unable to handle. Please suggest how to proceed in such a scenario0683p000009M8NB.jpg

 

In this tRestClient1 component brings in the list of items with the help of a search query and textractjsonfields_1 is responsible to get the single item and iteration to continue. However, the flow gets halted in the middle and I am not getting the complete result as only 100 records are coming up. There is a page link in the json response which actually indicates the next page. I am able to get the value however, I am not sure how to redo the entire process once again until all the pages are exhausted..

Labels (3)
9 Replies
Anonymous
Not applicable

Hi
Is an API available for returning the total number of pages for the specified item? or how to know that all pages have been handled?

Regards
Shong
sbhadra
Contributor II
Contributor II
Author

Hi Shong,

 

Yes, the api url has the page number within it and if there are no more pages available then the value of next_page becomes null.

Anonymous
Not applicable

OK, so you can parse the page link from the response, if it is null, then exit the loop. the job design looks like:
..tRestClient1--tExtractJsonField--tFlowToITerate--tjava--oncomponent-->tLoop---iterate-->tRestClient2---tExtractJsonField2---tJavaRow

tRestClient1 brings in the list of items, and then iterate each item.

on tJava: define a context variable for URL, and build the URL for used later.
on tLoop, select For loop type, define a context variable with boolean type in the job, let's call it condition, set the default as true, set the condition field as context.condition

on tJavaRow:
...babababa..for other columns...
if(input_row.page_link==null){
context.condition=false;
}else
{
context.condition=true;
context.URL="balabala"
}

Hope it gives you hints.

Regards
Shong


sbhadra
Contributor II
Contributor II
Author

Hi Shong,

 

There is an issue with the solution you provided. The tRestClient1 component brings in the list of items and also the url of the next page. I am able to store the next page url but with the current flow if I am intoducing the tloop it is not helping me because with each item which I am receiving I have to get some other details as well with the ids.

 

Let say I have an search url 

https://api.xyz.com/search.json - It is giving me some list of item ids. I am using tRestClient_1 to call the search url. 

Next with those ids I am having two subsequent calls to retrieve some more information for which I am using tRestClient_3 and tRestClient_4 and then finally writing all the captured information into excel. Now, once the job completes all the iterations for the initial search result then the job need to do call the next page search url (https://api.xyz.com/search.json?page=2 which I have received from parsing the first search result. 

 

Hence, I need to call the tRestClient_1 once again and repeat the whole process. Please suggest if there is any way to get it done.

Anonymous
Not applicable

How do you get the page number such as page=2? Can you please show me an example of the first search result?
sbhadra
Contributor II
Contributor II
Author

Hi Shong,

The json result is coming like this

{
"results": [

     {

       "url":"https://api.xyz.com/tickets/1111.json"

       "id": 1111

       "external_id": null

     }

     ]

  "next_page": "https://api.xyz.com/api/search.json?page=2

   },

 

The next page is coming within the json body. I am capturing the same however, there are two criteria's that if the value of page increases to 10 then stop the iteration or if the next_page value is null then stop the iteration. The first one works fine (10 iterations) however, the second criteria is working fine but when the next_page is becoming null it is throwing the below exception 

 

Exception in component tRESTClient_1

javax.ws.rs.ProcessingException: java.lang.IllegalArgumentException: URI is not absolute
at org.apache.cxf.jaxrs.client.AbstractClient.checkClientException(AbstractClient.java:604)

 

This is my job design 

 

0683p000009M98L.jpg0683p000009M920.jpg

Now the job is failing. Please suggest. I have used one of the examples however still not working

 

https://community.talend.com/t5/Design-and-Development/Iterative-Data-extraction-Pagination-and-Poll...

sbhadra
Contributor II
Contributor II
Author

Hi Team,

 

The condition ((Boolean)globalMap.get("V_LOOP") && i<10) which I have mentioned in my post. Out of this only the second one i<10 is working however the first one is not working.

 

and thorwing the exception 

Exception in component tRESTClient_1

javax.ws.rs.ProcessingException: java.lang.IllegalArgumentException: URI is not absolute
at org.apache.cxf.jaxrs.client.AbstractClient.checkClientException(AbstractClient.java:604)

 

This may be because the next_page url is null. However, I have already put an condition and it should fail however, instead of failing it is continuing. Please suggest on the same as I am unable to proceed further on this.

Anonymous
Not applicable

For testing and debug, print the value of global variable and the URL on console to check if the value are correct or not.
sbhadra
Contributor II
Contributor II
Author

Hi Shong,

 

Yes able to figure out the issue. However, there is an issue in x the design which I have prepared is taking longer time to process the records let say for 750 records it is taking around 35 mins because there are are two subsequent api calls after the search call returns all the item details which I am capturing in an excel sheet.

This is my current flow. Can you please suggest if there is any scope in performance improvement. I am attaching my current job design.0683p000009M8yh.jpg