Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Gourav_King_of_DataLand
Contributor II
Contributor II

Iterative Data extraction (Pagination and Polling) from REST API using tRestClient

Hi All,

Need you expert help.

 

The requirement is to pull all chat data from REST API (one time full data dump) and then pull chat on daily basis.The output is spread across 180K pages with each page giving URL to next and previous (except first page which have only 'nex_url' and last page with have only 'prev_url').

 

So Far I have been able to use the API/URL to extract information from first page first page

 

tRestClient ->tJavaRow->tJsonExtract->tOracleOut

 

How do I modify the job to

1) Pull all data for one time data dump, 180k pages

2) Pull data on daily basis for current day or extract data until the timestamp is current day.

 

Example output from API

 

Page 1 gives
{
    "chats": [

                  all chat related attributes that needs to imported

                 ],
    "count": 179451,
    "next_url": "next_url_here"

}


Page2 gives

{
    "chats": [

                all chat related attributes that needs to imported

                 ],
    "count": 179451
    "prev_url": "previous_url_here"
    "next_url": "next_url_here"

}

Page 3 gives ......next page 

 

 

Labels (5)
79 Replies
Parikhharshal
Creator III
Creator III

@rhall: Yes you are right but what I am trying to say is when I do println even within loop, it prints all 3 urls like this:

 

0683p000009M0tE.png

 

Ideally the java code you have written should only written URL for rel=next and not anything else but currently it brings value of all 3 still.

 

Using your code it is actually deleting the text ">: rel="last" ". Hope this makes sense.

Anonymous
Not applicable

Add. this line of code underneath the while loops.....

 

while(it.hasNext(){
      System.out.println("TEST");

Run the job with that and show me the precise results. Everything.

Parikhharshal
Creator III
Creator III

@rhall: Here is the screenshot of the result.

 

0683p000009M0tY.png

Anonymous
Not applicable

This is why it is so difficult to debug remotely 🙂 

The problem here is that while the list of URLs is returned in a List, it is actually a String list of values in a single List element. This is frustrating, but means you need a combination of my first and second solution. The second solution can be used to retrieve the String (it will be in the first element, or element 0). You then need to use the first code I gave you where the split method was used. What this will do is extract the String from the List and then split up the String into its individual URLs. This is a good example for you to work through. All the code is there, but you need to rework it to put get it to work.

Parikhharshal
Creator III
Creator III

@rhall: Finally got it working with lots of hassles 0683p000009MACn.png.

 

java.util.List <String> strList  = ((java.util.Map<String,java.util.List<String>>)globalMap.get("tRESTClient_2_HEADERS")).get("Link");

java.util.List <String> new_list = java.util.Arrays.asList(strList.get(0).split(","));

int foundIndex = -1;

String next_url="";

 

for(int i=0; i < new_list.size(); i++)

  {

     if(new_list.get(i).indexOf("rel=\"next\"") != -1)

     {

         System.out.println("Item Found...");

         foundIndex = i;

         break;

     }

  }

 

  if (foundIndex != -1)

  {

      String[] found_item = new_list.get(foundIndex).split(";"); 

      

      next_url=found_item[0].substring(1,found_item[0].length() - 1);

      System.out.println("Next URL is: " + next_url);

      globalMap.put("next_url",next_url); 

  }

  else

  {

      System.out.println("No Next URL found..."); 

  }

 

Parikhharshal
Creator III
Creator III

@rhall: Is there any way I can get course_id as a column and output into db during this whole process?

 

0683p000009M0sI.png

 

This is how my job is built and I am passing Course Id from db and passing it to URL.

Anonymous
Not applicable

Put a tMap after the tExtractJSONFields_1 and then use the value stored in a globalMap by your IterateCanvasID component

Parikhharshal
Creator III
Creator III

@rhall: This works perfectly. Thanks.

 

 

 

DBLONDEL1643728674
Contributor III
Contributor III

Could you please tell me, what do you implemente in tjavaflex "Modify Json" please

thanks

Anonymous
Not applicable

I can't recall what I did here to be honest. But there are plenty of ways in which you might "modify JSON" in the example above. The very simplest method (but not the most advisable in many situations) would be to use String manipulation, going to a potential rebuild using Java. But as I said, I can't recall what was happening in this example.