Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Extract more than 10k records from thttpsrequest component

I am using Talend big data 6.4 and I have got a scenario which requires your guys expertise.
Here is the scenario:
I am using thhtprequest component (GET method) to extract the data which is hosted on kinvey server. Due to some restrictions at source; if any tables having more than 10k records, only first 10k records are extracted and the remaining records are discarded and not sent through that request.
Here, I require your expertise to help me to extract all records by some work around.

While doing some investigation, I came to know a concept called pagination can be used to solve the problem. But I don't know how to configure this pagination in Talend or to use any other components for this purpose.

It would be more beneficial if you can share some ideas on how to get this working and also show us the screenshot about list of components used for that job.

Any other ways to accomplish this work around is also greatly welcome. (I heard another way is to use tloop component). kindly share the screenshot of the components used along with any java codes written for this purpose.
Labels (5)
19 Replies
Anonymous
Not applicable
Author

I am not sure on which component you want to make this change. Do you want to create a tjava component? If possible can you send me the modified job . (Shong's job is available with this thread) as I am not an expertise in java and sending the job like how you did on your previous post, it will be more understandable and can help to extend this feature by me and anybody who is having such a similar issue in future.

Kindly assist.


@rhall wrote:

OK, you can keep most of your job structure (I would used @shong's example for this). What you have to remember is that the last part of your URL will adjust with every call. The code you will need to change is below.....

 

//Set the limit value
int limit = 1000;
//Set the skip value....(1000 x the current iteration of the loop) - 1000
int skip = (1000 * ((Integer)globalMap.get("tLoop_1_CURRENT_ITERATION")).intValue()) -1000;

//Set the query value
String query = "?query={}&limit=" +limit+"&skip="+skip;

//Assign the query value to the query globalMap variable
globalMap.put("query", query);

You will then need to append ((String)globalMap.get("query")) to your URL.


 

Anonymous
Not applicable
Author

I'm afraid I cannot do this for you since I do not have the time to reconfigure my system to do this. However, I have given you the bulk of the code you will need for this. You are right that this will need to be done in a tJava component. 

 

The best way for you to get better at this is to struggle through this. You have all of the information you need, you just now need to work out how to implement it. Most of the work is done, you just need to think about variable names, etc

Anonymous
Not applicable
Author

Does this code of yours require finding the count of the records before getting into the loop (for iteration)

Anonymous
Not applicable
Author

The code uses the number of iterations of the loop to calculate the record numbers....

//Set the limit value
int limit = 1000;
//Set the skip value....(1000 x the current iteration of the loop) - 1000
int skip = (1000 * ((Integer)globalMap.get("tLoop_1_CURRENT_ITERATION")).intValue()) -1000;

//Set the query value
String query = "?query={}&limit=" +limit+"&skip="+skip;

//Assign the query value to the query globalMap variable
globalMap.put("query", query);

If we assume that the first iteration is iteration 1, then the query string will be....

"?query={}&limit=1000&skip=0"

The second iteration it will be....

"?query={}&limit=1000&skip=1000"

 

The third iteration it will be....

"?query={}&limit=1000&skip=2000"

Anonymous
Not applicable
Author

Hi rhall_2_0,

I am attaching you the screenshot of the job which i have build, but I could not able to join a tloop with tjavarow directly also i am not sure what should be the schema to be on the tjava as the http component can only have responsecode as an output. shall i have the schema as query and the input in thttpresponse component as query and the output to be responsecode?

Can you see if I made all the parts correctly and also how shall i connect the tloop component with tjavarow component.

Let me know if i missed anything out. Kindly assist

0683p000009Lv0a.jpg0683p000009Lv96.jpg

Anonymous
Not applicable
Author

Use a tJava component instead of a tJavaRow. tJavaRow components cannot be connected using "iterate" links, they need "Main" rows. Since the tLoop only provides an "Iterate" link, this needs to be considered. Link the tJava to the tHttpRequest using an "Iterate" link. 

It should look something like this.....

 

tLoop --iterate--> tJava ---iterate--> tHttpRequest

Anonymous
Not applicable
Author

Thanks for the hint.
But this takes only records between 10k to 20k and loads (source have 20k
records) them into the target but neglects loading the first set of 10k
records.
May I know where did I do wrong?
Anonymous
Not applicable
Author

Problem fixed. Some problem with my settings. Kindly ignore my previous posts.

Huge thanks to rhall_2_0 and shong for helping me out and making me to learn more about Talend and its capabilities.

Please continue your service to the community.

Anonymous
Not applicable
Author

Glad you got it sorted!

I felt a little bad about appearing to try to make you struggle, but I was a little busy and do think struggling a little massively increases the amount you learn 🙂

Anonymous
Not applicable
Author

Hi rhall_2_0,

Never feel bad. Given the timeline for me, I thought it cannot be done on time, but with your help, I can able to finish it in time.Thanks anyways and perhaps it did made me to learn more about the implementation.

 

I am also getting a WARNING message right now, though it does not affect the outcome of the job, I am little interested to know how can this be fixed. Let me know if you have any reasons in mind due to which this WARNING pops up while executing every time. Below is the URL of the topic which I posted,

https://community.talend.com/t5/Design-and-Development/Unable-to-find-mime-types-file-in-classpath-w...