Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Talend Cloud AWS EU Scheduled Outage: Starting Tues 26 May 21:00 CEST with expected completion Wed 27 May 01:00 CEST
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Iterate all X rows and not for each row

hi,

 

I used talend for data extraction and insertion into OTSDB. But I need to cut my file, and a classic iteration take too much time (40 rows/s and I have 90 millions rows).

Do you know how to send for example 50 rows by 50 rows instead of each row individually?

 

Best regards,

 

terreprime.

Labels (3)
17 Replies
Anonymous
Not applicable
Author

Can you show the JSON that was created using a System.out.println call?
Are you sure that the Rest service method should be PUT? I would have opted for POST if I wasn't given a direction here.

Anonymous
Not applicable
Author

Hi,

 

This is the JSON:

0683p000009Lu2s.png

 

Actually, I changed the put method by post and it works !

But the treatment is really slow .... 20 rows/s.

 

0683p000009Lu6Z.png

How can I accelerate the treatment of file ?

 

Terreprime

Anonymous
Not applicable
Author

A web service call is always going to be slow by comparison to loading to an on-premise database. What you need to do is figure out if you can batch up your calls. If you can, you then need to work out how many rows can be sent at a time. There is absolutely no way you will get it going faster than this unless you can send more than one row per web service call or send parallel service calls. I suspect you can send more than one row in each web service call.

Anonymous
Not applicable
Author

Yes, I know, this is the initial subject of this topic. I'll like sending data by group of 50 rows for example. But I don't know how to proceed.

Anonymous
Not applicable
Author

OK, I have put together an example you will need to extrapolate from. It is quite simple. The layout for your job will be.....

 

tFileInputDelimited ---> tJavaFlex ---> tFileterRow ---> tFlowToIterate --->tRest

 

1) You read the file as normal with the tFileInputDelimited. 

2) The magic happens in the tJavaFlex. The code below shows what I did with my example. You will need to extrapolate from this to put in your JSON build (and combine) code....

Start Code

//Used to count the rows
int count = 0;
//Used to concatenate your Strings
String myConcatenatedVal = "";

Main Code

//Append 1 to each incoming row
count++;

//Concatenate your code (adjust this to concatenate your computed JSON Strings
myConcatenatedVal = myConcatenatedVal+row1.newColumn;

//A modulus operation to fire on every 50th row. It sets the output "newColumn" to the concatenated value, then resets the myConcatenatedVal and count variables.
if(count%50==0){
	row2.newColumn = myConcatenatedVal;
	myConcatenatedVal = "";
	count=0;
}else{
//The output "newColumn" column is set to null when not the 50th row
	row2.newColumn = null;
}

This code will build up your records and only output a value every 50th record. It will output a null value for every other row. To handle this null value (to filter it out), we use the tFilterRow. Use the Advanced Mode and then set the code to ....

input_row.newColumn!=null

Your tRest will now only run once for every 50 records. 

 

I hope that helps

 

Anonymous
Not applicable
Author

Your code is really great !

 

I adapt your code for OTSDB:

 

Start code:

//Used to count the rows
int count = 0;
//Used to concatenate your Strings
String myConcatenatedVal = "[";   

 Main code;

 

count++;
String myJSON = "{\"metric\":\""+context.metric+"\",\"value\":" + row1.myJSON+",\"timestamp\":"+context.timestamp+",\"tags\":{"+ "\"" + context.tags1 + "\"" +":"+"\"" + context.tags2+ "\"" + "}}";
myConcatenatedVal += myJSON;
context.timestamp++;

if(count%50==0){
	row2.myJSON=myConcatenatedVal + "]";
	myConcatenatedVal = "[";
	count = 0;
}else{
myConcatenatedVal += ",";
row2.myJSON= null;
}

Result:

 

0683p000009Lu2Y.png

 

I am between 850 and 1000 row/s it's much better !

 

Thank you very much !

 

Best regards

 

Terreprime

Anonymous
Not applicable
Author

No problem. You may be able to tweak this to get even better performance by adjusting the number of rows you are merging.

Anonymous
Not applicable
Author

Hi 

 

If i have 153 records in my input and try to execute this job i am getting only 3 rows in output.rest three records are ignored.

can you please help me how to get those 3 records too.

 

for eg. my output is

1,2,3...50

51,52,53 ... 100

101,102,103...150

 

but ignoring 151,152 and 153.

Can you please help me getting these records too as a new line in this code