Skip to main content
Announcements
SYSTEM MAINTENANCE: Thurs., Sept. 19, 1 AM ET, Platform will be unavailable for approx. 60 minutes.
cancel
Showing results for 
Search instead for 
Did you mean: 
Mayot
Contributor III
Contributor III

How to optimize tSortRow with 10M rows ?

Hello,

 

Do you have tips to optimize a 10 million lines processing with a tSortRow before inserting it into the database?

I have good performance at the beginning (~ 6600rows / s), the more the number of treated lines increases, the more the performances decrease. Arrived at 600 000 lines, I have the error OutOfMemoryError: GC overhead limit exceeded (I could increase the memory of the JVM for the job, but I think it's not optimal)

 

Thanks.

Labels (2)
1 Solution

Accepted Solutions
Jesperrekuh
Specialist
Specialist

In your tSQLinput query add : order by <column>

Alternatively, it sounds like a load once to this table? say from a multiple file source, first store them in smaller fragments, write output based on some logic... like a file for each week of the year / data you want to sort by. Then process these smaller files and sort them before writing to db.

Alternatively, write to a tmp table, and next write a tsql : insert into finaltable as select... from tmptable order by your columns.

View solution in original post

2 Replies
Jesperrekuh
Specialist
Specialist

In your tSQLinput query add : order by <column>

Alternatively, it sounds like a load once to this table? say from a multiple file source, first store them in smaller fragments, write output based on some logic... like a file for each week of the year / data you want to sort by. Then process these smaller files and sort them before writing to db.

Alternatively, write to a tmp table, and next write a tsql : insert into finaltable as select... from tmptable order by your columns.

Mayot
Contributor III
Contributor III
Author

Thanks, i will try this