Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have huge volume table (size - 3.5 Tb) and wanted to join with another table and insert the output to a new table.
I wanted to split the rows in blocks and run as multiple process to reduce execution time.
For this:
I have prepared a child job that does ELT process: tELTInput -> tELTMap(between from key to end key) -> tELTOutput.
I am preparing the standard parent job template to run the child job dynamically. This template should even work if the table have 750 million rows or 3.5 billion rows.
To maintain generic, I wanted to have 4 tRunJob components(static number). And for each process I have created a file with start key and end key. I am calculating the start key and end key for a table based on records to process counts.
Say. If I wanted tRunJob to process 10 million rows, I calculate start and end key every 10'th million row. This is saved in file.
Now, how do prepare the dynamic number of tRunJob to run 4 process parallely.
Thanks.
Say if my records to process count is 10 million, I take the key value every 10th million key value.
Hello,
Could you please share a screenshot of your job?
To run multiple things at the same time, you can either use the tParallelize or you could place four tRunJob components and configure your job to run multithread...
Best regards
Sabrina