Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Have a data of about ~85,177,324 records in a qlik memory table. I need to assign a rownumber to each record. How can i do it quickly. The above data set is concatenated into a table from various qvds and now I need to have a rownumber to each record entry. Rowno() is taking lot of time to do this and exract takes about one hour to get loaded.
Is there any optimised way to do this..
Data as of now..
Name | Address | pincode |
abc | aaa | 123 |
xyz | sss | 3432 |
pqr | fff | 4324 |
. | . | . |
. | . | . |
. | . | . |
Output
Rowno | Name | Address | pincode |
1 | abc | aaa | 123 |
2 | xyz | sss | 3432 |
3 | pqr | fff | 4324 |
. | . | . | . |
. | . | . | . |
85,177,324 | . | . | . |
If i am using rowno() while reading data from each qvd then it is making load unoptimised. Is there any optimised way to do this considering large data set
Here is my test code for 10M rows.
data:
LOAD
'A' & div(RecNo(),2) as A
,'A' & div(RecNo(),5) as B
,'A' & div(RecNo(),10) as C
,'A' & div(RecNo(),100) as D
,'A' & div(RecNo(),500) as E
,'A' & div(RecNo(),1000) as F
AutoGenerate 1e7
;
Store data into data.qvd (qvd);
Drop Table data;
data:
Load * From data.qvd (qvd);
data2:
NoConcatenate
LOAD *, RecNo() as RecId Resident data;
Drop Table data;
Rename table data2 to data;
On a 4 core 32GB laptop here's the timing.
1st Load: 9 seconds
2nd Load: 2 seconds
3rd Load (recno): 5 seconds
I'm curious to hear results for this same script on your system.
-Rob
Load from the QVD, but add a preceding load to it or a resident load to add in the rowno.
eg..
Load Rowno(), *;
Load Name, Address..... ;
Thanks Lisa.. this is working but this is taking time to load data. I need something which can be quickly achieved. Not sure if I am asking for more here. Is this achievable
@rwunderlich @swuehl @sunny_talwar if you could advise something here.
If you have an unique key within the data you may use it within an autonumber script-statement (if the origin key-information are further needed you would need to duplicate the field in beforehand). If not you could apply a logic like Lisa suggested whereby I could imagine that a resident-load may be more performant as a preceeding-approach. In general you couldn't get the rowno-id without any costs ...
A partly alternatively might be to include recno() and a source-information within the qvd's to be able to track each record from any source.
Beside this are you sure that you really need this kind of information within your application? Because such a measure could increase the needed RAM and storage space enormously and therefore it needs very good reasons to implement such information. Personally I have nowhere these information within a final application - during the development they are very helpful to validate the data but after that they have usually more disadvantages than benefits.
- Marcus
I think simplest is best here. Stick with the optimized load and add the RecId in a simple resident load like this:
data2:
NoConcatenate
LOAD *, RecNo() as RecId Resident data;
Drop Table data;
Rename table data2 to data;
In my testing I could do about 2M rows per second. If you had similar results, it would only add about 45 seconds to your load.
BTW, it's very important to use RecNo() rather than RowNo() here. RecNo() will be 3-4 times faster than RowNo() in this case.
-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com
Tried this but this is loading only thousands of records per second which is causing this to be very very slow. Is this because there are many columns in source table ?
Why is this happening with us. What could be the issue here. Code is almost similar to your one. It took 12 min to load data to a table using recno()
Can you post the portion of your script where you are assigning the RecNo()?
-Rob
This is how it looks..
Data:
LOAD $(GenColList) // This has 99 columns
FROM $(vFile_Path)$(vFile_Name) (qvd);
Data2:
NoConcatenate
LOAD *,RecNo() AS Sequence Resident Data;
Drop Table Data;
Rename table Data2 to Data;
STORE Data INTO D:\Qlikview\Output\Data.QVD (QVD);
DROP TABLE Data;
Call RecordLoadTime('Data');
It looks now that you are loading from a single qvd - if so you could include the recno/rowno already within the creation of this file. In regard to the load-performance take a look on the available resources of the RAM because if there happens any swapping to the virtual RAM the performance will decrease significantly.
- Marcus