Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
marcel_olmo
Partner Ambassador
Partner Ambassador

Anybody knows a faster/optimized way to do this? (Autogenerate stuff)

Hey guys, I have to do a demo with millions of records of a "selling table".

My code works perfectly. But it does autogenerates, and they are very slow if you have to generate a big bunch of data.

I have to do this script for 10 million records, and if I trust my computer, it takes 3 days of generating data. 😞

Here is what I have, if anybody knows a way to do it faster, I'm gonna thank you a lot :

set nClientes = '1000000';

set nProductos = '5000000';

set nCabeceras = '10000000';

set nLineas = '10';

FOR k = 0 TO ($(nCabeceras) -1)

FOR i = 0 to ($(nLineas)-1)

LineasVentas:

load

mod(2000 + ( $(k) ),$(nCabeceras)) + 2000 as [Nº documento],
10000 + ($(i)) as [Nº línea],
(mod(2000 + ( $(k) ),$(nCabeceras)) + 2000) &'-'&( 10000 + (($(i))) ) as [Nº documento-Nº línea],
mod(10000 + ( $(k) ),$(nClientes)) + 10000 as [Venta a-Nº cliente],
date( today() - mod($(k),500)) as [Fecha registro],
mod(1000+ (rowno()-1),$(nProductos))+ 1000 as [Nº producto]

autogenerate 1;

NEXT i

NEXT k

store LineasVentas into LineasVentas.qvd;

1 Solution

Accepted Solutions
Not applicable

Hi Marcel,

You can create a table temporary to generate the values of variables "i", "k". After use this table to generate your data.


set nClientes = '1000000';
set nProductos = '5000000';
tabTemp:
LOAD (100 - RowNo()) AS k
AUTOGENERATE 10000000;
Lineas:
JOIN(tabTemp)
LOAD (10 - RowNo()) AS i
AUTOGENERATE 10;
LineasVentas:
LOAD
mod(2000 + (k),$(nCabeceras)) + 2000 as [Nº documento],
10000 + (i) as [Nº línea],
(mod(2000 + (k),$(nCabeceras)) + 2000) &'-'&( 10000 + (($(i))) ) as [Nº documento-Nº línea],
mod(10000 + (k),$(nClientes)) + 10000 as [Venta a-Nº cliente],
date( today() - mod(k,500)) as [Fecha registro],
mod(1000+ (rowno()-1),$(nProductos))+ 1000 as [Nº producto]
RESIDENT tabTemp;
DROP TABLE tabTemp;
store LineasVentas into LineasVentas.qvd;


View solution in original post

11 Replies
maneshkhottcpl
Partner - Creator III
Partner - Creator III

What is the use of this script?

or u can use incremental load.

suniljain
Master
Master

is this code for increamental load or anything else ?. not getting your code .

marcel_olmo
Partner Ambassador
Partner Ambassador
Author

Thank you Manesh for your quick reply.

What is the use of this script? I don't understand you. Are you asking me why I'm doing it like this way? Because I don't know a better way to do it, of course, haha.

It works fine for few thousands of records, but this time I have to do it for millions of records, and it spends a lot of time to do it, and I'm looking for better / faster ways to do that.

nCabeceras is the "selling header", and I have 10 million, and nLineas is "selling lines", and with 10 is enough.

Could you be more specific about using incremental load in this case?

Thank you very much!!

sathishkumar_go
Partner - Specialist
Partner - Specialist

Hi Marcel,

It depends up on data Maintaining. What I suggest, Create 1 Qvd for Historical data and Make 1 qvd for last 1 or 2 year data.

In final App Call all Qvds either concatenate or use *.

Regards

Sathish

marcel_olmo
Partner Ambassador
Partner Ambassador
Author

Thank you Sathish for your suggestion.

Unfortunately, I don't have historical data. The point is to get a big bunch of "invented" data to test between all the versions of QV9 and QV10.

I know that when I get the qvd file of "selling lines", everything is gonna get more easy. But by now, I have the problem of the time, I don't want to wait 3 or 4 days of "creating autogenerate data".

Many thanks anyway.

marcel_olmo
Partner Ambassador
Partner Ambassador
Author

Hello Sunil, thanks for your reply. It's not for an incremental load, is for testing "invented selling data" to compare between all versions of QV.

(is a request from mycompany).

sathishkumar_go
Partner - Specialist
Partner - Specialist

HI Marel,

May I know y u r using nested for loop in script?

Because this nested for loop obviously take time (Because it ll read line by line also)

In 1st loop will read almost 1 crore records.

try to avoid the nested loops or explain y u need to use this logic it will help to find some other logic

-Sathish

marcel_olmo
Partner Ambassador
Partner Ambassador
Author

Yes, I know I'm using nested for loop and that's why this is slow. But by now I don't know a better way to do it, and until now it was enough for my requirements, but not for this case. 😞

This is a picture of my desired result :

As you can see, there is :

1-Id Selling Header.

2- 10 associated selling lines for each selling header

3- The same Id Costumer for each Id Selling Header

4-The same Register date for each Id Selling Header

5- An incremental number of Id products for each line of each selling header.

Hope it helps about what I want to get.

Many thanks in advance!!!

Miguel_Angel_Baeyens

Hello Marcel,

I understand that you want to create some "dummy" sales information. If my assuption is correct, try something based on the following

Set nClientes = '1000000';Set nProductos = '5000000';Set nCabeceras = '1000000';Set nLineas = '10'; LineasVentas:LOAD Ceil(Rand() * $(nCabeceras)) + 2000 as [Nº documento], Ceil(Rand() * $(nLineas)) + 10000 as [Nº línea], Ceil(Rand() * $(nCabeceras)) + 2000 & '/' & Ceil(Rand() * $(nLineas)) + 10000 as [Nº documento-Nº línea], Ceil(Rand() * $(nClientes)) as [Venta a-Nº cliente], Date(Today() - Ceil(Rand() * 500)) as [Fecha registro], Ceil(Rand() * $(nProductos)) + 1000 as [Nº producto]AUTOGENERATE $(nCabeceras); store LineasVentas into LineasVentas.qvd;


The performance problem in your script is not in the autogenerated table rather than the amount of loops it has to do. In my modest computer, those 10M have taken few minutes to load. Playing with the rand(), ceil() and floor() and chr() functions, you can get some coherent information.

Hope this helps