Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello Guys!
I have really huge amount of data (one qvd file is about 250 million of rows, and I need to load and process several files). The problem is that these qvd files contains duplicates of rows.
For example I have two rows for the same Item. One row contains Sum of sales and 0 as a Price, but the other one contains 0 in Sales and positive Price.
Item | Price | SumOfSales |
---|---|---|
10001 | 0 | 100 |
10001 | 1 | 0 |
I want to reduce data by joining parts of the same table:
t1:
LOAD
Item,
Sum
From QVD Where Sum>0;
left join (t1)
LOAD
Item,
Price
From QVD Where Price>0;
But if i use "where" in a clause it breaks optimized load and all my data loads terrifying slow.
Can somebody advice me how to keep optimized load and keep only rows with positive values?
Hi.
I don't see a way to keep optimized....Maybe change the script (script that you store QVD) to save in one record is the best way, otherwise
where
or
group by resident
but time for huge data will be big.....
Best,
Alessandro Furtado
Inside Qlik it's not possible.
When use "where" the optimized option is not used.
You must generate with QVD without duplicated lines.
Why don't you use where exists function with will remain optimized load and can achieve your logic.
Use an temporary inline table and do where exists..
Good try....
load from qvd in this script is always optimized
f: load SumOfSales, Price from t1.qvd (qvd);
s: load SumOfSales Resident f where SumOfSales>0;
p: load Price Resident f where Price>0;
DROP Table f;
t:
load Item, SumOfSales
from t1.qvd (qvd)
where exists(SumOfSales);
Left join (t)
load Item, Price
from t1.qvd (qvd)
where exists(Price);
DROP Table s,p;