Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to remove rows after load

Hello,

I want to remove rows from a LOADed table. One way to do it is to use WHERE when loading the table but this will stop the optimized load causing serious performance problem. Because the number of rows I want to exclude is small (one thousand of 25millions), I had the "brilliant" idea to load the whole table and delete rows after, but I cannot find any statement to delete loaded data.

Is there anything like sql delete statement to delete rows with criteria?

Thanks

8 Replies
Not applicable
Author

Just do a second load on the RESIDENT table using a WHERE clause to filter out the unwanted data then DROP the initial table. To prevent Qlikview joinging the second table to the initial you will want to use NOCONCATENATE LOAD.

Regards,

Gordon

Not applicable
Author

Hi.. try using "Reduce Data" . Select the rows from a list box you wish to keep (use Select Excluded) and simply select "Reduce Data" from the file menu option. You may save the file with a new name before choosing this function.

Thanks & Regards,

Vineet Agarwal

Not applicable
Author

..... but you would need to do this after every load...

Not applicable
Author

In by company we use QlikView Publisher. In the distributions task, it is possible to do a Reduction. That seems to do the job for us. But it may be depending on how the scripting is done.

Do I need to say, that the reduction is done automatic, on every load, and dynamic when data changes....

Regards,

Jerrik

Miguel_Angel_Baeyens

Hello Vlisidis,

Here you are my suggestion: I would load first the whole table and store it in a QVD file and drop the table, then a second load from the QVD file using WHERE clause now. Depending on your hardware configuration and if storage is not a problem, will perform better than loading from RESIDENT. In my experience, this option is preferable than loading from resident tables when moving hundreds of thousands of records.

johnw
Champion III
Champion III

You may be able to do your data reductions with where exists and/or inner joins. You can use ONE where exists statement without breaking the optimized load. And you can do inner joins to the table after the fact. This is often faster than an unoptimized load or doing a load resident.

So let's say I wanted all fruit records with color RED and type APPLE. Let's say that 30% of my fruit are red, but only 5% are apples. So I want to apply the apple filter first, in the where exists. Then I'll apply the color red filter after the fact with an inner join.

[Red Apples]:
LOAD 'APPLE' as Type
AUTOGENERATE 1
;
INNER JOIN ([Red Apples])
LOAD
Type
,Color
,other fields
FROM Fruit.qvd (QVD)
WHERE exists(Type)
;
INNER JOIN ([Red Apples])
LOAD 'RED' as Color
AUTOGENERATE 1
;

Not applicable
Author

just to understand: is the where exists() needed if you're already doing an inner join?

johnw
Champion III
Champion III


dragonauta wrote:
just to understand: is the where exists() needed if you're already doing an inner join?<div></div>


It is not needed, in the sense that the results will be exactly the same either way. However, if you don't use the exists(), it still loads all of the rows, and then drops the ones that don't match in the inner join. With the exists(), it only loads matching rows.

Testing on a sample data set, it appears that both methods take the same amount of time. However, using exists() saves a significant amount of RAM during the load. That may or may not matter in your environment, but it seems like it would be a good habit to use an exists() in a case like this.