What is the cleanest way of doing a binary load, then automatically making selections and reducing the data?
I have an application that needs the complicated data model from another application, but only a small fraction of the data based on some simple selections. I solved the problem by reducing my data with a sequence of inner joins, like so:
INNER JOIN ([Products]) LOAD 'TFS' as "Product Group" AUTOGENERATE 1;
INNER JOIN ([Date Types]) LOAD 'Int Yields Plus Open TFS Batch' as "Date Type" AUTOGENERATE 1;
INNER JOIN ([Show Weights In]) LOAD dual('Tons',2000) as "Show Weights In" AUTOGENERATE 1;
INNER JOIN ([Yields]) LOAD DISTINCT "Product" RESIDENT [Products];
INNER JOIN ([Defect]) LOAD DISTINCT "Loss Reason Code" RESIDENT [Yields];
INNER JOIN ([Locations]) LOAD DISTINCT "Location" RESIDENT [Yields];
INNER JOIN ([Dates by Type]) LOAD DISTINCT "Recno" RESIDENT [Yields];
INNER JOIN ([Dates by Type]) LOAD DISTINCT "Date Type Number" RESIDENT [Date Types];
INNER JOIN ([Calendar]) LOAD DISTINCT "Date" RESIDENT [Dates by Type];
This works just fine, but it just seems like it shouldn't be this complicated given the various ways QlikView has of reducing data based on selections.
Written as a macro, I believe it would be this simpler and probably faster code:
activedocument.fields("Product Group").select "TFS"
activedocument.fields("Date Type").select "Int Yields Plus Open TFS Batch"
activedocument.fields("Show Weights In").select "Tons"
But I can't run a macro OnPostReload in a server environment. I can supposedly execute a function during script execution, but that didn't work. I assume that's because none of the logic in the macro above makes any sense until the final data model has been built, so after script execution.
I can do a reduction using Publisher, but only the distributed file would be reduced. Explaining why this isn't good enough will take a little more detail. The full set of data is huge, reloads daily, and takes 5-10 minutes to binary load. My final user application requires additional QVDs that are refreshed hourly, so it must reload hourly. I can't afford to binary load for 5-10 minutes every hour. Instead, I have a middle QVW that loads daily after the huge data model, and does the reduction. Then the user application is only binary loading a small file hourly. It could not (?) binary load from the distributed file.
I could write all the original data model's tables out as QVDs, then do a bunch of where exists() loads, but that seems at least as complicated as the inner joins, plus keeps an extra copy of some very large tables of data.
Is there a cleaner way to reduce my data? Or are inner joins as good as it gets?