Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I'm looking for some data handling strategy feedback.
Scenario:
I have regional data (same format) sitting on 5 separate servers. Each region has 1 month of log data, made available in Hive. The data volume is huge, so I need to:
(1) transform/aggregate it during load and
(2) store the aggregated content for up to a year.
Current Status:
I have a working incremental load script (using a qvd file) for one region (one ODBC connection).
Challenge:
Because loads from each region can fail independently, I would like to keep the regional data in separate qvd files, so that each can be corrected/updated on the subsequent incremental load execution. This means that for EACH connection/region I have to track start/end dates for both qvd file and the current hive load.
...I'm assuming I would have to edit [date] variable names so they're different for each connection e.g. vHiveLoadStartDateRegionA, vHiveLoadEndDateRegionA, vHiveLoadStartDateRegionB, vHiveLoadEndDateRegionB, etc. (I understand QV does not have a method of restricting variable scope).
Question:
What's the best way to handle this?
Should I have 5 copies of the same connection script but each with different connection, file, and variable names?
Should I apply some sort of a loop, where the connection, file, and variable names are auto-generated on each iteration?
Regardless of the strategy, what's the best way to merge the regional data for QV visualization, once incremental loads are done?
Thanks,
J.
One, unique connection per region:
ODBC CONNECT TO RegionA;
ODBC CONNECT TO RegionB;
ODBC CONNECT TO RegionC;
etc.
Each region is a different ODBC data source (different: host name / IP).
How about :
overcome the sequential nature of ODBC connect statements
on your desktop open qlikview twice
makes 2 different .qvw doc, one with an odbc conn to the server regionA, the other regionB
you can reload the 2 .qvw doc in parallel because they are in different qv.exe
I would first try in this way; also I would add the final .qvw to merge the data from regional .qvw and prepare the data for the UI. Then, test in publisher (task dependency, etc....).
Will it works? I think yes.
Example,
I renamed your doc in JW_Sample-QV-Script B.qvw
I make an include connB.txt with just a line (replace with a conn to a server region) in the same folder
trace I connect to Region B;
when I reload the .qvw with just these lines (no reference to region A B C...)
LET region=right(subfield(DocumentName(), '.'), 1); // region from .qvw name
LET vQvdFile = 'Region$(region).QVD';
TRACE $(region);
TRACE $(vQvdFile);
$(Must_Include=Conn$(region).txt);
I get B region , B qvd, B connection. The only difference is in the name of the .qvw.
5 .qvw, 5 include, 5 regions
If I understand Bill's suggestion, it's one step better; he suggests (hope to understand, Bill correct me if I'm wrong) to use one .qvw and pass the parameter with the publisher (instead of 5 different .qvw).
Thanks Bill!
I haven't implemented it yet, but I now understand what to do.
Also a big thank you to maxgro for introducing me to the insanely useful 'trace' command, providing an example of parameter handling, and rephrasing Bill's strategy (yes, it did help!).