Usage of LOAD....SQL SELECT Statement - Qlik Community

Report Inappropriate Content · ‎2009-05-01

Hi,

Could you please tell me the advantage of LOAD ... SQL SELECT and when we need to use this?

Is that we can use the QlikView functions in the LOAD and Database functions in the SQL SELECT part as
LOAD would manipulate in QlikView and SQL SELECT would manipulate in database.? Is this correct.

Example:

LOAD floor(date(CREATION_TS)) as TS,
....

SQL SELECT MAX(CREATION_TS)
FROM sample.

The order of the LOAD and SQL SELECT can differ?correct?

Report Inappropriate Content · ‎2009-05-01

SQL SELECT is necessary for retrieving data from an external SQL data source. Using functions like MAX into the SQL statements will use for instance the ODBC driver instead of QlikView for doing the calculation. I think it is always better and perhaps faster to use these functions in QlikView but it could be different for each connection and data source type.

Report Inappropriate Content · ‎2009-05-01

Hi Mark,

You have not answered the question clearly

Is that we can use the QlikView functions in the LOAD and Database functions in the SQL SELECT part as
LOAD would manipulate in QlikView and SQL SELECT would manipulate in database.? Is this correct.

Report Inappropriate Content · ‎2009-05-01

In which case do you use LOAD without SQL SELECT?

Using database functions in SQL SELECT will not manipulate in database, they will only use the connection driver for perfroming these calculations.

Report Inappropriate Content · ‎2009-05-01

How to interpret LOAD and SQL statements:

LOAD
.....;
SQL SELECT
....;

The SQL statement is executed FIRST, passing a database data retrieval command to the database via your ODBC driver. When the data has been fetched from the database, it can be further processed by QlikView. This is what happens in the LOAD statement.

There is an extreme overlap in what data transformations a database can do for you and what QlikView can do for you. For example:

SQL SELECT left(myField,3) as FieldPrefix FROM myTable;

would ask the database to pick the 4 first characters from 'myField', call it FieldPrefix and THEN send it back to QlikView.

If you do not want to incurr additional load on your database source system, you could have instead left QlikView to do the same by the statements:

LOAD left(myField,3) as FieldPrefix;
SQL SELECT myField FROM myTable;

The database would FIRST retrieve myField, after which the data is passed UP through the preceding LOAD statement in QlikView.

The SQL statement is always below any QlikView LOAD statements. You cannot switch the order around. The way to read the scope of a full pass of data is from BOTTOM to the TOP. My hypothesis of why this is would be because QlikView needs a template of what to do once data is received PRIOR to receiving the data, thus prior to the SQL statement being issued it needs to understand the intended QlikView transformations to be performed.

Hope that made sense.

Oleg_Troyansky · ‎2009-05-01

yes, your understanding is correct - LOAD is used for QlikVIew functions and proper case renaming (some databases can only work with upper case), while SELECT is executed on the database and is using the database syntax.

For large data volumnes, it's preferable to make the initial selections, groupping and any complex calculations on the database side, to avoid bringing over and then discarding a lot of data.

Oleg

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

btrompetter · ‎2009-05-06

Hello Oleg,

do you loose the flexibility of QlikView if you make the grouping and calculation in the database for large data volumnes?

What are large datavolumnes for you that these option make sense?

Bastian

johnw · ‎2009-05-06

You don't lose any flexibility. If the database engine can't do a particular operation, you can still do it in QlikView. But you would generally want to do operations where they can be handled most efficiently. I don't generally worry about how large the data set is - I just try to make a habit of doing operations in the most efficient spot, even when it doesn't matter. I figure it's a good habit to be in for when it really matters. How large the database has to be for it to matter will depend on the specifics in your shop. In our shop, I'd say it only starts getting critical when we're retrieving millions of rows of data, but could probably make a noticeable difference with hundreds of thousands of rows.

In practice, by far the most common thing I do on the database side is filtering the records being retrieved. Other manipulation is quite rare for me.

Oleg_Troyansky · ‎2009-05-06

Bastian,

as usual, the answer is "it depends"

You should always load data and the lowest level that your analysis requires. If you keep this rule, you'll never lose any flexibility.

As far as "what's large data volumes" - it's all relative to your environment. For a small company with a 32 bit server, - a few millions of rows is "large". For a large company with 64-way beast with 128 GB of RAM, a few hundred of million rows is "large".

To generalize, I'd say, every time you cross the "few millions of rows" mark, you should begin worrying about performance and thinking about the most efficient way of loading and presenting data.

cheers,

Oleg

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

Report Inappropriate Content · ‎2010-02-11

How would you select all tables? I cant find any examples where anything other than reading from only one table is used. I need to read all tables in a database and count the number of rows in each. Example of how this works for one table is below:

Users1:

LOAD

*;

SQL

SELECT COUNT(*) from pub."_user";

Anyone done this before?

Thanks