Left join result

Report Inappropriate Content · ‎2012-04-09

Hello,

I have 2 tables, Temp with 753878 lines and Relations with 4120 lines.

In SQL I have a left join script:

Load * from Temp it LEFT JOIN
Relations r ON SUBSTRING(Position, 5, 2) = r.VG AND SUBSTRING(Position, 7, 2) = r.RelationNr AND r.Branch = it.Branch

This left join in SQL brings me 753881 lines.

I made a similar script in Qlik, where I stored Temp and Relations in 2 qvd files. For having the 3 fields correspondent, I created a common key.

So I have this:

Load *,

Branch&VG&RelationNr as RelationKey

FROM

D:\QVDs\Temp.qvd (qvd);

left join
LOAD
CostCenter,
BusinessArea,
RelationGroup,
RelationName,
Branch&VG&RelationNr as RelationKey
FROM
D:\QVDs\Relations2011_.qvd (qvd);

In QlikView, this left join brings me 753887 lines.

Can someone tell me why is this difference between this two similar join's in SQL and QlikView?

10x a lot!

Regards, Olivia

vijay_iitkgp · ‎2012-04-09

Hi,

Is your key numeric ? I can find only one possiblity that after concatenating key valyes to create relationkey there must be duplicate.

Please try Branch&'_'&VG&'_'&RelationNR as RelationKey

Or instead of creating key you can directly do your left join on three fields

Load

Field1,

Field2,

Branch,

VG,

RelationNr

From Temp;

Left Join

Load

Field4,

Field5,

Branch,

VG,

RelationNr

From Relation;

Hope this will help.

Report Inappropriate Content · ‎2012-04-09

My first question mark is why in SQL, if I make left join between Temp which has 753878 lines and Relations with 4120 lines, the result table has 753881 lines and not 753878 as first table? May be because in Relations can be duplicates?

Jason_Michaelides · ‎2012-04-09

Exactly. The LEFT JOIN means you won't get any records from Relations that don't join to a row in Temp. However, if two different rows from Relations join to the same row in Temp, that row will be duplicated - once for each of the two different Relations values.

vijay_iitkgp · ‎2012-04-09

no because if it is duplicate then it should result same in SQL also. But there may be possibilit that RelationKey is duplicate

Eg:

If Branch =12 VG=315 and relationNr=26

and

If Branch=123 , VG 152 and RelationNr=6

In both cases Key is 1231526

Hope this will help.

Just try to left join without creating key

Report Inappropriate Content · ‎2012-04-09

I already done without creating the key and the result is the same.

I created a key with separator "_" and I saw that I have exactly 9 duplicates in 5 lines. I will try to solve the duplicates problem and see what happends.

Thank you all!

Have a nice day! Olivia

Jason_Michaelides · ‎2012-04-09

Why are you trying to replicate this functionality in QV anyway? Unless you have a good reason to, let SQL do what it is good at, which is basic joins to present a limited subset of data. Otherwise, you will slow your load script down unnecessarily. Take the following situation:

Table TempIT in your SQL database has 50,000 rows.

Table Relations in your db has 100,000 rows.

Only 25,000 rows in Relations are linked to TempIT.

LOAD

*

;

SQL SELECT * FROM TempIT LEFT JOIN Relations ON....;

This will take x seconds to run and will present 50,000 rows (plus any duplicate joins) to QlikView, which is all it needs. SQL deals with basic joins very efficiently and so db load shouldn't be a worrysome factor.

However, if you load the entire TempIT table into a QVD, then load the entire Relations table into a second QVD, then use QlikView to join them, it first has to load both tables then perform the joins. This will most likely take longer than letting SQL join the data and QV needs to deal with 150,000 rows of data whereas it only really needs to see 50,000. Plus you are asking SQL to return 150,000 rows of data instead of only 50,000.

So, unless you NEED all the Relations data inside a QVD file (maybe for other uses - even then I would question whether you should use it here) then let SQL do what it is good at (basic stuff) and do the clever bits in QV.

Jason

Report Inappropriate Content · ‎2012-04-10

Hello,

Why I am trying to replicate in OV?

Mostly because in SQL data are imported manually and this is what we want to elliminate first.

We also have more systems from where data are comming daily or monthly in csv, txt or xls files and we want to combine them, something that now we don't have in sql.

If someone has a better idea or knows a better method, please tell me.

Olivia

Jason_Michaelides · ‎2012-04-10

If you're dealing with non-DB files then a structure of QVD files may well make sense. Your example above speaks of 2 SQL tables with a known join however, so I was referring to that situation with my comment above.

Hope this helps,

Jason

Related Topics