Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
So, I'm trying to load data from two different databases, and can't figure it out.
We'll say DB1 and DB2, which both have the field Serial Number, but DB1 has the field Customer Name and DB2 has the field Expiration Date.
What I want is to select all three columns, but only those with a serial number in DB1. I have to do it in the database query, as DB1 has ~300k rows, while DB2 has 10 million.
I can't figure out how to only get the serial numbers and expiration dates from DB2 where the serial number is in DB1.
Help?
Hi,
Did you try inner join yet? See sample load script below
[DB1]:
LOAD
[Serial Number] AS [_SerialNo],
[Field1];
Select [Serial Number],[Field1] From DB1.Table1;
[DB2]:
Inner Join ([DB1])
LOAD
[Serial Number] AS [_SerialNo],
[Customer Name],
[Expiration Date];
Select [Serial Number],[Customer Name],[Expiration Date] From DB2.Table2;
Regards,
Sokkorn
That's a sticky problem. I understand why you don't want to pull down all 10M rows from DB2 and filter them with a WHERE EXISTS([Serial Number]) or INNER JOIN.
Here are a few possibilites to think about.
1. After selecting the Serial Number values from DB1, use CONCAT to format a SELECT for DB2 that in uses a WHERE IN() clause for the values you want to select. The practicality of this would be dependent on how many unique Serial Number values you have and the limits of the DB2 driver. If you hae a small amount of values, it may work.
2. After selecting the Serial Number values from DB1, SQL INSERT them into a temp table in DB2 and reference the temp table in a subquery to qualify rows. QV script does allow for SQL INSERT. To make this practical you would probably have to use a bulk insert strategy. You didn't mention what kind of database DB2 is so I don't have a specific example of how to do that.
3. 10M rows isn't so big if you do incremental loads. A third possibilty would be to maintain an incremental reloaded QVD of the DB2 table and then load from that QVD with a WHERE EXISTS() clause.
-Rob
You said you "have to do it in the database query"; I'm guessing to avoid pulling 10m records, so put the join logic in your SQL query.
Sudo Code:
[QlikViewTableName]
LOAD *;
SQL
SELECT
DB1T1.SerialNumber,
DB1T1.CustomerNumber,
DB2T2.ExpirationDate
FROM DB1.Table1 AS DB1T1
INNER JOIN DB2.Table2 AS DB2T2
ON
DB1T1.SerialNumber = DB2T2.SerialNumber;
Mitch
Let's clarify. When you said "two different databases" did you mean two different servers (my assumption) or two databases in the same database server -- i.e. sharing the same connection string.
-Rob
I did mean two different servers.
I'm trying all of these solutions now, I'll get back to you
So, your answer worked... but it didn't quite solve my problem. Here's what I worked out with your help:
CONNECT TO [...];
LOAD
serial AS SerialNo;
Select serial From table1;
CONNECT TO [...];
Inner Join (eshistory)
LOAD
serialnumber AS SerialNo;
Select Distinct serialnumber From table2;
The resultant serial numbers are correct... but it still loads all of table2 (distinct cuts it down to ~2million, but for only 200k actual used values). Is there a way to make a Where clause in the second Select statement, something like: "Select Distinct serialnumber From table2 Where serialnumber=SerialNo"?
Ya mine won't work if they are seperate servers; only if they are in the same db, but different schemas.
To be honest, I've just started using qlikview and sql, so I'm not exactly sure how to do... well, all of that. I get the theory, or at least most of it, but I'm not sure how use CONCAT to format SELECT. Could you give an example for #1? #2 and #3 I think won't work for me
So, I've been trying #1:
(already loaded serial numbers from DB1)
Set vSerials=Concat([Serial Numbers], ',');
CONNECT TO [DB2];
LOAD
serialnumber AS [Serial Numbers];
SELECT serialnumber,
expirationdate
From table2
WHERE serialnumber IN(vSerials);
And it tells me vSerials doesn't exist as a column.