Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I am currently building an application that involves very large data sets along with quite a complex lookup. I am not quite sure how to go about this and would be grateful for any advice/help.
Table A is the fixed table that I want to extract data from.
Table B is the table I want to map data onto; this table will be very large.
Table A
cov | type | us | range1 | range2 | 1 | 2 | 3 | 4 | date1 | date2 |
C | G | O | 0.1 | 18 | 6000 | 6100 | 6200 | 6300 | 01/01/2012 | 31/03/2012 |
C | G | O | 0.1 | 18 | 1000 | 1100 | 1200 | 1300 | 01/04/2012 | 31/12/9999 |
C | V | U | 18.1 | 40 | 2000 | 2100 | 2200 | 2300 | 01/01/2012 | 31/12/9999 |
C | V | U | 40.1 | 50 | 3000 | 3100 | 3200 | 3300 | 01/01/2012 | 31/12/9999 |
T | G | U | 0.1 | 18 | 700 | 710 | 720 | 730 | 01/01/2012 | 31/12/9999 |
T | V | O | 18.1 | 40 | 800 | 810 | 820 | 830 | 01/01/2012 | 31/12/9999 |
T | V | O | 40.1 | 50 | 900 | 910 | 920 | 930 | 01/01/2012 | 31/12/9999 |
O | G | U | 0.1 | 18 | 100 | 110 | 120 | 130 | 01/01/2012 | 31/12/9999 |
O | G | O | 18.1 | 40 | 200 | 210 | 220 | 230 | 01/01/2012 | 31/12/9999 |
O | V | U | 40.1 | 50 | 300 | 310 | 320 | 330 | 01/01/2012 | 31/12/9999 |
Table B
Key | Cov | Type | Us | Size | Area | Date |
1 | C | G | O | 16 | 1 | 01/02/2012 |
2 | C | G | O | 16 | 1 | 01/05/2012 |
3 | O | G | O | 20 | 2 | 01/01/2012 |
4 | C | V | U | 25 | 3 | 01/07/2012 |
5 | C | V | U | 42 | 1 | 01/05/2012 |
In essence I am trying to lookup based on:
Cov = cov
Type = type
Us = us
Size being between range1 and range2
Area being equal to column 1, 2, 3, or 4
Date being between date1 and date 2
The data that needs to be returned is contained within columns 1, 2, 3 and 4 to give th following result:
Key | Cov | Type | Us | Size | Area | Date | Lookup |
1 | C | G | O | 16 | 1 | 01/02/2012 | 6000 |
2 | C | G | O | 16 | 1 | 01/05/2012 | 1000 |
3 | O | G | O | 20 | 2 | 01/01/2012 | 210 |
4 | C | V | U | 25 | 3 | 01/07/2012 | 2200 |
5 | C | V | U | 42 | 1 | 01/05/2012 | 3000 |
Once again any help would be appreciated on this as i'm not really sure of the best way to start.
Many thanks
Ted,
I think you can do this using intervalmatch. You need to to the matching more than once, since you are coping with date and size intervals.
It could then work like this (I needed to recreate some area values because I failed loading your column names '1', '2',... correctly from the WEB):
LOOKUP:
LOAD cov,
type,
us,
range1,
range2,
100+floor(RAND()*10) as A1,
200+floor(RAND()*10) as A2,
300+floor(RAND()*10) as A3,
400+floor(RAND()*10) as A4,
date1,
date2
FROM
[http://community.qlik.com/thread/55783?tstart=0]
(html, codepage is 1252, embedded labels, table is @1);
TOMATCH:
LOAD Key,
Cov,
Type,
Us,
Size,
Area,
Date
FROM
[http://community.qlik.com/thread/55783?tstart=0]
(html, codepage is 1252, embedded labels, table is @2);
inner join (TOMATCH) IntervalMatch (Size, Cov, Type, Us) LOAD range1, range2, cov as Cov, type as Type, us as Us Resident LOOKUP;
inner join (TOMATCH) Intervalmatch (Date, Cov, Type, Us) LOAD date1, date2, cov as Cov, type as Type, us as Us Resident LOOKUP;
left join (TOMATCH) LOAD cov as Cov, type as Type, us as Us, range1, range2, date1, date2, A1, A2, A3, A4 Resident LOOKUP;
drop table LOOKUP;
RESULT:
NoConcatenate LOAD *, pick(Area, A1,A2,A3,A4) as Result Resident TOMATCH;
drop table TOMATCH;
//drop fields date1, date2, range1, range2, A1,A2,A3,A4;
Hope this helps,
Stefan
Ted,
I think you can do this using intervalmatch. You need to to the matching more than once, since you are coping with date and size intervals.
It could then work like this (I needed to recreate some area values because I failed loading your column names '1', '2',... correctly from the WEB):
LOOKUP:
LOAD cov,
type,
us,
range1,
range2,
100+floor(RAND()*10) as A1,
200+floor(RAND()*10) as A2,
300+floor(RAND()*10) as A3,
400+floor(RAND()*10) as A4,
date1,
date2
FROM
[http://community.qlik.com/thread/55783?tstart=0]
(html, codepage is 1252, embedded labels, table is @1);
TOMATCH:
LOAD Key,
Cov,
Type,
Us,
Size,
Area,
Date
FROM
[http://community.qlik.com/thread/55783?tstart=0]
(html, codepage is 1252, embedded labels, table is @2);
inner join (TOMATCH) IntervalMatch (Size, Cov, Type, Us) LOAD range1, range2, cov as Cov, type as Type, us as Us Resident LOOKUP;
inner join (TOMATCH) Intervalmatch (Date, Cov, Type, Us) LOAD date1, date2, cov as Cov, type as Type, us as Us Resident LOOKUP;
left join (TOMATCH) LOAD cov as Cov, type as Type, us as Us, range1, range2, date1, date2, A1, A2, A3, A4 Resident LOOKUP;
drop table LOOKUP;
RESULT:
NoConcatenate LOAD *, pick(Area, A1,A2,A3,A4) as Result Resident TOMATCH;
drop table TOMATCH;
//drop fields date1, date2, range1, range2, A1,A2,A3,A4;
Hope this helps,
Stefan
Many thanks for this, it seems to do just the job.
I will scale it up to the large database and see what happens.
Hi
This will also produce the result table, but Stefan's might be a litle nicer:
Table_A:
LOAD cov,
type,
us,
range1,
range2,
1 as Column,
[1] as Lookup,
date1,
date2
FROM [230217.xlsx] (ooxml, embedded labels, table is A);
LOAD cov,
type,
us,
range1,
range2,
2 as Column,
[2] as Lookup,
date1,
date2
FROM [230217.xlsx] (ooxml, embedded labels, table is A);
LOAD cov,
type,
us,
range1,
range2,
3 as Column,
[3] as Lookup,
date1,
date2
FROM [230217.xlsx] (ooxml, embedded labels, table is A);
LOAD cov,
type,
us,
range1,
range2,
4 as Column,
[4] as Lookup,
date1,
date2
FROM [230217.xlsx] (ooxml, embedded labels, table is A);
// Load Table B
Table_B:
LOAD Key,
Cov,
Type,
Us,
Size,
Area,
Date
FROM [230217.xlsx] (ooxml, embedded labels, table is B);
// Join Tabel A data on, and afterwards keep the rows that fulfill the criterias
Left join (Table_B)
Load cov as Cov,
type as Type,
us as Us,
Column as Area,
Lookup,
range1, range2, date1, date2
resident Table_A;
NoConcatenate
Table_B_Final:
Load *
resident Table_B
where
Size >= range1 and
Size <= range2 and
Date >= date1 and
Date <= date2;
drop fields range1, range2, date1, date2 from Table_B_Final;
drop tables Table_A, Table_B;
How to attach?
/Nicolai