Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
This is a follow-up question from: http://community.qlik.com/forums/t/33999.aspx
I realized after all that was said and done, I realized I needed another column in there. A here's a brief look at what it looked like originally:
Taxonomy Category DoS to DoOS DoOS to Do1NA Do1NA to Do1O ....etc
Cardiology 2 1 3 ....etc
Ophthalmology 1 3 6 ...etc
Orthopedics 6 8 9 ....etc
What it should've been when it was transformed:
Variables Days Taxonomy Category
DoS to DoOS 2 Cardiology
DoS to DoOS 1 Ophthalmology
DoS to DoOS 6 Orthopedics
DoOS to Do1NA 1 Cardiology
DoOS to Do1NA 3 Ophthalmology
DoOS to Do1NA 8 Orthopedics
Do1NA to Do1O 3 Cardiology
Do1NA to Do1O 6 Ophthalmology
Do1NA to Do1O 9 Orthopedics
Here's the code i've been tinkering with in an attempt to just add the column onto the first with it's ID key:
DaysTable:
CROSSTABLE (TimeFrame, Days)
LOAD recno() as ID,
[DoS to DoOS],
[DoOS to Do1NA],
[Do1NA to Do1O],
[Do1O to Do1PA],
[DoS to D1RAR],
[Do1PA to D1RAR],
[DoS to Do1CI]
Resident Lag;
TaxTable:
CROSSTABLE (TaxCat,Count)
Load recno() as ID,
[Taxonomy Category]
Resident Lag;
Try:
ADD
Load * Resident Lag;
Outer Join Load * Resident TaxTable;
When I load the data, here's what the script executioner says:
Lag << Lag 308,352 lines fetched
DaysTable << Lag 2,006,597 lines fetched
Lag << Lag 308,352 lines fetched
TaxTable << Lag 308,352 lines fetched
Lag << Lag 616,704 lines fetched *Wrong
Try << TaxTable 308,352 lines fetched *Wrong
$Syn 1 = Taxonomy+Taxonomy Category
The Lag table should continue to have 308,352 entries, and the try table should have the same number of lines as the DaysTable (2,006,597).
Ideally, what I probably should have done, was have the [Taxonomy code] Loaded with the DaysTable and avoid the merge altogether.
Any idea what's going wrong and how can I fix it?
Thanks!
The ID being generated is the recno() of the OUTPUT table in each case, not the record number of the Lag table itself. So in the DaysTable, you have 2,006,597 IDs, while in the TaxTable you only have 308,352 IDs. Therefore, when you outer join them, you're joining to only the first 308,352 records in the DaysTable, and they're the wrong records - just whatever happens to come first. (Edit: I think I got that a little wrong. The outer join would have worked fine, as it would have joined by Taxonomy Category. But then the resulting data model would associate the wrong Taxonomy categories with the data in the DaysTable because of the way the IDs don't line up. Same bad result, but slightly different process to get there.)
The solution is to include the Taxonomy Category in the original load, as you said, and avoid the merge altogether. That just takes an additional parameter in the crosstable() to tell it that you have TWO fields (instead of the default of one) that come before the data it should switch from columns to rows. Like this:
DaysTable:
CROSSTABLE (TimeFrame, Days, 2)
LOAD recno() as ID,
[Taxonomy Category],
[DoS to DoOS],
[DoOS to Do1NA],
[Do1NA to Do1O],
[Do1O to Do1PA],
[DoS to D1RAR],
[Do1PA to D1RAR],
[DoS to Do1CI]
Resident Lag;
The ID being generated is the recno() of the OUTPUT table in each case, not the record number of the Lag table itself. So in the DaysTable, you have 2,006,597 IDs, while in the TaxTable you only have 308,352 IDs. Therefore, when you outer join them, you're joining to only the first 308,352 records in the DaysTable, and they're the wrong records - just whatever happens to come first. (Edit: I think I got that a little wrong. The outer join would have worked fine, as it would have joined by Taxonomy Category. But then the resulting data model would associate the wrong Taxonomy categories with the data in the DaysTable because of the way the IDs don't line up. Same bad result, but slightly different process to get there.)
The solution is to include the Taxonomy Category in the original load, as you said, and avoid the merge altogether. That just takes an additional parameter in the crosstable() to tell it that you have TWO fields (instead of the default of one) that come before the data it should switch from columns to rows. Like this:
DaysTable:
CROSSTABLE (TimeFrame, Days, 2)
LOAD recno() as ID,
[Taxonomy Category],
[DoS to DoOS],
[DoOS to Do1NA],
[Do1NA to Do1O],
[Do1O to Do1PA],
[DoS to D1RAR],
[Do1PA to D1RAR],
[DoS to Do1CI]
Resident Lag;
Thanks for you help John.