Skip to main content
Announcements
Qlik Introduces a New Era of Visualization! READ ALL ABOUT IT
cancel
Showing results for 
Search instead for 
Did you mean: 
rmainhart
Contributor II
Contributor II

Avoiding circular reference

I am searching for an elegant solution to a circular reference problem.

Table [SDE]: (~500K records)

UIN | Subject Code | ... (about 70 other fields)

Table [SDBQ]: (~20M records)

UIN | Reporting Category | ... (about 12 other fields)

Table SDBQ has multiple rows per UIN, but not every UIN in Table SDE is also in Table SDBQ. 

I need to introduce a field, [Reporting Category Descriptor], which depends on both [Subject Code] (from the SDE table) and [Reporting Category] (from the SDBQ table):

Table [Crosswalk]:

Subject Code | Reporting Category | Reporting Category Descriptor

This, of course, creates a circular reference.

I'm thinking a clever use of ApplyMap and creating a key with [Subject Code] and [Reporting Category] is part of a solution, but how to create a key from values from two tables?  Or is my best solution to join the SDE and SDBQ tables to begin with? (Given the scale of the tables, I'm hoping to not have to do that -- I tried, and it more than doubles the load script time.)

Labels (2)
1 Solution

Accepted Solutions
rmainhart
Contributor II
Contributor II
Author

It's great when solutions present themselves.  Solution below added only 4 seconds to the load time, compared to joining the two main data tables, which added nearly 90 seconds. 

Here is what I did, in case it is helpful to others:

SET vPipe = '|';

// load the crosswalk table as a mapping table, concatenating the two key fields into a single key

[Mapping_ReportingCategoryDescriptor]:
Mapping Load
[Subject Code] & '$(vPipe)' & [Reporting Category] as Key_SubjectCodeReportingCategory,
// [Subject Code],
// [Reporting Category],
[Reporting Category Descriptor]
FROM [$(vPath)/ReportingCategories.xlsx]
(ooxml, embedded labels, table is Sheet1)

// load the first main data table

[SDE]:

Load *
From '[$(vPath)/SDE.qvd]' (qvd);

// here's the clever(?) bit: a mapping table from a resident table

[Mapping_SubjectCode]:
Mapping Load
UIN,
[Subject Code]
Resident SDE;

// load the second main data table

[Temp-SDBQ]:

Load *
From '[$(vPath)/SDBQ.qvd]' (qvd);  // optimized load

// apply the mapping tables and add the new field to the second main data table

[SDBQ]:

Load *,
ApplyMap('Mapping_SubjectCode', UIN) as [SDBQ_Subject Code], // for debugging purposes
ApplyMap('Mapping_ReportingCategoryDescriptor', ApplyMap('Mapping_SubjectCode', UIN) & '$(vPipe)' & Num([Reporting Category])) as [Reporting Category Descriptor] // note the nested ApplyMap() functions
Resident [Temp-SDBQ]
;

Drop Table [Temp-SDBQ];

View solution in original post

1 Reply
rmainhart
Contributor II
Contributor II
Author

It's great when solutions present themselves.  Solution below added only 4 seconds to the load time, compared to joining the two main data tables, which added nearly 90 seconds. 

Here is what I did, in case it is helpful to others:

SET vPipe = '|';

// load the crosswalk table as a mapping table, concatenating the two key fields into a single key

[Mapping_ReportingCategoryDescriptor]:
Mapping Load
[Subject Code] & '$(vPipe)' & [Reporting Category] as Key_SubjectCodeReportingCategory,
// [Subject Code],
// [Reporting Category],
[Reporting Category Descriptor]
FROM [$(vPath)/ReportingCategories.xlsx]
(ooxml, embedded labels, table is Sheet1)

// load the first main data table

[SDE]:

Load *
From '[$(vPath)/SDE.qvd]' (qvd);

// here's the clever(?) bit: a mapping table from a resident table

[Mapping_SubjectCode]:
Mapping Load
UIN,
[Subject Code]
Resident SDE;

// load the second main data table

[Temp-SDBQ]:

Load *
From '[$(vPath)/SDBQ.qvd]' (qvd);  // optimized load

// apply the mapping tables and add the new field to the second main data table

[SDBQ]:

Load *,
ApplyMap('Mapping_SubjectCode', UIN) as [SDBQ_Subject Code], // for debugging purposes
ApplyMap('Mapping_ReportingCategoryDescriptor', ApplyMap('Mapping_SubjectCode', UIN) & '$(vPipe)' & Num([Reporting Category])) as [Reporting Category Descriptor] // note the nested ApplyMap() functions
Resident [Temp-SDBQ]
;

Drop Table [Temp-SDBQ];