Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to do an Inner Keep

Hi,

I'm pretty new to Qlik Sense and would like some help with the data load editor, specifically on how to perform an Inner Keep.  Unfortunately, hours of searching the web for specific examples of doing an inner join only revealed the same simple two table "joins" where the Keep statement always followed the loading of the first table.  My situation is more complicated as I would like to only keep records that are in the tables labeled [ERLINKIDS], [Employer-Level Data], and in either [ONE], [TWO], or [THR].  Note:  [ONE], [TWO], and [THR] contain subsets of a 1.5M record file that was too big to load in an Excel or .csv file.  (BTW, I don't know why Qlik added the following code at the end:  RENAME TABLE [ONE] TO [ONE-TWO];.  First, I don't know it's purpose.  Second, why wasn't it named something like [ONE-THR]?  If anyone can answer these questions, I'd appreciate it.)

Here is my code:

[ERLINKIDS]:

LOAD [ACTIVE],

  [MCP],

  [MCPNAME],

  [LINKNAME] AS [ERLINKIDS.LINKNAME],

  [SOURCE],

  [ERLINKID],

  [CITY],

  [ST],

  [DIV],

  [SIC4],

  [KEEP],

  [LSTATE],

  [TYPE],

  [LEAID],

  [K12EMPL],

  [LEAID2],

  [K12EMPL2],

  [DUNS],

  [BUSNAME],

  [SECNAME],

  [EEsTotal],

  [EesHere],

  [SIC4DB],

  [GOVTIDNUM],

  [GovtType],

  [GovtDescription],

  [County],

  [DataFuncCD],

  [FTEes],

  [PTEes],

  [FTEs],

  [DataYr],

  [LINKNAME]&[ERLINKID] AS [ERLINKIDS.LINKNAME-ERLINKID-Employer-Level Data.LINKNAME-Employer-Level Data.ERLINKID]

FROM [lib://Qlik Stuff/ERLINKIDS.csv]

(txt, codepage is 1252, embedded labels, delimiter is ',', msq)

where [ACTIVE] = 1 and ([DIV]='ESD' or [DIV]='GSD') and ([NICHENL]='AUTO' OR [NICHENL]='EDUC' OR [NICHENL]='MUNICO'

OR [NICHENL]='HEALTH' OR [NICHENL]='HLTOTH' OR [NICHENL]='HLTSNF' OR [NICHENL]='UNICOL' OR [NICHENL]='OTHER');

[EERanges]:

Load * Inline [EERanges, MinLimit, MaxLimit

N/A, ., 0

1-25, 1, 25

26-50, 26, 50

51-100, 51, 100

101-250, 101, 250

251-500, 251, 500,

501-1000, 501, 1000

1001-2500, 1001, 2500

2501-5000, 2501, 5000

5001-10000, 5001, 10000

Over 10000, 10001, 1000000];

[Employer-Level Data]:

LOAD [LINKNAME] AS [Employer-Level Data.LINKNAME],

  [ERLINKID] AS [Employer-Level Data.ERLINKID],

  [ZIP] AS [Zip],

  [PUBLIC],

  [NICHENL],

  [SCHDIST],

  [NONSCH],

  [TOTALEES],

    [STATUS] AS [Employer Status],

  [LINKNAME]&[ERLINKID] AS [ERLINKIDS.LINKNAME-ERLINKID-Employer-Level Data.LINKNAME-Employer-Level Data.ERLINKID]

FROM [lib://Qlik Stuff/Employer-Level Data.csv]

(txt, codepage is 1252, embedded labels, delimiter is ',', msq)

where ([NICHENL]='AUTO' OR [NICHENL]='EDUC' OR [NICHENL]='MUNICO'

OR [NICHENL]='HEALTH' OR [NICHENL]='HLTOTH' OR [NICHENL]='HLTSNF' OR [NICHENL]='UNICOL' OR [NICHENL]='OTHER')

AND [STATUS] = 'A' AND [TOTALEES] > 0;

Left Join([Employer-Level Data]) IntervalMatch (TOTALEES) LOAD MinLimit, MaxLimit Resident EERanges;

Left Join([Employer-Level Data])

Load MinLimit, MaxLimit, EERanges Resident EERanges;

DROP table EERanges;

[CUSTMCH]:

LOAD [MCH_NAME] as [MCH Name],

  [MCH]

FROM [lib://Qlik Stuff/CUSTMCH.csv]

(txt, codepage is 1252, embedded labels, delimiter is ',', msq);

[ONE]:

LOAD [MCP],

  [MCH],

  [ACCT],

  [PRODCODE],

  [DTYPE],

  [KIND],

  [SERIES],

  [POLS],

  [NBILL]

FROM [lib://Qlik Stuff/PRODPART_ACCTS1.xlsx]

(ooxml, embedded labels, table is ONE);

[TWO]:

LOAD [MCP],

    [MCH],

  [ACCT],

  [PRODCODE],

  [DTYPE],

  [KIND],

  [SERIES],

  [POLS],

  [NBILL]

FROM [lib://Qlik Stuff/PRODPART_ACCTS2.xlsx]

(ooxml, embedded labels, table is TWO);

[THR]:

LOAD [MCP],

  [MCH],

  [ACCT],

  [PRODCODE],

  [DTYPE],

  [KIND],

  [SERIES],

  [POLS],

  [NBILL]

FROM [lib://Qlik Stuff/PRODPART_ACCTS3.xlsx]

(ooxml, embedded labels, table is THR);

[zipcode]:

LOAD [ZipNumber],

  [ZipText] AS [Zip],

  [Latitude],

  [Longitude],

    GeoMakePoint(Latitude,Longitude) as ZipLocation

FROM [lib://Qlik Stuff/Zip Codes.xlsx]

(ooxml, embedded labels, table is zipcode);

[ST to STATE]:

Load [State],

    [ST]

FROM [lib://Qlik Stuff/ST to STATE.xlsx]

(ooxml, embedded labels, table is Sheet1);

   

[cb_2014_us_state_500k/cb_2014_us_state_500k]:

LOAD [US State Boundaries.Name] as [State],

  [US State Boundaries.Point],

  [US State Boundaries.Area]

FROM [lib://Qlik Stuff/US State Boundaries.kml]

(kml, Table is [cb_2014_us_state_500k/cb_2014_us_state_500k]);

RENAME TABLE [ONE] TO [ONE-TWO];

Thanks,

Bryan

5 Replies
Lech_Miszkiewicz
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi Bryan,

Just by looking at this code it looks like some sort of autogenerated code from "bubbles" in QlikSense. If you are saying you are newbe to QlikSense i doubth you would be able to create this logic ith joins and IntervalMatch.

I know bubbles are a "briliant" feature but they just make a mess when you need to design something more complex than simple relationship.

Would you be able to attach those files (maybe reduce data) so we could work of it and just in few words explain how you would like to link those tables.

This code above is not a usual way developers write code. It is autogenerated with some algorithm.

We would be able to help more when having some data so then Left / Right / Inner Keep or Join examples could be presented to you.

regards

Lech

cheers Lech, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful to the problem.
jonathandienst
Partner - Champion III
Partner - Champion III

I think you need to load using the following strategy:

Load ONE

CONCATENATE Load TWO

CONCATENATE Load THREE

INNER KEEP Load ERLINKIDS

Load EERanges

Load Employer-Level Data

Interval match

Load other tables

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
Not applicable
Author

Hi, Jonathan,

I tried your suggestion but I’m afraid it didn’t produce the desired results.  (The code is below just in case I didn't do it as you meant me to do it.)  Please let me know if you have another suggestion.


Thanks,

Bryan


[ONE]:

LOAD [MCP],

  [MCH],

  [ACCT],

  ...,

  [NBILL]

FROM [lib://Qlik Stuff/PRODPART_ACCTS1.xlsx]

(ooxml, embedded labels, table is ONE);

[TWO]:

concatenate LOAD [MCP],

    [MCH],

  [ACCT],

  ...,

  [NBILL]

FROM [lib://Qlik Stuff/PRODPART_ACCTS2.xlsx]

(ooxml, embedded labels, table is TWO);

[THR]:

concatenate LOAD [MCP],

  [MCH],

  [ACCT],

  ...,

  [NBILL]

FROM [lib://Qlik Stuff/PRODPART_ACCTS3.xlsx]

(ooxml, embedded labels, table is THR);

[ERLINKIDS]:

inner keep LOAD [ACTIVE],

  [MCP],

  [LINKNAME] AS [ERLINKIDS.LINKNAME],

  [SOURCE],

  [ERLINKID],

  ...,

  [LINKNAME]&[ERLINKID] AS [ERLINKIDS.LINKNAME-ERLINKID-Employer-Level Data.LINKNAME-Employer-Level Data.ERLINKID]

FROM [lib://Qlik Stuff/ERLINKIDS.csv]

(txt, codepage is 1252, embedded labels, delimiter is ',', msq)

where [ACTIVE] = 1 and ([DIV]='ESD' or [DIV]='GSD') and ([NICHENL]='AUTO' OR [NICHENL]='EDUC' OR [NICHENL]='MUNICO'

OR [NICHENL]='HEALTH' OR [NICHENL]='HLTOTH' OR [NICHENL]='HLTSNF' OR [NICHENL]='UNICOL' OR [NICHENL]='OTHER');

[EERanges]:

Load * Inline [EERanges, MinLimit, MaxLimit

N/A, ., 0

1-25, 1, 25

26-50, 26, 50

51-100, 51, 100

101-250, 101, 250

251-500, 251, 500,

501-1000, 501, 1000

1001-2500, 1001, 2500

2501-5000, 2501, 5000

5001-10000, 5001, 10000

Over 10000, 10001, 1000000];

[Employer-Level Data]:

LOAD [LINKNAME] AS [Employer-Level Data.LINKNAME],

  [ERLINKID] AS [Employer-Level Data.ERLINKID],

  [TOTALEES],

    [STATUS] AS [Employer Status],

  [LINKNAME]&[ERLINKID] AS [ERLINKIDS.LINKNAME-ERLINKID-Employer-Level Data.LINKNAME-Employer-Level Data.ERLINKID]

FROM [lib://Qlik Stuff/Employer-Level Data.csv]

(txt, codepage is 1252, embedded labels, delimiter is ',', msq)

where ([NICHENL]='AUTO' OR [NICHENL]='EDUC' OR [NICHENL]='MUNICO'

OR [NICHENL]='HEALTH' OR [NICHENL]='HLTOTH' OR [NICHENL]='HLTSNF' OR [NICHENL]='UNICOL' OR [NICHENL]='OTHER')

AND [STATUS] = 'A' AND [TOTALEES] > 0;

Left Join([Employer-Level Data]) IntervalMatch (TOTALEES) LOAD MinLimit, MaxLimit Resident EERanges;

Left Join([Employer-Level Data])

Load MinLimit, MaxLimit, EERanges Resident EERanges;

DROP table EERanges;

[CUSTMCH]:

LOAD [MCH_NAME] as [MCH Name],

  [MCH]

FROM [lib://Qlik Stuff/CUSTMCH.csv]

(txt, codepage is 1252, embedded labels, delimiter is ',', msq);

[zipcode]:

LOAD [ZipNumber],

  [ZipText] AS [Zip],

  [Latitude],

  [Longitude],

    GeoMakePoint(Latitude,Longitude) as ZipLocation

FROM [lib://Qlik Stuff/Zip Codes.xlsx]

(ooxml, embedded labels, table is zipcode);

[ST to STATE]:

Load [State],

    [ST]

FROM [lib://Qlik Stuff/ST to STATE.xlsx]

(ooxml, embedded labels, table is Sheet1);

   

[cb_2014_us_state_500k/cb_2014_us_state_500k]:

LOAD [US State Boundaries.Name] as [State],

  [US State Boundaries.Point],

  [US State Boundaries.Area]

FROM [lib://Qlik Stuff/US State Boundaries.kml]

(kml, Table is [cb_2014_us_state_500k/cb_2014_us_state_500k]);

RENAME TABLE [ONE] TO [ONE-TWO];

Not applicable
Author

Hi, Lech,

How would I upload subsets of the data files for my app?

In the meantime, assuming we can do this without the sample data, I want to:

  • Link files ONE-THR to the ERLINKIDS file by the MCP field.
  • Link the CUSTMCH file to the ERLINKIDS file by the MCH field.
  • Link the ERLINKIDS file to the Employer-Level Data file by the LINKNAME & ERLINKID fields (these are concatenated to make the link happen)
  • "Link" the EERanges inline file to the Employer-Level Data file by the value in the TOTALEES field in (Employer-Level Data).
  • Link the cb_2014_us_state_500k/cb_2014_us_state_500k file to the ST to STATE file by the State field.
  • Link the ST to STATE file to the Employer-Level Data file by the ST field.
  • Link the Zipcode file to the Employer-Level Data file by the Zip field.

Thanks,

Bryan

P.S. I don't know why Qlik added this code:  RENAME TABLE [ONE] TO [ONE-TWO];.  What is its purpose, and why wasn't it named something like [ONE-THR]?

Lech_Miszkiewicz
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi

I just thought you could reduce your XLSX files to few rows (1000-1000 rows aprox.). Ultimetly you could prefix your Load statements in script with "First 1000" or first "10000" which would run reduced load on your data.

When designing data model for Qlik application i always try to understand how users will interact with filters and objects and what business questions they will try to find an answer for.

Therefore i have suggested to attach your files to this topic so we can play little bit with it.

Another thing is to answer following questions:

  • Each table - what data it contains? Dimensions or Facts, what are the metrics, why we actually linking those tables together?
  • Johnathan sugested to use concatenate as he noticed the same column names in all 3 files you are taking data from. I would suggest the same solution. For some reasons you are saying it will not work for  you, so maybe we will have to get to know those reasons and maybe rename some fields as they may represent completly different things?
  • Please provide us little bit of background what are you trying to achieve and why.

Few answers to your questions:

  • Link files ONE-THR to the ERLINKIDS file by the MCP field. (
    • why? what are those 4 tables representing? what is a MCP field. Remember that in files One-Thr have the same column names. In QlikView or Qlik Sense each field has to have unique name otherwise it becomes the same field in whole datamodel. At the same time ONE-THR looks like you Fact table - is this right
  • Answer for your other question is simple - basicly you create link / keyfields between tables by giving them the same name. Such field becomes than one field in your model. then depending which table should be core you load it first

Table1:

Load

KeyFieldXX,

Other fields...

From

...


and then you can load other tables using function "Left keep"

Left Keep (Table1)

Load

KeyFieldXX,

Other fields...

From

...



To get a full picture what are you trying to achieve we need a bit more of your input. Not only technical but also business POW and conceptual POW.


thanks


Lech

cheers Lech, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful to the problem.