Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

to join or not to join, applymap, temptables. Optimisation

I have been reading a lot of different ways to help with performance of a reload.

I have come to conclusion that I will need to either applymap or join with the fact load rather than a left join outside of the resident load of the fact table.

My memory reached 40GB from n app that's less than 1GB when I reload from reading forums I have gathered that it's from all the left joins.

I have attached my script, could anyoen recommend what they would do with the script?

There's temp tables, joins, left joins and apply maps....

Thanks,

Lauren

11 Replies
Gysbert_Wassenaar

1. Don't join tables unless you need the fields from both tables to calculate new fields

     Booking Agent

     Transaction Agent

     DimTransactionType

     DimJourneyType

     DimFareType

     DimDates

2. Try storing Sales into a temporary qvd and do the final calculations with a load from this qvd instead of using a resident load.

3. If(SaleType=0,Company, ProductCompany) as CompanyOverall

    ProductCompany doesn't seem to exist.


talk is cheap, supply exceeds demand
Not applicable
Author

Hi Gysbert Wassenaar,

Thanks for the reply!

What would you suggest instead of joins for number 1?

For number 2. Please could you send an example of the script for this part or just a example structure of it? Would this all be in the script in the qvw?

Thanks again.

Gysbert_Wassenaar

1. Do nothing. Simply let Qlikview associate the tables using the fields they have in common.

2. Example:

Sales:

LOAD ...stuff ... FROM ...somewhere... ;

STORE Sales INTO Sales.qvd ;

DROP TABLE Sales;

SalesTransactions:

LOAD X, sum(Y) FROM Sales.qvd (qvd) GROUP BY X;


talk is cheap, supply exceeds demand
Not applicable
Author

Hi,

Thanks for coming back to me.

For point one, just do this for each dim

LOAD   [FlightKey] AS _FlightKey

      ,[ServiceNumber]

      ,[Company]

      ,[BrandCode]

   ,ServiceName

      ,[ServiceCategory]

FROM [..\QVD\DimServices.qvd] (qvd);

rather than

LEFT JOIN (Sales)

LOAD   [FlightKey] AS _FlightKey

      ,[ServiceNumber]

      ,[Company]

      ,[BrandCode]

   ,ServiceName

      ,[ServiceCategory]

FROM [..\QVD\DimServices.qvd] (qvd);

I am bit confused as I believe I am already doing point 2.

As I am taking load from qvd:

FROM

[..\QVD\FactSalesTransactions.qvd] (qvd)

Gysbert_Wassenaar

At the end of your script you use a resident load:

SalesTransactions:
LOAD *,
    TopDepartureName &'-'&TopArrivalName AS Flow,
    If (TopDepartureName < TopArrivalName,
      TopDepartureName & '-' & TopArrivalName,
      TopArrivalName & '-' & TopDepartureName) AS FlowGrouped,
    If (DepartureName < ArrivalName,
  DepartureName & '-' & ArrivalName,
  ArrivalName & '-' & DepartureName) AS JourneyGrouped,
    DepartureName & '-' & ArrivalName AS Journey,
    If (LegFare = 0, 1, 0) AS ZeroFlag,
    If(SaleType=0,Company, ProductCompany) as CompanyOverall
Resident Sales;

DROP TABLE Sales;

You could store Sales into a qvd and drop that table first. Then instead of a resident load you could load data from the qvd. Perhaps that will perform better. It'll use less ram memory in any case.


talk is cheap, supply exceeds demand
Not applicable
Author

Do you mean store the Fact with joined dimensions in a qvd?

and then call the qvd later in the script?

Anonymous
Not applicable
Author

Instead of 'left join', have you tried to use a 'left keep' ?  This reduces data in the same way a 'join' statement does, but QV keeps all tables.  It is recommendable not to join. 

Not applicable
Author

At the end of the script I do some post load calculations on SalesTransactions. SalesTransactions would be the table that has the fact and dimension joined via left joins then at the end I use this to create fields from the dimensions:

How would I do this without joining please? Is this possible?

SalesTransactions:

LOAD *,

     TopDepartureName &'-'&TopArrivalName AS Flow,

     If (TopDepartureName < TopArrivalName,

      TopDepartureName & '-' & TopArrivalName,

      TopArrivalName & '-' & TopDepartureName) AS FlowGrouped,

     If (DepartureName < ArrivalName,

  DepartureName & '-' & ArrivalName,

  ArrivalName & '-' & DepartureName) AS JourneyGrouped,

     DepartureName & '-' & ArrivalName AS Journey,

     If (LegFare = 0, 1, 0) AS ZeroFlag,

     If(SaleType=0,Company, ProductCompany) as CompanyOverall

Resident Sales;

DROP TABLE Sales

Not applicable
Author

At the end of the script I do some post load calculations on SalesTransactions. SalesTransactions would be the table that has the fact and dimension joined via left joins then at the end I use this to create fields from the dimensions:

How would I do this without joining please? Is this possible?

SalesTransactions:

LOAD *,

     TopDepartureName &'-'&TopArrivalName AS Flow,

     If (TopDepartureName < TopArrivalName,

      TopDepartureName & '-' & TopArrivalName,

      TopArrivalName & '-' & TopDepartureName) AS FlowGrouped,

     If (DepartureName < ArrivalName,

  DepartureName & '-' & ArrivalName,

  ArrivalName & '-' & DepartureName) AS JourneyGrouped,

     DepartureName & '-' & ArrivalName AS Journey,

     If (LegFare = 0, 1, 0) AS ZeroFlag,

     If(SaleType=0,Company, ProductCompany) as CompanyOverall

Resident Sales;

DROP TABLE Sales