Skip to main content
hic
Former Employee
Former Employee

There are two Swedish car brands, Volvo and SAAB. Or, at least, there used to be... SAAB was made in Trollhättan and Volvo was – and still is – made in Gothenburg.

Two fictive friends – Albert and Herbert – live in Trollhättan and Gothenburg, respectively. Albert drives a Volvo and Herbert drives a SAAB.

If the above information is stored in a tabular form, you get the following three tables:

Tables.png

Logically, these tables form a circular reference: The first two tables are linked through City; the next two through Person; the last and the first through Car.

Further, the data forms an anomaly: Volvo implies Gothenburg; Gothenburg implies Herbert; and Herbert implies SAAB. Hence, Volvo implies SAAB – which doesn’t make sense. This means that you have ambiguous results from the logical inference - different results depending on whether you evaluate clockwise or counterclockwise.

If you load these tables into QlikView, the circular reference will be identified and you will get the following data model:

Circular reference.png

To avoid ambiguous results, QlikView marks one of the tables as “loosely coupled”, which means that the logical inference cannot propagate through this table. In the document properties you can decide which table to use as the loosely coupled table. You will get different results from the logical inference depending on which you choose.

So what did I do wrong? Why did I get a circular reference?

It is not always obvious why they occur, but when I encounter circular references I always look for fields that are used in several different roles at the same time. One obvious example is if you have a table listing external organizations and this table is used in several roles: as Customers, as Suppliers and as Shippers. If you load the table only once and link to all three foreign keys, you will most likely get a circular reference. You need to break the circular reference and the solution is of course to load the table several times, once for each role.

In the above data model you have a similar case. You can think of Car as “Car produced in the city” or “Car that our friend drives”. And you can think of City as “City where car is produced” or “City where our friend lives”. Again, you should break the circular reference by loading a table twice. One possible solution is the following:

Tables2.png

In real life circular references are not as obvious as this one. I once encountered a data model with many tables where I at first could not figure out what to do, but after some analyzing, the problem boiled down to the interaction between three fields: Customers, Machines and Devices. A customer had bought one or several machines; a device could be connected to some of the machine types – but not to all; and a customer had bought some devices. Hence, the device field could have two roles: Devices that the customer actually had bought; and devices that would fit the machine that the customer had bought, i.e. devices that the customer potentially could buy. Two roles. The solution was to load the device table twice using different names.

Bottom line: Avoid circular references. But you probably already knew that…

HIC

Further reading on Qlik data modelling:

To Join or not to Join

Synthetic Keys

Fan traps and Chasm traps

20 Comments
Karim_Khan
Creator III
Creator III

Very Nice Post

0 Likes
3,315 Views
Not applicable

Nice post. It helped me a lot!

Best regards!

Gabrielle

0 Likes
3,315 Views
beck_bakytbek
Master
Master

Thanks a lot for your sharing

3,268 Views
wizardo
Creator III
Creator III

Hi,

i know its a bit late to the party but hope ill get an answer still,

my question is this:

is circular references  "content" based or  "schema" based?

what do i mean by that. let me use the tables in HIC post as example.

what if the cars in the "persons/cars" table ware not what they ware, lets say the ware

person        car

alebrt          ford

herbert        fiat

in this case when this table connects back to the "cars/cities" table it wont create a problem.

we will still see the same "Table Diagram" where all the tables are connected (A to B, B to C, C to A)

in this scenario, will qlikview complain about "loops found" or will qlikview just draw solid lines and wont say there is a problem.

hope i made myself clear.

im asking this because i have many examples of data-models where there seems to be many "circular references" but i do not get the "loops" message

Thanks

Daniel

0 Likes
3,268 Views
anderseriksson
Partner - Specialist
Partner - Specialist

While not knowing for certain I'm pretty sure it is schema based.

Do you have an example of those data-models where there seems to be circular references but no loops?
Probably something in your data-model that you have overlooked.

0 Likes
3,268 Views
hic
Former Employee
Former Employee

It's schema based.

The content can never change the fact that you have a logical inconsistency: Whatever content you have, you will always have two ways to make the logical evaluation: clockwise or counterclockwise. If you, as you suggest, use Ford and Fiat in one of the tables, a click on 'Herbert' would imply 'Gothenburg' if you evaluate counterclockwise, and an empty result set (no city possible) if you evaluate clockwise. This is still a logical contradiction.

HIC

3,268 Views
Anonymous
Not applicable

Very useful article hic

Thanks a lot

0 Likes
3,268 Views
Anonymous
Not applicable

The renaming of the fields isn't a solution?

0 Likes
3,268 Views
BartBrouwer
Partner - Contributor II
Partner - Contributor II

Why do you load the City-Car table twice? Why don't just rename the fields? I made this data model and both tables City-Car 1 and City-Car 2 have exactly the same fields and field values right now. And I also see on Qlik Help the solution of renaming fields coming back. Loading a table twice is not mentioned there.

So I am curious why you choose this approach @hic 

Kind regards,

Bart Brouwer

46 Views
marcus_sommer

Renaming could prevent the occurrence of circular loops and synthetic keys but it doesn't mandatory mean that the resulting data-model is in generally working nor that's suitable in regards of efforts and performance.

Officially recommended is to use a star-scheme with a single fact-table and n surrounding dimension-tables. It's the most simple kind of a data-model and avoids automatically such troubles. I couldn't speak for HIC but I think the example was just to demonstrate the challenge in an easy manner without adding more content and complexity as needed for it. 

33 Views