Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
PurplePills555
Contributor III
Contributor III

Linking Table IDs That Circular Reference To Find Master

Hi All 

I have in my data base form IDs. These forms can be copied from original and a new ID can be generated. Sometimes these can result in a loop where one was copied from one but another from another

Ie

Form ID.   Copy ID

1.                 2

2.                1

3.                2

 

As can be seen here form 2 was copied from 1 and 1 then must have been reupdated and copied from 2. Form 3 was copied from 2. 

My master form is 1 as it has the lowest ID. Our original always have lowest ID

How do I identify the original?

Thanks

 

1 Solution

Accepted Solutions
vunguyenq89
Creator III
Creator III

Hi,

The undirected graph can be quickly constructed from the directed tree by making every arc bi-directional. So if A ==> B is an arc in the graph, you will add arc B ==> A.

So the table in your latest comment becomes the following:

FormID, CopyID
20, 22
21, 20
20, 21
22, 20

 in which the last row (22, 20) has been added to the table as the opposite of the first row (20, 22). Both (20,21) and (21,20) are already in the table, so we don't have to add the opposite for these arcs.

You can then feed this table into the FindOriginal() sub. The whole script is as follows:

 

// Function finding original form
// Returns value in variable vOriginalFormID
Sub FindOriginal(vSearchFormID)
	vPos = 0;
	vCurrentID = vSearchFormID;
	
	Original:
	LOAD * INLINE [
	OriginalFormID_Temp
	$(vSearchFormID)];
	
	vOriginalLength = NoOfRows('Original');
	
	Do While vPos < vOriginalLength
		vCurrentID = Peek('OriginalFormID_Temp',vPos);
		
		Copies_Temp:
		LOAD DISTINCT CopyID as CopyID_Temp Resident Copies_Undirected Where FormID = $(vCurrentID);
		
		Concatenate (Original)
		LOAD CopyID_Temp as OriginalFormID_Temp Resident Copies_Temp WHERE NOT Exists(OriginalFormID_Temp,CopyID_Temp);
			
		DROP TABLE Copies_Temp;
		
		vPos = vPos + 1;
		vOriginalLength = NoOfRows('Original');
	Loop;
	
	MinOriginal:
	LOAD
	     MIN(OriginalFormID_Temp) AS MinOriginalFormID
	RESIDENT Original;

	vOriginalFormID = Peek('MinOriginalFormID');
	DROP TABLES Original, MinOriginal;
End Sub;

//Sample data
Copies:
LOAD * Inline[
FormID, CopyID
9, 10
9, 8
8, 10
8, 7
7, 1
7, 8
10, 2
9, 10
4, 5
6, 5
3, 4
4, 3
];

// Convert table into undirected graph
Copies_Undirected_Temp:
NoConcatenate LOAD Distinct * Resident Copies;

Concatenate(Copies_Undirected_Temp) 
LOAD CopyID AS FormID,
     FormID AS CopyID
Resident Copies_Undirected_Temp;

Copies_Undirected:
NoConcatenate LOAD Distinct * Resident Copies_Undirected_Temp;

DROP TABLE Copies_Undirected_Temp;

Copies_Distinct:
LOAD DISTINCT FormID RESIDENT Copies_Undirected;

Copies_Original:
LOAD * INLINE [FormID, OriginalFormID];

// Loop through each FormID and apply sub FindOriginal
For i=0 to NoOfRows('Copies_Distinct') 
	vFormID = Peek('FormID',i,'Copies_Distinct');
	Call FindOriginal(vFormID);
	
	Concatenate(Copies_Original)
	LOAD * INLINE [
		FormID, OriginalFormID
		$(vFormID), $(vOriginalFormID)
	];
Next;

DROP TABLE Copies_Distinct, Copies_Undirected;

 

Hope this helps,

BR,

Vu Nguyen

View solution in original post

9 Replies
vunguyenq89
Creator III
Creator III

Hi,

Interesting problem! Let's consider the following data sample:

data.png

To trace the original form for FormID = 9, you can think of this table as the following tree structure:

tree.png

in which 9 → 10 and 9 → 8 means "form 9 is copied from form 10 and form 8" (corresponding to the first 2 lines in the sample). So if form A is copied from B, then A is the parent node and B is the child node.

This tree is built via a repetitive process. At each step, if a child node is already in the tree, you will not add its child nodes (red nodes in the pictures) to the tree again.

Because the final original always have lowest ID, it's clear that the original should be the node with smallest ID in the tree. For FormID=9, it is form 1 according to the tree.

For each value in FormID field, you will need to generate a tree like this.

Complete script for the above sample data is as follows:

 

// Function finding original form
// Returns value in variable vOriginalFormID
Sub FindOriginal(vSearchFormID)
	vPos = 0;
	vCurrentID = vSearchFormID;
	
	Original:
	LOAD * INLINE [
	OriginalFormID_Temp
	$(vSearchFormID)];
	
	vOriginalLength = NoOfRows('Original');
	
	Do While vPos < vOriginalLength
		vCurrentID = Peek('OriginalFormID_Temp',vPos);
		
		Copies_Temp:
		LOAD DISTINCT CopyID as CopyID_Temp Resident Copies Where FormID = $(vCurrentID);
		
		Concatenate (Original)
		LOAD CopyID_Temp as OriginalFormID_Temp Resident Copies_Temp WHERE NOT Exists(OriginalFormID_Temp,CopyID_Temp);
			
		DROP TABLE Copies_Temp;
		
		vPos = vPos + 1;
		vOriginalLength = NoOfRows('Original');
	Loop;
	
	MinOriginal:
	LOAD
	     MIN(OriginalFormID_Temp) AS MinOriginalFormID
	RESIDENT Original;

	vOriginalFormID = Peek('MinOriginalFormID');
	DROP TABLES Original, MinOriginal;
End Sub;

// Sample data
Copies:
LOAD * Inline[
FormID, CopyID
9, 10
9, 8
8, 10
8, 7
7, 1
7, 8
10, 2
];

Copies_Distinct:
LOAD DISTINCT FormID RESIDENT Copies;

Copies_Original:
LOAD * INLINE [FormID, OriginalFormID];

// Loop through each FormID and apply sub FindOriginal
For i=0 to NoOfRows('Copies_Distinct') 
	vFormID = Peek('FormID',i,'Copies_Distinct');
	Call FindOriginal(vFormID);
	
	Concatenate(Copies_Original)
	LOAD * INLINE [
		FormID, OriginalFormID
		$(vFormID), $(vOriginalFormID)
	];
Next;

DROP TABLE Copies_Distinct;

 

Outcome of the script

result.png

Hope it helps!

BR,

Vu Nguyen

PurplePills555
Contributor III
Contributor III
Author

Hi Vu,

Thanks for the solution. We have however found an exception.

Have a look at this data.

Copies:

LOAD *

Inline

[

FormID, CopyID

20,        22

21,        22

22,        21

];

 

 

clipboard_image_1.png

 

clipboard_image_0.png

 

Form 20 should be the original for all of these as it is the lowest occurring one. It is linked to the others in a different way.

vunguyenq89
Creator III
Creator III

Hi,

According to the initial description in your first comment, your new table

FormID, CopyID
20,        22
21,        22
22,        21

should be translated as follows:

  • Form20 is copied from Form22
  • Form21 is copied from Form22
  • Form22 is copied from Form21

(for example, in your post, you mentioned that (3,2) means "Form 3 was copied from 2")

With this logic in mind, there is actually no information in the current table saying any form is copied from Form20

So either your problem description has changed, which no longer fits into the suggested script logic;

or data in your table is incomplete. I.E. if Form20 is indeed the original of all, why isn't there at least one row in the table capturing the event that something has been captured from Form20?

Or in other words, why Form20 doesn't appear in field CopyID at all?

BR,

Vu Nguyen

vunguyenq89
Creator III
Creator III

Hi, 

For some reason Qlik community continuously marks my comment in this post as spam and removed it. Have no idea why 🤔

Please see my answer below

Community.png

PurplePills555
Contributor III
Contributor III
Author

Hi Vu,

Yes this is an awkward problem.

Looks like the data is sometimes incomplete. The lowest ID is always the original but sometimes this copy id is replaced by a copy id from a different higher number.

In this case. What I think has happened is the following:

 

Form 20 was beginning and copied to 21

Form 21 was copied to 22

22 was copied back to 21

22 was also then copied back to 20.

So the copied field is replaced from the original when you make an additional copy.

 

We just need a way to link these numbers together and pick the lowest one.

So 21 links to 22 and 20 links to 22. Therefore lowest number is 20 which is original.

Sorry it's tricky to explain. You script works but just not for this situation.

 

vunguyenq89
Creator III
Creator III

Hi,

If it's just the matter of "picking out the lowest ID in a group of linked forms", then you can think of the table as an undirected graph instead of a tree. In the graph, FormIDs are nodes. For both events "A is copied from B" and "B is copied from A", you draw one unique undirected arc A--B

Your table then becomes a graph like this

undirectedgraph1.png

The problem is then reduced to "find the node with smallest ID in the graph", which is 20 in this case.

However, if you apply this logic, then the data sample in my first comment becomes this graph

undirectedgraph2.png

which gives a result that Form1 is the original of all 6 forms because they're all connected and 1 is the lowest ID. So original of Form10 is not Form2 like in my last comment, but Form1.

If this is the result you're looking for, we can tweak the script to build this undirected graph instead of tree.

Otherwise, for a complete solution, you will need to think of a way to generate the hidden event "Form 20 was beginning and copied to 21" as a row in your latest data table (FormID=21, CopyID=20). I think this cannot be done at Qlik script level and should be implemented in the system that generates the data source.

With an incomplete input data, it is usually very tricky to generate the expected insight 🙂

Best Regards,

Vu Nguyen

PurplePills555
Contributor III
Contributor III
Author

HI Vu,

Yes it looks like the forms could be linked in any way and not necessarily a tree fashion.

You second example of the undirected graph is correct!

Can we change the script so that it works that way.

Thanks for all your help! My challenge is now to understand how your script works. My Qlik is okay but not at that level.

 

 

 
 
 
 
vunguyenq89
Creator III
Creator III

Hi,

The undirected graph can be quickly constructed from the directed tree by making every arc bi-directional. So if A ==> B is an arc in the graph, you will add arc B ==> A.

So the table in your latest comment becomes the following:

FormID, CopyID
20, 22
21, 20
20, 21
22, 20

 in which the last row (22, 20) has been added to the table as the opposite of the first row (20, 22). Both (20,21) and (21,20) are already in the table, so we don't have to add the opposite for these arcs.

You can then feed this table into the FindOriginal() sub. The whole script is as follows:

 

// Function finding original form
// Returns value in variable vOriginalFormID
Sub FindOriginal(vSearchFormID)
	vPos = 0;
	vCurrentID = vSearchFormID;
	
	Original:
	LOAD * INLINE [
	OriginalFormID_Temp
	$(vSearchFormID)];
	
	vOriginalLength = NoOfRows('Original');
	
	Do While vPos < vOriginalLength
		vCurrentID = Peek('OriginalFormID_Temp',vPos);
		
		Copies_Temp:
		LOAD DISTINCT CopyID as CopyID_Temp Resident Copies_Undirected Where FormID = $(vCurrentID);
		
		Concatenate (Original)
		LOAD CopyID_Temp as OriginalFormID_Temp Resident Copies_Temp WHERE NOT Exists(OriginalFormID_Temp,CopyID_Temp);
			
		DROP TABLE Copies_Temp;
		
		vPos = vPos + 1;
		vOriginalLength = NoOfRows('Original');
	Loop;
	
	MinOriginal:
	LOAD
	     MIN(OriginalFormID_Temp) AS MinOriginalFormID
	RESIDENT Original;

	vOriginalFormID = Peek('MinOriginalFormID');
	DROP TABLES Original, MinOriginal;
End Sub;

//Sample data
Copies:
LOAD * Inline[
FormID, CopyID
9, 10
9, 8
8, 10
8, 7
7, 1
7, 8
10, 2
9, 10
4, 5
6, 5
3, 4
4, 3
];

// Convert table into undirected graph
Copies_Undirected_Temp:
NoConcatenate LOAD Distinct * Resident Copies;

Concatenate(Copies_Undirected_Temp) 
LOAD CopyID AS FormID,
     FormID AS CopyID
Resident Copies_Undirected_Temp;

Copies_Undirected:
NoConcatenate LOAD Distinct * Resident Copies_Undirected_Temp;

DROP TABLE Copies_Undirected_Temp;

Copies_Distinct:
LOAD DISTINCT FormID RESIDENT Copies_Undirected;

Copies_Original:
LOAD * INLINE [FormID, OriginalFormID];

// Loop through each FormID and apply sub FindOriginal
For i=0 to NoOfRows('Copies_Distinct') 
	vFormID = Peek('FormID',i,'Copies_Distinct');
	Call FindOriginal(vFormID);
	
	Concatenate(Copies_Original)
	LOAD * INLINE [
		FormID, OriginalFormID
		$(vFormID), $(vOriginalFormID)
	];
Next;

DROP TABLE Copies_Distinct, Copies_Undirected;

 

Hope this helps,

BR,

Vu Nguyen

PurplePills555
Contributor III
Contributor III
Author

Came to say thanks!

Works like a charm