Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I am reading about hierarchies from Hierarchies. ,
In the PDF attached to that article, it discussed several ways of storing hierarchical data, of one which is path enumeration. Consider the data in the following table from the PDF, which is an n-level unbalanced tree:
NodeID | Path | Name |
---|---|---|
1 | 1 | The World |
20 | 1/20 | Americas |
79 | 1/20/79 | United States |
85 | 1/20/79/85 | California |
90 | 1/20/79/85/90 | Napa Valley |
178 | 1/178 | Europe |
281 | 1/178/281 | France |
283 | 1/178/281/283 | Bordeaux |
288 | 1/178/281/283/288 | Medoc |
294 | 1/178/281/283/288/294 | Haut-Medoc |
354 | 1/178/354 | Germany |
368 | 1/178/354/368 | Rheingau |
It's clear that this table isn't usable directly for analysis. To quote the PDF again:
"Also this table completely defines the hierarchy, and also here the table needs to be transformed to be usable in QlikView."
But the PDF doesn't say how to transform a path enumerated table.
There are two functions available in load scripts: Hierarchy and HierarchyBelongsTo. Both, however, have a required parent id column, which is not explicit in the path enumerated list?
How does one transform a n-level unbalanced tree expressed in a path enumerated table into an expanded nodes table with a column for each path step?
Related question, the Hierarchy function takes an option PathName, but there isn't an example using this parameter. What is a good example of such a use-case?
Inline data for the aforementioned table:
[regions]:
LOAD * inline
[
NodeID, Path, Name
1, 1, The World
20, 1/20, Americas
79, 1/20/79, United States
85, 1/20/79/85, California
90, 1/20/79/85/90, Napa Valley
178, 1/178, Europe
281, 1/178/281, France
283, 1/178/281/283, Bordeaux
288, 1/178/281/283/288, Medoc
294, 1/178/281/283/288/294, Haut-Medoc
354, 1/178/354, Germany
368, 1/178/354/368, Rheingau
];
For this data set, you can just parse out the parent id from the expression:
[regions]:
Hierarchy(NodeID,Parent,Name)
LOAD
NodeID,
Path,
Name,
SubField(Path,'/', SubStringCount(Path,'/')) as Parent
inline
[
NodeID, Path, Name
1, 1, The World
20, 1/20, Americas
79, 1/20/79, United States
85, 1/20/79/85, California
90, 1/20/79/85/90, Napa Valley
178, 1/178, Europe
281, 1/178/281, France
283, 1/178/281/283, Bordeaux
288, 1/178/281/283/288, Medoc
294, 1/178/281/283/288/294, Haut-Medoc
354, 1/178/354, Germany
368, 1/178/354/368, Rheingau
];
This is a somewhat synthetic example, however, where the materialized path is constructed of unique ids at every stage.
For an alternative, more general case, consider a listing of a filesystem, where you have two columns, the path to the file and the size of the file. Here there is no nodeid precalculated and the parent. Say we want to display total size by directory, or show a treemap of size by directory.
Is there a solution for this general case where there are no unique ids outside of the materialized path? That is to say, other then the alternative of pre-processing the list outside of Qlik into an adjacent nodes table.