Skip to main content

Suggest an Idea

Vote for your favorite Qlik product ideas and add your own suggestions.

Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE

Excel-based Data Lineage for Compose for Data Warehouse

DWH4Fun
Contributor III
Contributor III

Excel-based Data Lineage for Compose for Data Warehouse

Consumers of our warehouse are asking for an Excel-based data lineage.  They would like to see full end to end linage from source (landing) to warehouse to mart tables including exposing any expressions, look-ups, and descriptions.

Justification for this includes:

  • Excel is the application almost all business users will leverage when documenting data lineage, mappings, and data dictionaries
  • In some cases the consumer is external to our company.  It is not feasible to give them access to Compose or send them the HTML version of the Compose on-line documents
  • Users feel that Compose on-online documents are ok, but not great.  Issues they face include:
    • Availability across the entire organization.  There is no single web location all users can access.
    • Does not solve use cases where information must be shared with external clients
    • Data lineage is only at the entity level and not attribute level like is supported in the Compose application
    • Not all internal users will be given read-only access to the Compose application
    • Navigation and search is not user friendly.  Search is not across the entire project.  There is more hunting and pecking for information via drill-down
    • Similar to the above point, there is no ability to trend across all fields, mappings, expressions, etc.  Much easier to do in Excel.

We have had mixed results trying to maintain it in Excel on our own.  It's very manual, time consuming, and prone to errors / omissions. In order to be current and accurate, Compose needs to generate it.

Thx!

5 Comments
JorgeUriarte
Employee
Employee

Tim G wrote a query against the project metadata sqllite db which produces exactly this. there is no reason for this not to be a simple button that exports it into a CSV format, and as part of the Project Documentation. its a query. and its already done. every customer of Compose would benefit from this simple thing.

 


WITH Mapping
as (

select map.ID as MappingID,
json_extract(db.json, '$.database_name') as LandingDB,
json_extract(db.json, '$.original_sourcedb_name') as SourceDB,
json_extract(db.json, '$.server_name' ) as SourceServer,
CASE WHEN json_extract(map.json, '$.mapping_type') = 'QUERY' THEN 'QUERY based mapping' ELSE json_extract(map.json, '$.schema') || '.' || json_extract(map.json,'$.table_name') END as LandingTable,
json_extract(map.json, '$.mapping_query') as QuerySource,
map.name as MappingName,json_extract(colmap.value, '$.staging_col_name_int') STAGING_COL_NAME_INT,
json_extract(colmap.value, '$.mapping_type') as ColMapType,
json_extract(colmap.value, '$.source_col_name') as SourceColumnName,
IFNULL( json_extract(colmap.value, '$.lookup.lookup_result_expression.expression_statement'), json_extract(colmap.value, '$.expression.expression_statement') ) as Expression,
json_extract(map.json, '$.entity_id') as ETY_ID,
IFNULL(json_extract(dbLKU.json, '$.database_name')||'.','') || json_extract(colmap.value, '$.lookup.lookup_table_schema') || '.' || json_extract(colmap.value,'$.lookup.lookup_table_name') as LKUSourceTable,
json_extract(colmap.value, '$.lookup.lookup_expression.expression_statement') as LookupCondition

FROM objects map , json_each(map.json, '$.mapping_fields') colmap
JOIN objects db
on db.id = json_extract(map.json,'$.database_id')
LEFT JOIN objects dbLKU
on dbLKU.ID = json_extract(colmap.value, '$.lookup.lookup_table_database_id')
),
TABLEINFO
as (SELECT
ETY_ID, COL_ID, ETY_NAME as LogicalEntity,
CASE WHEN json_extract(COLUMNS.json, '$.history_type') = 'S' THEN
tblPrefix || ETY_NAME || sat_suffix || json_extract(COLUMNS.json, '$.block')
ELSE
tblPrefix || ETY_NAME || hub_suffix
END as PhysicalVaultTable,
CASE WHEN DA.type = 'Entity' then 'Relationship' else 'Column' end as AttType,
CASE WHEN
IFNULL(json_extract(COLUMNS.json, '$.prefix'),'') || ifnull(COL_NAME, json_extract(DA.json, '$.name') ) ='' THEN IFNULL(json_extract(COLUMNS.json, '$.prefix'),'') || COL_NAME
ELSE
IFNULL(json_extract(COLUMNS.json, '$.prefix'),'') || ifnull(COL_NAME, json_extract(DA.json, '$.name') )
END as ColumnName
,json_extract(COLUMNS.json,'$.ordinal') /100 as ColumnOrdinal
,
json_extract(DA.json,'$.type') as DataType,
'COLID_'|| COL_ID || '_' || CASE WHEN DA.Type = 'Entity' THEN ETY_PK.id ||'_' || json_extract(ETY_PK.json, '$.pbe') ELSE COL_DOMAIN_ID END as RelatedETYID

FROM (SELECT id as ETY_ID, name as ETY_NAME
FROM objects ETY
WHERE ETY.type ='Entity'
) ENTITY
JOIN
(
select json,
json_extract(COL.json, '$.poe') COL_PARENT,
json_extract(COL.json, '$.pbe') COL_DOMAIN_ID,
id as COL_ID,
json_extract(COL.json, '$.is_primary') as IS_PK,
json_extract(COL.json, '$.ordinal') as COL_ORDINAL ,
json_extract(COL.json, '$.attribute_name') as COL_NAME

FROM objects COL
WHERE COL.type ='TableColumn' ) COLUMNS
on COLUMNS.COL_PARENT = ETY_ID
LEFT JOIN objects DA
on DA.id = COL_DOMAIN_ID
LEFT JOIN objects ETY_PK
on DA.type = 'Entity'
AND json_extract(ETY_PK.json, '$.poe') = DA.ID
AND json_extract(ETY_PK.json, '$.is_primary') = 1
JOIN (select
json_extract(json, '$.naming_conv.dwh_prefix') as tblPrefix,
json_extract(json, '$.naming_conv.hub_suffix') as hub_suffix,
json_extract(json, '$.naming_conv.sat_suffix') as sat_suffix
from objects where name = 'settings')
on 1=1
)
, FactDims
as (
select fd.json,
json_extract(fd.json, '$.type') as DataMartObject
, CASE json_extract(fd.json, '$.history_type') WHEN 'N' THEN 'Type 1' WHEN 'S' THEN 'Type 2' END as DimType
, json_extract(fd.json, '$.root_entity_id') as Root_ETY_ID
, json_extract(fd.json, '$.display_name') as DMObjectDisplayName
, replace( json_extract(fd.json, '$.name'), 'DataMart!!','') as DMPhysicalTableName
, json_extract(colmap.value, '$.attribute_name') as DMColumnName
, json_extract(colmap.value, '$.data_type') as DMColumnDataType
, CASE json_extract(colmap.value, '$.aggregate_function') WHEN 'GROUPBY' THEN json_extract(colmap.value, '$.expression.expression_statement') ELSE json_extract(colmap.value, '$.aggregate_function') ||'( ' || json_extract(colmap.value, '$.expression.expression_statement') || ' )' END as DMSourceExpression
, reverse(substr( reverse( json_extract(epr.value, '$.alias_parameter')) , 0, instr(reverse(json_extract(epr.value, '$.alias_parameter')), ' '))) as DV_COL_ID, dm.*
, dm.name as DataMartName

from objects fd, json_each(fd.json, '$.fact_dim_columns') colmap, json_each(colmap.value, '$.expression.alias') epr
JOIN objects dm
on dm.id=json_extract(fd.json,'$.datamart_id')
WHERE fd.type = 'FactDim'

)

 

SELECT
T.LogicalEntity ,
IFNULL( M.MappingName, '** NOT MAPPED **') as MappingName ,
T.PhysicalVaultTable,
T.ColumnName,
T.AttType,
T.ColumnOrdinal,
M.LandingDB,
M.LandingTable,
M.ColMapType,
IFNULL(M.Expression, M.SourceColumnName) as SourceExpression ,
M.LKUSourceTable,
M.LookupCondition,
DataMartName,
FD.DataMartObject as DataMartObjectType,
FD.DimType,
FD.DMObjectDisplayName,
FD.DMPhysicalTableName ,
FD.DMColumnName,
FD.DMSourceExpression as DM_Expression,
M.MappingID,

replace(REPLACE(QuerySource, CHAR(13),' '), CHAR(10), ' ') as QuerySource

FROM TABLEINFO T
LEFT JOIN FactDims FD
on FD.DV_COL_ID = T.COL_ID
LEFT JOIN Mapping M
on M.ETY_ID = T.ETY_ID
AND M.STAGING_COL_NAME_INT = T.RelatedETYID

--and DMColumnName = 'TestAtt'
ORDER BY LogicalEntity, MappingName, PhysicalVaultTable, ColumnOrdinal, DataMartName, DataMartObjectType, DMObjectDisplayName, DMColumnName
;

Tzachi_Nissim
Employee
Employee

Hi DWH4Fun,

Thank you for your suggestion. It obviously makes sense, so I'm opening this to the community for feedback and voting that will help prioritize.

Regards,

Tzachi

Status changed to: Open - Collecting Feedback
jfgiudicelli
Contributor II
Contributor II

Hello ,

i return to an old subject but which is still relevant for new compose users:

maybe another way we use it's to use command line to generate  a list of CSV files to have these kind of information

composecli connect htpps://serveurCompose   (to connect composer server)
"C:\Program Files\Qlik\Compose\bin\"ComposeCli.exe export_csv --project {project_name} --outfolder {folder_destination}

 

By this way you can see :

Attribute.csv

Attributedomain.csv

custometl.csv

datamart.csv

entities.csv

mappings.csv

mappingsmetadat.csv

relationships.csv

and more 

I hope this will help you

Meghann_MacDonald

From now on, please track this idea from the Ideation portal. 

Link to new idea

Meghann

NOTE: Upon clicking this link 2 tabs may open - please feel free to close the one with a login page. If you only see 1 tab with the login page, please try clicking this link first: Authenticate me! then try the link above again. Ensure pop-up blocker is off.

Ideation
Explorer II
Explorer II
 
Status changed to: Closed - Archived