Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
110
Creator
Creator

Reload lineage over time for Data files / Sharepoint connections? ( Reload Monitor App )

Quick question hopefully  -  I'm using the reload monitor to pull just the apps / connections that are using sharepoint excel files, or uploaded excel files .

Within the reload monitor - I've added the following to the script to filter just Excel files.

TempReloadAuditData:
NoConcatenate
LOAD
_ReloadExecutionConcurrencyKey as 'ExcelConcurrencyKey',
    ReloadID as 'ExcelReloadID',
    _AuditReloadKey    // keep key for join to AuditReloadLineage
RESIDENT Reloads;


LEFT JOIN (TempReloadAuditData)
LOAD
    _AuditReloadKey,
    FileType as 'ExcelReloadFileType',
    FileName as 'ExcelReloadFileName',
    DataConnectionID
RESIDENT AuditReloadLineage
where
FileType ='xlsx' or FileType ='xlsm' 
 
 
However - this only appears to pull in the latest loaded Excel lineage files, and not the historical ones. 
 
Should I just start populating this into a QVD or is there a way to see this data historically also?
 
Labels (1)
1 Solution

Accepted Solutions
diegozecchini
Specialist
Specialist

Hi!

I agree with @Chanty4u the recommended solution is to store lineage rows into your own QVD keyed by the execution, then reuse it.

somenthing like
// Historical lineage store
LET vLineageQvd = 'lib://YourDataLake/ReloadLineageHistory.qvd';

// Load existing history (if any)
LineageHistory:
LOAD
ReloadID,
_AuditReloadKey,
FileType,
FileName,
DataConnectionID,
Timestamp // whatever execution timestamp field you have (from Reloads)
FROM [$(vLineageQvd)] (qvd)
WHERE 1=1; // handles first-run if file exists

// New lineage from this reload's ingestion
// (Assumes you also have Reloads with ReloadID, Timestamp)
LEFT JOIN (AuditReloadLineage)
LOAD
_AuditReloadKey,
ReloadID,
Timestamp
RESIDENT Reloads;

// Append only new execution+file combos
CONCATENATE (LineageHistory)
LOAD DISTINCT
ReloadID,
_AuditReloadKey,
FileType,
FileName,
DataConnectionID,
Timestamp
RESIDENT AuditReloadLineage
WHERE NOT EXISTS(ReloadID) // coarse grain; refine key if one ReloadID can emit multiple rows
OR NOT EXISTS(_AuditReloadKey); // safer: use a composite key

// Save back
STORE LineageHistory INTO [$(vLineageQvd)] (qvd);

// Use LineageHistory in your UI instead of the volatile AuditReloadLineage
If a single reload execution can produce multiple lineage rows, build a composite key like Hash128(ReloadID, FileType, FileName, DataConnectionID) and use that in the NOT EXISTS() filter.

View solution in original post

2 Replies
Chanty4u
MVP
MVP

Hi

In your modified Reload Monitor script, store the relevant lineage data (FileType, FileName, DataConnectionID, ReloadID, AppID, ReloadStartTime) into a QVD on every reload.

Then in the next reload, append new rows (rather than overwriting).

This builds your own historical lineage repository.

diegozecchini
Specialist
Specialist

Hi!

I agree with @Chanty4u the recommended solution is to store lineage rows into your own QVD keyed by the execution, then reuse it.

somenthing like
// Historical lineage store
LET vLineageQvd = 'lib://YourDataLake/ReloadLineageHistory.qvd';

// Load existing history (if any)
LineageHistory:
LOAD
ReloadID,
_AuditReloadKey,
FileType,
FileName,
DataConnectionID,
Timestamp // whatever execution timestamp field you have (from Reloads)
FROM [$(vLineageQvd)] (qvd)
WHERE 1=1; // handles first-run if file exists

// New lineage from this reload's ingestion
// (Assumes you also have Reloads with ReloadID, Timestamp)
LEFT JOIN (AuditReloadLineage)
LOAD
_AuditReloadKey,
ReloadID,
Timestamp
RESIDENT Reloads;

// Append only new execution+file combos
CONCATENATE (LineageHistory)
LOAD DISTINCT
ReloadID,
_AuditReloadKey,
FileType,
FileName,
DataConnectionID,
Timestamp
RESIDENT AuditReloadLineage
WHERE NOT EXISTS(ReloadID) // coarse grain; refine key if one ReloadID can emit multiple rows
OR NOT EXISTS(_AuditReloadKey); // safer: use a composite key

// Save back
STORE LineageHistory INTO [$(vLineageQvd)] (qvd);

// Use LineageHistory in your UI instead of the volatile AuditReloadLineage
If a single reload execution can produce multiple lineage rows, build a composite key like Hash128(ReloadID, FileType, FileName, DataConnectionID) and use that in the NOT EXISTS() filter.