Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Quick question hopefully - I'm using the reload monitor to pull just the apps / connections that are using sharepoint excel files, or uploaded excel files .
Within the reload monitor - I've added the following to the script to filter just Excel files.
TempReloadAuditData:
NoConcatenate
LOAD
_ReloadExecutionConcurrencyKey as 'ExcelConcurrencyKey',
ReloadID as 'ExcelReloadID',
_AuditReloadKey // keep key for join to AuditReloadLineage
RESIDENT Reloads;
LEFT JOIN (TempReloadAuditData)
LOAD
_AuditReloadKey,
FileType as 'ExcelReloadFileType',
FileName as 'ExcelReloadFileName',
DataConnectionID
RESIDENT AuditReloadLineage
where
FileType ='xlsx' or FileType ='xlsm'
Hi!
I agree with @Chanty4u the recommended solution is to store lineage rows into your own QVD keyed by the execution, then reuse it.
somenthing like
// Historical lineage store
LET vLineageQvd = 'lib://YourDataLake/ReloadLineageHistory.qvd';
// Load existing history (if any)
LineageHistory:
LOAD
ReloadID,
_AuditReloadKey,
FileType,
FileName,
DataConnectionID,
Timestamp // whatever execution timestamp field you have (from Reloads)
FROM [$(vLineageQvd)] (qvd)
WHERE 1=1; // handles first-run if file exists
// New lineage from this reload's ingestion
// (Assumes you also have Reloads with ReloadID, Timestamp)
LEFT JOIN (AuditReloadLineage)
LOAD
_AuditReloadKey,
ReloadID,
Timestamp
RESIDENT Reloads;
// Append only new execution+file combos
CONCATENATE (LineageHistory)
LOAD DISTINCT
ReloadID,
_AuditReloadKey,
FileType,
FileName,
DataConnectionID,
Timestamp
RESIDENT AuditReloadLineage
WHERE NOT EXISTS(ReloadID) // coarse grain; refine key if one ReloadID can emit multiple rows
OR NOT EXISTS(_AuditReloadKey); // safer: use a composite key
// Save back
STORE LineageHistory INTO [$(vLineageQvd)] (qvd);
// Use LineageHistory in your UI instead of the volatile AuditReloadLineage
If a single reload execution can produce multiple lineage rows, build a composite key like Hash128(ReloadID, FileType, FileName, DataConnectionID) and use that in the NOT EXISTS() filter.
Hi
In your modified Reload Monitor script, store the relevant lineage data (FileType, FileName, DataConnectionID, ReloadID, AppID, ReloadStartTime) into a QVD on every reload.
Then in the next reload, append new rows (rather than overwriting).
This builds your own historical lineage repository.
Hi!
I agree with @Chanty4u the recommended solution is to store lineage rows into your own QVD keyed by the execution, then reuse it.
somenthing like
// Historical lineage store
LET vLineageQvd = 'lib://YourDataLake/ReloadLineageHistory.qvd';
// Load existing history (if any)
LineageHistory:
LOAD
ReloadID,
_AuditReloadKey,
FileType,
FileName,
DataConnectionID,
Timestamp // whatever execution timestamp field you have (from Reloads)
FROM [$(vLineageQvd)] (qvd)
WHERE 1=1; // handles first-run if file exists
// New lineage from this reload's ingestion
// (Assumes you also have Reloads with ReloadID, Timestamp)
LEFT JOIN (AuditReloadLineage)
LOAD
_AuditReloadKey,
ReloadID,
Timestamp
RESIDENT Reloads;
// Append only new execution+file combos
CONCATENATE (LineageHistory)
LOAD DISTINCT
ReloadID,
_AuditReloadKey,
FileType,
FileName,
DataConnectionID,
Timestamp
RESIDENT AuditReloadLineage
WHERE NOT EXISTS(ReloadID) // coarse grain; refine key if one ReloadID can emit multiple rows
OR NOT EXISTS(_AuditReloadKey); // safer: use a composite key
// Save back
STORE LineageHistory INTO [$(vLineageQvd)] (qvd);
// Use LineageHistory in your UI instead of the volatile AuditReloadLineage
If a single reload execution can produce multiple lineage rows, build a composite key like Hash128(ReloadID, FileType, FileName, DataConnectionID) and use that in the NOT EXISTS() filter.