Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I have been trying to wrap my head around Canonical dates and was wondering if I could please get some guidance.
I am trying to Load/Join 2 Tables with the same name Date fields into 1 MasterCalendar.
If I just perform a straight load of the 2 tables, I will get syn keys.
See my script
///////////////////////////////////////// Load Product Files
Set Path = 'lib://AttachedFiles/';
For each File in filelist(Path & '*.xlsx')
If(NOT(Wildmatch('$(File)', '*Schedule*'))) Then
Fact:
LOAD
"ProductID",
"ProductName",
"ReportDate",
FileName() as FilenameA
FROM
[$(File)]
(ooxml, embedded labels, table is Sheet1)
;
End If;
Next File;
///////////////////////////////////////// Load Schedule Files
For each File in filelist(Path & '*.xlsx')
If(Wildmatch('$(File)', '*Schedule*')) Then
KCT:
LOAD
"ProductID",
"TestingResults",
"ValidationStatus",
"AssessmentDate",
"ReportDate",
FileName() as FilenameB,
FROM
[$(File)]
(ooxml, embedded labels, header is 3 lines, table is [Schedule])
;
End If;
Next File;
Interestingly after the Syn Keys load - The data model/outcome is exactly how I wanted it.
But I want to avoid Syn keys for best data model practice. So I am asking for some guidance here.
Please see the tables:
Fact:
ProductID | ProductName | ReportDate | FileNameA |
CR1 | Sample1 | 31/01/2022 | Jan-22.xlsx |
CR1 | Sample1 | 28/02/2022 | Feb-22.xlsx |
CR1 | Sample1 | 31/03/2022 | Mar-22.xlsx |
CR1 | Sample1 | 30/04/2022 | Apr-22.xlsx |
CR1 | Sample1 | 31/05/2022 | May-22.xlsx |
CR1 | Sample1 | 30/06/2022 | Jun-22.xlsx |
KCT:
ProductID | TestingResults | ValidationStatus | AssessmentDate | ReportDate | FileNameB |
CR1 | - | - | - | 31/01/2022 | Schedule_Jan22.xlsx |
CR1 | Preparing | Validated | 15/02/2022 | 28/02/2022 | Schedule_Feb22.xlsx |
CR1 | Started | Validated | 08/03/2022 | 31/03/2022 | Schedule_Mar22.xlsx |
CR1 | Completed | Validated | 18/04/2022 | 30/04/2022 | Schedule_Apr22.xlsx |
CR1 | - | - | - | 31/05/2022 | Schedule_May22.xlsx |
CR1 | - | - | - | 30/06/2022 | Schedule_Jun22.xlsx |
Outcome I am looking for in a Straight Table
ProductID | ProductName | TestingResults | ValidationStatus | AssessmentDate | ReportDate | FileNameA | FileNameB |
CR1 | Sample1 | - | - | - | 31/01/2022 | Jan-22.xlsx | Schedule_Jan22.xlsx |
CR1 | Sample1 | Preparing | Validated | 15/02/2022 | 28/02/2022 | Feb-22.xlsx | Schedule_Feb22.xlsx |
CR1 | Sample1 | Started | Validated | 08/03/2022 | 31/03/2022 | Mar-22.xlsx | Schedule_Mar22.xlsx |
CR1 | Sample1 | Completed | Validated | 18/04/2022 | 30/04/2022 | Apr-22.xlsx | Schedule_Apr22.xlsx |
CR1 | Sample1 | - | - | - | 31/05/2022 | May-22.xlsx | Schedule_May22.xlsx |
CR1 | Sample1 | - | - | - | 30/06/2022 | Jun-22.xlsx | Schedule_Jun22.xlsx |
I am currently using 1 MasterCalendar with the below field
QuartersMap:
MAPPING LOAD
rowno() as Month,
'Q' & Ceil (rowno()/3) as Quarter
AUTOGENERATE (12);
Temp:
Load
min([ReportDate]) as minDate,
max([ReportDate]) as maxDate
Resident Fact;
Let varMinDate = Num(Peek('minDate', 0, 'Temp'));
Let varMaxDate = Num(Peek('maxDate', 0, 'Temp'));
DROP Table Temp;
TempCalendar:
LOAD
$(varMinDate) + Iterno()-1 As Num,
Date($(varMinDate) + IterNo() - 1) as TempDate
AutoGenerate 1 While $(varMinDate) + IterNo() -1 <= $(varMaxDate);
MasterCalendar:
Load
TempDate AS [ReportDate],
week(TempDate) As Week,
Year(TempDate) As Year,
Month(TempDate) As Month,
Day(TempDate) As Day,
// Month(MonthStart(TempDate)) & '-' & Year(TempDate) As MonthYear,
// month(monthstart(TempDate)) & '-' & Year(TempDate) as MonthYear,
Date(Monthstart(TempDate), 'MMM-YYYY') as MonthYear,
ApplyMap('QuartersMap', month(TempDate), Null()) as Quarter,
Week(weekstart(TempDate)) & '-' & WeekYear(TempDate) as WeekYear,
WeekDay(TempDate) as WeekDay
Resident TempCalendar
Order By TempDate ASC;
Drop Table TempCalendar;
If I could get this solved with 1 MasterCalendar that would be great. But I am open to other possible solutions.
Thank you
Hi, the first script is creating a syn key because both table has the "ProductID" and the ReportDate fields, and when two tables has more than one common field it will create a syn key.
To avoid this key you can load a composed key in both tables:
"ProductID" &'_'& ReportDate as ProductDateKey
If one of the table will hve all the keys, you can keep both fields in that table and remove from the other. If this can't be assured you can have another table related with this same field, that also splits the key in the two fields:
LOAD
ProductDateKey,
SubField(ProductDateKey,'_',1) as ProductID,
SubField(ProductDateKey,'_',2) as ReportDate,
LOAD
FieldValue('ProductDateKey', RecNo()) as ProductDateKey,
AutoGenerate FieldValueCount('ProductDateKey')
;
Hi, the first script is creating a syn key because both table has the "ProductID" and the ReportDate fields, and when two tables has more than one common field it will create a syn key.
To avoid this key you can load a composed key in both tables:
"ProductID" &'_'& ReportDate as ProductDateKey
If one of the table will hve all the keys, you can keep both fields in that table and remove from the other. If this can't be assured you can have another table related with this same field, that also splits the key in the two fields:
LOAD
ProductDateKey,
SubField(ProductDateKey,'_',1) as ProductID,
SubField(ProductDateKey,'_',2) as ReportDate,
LOAD
FieldValue('ProductDateKey', RecNo()) as ProductDateKey,
AutoGenerate FieldValueCount('ProductDateKey')
;
Thank you Ruben, your suggestion to use Composed key did it for me. I also incorporated @MK_QSL solution to a similar question to my script to make it work.
My updated script:
///////////////////////////////////////// Load Product Files
Set Path = 'lib://AttachedFiles/';
For each File in filelist(Path & '*.xlsx')
If(NOT(Wildmatch('$(File)', '*Schedule*'))) Then
Fact:
LOAD
"ProductID",
"ProductName",
"ReportDate",
[ProductID]&[ReportDate] as ProductDateKey,
FileName() as FilenameA
FROM
[$(File)]
(ooxml, embedded labels, table is Sheet1)
;
End If;
Next File;
///////////////////////////////////////// Load Schedule Files
For each File in filelist(Path & '*.xlsx')
If(Wildmatch('$(File)', '*Schedule*')) Then
KCT:
LOAD
"ProductID",
"TestingResults",
"ValidationStatus",
"AssessmentDate",
"ReportDate",
[ProductID]&[ReportDate] as ProductDateKey,
FileName() as FilenameB
FROM
[$(File)]
(ooxml, embedded labels, header is 3 lines, table is [Schedule])
;
End If;
Next File;
LinkTable:
LOAD Distinct
ProductDateKey,
ProductDateKey as TEMP_ProductDateKey,
[ProductID],
[ReportDate]
Resident Fact;
Concatenate (LinkTable)
LOAD Distinct
ProductDateKey,
[ProductID],
[ReportDate]
Resident KCT
Where Not Exists(TEMP_ProductDateKey,ProductDateKey);
Drop Field TEMP_ProductDateKey;
DROP Fields
[ProductID],
[ReportDate]
From Fact;
DROP Fields
[ProductID],
[ReportDate]
From KCT;
Hi, all can I ask for a follow-up scenario?
What if the Date field on my 2nd table (KCT) is a new field that is derived from the filename?
See example
Date(MonthEnd(Date(Date#(mid(FileName(),14,5),'MMMYY'),'DD-MM-YYYY'))) as [ReportDate]
in KCT Table
///////////////////////////////////////// Load Product Files
Set Path = 'lib://AttachedFiles/';
For each File in filelist(Path & '*.xlsx')
If(NOT(Wildmatch('$(File)', '*Schedule*'))) Then
Fact:
LOAD
"ProductID",
"ProductName",
"ReportDate",
[ProductID]&[ReportDate] as ProductDateKey,
FileName() as FilenameA
FROM
[$(File)]
(ooxml, embedded labels, table is Sheet1)
;
End If;
Next File;
///////////////////////////////////////// Load Schedule Files
For each File in filelist(Path & '*.xlsx')
If(Wildmatch('$(File)', '*Schedule*')) Then
KCT:
LOAD
"ProductID",
"TestingResults",
"ValidationStatus",
"AssessmentDate",
Date(MonthEnd(Date(Date#(mid(FileName(),14,5),'MMMYY'),'DD-MM-YYYY'))) as [ReportDate],
FileName() as FilenameB
FROM
[$(File)]
(ooxml, embedded labels, header is 3 lines, table is [Schedule])
;
End If;
Next File;
How would I get the composite key to work?
Thank you