Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
Problem: I have a situation where a uniqueID can be 2 different results. See below:
Library Item Identifier | Program Name |
site-induction-program-zillmere-child-health-centre | General Evacuation and First Response |
site-induction-program-zillmere-child-health-centre | Site Induction |
I think I need to handle this through two mapping tables, so I have created two spreadsheets where the first has the following:
Library Item Identifier | Program Name |
site-induction-program-zillmere-child-health-centre | General Evacuation and First Response |
And the second has the following:
Library Item Identifier | Program Name |
site-induction-program-zillmere-child-health-centre | Site Induction |
My logic is to load the same table of data twice, but applymap the first table in the first time round, and then applymap the second table the second time round. Then concatenate the two tables together.
Unfortunately I keep getting a "Field names must be unique within table" error. I'm stumped, please help if you can!
Here is the full load script:
//Load and map the mandatory training library items to their program name
ProgramName:
Mapping LOAD [Library Item Identifier],
[Program Name]
FROM
MappingTables\MandatoryTrainingPrograms.xls
(biff, embedded labels, table is Sheet1$);
//Load and map the duplicate mandatory training library items to their program name
ProgramName2:
Mapping LOAD [Library Item Identifier],
[Program Name]
FROM
MappingTables\MandatoryTrainingProgramsDuplicate.xls
(biff, embedded labels, table is Sheet1$);
//Load the TEACHQ data
Training1:
LOAD Username as PID,
Email,
[Library Item Identifier],
[Library Item],
date(floor([Completed Date]),'DD/MM/YYYY') as CompletedDate
FROM
QlikCorporateMandatoryTrainingReportAllEmployees.xlsx
(ooxml, embedded labels, table is Sheet1);
//Filter the TEACHQ data so it only shows the last completed date for each employees library item
Left Join (Training1)
Load PID,
[Library Item],
date(max(CompletedDate)) as maxCompletedDate
Resident Training1
Group by PID,[Library Item];
//Create a new table that only shows the last completed date for each library item by employee
Noconcatenate
TrainingData:
Load *
Resident Training1
WHERE CompletedDate=maxCompletedDate;
Drop table Training1;
//Load the current contracts table to identify an employees earliest start date in CHQ
EmployeesStartDate:
LOAD Pers.ID as PID,
[Start Date]
FROM
EmployeeContracts.xls
(biff, embedded labels, table is Sheet1$);
Left Join (EmployeesStartDate)
LOAD PID,
date(min([Start Date])) as minstartdate
Resident EmployeesStartDate
Group by PID;
//Create a new table of employees that has their PID and their minStartDate in CHQ
Noconcatenate
CleanEmployee:
LOAD PID,
[Start Date] as MinStartDate
Resident EmployeesStartDate
WHERE [Start Date]=minstartdate;
Drop table EmployeesStartDate;
//Create a new table of employees that has their PID, their minStartDate in CHQ and the last time they completed each library item and map the program name
Left Join (CleanEmployee)
LOAD
*,
ApplyMap('ProgramName', [Library Item Identifier], 'UpdateMappingTable') as [Program Name]
Resident TrainingData;
Drop table TrainingData;
//Reload the table of employees that has their PID, their minStartDate in CHQ and the last time they completed each library item and map the other program name so it concatenates
concatenate LOAD
*,
ApplyMap('ProgramName2', [Library Item Identifier], 'UpdateMappingTable') as [Program Name]
Resident CleanEmployee;
In your last LOAD, [Program Name] is already part of the input table.
You would need to list all fields except [Program Name] from CleanEmployee instead of using the star symbol * to get around this.
But if you want to duplicate your records, why don't you join your employee table with a table with your two records:
Library Item Identifier | Program Name |
site-induction-program-zillmere-child-health-centre | General Evacuation and First Response |
site-induction-program-zillmere-child-health-centre | Site Induction |
?
Or create a linked table for program name instead duplicating your employee records?
In your last LOAD, [Program Name] is already part of the input table.
You would need to list all fields except [Program Name] from CleanEmployee instead of using the star symbol * to get around this.
But if you want to duplicate your records, why don't you join your employee table with a table with your two records:
Library Item Identifier | Program Name |
site-induction-program-zillmere-child-health-centre | General Evacuation and First Response |
site-induction-program-zillmere-child-health-centre | Site Induction |
?
Or create a linked table for program name instead duplicating your employee records?
Thank you so much Stefan! I feel like a moron though for not realising that at first though... Haha.
If you have the time, can you explain your other 2 solutions as I don't understand the logic?
The other two options may look like (untested code):
//Create a single table for lib item Program names:
//assuming each table shows a single entry per item
ProgramName:
LOAD [Library Item Identifier],
[Program Name],
'Original' as Source
FROM
MappingTables\MandatoryTrainingPrograms.xls
(biff, embedded labels, table is Sheet1$);
Concatenate (ProgramName)
LOAD [Library Item Identifier],
[Program Name],
'Duplicate' as Source
FROM
MappingTables\MandatoryTrainingProgramsDuplicate.xls
(biff, embedded labels, table is Sheet1$);
//Load the TEACHQ data
Training1:
LOAD Username as PID,
Email,
[Library Item Identifier],
[Library Item],
date(floor([Completed Date]),'DD/MM/YYYY') as CompletedDate
FROM
QlikCorporateMandatoryTrainingReportAllEmployees.xlsx
(ooxml, embedded labels, table is Sheet1);
//Filter the TEACHQ data so it only shows the last completed date for each employees library item
Left Join (Training1)
Load PID,
[Library Item],
date(max(CompletedDate)) as maxCompletedDate
Resident Training1
Group by PID,[Library Item];
//Create a new table that only shows the last completed date for each library item by employee
Noconcatenate
TrainingData:
Load *
Resident Training1
WHERE CompletedDate=maxCompletedDate;
Drop table Training1;
//Load the current contracts table to identify an employees earliest start date in CHQ
EmployeesStartDate:
LOAD Pers.ID as PID,
[Start Date]
FROM
EmployeeContracts.xls
(biff, embedded labels, table is Sheet1$);
Left Join (EmployeesStartDate)
LOAD PID,
date(min([Start Date])) as minstartdate
Resident EmployeesStartDate
Group by PID;
//Create a new table of employees that has their PID and their minStartDate in CHQ
Noconcatenate
CleanEmployee:
LOAD PID,
[Start Date] as MinStartDate
Resident EmployeesStartDate
WHERE [Start Date]=minstartdate;
Drop table EmployeesStartDate;
//Create a new table of employees that has their PID, their minStartDate in CHQ and the last time they completed each library item and map the program name
Left Join (CleanEmployee)
LOAD
*,
Resident TrainingData;
Drop table TrainingData;
Now you should have your clean table and a linked ProgramData table, which you can join to the CleanEmployee table if you want, duplicating the records as with your mapping approach.
But maybe you don't need to join, just leave the tables linked to each other.