Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello guys, I'm new to Qlik and I'm already facing my first big challenge so I would like to know if you've faced a similar problem before.
I have a table with many (many) columns ,but basically it is just a versioned list of documents that looks like this:
You can see that there are some documents that don't have a first version (Version='1') so what I need to do is create a script to check if a first version exists for every document, and if this is not the case, add it using the info that I have from the second version (Version='2').
My expected result is the following:
What could I do from the Data Load Editor?
Thanks a lot for taking the time to help me.
Hi try this:
Data:
LOAD * Inline [
Document_ID, Version
1, 2
1, 3
1, 4
1, 5
2, 1
2, 2
3, 1
4, 1
5, 1
6, 1
7, 2
7, 3
7, 4
8, 1
];
TEMP:
LOAD distinct
Document_ID AS ID
RESIDENT Data
Where Version = 1;
Concatenate(Data)
Load distinct Document_ID,
1 AS Version
RESIDENT Data
Where Not Exists(ID, Document_ID);
DROP TABLE TEMP;
Best regards Son
Hi try this:
Data:
LOAD * Inline [
Document_ID, Version
1, 2
1, 3
1, 4
1, 5
2, 1
2, 2
3, 1
4, 1
5, 1
6, 1
7, 2
7, 3
7, 4
8, 1
];
TEMP:
LOAD distinct
Document_ID AS ID
RESIDENT Data
Where Version = 1;
Concatenate(Data)
Load distinct Document_ID,
1 AS Version
RESIDENT Data
Where Not Exists(ID, Document_ID);
DROP TABLE TEMP;
Best regards Son
This script will fill all missing rows with the next existing entry for all documents.
If versions 1, 2 and 4 are missing, but 3, 5 and 6 exist, 1 and 2 will use the values from 3 and 4 those from 5.
// Loading data
// I've removed a few more entries from the data...
Data:
NoConcatenate Load * Inline [
Document_ID, Version, LastVersion, Author_ID, Client_ID
1, 4, N, 100, 888
1, 5, Y, 100, 888
2, 1, N, 133, 777
2, 2, Y, 133, 777
3, 1, Y, 100, 666
4, 1, Y, 100, 555
5, 1, Y, 133, 444
6, 1, Y, 189, 333
7, 3, N, 133, 222
7, 4, Y, 133, 222
8, 1, Y, 100, 999
];
// Creating a list of all versions for all documents
AllDocuments:
NoConcatenate Load Distinct
Document_ID
Resident Data;
MaxVersion:
NoConcatenate Load
Max(Version) as MaxVersion
Resident Data;
Join(AllDocuments) Load
RecNo() as Version
AutoGenerate Peek('MaxVersion');
Drop Table MaxVersion;
// Restricting the list to all versions <= the max. version for that document
Right Join(AllDocuments) IntervalMatch(Version, Document_ID) Load Distinct
1 as MinVersion,
Max(Version) as MaxVersion,
Document_ID
Resident Data
Group By Document_ID;
// Adding other fields to new table
Left Join(AllDocuments) Load * Resident Data;
Drop Table Data;
// Creating the final table and filling the missing versions from the next filled version
// This works for ALL missing versions, not only the first one and also when more then one is missing.
Final:
NoConcatenate Load
Document_ID,
Version,
Coalesce(LastVersion, Peek(LastVersion)) as LastVersion,
Coalesce(Author_ID, Peek(Author_ID)) as Author_ID,
Coalesce(Client_ID, Peek(Client_ID)) as Client_ID
Resident AllDocuments
Order By Document_ID, Version desc;
Drop Table AllDocuments;
You could even remove the LastVersion field from the input data and set the flag in the script. This way there can't be 2 LastVersions for any document.
Wow! You really thought of everything. I'm not sure if I need to implement this but it's clearly a flawless solution.
Thanks a lot!
- Marcelo
Hello Son, your solution works perfectly!
That was exactly what I needed, thank you so much 😉
- Marcelo