Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Everyone- I'm hoping someone on here can help me please?
I need to use QlikView to compare two different CSV files and report back on any differences between them.
For example the file below, if I wanted to compare it against a duplicate of the same file, it should compare each cell and then report any differences. For example, if Age for 001 was 44 on the other file, it should report this as a difference as this one shows 43.
NOTE: It needs to use the ID as a lookup, as on the other file, the row numbers may not be the same.
I hope this all makes sense and somebody can help!!!
Thanks
ID | Age | Height | Colour |
001 | 43 | 176 | W |
002 | 31 | 154 | B |
003 | 22 | 132 | W |
004 | 27 | 110 | B |
005 | 23 | 156 | W |
006 | 47 | 133 | B |
007 | 43 | 186 | W |
008 | 38 | 199 | B |
009 | 21 | 164 | W |
010 | 35 | 132 | B |
I guess there are several ways of comparing the two files. Here is what I'd do:
1. Load both files into 2 separate tables.
2. Prior to the load, QUALIFY all fields except for the ID.
3. In both tables, add a new field that contains all different fields concatenated together - call it AllFields.
4. Create a Straight Table and limit the view to only those IDs where AllFields are different.
cheers,
Oleg Troyansky
Upgrade your Qlik skills at the Masters Summit for Qlik - coming soon to Austin, Texas and Johannesburg, South Africa!
Hi Freddie,
Try:
Data:
LOAD * INLINE [
File, ID, Age, Height, Colour
1, 001, 43, 176, W
1, 002, 31, 154, B
1, 003, 22, 132, W
1, 004, 27, 110, B
1, 005, 23, 156, W
1, 006, 47, 133, B
1, 007, 43, 186, W
1, 008, 38, 199, B
1, 009, 21, 164, W
1, 010, 35, 132, B
];
LOAD * INLINE [
File, ID, Age, Height, Colour
2, 001, 44, 176, W
2, 002, 31, 154, B
2, 003, 22, 132, W
2, 004, 27, 110, B
2, 005, 23, 156, W
2, 006, 47, 133, B
2, 007, 43, 186, W
2, 008, 38, 199, B
2, 009, 21, 164, W
2, 010, 35, 132, B
];
//These concatenated inline tables represent your two csv files. There is a field to identify the different source files.
CrossData:
CrossTable(Field,Value,2) LOAD * Resident Data;
drop Table Data;
//at this stage we get a table like this:
File | ID | Field | Value |
---|---|---|---|
1 | 001 | Age | 43 |
2 | 001 | Age | 44 |
1 | 001 | Colour | W |
2 | 001 | Colour | W |
1 | 001 | Height | 176 |
2 | 001 | Height | 176 |
1 | 002 | Age | 31 |
2 | 002 | Age | 31 |
1 | 002 | Colour | B |
2 | 002 | Colour | B |
1 | 002 | Height | 154 |
2 | 002 | Height | 154 |
//we want to discard any rows that duplicate ID, Field and Value combinations
//the bottom half of this preceding load groups on ID, Field and Value and counts the number of values of File.
//the count of file will be one only if a combo of ID, Field and Value is unique.
//the top half returns only the ID and Fields with differing values in the two files
//the Inner Keep operation with the CrossData table discards the records that match in the two files
Answer:
Inner Keep(CrossData)
LOAD
ID,
Field
Where NumFiles = 1;
LOAD
ID,
Field,
Value,
count(File) as NumFiles
Resident CrossData Group by ID, Field,Value;
DROP Table Answer;
After all that we get the answer I hope you're looking for:
ID | Field | File | 1 | 2 |
---|---|---|---|---|
001 | Age | 43 | 44 |
this is a pivot table with dims ID,Field and file and Expression Only(Value).
Good Luck
Andrew