Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to compare two CSV files using QlikView?

Hi Everyone- I'm hoping someone on here can help me please?

I need to use QlikView to compare two different CSV files and report back on any differences between them.

For example the file below, if I wanted to compare it against a duplicate of the same file, it should compare each cell and then report any differences. For example, if Age for 001 was 44 on the other file, it should report this as a difference as this one shows 43.

NOTE: It needs to use the ID as a lookup, as on the other file, the row numbers may not be the same.

I hope this all makes sense and somebody can help!!!

Thanks

IDAgeHeightColour
00143176W
00231154B
00322132W
00427110B
00523156W
00647133B
00743186W
00838199B
00921164W
01035132B
2 Replies
Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

I guess there are several ways of comparing the two files. Here is what I'd do:

1. Load both files into 2 separate tables.

2. Prior to the load, QUALIFY all fields except for the ID.

3. In both tables, add a new field that contains all different fields concatenated together - call it AllFields.

4. Create a Straight Table and limit the view to only those IDs where AllFields are different.

cheers,

Oleg Troyansky

Upgrade your Qlik skills at the Masters Summit for Qlik - coming soon to Austin, Texas and Johannesburg, South Africa!

effinty2112
Master
Master

Hi Freddie,

Try:

Data:

LOAD * INLINE [

    File, ID, Age, Height, Colour

    1, 001, 43, 176, W

    1, 002, 31, 154, B

    1, 003, 22, 132, W

    1, 004, 27, 110, B

    1, 005, 23, 156, W

    1, 006, 47, 133, B

    1, 007, 43, 186, W

    1, 008, 38, 199, B

    1, 009, 21, 164, W

    1, 010, 35, 132, B

];

LOAD * INLINE [

    File, ID, Age, Height, Colour

    2, 001, 44, 176, W

    2, 002, 31, 154, B

    2, 003, 22, 132, W

    2, 004, 27, 110, B

    2, 005, 23, 156, W

    2, 006, 47, 133, B

    2, 007, 43, 186, W

    2, 008, 38, 199, B

    2, 009, 21, 164, W

    2, 010, 35, 132, B

];

//These concatenated inline tables represent your two csv files. There is a field to identify the different source files.

CrossData:

CrossTable(Field,Value,2) LOAD * Resident Data;

drop Table Data;

//at this stage we get a table like this:

File ID Field Value
1001Age43
2001Age44
1001ColourW
2001ColourW
1001Height176
2001Height176
1002Age31
2002Age31
1002ColourB
2002ColourB
1002Height154
2002Height154


//we want to discard any rows that duplicate ID, Field and Value combinations

//the bottom half of this preceding load groups on ID, Field and Value and counts the number of values of File.

//the count of file will be one only if a combo of ID, Field and Value is unique.

//the top half returns only the ID and Fields with differing values in the two files

//the Inner Keep operation with the CrossData table discards the records that match in the two files

Answer:

Inner Keep(CrossData)

LOAD

ID,

Field

Where NumFiles = 1;

LOAD

ID,

Field,

Value,

count(File) as NumFiles

Resident CrossData Group by ID, Field,Value;

DROP Table Answer;

After all that we get the answer I hope you're looking for:

ID Field File 12
001Age 4344

this is a pivot table with dims ID,Field and file and Expression Only(Value).

Good Luck

Andrew