Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have loaded 2 tables with exactly the same columns: TABLE_A and TABLE_B.
I want to be able to view the rows that are in TABLE_B but not in TABLE_A. The 2 tables are very large (around 160 million rows) and there are around 5 million extra rows in TABLE_B so the most efficient way of doing this would be very helpful. Just being able to see some of the rows that are different would still be helpful.
Many thanks!
Hi Afraz,
You can try "not exits" function.
See the attachment.
Thanks.
Narender
Take a look at the document Test_Join posted here
Hi Afraz,
You can try "not exits" function.
See the attachment.
Thanks.
Narender
Thanks Andrea.
I have had a look at your 'Test Join' file and it almost looks like what I need. However, instead of seeing matching rows I need to see the rows that don't match. How should I do this?
The qvw I suggested contains a lisbox named "Flag"...select "Table 1 rows" or "Table 2 rows" and you will see only rows that belongs to Table 1 or Table 2.
Another way to achieve your needs is to use not exists in your load statement.
Thanks Narender.
Will your example also work if I am doing a SQL SELECT from a DB? Something like:
Table_1:
LOAD
Key,
'QVD' as Source
FROM[QVD Path](qvd);
join
Table_2:
SQL SELECT
Key,
'DB' as Source
FROM "DB"
WHERE NOT EXISTS(Key);
Yes it will work.
NOT Exist also used in sql.
Narender.
I am still having trouble with this. When I execute the above script I get a SQL Select error. I don't want the 'Where not exists' to be executed in the DB, it should be done once the data has been loaded when joining to Table_1. Is there another way to do this or have I not understood the 'Where not Exists' correctly?
When you fetch the data from your sql database then then it looks like as
LOAD "OFFICE_CODE",
"LOCATION_CODE",
NAME,
ACTIVE,
SQL SELECT *
FROM CIIS.dbo."REF_OFFICE_LOCATION" WHERE NOT EXISTS(yourfield);
Please check.
Assuming your tables have a primary key and you want to load all the data from both tables. You can create a flag field that indicates if the record from table B is a duplicate:
Data:
sql select PrimaryKeyField, field2, field3, ...., 'source1' as source' from sourcetable1;
concatenate(Data)
load *, if(exists(PrimaryKeyField),1,0) as FlagDuplicate;
sql select PrimaryKeyField, field2, field3, ...., 'source2' as source' from sourcetable2;
You can then select 1 in a listbox that has FlagDuplicate as field to find the records that are duplicates, or select 0 to find the records from table B that are not a duplicate.