I'm trying to create a yes-no field that I can use in a filter within my app. I'm ending up with duplicate rows. The point of the app is to provide a way for people to filter IDs that are in db#1 based on whether or not they exist in db#2. Here's a sample dataset that mimics the raw data I'm working with:
ID, dbnum, dbID
1, 1, EX4502
1, 2, 68465832
2, 1, VM640
3, 1, DX1201
3, 2, 49674828
4, 2, 3857393837
5, 1, VX12
Here's how I'd like to transform the data:
ID, db1ID, db2ID, Has_db2ID
1, EX4502, 68465832, Yes
2, VM640, , No
3, DX1201, 49674828, Yes
4, , 3857393837, Yes
5, VX12, , No
Here's the load query I'm using:
SET cYES = Dual('Yes', 1);
SET cNO = Dual('No', 0);
DB2:
LOAD
ID,
If(dbnum = '1', dbID) as db1ID
FROM [file]
(qvd)
WHERE match(dbnum, '1');
left keep LOAD
ID,
If(dbnum = '2', dbID) as db2ID,
If(db = '2', $(cYES), $(cNO)) as Has_db2ID
FROM [file]
(qvd)
WHERE match(dbnum, '2', '1');
I end up with data that looks and behaves like this:
ID, db1ID, db2ID, Has_db2ID
1, EX4502, , No
1, EX4502, 68465832, Yes
Within a table that displays the data, if I filter by ID = 1 I get multiple rows as above. I have a map in my app that lights up for ID = 1 when I filter for both yes and no values for Has_db2ID (after I load this data I'm also loading geospatial & other data that contains ID).
I've tried with resident loads, concatenating & other ways of creating the yes/no field, but still got the same result (or introduced other issues that still didn't get rid of the extra row). I need a data model that gives me one row per ID. Thanks for any advice!
Try this may be:
Table:
LOAD * Inline [
ID, dbnum, dbID
1, 1, EX4502
1, 2, 68465832
2, 1, VM640
3, 1, DX1201
3, 2, 49674828
4, 2, 3857393837
5, 1, VX12
];
TempTable:
LOAD ID,
dbID as db1ID
Resident Table
Where dbnum = 1;
Join (TempTable)
LOAD ID,
dbID as db2ID,
'Yes' as Has_db2ID
Resident Table
Where dbnum = 2;
FinalTable:
NoConcatenate
LOAD ID,
db1ID,
db2ID,
If(IsNull(Has_db2ID), 'No', Has_db2ID) as Has_db2ID
Resident TempTable;
DROP Table Table, TempTable;
Try this may be:
Table:
LOAD * Inline [
ID, dbnum, dbID
1, 1, EX4502
1, 2, 68465832
2, 1, VM640
3, 1, DX1201
3, 2, 49674828
4, 2, 3857393837
5, 1, VX12
];
TempTable:
LOAD ID,
dbID as db1ID
Resident Table
Where dbnum = 1;
Join (TempTable)
LOAD ID,
dbID as db2ID,
'Yes' as Has_db2ID
Resident Table
Where dbnum = 2;
FinalTable:
NoConcatenate
LOAD ID,
db1ID,
db2ID,
If(IsNull(Has_db2ID), 'No', Has_db2ID) as Has_db2ID
Resident TempTable;
DROP Table Table, TempTable;
Seems close, but it's not retaining the 'yes' values for Has_db2ID. It gives me only this result:
1, EX4502, , No
And cuts this result, which is the one I need:
1, EX4502, 68465832, Yes
Experimenting with the If(IsNull... line in FinalTable to see if I can make it work.
Is this not the output you are looking for? I got this using the same script I provided above
Yep, you're right. Your file does it perfectly.
In my app, I was limiting some of the data on ingest, and it was causing the issue. I think I"m all set now.
Thank you, Sunny T! Much appreciated.