Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm using the code below to take a first and last name from one table (135K records)
and search 2.15 million records in a 2nd table for the first and last name to be in a field called FullName.
I tried building a crossreference table and it was too much for the server so resorted to doing it one by one
but it is taking forever (estimating 30 hours) at about one sec per search. Also the DO WHILE LOOP code has some kind of bug becasue at the end of my sample runs, the next command is skipped (with or without a semicolon after LOOP)
WL_CANDIDATES:
LOAD rowNum, WL_ID, fName, lName, lenfName
FROM ..\QVD\WL_FirstLastCandidates.QVD (qvd);
CUST_DATA:
LOAD IND_PARTY_KEY, FullName
FROM ..\QVD\CUST_DATA_SEARCH.QVD (qvd);
LET MAX = NoOfRows('WL_CANDIDATES');
SET lowRnge=0;
DO WHILE lowRnge<MAX
LET fName = Peek('fName',$(lowRnge),'WL_CANDIDATES');
LET lName = Peek('lName',$(lowRnge),'WL_CANDIDATES');
LET wlID = Peek('WL_ID',$(lowRnge),'WL_CANDIDATES');
HITLIST:
LOAD
IND_PARTY_KEY,
Num($(wlID)) as WL_ID,
'Y' as matchFound
Resident CUST_DATA
WHERE index(FullName,'$(fName)')>0
AND index(Mid(FullName,index(FullName,'$(fName)') + 1 + Len('$(fName)')),'$(lName)')>0;
LET lowRnge=lowRnge+1;
LOOP
Trace wlID = $(wlID);
Drop Table WL_CANDIDATES;
STORE HITLIST INTO ..\QVD\HITLIST.QVD (QVD);
DROP TABLES HITLIST, CUST_DATA;
// Interesting anomaly: The first time I ran this I had a lot of code after the loop, so I put an “exit script” after the LOOP.
// When it ran, the exit script command was completely ignored and it proceeded to run the rest of the code.
// I then moved this code to the end. I still needed to drop a temp table so initially the last command was: Drop Table WL_CANDIDATES;
// I added a few more commands as shown below in the hopes that if it skipped over the first one, the nextwould actually execute.
Trace wlID = $(wlID);
Drop Table WL_CANDIDATES;
STORE HITLIST INTO ..\QVD\HITLIST.QVD (QVD);
DROP TABLES HITLIST, CUST_DATA;
// Unfortunately, I never got to see this work as the process never completed again. It ran for 5 hours and completed around 3000 of the 135,502 records.
Can anyone please recommend more efficient code that would do this quicker?
hi!
just to be clear, you have the table [WL_CANDIDATES] and you want to lookup the fname and lname in [CUST_DATA] and create the field [Full Name] and store this into the [HITLIST] table?
Have you tried doing joins or mapping table?
I'm taking a first and last name from one table (135K records)
and searching 2.15 million records in a 2nd table for the first and last name to be in a field called FullName.
There is no key to join on.
In the end I end up with HITLIST table that connects a WL_ID to an IND_PARTY_ID with a matchFound='Y' so I can filter cust data by matches found.
It's an absolute performance killer to iterate over a larger table and query the record values using PEEK(). Always try to avoid these kind of loops.
You should be able to use a mapping table with your search names and MapSubstring() in your CustData table to filter your data, something along what is demonstrated in e.g.
I thought about that, but then my search logic is a bit more complicated than a simple wildcard search.
I "Capitalize" the fName, lName and FullName ahead of time
then search for fName in FullName case sensitive
I then search for lName to be in remainder of string after where fName was found, also case sensitive
The results are really good.
Is there a better way to run my query without PEEK?
Should I query the smaller table for one row, then query the large?
To demonstrate your complications, could you post some sample lines of (mock-up) data for both tables?
WL_CANDIDATES:
fName lName
A Lee
Amit Hubar
David John
CUST_DATA: HITLIST:
FullName matchFound
Mr Alexander Lee Y
Ami Glee
Happy Lee
Amit Chin Hubarendi Y
Kamit Hubar
Davidson Johnson Y
And of course if the last name preceeds the first name that is a fail (no match)
Something like
WL_CANDIDATES:
LOAD * INLINE [
fName, lName
A, Lee
Amit, Hubar
David, John
];
MAP1:
MAPPING LOAD
fName, '\'&1&'/'
Resident WL_CANDIDATES;
MAP2:
MAPPING LOAD
lName, '\'&2&'/'
Resident WL_CANDIDATES;
CUST_DATA:
LOAD *, If(KeepChar(Mapped,'\/12') ='\1/\2/', 'Yes') as Match;
LOAD *, MapSubString('MAP2',MapSubString('MAP1',FullName)) as Mapped
INLINE [
FullName
Mr Alexander Lee
Ami Glee
Happy Lee
Amit Chin Hubarendi
Kamit Hubar
Davidson Johnson
];
Hmmm, very nice! The only problem I see, unless I read this wrong, is that you are matching a mix of all last names and first names. the two must be used exclusively together.