Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Experts
I have a table with several data columns.
And one of the field is a string type. (This field is feed from a data source where free form description allowed)
SOURCE_TABLE:
F1, F2, F3
ABC , 100, $20000
abc Inc., 220, $25000
Abc Plastics, 230, $23000
3M, 100, 12000
3M Care, 175, 6500
And the real data contains 100s of variations of F1
In this example, I want the resultant table with F1 and F3, grouped by F1
RESULTANT_TABLE:
F1, F2
ABC, $68000
3M, $18500
Can the resultant F1 be stored in a mapping table
and
add resultant F1 as a column to the source table through apply map based on 'fuzzy' match ?
If so, can you pl share sample script ?
Also pl suggest other options to achieve this ?
Thanks,
Umashankar
Hi Umashankar,
As an variant
//load the source table
TABLE:
LOAD * Inline [
F1, F2, F3
ABC , 100, $20000
abc Inc., 220, $25000
Abc Plastics, 230, $23000
3M, 100, 12000
3M Care, 175, 6500];
NoConcatenate
SOURCE_TABLE:
LOAD
F1,
Lower(SubField(F1, ' ', 1)) as F1Key, //create key
F2,
Money#(F3, '$#') as F3 //convert money format
Resident TABLE;
Left Join
LOAD
F1Key,
Money(Sum(F3), '$ #,##') as SumF3 //convert money format
Resident SOURCE_TABLE
Group By F1Key;
DROP Table TABLE;
//Then you can remove the unneeded fields from the resulting table
//DROP Fields ... From SOURCE_TABLE;
Result
Regards,
Andrey
You have to create a lookup table for values in the F1 and the value it should take in the resultant table. Can use Applymap and create a table and group by.
TABLE:
load * inline [
F1, F2, F3
ABC , 100, $20000
abc Inc., 220, $25000
Abc Plastics, 230, $23000
3M, 100, 12000
3M Care, 175, 6500];
//Creating a mapping table
lookup:
Mapping
load * inline
[
F1,F1_value
ABC,ABC
abc Inc.,ABC
Abc Plastics,ABC
3M,3M
3M Care,3M];
FinalTable:
LEFT join(TABLE)
load F1,ApplyMap('lookup',F1) AS RES_F1 Resident TABLE;
lEFT JOIN(TABLE)
LOAD RES_F1,SUM(F3)
Resident TABLE
GROUP BY RES_F1;
Hope it helps!!
Hi Umashankar,
As an variant
//load the source table
TABLE:
LOAD * Inline [
F1, F2, F3
ABC , 100, $20000
abc Inc., 220, $25000
Abc Plastics, 230, $23000
3M, 100, 12000
3M Care, 175, 6500];
NoConcatenate
SOURCE_TABLE:
LOAD
F1,
Lower(SubField(F1, ' ', 1)) as F1Key, //create key
F2,
Money#(F3, '$#') as F3 //convert money format
Resident TABLE;
Left Join
LOAD
F1Key,
Money(Sum(F3), '$ #,##') as SumF3 //convert money format
Resident SOURCE_TABLE
Group By F1Key;
DROP Table TABLE;
//Then you can remove the unneeded fields from the resulting table
//DROP Fields ... From SOURCE_TABLE;
Result
Regards,
Andrey
Thanks for the reply
Since F1 variations are more, we would need dynamic expression to handle
Thanks for the reply
The suggestion helps
Additionally, if there are data variations / spelling mismatch (within Subfield)....is there a way to handle by means of fuzzy match within the expression ?
example data for F1
BOTICELLI
BOTTICELLI Inc
BOTICELI Brands
For this it is necessary to apply a mapping table. Look at this, please here Mapping ‒ QlikView
and ApplyMap ‒ QlikView
In my view try to identify errors in the words automatically is not very reasonable. If two words differ by one or more letters and have different meanings - is not always clear, it is a mistake or not.
Regards,
Andrey
I think . Unless you have a solid lookup table, we cant always trust the numbers you get. Its hard to find out each case where data anomaly occurs and use some function to group it.
You still want to automatically link
BOTICELLI
BOTTICELLI Inc
BOTICELI Brands in a single value?
Yes, I would like to link in a single value
Thanks