Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Qlikies
I am looking for solution on wild match/ pick match/ applymap in two different tables columns. I have huge data set to find matching and create new derived fields . Unfortunately I do not have key
LookUpData:
Load * inline
[Cat,SubCat
CASTINGS & FORGINGS,AIRFOIL CASTINGS
CASTINGS & FORGINGS,OTHER CASTINGS (DIE CASTINGS, ETC)
CASTINGS & FORGINGS,SAND CASTING
CASTINGS & FORGINGS,STRUCTURAL INVESTMENT
CASTINGS & FORGINGS,ISOTHERMAL
CASTINGS & FORGINGS,SEAMLESS RINGS
ELECTRONICS,PCBS / PCBAS / FLEXPRINTS
ELECTRONICS,PCBS / PCBAS / FLEXPRINTS
ELECTRONICS,BOX ASSEMBLIES
MACHINING,FAN BLADES & AIRFOILS
ENGINEERED ITEMS,"ADPTRS,CPLRS,VALVES"
ENGINEERED ITEMS,OTHER / UNCATEGORIZED
LOGISTICS,Warehousing and 3PL];
DataTab:
Load * inline
[CatData,SubCatData
BEARINGS,BALL
BEARINGS,CYLINDRICAL ROLLER
CASTINGS,AIRFOIL
CASTINGS,OTHER CASTINGS (DIE CASTINGS, ETC)
CASTINGS,SAND
CASTINGS,STRUCTURAL INVESTMENT
FORGINGS,CLOSED DIE
ENGINEERED ITEMS,VALVES
MACHINING,AIRFOILS
MACHINING,ASSEMBLIES
LOGISTICS,3PL
ELECTRONICS,PCB
ELECTRONICS,PCBA
ENGINEERED,ITEMS OTHER
TRAVEL,AGENCY
];
The final output should look like
1. Wildmatch CatData field from DataTab and pick value from LookUpTab Cat Field
2. If WildMatch data on CatData field then do wild match on second column SubCat column for matching CatData
3. If CatData not matched then CatData and SubCatData as it is.
4. Pick up all non match Cat and SubCat from Lookptab
CatData | SubCommData | Derived | SubCommDataDerived | MatchFlag |
ELECTRONICS | PCB | ELECTRONICS | PCBS / PCBAS / FLEXPRINTS | Y |
ELECTRONICS | PCBA | ELECTRONICS | PCBS / PCBAS / FLEXPRINTS | Y |
BEARINGS | BALL | BEARINGS | BALL | N |
BEARINGS | CYLINDRICAL ROLLER | BEARINGS | CYLINDRICAL ROLLER | N |
CASTINGS | AIRFOIL | CASTINGS & FORGINGS | AIRFOIL CASTINGS | Y |
CASTINGS | OTHER CASTINGS (DIE CASTINGS, ETC) | CASTINGS & FORGINGS | OTHER CASTINGS (DIE CASTINGS, ETC) | Y |
CASTINGS | SAND | CASTINGS & FORGINGS | SAND CASTING | Y |
CASTINGS | STRUCTURAL INVESTMENT | CASTINGS & FORGINGS | STRUCTURAL INVESTMENT | Y |
MACHINING | AIRFOILS | MACHINING | FAN BLADES & AIRFOILS | Y |
MACHINING | ASSEMBLIES | MACHINING | ASSEMBLIES | Y |
ENGINEERED ITEMS | VALVES | ENGINEERED ITEMS | ADPTRS,CPLRS,VALVES | Y |
ENGINEERED ITEMS | OTHER | ENGINEERED ITEMS | OTHER / UNCATEGORIZED | Y |
LOGISTICS | 3PL | LOGISTICS | Warehousing and 3PL | Y |
TRAVEL | AGENCY | TRAVEL | AGENCY | N |
FORGINGS | CLOSED DIE | CASTINGS & FORGINGS | CLOSED DIE | Y |
CASTINGS & FORGINGS | ISOTHERMAL | N | ||
CASTINGS & FORGINGS | SEAMLESS RINGS | N | ||
ELECTRONICS | BOX ASSEMBLIES | N |
Thank you
Dev
One solution is .
DataTab:
Load RowNo() As RowID, * inline
[CatData,SubCatData
BEARINGS,BALL
BEARINGS,CYLINDRICAL ROLLER
CASTINGS,AIRFOIL
CASTINGS,OTHER CASTINGS (DIE CASTINGS, ETC)
CASTINGS,SAND
CASTINGS,STRUCTURAL INVESTMENT
FORGINGS,CLOSED DIE
ENGINEERED ITEMS,VALVES
MACHINING,AIRFOILS
MACHINING,ASSEMBLIES
LOGISTICS,3PL
ELECTRONICS,PCB
ELECTRONICS,PCBA
ENGINEERED,ITEMS OTHER
TRAVEL,AGENCY
];
Join
LookUpData:
Load * inline
[Cat,SubCat
CASTINGS & FORGINGS,AIRFOIL CASTINGS
CASTINGS & FORGINGS,OTHER CASTINGS (DIE CASTINGS, ETC)
CASTINGS & FORGINGS,SAND CASTING
CASTINGS & FORGINGS,STRUCTURAL INVESTMENT
CASTINGS & FORGINGS,ISOTHERMAL
CASTINGS & FORGINGS,SEAMLESS RINGS
ELECTRONICS,PCBS / PCBAS / FLEXPRINTS
ELECTRONICS,PCBS / PCBAS / FLEXPRINTS
ELECTRONICS,BOX ASSEMBLIES
MACHINING,FAN BLADES & AIRFOILS
ENGINEERED ITEMS,"ADPTRS,CPLRS,VALVES"
ENGINEERED ITEMS,OTHER / UNCATEGORIZED
LOGISTICS,Warehousing and 3PL];
tab1:
LOAD RowID, CatData, SubCatData, Cat, SubCat, -(Cat Like CatData&'*' And SubCat Like SubCatData&'*') As Result
Resident DataTab;
Left Join(tab1)
LOAD RowID, Max(Result) As MR
Resident tab1
Group By RowID;
tab2:
LOAD DISTINCT RowID, CatData, SubCatData, Cat As Derived, SubCat As SubCommDataDerived
Resident tab1
Where Result=1 And MR=1;
LOAD DISTINCT RowID, CatData, SubCatData, CatData As Derived, SubCatData As SubCommDataDerived
Resident tab1
Where Result=0 And MR=0;
Drop Table DataTab, tab1;
Any Help please?
Take a look at this post to see some potential solutions.
https://qlikviewcookbook.com/2020/03/creating-temporary-script-associations/
-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com
Hi Rob ( @rwunderlich ),
Thank you very much for your reply. I checked that earlier as well. My problem is little different, not able to wrap head around it.
My data field has smaller string/ word and lookup has two or more words, So wild match kind of looks reversed.
e.g. good scenario would be 'Qlik is the best BI platform' is data and i am looking up world like 'Qlik' in look table then it will wild match.
In my case it reverse. My data is 'Qlik' and look up wildmatch 'Qlik is the best BI platform' and assign 'Qlik is the best BI platform' in new field. It is also two level wild match
Hope I am explaining it properly.
Thank you for your help.
Dev
Here are a couple of recipes that may have some useful pieces for your requirement.
Qlikview Cookbook: Mapping With Wildcards http://qlikviewcookbook.com/recipes/download-info/mapping-with-wildcards/
Qlikview Cookbook: Indexing Keywords in Text http://qlikviewcookbook.com/recipes/download-info/indexing-keywords-in-text/
Let me know if those help. It's an interesting problem.
-Rob
Hi Rob,
Thank you very much, I will look into it and update you.
Thanks
Dev
Hi Rob,
Thank you very much for , I tried following, i am still struggling, Cat and Catdata kind of match, but the subcat and subcatdata not able to match.
LookUpData:
Load *,RecNo() as LookUpId inline
[Cat,SubCat
CASTINGS & FORGINGS,AIRFOIL CASTINGS
CASTINGS & FORGINGS,OTHER CASTINGS (DIE CASTINGS, ETC)
CASTINGS & FORGINGS,SAND CASTING
CASTINGS & FORGINGS,STRUCTURAL INVESTMENT
CASTINGS & FORGINGS,ISOTHERMAL
CASTINGS & FORGINGS,SEAMLESS RINGS
ELECTRONICS,PCBS / PCBAS / FLEXPRINTS
ELECTRONICS,PCBS / PCBAS / FLEXPRINTS
ELECTRONICS,BOX ASSEMBLIES
MACHINING,FAN BLADES & AIRFOILS
ENGINEERED ITEMS,"ADPTRS,CPLRS,VALVES"
ENGINEERED ITEMS,OTHER / UNCATEGORIZED
LOGISTICS,Warehousing and 3PL];
DataTab:
Load *,RecNo() as DataId inline
[CatData,SubCatData
BEARINGS,BALL
BEARINGS,CYLINDRICAL ROLLER
CASTINGS,AIRFOIL
CASTINGS,OTHER CASTINGS (DIE CASTINGS, ETC)
CASTINGS,SAND
CASTINGS,STRUCTURAL INVESTMENT
FORGINGS,CLOSED DIE
ENGINEERED ITEMS,VALVES
MACHINING,AIRFOILS
MACHINING,ASSEMBLIES
LOGISTICS,3PL
ELECTRONICS,PCB
ELECTRONICS,PCBA
ENGINEERED,ITEMS OTHER
TRAVEL,AGENCY
];
Left Join(DataTab) // Inner Join -- only Keywords that have a matching Review will be kept. Change to "Left Join" to keep all keywords.
LOAD
subfield( // Try to join on each individual word
PurgeChar(Cat, '",.!?&') // Remove punctuation
, ' ' // subfield - Words are delimited by space
) as CatData,
Cat as CatDerived
RESIDENT LookUpData
;
Left Join(DataTab) // Inner Join -- only Keywords that have a matching Review will be kept. Change to "Left Join" to keep all keywords.
LOAD Cat as CatDerived,
subfield( // Try to join on each individual word
PurgeChar(Upper(SubCat), '"&/ '), // Remove punctuation
, ' ' // subfield - Words are delimited by space
) as SubCatData,
SubCat as SubCatDerived
RESIDENT LookUpData
;
There might be small change i am missing , Following example is not matching as well.
CatData | SubCommData | Derived | SubCommDataDerived | MatchFlag |
ELECTRONICS | PCB | ELECTRONICS | PCBS / PCBAS / FLEXPRINTS | Y |
ELECTRONICS | PCBA | ELECTRONICS | PCBS / PCBAS / FLEXPRINTS | Y |
Thank you
Dev
One solution is .
DataTab:
Load RowNo() As RowID, * inline
[CatData,SubCatData
BEARINGS,BALL
BEARINGS,CYLINDRICAL ROLLER
CASTINGS,AIRFOIL
CASTINGS,OTHER CASTINGS (DIE CASTINGS, ETC)
CASTINGS,SAND
CASTINGS,STRUCTURAL INVESTMENT
FORGINGS,CLOSED DIE
ENGINEERED ITEMS,VALVES
MACHINING,AIRFOILS
MACHINING,ASSEMBLIES
LOGISTICS,3PL
ELECTRONICS,PCB
ELECTRONICS,PCBA
ENGINEERED,ITEMS OTHER
TRAVEL,AGENCY
];
Join
LookUpData:
Load * inline
[Cat,SubCat
CASTINGS & FORGINGS,AIRFOIL CASTINGS
CASTINGS & FORGINGS,OTHER CASTINGS (DIE CASTINGS, ETC)
CASTINGS & FORGINGS,SAND CASTING
CASTINGS & FORGINGS,STRUCTURAL INVESTMENT
CASTINGS & FORGINGS,ISOTHERMAL
CASTINGS & FORGINGS,SEAMLESS RINGS
ELECTRONICS,PCBS / PCBAS / FLEXPRINTS
ELECTRONICS,PCBS / PCBAS / FLEXPRINTS
ELECTRONICS,BOX ASSEMBLIES
MACHINING,FAN BLADES & AIRFOILS
ENGINEERED ITEMS,"ADPTRS,CPLRS,VALVES"
ENGINEERED ITEMS,OTHER / UNCATEGORIZED
LOGISTICS,Warehousing and 3PL];
tab1:
LOAD RowID, CatData, SubCatData, Cat, SubCat, -(Cat Like CatData&'*' And SubCat Like SubCatData&'*') As Result
Resident DataTab;
Left Join(tab1)
LOAD RowID, Max(Result) As MR
Resident tab1
Group By RowID;
tab2:
LOAD DISTINCT RowID, CatData, SubCatData, Cat As Derived, SubCat As SubCommDataDerived
Resident tab1
Where Result=1 And MR=1;
LOAD DISTINCT RowID, CatData, SubCatData, CatData As Derived, SubCatData As SubCommDataDerived
Resident tab1
Where Result=0 And MR=0;
Drop Table DataTab, tab1;
Hi Saran
Thank you very much for your your solution.
Row 7 in solution, Derived should be "Casting and Forging" for "Forging" and SubCat should be "Closed Die"
I am coming very close to solution. It looks promising, only concern is cross join on both table with large data will create bottle neck.
Thank you
Dev