Skip to main content
Announcements
Global Transformation Awards! Applications are now open. Submit Entry
cancel
Showing results for 
Search instead for 
Did you mean: 
SDT
Creator
Creator

Join table on partial product code?

Hello,

I have one table with product codes that are missing parts of the code.

The product codes are always 42 characters.

The missing parts are identified by '?' characters (can easily be changed to another character).

The product codes are all numeric.

I have a second table of complete product codes (no ? s)

How can I join the two in a one to many relationship where a product code like 111???222333 would join up to products 111555222333, 111777222333, 111000222333, etc?

 

Thank you in advance for any assistance. I assume I need to use wildmatch somehow. 

I am also OK with creating a master dimension if that works better.

 

 

 

Labels (1)
1 Solution

Accepted Solutions
SDT
Creator
Creator
Author

After some banging about, this seems to have worked. I had a problem with my original PEEK statement.

 

[ProductBridgeTemp]:
LOAD DISTINCT
EUProductCode
RESIDENT EUParts;
 
OUTER JOIN (ProductBridgeTemp)
LOAD DISTINCT
    ProductCodeNK
RESIDENT USProducts;
 
[ProductBridgeTemp1]:
NOCONCATENATE LOAD DISTINCT
ProductCodeNK,
    EUProductCode,
    WILDMATCH(ProductCodeNK,PEEK(EUProductCode,RowNo(),'ProductBridgeTemp')) as MatchingCodes
RESIDENT ProductBridgeTemp;
 
[ProductBridge]:
NOCONCATENATE LOAD
ProductCodeNK,
    EUProductCode
RESIDENT ProductBridgeTemp1
WHERE MatchingCodes=1;
 
DROP TABLE ProductBridgeTemp, ProductBridgeTemp1;

View solution in original post

7 Replies
Miguel_Angel_Baeyens

IF every code is always 42 characters and the rest of the characters remain the same, you could use Left() Right() and Mid() functions to break down that longer key into smaller keys.

In your example, "111", "222" and "333" are the same, so you could create a key with the value "111222333" and use that value to link between the tables, for example by using

...

Left('111???222333', 3) & Right('111???222333', 6) AS JoinKey,

...

The idea is that both fields can get to a value that is unique in both tables that will allow for the JOIN.

vincent_ardiet_
Specialist
Specialist

You can maybe use intervalmatch.
In your first table, create 2 new fields, MinProductCode and MaxProductCode, for the first one you replace all ? with 0, in the second one, you replace all ? with 9.
You will end up with something like this:

ProductCode MinProductCode MaxProductCode otherFields...
111???222333 111000222333 111999222333 ...
111123??2333 111123002333 111123992333 ...
...      

 

And then, you can use intervalmatch to join this table with the reference table of products.

SDT
Creator
Creator
Author

Thank you Vincent. Interesting idea. I think it might give me too many results though. There can be multiple groups of ??? in the product code and if one occurred early in the string it would give a huge set of results between the two 000 - 999.

 

Maybe I need to do this in SQL with a 'like' in the join or a subquery?

vincent_ardiet_
Specialist
Specialist

Ah ok, I see, you want only the same number. For example '???' should replace 000, 111, 222... but not 123, right?
Could you have a single '?', or multiple? Or always 3?
Could you have '?' at different places (like 12?34??6778) or always grouped together?

 

SDT
Creator
Creator
Author

Hello Vincent,

Always in groups of three. Could be any three numbers replacing the three ???.

I tried this method that did not work. I thought I could then run through the table and eliminate items with no wildmatch. The field MatchingCodes was 0 for all records.

[ProductBridgeTemp]:
LOAD DISTINCT
     EUProductCode   //This code has the ??? parts in it
RESIDENT EUParts;
 
OUTER JOIN (ProductBridgeTemp)
LOAD DISTINCT
    ProductCodeNK //Complete codes...no ??? parts in them
RESIDENT USProducts;
 
[ProductBridge]:
NOCONCATENATE LOAD DISTINCT
    ProductCodeNK,
    EUProductCode,
    WILDMATCH(ProductCodeNK,PEEK(EUProductCode,0)) as MatchingCodes
RESIDENT ProductBridgeTemp;
 
DROP TABLE ProductBridgeTemp;

 

vincent_ardiet_
Specialist
Specialist

I'm still thinking that intervalmatch can... match.

Of course you have thousand possible values between 000 and 999.
However, do you really have thousand products with a code between 111000222333 and 111999222333?
And, if this is the case, how could you decide which ones are matching with 111???222333, if I understand well, all of them should match.
And if you have only 10 products in this interval, only those 10 will be linked to your first table.
You can create a link table in between, your first table will still contents 111???222333, the product reference table 111456222333 and the link table 111???222333 and 111456222333.

SDT
Creator
Creator
Author

After some banging about, this seems to have worked. I had a problem with my original PEEK statement.

 

[ProductBridgeTemp]:
LOAD DISTINCT
EUProductCode
RESIDENT EUParts;
 
OUTER JOIN (ProductBridgeTemp)
LOAD DISTINCT
    ProductCodeNK
RESIDENT USProducts;
 
[ProductBridgeTemp1]:
NOCONCATENATE LOAD DISTINCT
ProductCodeNK,
    EUProductCode,
    WILDMATCH(ProductCodeNK,PEEK(EUProductCode,RowNo(),'ProductBridgeTemp')) as MatchingCodes
RESIDENT ProductBridgeTemp;
 
[ProductBridge]:
NOCONCATENATE LOAD
ProductCodeNK,
    EUProductCode
RESIDENT ProductBridgeTemp1
WHERE MatchingCodes=1;
 
DROP TABLE ProductBridgeTemp, ProductBridgeTemp1;