Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello everyone,
I am trying to figure out a way to identify a country code within a string. The string that is loaded is a variable length string and within that string it 'could' contain a country code.
The goal is to check if a specific country code exists in a text; then i would compare this with a specific country code field.
For example:
Project Name | Text Country | Impacted Country | Outcome |
Project A: Simon says jump - NL | NL | NL | TRUE |
Project B: Simon says sing - UK | UK | NL | FALSE |
Some of the problems that i have:
Any ideas anyone?
Maurice
does the country code always come at the end? also is the length always 2?
if so why not get the last 2 characters from your string and compare using that
Unfortunately this would be an extra problem. The country code is not always at the end.
For example:
'Project C: Simon says dance in NL (if you want)'
The easiest you can do is a JScript macro, where you can use a regular expression (VB doesn't known regular expressions).
In the macro editor it looks like this:
var regExp;
function InitRegExp(codes){
regExp=new RegExp("(\\s|-)("+codes+")(\\s|-|\$)");
return 0;
}
function GetCountryCode(text){
var res=regExp.exec(text);
return res ? res[2] : "";
}
And the script for example:
Countries:
LOAD * INLINE [
Code
NL
RO
US
EN
];
t:
LOAD
InitRegExp(Concat(DISTINCT Code,'|')) as InitRegExp
Resident Countries;
drop table t;
Projects:
LOAD *,
GetCountryCode(Project) as Country
INLINE [
Project
Project A: Simon says jump - NL
Project A: Simon says jump -RO
Project A: Simon says jump -Ro NL
Project A: Simon says jump US
US Project A: Simon says jump -EN-
Project A: Simon NL says jump
Project A: SimonNL says jump
Project A: Simon NLsays jump
Project A: Simon -US- says jump
Project A: Simon -EN says jump
];
In the example the regular expression is build from "(\\s|-)(NL|US|EN|RO)(\\s|-|\$)".
This means it consists of three groups:
1. "(\\s|-)" the start of search string, can be a whitespace or "-" (it cannot be at the beginning of the text that is being searched for matches)
2. followed by "(NL|US|EN|RO)" one of the country code
3. followed by "(\\s|-|\$)" the end of searchs tring, can be a whitespace, a "-" or it can be at the end of the text.
Because of the three groups, the result of exec() is an array with 4 elements. The first is the whole match string, the others are the results for each group. So the function return only the result for the second group.
The regulare expression is case sensitive! It only searches for country codes in capital letters.