Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
The goal is to split a column by varying delimiters. All of the items to split have different character lengths. How would you go about this in the script editor? Please see below (Current Example Column vs. Expected Output)
Current Example Column | Expected Output | ||||
Location | City | State | Country | Address | Location_Code |
San Francisco, CA, USA - One ABC Street (9501) | San Francisco | CA | USA | One ABC Street | 9501 |
Columbia, SC, USA - 120 QlikView Drive (B800) | Columbia | SC | USA | 120 QlikView Drive | B800 |
Murfreesboro, TN, USA - 1000 Medical Center Parkway, Ste. 9999 (A585) | Murfreesboro | TN | USA | 1000 Medical Center Parkway, Ste. 9999 | A585 |
Irving, TX, USA - 5000 North State Highway 1000 (P001) | Irving | TX | USA | 5000 North State Highway 1000 | P001 |
Maye be :
Data:
LOAD
SubField(Location, ',',1) as City,
SubField(Location, ',',2) as State,
SubField(SubField(Location, ',',3),'-',1) as Country,
Subfield(SubField(SubField(Location, ',',3)&SubField(Location, ',',4),'-',2) ,'(',1) as Address,
Purgechar(Subfield(SubField(SubField(Location, ',',3)&SubField(Location, ',',4),'-',2),'(',2),'()') as Location_Code,
*
INLINE [
Location
"Columbia, SC, USA - 120 QlikView Drive (B800)"
"Irving, TX, USA - 5000 North State Highway 1000 (P001)"
"Murfreesboro, TN, USA - 1000 Medical Center Parkway, Ste. 9999 (A585)"
"San Francisco, CA, USA - One ABC Street (9501)"
];
Maye be :
Data:
LOAD
SubField(Location, ',',1) as City,
SubField(Location, ',',2) as State,
SubField(SubField(Location, ',',3),'-',1) as Country,
Subfield(SubField(SubField(Location, ',',3)&SubField(Location, ',',4),'-',2) ,'(',1) as Address,
Purgechar(Subfield(SubField(SubField(Location, ',',3)&SubField(Location, ',',4),'-',2),'(',2),'()') as Location_Code,
*
INLINE [
Location
"Columbia, SC, USA - 120 QlikView Drive (B800)"
"Irving, TX, USA - 5000 North State Highway 1000 (P001)"
"Murfreesboro, TN, USA - 1000 Medical Center Parkway, Ste. 9999 (A585)"
"San Francisco, CA, USA - One ABC Street (9501)"
];
You will have to use Subfield to get what you want.
DATA:
LOAD *,
SubField(CmpId0, ')',1) AS 'CmpIdFinal'
;
LOAD *,
SubField(Street0, '(',1) AS 'StreetFinal',
SubField(Street0, '(',2) AS 'CmpId0'
;
LOAD *,
SubField(Location, ',',1) AS 'City',
SubField(Location, ',',2) AS 'State',
SubField(Location, '-',2) AS 'Street0'
;
Load * Inline
[
'Location'
'San Francisco, CA, USA - One ABC Street (9501)'
'Columbia, SC, USA - 120 QlikView Drive (B800)'
'Murfreesboro, TN, USA - 1000 Medical Center Parkway, Ste. 9999 (A585)'
'Irving, TX, USA - 5000 North State Highway 1000 (P001)'
]
;
EXIT Script
;
@Taoufiq_Zarra Thank you so much!! I have another question for you. There are some values that are not coming out correct. Your initial syntax worked great but how would you approach this scenario?
Current Example Column | Expected Output | ||||
Location | City | State | Country | Address | Location_Code |
Work at Home - Texas, USA (All Zones) (WTXA) | Work at Home | Texas | USA | All Zones | WTXA |
Work at Home - New York, USA (Zone 4) (WNY4) | Work at Home | New York | USA | Zone 4 | WNY4 |
Current Example Column | Current Output (Incorrect) | ||||
Location | City | State | Country | Address | Location_Code |
Work at Home - Texas, USA (All Zones) (WTXA) | Work at Home - Texas | USA (All Zones) (WTXA) | |||
Work at Home - New York, USA (Zone 4) (WNY4) | Work at Home - New York | USA (Zone 4) (WNY4) |
subfield() needs a minimum set of rules to work.
I suggest you identify groups that have delimiters in common, then we can apply for example various rules .
Hi,
If you need something generic, may be for Cities and State you can pull the data from the below URL.
https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population
May be use the below code:
tabUS:
LOAD [2018rank],
SubField(City,'[') As City,
"State[c]",
[2018estimate],
[2010Census],
Change,
[2016 land area],
[2016 land area1],
[2016 population density],
[2016 population density1],
Location
FROM
"C:\Users\sd80250\AppData\Local\Microsoft\Windows\INetCache\IE\C9AUOZON\List_of_United_States_cities_by_population[1]"
(html, codepage is 1252, embedded labels, table is @5);
//Exit Script;
MapCity:
Mapping
LOAD City, '@start1@'&City&'@end1@' As B
Resident tabUS;
MapState:
Mapping
LOAD DISTINCT "State[c]", '@start2@'&"State[c]"&'@end2@' As B
Resident tabUS;
Data:
LOAD *
,TextBetween(MapSubString('MapCity',Location),'@start1@','@end1@') As City
,TextBetween(MapSubString('MapState',Location),'@start2@','@end2@') As State
;
Load * Inline
[
Location
'San Francisco, California, USA - One ABC Street (9501)'
'Columbia, South Carolina, USA - 120 QlikView Drive (B800)'
'Murfreesboro, Tennessee, USA - 1000 Medical Center Parkway, Ste. 9999 (A585)'
'Irving, Texas, USA - 5000 North State Highway 1000 (P001)'
]
;
DROP Table tabUS;