Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

tExtractDelimitedFields on variable location field

I have a location column that I'm processing through tExtractDelimitedFields to break out into city, region, country

However, the input isn't consistent, sometimes just containing a country, sometimes a region and country, sometimes a city, region, and country:

 

 

|=--------------------------------=|
|location                          |
|=--------------------------------=| |Bangkok, Thailand | |Sydney, New South Wales, Australia| |Atlanta, Georgia, United States | |Zurich, Switzerland | |Peru | |=--------------------------------=|

Which is obviously and correctly split by tExtractDelimitedFields to:

|=----------------+-----------------------------+--------------=|
|city             |region                       |country        |
|=----------------+-----------------------------+--------------=|
|Bangkok          | Thailand                    |               |
|Sydney           | New South Wales             | Australia     |
|Atlanta          | Georgia                     | United States |
|Zurich           | Switzerland                 |               |
|Peru             |                             |               |
|=----------------|-----------------------------|--------------=|

Now, I'd like to correct the locations of the different elements of the location field. My thought was to extract the fields into variables, and shift them to the right until the country field is no longer blank. But it occurred to me that

  1. This might be a solved problem that I'm unable to find by simple looking for "location" or "tExtractDelimitedFields" in the community
  2. I'm also looking for some pointers on how to do the shifting if that's the solution
  3. While I'm at it, I'm seeing the whitespace after the split in the output. I can easily use tMap to apply a trim. I used tMap immediately before the tExtractDelimitedFields to drop two columns. Is it better to carry the two rows and use tMap just once to drop and trim at the same time, or to use it twice, to drop unneeded columns early, then trim later?

 

 

 

Labels (2)
1 Solution

Accepted Solutions
TRF
Champion II
Champion II

Here is the solution I propose as a more elegant way (let you decide this point!)

Use a tFileInputFullRow to read the file as it is (schema is one field called "line").
Use tMap to left pad the line field with the missing "," if any:
0683p000009Ltuw.png
Here is the magic, get a substring from ",," starting a position depending of the number of "," in the line, then add the line to this substring which contains 0, 1 or 2 "," (for ex: ",Bangkok, Thailand" or ",,Peru").
Now you can use tExtractDelimitedFields as you want (just have to change the Field separator to "," and to define the desired output schema):
0683p000009LuCB.png

Check the "Trim Column" option on Advanced setting tabs.

And the result as expected with just 1 line of Java code:
0683p000009LuTJ.png
Hope this helps.

View solution in original post

4 Replies
Anonymous
Not applicable
Author

I assume I brute-forced it with tJavaRow:

 

if ("" == input_row.country && "" == input_row.region) {
	output_row.city = "";
	output_row.region = "";
	output_row.country = input_row.city;
} else if ("" == input_row.country) {
	output_row.city = "";
	output_row.region = input_row.city;
	output_row.country = input_row.region;
} else {
	output_row.city = input_row.city;
	output_row.region = input_row.region;
	output_row.country = input_row.country;
}

If there's a more elegant way to do it, please let me know!

TRF
Champion II
Champion II

Here is the solution I propose as a more elegant way (let you decide this point!)

Use a tFileInputFullRow to read the file as it is (schema is one field called "line").
Use tMap to left pad the line field with the missing "," if any:
0683p000009Ltuw.png
Here is the magic, get a substring from ",," starting a position depending of the number of "," in the line, then add the line to this substring which contains 0, 1 or 2 "," (for ex: ",Bangkok, Thailand" or ",,Peru").
Now you can use tExtractDelimitedFields as you want (just have to change the Field separator to "," and to define the desired output schema):
0683p000009LuCB.png

Check the "Trim Column" option on Advanced setting tabs.

And the result as expected with just 1 line of Java code:
0683p000009LuTJ.png
Hope this helps.

TRF
Champion II
Champion II

@jwhite_tc, did my proposition help you to solve your case?

Anonymous
Not applicable
Author

It did indeed! Basically pad the front of the location field with commas to force it to three total. And the trim in the advanced settings. Lovely!