Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

XML transformation trouble

I have some real troubles with a XML and is leaning towards writing some XSLT and converting it to csv or something for the import.
But before that I just want to make sure that there really is no way back with QV 😃

Consider the following example:

<category id="15" desc="Rent_a_car_facility">
<name language="arabic">????? ??????</name>
<name language="czech">P?j?ovna aut</name>
<name language="danish">Biludlejningssted</name>
<name language="german">Autoverleih</name>
<name language="english">Car rental</name>
</category>

And I got like a thousand of these.

I want to extract the following:
- category id (15 above)
- "Car rental" (i.e categoryDescription)

All others are irrelevant and can be ignored.

I've tried some different concatenations and loads on loads and stuff but I think I am missing some things. Any tips or tricks?

1 Solution

Accepted Solutions
Not applicable
Author

And for the curious one.


#!/usr/bin/perl -w
#
# Converts the POI category tree to csv
#
use XML::Simple;
use Data::Dumper;
# This script will produce a intermittent format like this
#
# The final product of this script is a CSV file on the following format
# ^categoryId;categoryDescription$
#
#
#
# language to use in the description field is selected from the variable below.
# please do note that you have to use the unicode converter thingie for special characters in russian, swedish, arabic and stuff.
my $language = "english";
# path to POI category tree.
my $filename = "statData/poi_category_tree.xml";


# Below this point you really should not be editing anything 😃
#
#
#
#
#
$xml = new XML::Simple;
my $data = $xml->XMLin($filename);
foreach my $categories (keys %{$data->{categories}}) {
# now we can extract the relevant fields for this category
# we will loop the catergories and for each and every print the ID and the translation matching the language in $language
my $i = 0;
while($data->{categories}{$categories}[$i]) {
my $category = $data->{categories}{$categories}[$i];

print $category->{id}.";";
# for the translated name we must find the INDEX where language == $language and print "content"
my @arr = $category->{name};
my $n = 0;
while ($arr[0][$n]) {
if ($arr[0][$n]{language} eq lc($language)) {
print $arr[0][$n]{content}."\n";
}
#increase counter to try next translation
$n++
}
#increase counter to get next category
$i++;
}
}


View solution in original post

4 Replies
Not applicable
Author


Martin Bagge wrote:
I have some real troubles with a XML and is leaning towards writing some XSLT and converting it to csv or something for the import.
But before that I just want to make sure that there really is no way back with QV 😃
Consider the following example:
<category id="15" desc="Rent_a_car_facility">
<name language="arabic">????? ??????</name>
<name language="czech">P?j?ovna aut</name>
<name language="danish">Biludlejningssted</name>
<name language="german">Autoverleih</name>
<name language="english">Car rental</name>
</category>
And I got like a thousand of these.
I want to extract the following:
- category id (15 above)
- "Car rental" (i.e categoryDescription)
All others are irrelevant and can be ignored.
I've tried some different concatenations and loads on loads and stuff but I think I am missing some things. Any tips or tricks?<div></div>


Hi,

please have a look at the attached example.

Best regards
Stefan

Not applicable
Author

Well.
That was far from the solution, I want the text from <name></name> and more specifically I only want the name element when language is English. However the restriction can be left out as long as I can get the connection from id to the text in <name>. (I've started writing a Perl script to transform the XML to an csv for now, I have to be done tomorrow).

Not applicable
Author

And here is the Perl solution for anyone who want to know that.


#!/usr/bin/perl -w
#
# Converts the POI category tree to csv
#
use XML::Simple;
use Data::Dumper;
# language to use in the description field is selected from the variable below.
# please do note that you have to use the unicode converter thingie for special characters in russian, swedish, arabic and stuff.
my $language = "english";
# path to POI category tree.
my $filename = "statData/poi_category_tree.xml";


# Below this point you really should not be editing anything 😃
#
#
#
#
#
$xml = new XML::Simple;
my $data = $xml->XMLin($filename);
foreach my $categories (keys %{$data->{categories}}) {
# now we can extract the relevant fields for this category
# we will loop the catergories and for each and every print the ID and the translation matching the language in $language
my $i = 0;
while($data->{categories}{$categories}[$i]) {
my $category = $data->{categories}{$categories}[$i];

print $category->{id}.";";
# for the translated name we must find the INDEX where language == $language and print "content"
my @arr = $category->{name};
my $n = 0;
while ($arr[0][$n]) {
if ($arr[0][$n]{language} eq lc($language)) {
print $arr[0][$n]{content}."\n";
}
#increase counter to try next translation
$n++
}
#increase counter to get next category
$i++;
}
}


Not applicable
Author

And for the curious one.


#!/usr/bin/perl -w
#
# Converts the POI category tree to csv
#
use XML::Simple;
use Data::Dumper;
# This script will produce a intermittent format like this
#
# The final product of this script is a CSV file on the following format
# ^categoryId;categoryDescription$
#
#
#
# language to use in the description field is selected from the variable below.
# please do note that you have to use the unicode converter thingie for special characters in russian, swedish, arabic and stuff.
my $language = "english";
# path to POI category tree.
my $filename = "statData/poi_category_tree.xml";


# Below this point you really should not be editing anything 😃
#
#
#
#
#
$xml = new XML::Simple;
my $data = $xml->XMLin($filename);
foreach my $categories (keys %{$data->{categories}}) {
# now we can extract the relevant fields for this category
# we will loop the catergories and for each and every print the ID and the translation matching the language in $language
my $i = 0;
while($data->{categories}{$categories}[$i]) {
my $category = $data->{categories}{$categories}[$i];

print $category->{id}.";";
# for the translated name we must find the INDEX where language == $language and print "content"
my @arr = $category->{name};
my $n = 0;
while ($arr[0][$n]) {
if ($arr[0][$n]{language} eq lc($language)) {
print $arr[0][$n]{content}."\n";
}
#increase counter to try next translation
$n++
}
#increase counter to get next category
$i++;
}
}