Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Discover how organizations are unlocking new revenue streams: Watch here
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

XML transformation trouble

I have some real troubles with a XML and is leaning towards writing some XSLT and converting it to csv or something for the import.
But before that I just want to make sure that there really is no way back with QV 😃

Consider the following example:

<category id="15" desc="Rent_a_car_facility">
<name language="arabic">????? ??????</name>
<name language="czech">P?j?ovna aut</name>
<name language="danish">Biludlejningssted</name>
<name language="german">Autoverleih</name>
<name language="english">Car rental</name>
</category>

And I got like a thousand of these.

I want to extract the following:
- category id (15 above)
- "Car rental" (i.e categoryDescription)

All others are irrelevant and can be ignored.

I've tried some different concatenations and loads on loads and stuff but I think I am missing some things. Any tips or tricks?

1 Solution

Accepted Solutions
Not applicable
Author

And for the curious one.


#!/usr/bin/perl -w
#
# Converts the POI category tree to csv
#
use XML::Simple;
use Data::Dumper;
# This script will produce a intermittent format like this
#
# The final product of this script is a CSV file on the following format
# ^categoryId;categoryDescription$
#
#
#
# language to use in the description field is selected from the variable below.
# please do note that you have to use the unicode converter thingie for special characters in russian, swedish, arabic and stuff.
my $language = "english";
# path to POI category tree.
my $filename = "statData/poi_category_tree.xml";


# Below this point you really should not be editing anything 😃
#
#
#
#
#
$xml = new XML::Simple;
my $data = $xml->XMLin($filename);
foreach my $categories (keys %{$data->{categories}}) {
# now we can extract the relevant fields for this category
# we will loop the catergories and for each and every print the ID and the translation matching the language in $language
my $i = 0;
while($data->{categories}{$categories}[$i]) {
my $category = $data->{categories}{$categories}[$i];

print $category->{id}.";";
# for the translated name we must find the INDEX where language == $language and print "content"
my @arr = $category->{name};
my $n = 0;
while ($arr[0][$n]) {
if ($arr[0][$n]{language} eq lc($language)) {
print $arr[0][$n]{content}."\n";
}
#increase counter to try next translation
$n++
}
#increase counter to get next category
$i++;
}
}


View solution in original post

4 Replies
Not applicable
Author


Martin Bagge wrote:
I have some real troubles with a XML and is leaning towards writing some XSLT and converting it to csv or something for the import.
But before that I just want to make sure that there really is no way back with QV 😃
Consider the following example:
<category id="15" desc="Rent_a_car_facility">
<name language="arabic">????? ??????</name>
<name language="czech">P?j?ovna aut</name>
<name language="danish">Biludlejningssted</name>
<name language="german">Autoverleih</name>
<name language="english">Car rental</name>
</category>
And I got like a thousand of these.
I want to extract the following:
- category id (15 above)
- "Car rental" (i.e categoryDescription)
All others are irrelevant and can be ignored.
I've tried some different concatenations and loads on loads and stuff but I think I am missing some things. Any tips or tricks?<div></div>


Hi,

please have a look at the attached example.

Best regards
Stefan

Not applicable
Author

Well.
That was far from the solution, I want the text from <name></name> and more specifically I only want the name element when language is English. However the restriction can be left out as long as I can get the connection from id to the text in <name>. (I've started writing a Perl script to transform the XML to an csv for now, I have to be done tomorrow).

Not applicable
Author

And here is the Perl solution for anyone who want to know that.


#!/usr/bin/perl -w
#
# Converts the POI category tree to csv
#
use XML::Simple;
use Data::Dumper;
# language to use in the description field is selected from the variable below.
# please do note that you have to use the unicode converter thingie for special characters in russian, swedish, arabic and stuff.
my $language = "english";
# path to POI category tree.
my $filename = "statData/poi_category_tree.xml";


# Below this point you really should not be editing anything 😃
#
#
#
#
#
$xml = new XML::Simple;
my $data = $xml->XMLin($filename);
foreach my $categories (keys %{$data->{categories}}) {
# now we can extract the relevant fields for this category
# we will loop the catergories and for each and every print the ID and the translation matching the language in $language
my $i = 0;
while($data->{categories}{$categories}[$i]) {
my $category = $data->{categories}{$categories}[$i];

print $category->{id}.";";
# for the translated name we must find the INDEX where language == $language and print "content"
my @arr = $category->{name};
my $n = 0;
while ($arr[0][$n]) {
if ($arr[0][$n]{language} eq lc($language)) {
print $arr[0][$n]{content}."\n";
}
#increase counter to try next translation
$n++
}
#increase counter to get next category
$i++;
}
}


Not applicable
Author

And for the curious one.


#!/usr/bin/perl -w
#
# Converts the POI category tree to csv
#
use XML::Simple;
use Data::Dumper;
# This script will produce a intermittent format like this
#
# The final product of this script is a CSV file on the following format
# ^categoryId;categoryDescription$
#
#
#
# language to use in the description field is selected from the variable below.
# please do note that you have to use the unicode converter thingie for special characters in russian, swedish, arabic and stuff.
my $language = "english";
# path to POI category tree.
my $filename = "statData/poi_category_tree.xml";


# Below this point you really should not be editing anything 😃
#
#
#
#
#
$xml = new XML::Simple;
my $data = $xml->XMLin($filename);
foreach my $categories (keys %{$data->{categories}}) {
# now we can extract the relevant fields for this category
# we will loop the catergories and for each and every print the ID and the translation matching the language in $language
my $i = 0;
while($data->{categories}{$categories}[$i]) {
my $category = $data->{categories}{$categories}[$i];

print $category->{id}.";";
# for the translated name we must find the INDEX where language == $language and print "content"
my @arr = $category->{name};
my $n = 0;
while ($arr[0][$n]) {
if ($arr[0][$n]{language} eq lc($language)) {
print $arr[0][$n]{content}."\n";
}
#increase counter to try next translation
$n++
}
#increase counter to get next category
$i++;
}
}