Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Turn XML with dynamic N children elements into N dynamic columns

I have XML inbound that looks like the below, note there are some standard elements, but then N dynamic "property" children for N unknown properties. I'd like to get a new row for each species with its 3 standard properties + all dynamic properties appended after it (comma separated). See below.
<root>
<species>
<type>BEAR</type>
<identifier>2203A</identifier>
<label>Animal</label>
<properties>
  <property>
 <name>description</name>
 <value>desc1</name>
  </property>
  <property>
 <name>region</name>
 <value>abc001</value>
  </property>
  </properties>
</species>
<species>
<type>DOG</type>
<identifier>8777</identifier>
<label>gear</label>
<properties>
  <property>
 <name>description</name>
 <value>apple</name>
  </property>
  <property>
 <name>region</name>
 <value>cccc</value>
  </property>
  <property>
 <name>prop9</name>
 <value>dddd</value>
  </property>
  <property>
 <name>prop10</name>
 <value>dlldld</value>
  </property>
  </properties>
</species>
</root>


I want rows like this, (i.e. note the dynamic N properties are appended to end)
bear,2203a,animal,desc1,abc001
dog,8777,gear,apple,cccc,dddd,dlldld


How can I do this in Talend when the <property> elements are unknown and can vary?
Labels (3)
4 Replies
Anonymous
Not applicable
Author

Selecting Unknown Nodes
XPath wildcards can be used to select unknown XML nodes.
Wildcard
Description
*Matches any element node
@*Matches any attribute node
node()Matches any node of any kind

SOURCE: www.w3schools.com/xsl/xpath_syntax.asp
Anonymous
Not applicable
Author

Aware of that, but how does essentially iterate over all subnodes of properties and create additional column values for each one encountered, like what component should I be using in studio. tExtractXmlField or tXmlMap don't seem to be able to do this
This properties block could contain 0-N properties
Anonymous
Not applicable
Author

Is this waht you want :
Starting job textxml at 15:38 25/04/2016.
connecting to socket on port 3952
connected
.------+-----.
| tLogRow_2 |
|=-----+----=|
|key |id |
|=-----+----=|
|desc1 |2203A|
|abc001|2203A|
|apple |8777 |
|cccc |8777 |
|dddd |8777 |
|dlldld|8777 |
'------+-----'
.----------------------+-----.
| tLogRow_3 |
|=---------------------+----=|
|key |id |
|=---------------------+----=|
|apple,cccc,dddd,dlldld|8777 |
|desc1,abc001 |2203A|
'----------------------+-----'
.----+-----+------.
| tLogRow_1 |
|=---+-----+-----=|
|type|id |label |
|=---+-----+-----=|
|BEAR|2203A|Animal|
|DOG |8777 |gear |
'----+-----+------'
.----+-----+------+----------------------.
| tLogRow_4 |
|=---+-----+------+---------------------=|
|type|id |label |key |
|=---+-----+------+---------------------=|
|BEAR|2203A|Animal|desc1,abc001 |
|DOG |8777 |gear |apple,cccc,dddd,dlldld|
'----+-----+------+----------------------'
disconnected
Job textxml ended at 15:38 25/04/2016.
Anonymous
Not applicable
Author

Here the script, i deleted the tlogrow components to save space
SCRIPT_VERSION=4.2,
DEFAULT_CONTEXT: Default,
ContextType {
 NAME: Default
}
addParameters {
 addElementParameters {
  JOB_RUN_VM_ARGUMENTS : " -Xms256M -Xmx1024M",
  JOB_RUN_VM_ARGUMENTS_OPTION : "false",
  SCREEN_OFFSET_X : "0",
  SCREEN_OFFSET_Y : "0",
  SCHEMA_OPTIONS : "none",,,
  IMPLICITCONTEXT_USE_PROJECT_SETTINGS : "true",
  STATANDLOG_USE_PROJECT_SETTINGS : "true",
  MULTI_THREAD_EXECATION : "false",
  IMPLICIT_TCONTEXTLOAD_FILE : "\"\"",
  FIELDSEPARATOR : "\"\"",,,
  DISABLE_WARNINGS : "false",
  DISABLE_INFO : "false",
  FILE_PATH : "\"E:/X163842/Workspace/.metadata\"",
  ENCODING:ENCODING_TYPE : "ISO-8859-15",

To see the whole post, download it here
OriginalPost.pdf