Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Learn how to migrate to Qlik Cloud Analytics™: On-Demand Briefing!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Combining data from delimited files to produce xml doc for posting to solr

Hi All,

New to Talend

I am attempting to combine 2 delimited files to produce a document that can be submitted to solr
My inputs are approx. as follows (simplified, but hopefully communicates the requirement)
I would eventually expect this requirement to increase to cover more than just x2 delimited files

prod.csv
prod_id  name

sku.csv
sku_id  prod_id  name

For each row in prod.csv, I would like to generate a document structure that can be posted to solr in the following format

<doc>
  <field name="prod_uid">prod.csv_prod_id</field>
  <field name="prod_skus">[sku.csv.sku_id, sku.csv.sku_id,sku.csv.sku_id,]</field>
</doc>

 

I have spent time trying to work with tMAP and tXMLMap (and various other components) and whilst I have got somewhere, I typically end up posting (viatRESTClient) individual documents for every product+sku combination, rather than grouping the sku_ids in a single element for a product
There are lots of examples in the forum, but nothing I have been table to take to solve this (what I am assuming is) simple problem


All help welcomed, thanks

Labels (4)
1 Reply
nfz11
Creator III
Creator III

The key component you need here is tAggregateRow.  If you group by prod_id and aggregate the sku_ids into a list it should be straightforward:

Here is my test input and output:

Starting job TestXmlArray at 00:17 07/07/2019.

[statistics] connecting to socket on port 3925
[statistics] connected
.--------------------.
|   #1. tLogRow_1    |
+-----------+--------+
| key       | value  |
+-----------+--------+
| prod_id   | 1      |
| prod_name | Orange |
+-----------+--------+

.-------------------.
|   #2. tLogRow_1   |
+-----------+-------+
| key       | value |
+-----------+-------+
| prod_id   | 2     |
| prod_name | Apple |
+-----------+-------+

.-------------------.
|   #3. tLogRow_1   |
+-----------+-------+
| key       | value |
+-----------+-------+
| prod_id   | 3     |
| prod_name | Pear  |
+-----------+-------+

.-----------------.
|  #1. tLogRow_3  |
+---------+-------+
| key     | value |
+---------+-------+
| sku_id  | 10    |
| prod_id | 1     |
+---------+-------+

.-----------------.
|  #2. tLogRow_3  |
+---------+-------+
| key     | value |
+---------+-------+
| sku_id  | 21    |
| prod_id | 2     |
+---------+-------+

.-----------------.
|  #3. tLogRow_3  |
+---------+-------+
| key     | value |
+---------+-------+
| sku_id  | 22    |
| prod_id | 2     |
+---------+-------+

.-----------------.
|  #4. tLogRow_3  |
+---------+-------+
| key     | value |
+---------+-------+
| sku_id  | 31    |
| prod_id | 3     |
+---------+-------+

.-----------------.
|  #5. tLogRow_3  |
+---------+-------+
| key     | value |
+---------+-------+
| sku_id  | 32    |
| prod_id | 3     |
+---------+-------+

.-----------------.
|  #6. tLogRow_3  |
+---------+-------+
| key     | value |
+---------+-------+
| sku_id  | 33    |
| prod_id | 3     |
+---------+-------+

.--------------------------------------------------------------------------------------------------------------------------------------------------------.
|                                                                     #1. tLogRow_2                                                                      |
+-----+--------------------------------------------------------------------------------------------------------------------------------------------------+
| key | value                                                                                                                                            |
+-----+--------------------------------------------------------------------------------------------------------------------------------------------------+
| xml | <?xml version="1.0" encoding="ISO-8859-15"?>

<doc>
  <prod_id>1</prod_id>
  <prod_name>Orange</prod_name>
  <prod_skus>[10]</prod_skus>
</doc>
 |
+-----+--------------------------------------------------------------------------------------------------------------------------------------------------+

.-----------------------------------------------------------------------------------------------------------------------------------------------------------.
|                                                                       #2. tLogRow_2                                                                       |
+-----+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| key | value                                                                                                                                               |
+-----+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| xml | <?xml version="1.0" encoding="ISO-8859-15"?>

<doc>
  <prod_id>2</prod_id>
  <prod_name>Apple</prod_name>
  <prod_skus>[21, 22]</prod_skus>
</doc>
 |
+-----+-----------------------------------------------------------------------------------------------------------------------------------------------------+

.--------------------------------------------------------------------------------------------------------------------------------------------------------------.
|                                                                        #3. tLogRow_2                                                                         |
+-----+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| key | value                                                                                                                                                  |
+-----+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| xml | <?xml version="1.0" encoding="ISO-8859-15"?>

<doc>
  <prod_id>3</prod_id>
  <prod_name>Pear</prod_name>
  <prod_skus>[31, 32, 33]</prod_skus>
</doc>
 |
+-----+--------------------------------------------------------------------------------------------------------------------------------------------------------+

[statistics] disconnected

Job TestXmlArray ended at 00:17 07/07/2019. [exit code=0]

See attached screenshot of the test job with the details of the tAggregateRow.  prod_sku_ids is of type List in the schema.

0683p000009M6C6.png

In the tMap you do an inner join on the prod id:
0683p000009M6CB.png