Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
MAnywar
Contributor III
Contributor III

Help with split-mapping xml file to template schema to produce multiple output xml files from a single file.

So, I am having a special kind of xml file (multiple of them) of which I need to map to a template.

Sample file.

<root>

<header>

<generated>2013-11-29 00:00:00</generated>

<somestuff/>

</header>

<catalog/>

<article/>

<id>111111</id> xxx<!-- split / aggregate by this -->

<article-details>/>

<description-short>nice article</description-short>

<description-long>really nice article</description-short>

<keyword>keyword A</keyword>

<keyword>keyword B</keyword>

<keyword>keyword C</keyword>

<keyword>keyword D</keyword>

<keyword>keyword E</keyword>

<keyword>keyword F</keyword>

<keyword>keyword G</keyword>

</root>

I need to map individual keyword plus some of the details in the header individually. The detail fields remain constant but not the values. For example

<root>

<header>

<generated>2013-11-29 00:00:00</generated>

<somestuff/>

</header>

<catalog/>

<article/>

<id>date</id>

<article-details>/>

<description-short>nice article</description-short>

<description-long>really nice article</description-short>

<keyword>keyword A</keyword>

</root>

However the file/s have varying number and values for the keywords.

What would be the best way to approach this:

I could do;

tFiles_Input

---->

tFiles_Extract

------>

(Not sure what have here)------>tMap_xml ˜{I dont know if the spliting could be done from here either}

-------->tFilesOutputXML.

I surely will be grateful for any assistance or help offered.

Thank you.

12 Replies
MAnywar
Contributor III
Contributor III
Author

Hi @Shicong Hong​ 

Thanks for the help, So here is resultant output while comparing both .

First: The later:

With this I am getting more 20 tables being mapped for 2 files with one having 4 expected individual mapped files and the other file having just 2 expected files.0693p00000AcLWGAA3.png0693p00000AcLWuAAN.png

On the first process I had designed however,

The 2nd file is producing exactly the two expected output files as required,

but the first file in in the folder which has 4 expected outputs is producing 8 outputs which is wrong. The expected out put is supposed to be exactly 4. could this be an issue of the joins? Thank you.

Michael.

0693p00000AcLWzAAN.png0693p00000AcLX4AAN.png

Anonymous
Not applicable

Hi

You are iterating multiple files, make sure the 'Clean cache after reading' box is checked on tHashInput component to clean the data for the current file after it is reading.

I don't understand the job design in your first screenshot, I see you are using a tRunJob to call a child job, but you don't move the processing to child job as I suggested.

0693p00000AcOySAAV.png 

MAnywar
Contributor III
Contributor III
Author

@Shicong Hong​ I sorted the issue.

This worked. In case of any comments, let me know and any adjustments you might recommend.0693p00000AcPPdAAN.png0693p00000AcPPiAAN.pngThought even though I passed the file path directly from the tFiles straight to the tFileInputXML_1 it seemed not have an effect. (ie. not having the main job)

Ohh the other thing is the file naming. This worked perfectly well.

"D:/Directory/FilesFolder"+((String)globalMap.get("tFileList_1_CURRENT_FILE"))+"_"+((String)globalMap.get("JoinedTable.DG1_1_Number"))+".xml"

The Output is just as expected. (Test3.xml_6.xml) is missing as that row (6) is also missing in the source file(Test3). The naming convention I believe is appropriate incase on expects to do traceback to source file

0693p00000AcPPnAAN.png