Hi all, I am trying to do a simple use of the tFileInputXML to extract information from existing XML file and put them in a tlogRow But I keep getting error with this component. So here is the very simple job that I try to run, wihtout sucess : A tfileinputXML reads the content of an xml file and sends it to a tlogrow file I try to use it with a very simple XML file I created for this test (see component configuration in one of the screnshoots) : the content of the file "C:/talend/bis_project/pit-project.xml" is the following : <?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE fmresultset PUBLIC "-//FMI//DTD fmresultset//EN" "/fmi/xml/fmresultset.dtd"> <resultset> <record id="1">data1</record> <record id="2">data2</record> <record id="3">data3</record> <record id="4">data564654</record> <record id="5">data534534</record> <record id="6">data45</record> <record id="7">data78</record> </resultset> It is saved under UTF-8. The configuration of the XML component is in the screenshot too. But the job doesn't work, I keep getting the error 'Invalid character constant' What does that mean? How can I solve this? Any help would be appreciated! Thank you
Hello
As the error message said, there are some Invalid character in the following line
<!DOCTYPE fmresultset PUBLIC "-//FMI//DTD fmresultset//EN" "/fmi/xml/fmresultset.dtd">
it is not a xml-formed file.
Before using the tFileInputxml to read it, you need delete that line from the xml file. for exmaple:
tFileInputFullLine---tFilterRow--tFileOutputDelimited
|
onsubjobok
|
tFileInputXML---tLogRow
For more details, please see my screenshots.
Best regards
Shong
Thank you very much Mister Shong.
This helps me. While waiting for an answer here I had found that this line was the source of my problem and your detailed explanation will help me to automatize the solution.
But isn't it normal to have a doctype ina XML file? could it be also because, in the example I gave, the name in the doctype (fmresultset) and the name of the root tag i used (resultset) were different and that caused the errors? I hope Talend allows to used XML files that have a doctype defined.
Hello, i'm using talend TIS 4.2.3 and I have to read a xml file with a doctype declaration. So, i would like first to validate the xml file thanks to the dtd, then, to read it. how can I do it cause the dtdvalidate uses a dtd file. for example a xml file : <?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <!DOCTYPE images SYSTEM "http://www.electre.com/xml/images.dtd"> <images> <record><ean>9782952972772</ean><couverture>1</couverture><imagette>1</imagette></record> <record><ean>9791092708004</ean><couverture>1</couverture><imagette>1</imagette></record> <record><ean>9782358630849</ean><couverture>1</couverture><imagette>1</imagette></record> <record><ean>9782358630832</ean><couverture>1</couverture><imagette>1</imagette></record> <record><ean>9782818605493</ean><couverture>1</couverture><imagette>1</imagette></record> <record><ean>9782092548707</ean><couverture>1</couverture><imagette>1</imagette></record> </images> thanks for your help.