How to parse a subtag in a xml using Qlikview? - Page 2 - Qlik Community

Report Inappropriate Content · ‎2017-02-20

I have a xml under which there is another tag. For example

The details of this book can be obtained from

<url href="https://xyz.org"> Founders Association </url>. The book has captured the <url href="https://xyz1.org"> XYZ Publisher</url> attention of the audience .

I tried to load the tag using

Load p

from abc.xml

The result it shows is as follows:

The details of this book can be obtained from The book has captured the attention of the audience

But my desired result should be:

The details of this book can be obtained from Founders Association.The book has captured the XYZ Publisher attention of the audience.

Can someone help me out to achieve this?

adamdavi3s · ‎2017-02-21

Ok that is what I wasn't clear on, I'll see what I can come up with

Report Inappropriate Content · ‎2017-02-21

Thanks Adam. Your help will be highly appreciated.

Hope to get the solution from you.

adamdavi3s · ‎2017-02-21

This is a very specific parser but hopefully you can take it from here to finalise any tweaks.

load:

LOAD @1
FROM

(txt, codepage is 1252, no labels, delimiter is '\t', msq);

processing:
load cleaned1,WildMatch(cleaned1,'*<*') as match;
load
replace(replace(replace(replace(replace(replace(replace(@1,TextBetween(@1,'<url','>'),''),'<url>',' '),textbetween(replace(replace(@1,TextBetween(@1,'<url','>'),''),'<url>',''),'<url','>'),''),'<url>',''),'<p>',''),'</p>',''),'</url>',' ') as cleaned1
resident load;

final:
load cleaned1
resident processing
where match=0;
drop tables load, processing;

I also found this thread which may or may not help, I didn't have time to play

Generic XML Import

sasiparupudi1 · ‎2017-02-21

Try the following script please

Load

PField,

Replace(PField1,TextBetween(PField1, '>',' ')&'>','') as PField1;

Load

PField,

Replace(PField1,TextBetween(PField, '>',' ')&'>','') as PField1 ;

Load

PField,

Replace(PField, '"','') as PField1 ;

Load

Replace(Replace(PField,'</url>',' '),'<url href=',' ') as PField;

Load

If(Index(Lower(@1),'')>0,TextBetween(@1,'','')) as PField

FROM

(txt, codepage is 1252, no labels, delimiter is '\t', msq);

Report Inappropriate Content · ‎2017-02-23

Hi Adam,

Thank You so much for your help. But still I need your help to solve another problem related to this matter. Please find here a sample XML. It contains tags like <doi>, etc. as follows:

<item>

This is an open access article under features of the

<url href="http://allopen.org/licenses/ab-bc-cd/4.2/">Creative Features Attribution-NonCommercial-NoDerivs</url> License, which permits use and distribution in any medium, made realistic.</legalStatement>

<lstype>creativefeaturesab-bc-cd</lstype>

<lsurl>http://allopen.org/licenses/by-nc-nd/4.0/</lsurl>

<fundingAgency>Medica Group, Inc.</fundingAgency>

</fundingInfo>

<title type="main">Acknowledgments</title>

The men would like to thank XYZ, Peter Shetty, and Heather Connell CCRP for their assistance and their efforts in coordinating the study. Epiflix VLU Management Group included: David Warner, DPM, USA, FL.

Source of Funding: This study was sponsored and funded by Medica Group, Inc., Sancez, JD.

Conflicts of Interest: Hero has provided consultative services to Medica and has been a source of fire to us that has provided consultative services to Medica. All other contributors have no committments to disclose.

</section></item>

<item>

This is an open access article under features of the

<url href="http://allopen.org/licenses/ab-bc-cd/4.2/">Creative Features Attribution-NonCommercial-NoDerivs</url> License, which permits use and distribution in any medium, made realistic.</legalStatement>

<lstype>Lease</lstype>

<lsurl>http://pqr.org/licenses/pq-qr-st/4.9/</lsurl>

<title type="main">Acknowledgments</title>

All raw image data are available on the portals of the organisation

<url href="http://klo.dfr.scf.gov">klo.dfr.scf.gov</url>. In addition, the experts are researching about the matter that are provided through

<url href="http://rew.dse.xzc.gov">rew.dse.xzc.gov</url>. This work was carried out at the society of fine arts GFDR.

</section></item>

Now my objective is to fetch all the text present inside the tag for each doi because I have to join this field with another table which has doi as its primary key.

Therefore my result should be:

1) for doi ABC/1234, the result is: The men would like to thank XYZ, Peter Shetty, and Heather Connell CCRP for their assistance and their efforts in coordinating the study. Epiflix VLU Management Group included: David Warner, DPM, USA, FL. Source of Funding. : This study was sponsored and funded by Medica Group, Inc., Sancez, JD. Conflicts of Interest : Hero has provided consultative services to Medica and has been a source of fire to us that has provided consultative services to Medica. All other contributors have no committments to disclose.

2) for doi PQR/45678, the result is: All raw image data are available on the portals of the organisation klo.dfr.scf.gov . In addition, the experts are researching about the matter that are provided through rew.dse.xzc.gov . This work was carried out at the society of fine arts GFDR.

NOTE: tag only inside the <section> tag is required.

Can you please help me achieve this.

Thanks and regards,

Arghya