Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
mfchmielowski
Creator II
Creator II

Loading data from html table - problem with br tag

Hello all.

I'm fetching data from url The Denied Persons List‌ There is a table with column: Name and address.

Name is devided from address with html br tag. When qlikview is parsing the value in this cell it's giving name and address concatenated and i've no idea how to split this two values. I need to have name and address in two different columns in qlikview.

I've tried subfield([Name and addres], chr(10), 1) and subfield([Name and addres], chr(13), 1) but none of this worked.

My load script:

DeniedPersonsTmp:

LOAD [Name and Address],

     [Effective Date],

     [Expiration Date],

     [Type of Denial],

     F5,

     RowNo() bitand 1 as rowNoBitAnd

FROM

[https://www.bis.doc.gov/dpl/public/dpl.php]

(html, utf8, embedded labels, table is @1);

NoConcatenate

DeniedPersons:

load

  *

  , dpEffectiveDateYear&'-'&dpEffectiveDateMonth&'-'&dpEffectiveDateDay as dpEffectiveDateNormal

  , if(dpExpirationDateRaw <> '', dpExpirationDateYear&'-'&dpExpirationDateMonth&'-'&dpExpirationDateDay) as dpExpirationDate

  , trim(SubField(dpNameAndAddress, ',', 1)) as dpName

  , trim(SubField(dpNameAndAddress, ',', 2)) as dpAddress

  , trim(SubField(dpNameAndAddress, ',', 3)) as dpAddress2

;

load

  'bis.doc.gov' as dpDataSource,

  [Name and Address] as dpNameAndAddress,

  SubField([Effective Date], '/', 3) as dpEffectiveDateYear,

  SubField([Effective Date], '/', 2) as dpEffectiveDateDay,

  SubField([Effective Date], '/', 1) as dpEffectiveDateMonth,

  SubField([Expiration Date], '/', 3) as dpExpirationDateYear,

  SubField([Expiration Date], '/', 2) as dpExpirationDateDay,

  SubField([Expiration Date], '/', 1) as dpExpirationDateMonth,

  [Expiration Date] as dpExpirationDateRaw,

  [Type of Denial] as dpTypeOfDenial

Resident DeniedPersonsTmp Where rowNoBitAnd = 1;

drop table DeniedPersonsTmp;

drop fields dpExpirationDateRaw from DeniedPersons;

1 Solution

Accepted Solutions
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I think you've run up against a limitation of the  QV HTML loader. Each <td> tag is treated as a field, but tags within the <td> are stripped, giving you no way to separate Name and Address. However, a nice csv version of the same data is available here:

https://www.bis.doc.gov/dpl/dpl.txt

That file offers a nice breakout of the individual attributes.

-Rob

http://masterssummit.com

http://qlikviewcookbook.com

View solution in original post

4 Replies
marcus_sommer

Try it with: subfield([Name and addres], chr(13) & chr(10), 1)

- Marcus

mfchmielowski
Creator II
Creator II
Author

I've tried that already.

DeniedPersonsTmp:

LOAD

  SubField([Name and Address], ',', 1) as dpName,

  index([Name and Address], chr(13)) as isChr13Raw,

  index([Name and Address], chr(10)) as isChr10Raw,

FROM

[https://www.bis.doc.gov/dpl/public/dpl.php]

(html, utf8, embedded labels, table is @1);

index.png

Looks like qlikview internal html parser makes something like replace(cellValue, '<br>', '')

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I think you've run up against a limitation of the  QV HTML loader. Each <td> tag is treated as a field, but tags within the <td> are stripped, giving you no way to separate Name and Address. However, a nice csv version of the same data is available here:

https://www.bis.doc.gov/dpl/dpl.txt

That file offers a nice breakout of the individual attributes.

-Rob

http://masterssummit.com

http://qlikviewcookbook.com

mfchmielowski
Creator II
Creator II
Author

Thank You very much