Qlik Community

QlikView Scripting

Discussion Board for collaboration on QlikView Scripting.

Announcements
QlikView Fans! We’d love to hear from you.
Share your QlikView feedback with the product team… Click here to participate in our 5-minute survey.
Rules, plus terms and conditions, can be found here.
Not applicable

Extract Datas from Html non parsed

Hi!!!

Im needing a little help again with other file to extract data.

I have an html file and i need the following datas:

UNIVERSIDAD XXXXX  into a Field called "Obra"

Hasta: (in this case the value its) 16/07/2015

Neto Rendido: (in this case the value its) $51.814,10

This Extraction its not only for this File. I have many files with the same format, and i need to read them all

Can Someone help me?

Regards!!!

Germán

Tags (3)
6 Replies
Not applicable

Re: Extract Datas from Html non parsed

Hi Please find the below sample script.

TEMP:

LOAD

  RowNo() AS RN ,

  LN1 &'|'& LN2 &'|'& LN3 &'|'& LN4 &'|'& LN5 &'|'& LN6 &'|'& LN7 &'|'& LN8 AS LINE

;

LOAD trim(@1) AS LN1,

     trim(@2) AS LN2,

     trim(@3) AS LN3,

     trim(@4) AS LN4,

     trim(@5) AS LN5,

     trim(@6) AS LN6,

     trim(@7) AS LN7,

     trim(@8) AS LN8

FROM [PL0716_15628_3.HTM] (html, codepage is 1252, no labels, table is @1);

TEMP2:

LOAD

  * ,

  IF(Previous(RN)=RN, IF(Previous(LINES)= LINES , Peek(SN) , RangeSum(1,Peek(SN)) ) ,1) AS SN

Where  Len(Trim(LINES))

;

LOAD

  RN,

  Trim(SubField(LINE ,'|')) AS LINES

Resident TEMP ;

DROP Table TEMP;

SourceFile:

LOAD

  RN,

  Concat( DISTINCT LINES , ' ' , SN ) AS HLINE

Resident TEMP2

Where Len(Trim(LINES))

Group By RN

;

DROP Table TEMP2;

FinalTemp:

LOAD

  DATASTRING ,

  SubField(DATASTRING,'|',1) AS Obra,

  subfield(SubField(DATASTRING,2),':',-1) AS Hasta ,

  subfield(SubField(DATASTRING,3),':',-1) AS [Neto Rendido]

;

LOAD

  Concat( HLINE , '|' , RN ) AS DATASTRING

Resident SourceFile

Where WildMatch( HLINE , 'UNIVERSIDAD*', '*Hasta*' ,'*Neto Rendido*');

DROP Table SourceFile;

Not applicable

Re: Extract Datas from Html non parsed

Hi Dathu!!!

Thanks!!

But this only works for one file, isnt it?

I need the same but for many files into the same folder.

How could do i?

Regards!!!

sasiparupudi1
Honored Contributor III

Re: Extract Datas from Html non parsed

May be this post can help you to loop through the html files in a folder

Re: loop through to load all files from a folder and its subfolders?

Not applicable

Re: Extract Datas from Html non parsed

Hi, If all files have same format create a for loop to load the data from each file.

Since we have the html files we can't load like sub folders or folders.

First Create a small excel or Inline table having all the html file names. (This must be a manual or some how you get from other source)

HtmlFilesList:

LOAD * INLINE [

HtmlFileName

file1

file2

file3

.

.

];

SET vTableName = TransformedData:;

For i=1 to FieldvalueCount('HtmlFileName')

Let vHtmlFile = FieldValue('HtmlFileName') ;

// **** Your Load Script Logic for each html page ****

TEMP:

LOAD

  RowNo() AS RN ,

  LN1 &'|'& LN2 &'|'& LN3 &'|'& LN4 &'|'& LN5 &'|'& LN6 &'|'& LN7 &'|'& LN8 AS LINE

;

LOAD trim(@1) AS LN1,

     trim(@2) AS LN2,

     trim(@3) AS LN3,

     trim(@4) AS LN4,

     trim(@5) AS LN5,

     trim(@6) AS LN6,

     trim(@7) AS LN7,

     trim(@8) AS LN8

FROM [$(vHtmlFile)] (html, codepage is 1252, no labels, table is @1);

TEMP2:

LOAD

  * ,

  IF(Previous(RN)=RN, IF(Previous(LINES)= LINES , Peek(SN) , RangeSum(1,Peek(SN)) ) ,1) AS SN

Where  Len(Trim(LINES))

;

LOAD

  RN,

  Trim(SubField(LINE ,'|')) AS LINES

Resident TEMP ;

 

DROP Table TEMP;

 

SourceFile:

LOAD

  RN,

  Concat( DISTINCT LINES , ' ' , SN ) AS HLINE

Resident TEMP2

Where Len(Trim(LINES))

Group By RN

;

 

DROP Table TEMP2;

$(vTableName)

LOAD

  DATASTRING ,

  SubField(DATASTRING,'|',1) AS Obra,

  subfield(SubField(DATASTRING,2),':',-1) AS Hasta ,

  subfield(SubField(DATASTRING,3),':',-1) AS [Neto Rendido]

;

LOAD

  Concat( HLINE , '|' , RN ) AS DATASTRING

Resident SourceFile

Where WildMatch( HLINE , 'UNIVERSIDAD*', '*Hasta*' ,'*Neto Rendido*');

 

DROP Table SourceFile;

Let vHtmlFile = ;

SETvTableName = Concatenate(TransformedData);

Next i

Drop Table HtmlFilesList ;

Not applicable

Re: Extract Datas from Html non parsed

I Apreciate your help but something dosnt look good on datas. Its not parsing datas in the correct way as you can see on the screen attached.

For instance, look the date data, it dosnt include the year. Or there's some Neto Value that its $29.67 when you can see on DataString that this value its $29.673,86

Qlikview_Ripsa.jpg

Something its not working.

Thanks anyway.

sasiparupudi1
Honored Contributor III

Re: Extract Datas from Html non parsed

may be this post can help you

Re: Importing HTML data into QlikView

Community Browser