Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi!!!
Im needing a little help again with other file to extract data.
I have an html file and i need the following datas:
UNIVERSIDAD XXXXX into a Field called "Obra"
Hasta: (in this case the value its) 16/07/2015
Neto Rendido: (in this case the value its) $51.814,10
This Extraction its not only for this File. I have many files with the same format, and i need to read them all
Can Someone help me?
Regards!!!
Germán
Hi Please find the below sample script.
TEMP:
LOAD
RowNo() AS RN ,
LN1 &'|'& LN2 &'|'& LN3 &'|'& LN4 &'|'& LN5 &'|'& LN6 &'|'& LN7 &'|'& LN8 AS LINE
;
LOAD trim(@1) AS LN1,
trim(@2) AS LN2,
trim(@3) AS LN3,
trim(@4) AS LN4,
trim(@5) AS LN5,
trim(@6) AS LN6,
trim(@7) AS LN7,
trim(@8) AS LN8
FROM [PL0716_15628_3.HTM] (html, codepage is 1252, no labels, table is @1);
TEMP2:
LOAD
* ,
IF(Previous(RN)=RN, IF(Previous(LINES)= LINES , Peek(SN) , RangeSum(1,Peek(SN)) ) ,1) AS SN
Where Len(Trim(LINES))
;
LOAD
RN,
Trim(SubField(LINE ,'|')) AS LINES
Resident TEMP ;
DROP Table TEMP;
SourceFile:
LOAD
RN,
Concat( DISTINCT LINES , ' ' , SN ) AS HLINE
Resident TEMP2
Where Len(Trim(LINES))
Group By RN
;
DROP Table TEMP2;
FinalTemp:
LOAD
DATASTRING ,
SubField(DATASTRING,'|',1) AS Obra,
subfield(SubField(DATASTRING,2),':',-1) AS Hasta ,
subfield(SubField(DATASTRING,3),':',-1) AS [Neto Rendido]
;
LOAD
Concat( HLINE , '|' , RN ) AS DATASTRING
Resident SourceFile
Where WildMatch( HLINE , 'UNIVERSIDAD*', '*Hasta*' ,'*Neto Rendido*');
DROP Table SourceFile;
Hi Dathu!!!
Thanks!!
But this only works for one file, isnt it?
I need the same but for many files into the same folder.
How could do i?
Regards!!!
May be this post can help you to loop through the html files in a folder
Re: loop through to load all files from a folder and its subfolders?
Hi, If all files have same format create a for loop to load the data from each file.
Since we have the html files we can't load like sub folders or folders.
First Create a small excel or Inline table having all the html file names. (This must be a manual or some how you get from other source)
HtmlFilesList:
LOAD * INLINE [
HtmlFileName
file1
file2
file3
.
.
];
SET vTableName = TransformedData:;
For i=1 to FieldvalueCount('HtmlFileName')
Let vHtmlFile = FieldValue('HtmlFileName') ;
// **** Your Load Script Logic for each html page ****
TEMP:
LOAD
RowNo() AS RN ,
LN1 &'|'& LN2 &'|'& LN3 &'|'& LN4 &'|'& LN5 &'|'& LN6 &'|'& LN7 &'|'& LN8 AS LINE
;
LOAD trim(@1) AS LN1,
trim(@2) AS LN2,
trim(@3) AS LN3,
trim(@4) AS LN4,
trim(@5) AS LN5,
trim(@6) AS LN6,
trim(@7) AS LN7,
trim(@8) AS LN8
FROM [$(vHtmlFile)] (html, codepage is 1252, no labels, table is @1);
TEMP2:
LOAD
* ,
IF(Previous(RN)=RN, IF(Previous(LINES)= LINES , Peek(SN) , RangeSum(1,Peek(SN)) ) ,1) AS SN
Where Len(Trim(LINES))
;
LOAD
RN,
Trim(SubField(LINE ,'|')) AS LINES
Resident TEMP ;
DROP Table TEMP;
SourceFile:
LOAD
RN,
Concat( DISTINCT LINES , ' ' , SN ) AS HLINE
Resident TEMP2
Where Len(Trim(LINES))
Group By RN
;
DROP Table TEMP2;
$(vTableName)
LOAD
DATASTRING ,
SubField(DATASTRING,'|',1) AS Obra,
subfield(SubField(DATASTRING,2),':',-1) AS Hasta ,
subfield(SubField(DATASTRING,3),':',-1) AS [Neto Rendido]
;
LOAD
Concat( HLINE , '|' , RN ) AS DATASTRING
Resident SourceFile
Where WildMatch( HLINE , 'UNIVERSIDAD*', '*Hasta*' ,'*Neto Rendido*');
DROP Table SourceFile;
Let vHtmlFile = ;
SETvTableName = Concatenate(TransformedData);
Next i
Drop Table HtmlFilesList ;
I Apreciate your help but something dosnt look good on datas. Its not parsing datas in the correct way as you can see on the screen attached.
For instance, look the date data, it dosnt include the year. Or there's some Neto Value that its $29.67 when you can see on DataString that this value its $29.673,86
Something its not working.
Thanks anyway.
may be this post can help you