Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
Is there any way or any extension available to read data from a pdf file,
just like we use to read from other sources as excel , or db etc..
No, but there are converters from PDF to Excel.
Interesting.. Could you post an example PDF file to understand the use case?
- Ralf
Interesting. . . Could you post an example PDF file to understand the use case?
-Brijesh
PFA a sample.........
Hi Nitin,
this is quite a long road.. You can do it with a file conversion using pdftohtml.exe from Sourceforge:
// Set path of source file
Set vPath = C:\Projekte\QVPDF\;
// Set amount of columns
Set vCols = 2;
// convert PDF file to XML
EXECUTE cmd.exe /C pdftohtml.exe -xml $(vPath)sample.pdf;
// Load from XML (this is very dependent from PDF layout!)
RawData:
LOAD text%Table as value
FROM [$(vPath)sample.xml] (XmlSimple, Table is [pdf2xml/page/text]);
// Load field names from header for later renaming
HeaderMap:
Mapping First $(vCols) LOAD '@' & RecNo() as x, value as y
Resident RawData;
// build a proper input table
InputTable:
LOAD ceil(RecNo()/2)-1 as %key, if(Mod(RecNo(),2)>0, '@1', '@2') as attribute, value
Resident RawData
Where RecNo()>$(vCols);
// generic load from input table
GenTable:
Generic LOAD * Resident InputTable;
// consolidation of tables created by generic load
ResultTable:
LOAD Distinct %key Resident InputTable;
FOR i = 0 to NoOfTables()
TableList:
LOAD TableName($(i)) as Tablename AUTOGENERATE 1
WHERE WildMatch(TableName($(i)), 'GenTable.*');
NEXT i
FOR i = 1 to FieldValueCount('Tablename')
LET vTable = FieldValue('Tablename', $(i));
LEFT JOIN (ResultTable) LOAD * RESIDENT [$(vTable)];
DROP TABLE [$(vTable)];
NEXT i
DROP TABLES RawData, TableList, InputTable;
RENAME Fields Using HeaderMap;
To run an external command you have to do these settings:
Open dialog by Shift-Ctrl-M:
- Ralf
HI,
I have done this to fetch data from voter list provided in government site.
As the files are in pdf format, i used a weeny free excel convertor to convert it in excel format.
Now you can easily load excel files in QV.
Regards
Arun
hi Ralf ,
this is really interesting. its working
great !
I happened to download this file.
DON'T
Win32/Vigram.A virus
hi Ralp,
I ran the samples you submitted. but I got the error below. Could there be a problem with the version? Qlikview version 12.20
Error text:
The top level of the document is invalid.
On line number: 2. On column number: 11. System ID: sample.xml.
RawData:
LOAD text%Table as value
FROM [C:\PDFtoQVD\sample.xml] (XmlSimple, Table is [pdf2xml/page/text])