Search through documents with QlikView - Qlik Community

Jason_Michaelides · ‎2011-02-16

Hello all,

I have just started using a sparkly new Win7Pro x64 PC at work. The primary reason for this was to utilise higher levels of RAM for QlikView applications so I stop developing on the server! However, one of the first issues I have hit is the (deliberate) omission by Microsoft of indexing for network shares unless you make them offline folders (not a practical option!). A colleague was telling me about Google Search Appliance (details here) which got me thinking about a QlikView solution.

I've tried to see if someone's already done this but not found anything. Do any of you gentlemen or ladies have some idea about how QlikView could be used to "index" a folder of documents for easy searching? Maybe even get some content from them given the new .docx open format?

Would anyone like to take up the challenge of building one!? Just a thought...[8-|]

Cheers,

Jason

Report Inappropriate Content · ‎2011-02-16

Hi Jason,

this is doable, we did it for one of our customers recently with indexing PDF files 😄

You can write a recursive subroutine which reads folder by folder and file by file. You may want to exclude some directories such as C:\Windows or file types. However the basic script is as simple as the following.

SUB GetPaths(dir)

FOR EACH subdir in dirlist( '$(dir)' & '\*' )

docpath:

LOAD '$(subdir)' as docpath

AUTOGENERATE 1;

CALL GetPaths('$(subdir)')

CALL GetPaths('C:');

You can do similiar for files then:

for each file in filelist(subdir & '\*.*')

...

Within this piece of code you can check for the file type and if readable (e.g. txt) load it into QlikView either as one, line by line or however. I never tried that for *.docx but it seems to work except that you will loose some meta data like page number etc. You also need to handle the header "crap", but I didn't say it is easy 😉

Feel free to ask further...

cheers

Florian