Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026 Agenda Now Available: Explore Sessions
cancel
Showing results for 
Search instead for 
Did you mean: 
Jason_Michaelides
Partner - Master II
Partner - Master II

Search through documents with QlikView

Hello all,

I have just started using a sparkly new Win7Pro x64 PC at work. The primary reason for this was to utilise higher levels of RAM for QlikView applications so I stop developing on the server! However, one of the first issues I have hit is the (deliberate) omission by Microsoft of indexing for network shares unless you make them offline folders (not a practical option!). A colleague was telling me about Google Search Appliance (details here) which got me thinking about a QlikView solution.

I've tried to see if someone's already done this but not found anything. Do any of you gentlemen or ladies have some idea about how QlikView could be used to "index" a folder of documents for easy searching? Maybe even get some content from them given the new .docx open format?

Would anyone like to take up the challenge of building one!? Just a thought...[8-|]

Cheers,

Jason

1 Reply
Not applicable

Hi Jason,

this is doable, we did it for one of our customers recently with indexing PDF files 😄

You can write a recursive subroutine which reads folder by folder and file by file. You may want to exclude some directories such as C:\Windows or file types. However the basic script is as simple as the following.

SUB GetPaths(dir)

FOR EACH subdir in dirlist( '$(dir)' & '\*' )

docpath:

LOAD '$(subdir)' as docpath

AUTOGENERATE 1;

CALL GetPaths('$(subdir)')

NEXT

END SUB

CALL GetPaths('C:');



You can do similiar for files then:

for each file in filelist(subdir & '\*.*')

...

Within this piece of code you can check for the file type and if readable (e.g. txt) load it into QlikView either as one, line by line or however. I never tried that for *.docx but it seems to work except that you will loose some meta data like page number etc. You also need to handle the header "crap", but I didn't say it is easy 😉

Feel free to ask further...

cheers

Florian