Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Loading Log files from huge directory

Hi,

I am loading log files from a huge directory.

I have about 1000 files that I am loading.

The load is taking long which is fine,

the directory receive new log files every 15 min and I need to reload to include the new log files.

When I reload, Is there any way I can include only the new log files on top of the old ones?

Thanks 

7 Replies
rbecher
MVP
MVP

Hi Badr,

yes, it's possible. You should store the filenames of all loaded filenames in a table and then use not exists(filename) in the load of the logfile..

- Ralf

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine
Not applicable
Author

May you please give me more details

Thanks

rbecher
MVP
MVP

Can you post your script, or the parts dealing with the log files..

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine
Not applicable
Author

Directory;

LOAD cdrRecordType,

     globalCallID_callManagerId,

     globalCallID_callId,

     origLegCallIdentifier,

     dateTimeOrigination,

     origNodeId,

     origSpan,

     authorizationCodeValue

FROM

X:\CallRecordsArchive\cdr\cdr_*

(txt, codepage is 1252, embedded labels, delimiter is ',', msq);

This script loads all the files that start with cdr_ on that directory.

those log files are phone log files and the directory receives new log files as people call during the day.

I want to have a task to reload every 60 min.

now by setting that the script will reload all the files again which takes forever and there is no need for that because there is no changes happen to the log files loaded previously.

rbecher
MVP
MVP

This is a suggestion:

if len(filesize('data.qvd')) > 0 then

    data:

    LOAD * FROM data.qvd (qvd);

end if

if len(filesize('files.qvd')) > 0 then

    files:

    LOAD * FROM files.qvd (qvd);

else

    files:

    LOAD '' as filename AutoGenerate(0);

end if

for each file in filelist('X:\CallRecordsArchive\cdr\cdr_*');

if not exists('filename', '$(file)') then

    files:

    LOAD '$(file)' as filename autogenerate(1);

    data:

    LOAD

             cdrRecordType,

         globalCallID_callManagerId,

         globalCallID_callId,

         origLegCallIdentifier,

         dateTimeOrigination,

         origNodeId,

         origSpan,

         authorizationCodeValue

    FROM $(file)

    (txt, codepage is 1252, embedded labels, delimiter is ',', msq);

end if;

store data into data.qvd (qvd);

store files into files.qvd (qvd);

- Ralf

Message was edited by: Ralf Becher There was an ELSE missing in the 2nd IF

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine
Not applicable
Author

I am going to wait until the load finish and edit the script.

Thank you so much for your help one more thing, I am trying to understand the script..would you mind having a little documentation next to the code.

If it is too much then don't worry about it

Thxs

rbecher
MVP
MVP

You're welcome! If you load a qvw file again all preceding loaded tables and data get lost except you're using the partial reload option (which is a more powerful solution but more complicated if you have multiple tables). Therefor the best practise is to save (store) the already loaded tables into qvd files (QlikView Data File) which can quickly loaded the next time at first.

// load the stored data from preceding loads

if len(filesize('data.qvd')) > 0 then

    data:

    LOAD * FROM data.qvd (qvd);

end if

// load the stored filenames (from preceding loads)

if len(filesize('files.qvd')) > 0 then

    files:

    LOAD * FROM files.qvd (qvd);

else

    // first time load, create empty files table

    files:

    LOAD '' as filename AutoGenerate(0);

end if

// load data for each file (loop)

for each file in filelist('X:\CallRecordsArchive\cdr\cdr_*');

// check if file was loaded in a preceding load (filename is a field of table files)

if not exists('filename', '$(file)') then

    // save current filename to prevent double processing

    files:

    LOAD '$(file)' as filename autogenerate(1);

    // load the data of the current file, append to the already loaded data from preceding loads

    data:

    LOAD

             cdrRecordType,

         globalCallID_callManagerId,

         globalCallID_callId,

         origLegCallIdentifier,

         dateTimeOrigination,

         origNodeId,

         origSpan,

         authorizationCodeValue

    FROM $(file)

    (txt, codepage is 1252, embedded labels, delimiter is ',', msq);

end if;

// store the data for the next load

store data into data.qvd (qvd);

// store the actual processed filenames

store files into files.qvd (qvd);

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine