Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
joey_lutes
Partner - Creator
Partner - Creator

Loop Through Directory Structure to load data

I have an elaborate load script (involving hash, flags, etc) that I need to run on every file found in a directory structure very similar to this one from csv files.

In a test load, I used Henric's  hic subroutine and called it - I guess I expected it to create the table 'FileList' which would be visible in the Data Model Viewer - which it is not.  The script runs without error, I'm just not sure how to implement it.

Here's the sub I used:

Set vConcatenate = ;

sub ScanFolder(Root)

          for each FileExtension in 'csv'

                    for each FoundFile in filelist( Root & '\*.' & FileExtension)

                              FileList:

                              $(vConcatenate)

                              LOAD *, '$(FoundFile)' as SourceFile

                              FROM [$(FoundFile)] (txt, codepage is 1252, embedded labels, delimiter is ',', msq, header is 2 lines);

                              Set vConcatenate = Concatenate;

                    next FoundFile

          next FileExtension

          for each SubDirectory in dirlist( Root & '\*' )

                    call ScanFolder(SubDirectory)

          next SubDirectory

end sub

Call ScanFolder('C:\datafiles') ;

In its most simplistic form, I have the following directory structure:

   C:\datafiles\20161101

          file1_20161101.csv

          file2_20161101.csv

    C:\datafiles\20161102

          file1_20161102.csv

          file2_20161102.csv

Once I have a filelist I need to do a few things with it:

For each FoundFile

1)  Extract the date.

2)  if filedate > LastLoadedFileDate,

3)  Load *, FileDate as FileDate

4)  Update vLastLoadedFileDate (set command?)

Next

So the end result would be comparable to:

For each FoundFile > vLastLoadedFileDate

File1Table:

      Load*,

     FileDate (probably wrapped in Date())

From: FoundFile.csv (File1_20161101.csv)

Join (File1Table)

     Load*,

     FileDate

From: FoundFile.csv (File1_20161102.csv)

etc.

Any assistance would be greatly appreciated!!

4 Replies
Gysbert_Wassenaar

Put this somewhere at the start of the script: SET vLastLoadedFileDate = ReloadTime();

And change the subroutine by adding an if statement to turn the name of the subfolder into a date and then compare it with the last reload time:

          for each SubDirectory in dirlist( Root & '\*' )

               if( date#('$(SubDirectory)','YYYYMMDD') > $(vLastLoadedFileDate) then

                    call ScanFolder(SubDirectory)

               endif

          next SubDirectory


talk is cheap, supply exceeds demand
joey_lutes
Partner - Creator
Partner - Creator
Author

Thanks so much!

Would this be an additional sub or should I integrate that / add it to the existing one?  Obviously I'll need to loop through the files as well....

Also, I'm still confused as to how this manifests - where my data actually gets loaded into a table, can you shed light on that?

Gysbert_Wassenaar

In this case 'change' means 'modify', not 'remove and replace with'.

The data from the files should get loaded by the subroutine into a table named FileList


talk is cheap, supply exceeds demand
joey_lutes
Partner - Creator
Partner - Creator
Author

Ok, thank you.

Here is my current code (I added the SET statement earlier)

Set vConcatenate = ;

sub ScanFolder(Root)

          for each FileExtension in 'csv'

                    for each FoundFile in filelist( Root & '\*.' & FileExtension)

                              FileList:

                              $(vConcatenate)

                              LOAD *, '$(FoundFile)' as SourceFile

                              FROM [$(FoundFile)] (txt, codepage is 1252, embedded labels, delimiter is ',', msq, header is 2 lines);

                              Set vConcatenate = Concatenate;

                    next FoundFile

          next FileExtension

          for each SubDirectory in dirlist( Root & '\*' )

               if( date#('$(SubDirectory)','YYYYMMDD') > $(vLastLoadedFileDate) then

                    call ScanFolder(SubDirectory)

               endif

          next SubDirectory

end sub

Call ScanFolder('C:\datafiles') ;

The script does indeed run, or seems to, with no errors however no table is generated.  Weird.  Files are present.

And - of course I need to complicate it more  

The name of the table should be the name of the file preceding the _ (file1_20161101 should create table file1:)

I would think I could get that value from a

subfield(FieldBaseName,'_',1) - just not sure how to insert that as the table name.


OUTPUT:

12:17:17 TT

App successfully saved.

---
Started loading data
---
---
App saved
---
Finished successfully
0 forced error(s)
0 synthetic key(s)
I really appreciate the help - this sub thing is new to me and I'm a bit lost.  I'm doing my due diligence on the net to try to learn.