Skip to main content

Suggest an Idea

Vote for your favorite Qlik product ideas and add your own suggestions.

Announcements
July 15, NEW Customer Portal: Initial launch will improve how you submit Support Cases. READ MORE

Qlik Replicate - S3 Endpoint - Dynamic Folder Name

mp-sf
Contributor II
Contributor II

Qlik Replicate - S3 Endpoint - Dynamic Folder Name

Good morning Qlik Community!

For once-daily Full Load operations, it would be *VERY* nice to have the capability of adding a dynamic folder name to the Target Folder field of an S3 Target Endpoint, or even to a Replication Task type that is targeted for S3 replication tasks.

For example, adding a dynamic DATE function so that each day the Replication Task will generate a new subfolder based on the date, and upload the daily parquet file to that subfolder.

IE:

  • Target Endpoint S3 Bucket Name = my-s3-bucket-name
  • Target Endpoint S3 Target Folder = dir1/dir2/dir3/subfolder/{DATE_VAR}

qr-s3-endpoint-dynamic.jpg

Would result in:

  • Full Path (from 12/25/2022) = my-s3-bucket-name/dir1/dir2/dir3/subfolder/12-25-2022/table_name/fileload00001.parquet
  • Full Path (from 12/26/2022) = my-s3-bucket-name/dir1/dir2/dir3/subfolder/12-26-2022/table_name/fileload00001.parquet

 

The DATE capability is the need for my current use-case, but I can think of at least a couple more instances where a dynamic folder name would be incredibly worthwhile, especially in an environment as large as ours. 

 

An ALTERNATE suggestion (See Here:  Qlik Replicate - S3 Replication Task Type (plus dy... - Qlik Community - 2020407) would be an S3 Replication Profile option during New Task creation, where the Target Endpoint data would be treated more like files instead of Database Schema/Table/Column/etc... fields, and the Target Endpoint itself would simply be a pointer to a top level directory.

For example, in an S3-Replication-Profile task, I would select the Source Schema/Tables/Columns - but for the S3 Task type, instead of "Map to target table" it would have "Map to Folder" options (that would include the capability to add a dynamic formula field- ie: DATE).

qr-s3-replication-profile.jpg

IE:

  • Target Endpoint S3 Bucket Name = my-s3-bucket-name
  • Target Endpoint S3 Target Folder = dir1/dir2/dir3/subfolder
  • S3 Replication Task Settings Option:  Target Folder = {DATE_VAR}
    •  qr-s3-task-folder.jpg
  • -or, less desirable, a setting within each Table Mapping of each Replication Task-
    • Target Subfolder Name (instead of Table Schema/Table Name fields) = {DATE_VAR}
    • qr-s3-table-folder.jpg

Would result in:

  • Full Path (from 12/25/2022) = my-s3-bucket-name/dir1/dir2/dir3/subfolder/12-25-2022/table_name/fileload00001.parquet
  • Full Path (from 12/26/2022) = my-s3-bucket-name/dir1/dir2/dir3/subfolder/12-26-2022/table_name/fileload00001.parquet

 

Thanks for your time!

-mp

 

 

 

5 Comments
pedrobergo
Employee
Employee

Hi @mp-sf 

You can use Global Transformations to rename tables creating subfolders to you using DateTime features.

Please, create a Global Rule Transformation to rename all table names with following expression:

Date('Now', 'LocalTime') || '/' || $AR_M_SOURCE_TABLE_NAME

After each Full Load, Qlik will create a Subfolder with Date in format YYYY-MM-DD + / + Table Name.

pedrobergo_0-1673445835922.png

pedrobergo_1-1673445917296.png

 

[],

Pedro

mp-sf
Contributor II
Contributor II

@pedrobergo Thanks for that feedback, looks like it will do the trick.  I've got 2 quick follow-up questions...

 

With that Global Rule, my replication structure is as below (not including <target-bucket> in example):

  • Original:  <target-bucket>/<schema>.<table> 
  • Example:  ABC.123, ABC.456, ABC.789

 

  • Global Rule:  <target-bucket>/<schema>.DATE/<table>
  • Example: ABC.2023-01-11/123, ABC.2023-01-11/456, ABC.2023-01-11/789

 

I'm not specifically mapping anything to use the source <schema> ID in the target name, and I don't see a setting that I can enable/disable for that.

 

So 2 questions:

  1. Is there a setting I can toggle to leave off the <schema> name, or is that going to require another Global Rule?
  2. Do you know of any way to modify the filename from "LOAD00000001.parquet" to something different?
pedrobergo
Employee
Employee

Hi @mp-sf 

Answering you:

1. For S3 target you cannot hidden or eliminate source schema.

2. You cannot change the name of the file.

You can execute a post processing command, like a batch or script that will be executed just after each file upload to target, then you can copy to another file. Please take a look into https://help.qlik.com/en-US/replicate/May2021/Content/Replicate/Main/Amazon%20S3/advanced_options_fi...

[],

Pedro

Meghann_MacDonald

From now on, please track this idea from the Ideation portal. 

Link to new idea

Meghann

NOTE: Upon clicking this link 2 tabs may open - please feel free to close the one with a login page. If you only see 1 tab with the login page, please try clicking this link first: Authenticate me! then try the link above again. Ensure pop-up blocker is off.

Ideation
Explorer II
Explorer II
 
Status changed to: Closed - Archived