Skip to main content

Suggest an Idea

Vote for your favorite Qlik product ideas and add your own suggestions.

Announcements
This page is no longer in use. To suggest an idea, please visit Browse and Suggest.

Qlik Replicate - S3 Endpoint - Dynamic Folder Name

mp-sf
Contributor II
Contributor II

Qlik Replicate - S3 Endpoint - Dynamic Folder Name

Good morning Qlik Community!

For once-daily Full Load operations, it would be *VERY* nice to have the capability of adding a dynamic folder name to the Target Folder field of an S3 Target Endpoint, or even to a Replication Task type that is targeted for S3 replication tasks.

For example, adding a dynamic DATE function so that each day the Replication Task will generate a new subfolder based on the date, and upload the daily parquet file to that subfolder.

IE:

  • Target Endpoint S3 Bucket Name = my-s3-bucket-name
  • Target Endpoint S3 Target Folder = dir1/dir2/dir3/subfolder/{DATE_VAR}

qr-s3-endpoint-dynamic.jpg

Would result in:

  • Full Path (from 12/25/2022) = my-s3-bucket-name/dir1/dir2/dir3/subfolder/12-25-2022/table_name/fileload00001.parquet
  • Full Path (from 12/26/2022) = my-s3-bucket-name/dir1/dir2/dir3/subfolder/12-26-2022/table_name/fileload00001.parquet

 

The DATE capability is the need for my current use-case, but I can think of at least a couple more instances where a dynamic folder name would be incredibly worthwhile, especially in an environment as large as ours. 

 

An ALTERNATE suggestion (See Here:  Qlik Replicate - S3 Replication Task Type (plus dy... - Qlik Community - 2020407) would be an S3 Replication Profile option during New Task creation, where the Target Endpoint data would be treated more like files instead of Database Schema/Table/Column/etc... fields, and the Target Endpoint itself would simply be a pointer to a top level directory.

For example, in an S3-Replication-Profile task, I would select the Source Schema/Tables/Columns - but for the S3 Task type, instead of "Map to target table" it would have "Map to Folder" options (that would include the capability to add a dynamic formula field- ie: DATE).

qr-s3-replication-profile.jpg

IE:

  • Target Endpoint S3 Bucket Name = my-s3-bucket-name
  • Target Endpoint S3 Target Folder = dir1/dir2/dir3/subfolder
  • S3 Replication Task Settings Option:  Target Folder = {DATE_VAR}
    •  qr-s3-task-folder.jpg
  • -or, less desirable, a setting within each Table Mapping of each Replication Task-
    • Target Subfolder Name (instead of Table Schema/Table Name fields) = {DATE_VAR}
    • qr-s3-table-folder.jpg

Would result in:

  • Full Path (from 12/25/2022) = my-s3-bucket-name/dir1/dir2/dir3/subfolder/12-25-2022/table_name/fileload00001.parquet
  • Full Path (from 12/26/2022) = my-s3-bucket-name/dir1/dir2/dir3/subfolder/12-26-2022/table_name/fileload00001.parquet

 

Thanks for your time!

-mp

 

 

 

5 Comments
pedrobergo
Employee
Employee

Hi @mp-sf 

You can use Global Transformations to rename tables creating subfolders to you using DateTime features.

Please, create a Global Rule Transformation to rename all table names with following expression:

Date('Now', 'LocalTime') || '/' || $AR_M_SOURCE_TABLE_NAME

After each Full Load, Qlik will create a Subfolder with Date in format YYYY-MM-DD + / + Table Name.

pedrobergo_0-1673445835922.png

pedrobergo_1-1673445917296.png

 

[],

Pedro

mp-sf
Contributor II
Contributor II

@pedrobergo Thanks for that feedback, looks like it will do the trick.  I've got 2 quick follow-up questions...

 

With that Global Rule, my replication structure is as below (not including <target-bucket> in example):

  • Original:  <target-bucket>/<schema>.<table> 
  • Example:  ABC.123, ABC.456, ABC.789

 

  • Global Rule:  <target-bucket>/<schema>.DATE/<table>
  • Example: ABC.2023-01-11/123, ABC.2023-01-11/456, ABC.2023-01-11/789

 

I'm not specifically mapping anything to use the source <schema> ID in the target name, and I don't see a setting that I can enable/disable for that.

 

So 2 questions:

  1. Is there a setting I can toggle to leave off the <schema> name, or is that going to require another Global Rule?
  2. Do you know of any way to modify the filename from "LOAD00000001.parquet" to something different?
pedrobergo
Employee
Employee

Hi @mp-sf 

Answering you:

1. For S3 target you cannot hidden or eliminate source schema.

2. You cannot change the name of the file.

You can execute a post processing command, like a batch or script that will be executed just after each file upload to target, then you can copy to another file. Please take a look into https://help.qlik.com/en-US/replicate/May2021/Content/Replicate/Main/Amazon%20S3/advanced_options_fi...

[],

Pedro

Meghann_MacDonald

From now on, please track this idea from the Ideation portal. 

Link to new idea

Meghann

NOTE: Upon clicking this link 2 tabs may open - please feel free to close the one with a login page. If you only see 1 tab with the login page, please try clicking this link first: Authenticate me! then try the link above again. Ensure pop-up blocker is off.

Ideation
Explorer II
Explorer II
 
Status changed to: Closed - Archived