Skip to main content
Announcements
Customer Spotlight: Discover what’s possible with embedded analytics Oct. 16 at 10:00 AM ET: REGISTER NOW

Piping Data Straight into QVD files

cancel
Showing results for 
Search instead for 
Did you mean: 
Dalton_Ruer
Support
Support

Piping Data Straight into QVD files

Last Update:

Jun 13, 2023 7:15:35 AM

Updated By:

Dalton_Ruer

Created date:

Jun 13, 2023 7:15:35 AM

Destinations

The wonderful thing about Qlik Cloud Data Integration is that you can have your data flowing from so many source to so many Cloud Data Warehouse platforms. You know like Snowflake. Or Google Big Query. Or Azure Synapse. Or DataBricks. Or Amazon Redshift. Hydrating those targets and keeping them all fresh with Change Data Capture. You gotta love that. 

For those Analytical minded folks in the crowd, Qlik Data Integration offers the chance to open up so many new avenues to surface fresh data. Back in November I posted about the simplicity of combining the CDC nature of Qlik Cloud Data Integration directly in your Qlik Sense ap... It showcased the magic of utilize the MERGE function in an application which greatly simplified the Incremental Load process. 

I know what some of you were thinking when you read the article... "It can't possibly get any easier than that." 

I also know that others were instead thinking ... "It would be so cool if Qlik would just pipe all that data directly to QVD files for me." 

Whichever crowd you might have been/be in, you are in luck. One of the destinations where all that sparkling fresh water, I mean data, can flow to is Qlik Cloud.  

CreatingProjectForQlikCloud.png

That's right baby. All of the data, including the changes can be sent directly to QVD files in Qlik Cloud.

To Good to be True

I know that sounds to good to be true, so I'm going to walk you through how to do it, and show you what the result of it is. Obviously, the first step is to choose Qlik Cloud as the Target Platform for your new Qlik Cloud Data Integration project as pictured above. 

Once you do, you will be prompted to select where to store the resulting QVD files. As pictured below, you can choose "Qlik managed storage" or "Customer managed storage." Qlik managed storage means that the QVD files will be created in the same Space that your project was created in. Whereas, selecting "Customer managed storage" will prompt you to identify an Amazon S3 bucket where you wish to store the output files. 

Regardless of the location you choose to store the QVD files, you will be prompted to identify an Amazon S3 bucket where the initial load and the Change Data Capture (CDC) changes can be written to. You will need to input the Bucket Name, the Access Key and the Secret Key. That connection is like others, so once you create it, you will be able to select it in the future for other projects if you have any. 

MustSetupS3BucketAsStaging.png

Once you configure that, you will see a project very similar to all others you have done, or those you have seen in my previous posts about Qlik Cloud Data Integration. You can name them, Prepare, and Run them in the exact same manner as other platforms. 

LandingAndStorage.png

Amazon S3 Staging Area

If you use any type of tool to browse your Amazon S3 bucket you will see folders in much the same structure as you will see in your Cloud Data Warehouses. Each Dataset will be represented in a similar fashion as well with a folder for it's Name, and one for the Changes _ct. Below you will notice that after I ran the full load I can see a CSV file for the Patients dataset. But the change table version has no other structures underneath. 

FullLoad_InS3Location.png

Fifteen minutes after doing the full load I went and made changes to the Source data. The Change Data Capturing read the logs for that source, and voila I began seeing other CSV files show up in my Amazon S3 bucket under that _ct folder. You can tell by the timestamps when they changes were processed. 

MultipleChangesInS3.png

Pop Quiz

Rather than me telling you, can you guess (or even take educated guess) as to why 2 of the folder structures in the Amazon S3 bucket for the _ct for Patients have multiple files, while 2 have only 1 file? Seriously, look at the image and try to guess. 

While you are thinking ... Here is an example of what you will find in those change CSV files. 

ChangeFilesLookLikeThis.png

If you actually took time to think, hopefully, you arrived at the conclusion that it was probably the "Change processing interval"  I had scheduled for my project. Just like all of other project types the changes are accumulated in the Landing zone prior to being "batched" together in the Storage zone. 

LandingProcessingTime.png

Let's Get to the QVD file part of this post

Important Note: The QVD files generated should be considered to be typical Stage 1 QVD files with the raw values. Within the Qlik Cloud Data Integration project for QVD's, unlike with other Storage Platform targets, you can not transform data in them or build data marts. The "why" is simple ... Qlik Cloud Data Integration is formed around the concept of pushing all work down to the underlying platforms.  There is simply no way to graphically let you drag/drop and create all of the variations of how you might want your Qlik Script to work for your environment. Today you might want to concatenate tables, tomorrow you may need to create a linked table. You very well may already have a lot of Qlik Script files (QVS) with subroutines to do the work you need done that it simply wouldn't know about. Long story short, it's a good thing that it focuses on the raw delivery of the data and the incremental changes, and allows you to control the other layers. 

If you added your project in a "clean" Space like below it's super easy to see the output QVD files.

QVDInSpacePurposely.png

If you create the project in your Personal space and happen to have hundreds and hundreds of files like the Qlik Dork, they don't immediately jump out at you. 

QVDInPersonalSpace.png

But that's no reason to rip your bow-tie off, just use the Search feature and voila ... you can now see the file that was created. 

QVDSearch.png

Like any QVD in your Catalog you have the ability to Open the Dataset to view the Metadata about it. However, before I press the button to do that, I want you to examine the file name closely. Notice that the friendly name is "aaDimension_Patients".qvd, which matches our Dataset, the name slightly above is the more complete name with a subfolder structure. 

OpenDataSet.png

The "path" part of the name is configurable via the project settings. Notice that I had accepted the default which included my very long project name, but I could have opted to place the QVD in the "Root" of the Space with no folder structure, or given it another Folder structure. The name also includes the name for the Landing zone, which I had also made very long. All this to say, think carefully about your naming convention

SettingsForName.png

Note: It's not really a path structure, all of the files will appear in your DataFiles connection for the Space, but you can see that the "path" is reflected in the actual name so that should you have different projects, your QVD files won't be overwriting each other. For a dork like me, that's important. 

SelectingDataFiles.png

Metadata

Sorry to have taken a side route when hovering over that "Open dataset" button above. Let's go ahead and press it. Voila, as expected, the data from our Qlik Cloud Data Integration was generated right into the QVD file. 

PatientFileMetadata.png

What do you need to do in order to handle the Incremental Load changes? Not a thing. At the appropriate Change processing intervals you defined, the changes will automatically be included in the QVD file(s) for every Dataset. 

MetadataAfterChange.png

[Disclaimer: This post isn't meant to spark a debate as to whether sending data directly to Qlik QVD files is a good or bad thing. It's not intended to suggest that you don't need a Data Warehouse to surface data via other methods. It is ONLY INTENDED to be used to demonstrate HOW TO take advantage of the functionality if that is your desire.]

Labels (1)
Contributors
Version history
Last update:
‎2023-06-13 07:15 AM
Updated by: