Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More

STT - Exploring Qlik Cloud Data Integration

cancel
Showing results for 
Search instead for 
Did you mean: 
Troy_Raney
Digital Support
Digital Support

STT - Exploring Qlik Cloud Data Integration

Last Update:

May 17, 2024 2:33:14 AM

Updated By:

Troy_Raney

Created date:

May 17, 2024 2:33:14 AM

Troy_Raney_0-1715927446711.gif

Environment

  • Qlik Cloud Data Integration

 

Transcript

Hello everyone and welcome to the May edition of Techspert Talks. I’m Troy Raney and I’ll be your host for today's session. Today's Techspert Talk session is Exploring Qlik Cloud Data Integration with our own Tim Garrod. Tim, why don't you tell us a little bit about yourself?
Yeah, hi Troy. I’m head of Cloud Data Transformation in CDC here at Qlik which means I’m in the product organization and have a team of product managers. Our focus area is what we would sort of consider heritage Qlik products; So, that's Qlik Replicate, Compose, and the data Pipeline capabilities that were built in Qlik Cloud Data Integration. My history has always been in data. I used to work in financial services. I ran data teams there and warehouse teams; and then I moved into the software industry with Oracle; and then spent the last five years here at Qlik.
Great, thanks. Today we're going to be talking about what Qlik Data Integrations can do; some of the differences from Replicate; how it can be used; and we'll be looking a lot at Cloud data architecture and all the pieces that are involved there. Now Tim, for people who are familiar with Qlik Replicate but aren't that familiar with Qlik Cloud Data Integrations, how does it all work?
Yeah, so let's give them a little bit of a visual here, right? So, many of our customers are leveraging Replicate today for all types of mission critical workloads; and as those customers will know, it's a log-based change data capture data replication solution as you can see here; capturing from sources and delivering to heterogeneous targets, right? 40-50 sources, 40-50 different targets. What we've done in our Cloud environment is: really giving you the ability to leverage that capability through our Cloud Replicate is the foundation of what we do for pipelining and so you can deploy it on-premise. We also have the ability to deploy it in the Cloud, and what this really means that was not deploying Replicate in the Cloud; we want Replicate or the replication engine to be close to your sources, because our customers have large data volumes that they need to move with low latency. That Gateway you see down there is a core part of our Cloud architecture. While you will (you know) deploy or or design your Pipelines in the Cloud, the Gateway is a component that you Deploy on-premise that allows us to move data from Source to Target; and then we support a couple different use-cases. We support 90% of the same sources that we support in Replicate. So, from relational databases, SAP, Mainframe; but what we've done is extended the Source capability to include SaaS applications. So, these are the ServiceNows, the Workdays, The HubSpots, Salesforce of the world; where you're accessing data through REST APIs. And the other thing that we've done is: extended what we do in the Target; we've added Data Transformation capability. Movement is very similar, but the architecture is different and I think we're going to talk about that a little bit later; where we automatically manage a Type 2; and we've really built this to support the modern Data Lakehouse environment and The Medallion architecture. So, ingestion supports an automated Bronze Layer (if you will); and then Transformations are all pushed down; what we would call ELT, and these are against your popular Cloud Warehouse or Lakehouse targets. So, a Snowflake, a Data Bricks, Big Query, Red Shift, Fabric, Etc. So, that's sort of in a nutshell what Qlik Cloud Data Integration is. It's around Data Movement and Data Transfer information as an integrated part of a Pipeline.
Thanks. Well, this is all very theoretical. Can you show us what a Data Pipeline looks like?
Of course, yeah.
Allright. So, right now we're looking at your Qlik Cloud tenant and the Data Integration section. What are we looking at here with these data projects?
Yeah, So, a Project is a collection of tasks and the tasks streamed together make a Pipeline. Now we do have 2 different types of projects. We have what's called a Replication Project that supports a pure data replication-only scenario.
Okay.
And then we have our Pipeline Projects where as you can see here they're predicated on a specific Target for a Snowflake and Data Bricks or SQL Server.
Can we take a look at one of those Pipelines?
Yeah.
All right. So, we got a lot of layers here, what's happening?
Yeah. So, this is a Pipeline Project. It's (as I said) on Snowflake, and we have different tasks like I mentioned. So, here you can see we have different types of sources that we're ingesting.
Okay.
We're going to (you know) automate the ingest process for you; very much like we do with Replicate. These initial set of tasks are doing what we call Landing.
Okay.
So, we have a very explicit architecture that we built for Cloud Warehouses that uses what we call a Delayed Merge. You'll hear the terminology Deferred Merge out there as well, but it's really built on finops, cost, and performance. So, it's slightly different than what Replicate does.
So, I see all those different sources there; are they doing a Full Load?
We have different capabilities based off of what you need to do, right? So, for example: this SQL Server Source here, we're actually doing log-based change data capture, so…
Okay.
For all of our sources, we can do what you're calling a Full Load; which if people aren't familiar with the Replicate terminology; that's (you know), it's also referred to in the industry as an Initial Load, right? Instantiating the target. So, we can do that, but we can also do either Incremental or log-based CDC processing. So, for relational databases: your SAPs, your Main Frames of the world; we have all these log-based capabilities for low latency (you know) incremental processing.
Yeah.
And then for things like Shopify which are API-based; that's where we do what we refer to as Incremental Load, and and the terminology is different because SaaS applications and REST APIs; a lot of them have a rate limit. And so, we will adhere to that rate limit. So, that may be more of a every 15 or 10 minute incremental refresh. So, does all this end up in Snowflake at the end of the Pipeline here?
Yeah, in this case it's Snowflake it could be (you know) like I said earlier Data Bricks or another platform, but we land data in Snowflake. These layers here with the database icon, long-term persisted storage layer.
Okay.
So, this is where we manage and curate the Type 2; so, history of data. And then beyond that, we have the ability to add in Transformations and also an automated Data Mart. These icons here represent a different type of task. You can see that here these additional tasks we have.
Yeah, okay.
Yeah, and so, each task type has a specific function: am I moving data? Am I storing the history long term? Am I Transforming it into fit for purpose? or am I automating a Data Mart pattern?
All right. Well, can you walk us through how to build one of these from scratch?
Yeah, yeah. So, let's jump back out to a blank project here that I have and show you how to how to start this process.
So, what's the difference between Onboarding and Registering?
Yeah, it's a good question. So, we obviously have all these capabilities that we've mentioned about moving data: getting it from those sources, loading it; but you may have a different technology loading data into your Snowflake; maybe you're using Stitch; maybe you're using Talend; or maybe you've got Python scripting. The Register Data basically says: Look, I’ve already got data in my target platform and I want to use it in a Pipeline and Transform that data; whether it's on its own or with data that we've loaded as well, through this onboard process.
Okay. So, Onboarding is moving data to make it more readable, and Registering data is taking data that's already on-boarded and Transforming it or doing something with it. So, I guess we should start with Onboarding?
Right. And the idea is we're going to automate this processing. So, we come in and we're going to be able to select what's our connection. Now if I don't have a connection already created (you know) to our Source system; this is where I create a connection, right? And you can see all the different types of applications that we can access: databases, Etc,
Right.
A really wide variety.
Right. This is what you're mentioning, 90% of the sources available in Replicate plus the Cloud-based ones.
Yeah, yeah and then 90% in Replicate; it's probably a question people might ask is: there's some sort of what we would consider almost Legacy sources in Replicate: V-Sam and Informix; things like that; that we don't support currently in our Cloud environment.
Since we're in the Cloud did all these sources need to be in the Cloud or can they be on-premise as well?
No they can be on-premise or in the Cloud. So, if we think back to that slide where I talked a little bit about a Gateway. That's a component the customer can deploy in their on-premise environment; it actually connects outbound to our Cloud over 443, so it's a secure outbound connection; then it's a mutually authenticated websocket which allows two-way communication; the channel is opened up outbound which is really important from a security perspective; and that Gateway allows us to connect locally to your source systems, without having to do things like manage SSH tunnels; without having to have our Cloud reach in and pull data out of your data center. The sources can be on-premise, the targets can be on-premise.
Okay.
The command and control is done to our Cloud.
All right. So, in this example, what's our Source?
Yes, we'll do a SQL Server like we showed earlier, and the design process is the same regardless of the Source. So, I’m going to ask it to show me Tables and Views; and let's pull a schema here. My, my sales schema; and this is interacting with that Gateway, right? So, it's doing the communication here to get the metadata we're just going to grab everything and note how one says some say CDC and one of them says Reload. So, we have the intelligence for all the sources that we leverage to know whether we can support CDC for an object, or whether we cannot. If I only wanted to do a Full Load; I could do this reload and compare process. Otherwise, I’m going to use CDC, which is the preferred mechanism; and then it tells me: hey, this is the Pipeline we're going to build for you.
All right. So, why do we have two Pipelines here?
Yeah. So, remember I said we had those two of objects? One that was Reload only and one that's CDC? Yeah. So, when we have that type of scenario, customers want to be able to schedule the reload operation to occur, maybe it's once a day, maybe it's once every couple hours; and then CDC is going to run all the time. So, we on purpose build two sets of processes; so that you can manage them separately.
Okay. So, it's built out the reload and CDC tasks, what's the next step?
In our Cloud, we have a methodology that we call: Prepare and Run. The first thing we can do is come in and prepare our tasks. Now we could go in and edit these tasks as well; we can show that in a minute, but what Prepare does, it actually creates the target artifacts: database tables, or objects, any views that we're leveraging as part of our architecture, things like that are created for you automatically. The other thing that we're doing here was: we're writing a lot of this information to the Catalog inside Qlik Cloud, which is obviously your conduit to getting that data into Analytics. We're also as we go through a Pipeline, which could be complex with Transformations. We're unraveling what the lineage looks like and we're sending lineage to the catalog as well.
Oh wow.
So, now we have all our artifacts created; and now we can run our tasks. I can run my, just my CDC; I can run the reload task (you know); I can select the objects to run here; and it's going to trigger them. Now the reload, I may want to schedule. So, all of these tasks have the ability to come in and add a schedule to them as well.
Oh nice.
So, maybe I want to run this hourly, right? Every hour; every, I say every 3 hours, and I can create that schedule for this task as well.
Okay. What's inside Storage?
Yeah. So, let's trigger one of the Storages to run real quick. So, this is where we're doing the long-term sort of persistent storage. So, you can see here our Full Load status, right? We're processing the tables. All these guys are complete already; and this is all pushed down inside Snowflake. So, if I want to see like the details of what's happening; I can come in and (you know) take a look at those details: how long it took to process (obviously it's a small data set for the demo purposes).
It's cool you're able to monitor the task like that. This looks a lot like Replicate. What's the difference between Replicate and Qlik Cloud Data Integration?
Yeah, one of the big differences; if we come over to Design, is the ability to manage a Type 2; so, history of data. And this is really important when you start to think about Analytics. So, I can for each object determine “do I want history or not?” The big difference is in the architecture and the tracking of that Type 2 history. So, I’ll give you an example of why it's important. This is our SQL service Source.
Okay.
This is all completely mock data. Harry Kane (actually a real person) and he's my scoring vice president; he's in Berlin right now; but let's let's move him to Jamaica. We're just gonna update the Source here; we go the country: Jamaica. But if we look at the data inside Cloud Data Integration, before we had say one customers object, right? Now you've got this history, this live, this live history. So, these are different perspectives of data that we curate for you automatically. But here you've got a From timestamp and a To timestamp. So, that's the point in time at which changes occurred.
So, we'll be able to actually see when he was living in what city?
Yeah, exactly. So, if we filter on the customer ID, here's Harry Kane, Berlin and Jamaica.
That's really cool. So, that's that historical data, that Type 2 you were talking about, so we can see what changed and when?
Exactly.
Very cool. So, this is data from SQL. Does it work differently with Cloud sources?
A little bit, yeah. So, we still provide the history. Let me jump back out here and let's go to this Connections. And I’m going to filter down to Shopify connection that we were looking at earlier.
Okay.
SaaS sources are REST API based. So, the difference here is that an API provides a JSON payload, right? We generate metadata. Important thing here is: every customers implementation even of a SaaS application could be different. What you'll see here is: we've completed 2 for this metadata generation; 2 objects. But then down here, you can see many more. So, let's use Customers as an example.
Okay.
In Shopify, Customers give you a set of default address. It also gives you an array of addresses that the customers used. So, rather than just dumping that in Snowflake or whatever your target is; and making the customer normalize that. We do that work for you as part of the loading process.
Wow. So, when you create a SaaS data connection, it's customizing that to whatever the implementation requires and automates that normalization process, so the customer doesn't have to?
Exactly.
That's so cool.
Right? If we just delivered it as (you know) a variant as some solutions do; you would have to go in and write the code to pull out all these items; and so, we're going to do that for you as part of our automation.
That's fantastic. Because yeah, I’m, I’m not a JSON person. That immediately starts to like overwhelm me just the idea of it.
Right.
Before, we were looking at the design and monitor of the task, we could actually monitor everything that's happening with one task, but you've got lots of tasks. How do you keep track of them all?
Yeah, So, we have this concept called Monitor Views.
Okay.
We just filter down to the ones I own, as opposed to the ones that I can see. I’ve got a view here called my Snowflake Test.
So, is that a custom monitor?
Yeah. Within an organization, different users or different personas may be tasked with monitoring different types of tasks. A good example of that is: whoever is doing the Onboarding or loading process might be a DBA, because it needs elevated permissions on the Source. So, they care about their db2 tasks or their Oracle tasks; because those are the systems that they manage. So, what this allows me to do if we look at our filters over here on the left is: filter and save a perspective of tasks that I care about. It could be based off of a specific Type of task; it could be based off of a specific Source; it could even be off of Lineage. If datamart is important to me; could say: show me everything that impacts that Data MarT within the monitor view.
Can you walk us through an example?
Yeah. So, now we we can see a list of task here, right? This guy's in Error. If I select this task though, what I can see down here is operational lineage; CDC streaming has a problem, but that is going to impact this Public Employee views. It's going to impact this Data Mart task.
So, you really get context and can see what's impacted Downstream? That's really cool.
Yeah, and then I can even maybe; this is my impact filter. And so, now using that and these other filters that we have up here; it's reduced even further the number of tasks I need to look at, and it's everything that that task is impacting.
Okay.
If I’m the guy who's responsible for this, I may want to know the other tasks that are being impacted. And now I could save this as another view. So, this is my CDC impact perspective; and I can save this in a space which allows then you to leverage it Troy.
That's fantastic, and then you can customize it however you need.
Yeah, exactly.
Now you saved it to a space. How do you handle Access Control in spaces?
Yeah, So, this is a really important concept in Qlik Cloud. And then let's do it through the Connections layer. I can apply a filter here to a specific space, like my Techspert Space here; and then I can look at the space details; and the members of the space. I can add members; I can grant Roles to those users. Spaces allow for a couple different things. Like I said, it's role-based access control. So, I could set you up Troy as a viewer only, so you could view tasks; you could view things from a monitoring perspective, because you need to know what's going on, but you you don't need to edit or manage or actually operate the tasks. I could set an operator up; where we showed data earlier in the Type 2; you're probably not going to want that turned on in production, you can turn that off, so that nobody has access to that role in a production space. So, spaces give us role based access control; like I could say look my SAP connections only; certain users can actually use that data in a Pipeline, and then I can also use spaces as Dev, QA, and Prod. So, these are really important concepts around security within our Cloud.
So, all this is preparation (you know) governing data so, it's prepared for Analytics side. What layer is actually used in Analytics? Is that what the Data Mart is?
Yeah, I mean you could use any layer that you want really. So, I’ve got ingestion now; I could only be doing ingestion, and I could use the data sets, the customers, customer history, I could load that up into Qlik Sense; use the associative engine there, and do all the Analytics. I can leverage Transformation capabilities to do fit forp purpose Transforms, and that could be my output. Like here, I’ve got some data related to a Shipping Domain, or I could automate the Data MarT. We don't force a specific architecture on the customers. It's up to the customer to decide how do they want to implement their data warehouse or data Lake house; now the Mart most commonly will be used as a consumption point, but it doesn't have to be.
Very cool. Now Transforming is a pretty powerful feature that we might need to set up an entire session just to talk about it; but can we take a peek at it and look at that quickly?
Yeah, Transform is a big focus of us. I think the important thing to know is that we do Transformations through what's known as an ELT process. So, it's all push down SQL, but we give you different ways to actually Transform data. So, I can use what we call Transformation flows; and I’ll give you a sneak peek of a Transformation flow here. And what you can see here is drag and drop. I can drag these components; like if I wanted to filter my shipping zones first; I could drag them in and I could come in and say: give me anything where the name is not null, and save that object. It's a drag and drop capability to design Transformation capabilities. You have a data preview; so, you can see that data.
Oh nice.
As you're going through every step to see what the component's doing. You can see the SQL as well, that's being generated; know that what's happening under the covers, is what you're expecting.
Yeah, for coders who want to be able to see the code.
Yep. So, we have that capability. And then we also have the ability for you to write bring your own SQL; and then we automate around that custom SQL. But like you said; Transformation is a broad topic that maybe we could do a deep dive on that next time. One way we do that is through a Transformation, and the other way we do it is: we have a specific task that will automate Data Mart; and that does everything you need for Data Mart processing: it manages Surrogate Keys; it manages lookups; it automates denormalization to build your Dimensions; late arriving Dimensions; so, there there's a lot to unpack there.
Yeah, that - I’m not sure if we have time for, but this is really cool. We've covered a lot. Is there anything else you wanted to address?
No, but Cloud Data Integration is sort of the next step for us, right? It's the evolution of what we do with Replicate. For customers that use Replicate, if you wanted to extend Replicate with Qlik Cloud Data Integration for Transformation workloads, that register capability that we talked about does have specific configurations to support Replicate. So, we do support customers even in a hybrid configuration. It's not a forced migration.
Great. Well, now it's time for Q&A. Please submit your questions through the Q&A tool on the left hand side of your On24 console. I see some questions have already come in; so, I’ll just read them from the top. First question: how many data sources can be connected with the free Qlik Cloud Data Integration edition?
Okay. So, that's a bit of a trick question. Currently there is no free edition of Qlik Cloud Data Integration. We are looking to work toward trials, but if there's anybody out there that's interested in trying Qlik Cloud Data Integration; my recommendation would be talk to your Qlik representative and we can help set up a proof of concept of PoC with you. But to answer the general question: today we have a standard edition; a Premium Edition; and an Enterprise Edition. In standard, you can leverage any Source except SAP and Mainframe to any Target. Premium adds in Transformations; and then SAP and Mainframe obviously come up in some of the higher editions. But it's all capacity based licensing metrics. You can leverage really any Source to any Target. Great. All right, next question: would it still make sense when Source and Target databases are on-premise or does the Target have to be on a Cloud database?
So, earlier on, we showed the architecture diagram with the Gateway. That Gateway can be deployed on-premise. So, if I need to go Oracle to SQL server; and they're all on-premise and it's just a pure application scenario, then we can handle that from our Cloud. The data path stays on-premise, because of that Gateway. Now we can also deliver SaaS data to an on-premise database as well. So, if you want to take Workday data as an example, and push it into your on- premise SQL Server, Oracle, Etc; we can handle that. The only difference that people should be aware of currently is that: we only support SQL Server for the full Transformation use-case from an on- premise perspective. All the Transformation workloads currently are really predicated around those Cloud data warehouse targets, like Snowflake, Data Bricks. Great. Okay. Next question: how safe is your Qlik Cloud and what security is in place around the different data connections?
It's a common question, right? Cloud and security. I would recommend for customers, because we have a lot of certifications, stock to Type 2, Etc. There is a Qlik.com/Trust website.
Okay.
And that explains all the the certifications that we have within our Cloud. We also have capabilities around customer managed keys to ensure data is encrypted in our Cloud. Don't forget that our Cloud is a complete end-to-end set of capabilities including Analytics. So, (you know) when you think about, Analytics and the fact that data is coming into our Cloud for Analytics; obviously there's been a lot of work there around security. The other thing to consider is: we do have a US Government Cloud. And so, while you may be a commercial customer, the rigors and controls that we put in place around the government Cloud capabilities; actually also kind of apply to everything that we build for commercial Cloud as well. On the connection front, we have Role-based access controls as I mentioned a lot of times. We're using ODBC or JDBC drivers for connectivity to these relational systems there. We can leverage any of the SSL capabilities that those drivers provide to ensure that the connection is encrypted.
Great, thank you. Okay, next question: could you talk a little more about Transformations and DBT Transform libraries and capabilities?
Yeah, So, we don't use DBT in our Cloud. I know a lot of other providers give features around DBT. Our capabilities are really built around Automation. We have a very similar experience in that you can do Custom SQL. We didn't show it today; we showed the graphical design. You're going to have different skill sets within your organization. Some people are going to want to write SQL; some people are not going to want to write SQL; or maybe not have the skill set to write complex enough SQL to get your Transformation workloads done. You can mix and match in our environment. SQL similar to how you might do it with DBT; graphical design capabilities; and automation capabilities. Didn't really cover this, but there's a feature in our Cloud called Qlik Application Automation that you may have done a Techspert Talks on previously, Troy.
And I was thinking the editor looks very much like that.
Yeah, So, that (you know) if customers are leveraging us for ingestion and DBT Cloud; as an example, there are blocks in Qlik Application Automation to call out and orchestrate third-party Solutions as part of a an overall solution. Obviously, we'd love for customers to leverage everything we have end-to-end. We think the integrated Transformation with ingestion is a big value proposition for our customers; understanding when data is loaded; and therefore trigger Transformations off of that; but we're also very open.
Great, next question: does (and this is kind of specific), does encryption occur On Open, On, or Add? Just trying to understand if data is read on the front-end or after data encryption?
That's an interesting question. So, we have different capabilities based off different Source systems as well. So, if your source is SQL Server, and you've got some type of encryption, or Oracle; there's certain type encryption that we can support on the sourcing side. If the database is encrypted, that's all very Source dependent. In general, the connection can be created to leverage SSL to ensure encryption over the wire. Any data payload that we have in the Gateway is managed for a very short time, and then loaded to the Target, again through an SSL based connection. Now when we process data with SaaS sources, that data is in our Cloud again for a transient period; that is encrypted and that follows the encryption REST methodology with the customer managed Keys capability to ensure obviously it's not accessible by anybody at Qlik or anywhere else.
Great, thank you. Next question: is it possible to connect to QVD files in Qlik Cloud Data Integration?
We don't support the ability to Source from QVD files; but Qlik Cloud Data Integration can be leveraged to deliver data to QVD files. Now today, we don't support Transformation features on that; it's it's very lightweight, sort of row-by-row Transformation; but we can leverage CDC to deliver data to what we call Active QVDs.
Oh.
And so, that Onboarding process, that wizard that we went through; it's very much the same as that, but we land data into an S3 bucket; and then we handle the processing to update and reflect the updates to QVDs.
Very cool. Next question: is Section Access something that can be applied on the Data Pipeline or is that more of the analytic side?
That is much more for the analytic side. From a security perspective which is obviously where this question is going; when you write data to Snowflake, we're going to be the conduit to writing that data, right? We're going to manage that Pipeline for you. You've got the Role-based access controls that I mentioned for who gets to do what from a Pipeline perspective; but we're going to let you manage the security within that Snowflake or that Data Warehouse environment. Section Access is very much for Analytics.
Next question: when loading Snowflake data using the data Gateway, is the ODBC driver necessary?
Yes. So, when you deploy the Gateway; we do not Auto deploy the ODBC drivers for you. There is utility on the Gateway, and it's well documented, and documentation that will do the installation. So, you can just run the driver management utility, and it'll deploy that driver. But again, because the Gateway is controlling that data path; so, if you're coming from an SAP or an Oracle, you've got large data volumes; you don't want that data going through our Cloud. You want to push it directly to Snowflake. So, that driver is required as part of the connectivity to Snowflake or whatever the target may be.
Okay. Great, next question: are there any courses available to learn more about setting up data Marts?
Yes. So, there are a series of courses that have been released. They are part of the academy, and there's a a core set of courses around both Replicate; but also Qlik Data Integration as part of the academy now that are available to Partners and customers.
Great and I think there's actually a couple courses up there that are free. But I’ll throw a link in here along with the recording so people can check it out. And last question we have time for today: how is this different from Talend or Stitch? And can you give us some examples of use-cases?
Yeah, So, this is a common question that we get from customers. We have a a very broad portfolio now on the Data Integration front. We have Talend studio, and Talend Data Fabric which is a Data Integration tool ETL, supports graphical design of APIs through the route capability. We have Stitch which is purely a data loader in the Cloud; and we have Replicating QCDI. So, the easiest way to delineate all of those, number one: Qlik Cloud Data Integration is for data replication and ELT. And ELT means push down SQL, running that payload against the data in that Target platform. So, in Snowflake, in Data Bricks, using their compute engine. Stitch doesn't support all the database sources that we have today and at the same type of volume, because it is a pure Cloud implementation, and only around data ingestion. It doesn't have the capabilities of Transformations. But when you think about Talend as well; while Talend has some ELT features, mostly Talend is used for spark processing for ETL. So, if you want to do a lot of Transformation workload before going into Snowflake, you could certainly use Talend. If you are a customer who leverages Stitch, and you want to add Transformations, you could continue to use Stitch and you could extend Cloud Data Integration with Transformations. But Talend is a Swiss army knife that can support hundreds of different use-cases. Qlik Cloud Data Integration really predicated on the data Lakehouse, or the Cloud data warehouse, data ingestion and Transformations in that realm.
Great. Thank you for clarifying that. Well Tim, I really appreciate all the time you took to walk us through everything; and show off what Cloud Data Integrations can do. I hope this will be helpful for people in the future, and we definitely need to see if we can schedule a deep dive into Transformations in the future, but I appreciate you putting this together for us.
Thanks, Troy. I appreciate you inviting me and I appreciate (you know) everybody listening to this for coming in and spending some time with us to understand more about Qlik Cloud Data Integration.
Great. Thank you everyone. We hope you enjoyed this session. And special thanks to Tim for presenting. We always appreciate getting experts like Tim to share with us. Here's our legal disclaimer. And thank you once again. Have a great rest of your day.

 

Contributors
Version history
Last update:
‎2024-05-17 02:33 AM
Updated by: