STT - Qlik Replicate Troubleshooting

Troy_Raney · Sep 16, 2022 1:50:47 AM

Environment

Qlik Replicate

Transcript

Hello and welcome to the September edition of Techspert Talks. I'm Troy Raney and I'll be your host for today's session. Today's presentation is Qlik Replicate Troubleshooting with Barbara Fill. Barbara, why don't you tell us a little bit about yourself?
Hi everyone! My name is Barbara Fill. I'm a Senior Technical Support Engineer that supports the QDI software (that's Replicate as you may know it) and anything to do with Data Integration and I've been working with Qlik and / Attunity for six years now.
That’s great. Okay, well, today we're going to talk about how Replicate works; what to do when the unexpected happens; we're going to take a look in detail at some log files; understand how to read those and get a better idea of how to resolve issues quickly. Now Barbara, for those of us who are kind of new to Qlik Replicate, could you explain from a high level what it is and what it does?
Sure. Qlik Replicate is a tool that helps you move your enterprise data from one Source to another Target. We call that a Full Load.
Okay.
It alSo, applies the changes; and Troy, we call that Change Data Capture. And the great thing Troy is that there's no coding involved and it happens in a near real-time fashion.
Great. Can we take a look at the product and see how this all works?
Sure. So, this is Qlik Replicate. Can you show us how to create a task? This whole no-code-thing you're talking about?
Sure. This is amazingly easy. It's a GUI tool that allows you to create new tasks in multiple fashions. You can make a single direction going from Source to Target, Bidirectional, Log Stream, Cloud Landing. And we're going to show you today how to create just a simple task going from a Source to a Target, a Unidirectional type of task. So, we have Full Load and Apply Changes, give it a task name. First, I'm going to call it TECH2. And what it does is it brings me to a palette in which I can very easily drag and drop Sources and Targets, combine them together and choose a table and performing a Full Load and the CDC changes.
Okay. So, what is the Source that we're going to be copying today?
Well today, we're going to use DB2 z/OS version 12. So, I'm going to simply drag and drop this endpoint that I've already pre-defined and I'm going to now grab the Target and drag and drop that onto the palette.
Well, that was easy. As I understand it that's DB2 on Mainframe?
Yes, it is.
And that's going to copy everything to Microsoft SQL; and what table are we going to be copying?
Simply click on the Table Selection button.
Okay.
It's going to actually connect to my z/OS system, talk to DB2 and present me with a list of tables under the schema called BARBARA. I'm going to choose my CUSTOMER table today. So, simply click on the table name or drag and drop. And it drops over here, and click OK.
Okay, that seems out pretty easy. So, this task is going to perform both of those actions, right? The load of all the data in the Source and the continuous updating and keeping the data live with the Change Processing, is that right?
That's correct, because the two symbols on this center button here. The blue solid line represents a Full Load. So, Replicate will perform essentially a select * from the DB2 table, and it's going to pull all the rows down to the SQL Server Target. The dashed line or the dotted line is going to read the DB2 logs and it's going to pull down all those changes: all the inserts, updates, deletes, whatever is happening against that Source table; and what wait for that Full Load to finish, and start applying the changes as soon as it's done.
That's cool. And that's basically just to keep everything very efficient, right? It doesn't have to copy over everything all the time and replace it. It's just only copying over changes once the Full Load is complete?
That's correct. It's a continuous applying of any new changes coming in to that Source table.
Okay. Well, let's get this thing started then.
Great. I'm going to Save the task, and I'm going to run the task now. Simply click on the Run > Start Processing.
Okay, starting up.
It's very fast.
But I like that it's telling you what it's doing as it does it. Okay, it stopped. That can't be good.
Right.
Okay, I'm seeing that red X there. What does that tell you or how can you investigate further what happened there?
Sure. The way to understand what happened is simply go to the Monitor tab. The Monitor tab is geared towards the runtime aspect of your task.
Okay.
Because the Designer tab is only geared towards picking and choosing tables. It's all about designing; there's no active running. But the Monitor tab shows you the health and the status of that task as it's running.
All right. So, what should have happened here?
Right. The Full Load should have started, and then when it was completed then the CDC changes should have begun.
Okay.
You can see, Troy, that there's a tab here called Full Load. The Full Load is still in a Queued up fashion.
Oh, it should have gone to Loading and then Completed, but right now it's stuck in Queued?
That's correct.
Okay.
And as you see, you can click upon each one of these columns and it'll begin to show you the status of that Full Load; but we're stuck in the Queued state.
Okay, I see a little red X in the bottom right, the messages window.
Correct. So, what we're going to do then is look inside the Task Log.
Okay.
Click on View the Logs; and this is where we can really get into (you know) what are the different components of this Task Log.
Now for most people, I think when they look at this; it's just a ton of jargon and doesn't make any sense. When you're looking at this, what information stands out to you or what are you looking for when you're looking through a Task Log?
Yes. When I'm analyzing a Task Log (especially if it's coming multiple test logs coming from a diagnostic package) first, I want to see what is the version of Replicate that the customer is using. You would want to be at the most current version of Replicate. In this case, Replicate and R4Z. R4Z is the Mainframe component that runs the API which talks to DB2 for the changes. So, in this case, I can see I'm at May 2022 release, the current Service Pack. It's a good strong release.
Okay.
And the next thing is: is the customer (you know) licensed for this Source or Target? And yes. So, that's all fine. And then I sometimes look at the task information and from Task Manager. These are the different components by the way in the left-hand side here. In this case, I see it's a Full Load and CDC type of task.
All right. So, what else do you look for?
Things like the ODBC driver version. It's always good to locate what version of the drivers for your Source or Target endpoints. Look for them in the requirements section in the user guide.
Okay.
And further on down. Yes, this is a valid client version going to DB2. The next thing I look for is: is there any color in this Task Log?
We got a fatal error there, a warning.
Right. And that's a W and that's in like an orange color.
Oh.
Right. It failed on the Target Apply.
Okay. The E is for Error I'm assuming?
That's correct. I is for information; W is for warning; E is for error.
Okay. All right. So, what is this error telling you? Table DBO does not exist?
Yes. It says it either does not exist or it has insufficient privileges or no supported columns. But I know I don't have permissions to write to a schema called DBO on my Target SQL Server database. Yeah, it's trying to use the default BARBARA instead it went to DBO; that - What does that mean: DBO?
Oh sure. DBO meaning the Data-Base Owner.
Okay. It's like a default schema then?
Yes.
Okay. So, basically it's trying to write to a schema that we didn't want it to, and it's failing because we didn't have the right permissions?
That's correct, yeah.
How would you resolve that?
Sure. I'm going to close this Task Log, and I'm going to go back into Designer, because that's a special table that it actually failed on. Into the Task Settings. All these items on the left-hand side control the behavior of Replicate.
Okay.
Starting at the top, we have the Metadata and the Target Metadata is blank, because while I do want to use BARBARA as my Target schema for my data tables; it failed, Troy, on the Apply Exceptions table.
Ah, right.
It went to the wrong schema. So, I'm going to change this and just simply direct it to the correct schema. So, we're good.
Okay.
That's all I need to change here. Now also, just a little bit of a description here: we have Full Load processing turned on; we also have Change Processing turned on. And these are places where you can fine-tune the behavior of the Full Load and the Change Processing. You know the Professional Services team that we have here in Qlik are a fantastic group to help customers perform Performance Tuning. It takes a little time, if you take a half-step to the left and a half-step to the right, and you eventually come to the optimum performance tuning for your Full Loads and your Change Data Processing, all depending on the different Sources and Targets that you may have. So, just want to add a little plug there.
Okay.
So, clicking OK. I can now Save my changes. And let's run the task again. I'm going to reload this Target table.
All right. So, it's telling us what it's doing again. We're starting up and hopefully we'll get no errors this time. All right! So, it’s moved from Queued to Loading, and Loading to Completed. So, the Full Load completed. That's progress.
Yes, this is beautiful. So, the Full Load has completed very cleanly. I have just eight records in my Source table. It's just a little test table I'm playing with today. And let's take a look at Change Processing now. And I do have some Inserts that I'd like to run.
Okay.
Just to see that my CDC processing is working correctly as expected.
All right. So, you're going to adjust the table a little bit on the Source side, and we should see those changes be reflected over and copied over as a part of the Replicate task to the Target?
That's correct.
All right.
Simply going to run all these SQL statements. I can always take a quick look at the table, and yes, I have my 16 rows of data because I had eight from the Full Load and I just inserted 8 more.
Okay.
Let's go back to Replicate and make sure that all the changes came across.
Yeah, it doesn't look like anything has happened. So, where would you go to figure out why those changes weren't picked up?
I would simply click on the View Logs. Let's look for any Warnings or any Errors, whatever.
Okay.
I'm simply going to scroll down. I know that the action I was performing was a Source Capture.
Okay. All right, we got a Warning there.
Yes. Data changes of table BARBARA.CUSTOMER will not be captured.
That's not what we want.
Right. Because it is not set to Capture Data Changes, and Replicate is not set to enable this option automatically.
Okay. You may know what that means, but for those of us who aren't really familiar with that warning message, how would you investigate this further? What do you recommend?
The first thing I would do is go directly to the Community and search on this error message.
Oh, look at that: No CDC coming through.
That's great. So, let's click on the link.
Seems like the same issue.
Yeah, it seems we have. Yes, my R4Z I know is installed, but no changes are coming through. There's a resolution to this.
Okay.
I can see that it says: make sure the table has been altered for CDC. So, all I have to do is go back to the Mainframe and apply my Alter Table statement.
And the other option?
Yeah, the other option says: go to the DB2 Endpoint and in the Advanced tab. We can click on the button that says Automatically Enable the Data Capture Changes.
So, one option is for making some changes on the Mainframe side; and the other is to make the adjustments on the Replicate side?
That's correct.
Can we take a look at both?
Sure. I just want to go to the Mainframe first.
All right. This is kind of cool. I'm not used to looking at Mainframe.
Yes, the wonderful world of Mainframe. This is a very menu driven type of system environment.
Now based on that article, it's suggesting that Data Changes are not being turned on. Is there a way to check to see if that's actually the case in our mainframe?
Yes, and I have a query all set.
Okay.
The one I performed, Troy, was this one here. And it says: is Data Capture Changes turned on? I can see Data Capture there, but this field here is blank.
All right. So, instead of saying None or No it's just blank?
Correct.
What would it say if it was actually turned on?
It would show the letter Y for yes.
I guess we need to turn that on. Can you show us how to do that?
Sure. Go back into that same list of queries, and uncomment out this one; which performs the Alter Table Data Capture Changes.
Okay. Putting those little hyphens in there just comments out those commands? Kind of de-selects them?
That's correct.
Okay.
Executed successfully; and it says so right there.
Okay.
I'm gonna page down or PF8 key, and it's showing: Data Capture: Yes.
Great. So, we see that it's now turned on. If we weren't an Admin or didn't know how to adjust settings in Mainframe, where would you correct that from the Replicate side?
Sure. The place to correct that is simply by going into the Endpoint definition, and going to the Advanced tab. There is an option here called Changed Data Capture. Have this radio button turned on so that automatically the Data Capture Changes is turned on for all the tables in their DB2 tasks. And this way the DBA doesn't need to get involved.
And what's the DBA?
The DB2 Database Administrator, the person who manages all the DB2 databases and tables and etc.
It's so much easier to do it from the Replicate side if you have the right privileges.
It Is.
Would you recommend that as a best practice to always have that checked, the Automatically Enabled Data Capture Changes?
Yes, as a best practice it's much easier to allow the Data Capture Changes to be turned on; so, that the DBA doesn't accidentally miss one table in production, for example. They don't want to miss out on a table not being captured because they accidentally made a mistake and I forgot to turn it on.
Okay.
I'm going to save this information anyway with the new setting.
All right. The task never stopped. It's still running.
It's still running.
Trying to pick up changes?
That's correct.
But it didn't pick up that previous insert that you ran. Can we test it out again?
Right. So, I'm just going to execute a new set of changes.
Okay.
And run these inserts just to make sure that my Data Capture Changes actually does work. All three sets have been inserted into the table.
This is the Source table we're looking at; and we're going to see the customer key is different all the way down.

Correct.
You've got 24 rows there; before you only had 16. Okay.
So, now I'll go back to Replicate.
Ah, and I see 8 inserts.
Correct.
Okay.
So, we have 8 inserts on the SQL Server Target. I should now see about 16. Not 24; why? Because remember Data Capture Changes was turned off?
Right. Okay. So, to pick up those inserts that happened before Data Changes were turned on, what would you need to do to pick those up?
Yes, you probably need to reload the table.
Okay.
So, to redo the Full Load. Would just a Stop and Start on the task do that?
Yes, you can stop the task and then reload.
That's great. Okay, now it's time for a Q&A. Go ahead and submit your questions through the Q&A panel on the left side of your On24 console. Barbara, how about if we just start from the top?
Sure.
The first question is: how to handle replication of tables without primary keys?
Ah yes. That's a very good question. Sometimes tables don't have a primary key on the Source. But what I do recommend is that you pick a column, I'm going to go into this table here in the Designer tab.
Okay.
Pick a column, and when you double-click to the left of that column name, you'll see a key symbol pop up. Try to pick a unique column that would have unique data all the time and use that as a primary key on the Target. There would be (for example) like, no symbol here and no symbol here. So, there will be no keys, and just pick one of the columns, make it a key; because otherwise, when it comes to deletes and updates, you're going to probably have lots of latency.
Okay. That's very helpful. All right, next question: what are the steps to troubleshoot missing records issue at Target?
Oh fantastic. We have a great 2 or 3 different links, and as you can see there are several links here: was the data missing during a Full Load? If so, you can click on this link to get the steps to collect that. If there's missing data during CDC, and this is for finding why there could be missing data. You can see there are several steps and options. We tell the customer how to turn on some extra logging; most importantly is to turn on the Change Processing for Store Changes.
Right. Okay, well I see there's a lot of options there and it's definitely a good idea to check in the Community.
Correct.
Because there's resources available. Next question: what are the best practices while creating Full Load and CDC tasks?
Well, that's a really good question. Well, when it comes to a best practice for Full Load in CDC; let's go into the task settings and take a look for Full Load. Most customers would either have a DROP and CREATE listed here, or a TRUNCATE. But the I think what you're truly trying to ask is what about performance tuning? Well, as you can see there is a Full Load tuning, and we usually encourage customers to work with Professional Services team, and you can engage with them. And what they will do is help you determine the best tuning factors here.
Just specific to their individual databases and environment?
That's correct; and according to your business needs.
Right.
And the Professional Services team will also, then help you with the Change Processing tuning. So, it's always good to engage with the Professional Services team.
That's a great tip. Okay, moving on, our next question: where can I find help with installing Qlik Replicate?
When it comes to installing Replicate, you can go to the User Guide, simply by going to the Help. Here we have the User Guide which shows you how to install Replicate. And we have a link here: Installing Qlik Replicate.
Ah, that's great. Recommended to hardware configurations, instructions for Windows, Linux. Very nice.
And security considerations further down. And then it continues on with working with the Endpoints, but these are the main topics here for installing Replicate.
All right. Next question: we're trying to load some huge tables, and after 22 hours, we get a ‘communication error has been detected’ error message. Any ideas what could be the problem?
Yes. Well, it really could be one of quite a few different reasons actually.
Yeah.
It could be a network timeout or maybe a timeout from the relational database management system, whether it's like SQL server or DB2 or Oracle. There's different types of timeouts, but there are different settings within the different Source and Target environments that can allow you to tell Replicate to wait a little bit longer or allow the execution of the queries to last a little bit longer for the environment so, they're not rejected. So, I also recommend that you go to the Community and just see if there's anything there. But if not, I would highly recommend that you open up a case.
That's a great tip as well. If you can't find a solution in Community, you can always create a case and we'll investigate further. Next question: can Replicate run multiple tasks using the same Source endpoints simultaneously?
Well, not within the same task. But you can have multiple tasks all reading from the same Source.
Right.
At the same time, this could add extra stress on your Source environment. To sort of answer your question: and it may be best to use the Log Stream to reduce the number of tasks that are hitting the same Source.
Okay. Yeah, I remember the Log Stream is one of the options is a type of task you could run, right?
That's correct.
Next question: where can I find information about what drivers Qlik Replicate should be using in case they need to be updated? FYI, we're using SQL server and DB2.
Great. So, for example, if you're using SQL Server if it's a Source, you can scroll down while you're in this user guide (which you know, I got here by clicking on the Help window in the upper right-hand corner of the Replicate console) and then just scroll down and click on the words Adding and Managing the Source Endpoints. And as you scroll down.
Wow.
Yeah. Let's say it's DB2, if I'm using say DB2 z/OS for example, I could come down to the prerequisites here. It can tell me what the client or ODBC driver requirements would be. For example, in DB2z if I'm on the Windows server that Replicate is installed on, then it would say find (you know), find from the IBM website the IBM data server driver for ODBC and CLI version 11.5.6.
That's great.
This is great. Yes, you would do that for each Source and each Target environment as well.
Okay. that was excellent. Next question is: how can we translate the version build number of Qlik Replicate to the release name? For example: May 2021?
There is a release notes addendum on the support portal I believe. It will show you the Replicate version and the patch release notes, and it'll show you the build number.
Okay.
Just like to add: sometimes a customer’s R4Z or Mainframe agent that's running on the Mainframe is it not from the same release, and it's very important to have both of those from the same release, and if it's an older release, it should be from the same patch release.
That makes sense. Okay, this next question is kind of specific but I’ll read it: ‘We are trying to perform a Full Load from postgreSQL DB to Mainframe DB2 Target. We are matching with DDLs, but still getting below errors. It's an SQL error: assignment of a null value to a not null column N is not allowed. What is the N column in the message? Any help there?
Yeah, you'd have to turn on some Verbose logging for the Source Capture and Target Apply. This is a Full Load; so, it would be - Actually let me show you. On this in the task settings. I would go to the logging component, and because it's a Full Load, I would actually turn on to for both. I would do Metadata Manager too, Source Unload and Target Load. These are all the corresponding components.
Okay.
Leave this on only until the error occurred. Otherwise, there would be too much logging using up too much of your disk space. The particular error we would need more information about.
All right. That's interesting seeing where you can raise the logging levels. Next question: where can we find documentation on using Kafka as a Source endpoint?
Yes. Interesting question. There's always new Sources and Targets being added to the Replicate repertoire (if I can say it that way). There's different Sources and Targets of course, and we know that Kafka can be used as a Target, but we don't have it listed here as Kafka being used as a Source.
Okay.
I would recommend that you probably go to the Ideation in Community and open up an enhancement request.
That's a good tip. I can see we're getting a lot of questions from specific log messages. This one with the following log message: how do I find the table causing the message ‘update on table 79 changes PK to a PK that was previously updated in the same bulk.’? Any ideas?
Yes, that's a great question. Again, what you would want to do is come to the logging and add either the Trace or Verbose logging on, and this was CDC Target apply; so, turn it on for the Metadata Manager Trace or Verbose; the Source Capture; and the Target Apply. What this will do is: a combination of these will help expose what the table ID is. You would then want to stop your task and then resume your task, and it will show in the Task Log that a certain table is associated with a certain table ID.
Great. Well, a nice follow-up to that question is the next one: what's the difference between restarting a task and resume processing?
When it comes to like the Advanced Run or Restarting of your task; what happens is: Replicate will refresh the metadata. Then you are starting from a certain point in in time. So, let me just illustrate that to you. Advanced Run is allowing you to go back in time through your database log. Whether it's a few days ago or a few hours ago, and start your task; but continuing on only from that point onwards.
That's great. It's like going back in time. What about Resume Processing?
Resume Processing means it's going to reuse the currently saved information. So, your last transaction that you were running, it's going to continue processing from where it last left off.
Okay, great. Next question: can you set up a task with several Sources to a single Target?
You can have multiple tasks pointing to the same Source, and then going to a single Target.
Okay. Now we're running a little low on time. So, the last couple questions: what log files are there and what is the difference between them?
So, there's two types of logs that Replicate retains. The first one is your standard Task Log that we see here in the monitor.
Okay.
And you can access it either via the View Logs in the bottom right hand corner here, or under Tools. You can do View Logs here. There's also, another log for the Replicate Server itself, that is located in the Logging Option, here. And as you can see, it has similar types of logging components. You can also view the log. And as I scroll down, you can see it's a similar layout. So, those are the two different types of logs.
Great. And that kind of leads into the last question we have time for today: can you help us really understand more about log files and what we're looking at?
Sure. Viewing the logs, it's very important to observe which logger Component had a failure. Either an E error or a W for warning, and turn on either Trace or Verbose logging. Also, definitely put that logging back to Info when you're done.
Right.
There's two places where you can turn on the logging. This one in the Designer phase; and you put logging on. That can be useful when you're starting up a task. But let's say your task is running. In the running of your task, just simply go to the Tools pull down here, then do your Log Management, and then turn on your Trace of Verbose logging.
Ah! So, here you can adjust the verbosity of a task that's already running?
Yes.
So, the login gets turned-on on-the-fly.
That's very cool. What about if the issue is intermittent and you don't want to have Verbose logging on all the time?
I'm going to show you a fantastic option here. It's called ‘Store the Trace/Verbose Logging in Memory.’ You can check the box, and Replicate for whatever loggers that you you've turned on the Trace or Verbose logging for; it's going to record all the Trace and Verbose messages to like a little bucket of memory, and it's about 10 MB in size which can be increased. And what it does, Troy, is it begins to collect all the logging messages for these Trace and Verbose logging components, then Replicate asks himself “did I see a red E somewhere? No?” then it takes that full bucket of 10 MB of Verbose logging and it throws it away. And it repeats that process over and over and over again. And when it comes upon an E error, it then says: “Great! Let's look in that temporary 10 MB buffer and let's grab those Verbose or Trace logging messages” and he writes it to the Task Log.
That's awesome.
It's a very powerful feature.
Well, thank you very much, Barbara! I think this will help a lot of people get a better understanding of what's happening on their system when errors occur, and how to resolve those issues quickly without having to create a support case.
Yes, and if you are getting errors, and you see that it's a SQL error (some sort of SQL error), first check with your DBAs. Maybe they can help you. I'd like to call them your local friendly DBA; so, you can get to know them better. And they would be glad to help you, I'm sure. The other thing you can do is just Google the error, because if it's not a Replicate failure message, and it's a SQL message it's a - then it means it's a relational database error. So, there's a true environment error. So, look it up, use Google, use your DBAs, talk to them, find out what the error message means. If you can't resolve it, it's not a problem, go to Qlik Community, because chances are and you can probably find an answer there. And if you cannot, simply open up a case and the tech Support team will be glad to help you.
Well, thank you, Barbara.
Oh, yes. Thanks Troy. And thank you everybody for attending; and I'm looking forward to serving you guys in any way that I can.
Okay, great. Thank you everyone. We hope you enjoyed this session. Thank you to Barbara for presenting. We appreciate getting experts like Barbara to share with us. Here's our legal disclaimer. Thank you once again. Have a great rest of your day.

STT - Qlik Replicate Troubleshooting