Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I'm working with a SQL Server AlwaysOn cluster and multiple databases. I noticed some issues with the task after failing over. The cause may be due to the DBID of the databases differing between the nodes.
I tested this by running the task for the Boats db in node 1 then failing over to node 2 and running the task for the Cars db. This created two duplicate publications with the names AR_PUBLICATION_00008. From here errors kept popping up.
After multiple tests I concluded that creating all the publications on one node before failing over seems to fix the issues. I couldn't find anything on this in the documentation or anyone that has experienced this before. Any input or extra knowledge on this will be great. Thanks.
Kind regards,
Mohammed
Hi Mahendra,
From what I remember about this issue, it turned out to be a Availability Group setup issue. Main things to check I think would be that you are connecting to the SQL Cluster listener and that when a publication is created on the primary, check your secondary instance to make sure that the publication has also been created there. If it isn't there's your problem. However there are other contributing factors to check so here are some extra notes that I created on the matter that should hopefully help you get a better understanding to fix your issue.
Hello Team,
Please refer below articles on "AlwaysOn Availability"
Solved: Backup on read-only secondary Node - Qlik Community - 1898033
Point Qlik Replicate at an AG Secondary Replica in... - Qlik Community - 1906012
Working with Microsoft SQL Server AlwaysOn Availab... - Qlik Community - 1742049
Regards,
Suresh
You're not specific on what errors are happening. When you have an AOG in MS-SQL, it's best practice to have a separate distributor server.
Also, the publications that are created are to mark what tables (articles) are being replicated. When on Node A and a transactional task is created, a publication is created as you listed on Node A. When you failover another publication is created on Node B and the one on Node A will more than likely have the articles removed. So when you switch back to Node A, you will have two publications, one with articles and one without. It's safe to remove the empty publications on either Node. It's not really that harmful if you accidentally remove the "wrong" publication since Qlik will recreate them as needed. The other links that Suresh listed will also provide some more insight I'm sure.
Hi,
When you say that it's best practice to have a separate distributor server, I am confused as the documentation says to configure it so that it acts as its own distributor?
I have 2 nodes and setup distribution on each of them. Is this setup incorrect?
Thanks,
Mohammed
Mohammed,
Did you find a solution to this. I also ran into same issue.
Thanks,
Mahendra
Hi Mahendra,
From what I remember about this issue, it turned out to be a Availability Group setup issue. Main things to check I think would be that you are connecting to the SQL Cluster listener and that when a publication is created on the primary, check your secondary instance to make sure that the publication has also been created there. If it isn't there's your problem. However there are other contributing factors to check so here are some extra notes that I created on the matter that should hopefully help you get a better understanding to fix your issue.
I was able to start the task without any issue after 30 minutes. not sure what was stopping task not to come up for around 30 minutes after the failover of database from secondary to primary node in a cluster. When I started the task with advanced option it came up without any issue. Not sure if this kind of step need to be performed when ever there is a DB failover from one node to another node and CDC stops working.
Hi,
This is what I remember of running tasks with my sql cluster setup. After failing over to the secondary database, the task would stop for a short time then resume by itself. For me it was no longer than 1 minute. I believe this is the short time taken by the primary to actually switch over to the secondary.
Are you saying that after failover the task stops completely and you must wait 30 minutes before manually starting it again? What Advanced option did you use to start it.
Also, if these issues persist please consider opening a support case directly with Qlik. I'll try help the best I can for now.
Regards,
Mohammed
Thank you for all your assistance. I had already opened a case with Qlik.