Quantcast
Channel: High Availability (Clustering) forum
Viewing all 3614 articles
Browse latest View live

Cluster network name resource failed registration of one or more associated DNS name(s) for the following reason: DNS bad key. Event ID 1196

$
0
0

Hello,

We have a server 2016 failover cluster with 1 clustered role on it.  In the event log of the cluster we are getting the following:

"Cluster network name resource "SQL Network Name (SQLSRVIT)" failed registration of one or more associated DNS name(s) for the following reason: DNS bad key."

I have checked the DNS server settings in the NIC and they are pointing to a valid working DNS server.  

This error happens every 15 minutes.  

What else can I check to try to fix this?

Thanks

James 



"No disks suitable for cluster disk" i have tried everything

$
0
0

We have a 2 node failover cluster running Server 2016. The physical servers and storage are hosted or part of a Dell VRTX blade server with 3 Physical Servers and on board storage availble to share to each host or vm's. 

I created virtual disks on the VRTX and gave access to each hyper V host, the disks show up in disk management on each host but when trying to add them as Cluster shared volumes failover cluster manager gives me the message "No disks suitable for cluster disk". 

Please note that i did this all before and it worked fine, i had 3 clustered shared volumes setup previosuly but now only 2 as i removed CSV 3

I am trying to create clustered shared volume 4. 

I have tried everything i can think of such as: - 

- formatting disks numerous times using MBR or GPT

- setting disks to oniline / offline with or without drive letters 

- used powershell commands to clear cluster disk reservation

- restarted both nodes

- Shutdown the whole cluster 

- Run cluster validation tests 

PLEASE HELP i dont know what else to try 

two rack (faultdomain) with each two servers failed cluster shared volume in test

$
0
0

We have four servers, in each rack two. With set-clusterfaultdomain I set the two rack-ids of the four servers. Created the s2d storage pool (FaultDomainAwarenessDefault=StorageRack). Added cloud witness disk. Created four volumes with mirror (parity not available). Put on each volume a vhdx of a test server, and started this on every hyperv-host one. All looks good so far.

Broke the network connection between the two racks, so the cluster goes into isolation mode. After some time, just one of the failed virtual server returns to the surviving side of the cluster (with the internet connection). The other enters failed state. After investigating the config, it looks like one of the volumes is failed, on which the failed server had his vhdx stored. I waited at least 15 minutes before helping the cluster to repair. it should have repaired itself in some way. Repair of volume in console give "network path not found".

Why is one volume (with mirror resiliency) failed ? It should allways been available on one side of the cluster because we just have two racks, and with mirror it should be available in each rack. How can I determine which volume will fail ? How to prevent the failure of the cluster-shared-volume, thus the virtual servers running on it ?

host unreachable warning in NLB manager

$
0
0

We setup Windows NLB  (unicast) on 2 Windows 2016 servers. The HA and load balancing works as expected. However, I get these the host unreachable warning in the NLB manager from both servers. Sometimes, the warning disappears but most of the time are there.

I saw that MS suggests to use 2 NICs on NLB host for unicast mode. But I also saw reference about not needing it because  UnicastInterHostCommSupport is used for above that is higher than 2003sp1 and I did see those registry keys in the OS.

Importance of Heartbeat network in 2008R2 cluster

$
0
0
So we have a 2-node 2008R2 cluster that has SQL 2012 AlwaysOn AG running on top of it.  I need to take the heartbeat network offline for 5 minutes to change it to a new heartbeat network.  The current heartbeat network has no route to it.  I am concerned about the cluster problems when I make this change.  I know I can change the AlwaysOn properties to manual to avoid accidental failovers.  Is there anything else I should do from the cluster level to change my heartbeat network to something new?  I basically was just planning to Re-IP the same nics, but am mostly concerned about cluster issues from taking the heartbeat network offline.

Dave


cluster file server in azure

$
0
0

hi

i am trying to create a clustered file server on top on Storage Space Direct 2016 cluster and it fails after long waiting.

if i pre-stage fileserver in AD and disable, give the cluster object permissions on that object, it errors saying the object already exists in AD. if i don't create AD object, it fails after 10minutes and not AD object gets created.

i have tried full domain admin permissions, given cluster object " create computer objects" permissions on the OU without luck.

any suggestions?

thanks


TA

Cannot move Role across sites - Element not found on Log disk

$
0
0

Greetings.

I am having an issue moving a WSFC Role from Site1 to Site2 or vice versa.

My configuration:

  • Windows Server 2016
  • WSFC, Stretch Cluster
  • 4 nodes Site1, 4 nodes Site2
  • File Services Roles
  • Storage Replica
  • Nimble CS7000 hybrid storage, iSCSI connected

I have two Roles currently created and running successfully. I CAN move the Roles between nodes at the SAME site without issue, however, when I try to move a Role to the opposite site, the disk configured as the LOG for that Storage Replica will briefly show "Element not Found." Error Code: 0x80070490

Has anyone run across this? I'm running a very similar set up in a Dev environment, and the difference is that the iSCSI SAN volumes are backed by a Nimble CS500 instead of the CS7000, not sure if/how that would make a difference, but it seems that it is a possible explanation.

Any thoughts or ideas are welcomed.

T

Any good explaination for having a quorum for a Fail-Over Cluster?

$
0
0
 Can anyone give me a good explaination of why they have a quorum for a Fail-Over Cluster?  Why would you not have it so only when the last server fails does the service end, since only one server is active at a time? Otherwise, why have 3 or 4 servers with the same service, with only one of them active at a time, if you are going to kill the service when only 2 servers fail?  Sounds like a waste of resources.  A quorum on a Load Balancing Cluster makes perfect sense if you need to maintain a fixed amount of capacity, but a Fail-Over Cluster?

Storage Spaces Direct: Can not add Disk to Storage Pool S2D

$
0
0

Hi everyone,

i have a two node cluster with Storage Spaces Direct.

I have an unused SSD on each Server:

when i try to add the disks to the storage pool i get an error:

Add-PhysicalDisk : One or more storage devices are unresponsive.

I have tried to reset the disks, but i did not help. There are no Events generated and the  Cluster ist OK.

Does anyone have an idea?

Thanks
Thorsten


File Share Witness in Google Cloud Instance Group?

$
0
0

Had an interesting question today. Customer wanted to deploy a cluster in the Google Cloud and wanted to put their File Share Witness on a Windows Container in an Instance Group. The file share itself would reside on persistent storage. There would only ever be one container instance running, but health checks would ensure that if the instance crashed that a new instance would be automatically provisioned. 

Behind the scenes when a new instance is provisioned the computer name and SID would change, but the file share would still be reachable by the same CNAME and presumably have the same permissions and the exact same contents as the instance that crashed.

I'm not overly familiar with Instance Groups in Google Cloud, but f it works as described above it seems like it might be a viable option to host a File Share Witness. I'm going to check it out, but curious to know what you think about this idea. Of course I know Azure Cloud Witness could still work, but they would like to keep it all in the Google Cloud.


David A. Bermingham, MVP, Senior Technical Evangelist, SIOS Technology Corp

Cluster Group Sets (ClusterGroupSet) breaks live migration on Hyper-V Cluster

$
0
0

I followed the instructions on the Microsoft Blog - Failover Clustering Sets for Start Ordering, and it works great for starting the VMs in the correct order.  However, it causes problems with live migration.

The problem is best described with an easy-to-reproduce example:

Setup

  1. Create 3 VMs (DC-Server, DB-Server, Web-Server).  There doesn't need to be any OSes installed for this testing.
  2. Create 3 ClusterGroupSets: DC-Set, DB-Set, Web-Set
  3. Add the appropriate VM to each set
  4. Make the Web-Set dependent on the DB-Set.
  5. Make the DB-Set dependent on the DC-Set.

Test

  1. Start the Web-Server VM.
  2. Web-Server will change to "Starting", and DC-Server will start.
  3. 20 Seconds later DB-Server will start
  4. 20 Seconds later, Web-Server will start.
  5. Good!

Problem

  1. Select the three, running VMs and live migrate them.
  2. DC-Server will migrate immediately and correctly.
  3. DB-Server and Web-Server will get stuck at 50-80%. While stuck, the VMs are not on the network.
  4. Sometimes, after 5+ minutes, the migration will complete.  But it usually errors out.
  5. You can manually cancel the live migration, but there is still an outage.

We first noticed this problem when we paused a cluster node to perform maintenance.  It failed to pause.  This also causes CAU to fail to drain nodes.

If you manually live migrate Web-Server, then DB-Server, then DC-Server, it works.

My Best Guess

When the DC-Server VM is live migrated, it is treated as a "fresh startup", and the StartupDelay causes the dependent Sets to wait to start the VMs for 20 seconds.  Hence, they get stuck near the end of migration.  I don't know why they just don't resume migrating after 20 seconds though.

Summary

Cluster Groups Sets are a must-have feature, but so is live migration.  I need both to work.


-Tony

How to cluster Hyper-V two node Failover Clustering with 4 VMs with Windows Server 2012 R2 Standard

$
0
0

Dear Ms Support 

I have 2 nodes

Host1: Windows server 2012R2 standard support 2VMs on Hpyer-V

On Host 1 there are 2 max 2 virtual machines.

Host2: Windows server 2012R2 standard support 2VMs on Hpyer-V

My question is to deploy Hyper-V Clustering with Windows Server 2012 R2 Standards with 2VMs on per host. Because at one point host 1 runs at most 2 virtual machines

For example, host 1 running 2 VMs, Host 2 failed, will migrate from 2 VMs from host 2 to host 1. Are there any errors?

Thanks




VM-VM Affinity Rule

$
0
0

Hi, quick question. Is it possible to configure For Hyper-v Clustering a VMware like VM-VM Affinity Rule to keep virtual machines together on the same host? I know its possible to setup anti-affinity rules (the opposite of what im looking for), but i cant find anything about VM-VM affinity rules. My objective is that whenever VM1 is moved (Live Migrate or failover), then VM2 should always end up on the same host as VM1

I should add, i dont have SCVMM, and my failover cluster is running on server 2016 datacenter nodes

Multi-Home File Server Cluster

$
0
0

Is this a supported configuration? I will have two node file server cluster with network cards connected to different networks. Management IP will be same. I was not able to find such documentation. Can someone shed some light in this situation?

Thank you

access cluster through Failover Cluster Manager

$
0
0

I am currently experiencing a problem accessing my cluster through Failover Cluster Manager.  Error message received:  Cluster *** not found.  Check the spelling of the cluster name.  Otherwise , there might be a problem with your network.  Make sure the cluster nodes are turned on and connected to the network." .   

The cluster consists of 3 Hyper-V servers.  I created the cluster inside of SCVMM 2016.    Previously I was able to access the cluster through FCM, but that has been a while more than a month ago.  SCVMM shows the cluster operating normally and is useable.  I can also access the cluster using PowerShell commands.  I just cannot manage the cluster through FCM.  Has anyone seen the problem and resolved it?



'(Get-Cluster).S2DBusTypes=0x100' for a non production storage space on a production cluster

$
0
0

I have an 8 node Hyper V Cluster with 3 * 1.8 TB unused drives in each (VMs on SAN) connected through a Dell Perc RAID Controller (hosts are dell 730s). cluster is running 2012 R2 but im in the process of upgrading to 2016.

Now I would like to utilize these disks for my cluster.

s2d doesnt support drives connected through RAID Controller, even though I can set the Conrtoller to pass through. it still reports as bus Type RAID in windows. 

you can override that read out with the command in the topic but that is apparently unsupported but otherwise works just fine.

real talk: I want to use this to have room for some additional(!) snapshots and VM Backups. How much of a risk would you guys say this really is? Im not going to put production VMs or any VMs at  all on that storage. But it would present a nice opportunity to have room for more backups just in case, temporary space for migrating data and so on.

Again, im not using this for production workloads, but it will be part of a production cluster. Under these circumstances is it still a no go?


cluster service Error 1053 in server 2012 R2

$
0
0

The error window is popped up once trying to start cluster service with the domain admin user. It says: Windows could not start the Cluster Service on Local Computer. Error 1053: The service did not respond to the start or control request in a timely fashiion. The Server 2012 R2 is installed in this PC. The capture is attached for the refernce. The cluster is created from Failover Cluster Manager per the procedures from MS website.

Please note, we create this cluster in order to deploy Always On availability groups in SQL 2016 that requries to work with Vedio survanlance system. 

What would be the cause of this issue? How to fix? Please advise.

thanks

John

Can an NLB host be a host in another NLB cluster?

$
0
0

Hello,

If I have an NLB cluster X that has hosts A & B, can I create another new NLB cluster Y (with a different TCP port of course) that has hosts B & C?

When I try & create the new NLB cluster Y on B, the "New Cluster: Connect" dialog box returns "The specified host has no interfaces available for installing a new cluster" when I connect to host B.  Though each of my hosts (A& B & C) have a single IP address, I have full control over the network IP addressing if need be to make this work.

Thanks,

Larry

VM's went to stopping state at the time of live migration

$
0
0

We have 3 Node Cluster for Hyper V and around 50VM are running on the cluster. We initiate the live migration for one of the VM from node 2 to node 1 and it shows VM got moved to node 1 but the status is still Live Migrating within a min the VM resource got failed and after some time all the VM's which are running on the node 1 went to stopping state.  After the hard reboot of node 1, all the VM's are failover to other nodes and came online.

This we experienced in different clusters in the past two months. 



OS NameMicrosoft Windows Server 2016 Datacenter   - System Model:PowerEdge R630

A component on the server did not respond in a timely fashion. This caused the cluster resource 'Virtual Machine CHNXXXXXPS01' (resource type 'Virtual Machine', DLL 'vmclusres.dll') to exceed its time-out threshold. As part of cluster health detection, recovery actions will be taken. The cluster will try to automatically recover by terminating and restarting the Resource Hosting Subsystem (RHS) process that is running this resource. Verify that the underlying infrastructure (such as storage, networking, or services) that are associated with the resource <g class="gr_ gr_20 gr-alert gr_gramm gr_inline_cards gr_run_anim Grammar multiReplace" data-gr-id="20" id="20">are</g> functioning correctly.


Hyper-Stretched Cluster with storage Replica, Unplanned failover

$
0
0

Hi,

We are using windows 2016 Hyper-V stretched cluster between two sites.

When we try to fail over a site, it fail over rightly. but when the site is restored, the failed over CSVs, are not Failing over Back to its original site. Then we have to move the cluster group manually.

Jiwan Sharma

Viewing all 3614 articles
Browse latest View live


Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>