Quantcast
Channel: High Availability (Clustering) forum
Viewing all 3614 articles
Browse latest View live

s2d 4 nodes cluster

$
0
0

dears,

i'm facing a serious issue and im stuck.

my deployment consists of 4 nodes 2019 deplyed in an s2d cluster. Validation is successfull.

all 4 nodes are up, same vswitch created on all the nodes via switch embedded teaming while enabling rdma.

the problem is with one node, each vm on this specific node is not accessible via rdp from client pcs and the application fails.

and the thing is if i move the vm to any of the other nodes it will work and get accessed via rdp. moreover, the host with the issue is accessible via rdp, its just any guest vm is not accessible. And ping is working, no firewall. Everything is working fine.

all users and servers in the same subnet, tracert works. I just don't know what is the issue

any help would be appreciated

best regards


Basic questions on failover clustering

$
0
0

We are running a 4 node File cluster running windows 2012 OS.

And run 6 prod server roles of 6 file servers

Question:

Say for Role FS6, i have 6 storage volumes. And one volume goes offline.

With the logical operator AND used and defined for all the cluster storage volumes of the file server FS6, the File server would go offline and the role FS6 would go in stopped state as shown in figure below:

Questions:

1) If one volume went off and it stopped the role as fig. above. What is expected of the remaining 5 volumes?
   Will they be accessible with the role in stopped state but the volumes show online.

2) A failover will be attempted as per policies on the role and the resources.
   With the polices for the role and resources set as per the following screenshots:

   What is expected?
   - Will a failover be attempted immediately or will it wait to start the resources on the same node for 15 minutes?
   - Will the resources move from current owner node a new node irrespective if the volume comes online or not?
   - Will it attempt to failover and start the resources, and if it fails it will stop trying after 3 attempts?

   - Will there be any disruption is accessing the available volumes during the failover?

Shall appreciate precise answers with above mentioned scenario.

Thanks,
Shailesh

2012 R2 RPC Errors When attempting to add to cluster

$
0
0

Hi,

Fresh build of 2012 R2 on a host, all updates applied and windows firewall is disabled completely. When attempting to add to cluster or even manage another Hyper-V host i get RPC unavailable errors. If i run a tnc IP -port 135 it`s listening but any WMI queries result in the RPC unavailable error.

Can anyone help me :)

window 2019 server error

$
0
0

I had brand new server just installed. But i get the server reboot with Kernel-Power event id 41

and bug check 1001. Please help

No communication on new "cluster only" network - how to troubleshoot?

$
0
0

Hi,

We've had a couple of Server 2016 vms working as File servers 1 and 2. To improve on best practices, I decided to add new vNICs to both VMs that were on a separate VLAN from the "cluster and client" traffic. The VLAN is wide open, with no gateway or DNS (it's only for cluster traffic), so I picked 192.168.11.1 for FS1 and 192.168.11.2 for FS2, both in 255.255.255.0.

The windows firewall rule for inbound and outbound UDP 1812 traffic rule is in place, but it's allow all, on any interface, within the network.

When I run cluster validation, I get:

Network interfaces FS1.ads.ssc.wisc.edu - Ethernet1 and FS2.ads.ssc.wisc.edu - Ethernet1 are on the same cluster network, yet address 192.168.11.2 is not reachable from 192.168.11.1 using UDP on port 3343.

I also notice that the nodes are unable to ping each other this new network.

What sort of troubleshooting can I do to determine why communication isn't happening on the network?

I already ran netstat -rn and see (from FS1 in this case)

Destination         Netmask               Gateway        Interface        Metric
192.168.11.0    255.255.255.0         On-link      192.168.11.1    271

in the routing table, so I think that is right.

I've never done this before, so I'm not sure how to proceed with further testing. Our VM admin has set up two VMs quickly in vSphere that only use this new VLAN, and he confirmed they can communicate with each other. So the problem seems to be with FS1 and FS2.

Change File Share Witness Location

$
0
0

Hi, 

Currently we have configured FSW in Windows Server 2008 server, as part of EOL server, planning to migrate from 2008 to 2016, 

My Question is - What is the procedure to change FSW path?

1. Downtime required?

2. Change the path to all three nodes?

3. Any precheck?

Appreciate your assistance on this. 

File Share Witness Path Changes

$
0
0

Hi 

We are planning to change the FSW path on existing three node cluster, is required downtime?, if no what is the impact if i will change the path that time, also do we need to copy any olde config files to new share?

Appreciate your assistance

How I an add the 3rd Host to my existing Hyper-v cluster ?

$
0
0

Existing environment is hyper-v 2 node cluster.

I want to add my 3rd host to my cluster.

Please help with the step by step procedure.



Windows Server 2016 - Failover Cluster failed

$
0
0

Hi, 

I have two Windows server 2016 VMs. Installed the failover cluster feature on both servers. Both servers were fully patched and could ping each other. However when I went to create a cluster on node A, it failed with an error:

https://imgur.com/a/M2KXipm

As soon as this errors occurs, this instantly corrupts network configuration on node B. I can ping node B to A, but can't ping node A to B. Something has gone horribly wrong. The issue I have is that these two VMs and the DC are hosted in Azure. The DC doesn't have DHCP installed, however during the create cluster wizard, it didn't give me the opportunity to assign a static IP to the cluster, instead it states that it will obtain one via DHCP (which doesn't exist). I'm sure this is the root of the problem:

https://imgur.com/a/z2Vc8BI

The only thing I didn't do on the nodes was to enable WMI on the windows firewall, should I blow them away and start over, but with windows firewall disabled as a test, or can this situation be recovered?

Thanks,



FCM sluggish, some VMs changing state rapidly

$
0
0

I have a Hyper-V Failover Cluster where Failover Cluster Manager (FCM) is behaving very strangely.

The FCM GUI shows several VMs changing state rapidly (from Running -> paused -> resume -> running in a very rapid cycle).  When I say rapidly, its happening so fast the right-click menus are flickering.  FCM responsiveness is also sluggish.

These same VMs are shown as running fine and not changing state  in Hyperv Manager.

Does anyone have any suggestions as to what to look for? I'm assuming it's some communication issue.

Problem to create a windows failover cluster in Windows Server 2016

$
0
0
Hi everyone, I have two servers joined to a domain and I need to install a sql server database failover cluster. Both servers have already installed the figure of failover clustering and the network cards are grouped in NIC Teaming with a Teaming Mode "Switch Independent" and Load Balancing "Dynamic". The problem is that when I want to create the cluster using the wizard, I add the first node without problems, however, when adding the second node I have the following error: "The node cannot be contacted. Ensure that node is powered on and is connected to the network ". Am I missing something? Please I need help about it.

Could two server create a cluster for hyperV failover

$
0
0

Hello,

I have already deploy two windows 2016 standard server for hyperV and File Sharing purpose. Need I deploy the third server or install failover clustering service on DC.

Another question now I use ISCSI to map san storage. I would like to know I should deploy hyperV virtual fibre channel SANs ?

I find it is little strange for me .

Hyper-V 2016 Cluster Crashing

$
0
0

Hello 

We have Three Nodes Windows 2016 cluster on Dell PowerEdge R640 Server. with Dell SC3020 Storage and CSVs are connected using iSCSI.

1. We have Random Issue of sometimes VMs not Responding in the Network and 
When we try to that VM OFF/Shutdown and then it becomes Stopping-Critical state and we have to Reboot that Node to start the VM again.

2. Also When try to Live Migrate using Drain the roles it also failed and stuck at 80% to 84% and we have to reboot that Host.


Things which we tried :

A. I have tried to Upgrade all Dell Driver\Firmware\BIOS and all are latest now.

B. Cluster Nodes having 50% RAM available.

C: Executed Cluster Validation Wizard and all are green.

When I checked Event Logs

Event ID 1230 : A component on the server did not respond in a timely fashion. This caused the cluster resource 'Virtual Machine resource type 'Virtual Machine', DLL 'vmclusres.dll') to exceed its time-out threshold. As part of cluster health detection, recovery actions will be taken. The cluster will try to automatically recover by terminating and restarting the Resource Hosting Subsystem (RHS) process that is running this resource. Verify that the underlying infrastructure (such as storage, networking, or services) that are associated with the resource are functioning correctly.

Event ID : 5157 : Cluster Shared Volume 'Volume1' ('_CSV_01') has entered a paused state because of 'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished. This error is usually caused by an infrastructure failure. For example, losing connectivity to storage or the node owning the Cluster Shared Volume being removed from active cluster membership.

Event ID : 5120 :

Cluster Shared Volume 'Volume1' ('_CSV_01') has entered a paused state because of 'STATUS_VOLUME_DISMOUNTED(c000026e)'. All I/O will temporarily be queued until a path to the volume is reestablished.

Event ID:1069 :

Cluster resource 'Virtual Machine Configuration PCVM01' of type 'Virtual Machine Configuration' in clustered role 'PCVM01' failed. The error code was '0x2' ('The system cannot find the file specified.').

Any Help would be more appreciated

Thanks

Prakash



Thanks , Prakash ,Please Note: My Posts are provided “AS IS” without warranty of any kind, either expressed or implied.

On 2-Node Windows 2012 R2 Cluster w/ raid disks for shared storage, after adding a resource to a new File Server Role, Add File Share to Role results in an exception.

$
0
0

BLUF: Receiving the following exception when Add File Share is performed on an established 2-node cluster with shared storage.

Log Name: Microsoft-Windows-FileServices-ServerManager-EventProvider/Operational
EventID: 0; Version: 0; Level: 2;Task: 1; Opcode: 0, Keywords: 0x2000000000000001
Exception: Caught exception Microsoft.Management.Infrastructure.CimException: The xsi:type attribute (p1:MSCluster_Property_Group_PrivateProperties) does not identify an existing class.

   at Microsoft.Management.Infrastructure.Internal.Operations.CimSyncEnumeratorBase`1.MoveNext()
   at Microsoft.FileServer.Management.Plugin.Services.FSCimSession.EnumerateAssociatedInstances(String cimNamespace, ICimInstance sourceInstance, String associationClassName, String targetClassName)
   at Microsoft.FileServer.Management.Plugin.Services.ClusterAssociationService.GetResourceGroup(ICimSession session, ICimInstance resource)

First, I want to state that the configuration has been successfully fielded several times quite recently. Second, the domain in which the configuration is deployed is strictly controlled (Latest patches, strict GPO's, high UAC enabled, etc.).  With that said, I am trying to figure out why this particular installation is receiving an exception when adding file shares to the configured shared drives.
High level steps are:

On both nodes Install Dell PowerVault MD Storage Manager Software
Use Disk Manager to online, rename and organize three disk volumes
Create the Cluster with two servers as member nodes
Establish Raid Disks in the Cluster (select disks, set names and Quorum configuration)
Create the File Server Role (establish the Client Access Point and select RAID drives for storage)
At this point Add File Shares work...
Add resource (Oracle Fail Safe adds standalone DB to the group) which is successful.
At this point Add File Shares failes, after selecting SMB Share - Quick and Next, the Share Location page has the Server but no longer shows any Volume's from the RAID and Type a custom path: is selected but always fails.

Explanations of the exception and what might require fixing appreciated.

Cluster Manager

$
0
0

hello,

I would like to know what security level is required to run Failover Cluster manager?  I had the question asked of me and I thought you have to be at least a Domain Admin to run it but I want to make sure.

Thanks in advance.

John


HV replication to Azure - VM guest with shared vhds Set

$
0
0

What a mess it is. MS own technology (shared disks in 2016/2019 vhds set) is simply not supported

That is the only bit that stops me from using Azure for DRAnd I do not fancy breaking my clusters (file & SQL)

Unless anybody has a better idea (than just a single VM)

Seb


Technical Q's on S2D, Storage Replica and Hyper-V

$
0
0

So after reading a ton of Docs and watching a few videos, I'm still very unsure whether we should move to HyperConverged platform.

Currently we have a Hyper-V cluster that spans 2 sites, and it all sits on the old HP LeftHand solutions P4500s. Our VMs can swim around from site to site, and if one site fails everything can live happily in the other until the problem is resolved.

Now it's time to renew the infrastructure, and we're taking the opportunity to re-look at how we do things, and of course management is looking to keep costs down.

So obviously Azure Stack HCI is something that has come up.

My concerns are as follows, and I'm hoping someone will be able to fill in the blanks and correct anything I've got wrong...

With S2D the best fault tolerance you can currently have is 2 nodes. So with a HA cluster stretching 2 sites, the maximum number of nodes you can have is 4 nodes, 2 in each site, because as soon as you move to 6+ nodes and one site fails you will be over the 2 node limit and the storage will fail. So in this situation 4 nodes is the limit, which would be very tight for us.

We can add in Storage Replica, and use cluster-to-cluster replication. In this case we can have an S2D cluster in Site A, and replicate to another S2D cluster in Site B. However, there's no automatic fail-over, you have to manually bring everything up, and we can't have that.

You can have Storage Replica with a stretch cluster that does support automatic fail-over, but only with Storage Spaces. Losing all the goodness of S2D.

The best of both worlds would be to have Storage Replica, stretch cluster, and S2D volumes, but this is not supported

Also, there appears to be a big question surrounding log placement for Storage Replica, as it needs to be faster than the data disks. So in effect you can end up limiting the benefit of the cache in S2D, to ensure it's not as fast the volume for the log.

Plus as it starts as a low cost solution, but as soon as you start building it up the costs go up a lot. 

I really like the look of S2D, it looks great. I just can't see how it will work for our environment. We're also looking at the HP Nimbles.

Andrew


Andrew France


Disabling Powershell 2.0 on Win2012R2 failover cluster

$
0
0

Hello,

For security concerns we are disabling powershell 2.0 in several servers.

Does anyone know if failover cluster services depends on powershell 2.0 in order to work? Can it be disabled?

thanks


Cristian L Ruiz

cluster

$
0
0

HI,

We have setup a cluster for SQL AG on 2 of our server 2012 R2 VM but today I see this error on the Node2.

Can some one tell me what this error means?




Shahin

File Share Cluster for UPD

$
0
0

Hello everyone,

So im stuck for days on a problem, I have an RDS farm and the file share for upd (single node).

I want to file share cluster for the high availability on the UPD profiles.

So I started creating the cluster on azure.

Each node has 2 hdd for data for the cluster, I have enable ClusterS2D create the disk on CSVFS_REFS format and everything until now is fine. Then I installed theScale-Out File Server role so the upd will be always available.

Configured a load balancer so can point to the file share role ip, I can connect now with the file share from the RDCB but when I try to add the shared path to  the user profile disk I got this error.

I have set the static ports for RPC on regedit.

#Set RPC dynamic ports to static range setting

 

New-Item "HKLM:\Software\Microsoft\RPC\Internet"

New-ItemProperty "HKLM:\Software\Microsoft\RPC\Internet" -Name "Ports" -Value '50001-51024' -PropertyType MultiString -Force

New-ItemProperty "HKLM:\SOFTWARE\Microsoft\Rpc\Internet" -Name "PortsInternetAvailable" -Value Y -PropertyType "String"

New-ItemProperty "HKLM:\SOFTWARE\Microsoft\Rpc\Internet" -Name "UseInternetPorts" -Value Y -PropertyType "String"

Do I need to configure anything on the load balancer?

Sorry maybe I didn’t expanding it very good as im new to this things.



Viewing all 3614 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>