Definitive answer needed please - Validating Clustered Storage??

July 6, 2015, 4:35 am

≫ Next: Hyper-V Cluster 2012R2- I can see VM which are Offline using Get-VM after restart cluster

≪ Previous: Some or all CSVs will Fail Only on the Weekends. They work just fine during the 5 day week.

Hi,

I am struggling to find a definitive answer as to how I can safely validate a windows server 2012 hyper-v failover cluster. The cluster passes all non-storage validation tests, but I need to run a full validation, including storage.

I am led to believe via this blog post, that if I present a scratch LUN from my storage to the to the cluster and add it as a clustered disk and publish it to the CSV, that I can perform a FULL validation by selecting only the newly presented "test" disk, which is not supporting any clustered workloads.

When I step though the validation wizard however, the below screen makes me apprehensive...

The ..."select ADDITIONAL storage..." bit is a little confusing given that ALL of my storage is listed below. The Warning at the bottom regarding stopping all roles using the CSV also makes me want to back away slowly. It's just not clear whether at this point I am selecting a single disk to run tests on.

So, the question is. Can I SAFELY run a full validation test with the lights on by using this method?

Thanks in advance.

Ben

↧

Hyper-V Cluster 2012R2- I can see VM which are Offline using Get-VM after restart cluster

July 6, 2015, 6:17 am

≫ Next: vms on the cluster are not booting

≪ Previous: Definitive answer needed please - Validating Clustered Storage??

I use Microsoft Hyper-V Server 2012R2 (so without GUI) for all two nodes in cluster.

I've got problem with manage VM after shuting down Hyper-V cluster (all nodes down) and turning it on.

When I turn on all nodes and log on to the on of its node and run command "Get-VM" I can see any VMs.

Using MMC: Cluster Manager on Windows 8.1 I see all VMs. They are Offline. So using powerhell on cluster node I can't see these machines or even when I turn them on. I have to use MMC console on Windows 8.1.

Why?

When I turn VMs on using MMC: Cluster Manager and then turn them Off - I can see all VMs using command "Get-VM".

They are offline - as shuld be.

What is the problem with this command Get-VM?

Or how to list all VM after turning off and then truning on hyper-v cluster?

Kind Regards Tomasz

↧

vms on the cluster are not booting

July 6, 2015, 8:31 am

≫ Next: Cluster Shared Volume X has entered a paused stated because of '(c00000)9d).

≪ Previous: Hyper-V Cluster 2012R2- I can see VM which are Offline using Get-VM after restart cluster

Dear all,

I have a 2-nodes cluster hosting 8 VMs connected to a QNAP TS-421U storage.

I don´t know what happened but i got call telling me that all vms were down. I connected there and was true, all the vms were off.

I restarted the 2 nodes and trying to bring the VMs up again and they failed, the OS are corrupted and they don´t boot anymore.

Can someone help me fix this?

1. The iSCSI initiators are connected after failed many times or after many tries;

2. On the server side, the node-1 shows conectivity issues but displays all the

3. On the storage side 1 of the 4 disk is faulting, it´s configured RAID 5 with 4x2TB

regards,

nelson chamba\

nelson chamba

↧

Cluster Shared Volume X has entered a paused stated because of '(c00000)9d).

July 6, 2015, 9:51 am

≫ Next: Storage Spaces turn into Clustered Storage Spaces when creating WFOC, but don't want them to

≪ Previous: vms on the cluster are not booting

We have two Hyper V 2012 R2 (HVHOST 1 and 2) servers both with Server 2012 R2. These servers communicate via ISCI over a Dell switch to a MD3200i. All the VM's live on a single CSV virtual disk.

At first we thought the cluster was crashing just at points of high I/O - like during a backup. Then it crashed when IO was very low. Each crash we see this error message: 'Cluster Shared Volume X has entered a paused stated because of '(c00000)9d). All I/O will be temporarily be queued until a path to the volume is reestablished.' We lose connection to the CSV. Rebooting each HVHOST reestablishes this connection.

We see the below event on both HV Host servers occurring. During heavy I/O the frequency of this message increases. During low I/O this message is less frequent. 'Dell MD Series Device Specific Module for Multi-Path failed to return a path to \Device\MPIODisk1.' This MPIO Disk 1 is the CSV disk.

Windows Failover Cluster Networks

HMI - cluster and client traffic - LCAP 2GB throughput - connections to a switch

RMC - cluster only traffic - Dynamic 1GB throughput - point to point connection between HVHOST1 and HVHOST2

SAN/ISCI - Cluster traffic: none

Items Already Checked: - on both hosts

Updates installed. https://support.microsoft.com/en-us/kb/2920151
No AV
Windows Firewall turned off
Networks drivers failed to update on HVHOST2. Working with Dell to resolve this.
BIOS are out dated

I realize the BIOS drivers and out dated drivers on HVHOST2 may be an issue. However should either of these cause a lack of communication with the CSV?

Any other ideas?

Thanks in advance.

↧

Storage Spaces turn into Clustered Storage Spaces when creating WFOC, but don't want them to

July 6, 2015, 4:58 pm

≫ Next: Unable to create AG Listener with public load balancer

≪ Previous: Cluster Shared Volume X has entered a paused stated because of '(c00000)9d).

Hi everybody and thank you in advance.

Environment:

I have 2 separate servers (physical) each running Windows Server 2012 R2.
Each one having its own local storage attached via DAS using HBAs (SATA SSD Drives). None of the disks are replicating to the other host, so completely separate. I used Storage spaces to create a storage pool on each server, so the volumes are the sizes I want.
I enabled MPIO for iSCSI devices only
I created a windows failover cluster with no special configuration.

Issue:

The cluster took the storage spaces and made them clustered storage spaces. In server manager the storage pools show that they are available to the cluster, but managed by both servers (both are listed under each one). The pools are not replicated, so of course this presents a problem.

When running validation tests, the cluster takes each node offline and detaches each one of the virtual disk during the test. This is where the problem is occurring as it is detaching virtual disks that are not replicated, so it causes programs installed on these storage spaces to crash instantly.

Question:

How do I remove the storage spaces from being clustered? I have searched the web, searched each cmdlet matching *cluster*, *disk*, removed the cluster and re-added it, created on completely separate hosts which produce the same result. I tried creating the stroage pools after the cluster has been created, but the premordial storage is owned by the cluster even though none of the disk ids match up across the 2 host, so I am lost and frustrated beyond belief. Any advice would trully be appreciated. Searching here and google and everywhere returns "How to...storage spaces" which I'm not looking for how to create storage spaces. Thanks again. Here is a screen shot of how it detaches the virtual disks while running the validation tests.

↧

Unable to create AG Listener with public load balancer

July 6, 2015, 8:05 pm

≫ Next: More cluster NIC configurations...

≪ Previous: Storage Spaces turn into Clustered Storage Spaces when creating WFOC, but don't want them to

Hi everyone,

I'm following this link to setup a public load balancer with an AG Listener.

https://msdn.microsoft.com/en-us/library/azure/dn425027.aspx

However, when I run this script, I'm getting an error. It used to work previously, not sure why it's not working now.

Set-ClusterParameter : Unable to save property changes for 'IPResourceName'.
The parameter is incorrect

# Define variables
$ClusterNetworkName = "<MyClusterNetworkName>" # the cluster network name (Use Get-ClusterNetwork on Windows Server 2012 of higher to find the name)
$IPResourceName = "<IPResourceName>" # the IP Address resource name
$ILBIP = “<X.X.X.X>” # the IP Address of the Internal Load Balancer (ILB)

Import-Module FailoverClusters

Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{"Address"="$ILBIP";"ProbePort"="59999";"SubnetMask"="255.255.255.255";"Network"="$ClusterNetworkName";"OverrideAddressMatch"=1;"EnableDhcp"=0}

↧

More cluster NIC configurations...

July 7, 2015, 12:05 am

≫ Next: Script running against Management node instead of SQL Resource name

≪ Previous: Unable to create AG Listener with public load balancer

Below are the NIC configurations for HV & SOFS nodes.

Hyper-V Cluster Nodes

2 x Production - allow cluster comms, with client access

4 x SMB - do not allow cluster comms

2 x Heartbeat/Cluster/Live Migration - allow cluster comms, do not allow client access

SOFS Cluster Nodes

2 x Production - allow cluster comms, with client access

4 x SMB - do not allow cluster comms

2 x Heartbeat/Cluster/Redirected I/O - allow cluster comms, do not allow client access

Question 1

Should an issue occur with the SAS cables connected between the SOFS nodes and the JBOD, redirected storage I/O traffic will flow across any network configured to allow cluster communication. Is there a way to restrict the potential redirected I/O traffic to the Heartbeat/Cluster/Redirected I/O NICs, to eliminate any traffic being transmitted over the production NICs? Any NIC eligible to allow cluster comms is technically available for one of the 3 types of cluster comms, CSV redirected traffic included!

Question 2

When referring to an Aidan Finn blog post regarding SMB multichannel - (http://www.aidanfinn.com/2012/11/enabling-smb-multichannel-to-scale-out-file-server-cluster/), he states that the storage network should be configured to allow cluster comms, and to allow client access, for the HV nodes to talk to the SOFS nodes. But in a 6hr clustering video, with Symon Perriman and Elden Cristensen, they state that by no means should cluster comms traffic be enabled for the storage network. Who is correct?

Question 3

In a cluster aware updating scenario, when a HV node is drained of VMs in preparation for patching/reboots, are VM states copied to the other HV node via the network enabled for live migration? Asking as I discovered today that in a 2-node cluster only, no matter how many NICs are used for live migration, only one will be utilised. Concerned whether our 1Gbps NIC will cause VM states to be copied to the other node very slowly during a node patching session.

10GBe NICs would be lovely, but we simply don't have the budget.

Any suggestions welcomed.

Many thanks.

↧

Script running against Management node instead of SQL Resource name

July 7, 2015, 5:31 am

≫ Next: Programmatically get Cluster Resource dependencies (WMI)

≪ Previous: More cluster NIC configurations...

We have a 2 node W2k8R2 Cluster. A 2008r2 SQL cluster. SCOM 2012 R2 agent is installed on the Nodes.

A SCOM DiscoverSQL2008DBEngineDiscovery.vbs script runs every 4 hours. Problem is the instance and DB are not discovered. After hours of troubleshooting i found the source of the problem. This script is running against the management node instead of the SQL resource name(SQL Network name). I didn't found issues on the net about the discovery script. But I don't know if this is a cluster configuration issue.

Thanks

↧

Programmatically get Cluster Resource dependencies (WMI)

July 7, 2015, 5:52 am

≫ Next: New Hyper-V Infrastructure Advice

≪ Previous: Script running against Management node instead of SQL Resource name

BLOT: I need to programmatically locate the dependencies of a cluster resource, specifally the "AND/OR" criterion.

Hi. I am familiar with the WMI class "MSCluster_ResourceToDependentResource" which will tell me the dependencies of a cluster resource. However, the Failover Cluster Manager GUI shows an additional property "AND/OR" on the Dependencies tab. This property is not available in the "MSCluster_ResourceToDependentResource" class.

Is anyone aware of where this additional info is stored?

↧

New Hyper-V Infrastructure Advice

July 7, 2015, 7:52 am

≫ Next: migration VMs to new cluster

≪ Previous: Programmatically get Cluster Resource dependencies (WMI)

Hi All!

I am a (mechanical) engineer with a small contact engineering firm who has, unfortunately, become the defacto CTO for our merry band. We have 10 engineers in house and 3-5 outside. This is likely to be a long post, so I'll thank you in advance for muddling through it...

OK, A little history about me and the company so you all have some perspective. I joined the company 3 years a go when there were only four of us and we were working off of 4 workstations with USB external drives mapped as network shares. We got a real backup when we were scared, or every couple of weeks, with ever came first. Realizing that this was not a very durable condition, I built our first server. This is an Essentials 2012 box with a 4 disk SATA hardware raid 5 (on the motherboard controller) and a 3 GB internal disk as back up. Off site back-up is accomplished by rotating USB disks. It's a pretty capable little machine with a 6 core E5 Xeon and 32GB Ram, redundant power supplies and is UPS protected. This worked great for a couple of years and 6 more users and we never taxed the server at all. Typically under %10 CPU and %25 memory. Then we got offsite employees.... All over the world. So, I configured Anywhere Access with a VPN and RemoteApp for a Timekeeping database and away we went. Except that now my little server need to be up 24/7 and I can hardly schedule a shutdown without affecting production, and if something really fails, were &@#$ed. End of the history segment.

The current server handles the following: 1) Domain controller, AD Services 2) Anywhere Access 3) SQL Express Server 4) Autodesk Vault Server 5) File server.

So, what I am thinking is we need to build another matching server, install full 2012R2 configure it as a Hyper-V Host, and deploy the aforementioned workloads as VMs. Then Take the original server down, install full 2012R2, configure it as a Hyper-V host and add it back in to the environment. This is where I get lost. From What I have read, to build a failover cluster requires a SAN. This could be an iSCSI box on the cheap side, but I don't see how to create a redundant iSCSI installation... I could also go direct attached SAS with a dual expander JBOD and clustered heads running full 2012R2, but from what I understand, I would need THREE JBODs to be able to take one down for maintenance, or I'm back to a single point of failure at the JBOD.

I don't need to support hundreds of users, I have a hard time seeing us ever break fifty. I don't need extreme performance, but an improvement will help the boss feel good about spending the money. I need to be able to take any machine offline for maintenance without bringing the whole infrastructure down. I have a budget of around $6K. It will cost approx. $1600 to duplicate the original sever and I have 4 2012R2 licenses available, but no CALs. figuring $1200 for CALs, that means I have about $3K for new Hardware. BTW, we are %100 1GE network, and have just one 16 port switch.

What are my options? Do I need to go the cluster route, or is there a cheaper way to do this? can you do redundant iSCSI? Where does Hyper-V replica fit in? Storage spaces seems to play a role here too, but seems super picky (read super expensive) about SAS hardware. Holy S**t Batman, this is complicated.

↧

migration VMs to new cluster

July 8, 2015, 6:06 am

≫ Next: What services and role can be used as Active/Active Windows Cluster

≪ Previous: New Hyper-V Infrastructure Advice

Dear all,

I have a 3 NODE Failover cluster running on windows 2008 R2 now I got 2 power full servers that I can run windows 2012 r2 server. How can I migrate my 25 VMs running on 2008r2 cluster to this new one running on windows 2012 r2?

Can someone help me or give me the faster and safest way to do it?

Regards,

nelson\

nelson chamba

↧

What services and role can be used as Active/Active Windows Cluster

July 8, 2015, 6:44 am

≫ Next: Cluster Config Question - For Noobie!

≪ Previous: migration VMs to new cluster

HI, I am looking for failover cluster with load balancing as web service or any other so some one can tell me what services and role i can use with clustering

pwnkmr

↧

Cluster Config Question - For Noobie!

July 8, 2015, 9:40 am

≫ Next: Mostly services and role used with Windows Cluster

≪ Previous: What services and role can be used as Active/Active Windows Cluster

Team -

First, thank you for your help. Sorry if this is a simple problem but I am stumped and Google has me running in circles.

My Goal: Setup High Availability Cluster, using WD World Book as Disk Witness

Configuration:

2 Identical Servers (Dell R710) (Raid 50)
1 or 2 WD 1GB (Model WD100000H1NC) - My World Book Edition - NAS

Cluster Purpose: Service Fail Over, storage isn't the main concern.

It was my understanding that I could make a HA cluster with the two servers using the WD World Book as witness. However I can't get the validation to approve the cluster. I have setup virtual drives on the two servers but they tell me they are not valid to put in a server.

I even setup 5 virtual drives on server 1, and still won't approve. (it says it passed, but then in drive description it tells me the drive is not suitable for a cluster)

What am I missing?

Dewayne

↧

Mostly services and role used with Windows Cluster

July 9, 2015, 4:41 am

≫ Next: 2 node windows server 2012 R2 cluster building error

≪ Previous: Cluster Config Question - For Noobie!

can someone help me to decide what roles and services are being used on cluster in corporate environment

Thanks

pwnkmr

↧

2 node windows server 2012 R2 cluster building error

July 9, 2015, 7:59 am

≫ Next: non-coordinator node cannot access CSV volume

≪ Previous: Mostly services and role used with Windows Cluster

Hi,

When I'm trying building the cluster I starting process with validation. The validation process return everyting is ok (some network warnings but no error)

I turn off the firewall on both of node.

I'm trying to build a 2 node cluster (for vitch validation is ok) I get this error message:

An error occurred while creating the cluster.

An error occured creating cluster 'msCluster'.

This operation returned because the timeout period expired.

Here is the whole log file:

https://onedrive.live.com/redir?resid=F041C0A3DDD47C1F!107&authkey=!ALJzueb4mmyk6Ss&ithint=file%2clog

Here is some log line where I found error:

00001014.000011f0::2015/07/09-09:35:16.713 ERR   Exception in the InstallState is fatal (status = 1387)
00001014.0000134c::2015/07/09-09:35:16.713 DBG   [NETFTAPI] Signaled NetftRemoteUnreachable event, local address 10.1.1.1:3343 remote address 10.1.1.2:3343
00001014.0000134c::2015/07/09-09:35:16.713 DBG   [NETFTEVM] FTI NetFT event handler got event: Remote endpoint 10.1.1.2:~3343~ unreachable from 10.1.1.1:~3343~
00001014.000011f0::2015/07/09-09:35:16.713 INFO [NETFT] Cluster Service preterminate succeeded.
00001014.0000134c::2015/07/09-09:35:16.713 DBG   [NETFTEVM] TM NetFT event handler got event: Remote endpoint 10.1.1.2:~3343~ unreachable from 10.1.1.1:~3343~
00001014.000011f0::2015/07/09-09:35:16.713 DBG   [NETFT] Disabling heartbeats
00001014.0000134c::2015/07/09-09:35:16.713 DBG   [WM] Filtering event NETFT_REMOTE_UNREACHABLE? 1
00001014.0000119c::2015/07/09-09:35:16.713 DBG   [NETFTEVM] FTI NetFT event dispatcher pushing event: Remote endpoint 10.1.1.2:~3343~ unreachable from 10.1.1.1:~3343~
00001014.0000162c::2015/07/09-09:35:16.713 DBG   [NETFTEVM] TM NetFT event dispatcher pushing event: Remote endpoint 10.1.1.2:~3343~ unreachable from 10.1.1.1:~3343~
00001014.0000162c::2015/07/09-09:35:16.713 INFO [IM] got event: Remote endpoint 10.1.1.2:~3343~ unreachable from 10.1.1.1:~3343~
00001014.0000162c::2015/07/09-09:35:16.713 INFO [IM] Marking Route from 10.1.1.1:~3343~ to 10.1.1.2:~3343~ as down
00001014.000011f0::2015/07/09-09:35:16.713 DBG   [NETFT] Enabling heartbeats
00001014.0000162c::2015/07/09-09:35:16.713 INFO [NDP] Checking to see if all routes for route (virtual) local fe80::1047:79ac:baa5:8aa8:~0~ to remote fe80::2859:99c1:8eb2:875d:~0~ are down
00001014.0000162c::2015/07/09-09:35:16.713 INFO [NDP] All routes for route (virtual) local fe80::1047:79ac:baa5:8aa8:~0~ to remote fe80::2859:99c1:8eb2:875d:~0~ are down
00001014.0000162c::2015/07/09-09:35:16.713 DBG   [IM] Ignoring event because adapter not found
00001014.0000162c::2015/07/09-09:35:16.713 INFO [IM] Not sending connectivity report for probe routes, unknown adapters, and disconnected adapters
00001014.000013a0::2015/07/09-09:35:16.713 INFO [CORE] Node 2: executing node 1 failed handlers on a dedicated thread
00001014.000013a0::2015/07/09-09:35:16.713 INFO [NODE] Node 2: Cleaning up connections for n1.
00001014.000013a0::2015/07/09-09:35:16.713 DBG   [NODE] Node 2: Cancelling send queue with n1.
00001014.000013a0::2015/07/09-09:35:16.713 INFO [MQ-NODE2] Clearing 0 unsent and 0 unacknowledged messages.
00001014.000013a0::2015/07/09-09:35:16.713 INFO [NODE] Node 2: n1 node object is closing its connections
00001014.000013a0::2015/07/09-09:35:16.713 INFO [NODE] Node 2: closing n1 node object channels
00001014.000013a0::2015/07/09-09:35:16.713 DBG   [NODE] Node 2: Closing 1 pullers for n1
00001014.000013a0::2015/07/09-09:35:16.713 DBG   [CHANNEL fe80::2859:99c1:8eb2:875d%25:~63594~] Close().
00001014.000013a0::2015/07/09-09:35:16.713 DBG   [NODE] Node 2: Scheduling async pullers join
00001014.0000085c::2015/07/09-09:35:16.713 DBG   [CHANNEL fe80::2859:99c1:8eb2:875d%25:~63594~]/recv: Socket was closed between initiating IO and getting result.
00001014.000013a0::2015/07/09-09:35:16.713 DBG   [NODE] Node 2: Clearing cookie 5c650178-3c8f-448e-9891-6116d4ea61fb for n1
00001014.000011f0::2015/07/09-09:35:16.713 DBG   Exception in the InstallState is fatal: set netft heartbeat interval to 900 seconds
00001014.000013a0::2015/07/09-09:35:16.713 INFO [CORE] Node 2: Clearing cookie 5c650178-3c8f-448e-9891-6116d4ea61fb
00001014.000011f0::2015/07/09-09:35:16.713 ERR   Exception in the InstallState is fatal (status = 1387), executing OnStop
00001014.000011f0::2015/07/09-09:35:16.713 INFO [DM]: Shutting down, so unloading the cluster database.
00001014.0000085c::2015/07/09-09:35:16.713 DBG   [CHANNEL fe80::2859:99c1:8eb2:875d%25:~63594~] Not closing handle because it is invalid.
00001014.000011f0::2015/07/09-09:35:16.713 INFO [DM] Shutting down, so unloading the cluster database (waitForLock: false).
00001014.0000085c::2015/07/09-09:35:16.713 INFO [CHANNEL fe80::2859:99c1:8eb2:875d%25:~63594~] graceful close, status (of previous failure, may not indicate problem) (0)
00001014.000013a0::2015/07/09-09:35:16.713 INFO [StreamDb] Cleaning all routes for route (virtual) local fe80::1047:79ac:baa5:8aa8:~0~ to remote fe80::2859:99c1:8eb2:875d:~0~
00001014.000011f0::2015/07/09-09:35:16.713 DBG   [DM] Unloading Hive, Key \Registry\Machine\Cluster, discardCurrentChanges true
00001014.000013a0::2015/07/09-09:35:16.713 DBG   [NETFT] Could not find route to remove
00001014.000013a0::2015/07/09-09:35:16.713 DBG   [StreamDb] Route local 10.1.1.1:~0~ to route remote 10.1.1.2:~0~ removed from NetFT
00001014.000013a0::2015/07/09-09:35:16.713 INFO [NODE] Node 2: Pausing queue sending for n1.
00001014.00001048::2015/07/09-09:35:16.713 DBG   [NETFTAPI] received NsiParameterNotification for fe80::1047:79ac:baa5:8aa8 (IpDadStateDeprecated)
00001014.0000085c::2015/07/09-09:35:16.713 WARN [PULLER NODE2] ReadObject failed with GracefulClose(1226)' because of 'channel to remote endpoint fe80::2859:99c1:8eb2:875d%25:~63594~ is closed'
00001014.000013a0::2015/07/09-09:35:16.713 DBG   [JPM] Node 2: Cleaning up after dead node 1
00001014.0000085c::2015/07/09-09:35:16.713 ERR   [NODE] Node 2: Connection to Node 1 is broken. Reason GracefulClose(1226)' because of 'channel to remote endpoint fe80::2859:99c1:8eb2:875d%25:~63594~ is closed'
00001014.0000148c::2015/07/09-09:35:16.713 INFO [RGP] node 2: Node Disconnected 1 00000000000000000000000000000000000000000000000000000000000000100
00001014.0000085c::2015/07/09-09:35:16.713 DBG   [NODE] Node 2: Stream has been configured so that delaying error reporting is not allowed, reporting node as failed.
00001014.0000148c::2015/07/09-09:35:16.713 INFO [RGP] node 2: MergeAndRestart +() -()
00001014.000013a0::2015/07/09-09:35:16.713 INFO [CORE] Node 2: executing node 1 failed handlers on a dedicated thread
00001014.000013a0::2015/07/09-09:35:16.713 INFO [NODE] Node 2: Cleaning up connections for n1.
00001014.000013a0::2015/07/09-09:35:16.713 DBG   [NODE] Node 2: Cancelling send queue with n1.
00001014.000013a0::2015/07/09-09:35:16.713 INFO [MQ-NODE2] Clearing 0 unsent and 0 unacknowledged messages.
00001014.000015d0::2015/07/09-09:35:16.713 INFO [CORE] Node 2: Proposed View is <ViewChanged joiners=() downers=() newView=302(1 2) oldView=201(1 2) joiner=false form=false/>
00001014.000013a0::2015/07/09-09:35:16.713 INFO [NODE] Node 2: Pausing queue sending for n1.
00001014.00000fd8::2015/07/09-09:35:16.713 DBG   [CHANNEL fe80::2859:99c1:8eb2:875d%25:~63594~] Not closing handle because it is invalid.
00001014.000013a0::2015/07/09-09:35:16.713 DBG   [JPM] Node 2: Cleaning up after dead node 1
00001014.0000148c::2015/07/09-09:35:16.713 INFO [RGP] node 2: Node Disconnected 1 00000000000000000000000000000000000000000000000000000000000000100
00001014.00000fd8::2015/07/09-09:35:16.713 DBG   [CHANNEL 10.1.1.2:~63593~] Close().
00001014.000011f0::2015/07/09-09:35:16.713 ERR   FatalError is Calling Exit Process.
000012e8.00000ab8::2015/07/09-09:35:56.057 DBG   Cluster node cleanup thread started.
000012e8.00000ab8::2015/07/09-09:35:56.057 DBG   Starting cluster node cleanup...
000012e8.00000ab8::2015/07/09-09:35:56.057 DBG   Disabling the cluster service...
000012e8.00000ab8::2015/07/09-09:35:56.072 DBG   Releasing clustered storages...
000012e8.00000ab8::2015/07/09-09:35:56.072 DBG   Getting clustered disks...
000012e8.00000ab8::2015/07/09-09:35:56.072 DBG   Waiting for clusdsk to finish its cleanup...
000012e8.00000ab8::2015/07/09-09:35:56.072 DBG   Clearing the clusdisk database...
000012e8.00000ab8::2015/07/09-09:35:56.072 DBG   Waiting for clusdsk to finish its cleanup...
000012e8.00000ab8::2015/07/09-09:35:56.072 DBG   Relinquishing clustered disks...
000012e8.00000ab8::2015/07/09-09:35:56.072 DBG   Opening disk handle by index...
000012e8.00000ab8::2015/07/09-09:35:56.072 DBG   Getting disk ID from layout...
000012e8.00000ab8::2015/07/09-09:35:56.072 DBG   Reset CSV state ...
000012e8.00000ab8::2015/07/09-09:35:56.088 DBG   Relinquish disk if clustered...
000012e8.00000ab8::2015/07/09-09:35:56.088 DBG   Opening disk handle by index...
000012e8.00000ab8::2015/07/09-09:35:56.088 DBG   Getting disk ID from layout...
000012e8.00000ab8::2015/07/09-09:35:56.088 DBG   Reset CSV state ...
000012e8.00000ab8::2015/07/09-09:35:56.088 DBG   Relinquish disk if clustered...
000012e8.00000ab8::2015/07/09-09:35:56.104 DBG   Opening disk handle by index...
000012e8.00000ab8::2015/07/09-09:35:56.104 DBG   Getting disk ID from layout...
000012e8.00000ab8::2015/07/09-09:35:56.104 DBG   Reset CSV state ...
000012e8.00000ab8::2015/07/09-09:35:56.104 DBG   Relinquish disk if clustered...
000012e8.00000ab8::2015/07/09-09:35:56.104 DBG   Opening disk handle by index...
000012e8.00000ab8::2015/07/09-09:35:56.104 DBG   Resetting cluster registry entries...
000012e8.00000ab8::2015/07/09-09:35:56.104 DBG   Resetting NLBSFlags value ...
000012e8.00000ab8::2015/07/09-09:35:56.119 DBG   Unloading the cluster Windows registry hive...
000012e8.00000ab8::2015/07/09-09:35:56.119 DBG   Getting the cluster Windows registry hive file path...
000012e8.00000ab8::2015/07/09-09:35:56.119 DBG   Getting the cluster Windows registry hive file path...
000012e8.00000ab8::2015/07/09-09:35:56.119 DBG   Getting the cluster Windows registry hive file path...

Can anyone help me what is the problem?

↧

non-coordinator node cannot access CSV volume

July 9, 2015, 8:09 am

≫ Next: MS Network Load Balancer Performance Issues

≪ Previous: 2 node windows server 2012 R2 cluster building error

We have a 2008 R2 Hyper-V cluster with couple CSVs, their ownership is distributed among the nodes.

Today we had a problem - two of the CSVs became inaccessible from non-coordinating nodes. Both the problematic CSVs had the same coordinator. The rest of the CSVs were accessible just fine. As a result, all the VMs having storage on these CSVs crashed and wouldn't start.

It was obviously some problem with the node that owned them at that time. The problematic CSVs were shown as working and accessible in FCM) and I had to manually change the coordinating node and only then the CSVs became accessible again.

The error logged on non-coordinating nodes was: Cluster Shared Volume 'Volume3' is no longer available on this node because of 'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished.

Any idea what might have caused this? One strange thing is that on the problematic node 17 services crashed right after the CSVs became inaccessible.

↧

MS Network Load Balancer Performance Issues

July 9, 2015, 8:58 am

≫ Next: Multi Subnet Clustering. How to configure Heartbeat Network.

≪ Previous: non-coordinator node cannot access CSV volume

Greetings,

I am having issues with performance with MS Network Load Balancer for my webservers that are running as virtual machines.

this issue is occurring whether the NLB is configured for Unicast, Multicast, or IGMP Multicast.

I can get NLB to work in every configuration, performance is abysmal.

the performance drop can be measured with IE developer tools and File transfers to the NLB vs accessing a server directly.

We have added the Static ARP multicast address to our router.

we are using Fortigate router/firewall, DLink 1210 switches, Broadcom nics on dell servers running HyperV.

Can I get some ideas to improve performance?

↧

Multi Subnet Clustering. How to configure Heartbeat Network.

July 10, 2015, 4:45 am

≫ Next: Was a Hyper-V Failover-Cluster the right choice, or am I missing something?

≪ Previous: MS Network Load Balancer Performance Issues

We are setting up SQL AlwaysOn Availability group so windows failover cluster is needed as pre req. The nodes are in Prod and DR DC. 2 nodes in Prod and 2 in DR, so it will be a 4 node cluster. Because the IP subnet is different in Prod and Dr, we will end up getting a multi subnet cluster.

The question is more on the Heartbeat network. Is it required to have the HB IP for all 4 nodes in the same subnet ? Or can have Heartbeat IP for 2 nodes (Prod) in one subnet and other for other 2 nodes in other.

Thanks.

↧

Was a Hyper-V Failover-Cluster the right choice, or am I missing something?

July 12, 2015, 3:20 pm

≫ Next: iSCSI Server Target is missing after install patches and reboot

≪ Previous: Multi Subnet Clustering. How to configure Heartbeat Network.

Hey there,

I always thought a Hyper-V Failover-Cluster guarantees a continuous availability of all virtual machines. So when one cluster node goes down, all the machines that are on that failing node remain online and available.

Either I am missing something, of I didn't configure it correctly ... but when I disconnect the node from the network (simulating a hardware failure) nothing happens, besides the node failing in the cluster manager.

I was expecting that the remaining node takes over all the VMs with a one or two second delay, keeping the current state of the VM. In other words, I was expecting that a Hyper-V Failover Cluster appears exactly as a singular Hyper-V Server unless both nodes go down (on a 2-node cluster, that is).

Did I miss something, or did I forget to configure something? Did I misunderstand what a Hyper-V Failover-Cluster does?

↧

iSCSI Server Target is missing after install patches and reboot

July 13, 2015, 4:24 am

≫ Next: What is hyper V, clustering, network load balance, V motions?

≪ Previous: Was a Hyper-V Failover-Cluster the right choice, or am I missing something?

I have a 10-node Hyper-V cluster connected via MS iSCSI initiator, to a MS iSCSI Server Target (single machine)

let´s forget the fact that my SAN/iSCSI is not fault tolerant and move on

we did a weekend maintenance and i´ve rebooted the iSCSI Server target after install (a lot of) patches and...

the nodes of the cluster were showing the "connecting" message.. no "volume1" available...

in the SAN/iSCSI Windows Server, the weirdest thing.. no iSCSI Server target service in the "services.msc" list of services!!!!!

the TCP/IP port was not there... no 3260 socket listening, as the service was not installerd.. but it was..

so i´ve uninstalled via server managae, reboot, installed again, reboot.. and!

everything was working fine again...

why is that? why the service misteriously disappera from the list of services? why the socket was not open? why server manager listed the role as installed?

Last year i have a similar problem (after installing patches and rebooting) in the same machine, but slightly different, wher i had to restart the iSCSI Server target (in these day, the service was there) to make VMs to run

Maybe some patch caused the bug?

↧