Quantcast
Channel: High Availability (Clustering) forum
Viewing all articles
Browse latest Browse all 3614

how to check RCA for heartbeat missing

$
0
0
  • \

  • The cluster service was halted to prevent an inconsistency within the failover cluster . the error code was 1359

  • Server : windows 2016

    As per my investigation , the  network adapter reset issue was observed at the same timestampi.e., 3:18:26 AM on 07-01-2019. Please be informed that cluster logs timezone will be in GMT timezone.


    00000c64.00001950::2019/07/01-07:18:33.587 INFO  [IM - Cluster Network 1] Resetting interface state calculation state

    00000c64.00001950::2019/07/01-07:18:33.587 INFO  [IM] Leader is sending request for all interfaces in the current view

    00000c64.00000b44::2019/07/01-07:18:33.587 INFO  [DCM] Force disconnect payload: netname \xxxxxxx, requested disconnect status (0), src <null>, dest <null>

    00000c64.00000b44::2019/07/01-07:18:33.587 ERR   [DCM] Force disconnect failed on DisconnectSmbInstance::CSV, status (c000000d)

    00000c64.00000b44::2019/07/01-07:18:33.587 INFO  [DCM] Force disconnect(DisconnectAll): server \169.254.2.228, DisconnectSmbInstance::CSV

    00000c64.00000b44::2019/07/01-07:18:33.587 INFO  [DCM] Releasing RDR handle for target node id 2

    .000006ec::2019/07/01-07:19:02.884 ERR   [NODE] Node 1: Connection to Node 2 is broken. Reason (10054)' because of 'channel to remote endpoint 169.254.2.228:~3343~ has failed with status 10054'

    00000c64.000006ec::2019/07/01-07:19:02.884 WARN  [NODE] Node 1: Initiating reconnect with n2.

    00000c64.000006ec::2019/07/01-07:19:02.884 INFO  [MQ-thpqhms0] Pausing

    00000c64.000008dc::2019/07/01-07:19:02.884 INFO  [Reconnector-thpqhms0] Reconnector from epoch 1 to epoch 2 waited 00.000 so far.

    00000c64.00001930::2019/07/01-07:19:03.012 INFO  [IM] got event: Node with FaultTolerantAddress xxxxx:~0~ has gone down with fatal error\crash

    00000c64.00001930::2019/07/01-07:19:03.013 ERR   [IM] Couldn't find node id for remote virtual IP xxxxxxxx:~0~

    0000194c::2019/07/01-07:19:14.683 DBG   [NETFTAPI] Signaled NetftRemoteUnreachable event, local address 10.81.64.153:3343 remote address 10.81.65.25:3343

    00000c64.00001930::2019/07/01-07:19:14.683 INFO  [IM] got event: Remote endpoint 10.81.65.25:~3343~ unreachable from xxxxx

    00000c64.00001930::2019/07/01-07:19:14.683 INFO  [NDP] Checking to see if all routes for route (virtual) local xxxxx:~0~ to remote 169.254.2.228:~0~ are down

    00000c64.00001930::2019/07/01-07:19:14.683 WARN  [NDP] All routes for route (virtual) local 169.254.1.43:~0~ to remote xxxxxxxxx:~0~ are down

    00000c64.00001924::2019/07/01-07:19:14.683 INFO  [CORE] Node 1: executing node 2 failed handlers on a dedicated thread

  • Also found this in event logs :

    07-02-2019           7:20:42 AM           Warning thpqghs0.prod.travp.net     10400    Microsoft-Windows-NDIS   N/A         N/A         The network interface 'vmxnet3 Ethernet Adapter' has begun resetting.  There will be a momentary disruption in network connectivity while the hardware resets. Reason: The network driver detected that its hardware has stopped responding to commands. This network interface has reset 1 time(s) since it was last initialized.

Please let me know if this causing the issue


Viewing all articles
Browse latest Browse all 3614

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>