All of lore.kernel.org
 help / color / mirror / Atom feed
* NFSv3 TCP socket stuck when all slots used and server goes away
@ 2013-03-06  9:51 Simon Kirby
  2013-03-06 14:06 ` Myklebust, Trond
  0 siblings, 1 reply; 4+ messages in thread
From: Simon Kirby @ 2013-03-06  9:51 UTC (permalink / raw)
  To: linux-nfs

We had an issue with an Pacemaker/CRM HA-NFSv3 setup where one particular
export hit an XFS locking issue on one node and got completely stuck.
Upon failing over, service recovered for all clients that hadn't hit the
mount since the issue occurred, but almost all of the usual clients
(which also statfs commonly as a monitoring check) sat forever (>20
minutes) without reconnecting.

It seems that the clients filled the RPC slots with requests over the TCP
socket to the NFS VIP and the server ack'd everything at the TCP layer,
but was not able to reply to anything due to the FS locking issue. When
we failed over the VIP to the other node, service was restored, but the
clients stuck this way continued to sit with nothing to tickle the TCP
layer. netstat shows a socket with no send-queue, in ESTABLISHED state,
and with no timer enabled:

tcp        0      0 c:724         s:2049       ESTABLISHED -                off (0.00/0/0)

The mountpoint options used are: rw,hard,intr,tcp,vers=3

The export options are: rw,async,hide,no_root_squash,no_subtree_check,mp

Is this expected behaviour? I suspect if TCP keepalived were enabled, the
socket would eventually get torn down as soon as the client tries to send
something to the (effectively rebooted / swapped) NFS server and gets an
RST. However, as-is, there seems to be nothing here that would eventually
cause anything to happen. Am I missing something?

Simon-

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-03-06 21:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-06  9:51 NFSv3 TCP socket stuck when all slots used and server goes away Simon Kirby
2013-03-06 14:06 ` Myklebust, Trond
2013-03-06 21:20   ` Simon Kirby
2013-03-06 21:31     ` Myklebust, Trond

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.