From: Peter Staubach <staubach@redhat.com>
To: Jeff Layton <jlayton@redhat.com>
Cc: linux-nfs@vger.kernel.org, nfsv4@linux-nfs.org,
nhorman@redhat.com, lhh@redhat.com
Subject: Re: rapid clustered nfs server failover and hung clients -- how best to close the sockets?
Date: Mon, 09 Jun 2008 11:03:53 -0400 [thread overview]
Message-ID: <484D4659.9000105@redhat.com> (raw)
In-Reply-To: <20080609103137.2474aabd-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
Jeff Layton wrote:
> Apologies for the long email, but I ran into an interesting problem the
> other day and am looking for some feedback on my general approach to
> fixing it before I spend too much time on it:
>
> We (RH) have a cluster-suite product that some people use for making HA
> NFS services. When our QA folks test this, they often will start up
> some operations that do activity on an NFS mount from the cluster and
> then rapidly do failovers between cluster machines and make sure
> everything keeps moving along. The cluster is designed to not shut down
> nfsd's when a failover occurs. nfsd's are considered a "shared
> resource". It's possible that there could be multiple clustered
> services for NFS-sharing, so when a failover occurs, we just manipulate
> the exports table.
>
> The problem we've run into is that occasionally they fail over to the
> alternate machine and then back very rapidly. Because nfsd's are not
> shut down on failover, sockets are not closed. So what happens is
> something like this on TCP mounts:
>
> - client has NFS mount from clustered NFS service on one server
>
> - service fails over, new server doesn't know anything about the
> existing socket, so it sends a RST back to the client when data
> comes in. Client closes connection and reopens it and does some
> I/O on the socket.
>
> - service fails back to original server. The original socket there
> is still open, but now the TCP sequence numbers are off. When
> packets come into the server we end up with an ACK storm, and the
> client hangs for a long time.
>
> Neil Horman did a good writeup of this problem here for those that
> want the gory details:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=369991#c16
>
> I can think of 3 ways to fix this:
>
> 1) Add something like the recently added "unlock_ip" interface that
> was added for NLM. Maybe a "close_ip" that allows us to close all
> nfsd sockets connected to a given local IP address. So clustering
> software could do something like:
>
> # echo 10.20.30.40 > /proc/fs/nfsd/close_ip
>
> ...and make sure that all of the sockets are closed.
>
> 2) just use the same "unlock_ip" interface and just have it also
> close sockets in addition to dropping locks.
>
> 3) have an nfsd close all non-listening connections when it gets a
> certain signal (maybe SIGUSR1 or something). Connections on a
> sockets that aren't failing over should just get a RST and would
> reopen their connections.
>
> ...my preference would probably be approach #1.
>
> I've only really done some rudimentary perusing of the code, so there
> may be roadblocks with some of these approaches I haven't considered.
> Does anyone have thoughts on the general problem or idea for a solution?
>
> The situation is a bit specific to failover testing -- most people failing
> over don't do it so rapidly, but we'd still like to ensure that this
> problem doesn't occur if someone does do it.
>
> Thanks,
>
This doesn't sound like it would be an NFS specific situation.
Why doesn't TCP handle this, without causing an ACK storm?
Thanx...
ps
next prev parent reply other threads:[~2008-06-09 15:04 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-09 14:31 rapid clustered nfs server failover and hung clients -- how best to close the sockets? Jeff Layton
[not found] ` <20080609103137.2474aabd-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-09 15:03 ` Peter Staubach [this message]
2008-06-09 15:18 ` Jeff Layton
[not found] ` <20080609111821.6e06d4f8-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-09 15:31 ` Neil Horman
2008-06-09 15:43 ` Jeff Layton
[not found] ` <RTPCLUEXC1-PRDOLZCH000001d2-rtwIt2gI0FxT+ZUat5FNkAK/GNPrWCqfQQ4Iyu8u01E@public.gmane.org>
[not found] ` <20080609120110.1fee7221-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
[not found] ` <RTPCLUEXC1-PRDF8Eqf000001d4-rtwIt2gI0FxT+ZUat5FNkAK/GNPrWCqfQQ4Iyu8u01E@public.gmane.org>
[not found] ` <20080609122249.51767b21-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-09 16:40 ` Talpey, Thomas
2008-06-09 16:46 ` Jeff Layton
2008-06-09 18:03 ` J. Bruce Fields
2008-06-09 17:14 ` J. Bruce Fields
2008-06-09 15:51 ` Talpey, Thomas
2008-06-09 16:01 ` Jeff Layton
2008-06-09 16:03 ` Neil Horman
2008-06-09 16:09 ` Talpey, Thomas
2008-06-09 16:22 ` Jeff Layton
2008-06-09 19:36 ` Chuck Lever
2008-06-09 20:11 ` Jeff Layton
2008-06-09 20:56 ` Chuck Lever
2008-06-09 15:23 ` Neil Horman
2008-06-09 15:37 ` Peter Staubach
2008-06-09 15:49 ` Jeff Layton
[not found] ` <20080609114909.131cfaef-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-09 16:01 ` Chuck Lever
2008-06-09 16:04 ` Neil Horman
2008-06-09 15:46 ` Chuck Lever
2008-06-09 16:00 ` Peter Staubach
2008-06-09 16:24 ` Neil Horman
2008-06-09 15:51 ` J. Bruce Fields
2008-06-09 16:02 ` Jeff Layton
2008-06-09 17:23 ` J. Bruce Fields
2008-06-09 19:10 ` Jeff Layton
2008-06-09 20:19 ` Lon Hohberger
2008-06-09 17:14 ` Wendy Cheng
2008-06-09 17:24 ` Jeff Layton
2008-06-09 17:51 ` Talpey, Thomas
2008-06-09 17:59 ` Talpey, Thomas
2008-06-09 19:01 ` Jeff Layton
2008-06-09 19:13 ` Talpey, Thomas
2008-06-09 18:10 ` Neil Horman
2008-06-09 18:07 ` Neil Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=484D4659.9000105@redhat.com \
--to=staubach@redhat.com \
--cc=jlayton@redhat.com \
--cc=lhh@redhat.com \
--cc=linux-nfs@vger.kernel.org \
--cc=nfsv4@linux-nfs.org \
--cc=nhorman@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox