From: Rick Macklem <rmacklem@uoguelph.ca>
To: Bram Vandoren <brambi@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
Chuck Lever <chuck.lever@oracle.com>
Subject: Re: NFS client hangs after server reboot
Date: Fri, 31 May 2013 19:24:03 -0400 (EDT) [thread overview]
Message-ID: <785211889.113142.1370042643416.JavaMail.root@erie.cs.uoguelph.ca> (raw)
In-Reply-To: <CACQjR_CsRY4O+S1Zx-QabXYBUtRMKHa8o7ymCmLqSHa9UdgNFQ@mail.gmail.com>
Bram Vandoren wrote:
> > Did both the client and server have the same IP addresses before the
> > reboot?
>
> Yes.
>
> > If not, the Linux client's nfs_client_id4.id SetClientID argument
> > will be different (it has the client/side IP# in it).
> > nfs_client_id4.id
> > isn't supposed to change for a given client when it is rebooted.
> > That will make the FreeBSD NFSv4 server see "new client" (which is
> > not in the
> > stablerestart file used to avoid certain reboot edge conditions) and
> > will not give it a grace period.
> > This is the only explanation I can think of for the NFS4ERR_NO_GRACE
> > reply shortly after the reboot.
>
> I checked some other clients and they all receive the
> NFS4ERR_NO_GRACE response from the server. It's not unique for the
> clients that hang. I was unable to reproduce this is a minimal test
> configuration. Perhaps the nfs-stablerestart file is corrupt on the
> server?
>
> I checked
> strings nfs-stablerestart
> and I see a lot of duplicate entries. In total there are ~10000 lines
> but we only have ~50 clients.
> Most clients have 3 types of entries:
> Linux NFSv4.0 a.b.c.d/e.f.g.h tcp
> Linux NFSv4.0 a.b.c.d/e.f.g.h tcp*
> Linux NFSv4.0 a.b.c.d/e.f.g.h tcp+
>
I'll take a look. I wrote that code about 10 years ago, so I don't remember
all the details w.r.t. the records in the stable restart file. If you truncate
the file, there won't be any recovery on the next reboot, so you need to
unmount all the NFSv4 mounts on it before rebooting for that case.
What you packet trace didn't indicate was when the server was rebooted vs
when the client sent it a SYN that started a new connection. During the
approx. 4400 sec the server was down there should have been repeated attempts
to connect to it (basically a TCP packet with SYN in it) at least once every
30sec. Basically, after the server reboots, the client must establish a TCP
connection and attempt recovery within 2 minutes or it just isn't going to
work.
Btw, server reboot recovery doesn't get a lot of testing. Some of that is
logistics (no one pays for FreeBSD NFS development, etc) and the rest is that
most assume a server will remain up for months/years at a time. If the FreeBSD
server is crashing, you need to try and resolve that. If the approx. 4400 sec
downtime was a scheduled maintenance type of thing, you should consider unmounting
the volumes before the server is shut down and doing fresh mounts after it
is rebooted.
rick
> Again, thanks a lot for looking into this.
>
> Bram.
next prev parent reply other threads:[~2013-05-31 23:24 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-09 15:51 NFS client hangs after server reboot Bram Vandoren
2013-04-09 19:08 ` J. Bruce Fields
2013-04-10 19:33 ` Chuck Lever
2013-04-10 23:23 ` Rick Macklem
2013-04-11 23:15 ` Rick Macklem
2013-04-12 9:19 ` Bram Vandoren
2013-04-12 15:10 ` J. Bruce Fields
[not found] ` <CACQjR_CcKwHU8sMrmQ5YfgV5dbuiMLRRqBkDRQEVq2yjGEuzmg@mail.gmail.com>
2013-04-12 15:14 ` Chuck Lever
2013-05-28 12:31 ` Bram Vandoren
2013-05-28 19:23 ` Chuck Lever
2013-05-28 22:06 ` Rick Macklem
2013-05-28 23:30 ` Rick Macklem
2013-05-29 1:04 ` Chuck Lever
2013-05-29 1:13 ` Chuck Lever
2013-05-29 12:49 ` Rick Macklem
2013-05-30 11:09 ` Bram Vandoren
2013-05-30 0:24 ` Rick Macklem
2013-05-30 0:31 ` Rick Macklem
2013-05-30 11:20 ` Bram Vandoren
2013-05-30 11:04 ` Bram Vandoren
2013-05-30 11:55 ` Rick Macklem
2013-05-31 16:35 ` Bram Vandoren
2013-05-31 23:24 ` Rick Macklem [this message]
2013-08-28 13:39 ` William Dauchy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=785211889.113142.1370042643416.JavaMail.root@erie.cs.uoguelph.ca \
--to=rmacklem@uoguelph.ca \
--cc=bfields@fieldses.org \
--cc=brambi@gmail.com \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).