From: "J. Bruce Fields" <bfields@fieldses.org>
To: Ulrich Gemkow <ulrich.gemkow@ikr.uni-stuttgart.de>
Cc: linux-nfs@vger.kernel.org
Subject: Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client
Date: Tue, 25 Aug 2015 17:54:56 -0400 [thread overview]
Message-ID: <20150825215456.GF8579@fieldses.org> (raw)
In-Reply-To: <201508251928.06201.ulrich.gemkow@ikr.uni-stuttgart.de>
On Tue, Aug 25, 2015 at 07:28:03PM +0200, Ulrich Gemkow wrote:
> Hello Bruce,
>
> On Monday 24 August 2015 22:14:01 J. Bruce Fields wrote:
> > On Mon, Aug 24, 2015 at 02:52:55PM +0200, Ulrich Gemkow wrote:
> > > we have a weired problem with Linux NFSv4.0 Server (Vanilla
> > > Kernel 4.1.6) and a Sun Solaris 10 client (all patches applied):
> > >
> > > When mounting a share on the Solaris client and then rebooting
> > > the client without unmounting the share first, after the reboot
> > > every attempt to mount the share again gives an I/O error on
> > > the client and the mount fails.
> > >
> > > After a long time (serveral hours) the v4 mount suddenly works
> > > again.
> > >
> > > Mounting a share with vers=2 works always even in times when
> > > the v4 mount fails.
> > >
> > > So it seems the Linux NFSv4 server holds a state for the client
> > > which prevents the re-mounting of the share and gives the
> > > I/O-error on the client.
> > >
> > > We use NFSv4 without idmapd.
> > >
> > > Is there any tip how to debug or solve this?
> >
> > Best is probably to get a packet trace. So something like:
> >
> > tcpdump -s0 -iem0 -wtmp.pcap
> >
> > and then try the client mount, then kill the tcpdump after the mount
> > fails, and send us tmp.pcap. (And/or take a look at tmp.pcap yourself
> > with wireshark. The interesting question is what kind of error the
> > server is returning when the client tries the mount after reboot.)
>
> Thank you for your reply. The tcpdump is attached, the relevant
> packets are 49..52. The error seems to be a SERVERFAULT. Can you
> see more from the dump?
>
> Thanks again and best regards
The SERVERFAULT is on SETCLIENTID_CONFIRM.
In nfsd4_setclientid_confirm():
conf = find_confirmed_client(clid, false, nn);
unconf = find_unconfirmed_client(clid, false, nn);
/*
* We try hard to give out unique clientid's, so if we get an
* attempt to confirm the same clientid with a different cred,
* there's a bug somewhere. Let's charitably assume it's our
* bug.
*/
status = nfserr_serverfault;
if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred))
goto out;
if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred))
goto out;
The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical
auth_unix creds.
The clientid that were looking up there was returned from the previous
SETCLIENTID, generated by this logic:
if (conf && same_verf(&conf->cl_verifier, &clverifier))
/* case 1: probable callback update */
copy_clid(new, conf);
else /* case 4 (new client) or cases 2, 3 (client reboot): */
gen_clid(new, nn);
So it should be a brand new clientid, unless the client was reusing the old
verifier.
So perhaps the client is sending the SETCLIENTID with a verifier set to what it
used on the previous boot? That sounds like a client bug. The linux
client uses a timestamp for the verifier, looks like the Solaris client
might too. Is there some reason the clock on this client isn't
advancing on reboot?
--b.
next prev parent reply other threads:[~2015-08-25 21:54 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-24 12:52 NFSv4 mount fails on Sun Solaris 10 after reboot of client Ulrich Gemkow
2015-08-24 20:14 ` J. Bruce Fields
2015-08-25 17:28 ` Ulrich Gemkow
2015-08-25 21:54 ` J. Bruce Fields [this message]
2015-08-26 19:54 ` Ulrich Gemkow
2015-08-26 20:09 ` J. Bruce Fields
2015-08-31 12:08 ` Ulrich Gemkow
2015-08-31 14:51 ` J. Bruce Fields
2015-08-31 15:52 ` Mkrtchyan, Tigran
2015-08-27 6:43 ` Mkrtchyan, Tigran
2015-08-27 18:29 ` J. Bruce Fields
2015-08-27 20:36 ` Frank Filz
2015-08-28 18:06 ` 'J. Bruce Fields'
2015-09-01 17:43 ` 'J. Bruce Fields'
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150825215456.GF8579@fieldses.org \
--to=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=ulrich.gemkow@ikr.uni-stuttgart.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).