* NFSv4 mount fails on Sun Solaris 10 after reboot of client @ 2015-08-24 12:52 Ulrich Gemkow 2015-08-24 20:14 ` J. Bruce Fields 0 siblings, 1 reply; 14+ messages in thread From: Ulrich Gemkow @ 2015-08-24 12:52 UTC (permalink / raw) To: linux-nfs Hello, we have a weired problem with Linux NFSv4.0 Server (Vanilla Kernel 4.1.6) and a Sun Solaris 10 client (all patches applied): When mounting a share on the Solaris client and then rebooting the client without unmounting the share first, after the reboot every attempt to mount the share again gives an I/O error on the client and the mount fails. After a long time (serveral hours) the v4 mount suddenly works again. Mounting a share with vers=2 works always even in times when the v4 mount fails. So it seems the Linux NFSv4 server holds a state for the client which prevents the re-mounting of the share and gives the I/O-error on the client. We use NFSv4 without idmapd. Is there any tip how to debug or solve this? Thank you and best regards -Ulrich -- |----------------------------------------------------------------------- | Ulrich Gemkow | University of Stuttgart, Germany | Institute of Communication Networks and Computer Engineering (IKR) |----------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-24 12:52 NFSv4 mount fails on Sun Solaris 10 after reboot of client Ulrich Gemkow @ 2015-08-24 20:14 ` J. Bruce Fields 2015-08-25 17:28 ` Ulrich Gemkow 0 siblings, 1 reply; 14+ messages in thread From: J. Bruce Fields @ 2015-08-24 20:14 UTC (permalink / raw) To: Ulrich Gemkow; +Cc: linux-nfs On Mon, Aug 24, 2015 at 02:52:55PM +0200, Ulrich Gemkow wrote: > we have a weired problem with Linux NFSv4.0 Server (Vanilla > Kernel 4.1.6) and a Sun Solaris 10 client (all patches applied): > > When mounting a share on the Solaris client and then rebooting > the client without unmounting the share first, after the reboot > every attempt to mount the share again gives an I/O error on > the client and the mount fails. > > After a long time (serveral hours) the v4 mount suddenly works > again. > > Mounting a share with vers=2 works always even in times when > the v4 mount fails. > > So it seems the Linux NFSv4 server holds a state for the client > which prevents the re-mounting of the share and gives the > I/O-error on the client. > > We use NFSv4 without idmapd. > > Is there any tip how to debug or solve this? Best is probably to get a packet trace. So something like: tcpdump -s0 -iem0 -wtmp.pcap and then try the client mount, then kill the tcpdump after the mount fails, and send us tmp.pcap. (And/or take a look at tmp.pcap yourself with wireshark. The interesting question is what kind of error the server is returning when the client tries the mount after reboot.) --b. > > Thank you and best regards > > -Ulrich > > -- > |----------------------------------------------------------------------- > | Ulrich Gemkow > | University of Stuttgart, Germany > | Institute of Communication Networks and Computer Engineering (IKR) > |----------------------------------------------------------------------- > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-24 20:14 ` J. Bruce Fields @ 2015-08-25 17:28 ` Ulrich Gemkow 2015-08-25 21:54 ` J. Bruce Fields 0 siblings, 1 reply; 14+ messages in thread From: Ulrich Gemkow @ 2015-08-25 17:28 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs [-- Attachment #1: Type: text/plain, Size: 1873 bytes --] Hello Bruce, On Monday 24 August 2015 22:14:01 J. Bruce Fields wrote: > On Mon, Aug 24, 2015 at 02:52:55PM +0200, Ulrich Gemkow wrote: > > we have a weired problem with Linux NFSv4.0 Server (Vanilla > > Kernel 4.1.6) and a Sun Solaris 10 client (all patches applied): > > > > When mounting a share on the Solaris client and then rebooting > > the client without unmounting the share first, after the reboot > > every attempt to mount the share again gives an I/O error on > > the client and the mount fails. > > > > After a long time (serveral hours) the v4 mount suddenly works > > again. > > > > Mounting a share with vers=2 works always even in times when > > the v4 mount fails. > > > > So it seems the Linux NFSv4 server holds a state for the client > > which prevents the re-mounting of the share and gives the > > I/O-error on the client. > > > > We use NFSv4 without idmapd. > > > > Is there any tip how to debug or solve this? > > Best is probably to get a packet trace. So something like: > > tcpdump -s0 -iem0 -wtmp.pcap > > and then try the client mount, then kill the tcpdump after the mount > fails, and send us tmp.pcap. (And/or take a look at tmp.pcap yourself > with wireshark. The interesting question is what kind of error the > server is returning when the client tries the mount after reboot.) Thank you for your reply. The tcpdump is attached, the relevant packets are 49..52. The error seems to be a SERVERFAULT. Can you see more from the dump? Thanks again and best regards Ulrich > --b. > > > > > Thank you and best regards > > > > -Ulrich > > |----------------------------------------------------------------------- > > | Ulrich Gemkow > > | University of Stuttgart, Germany > > | Institute of Communication Networks and Computer Engineering (IKR) > > |----------------------------------------------------------------------- [-- Attachment #2: nfserr.pcap --] [-- Type: application/octet-stream, Size: 7439 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-25 17:28 ` Ulrich Gemkow @ 2015-08-25 21:54 ` J. Bruce Fields 2015-08-26 19:54 ` Ulrich Gemkow 2015-08-27 6:43 ` Mkrtchyan, Tigran 0 siblings, 2 replies; 14+ messages in thread From: J. Bruce Fields @ 2015-08-25 21:54 UTC (permalink / raw) To: Ulrich Gemkow; +Cc: linux-nfs On Tue, Aug 25, 2015 at 07:28:03PM +0200, Ulrich Gemkow wrote: > Hello Bruce, > > On Monday 24 August 2015 22:14:01 J. Bruce Fields wrote: > > On Mon, Aug 24, 2015 at 02:52:55PM +0200, Ulrich Gemkow wrote: > > > we have a weired problem with Linux NFSv4.0 Server (Vanilla > > > Kernel 4.1.6) and a Sun Solaris 10 client (all patches applied): > > > > > > When mounting a share on the Solaris client and then rebooting > > > the client without unmounting the share first, after the reboot > > > every attempt to mount the share again gives an I/O error on > > > the client and the mount fails. > > > > > > After a long time (serveral hours) the v4 mount suddenly works > > > again. > > > > > > Mounting a share with vers=2 works always even in times when > > > the v4 mount fails. > > > > > > So it seems the Linux NFSv4 server holds a state for the client > > > which prevents the re-mounting of the share and gives the > > > I/O-error on the client. > > > > > > We use NFSv4 without idmapd. > > > > > > Is there any tip how to debug or solve this? > > > > Best is probably to get a packet trace. So something like: > > > > tcpdump -s0 -iem0 -wtmp.pcap > > > > and then try the client mount, then kill the tcpdump after the mount > > fails, and send us tmp.pcap. (And/or take a look at tmp.pcap yourself > > with wireshark. The interesting question is what kind of error the > > server is returning when the client tries the mount after reboot.) > > Thank you for your reply. The tcpdump is attached, the relevant > packets are 49..52. The error seems to be a SERVERFAULT. Can you > see more from the dump? > > Thanks again and best regards The SERVERFAULT is on SETCLIENTID_CONFIRM. In nfsd4_setclientid_confirm(): conf = find_confirmed_client(clid, false, nn); unconf = find_unconfirmed_client(clid, false, nn); /* * We try hard to give out unique clientid's, so if we get an * attempt to confirm the same clientid with a different cred, * there's a bug somewhere. Let's charitably assume it's our * bug. */ status = nfserr_serverfault; if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) goto out; if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) goto out; The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical auth_unix creds. The clientid that were looking up there was returned from the previous SETCLIENTID, generated by this logic: if (conf && same_verf(&conf->cl_verifier, &clverifier)) /* case 1: probable callback update */ copy_clid(new, conf); else /* case 4 (new client) or cases 2, 3 (client reboot): */ gen_clid(new, nn); So it should be a brand new clientid, unless the client was reusing the old verifier. So perhaps the client is sending the SETCLIENTID with a verifier set to what it used on the previous boot? That sounds like a client bug. The linux client uses a timestamp for the verifier, looks like the Solaris client might too. Is there some reason the clock on this client isn't advancing on reboot? --b. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-25 21:54 ` J. Bruce Fields @ 2015-08-26 19:54 ` Ulrich Gemkow 2015-08-26 20:09 ` J. Bruce Fields 2015-08-27 6:43 ` Mkrtchyan, Tigran 1 sibling, 1 reply; 14+ messages in thread From: Ulrich Gemkow @ 2015-08-26 19:54 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs Hello Bruce, On Tuesday 25 August 2015 23:54:56 J. Bruce Fields wrote: > On Tue, Aug 25, 2015 at 07:28:03PM +0200, Ulrich Gemkow wrote: > > Hello Bruce, > > > > On Monday 24 August 2015 22:14:01 J. Bruce Fields wrote: > > > On Mon, Aug 24, 2015 at 02:52:55PM +0200, Ulrich Gemkow wrote: > > > > we have a weired problem with Linux NFSv4.0 Server (Vanilla > > > > Kernel 4.1.6) and a Sun Solaris 10 client (all patches applied): > > > > > > > > When mounting a share on the Solaris client and then rebooting > > > > the client without unmounting the share first, after the reboot > > > > every attempt to mount the share again gives an I/O error on > > > > the client and the mount fails. > > > > > > > > After a long time (serveral hours) the v4 mount suddenly works > > > > again. > > > > > > > > Mounting a share with vers=2 works always even in times when > > > > the v4 mount fails. > > > > > > > > So it seems the Linux NFSv4 server holds a state for the client > > > > which prevents the re-mounting of the share and gives the > > > > I/O-error on the client. > > > > > > > > We use NFSv4 without idmapd. > > > > > > > > Is there any tip how to debug or solve this? > > > > > > Best is probably to get a packet trace. So something like: > > > > > > tcpdump -s0 -iem0 -wtmp.pcap > > > > > > and then try the client mount, then kill the tcpdump after the mount > > > fails, and send us tmp.pcap. (And/or take a look at tmp.pcap yourself > > > with wireshark. The interesting question is what kind of error the > > > server is returning when the client tries the mount after reboot.) > > > > Thank you for your reply. The tcpdump is attached, the relevant > > packets are 49..52. The error seems to be a SERVERFAULT. Can you > > see more from the dump? > > > > Thanks again and best regards > > The SERVERFAULT is on SETCLIENTID_CONFIRM. > > In nfsd4_setclientid_confirm(): > > conf = find_confirmed_client(clid, false, nn); > unconf = find_unconfirmed_client(clid, false, nn); > /* > * We try hard to give out unique clientid's, so if we get an > * attempt to confirm the same clientid with a different cred, > * there's a bug somewhere. Let's charitably assume it's our > * bug. > */ > status = nfserr_serverfault; > if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) > goto out; > if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) > goto out; > > The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical > auth_unix creds. > > The clientid that were looking up there was returned from the previous > SETCLIENTID, generated by this logic: > > if (conf && same_verf(&conf->cl_verifier, &clverifier)) > /* case 1: probable callback update */ > copy_clid(new, conf); > else /* case 4 (new client) or cases 2, 3 (client reboot): */ > gen_clid(new, nn); > > So it should be a brand new clientid, unless the client was reusing the old > verifier. > > So perhaps the client is sending the SETCLIENTID with a verifier set to what it > used on the previous boot? That sounds like a client bug. The linux > client uses a timestamp for the verifier, looks like the Solaris client > might too. Is there some reason the clock on this client isn't > advancing on reboot? Thank you for the analysis. But the clock of the client advances regularely and as one would expect. The client is SPARC Solaris 10 with the latest patches applied - I cannot believe that this client has such a basic NFS bug. Can you think of any kind of server configuration bug (as said, Vanilla Linux 4.1.6) that causes this error? The NFS server startup system is "self-made"... Thanks again and best regards -Ulrich > > --b. > -- -- |----------------------------------------------------------------------- | Ulrich Gemkow | University of Stuttgart, Germany | Institute of Communication Networks and Computer Engineering (IKR) |----------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-26 19:54 ` Ulrich Gemkow @ 2015-08-26 20:09 ` J. Bruce Fields 2015-08-31 12:08 ` Ulrich Gemkow 0 siblings, 1 reply; 14+ messages in thread From: J. Bruce Fields @ 2015-08-26 20:09 UTC (permalink / raw) To: Ulrich Gemkow; +Cc: linux-nfs On Wed, Aug 26, 2015 at 09:54:22PM +0200, Ulrich Gemkow wrote: > Hello Bruce, > > On Tuesday 25 August 2015 23:54:56 J. Bruce Fields wrote: > > The SERVERFAULT is on SETCLIENTID_CONFIRM. > > > > In nfsd4_setclientid_confirm(): > > > > conf = find_confirmed_client(clid, false, nn); > > unconf = find_unconfirmed_client(clid, false, nn); > > /* > > * We try hard to give out unique clientid's, so if we get an > > * attempt to confirm the same clientid with a different cred, > > * there's a bug somewhere. Let's charitably assume it's our > > * bug. > > */ > > status = nfserr_serverfault; > > if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) > > goto out; > > if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) > > goto out; > > > > The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical > > auth_unix creds. > > > > The clientid that were looking up there was returned from the previous > > SETCLIENTID, generated by this logic: > > > > if (conf && same_verf(&conf->cl_verifier, &clverifier)) > > /* case 1: probable callback update */ > > copy_clid(new, conf); > > else /* case 4 (new client) or cases 2, 3 (client reboot): */ > > gen_clid(new, nn); > > > > So it should be a brand new clientid, unless the client was reusing the old > > verifier. > > > > So perhaps the client is sending the SETCLIENTID with a verifier set to what it > > used on the previous boot? That sounds like a client bug. The linux > > client uses a timestamp for the verifier, looks like the Solaris client > > might too. Is there some reason the clock on this client isn't > > advancing on reboot? > > Thank you for the analysis. But the clock of the client advances > regularely and as one would expect. OK, thanks for checking that. > The client is SPARC Solaris 10 with the latest patches > applied - I cannot believe that this client has such a > basic NFS bug. To confirm or deny my hypothesis, I think what we want is a longer capture that gets the failing SETCLIENTID_CONFIRM (as seen in the previous capture) but also shows what clientid the client was using before the reboot. So ideal might be something like: - start the capture - mount - create a file (I just want to make sure the client does at least one open) - reboot the client - mount again, see the failure - stop the capture > Can you think of any kind of server configuration bug > (as said, Vanilla Linux 4.1.6) that causes this error? > The NFS server startup system is "self-made"... I can't think of any off hand, if there's a server-side problem here I'd suspect the code before the configuration. --b. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-26 20:09 ` J. Bruce Fields @ 2015-08-31 12:08 ` Ulrich Gemkow 2015-08-31 14:51 ` J. Bruce Fields 0 siblings, 1 reply; 14+ messages in thread From: Ulrich Gemkow @ 2015-08-31 12:08 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs Hallo Bruce, On Wednesday 26 August 2015 22:09:40 you wrote: > On Wed, Aug 26, 2015 at 09:54:22PM +0200, Ulrich Gemkow wrote: > > Hello Bruce, > > > > On Tuesday 25 August 2015 23:54:56 J. Bruce Fields wrote: > > > The SERVERFAULT is on SETCLIENTID_CONFIRM. > > > > > > In nfsd4_setclientid_confirm(): > > > > > > conf = find_confirmed_client(clid, false, nn); > > > unconf = find_unconfirmed_client(clid, false, nn); > > > /* > > > * We try hard to give out unique clientid's, so if we get an > > > * attempt to confirm the same clientid with a different cred, > > > * there's a bug somewhere. Let's charitably assume it's our > > > * bug. > > > */ > > > status = nfserr_serverfault; > > > if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) > > > goto out; > > > if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) > > > goto out; > > > > > > The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical > > > auth_unix creds. > > > > > > The clientid that were looking up there was returned from the previous > > > SETCLIENTID, generated by this logic: > > > > > > if (conf && same_verf(&conf->cl_verifier, &clverifier)) > > > /* case 1: probable callback update */ > > > copy_clid(new, conf); > > > else /* case 4 (new client) or cases 2, 3 (client reboot): */ > > > gen_clid(new, nn); > > > > > > So it should be a brand new clientid, unless the client was reusing the old > > > verifier. > > > > > > So perhaps the client is sending the SETCLIENTID with a verifier set to what it > > > used on the previous boot? That sounds like a client bug. The linux > > > client uses a timestamp for the verifier, looks like the Solaris client > > > might too. Is there some reason the clock on this client isn't > > > advancing on reboot? > > > > Thank you for the analysis. But the clock of the client advances > > regularely and as one would expect. > > OK, thanks for checking that. > > > The client is SPARC Solaris 10 with the latest patches > > applied - I cannot believe that this client has such a > > basic NFS bug. > > To confirm or deny my hypothesis, I think what we want is a longer > capture that gets the failing SETCLIENTID_CONFIRM (as seen in the > previous capture) but also shows what clientid the client was using > before the reboot. So ideal might be something like: > > - start the capture > - mount > - create a file (I just want to make sure the client does at > least one open) > - reboot the client > - mount again, see the failure > - stop the capture I tried but probably made a mistake: To be sure to have a defined state for the test I rebooted the server while clearing all its NFS state and I reinstalled the client - both with the exact same configuration as before. And now the bug unfortunately does not happen again, the mount always succeeds. I did the reinstall of the client also before my first mail to be sure so it seems that the server may have reached an invalid state before - whatever this may has caused. I can only wait until the bug happens again (hoping not :-). Maybe you are able to find a reason from the information given before. I regret to be of no more help. If I can do something please tell me. Thank you very much again and best regards -Ulrich -- | Ulrich Gemkow | University of Stuttgart | Institute of Communication Networks and Computer Engineering (IKR) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-31 12:08 ` Ulrich Gemkow @ 2015-08-31 14:51 ` J. Bruce Fields 2015-08-31 15:52 ` Mkrtchyan, Tigran 0 siblings, 1 reply; 14+ messages in thread From: J. Bruce Fields @ 2015-08-31 14:51 UTC (permalink / raw) To: Ulrich Gemkow; +Cc: linux-nfs On Mon, Aug 31, 2015 at 02:08:08PM +0200, Ulrich Gemkow wrote: > Hallo Bruce, > > On Wednesday 26 August 2015 22:09:40 you wrote: > > On Wed, Aug 26, 2015 at 09:54:22PM +0200, Ulrich Gemkow wrote: > > > Hello Bruce, > > > > > > On Tuesday 25 August 2015 23:54:56 J. Bruce Fields wrote: > > > > The SERVERFAULT is on SETCLIENTID_CONFIRM. > > > > > > > > In nfsd4_setclientid_confirm(): > > > > > > > > conf = find_confirmed_client(clid, false, nn); > > > > unconf = find_unconfirmed_client(clid, false, nn); > > > > /* > > > > * We try hard to give out unique clientid's, so if we get an > > > > * attempt to confirm the same clientid with a different cred, > > > > * there's a bug somewhere. Let's charitably assume it's our > > > > * bug. > > > > */ > > > > status = nfserr_serverfault; > > > > if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) > > > > goto out; > > > > if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) > > > > goto out; > > > > > > > > The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical > > > > auth_unix creds. > > > > > > > > The clientid that were looking up there was returned from the previous > > > > SETCLIENTID, generated by this logic: > > > > > > > > if (conf && same_verf(&conf->cl_verifier, &clverifier)) > > > > /* case 1: probable callback update */ > > > > copy_clid(new, conf); > > > > else /* case 4 (new client) or cases 2, 3 (client reboot): */ > > > > gen_clid(new, nn); > > > > > > > > So it should be a brand new clientid, unless the client was reusing the old > > > > verifier. > > > > > > > > So perhaps the client is sending the SETCLIENTID with a verifier set to what it > > > > used on the previous boot? That sounds like a client bug. The linux > > > > client uses a timestamp for the verifier, looks like the Solaris client > > > > might too. Is there some reason the clock on this client isn't > > > > advancing on reboot? > > > > > > Thank you for the analysis. But the clock of the client advances > > > regularely and as one would expect. > > > > OK, thanks for checking that. > > > > > The client is SPARC Solaris 10 with the latest patches > > > applied - I cannot believe that this client has such a > > > basic NFS bug. > > > > To confirm or deny my hypothesis, I think what we want is a longer > > capture that gets the failing SETCLIENTID_CONFIRM (as seen in the > > previous capture) but also shows what clientid the client was using > > before the reboot. So ideal might be something like: > > > > - start the capture > > - mount > > - create a file (I just want to make sure the client does at > > least one open) > > - reboot the client > > - mount again, see the failure > > - stop the capture > > I tried but probably made a mistake: To be sure to have a > defined state for the test I rebooted the server while clearing > all its NFS state and I reinstalled the client - both with the > exact same configuration as before. > > And now the bug unfortunately does not happen again, the mount > always succeeds. I did the reinstall of the client also before > my first mail to be sure so it seems that the server may have > reached an invalid state before - whatever this may has caused. That's interesting! > I can only wait until the bug happens again (hoping not :-). > > Maybe you are able to find a reason from the information > given before. I regret to be of no more help. If I can do > something please tell me. I'm not coming up with any ideas right now. Do let us know if you get into that state again. --b. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-31 14:51 ` J. Bruce Fields @ 2015-08-31 15:52 ` Mkrtchyan, Tigran 0 siblings, 0 replies; 14+ messages in thread From: Mkrtchyan, Tigran @ 2015-08-31 15:52 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Ulrich Gemkow, linux-nfs ----- Original Message ----- > From: "J. Bruce Fields" <bfields@fieldses.org> > To: "Ulrich Gemkow" <ulrich.gemkow@ikr.uni-stuttgart.de> > Cc: linux-nfs@vger.kernel.org > Sent: Monday, August 31, 2015 4:51:48 PM > Subject: Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client > On Mon, Aug 31, 2015 at 02:08:08PM +0200, Ulrich Gemkow wrote: >> Hallo Bruce, >> >> On Wednesday 26 August 2015 22:09:40 you wrote: >> > On Wed, Aug 26, 2015 at 09:54:22PM +0200, Ulrich Gemkow wrote: >> > > Hello Bruce, >> > > >> > > On Tuesday 25 August 2015 23:54:56 J. Bruce Fields wrote: >> > > > The SERVERFAULT is on SETCLIENTID_CONFIRM. >> > > > >> > > > In nfsd4_setclientid_confirm(): >> > > > >> > > > conf = find_confirmed_client(clid, false, nn); >> > > > unconf = find_unconfirmed_client(clid, false, nn); >> > > > /* >> > > > * We try hard to give out unique clientid's, so if we get an >> > > > * attempt to confirm the same clientid with a different cred, >> > > > * there's a bug somewhere. Let's charitably assume it's our >> > > > * bug. >> > > > */ >> > > > status = nfserr_serverfault; >> > > > if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) >> > > > goto out; >> > > > if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) >> > > > goto out; >> > > > >> > > > The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical >> > > > auth_unix creds. >> > > > >> > > > The clientid that were looking up there was returned from the previous >> > > > SETCLIENTID, generated by this logic: >> > > > >> > > > if (conf && same_verf(&conf->cl_verifier, &clverifier)) >> > > > /* case 1: probable callback update */ >> > > > copy_clid(new, conf); >> > > > else /* case 4 (new client) or cases 2, 3 (client reboot): */ >> > > > gen_clid(new, nn); >> > > > >> > > > So it should be a brand new clientid, unless the client was reusing the old >> > > > verifier. >> > > > >> > > > So perhaps the client is sending the SETCLIENTID with a verifier set to what it >> > > > used on the previous boot? That sounds like a client bug. The linux >> > > > client uses a timestamp for the verifier, looks like the Solaris client >> > > > might too. Is there some reason the clock on this client isn't >> > > > advancing on reboot? >> > > >> > > Thank you for the analysis. But the clock of the client advances >> > > regularely and as one would expect. >> > >> > OK, thanks for checking that. >> > >> > > The client is SPARC Solaris 10 with the latest patches >> > > applied - I cannot believe that this client has such a >> > > basic NFS bug. >> > >> > To confirm or deny my hypothesis, I think what we want is a longer >> > capture that gets the failing SETCLIENTID_CONFIRM (as seen in the >> > previous capture) but also shows what clientid the client was using >> > before the reboot. So ideal might be something like: >> > >> > - start the capture >> > - mount >> > - create a file (I just want to make sure the client does at >> > least one open) >> > - reboot the client >> > - mount again, see the failure >> > - stop the capture >> >> I tried but probably made a mistake: To be sure to have a >> defined state for the test I rebooted the server while clearing >> all its NFS state and I reinstalled the client - both with the >> exact same configuration as before. >> >> And now the bug unfortunately does not happen again, the mount >> always succeeds. I did the reinstall of the client also before >> my first mail to be sure so it seems that the server may have >> reached an invalid state before - whatever this may has caused. > > That's interesting! > >> I can only wait until the bug happens again (hoping not :-). >> >> Maybe you are able to find a reason from the information >> given before. I regret to be of no more help. If I can do >> something please tell me. > > I'm not coming up with any ideas right now. Do let us know if you get > into that state again. To me it sounds like server still has a reference by client's ownerid and fails to detect that verifier is not valid any more. Some kind of leak in client disconnect/reboot detection code (although code looks like doing the right thing). I don't have much inside Linux server to verify that. Tigran. > > --b. > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-25 21:54 ` J. Bruce Fields 2015-08-26 19:54 ` Ulrich Gemkow @ 2015-08-27 6:43 ` Mkrtchyan, Tigran 2015-08-27 18:29 ` J. Bruce Fields 1 sibling, 1 reply; 14+ messages in thread From: Mkrtchyan, Tigran @ 2015-08-27 6:43 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Ulrich Gemkow, linux-nfs ----- Original Message ----- > From: "J. Bruce Fields" <bfields@fieldses.org> > To: "Ulrich Gemkow" <ulrich.gemkow@ikr.uni-stuttgart.de> > Cc: linux-nfs@vger.kernel.org > Sent: Tuesday, August 25, 2015 11:54:56 PM > Subject: Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client > On Tue, Aug 25, 2015 at 07:28:03PM +0200, Ulrich Gemkow wrote: >> Hello Bruce, >> >> On Monday 24 August 2015 22:14:01 J. Bruce Fields wrote: >> > On Mon, Aug 24, 2015 at 02:52:55PM +0200, Ulrich Gemkow wrote: >> > > we have a weired problem with Linux NFSv4.0 Server (Vanilla >> > > Kernel 4.1.6) and a Sun Solaris 10 client (all patches applied): >> > > >> > > When mounting a share on the Solaris client and then rebooting >> > > the client without unmounting the share first, after the reboot >> > > every attempt to mount the share again gives an I/O error on >> > > the client and the mount fails. >> > > >> > > After a long time (serveral hours) the v4 mount suddenly works >> > > again. >> > > >> > > Mounting a share with vers=2 works always even in times when >> > > the v4 mount fails. >> > > >> > > So it seems the Linux NFSv4 server holds a state for the client >> > > which prevents the re-mounting of the share and gives the >> > > I/O-error on the client. >> > > >> > > We use NFSv4 without idmapd. >> > > >> > > Is there any tip how to debug or solve this? >> > >> > Best is probably to get a packet trace. So something like: >> > >> > tcpdump -s0 -iem0 -wtmp.pcap >> > >> > and then try the client mount, then kill the tcpdump after the mount >> > fails, and send us tmp.pcap. (And/or take a look at tmp.pcap yourself >> > with wireshark. The interesting question is what kind of error the >> > server is returning when the client tries the mount after reboot.) >> >> Thank you for your reply. The tcpdump is attached, the relevant >> packets are 49..52. The error seems to be a SERVERFAULT. Can you >> see more from the dump? >> >> Thanks again and best regards > > The SERVERFAULT is on SETCLIENTID_CONFIRM. > > In nfsd4_setclientid_confirm(): > > conf = find_confirmed_client(clid, false, nn); > unconf = find_unconfirmed_client(clid, false, nn); > /* > * We try hard to give out unique clientid's, so if we get an > * attempt to confirm the same clientid with a different cred, > * there's a bug somewhere. Let's charitably assume it's our > * bug. > */ > status = nfserr_serverfault; > if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) > goto out; > if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) > goto out; > > The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical > auth_unix creds. > > The clientid that were looking up there was returned from the previous > SETCLIENTID, generated by this logic: > > if (conf && same_verf(&conf->cl_verifier, &clverifier)) > /* case 1: probable callback update */ > copy_clid(new, conf); > else /* case 4 (new client) or cases 2, 3 (client reboot): */ > gen_clid(new, nn); > > So it should be a brand new clientid, unless the client was reusing the old > verifier. > > So perhaps the client is sending the SETCLIENTID with a verifier set to what it > used on the previous boot? That sounds like a client bug. The linux > client uses a timestamp for the verifier, looks like the Solaris client > might too. Is there some reason the clock on this client isn't > advancing on reboot? probably NFS4ERR_STALE_CLIENTID is a better error code for this scenario. Tigran. > > --b. > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-27 6:43 ` Mkrtchyan, Tigran @ 2015-08-27 18:29 ` J. Bruce Fields 2015-08-27 20:36 ` Frank Filz 0 siblings, 1 reply; 14+ messages in thread From: J. Bruce Fields @ 2015-08-27 18:29 UTC (permalink / raw) To: Mkrtchyan, Tigran; +Cc: Ulrich Gemkow, linux-nfs On Thu, Aug 27, 2015 at 08:43:51AM +0200, Mkrtchyan, Tigran wrote: > > > ----- Original Message ----- > > From: "J. Bruce Fields" <bfields@fieldses.org> > > To: "Ulrich Gemkow" <ulrich.gemkow@ikr.uni-stuttgart.de> > > Cc: linux-nfs@vger.kernel.org > > Sent: Tuesday, August 25, 2015 11:54:56 PM > > Subject: Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client > > > On Tue, Aug 25, 2015 at 07:28:03PM +0200, Ulrich Gemkow wrote: > >> Hello Bruce, > >> > >> On Monday 24 August 2015 22:14:01 J. Bruce Fields wrote: > >> > On Mon, Aug 24, 2015 at 02:52:55PM +0200, Ulrich Gemkow wrote: > >> > > we have a weired problem with Linux NFSv4.0 Server (Vanilla > >> > > Kernel 4.1.6) and a Sun Solaris 10 client (all patches applied): > >> > > > >> > > When mounting a share on the Solaris client and then rebooting > >> > > the client without unmounting the share first, after the reboot > >> > > every attempt to mount the share again gives an I/O error on > >> > > the client and the mount fails. > >> > > > >> > > After a long time (serveral hours) the v4 mount suddenly works > >> > > again. > >> > > > >> > > Mounting a share with vers=2 works always even in times when > >> > > the v4 mount fails. > >> > > > >> > > So it seems the Linux NFSv4 server holds a state for the client > >> > > which prevents the re-mounting of the share and gives the > >> > > I/O-error on the client. > >> > > > >> > > We use NFSv4 without idmapd. > >> > > > >> > > Is there any tip how to debug or solve this? > >> > > >> > Best is probably to get a packet trace. So something like: > >> > > >> > tcpdump -s0 -iem0 -wtmp.pcap > >> > > >> > and then try the client mount, then kill the tcpdump after the mount > >> > fails, and send us tmp.pcap. (And/or take a look at tmp.pcap yourself > >> > with wireshark. The interesting question is what kind of error the > >> > server is returning when the client tries the mount after reboot.) > >> > >> Thank you for your reply. The tcpdump is attached, the relevant > >> packets are 49..52. The error seems to be a SERVERFAULT. Can you > >> see more from the dump? > >> > >> Thanks again and best regards > > > > The SERVERFAULT is on SETCLIENTID_CONFIRM. > > > > In nfsd4_setclientid_confirm(): > > > > conf = find_confirmed_client(clid, false, nn); > > unconf = find_unconfirmed_client(clid, false, nn); > > /* > > * We try hard to give out unique clientid's, so if we get an > > * attempt to confirm the same clientid with a different cred, > > * there's a bug somewhere. Let's charitably assume it's our > > * bug. > > */ > > status = nfserr_serverfault; > > if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) > > goto out; > > if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) > > goto out; > > > > The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical > > auth_unix creds. > > > > The clientid that were looking up there was returned from the previous > > SETCLIENTID, generated by this logic: > > > > if (conf && same_verf(&conf->cl_verifier, &clverifier)) > > /* case 1: probable callback update */ > > copy_clid(new, conf); > > else /* case 4 (new client) or cases 2, 3 (client reboot): */ > > gen_clid(new, nn); > > > > So it should be a brand new clientid, unless the client was reusing the old > > verifier. > > > > So perhaps the client is sending the SETCLIENTID with a verifier set to what it > > used on the previous boot? That sounds like a client bug. The linux > > client uses a timestamp for the verifier, looks like the Solaris client > > might too. Is there some reason the clock on this client isn't > > advancing on reboot? > > probably NFS4ERR_STALE_CLIENTID is a better error code for this scenario. SERVERFAULT is obviously lame, but I don't know that STALE_CLIENTID is right either. Another thing that's weird is: > After a long time (serveral hours) the v4 mount suddenly works > again. I'd expect the clent to expire after a lease period (default 90 seconds), I don't know what could be happening that would take hours. Also I don't know why those creds would change after a reboot. Anyway I think a trace covering the reboot is still the best hope of an explanation. --b. ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-27 18:29 ` J. Bruce Fields @ 2015-08-27 20:36 ` Frank Filz 2015-08-28 18:06 ` 'J. Bruce Fields' 0 siblings, 1 reply; 14+ messages in thread From: Frank Filz @ 2015-08-27 20:36 UTC (permalink / raw) To: 'J. Bruce Fields', 'Mkrtchyan, Tigran' Cc: 'Ulrich Gemkow', linux-nfs > On Thu, Aug 27, 2015 at 08:43:51AM +0200, Mkrtchyan, Tigran wrote: > > > > > > ----- Original Message ----- > > > From: "J. Bruce Fields" <bfields@fieldses.org> > > > To: "Ulrich Gemkow" <ulrich.gemkow@ikr.uni-stuttgart.de> > > > Cc: linux-nfs@vger.kernel.org > > > Sent: Tuesday, August 25, 2015 11:54:56 PM > > > Subject: Re: NFSv4 mount fails on Sun Solaris 10 after reboot of > > > client > > > > > On Tue, Aug 25, 2015 at 07:28:03PM +0200, Ulrich Gemkow wrote: > > >> Hello Bruce, > > >> > > >> On Monday 24 August 2015 22:14:01 J. Bruce Fields wrote: > > >> > On Mon, Aug 24, 2015 at 02:52:55PM +0200, Ulrich Gemkow wrote: > > >> > > we have a weired problem with Linux NFSv4.0 Server (Vanilla > > >> > > Kernel 4.1.6) and a Sun Solaris 10 client (all patches applied): > > >> > > > > >> > > When mounting a share on the Solaris client and then rebooting > > >> > > the client without unmounting the share first, after the reboot > > >> > > every attempt to mount the share again gives an I/O error on > > >> > > the client and the mount fails. > > >> > > > > >> > > After a long time (serveral hours) the v4 mount suddenly works > > >> > > again. > > >> > > > > >> > > Mounting a share with vers=2 works always even in times when > > >> > > the v4 mount fails. > > >> > > > > >> > > So it seems the Linux NFSv4 server holds a state for the client > > >> > > which prevents the re-mounting of the share and gives the > > >> > > I/O-error on the client. > > >> > > > > >> > > We use NFSv4 without idmapd. > > >> > > > > >> > > Is there any tip how to debug or solve this? > > >> > > > >> > Best is probably to get a packet trace. So something like: > > >> > > > >> > tcpdump -s0 -iem0 -wtmp.pcap > > >> > > > >> > and then try the client mount, then kill the tcpdump after the > > >> > mount fails, and send us tmp.pcap. (And/or take a look at > > >> > tmp.pcap yourself with wireshark. The interesting question is > > >> > what kind of error the server is returning when the client tries > > >> > the mount after reboot.) > > >> > > >> Thank you for your reply. The tcpdump is attached, the relevant > > >> packets are 49..52. The error seems to be a SERVERFAULT. Can you > > >> see more from the dump? > > >> > > >> Thanks again and best regards > > > > > > The SERVERFAULT is on SETCLIENTID_CONFIRM. > > > > > > In nfsd4_setclientid_confirm(): > > > > > > conf = find_confirmed_client(clid, false, nn); > > > unconf = find_unconfirmed_client(clid, false, nn); > > > /* > > > * We try hard to give out unique clientid's, so if we get an > > > * attempt to confirm the same clientid with a different cred, > > > * there's a bug somewhere. Let's charitably assume it's our > > > * bug. > > > */ > > > status = nfserr_serverfault; > > > if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) > > > goto out; > > > if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) > > > goto out; If the creds don't match, the return should be NFS4ERR_CLID_INUSE per section 16.34.5. IMPLEMENTATION first bullet after DRC discussion. At least the way I read RFC 7530... > > > The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical > > > auth_unix creds. > > > > > > The clientid that were looking up there was returned from the > > > previous SETCLIENTID, generated by this logic: > > > > > > if (conf && same_verf(&conf->cl_verifier, &clverifier)) > > > /* case 1: probable callback update */ > > > copy_clid(new, conf); > > > else /* case 4 (new client) or cases 2, 3 (client reboot): */ > > > gen_clid(new, nn); > > > > > > So it should be a brand new clientid, unless the client was reusing > > > the old verifier. > > > > > > So perhaps the client is sending the SETCLIENTID with a verifier set > > > to what it used on the previous boot? That sounds like a client > > > bug. The linux client uses a timestamp for the verifier, looks like > > > the Solaris client might too. Is there some reason the clock on > > > this client isn't advancing on reboot? > > > > probably NFS4ERR_STALE_CLIENTID is a better error code for this scenario. > > SERVERFAULT is obviously lame, but I don't know that STALE_CLIENTID is > right either. > > Another thing that's weird is: > > > After a long time (serveral hours) the v4 mount suddenly works > > again. > > I'd expect the clent to expire after a lease period (default 90 seconds), I don't > know what could be happening that would take hours. > > Also I don't know why those creds would change after a reboot. > > Anyway I think a trace covering the reboot is still the best hope of an > explanation. Frank ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-27 20:36 ` Frank Filz @ 2015-08-28 18:06 ` 'J. Bruce Fields' 2015-09-01 17:43 ` 'J. Bruce Fields' 0 siblings, 1 reply; 14+ messages in thread From: 'J. Bruce Fields' @ 2015-08-28 18:06 UTC (permalink / raw) To: Frank Filz Cc: 'Mkrtchyan, Tigran', 'Ulrich Gemkow', linux-nfs On Thu, Aug 27, 2015 at 01:36:38PM -0700, Frank Filz wrote: > > On Thu, Aug 27, 2015 at 08:43:51AM +0200, Mkrtchyan, Tigran wrote: > > > > > > > > > ----- Original Message ----- > > > > From: "J. Bruce Fields" <bfields@fieldses.org> > > > > To: "Ulrich Gemkow" <ulrich.gemkow@ikr.uni-stuttgart.de> > > > > Cc: linux-nfs@vger.kernel.org > > > > Sent: Tuesday, August 25, 2015 11:54:56 PM > > > > Subject: Re: NFSv4 mount fails on Sun Solaris 10 after reboot of > > > > client > > > > > > > On Tue, Aug 25, 2015 at 07:28:03PM +0200, Ulrich Gemkow wrote: > > > >> Hello Bruce, > > > >> > > > >> On Monday 24 August 2015 22:14:01 J. Bruce Fields wrote: > > > >> > On Mon, Aug 24, 2015 at 02:52:55PM +0200, Ulrich Gemkow wrote: > > > >> > > we have a weired problem with Linux NFSv4.0 Server (Vanilla > > > >> > > Kernel 4.1.6) and a Sun Solaris 10 client (all patches applied): > > > >> > > > > > >> > > When mounting a share on the Solaris client and then rebooting > > > >> > > the client without unmounting the share first, after the reboot > > > >> > > every attempt to mount the share again gives an I/O error on > > > >> > > the client and the mount fails. > > > >> > > > > > >> > > After a long time (serveral hours) the v4 mount suddenly works > > > >> > > again. > > > >> > > > > > >> > > Mounting a share with vers=2 works always even in times when > > > >> > > the v4 mount fails. > > > >> > > > > > >> > > So it seems the Linux NFSv4 server holds a state for the client > > > >> > > which prevents the re-mounting of the share and gives the > > > >> > > I/O-error on the client. > > > >> > > > > > >> > > We use NFSv4 without idmapd. > > > >> > > > > > >> > > Is there any tip how to debug or solve this? > > > >> > > > > >> > Best is probably to get a packet trace. So something like: > > > >> > > > > >> > tcpdump -s0 -iem0 -wtmp.pcap > > > >> > > > > >> > and then try the client mount, then kill the tcpdump after the > > > >> > mount fails, and send us tmp.pcap. (And/or take a look at > > > >> > tmp.pcap yourself with wireshark. The interesting question is > > > >> > what kind of error the server is returning when the client tries > > > >> > the mount after reboot.) > > > >> > > > >> Thank you for your reply. The tcpdump is attached, the relevant > > > >> packets are 49..52. The error seems to be a SERVERFAULT. Can you > > > >> see more from the dump? > > > >> > > > >> Thanks again and best regards > > > > > > > > The SERVERFAULT is on SETCLIENTID_CONFIRM. > > > > > > > > In nfsd4_setclientid_confirm(): > > > > > > > > conf = find_confirmed_client(clid, false, nn); > > > > unconf = find_unconfirmed_client(clid, false, nn); > > > > /* > > > > * We try hard to give out unique clientid's, so if we get an > > > > * attempt to confirm the same clientid with a different cred, > > > > * there's a bug somewhere. Let's charitably assume it's our > > > > * bug. > > > > */ > > > > status = nfserr_serverfault; > > > > if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) > > > > goto out; > > > > if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) > > > > goto out; > > If the creds don't match, the return should be NFS4ERR_CLID_INUSE per > section 16.34.5. IMPLEMENTATION first bullet after DRC discussion. > > At least the way I read RFC 7530... I assumed that was only the case when the long-form client-provided client identifier matched, but here we're looking at records matched by the shorthand server-generated clientid. Very weird that that we'd get to this case (and without hitting CLID_INUSE on the setclientid?). There's something we don't understand. Anyway, looking at the SETCLIENTID_CONFIRM description in 7530, I think you're right, they're recommending CLIDN_INUSE for this case. Doubt that would actually help in Ulrich's case, though. --b. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client 2015-08-28 18:06 ` 'J. Bruce Fields' @ 2015-09-01 17:43 ` 'J. Bruce Fields' 0 siblings, 0 replies; 14+ messages in thread From: 'J. Bruce Fields' @ 2015-09-01 17:43 UTC (permalink / raw) To: Frank Filz Cc: 'Mkrtchyan, Tigran', 'Ulrich Gemkow', linux-nfs On Fri, Aug 28, 2015 at 02:06:12PM -0400, 'J. Bruce Fields' wrote: > On Thu, Aug 27, 2015 at 01:36:38PM -0700, Frank Filz wrote: > > If the creds don't match, the return should be NFS4ERR_CLID_INUSE per > > section 16.34.5. IMPLEMENTATION first bullet after DRC discussion. > > > > At least the way I read RFC 7530... > > I assumed that was only the case when the long-form client-provided > client identifier matched, but here we're looking at records matched by > the shorthand server-generated clientid. Very weird that that we'd get > to this case (and without hitting CLID_INUSE on the setclientid?). > There's something we don't understand. > > Anyway, looking at the SETCLIENTID_CONFIRM description in 7530, I think > you're right, they're recommending CLIDN_INUSE for this case. Doubt > that would actually help in Ulrich's case, though. May as well fix that anyway, I guess.--b. commit 3d70e158ea9b Author: J. Bruce Fields <bfields@redhat.com> Date: Tue Sep 1 13:40:53 2015 -0400 nfsd: return CLID_INUSE for unexpected SETCLIENTID_CONFIRM case Somebody with a Solaris client was hitting this case. We haven't figured out why yet, and don't have a reproducer. Meanwhile Frank noticed that RFC 7530 actually recommends CLID_INUSE for this case. Unlikely to help the original reporter, but may as well fix it. Reported-by: Frank Filz <ffilzlnx@mindspring.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 416f32e34a33..08746ec1d44a 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -3111,10 +3111,11 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp, /* * We try hard to give out unique clientid's, so if we get an * attempt to confirm the same clientid with a different cred, - * there's a bug somewhere. Let's charitably assume it's our - * bug. + * the client may be buggy; this should never happen. + * + * Nevertheless, RFC 7530 recommends INUSE for this case: */ - status = nfserr_serverfault; + status = nfserr_clid_inuse; if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) goto out; if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-09-01 17:43 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-08-24 12:52 NFSv4 mount fails on Sun Solaris 10 after reboot of client Ulrich Gemkow 2015-08-24 20:14 ` J. Bruce Fields 2015-08-25 17:28 ` Ulrich Gemkow 2015-08-25 21:54 ` J. Bruce Fields 2015-08-26 19:54 ` Ulrich Gemkow 2015-08-26 20:09 ` J. Bruce Fields 2015-08-31 12:08 ` Ulrich Gemkow 2015-08-31 14:51 ` J. Bruce Fields 2015-08-31 15:52 ` Mkrtchyan, Tigran 2015-08-27 6:43 ` Mkrtchyan, Tigran 2015-08-27 18:29 ` J. Bruce Fields 2015-08-27 20:36 ` Frank Filz 2015-08-28 18:06 ` 'J. Bruce Fields' 2015-09-01 17:43 ` 'J. Bruce Fields'
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).