linux-tegra.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] nfs: simplify and guarantee owner uniqueness.
       [not found]   ` <172652955677.17050.4744720185342907808@noble.neil.brown.name>
@ 2024-09-19  6:38     ` Jon Hunter
  0 siblings, 0 replies; 3+ messages in thread
From: Jon Hunter @ 2024-09-19  6:38 UTC (permalink / raw)
  To: NeilBrown, Steven Price
  Cc: Trond Myklebust, Anna Schumaker, linux-nfs,
	linux-tegra@vger.kernel.org

Hi Neil,

On 17/09/2024 00:32, NeilBrown wrote:
> On Tue, 17 Sep 2024, Steven Price wrote:
>>
>> Hi Neil,
>>
>> I'm seeing issues on a test board using an NFS root which I've bisected
>> to this commit in linux-next. The kernel spits out many errors of the form:
>>
>> [    7.478995] NFS: v4 server <ip>  returned a bad sequence-id error!
>> [    7.599462] NFS: v4 server <ip>  returned a bad sequence-id error!
>> [    7.600570] NFS: v4 server <ip>  returned a bad sequence-id error!
>> [    7.615243] NFS: v4 server <ip>  returned a bad sequence-id error!
>> [    7.636756] NFS: v4 server <ip>  returned a bad sequence-id error!
>> [    7.644808] NFS: v4 server <ip>  returned a bad sequence-id error!
>> [    7.653605] NFS: v4 server <ip>  returned a bad sequence-id error!
>> [    7.692836] NFS: nfs4_reclaim_open_state: unhandled error -10026
>> [    7.699573] NFSv4: state recovery failed for open file
>> arm-linux-gnueabihf/libgpg-error.so.0.29.0, error = -10026
>> [    7.711055] NFSv4: state recovery failed for open file
>> arm-linux-gnueabihf/libgpg-error.so.0.29.0, error = -10026
>>
>> (with the filename obviously varying)
>>
>> The NFS server is a standard Debian 12 system.
>>
>> Any ideas?
> 
> Not immediately.  It appears that when the client opens a file during
> recovery, the server doesn't like the seqid that it uses...
> 
> Recover happens when the server restarts and when the client and server
> have been out of contact for an extended period or time (>90 seconds by
> default).
> Was either of those the case here?  Which one?


I am seeing various failures on -next and bisect is also pointing to
this commit. Reverting it does fix these issues. On one board I also
observed ...

[   12.674296] NFS: v4 server 192.168.99.1  returned a bad sequence-id error!
[   12.780476] NFS: v4 server 192.168.99.1  returned a bad sequence-id error!
[   12.829071] NFS: v4 server 192.168.99.1  returned a bad sequence-id error!
[   12.971432] NFS: v4 server 192.168.99.1  returned a bad sequence-id error!
[   13.102700] NFS: v4 server 192.168.99.1  returned a bad sequence-id error!
[   13.171315] NFS: v4 server 192.168.99.1  returned a bad sequence-id error!
[   13.216019] NFS: v4 server 192.168.99.1  returned a bad sequence-id error!
[   13.273610] NFS: v4 server 192.168.99.1  returned a bad sequence-id error!
[   13.298471] NFS: v4 server 192.168.99.1  returned a bad sequence-id error!

And on the same board I see ...

[   16.496417] NFS: nfs4_reclaim_open_state: unhandled error -10026
[   16.991736] NFS: nfs4_reclaim_open_state: unhandled error -10026
[   17.106226] NFS: nfs4_reclaim_open_state: unhandled error -10026

Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] nfs: simplify and guarantee owner uniqueness.
       [not found] ` <172680136351.17050.10296437171546281772@noble.neil.brown.name>
@ 2024-09-22 12:56   ` Jon Hunter
  2024-09-22 23:21     ` NeilBrown
  0 siblings, 1 reply; 3+ messages in thread
From: Jon Hunter @ 2024-09-22 12:56 UTC (permalink / raw)
  To: NeilBrown, Steven Price
  Cc: Trond Myklebust, Anna Schumaker, linux-nfs,
	linux-tegra@vger.kernel.org

Hi Neil,

On 20/09/2024 04:02, NeilBrown wrote:
> On Thu, 19 Sep 2024, Steven Price wrote:
>> On 19/09/2024 02:29, NeilBrown wrote:
>>> On Wed, 18 Sep 2024, Steven Price wrote:
>>>> Hi Neil,
>>>>
>>>> (Dropping the list/others due to the attachment)
>>>
>>> (re-adding others now - thanks for the attachment).
>>>
>>>>
>>>> Attached, this is booting a kernel compiled from 00fd839ca761 ("nfs:
>>>> simplify and guarantee owner uniqueness.") which uses an NFS root with a
>>>> Debian bullseye userspace.
>>>
>>> This shows that the owner_id was always different - or almost always.
>>> Once it repeated we got an error because the seqid kept increasing.
>>> This is because the xdr encoding is broken.
>>>
>>> Please apply this incremental patch and confirm that it works now.
>>
>> Thanks, I've tested the below and I don't see NFS errors any more.
>>
>> Tested-by: Steven Price <steven.price@arm.com>
> 
> Thanks Steve.
> 
> Anna: could you please squash this fix in to the commit?
> Jon: could you please confirm that this fixes your problem too.
> 
> Thanks,
> NeilBrown
> 
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index 1aaf908acc5d..88bcbcba1381 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -1429,7 +1429,7 @@ static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_opena
>   	*p++ = cpu_to_be32(28);
>   	p = xdr_encode_opaque_fixed(p, "open id:", 8);
>   	*p++ = cpu_to_be32(arg->server->s_dev);
> -	xdr_encode_hyper(p, arg->id.uniquifier);
> +	p = xdr_encode_hyper(p, arg->id.uniquifier);
>   	xdr_encode_hyper(p, arg->id.create_time);
>   }


Works for me!

Tested-by: Jon Hunter <jonathanh@nvidia.com>

Thanks
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] nfs: simplify and guarantee owner uniqueness.
  2024-09-22 12:56   ` Jon Hunter
@ 2024-09-22 23:21     ` NeilBrown
  0 siblings, 0 replies; 3+ messages in thread
From: NeilBrown @ 2024-09-22 23:21 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Steven Price, Trond Myklebust, Anna Schumaker, linux-nfs,
	linux-tegra@vger.kernel.org

On Sun, 22 Sep 2024, Jon Hunter wrote:
> Hi Neil,
> 
> On 20/09/2024 04:02, NeilBrown wrote:
> > On Thu, 19 Sep 2024, Steven Price wrote:
> >> On 19/09/2024 02:29, NeilBrown wrote:
> >>> On Wed, 18 Sep 2024, Steven Price wrote:
> >>>> Hi Neil,
> >>>>
> >>>> (Dropping the list/others due to the attachment)
> >>>
> >>> (re-adding others now - thanks for the attachment).
> >>>
> >>>>
> >>>> Attached, this is booting a kernel compiled from 00fd839ca761 ("nfs:
> >>>> simplify and guarantee owner uniqueness.") which uses an NFS root with a
> >>>> Debian bullseye userspace.
> >>>
> >>> This shows that the owner_id was always different - or almost always.
> >>> Once it repeated we got an error because the seqid kept increasing.
> >>> This is because the xdr encoding is broken.
> >>>
> >>> Please apply this incremental patch and confirm that it works now.
> >>
> >> Thanks, I've tested the below and I don't see NFS errors any more.
> >>
> >> Tested-by: Steven Price <steven.price@arm.com>
> > 
> > Thanks Steve.
> > 
> > Anna: could you please squash this fix in to the commit?
> > Jon: could you please confirm that this fixes your problem too.
> > 
> > Thanks,
> > NeilBrown
> > 
> > diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> > index 1aaf908acc5d..88bcbcba1381 100644
> > --- a/fs/nfs/nfs4xdr.c
> > +++ b/fs/nfs/nfs4xdr.c
> > @@ -1429,7 +1429,7 @@ static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_opena
> >   	*p++ = cpu_to_be32(28);
> >   	p = xdr_encode_opaque_fixed(p, "open id:", 8);
> >   	*p++ = cpu_to_be32(arg->server->s_dev);
> > -	xdr_encode_hyper(p, arg->id.uniquifier);
> > +	p = xdr_encode_hyper(p, arg->id.uniquifier);
> >   	xdr_encode_hyper(p, arg->id.create_time);
> >   }
> 
> 
> Works for me!

Thanks Jon.
Anna has updated the patch so the fixed version is what will land
upstream.

NeilBrown

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-09-22 23:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <172558992310.4433.1385243627662249022@noble.neil.brown.name>
     [not found] ` <5c90c3d0-c51f-4012-9ab6-408d023570c8@arm.com>
     [not found]   ` <172652955677.17050.4744720185342907808@noble.neil.brown.name>
2024-09-19  6:38     ` [PATCH] nfs: simplify and guarantee owner uniqueness Jon Hunter
     [not found] <1d66e015-1ca7-4786-893c-9224ad0c7371@arm.com>
     [not found] ` <172680136351.17050.10296437171546281772@noble.neil.brown.name>
2024-09-22 12:56   ` Jon Hunter
2024-09-22 23:21     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).