* [PATCH] NFSv4: clear exception state on successful mkdir retry [not found] <177745671692.1474915.5018486129724109553@noble.neil.brown.name> @ 2026-04-29 10:49 ` Igor Raits 2026-05-13 7:18 ` Thorsten Leemhuis 0 siblings, 1 reply; 2+ messages in thread From: Igor Raits @ 2026-04-29 10:49 UTC (permalink / raw) To: Trond Myklebust, Anna Schumaker Cc: NeilBrown, Jan Čípa, linux-nfs, linux-kernel, stable After a server returns NFS4ERR_DELAY for an NFSv4 CREATE issued by mkdir(2), the client correctly waits and retries. When the retry succeeds, however, mkdir(2) can still surface -EEXIST to userspace even though the directory was just created on the server. Reproducer (random 16-hex names so collisions are not the cause) against an in-kernel Linux nfsd; reproduces under both NFSv4.0 and NFSv4.2: N=2000000; base=/var/gdc/export for ((i=1; i<=N; i++)); do d=$base/$(openssl rand -hex 8) mkdir "$d" 2>/dev/null || echo "$(date +%T) failed loop=$i $d" rmdir "$d" 2>/dev/null done Failures cluster at the cadence at which the server-side auth/export cache refresh path causes nfsd to return NFS4ERR_DELAY for CREATE. A wire trace of one failure (the three CREATE RPCs all come from a single mkdir(2), generated by the do-while in nfs4_proc_mkdir()): client -> server CREATE name=... -> NFS4ERR_DELAY ~100 ms later client -> server CREATE name=... -> NFS4_OK (dir created) ~80 us later client -> server CREATE name=... -> NFS4ERR_EXIST (correct) Since commit dd862da61e91 ("nfs: fix incorrect handling of large-number NFS errors in nfs4_do_mkdir()"), nfs4_handle_exception() is called only when _nfs4_proc_mkdir() returned an error. That gate breaks retry-state hygiene: nfs4_do_handle_exception() resets exception.{delay,recovering, retry} to 0 on entry, so calling it on success is what previously cleared the retry flag set by the preceding NFS4ERR_DELAY iteration. With the gate in place, exception.retry stays at 1 after the successful retry, the loop runs once more, and the resulting CREATE for an already-created name yields NFS4ERR_EXIST -> -EEXIST to userspace. Drop the conditional and call nfs4_handle_exception() unconditionally, matching every other do-while in fs/nfs/nfs4proc.c (nfs4_proc_symlink(), nfs4_proc_link(), etc.). The dentry/status separation introduced by that commit is preserved. Fixes: dd862da61e91 ("nfs: fix incorrect handling of large-number NFS errors in nfs4_do_mkdir()") Reported-and-tested-by: Jan Čípa <jan.cipa@gooddata.com> Closes: https://lore.kernel.org/linux-nfs/CA+9S74hSp_tJu2Ffe2BPNC2T25gfkhgjjDkdgSsF5c2rnJq_wA@mail.gmail.com/ Reviewed-by: NeilBrown <neil@brown.name> Cc: stable@vger.kernel.org Signed-off-by: Igor Raits <igor.raits@gmail.com> --- fs/nfs/nfs4proc.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index a0885ae55abc..ffd14141ea1d 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5393,10 +5393,9 @@ static struct dentry *nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, do { alias = _nfs4_proc_mkdir(dir, dentry, sattr, label, &err); trace_nfs4_mkdir(dir, &dentry->d_name, err); + err = nfs4_handle_exception(NFS_SERVER(dir), err, &exception); if (err) - alias = ERR_PTR(nfs4_handle_exception(NFS_SERVER(dir), - err, - &exception)); + alias = ERR_PTR(err); } while (exception.retry); nfs4_label_release_security(label); -- 2.53.0 ^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] NFSv4: clear exception state on successful mkdir retry 2026-04-29 10:49 ` [PATCH] NFSv4: clear exception state on successful mkdir retry Igor Raits @ 2026-05-13 7:18 ` Thorsten Leemhuis 0 siblings, 0 replies; 2+ messages in thread From: Thorsten Leemhuis @ 2026-05-13 7:18 UTC (permalink / raw) To: Trond Myklebust, Anna Schumaker Cc: NeilBrown, Jan Čípa, linux-nfs, linux-kernel, stable, Igor Raits, Linux kernel regressions list [top-posting to facilitate processing] @NFSv4 maintainers, just wondering, did this patch maybe fall through the cracks? It fixes a regression, that's why it's on my radar. Or was there some progress and I missed it? Ciao, Thorsten On 4/29/26 12:49, Igor Raits wrote: > After a server returns NFS4ERR_DELAY for an NFSv4 CREATE issued by > mkdir(2), the client correctly waits and retries. When the retry > succeeds, however, mkdir(2) can still surface -EEXIST to userspace > even though the directory was just created on the server. > > Reproducer (random 16-hex names so collisions are not the cause) > against an in-kernel Linux nfsd; reproduces under both NFSv4.0 and > NFSv4.2: > > N=2000000; base=/var/gdc/export > for ((i=1; i<=N; i++)); do > d=$base/$(openssl rand -hex 8) > mkdir "$d" 2>/dev/null || echo "$(date +%T) failed loop=$i $d" > rmdir "$d" 2>/dev/null > done > > Failures cluster at the cadence at which the server-side auth/export > cache refresh path causes nfsd to return NFS4ERR_DELAY for CREATE. > > A wire trace of one failure (the three CREATE RPCs all come from a > single mkdir(2), generated by the do-while in nfs4_proc_mkdir()): > > client -> server CREATE name=... -> NFS4ERR_DELAY > ~100 ms later > client -> server CREATE name=... -> NFS4_OK (dir created) > ~80 us later > client -> server CREATE name=... -> NFS4ERR_EXIST (correct) > > Since commit dd862da61e91 ("nfs: fix incorrect handling of large-number > NFS errors in nfs4_do_mkdir()"), nfs4_handle_exception() is called only > when _nfs4_proc_mkdir() returned an error. That gate breaks retry-state > hygiene: nfs4_do_handle_exception() resets exception.{delay,recovering, > retry} to 0 on entry, so calling it on success is what previously > cleared the retry flag set by the preceding NFS4ERR_DELAY iteration. > With the gate in place, exception.retry stays at 1 after the successful > retry, the loop runs once more, and the resulting CREATE for an > already-created name yields NFS4ERR_EXIST -> -EEXIST to userspace. > > Drop the conditional and call nfs4_handle_exception() unconditionally, > matching every other do-while in fs/nfs/nfs4proc.c (nfs4_proc_symlink(), > nfs4_proc_link(), etc.). The dentry/status separation introduced by > that commit is preserved. > > Fixes: dd862da61e91 ("nfs: fix incorrect handling of large-number NFS errors in nfs4_do_mkdir()") > Reported-and-tested-by: Jan Čípa <jan.cipa@gooddata.com> > Closes: https://lore.kernel.org/linux-nfs/CA+9S74hSp_tJu2Ffe2BPNC2T25gfkhgjjDkdgSsF5c2rnJq_wA@mail.gmail.com/ > Reviewed-by: NeilBrown <neil@brown.name> > Cc: stable@vger.kernel.org > Signed-off-by: Igor Raits <igor.raits@gmail.com> > --- > fs/nfs/nfs4proc.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index a0885ae55abc..ffd14141ea1d 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -5393,10 +5393,9 @@ static struct dentry *nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, > do { > alias = _nfs4_proc_mkdir(dir, dentry, sattr, label, &err); > trace_nfs4_mkdir(dir, &dentry->d_name, err); > + err = nfs4_handle_exception(NFS_SERVER(dir), err, &exception); > if (err) > - alias = ERR_PTR(nfs4_handle_exception(NFS_SERVER(dir), > - err, > - &exception)); > + alias = ERR_PTR(err); > } while (exception.retry); > nfs4_label_release_security(label); > ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-13 7:19 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <177745671692.1474915.5018486129724109553@noble.neil.brown.name>
2026-04-29 10:49 ` [PATCH] NFSv4: clear exception state on successful mkdir retry Igor Raits
2026-05-13 7:18 ` Thorsten Leemhuis
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox