* [PATCH AUTOSEL 7.0-6.6] NFS: Use nlmclnt_shutdown_rpc_clnt() to safely shut down NLM
[not found] <20260428104133.2858589-1-sashal@kernel.org>
@ 2026-04-28 10:40 ` Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0] NFS: fix writeback in presence of errors Sasha Levin
1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2026-04-28 10:40 UTC (permalink / raw)
To: patches, stable
Cc: Chuck Lever, Jeff Layton, Sasha Levin, trond.myklebust, anna,
linux-nfs, linux-kernel
From: Chuck Lever <chuck.lever@oracle.com>
[ Upstream commit 840621fd2ff23ada8b9262d90477e75232566e6b ]
A race condition exists in shutdown_store() when writing to the sysfs
"shutdown" file concurrently with nlm_shutdown_hosts_net(). Without
synchronization, the following sequence can occur:
1. shutdown_store() reads server->nlm_host (non-NULL)
2. nlm_shutdown_hosts_net() acquires nlm_host_mutex, calls
rpc_shutdown_client(), sets h_rpcclnt to NULL, and potentially
frees the host via nlm_gc_hosts()
3. shutdown_store() dereferences the now-stale or freed host
Introduce nlmclnt_shutdown_rpc_clnt(), which acquires nlm_host_mutex
before accessing h_rpcclnt. This synchronizes with
nlm_shutdown_hosts_net() and ensures the rpc_clnt pointer remains
valid during the shutdown operation.
This change also improves API layering: NFS client code no longer
needs to include the internal lockd header to access nlm_host fields.
The new helper resides in bind.h alongside other public lockd
interfaces.
Reported-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
# Analysis: NFS: Use nlmclnt_shutdown_rpc_clnt() to safely shut down NLM
## Phase 1: Commit Message Forensics
**Step 1.1 — Subject parsing:**
Record: Subsystem `NFS:` / action `Use` (implicit "fix by using") /
summary: replace direct h_rpcclnt dereference with a helper that adds
synchronization for safely shutting down NLM client RPC state.
**Step 1.2 — Tags:**
Record:
- `Reported-by: Jeff Layton <jlayton@kernel.org>` (subsystem expert)
- `Reviewed-by: Jeff Layton <jlayton@kernel.org>` (same person,
NFS/lockd maintainer-adjacent)
- `Signed-off-by: Chuck Lever <chuck.lever@oracle.com>` (NFSD
maintainer)
- No `Fixes:` tag, no `Cc: stable`, no `Link:`, no syzbot — expected
(this is a candidate under review).
**Step 1.3 — Body analysis:**
Record: Author describes a concrete race between `shutdown_store()` and
`nlm_shutdown_hosts_net()`:
1. `shutdown_store()` reads `server->nlm_host` (non-NULL).
2. Concurrent path takes `nlm_host_mutex`, shuts down the RPC client,
sets `h_rpcclnt = NULL`, may free host via `nlm_gc_hosts()`.
3. `shutdown_store()` dereferences freed/stale pointer.
Author explicitly states consequences: UAF / NULL pointer dereference
(NPD).
Mechanism described is a cross-subsystem race.
**Step 1.4 — Hidden fix detection:**
Record: Not hidden — the message explicitly calls out a "race
condition". The v2->v3 changelog in the cover letter (found in mailing
list) explicitly states: "Changes since v2: Serialize client-side NLM
shutdown to avoid UAF and NPD." This confirms the commit targets a
UAF/NPD, not just cleanup.
## Phase 2: Diff Analysis
**Step 2.1 — Inventory:**
Record: 3 files changed, +32/-2.
- `fs/lockd/host.c`: +29 (new helper function
`nlmclnt_shutdown_rpc_clnt` + callback `nlmclnt_match_all` +
`EXPORT_SYMBOL_GPL`).
- `fs/nfs/sysfs.c`: 2 lines modified — header include switched from
`lockd.h` to `bind.h`; direct `h_rpcclnt` access replaced with new
helper call.
- `include/linux/lockd/bind.h`: +1 line (extern declaration).
Classification: small, contained, cross-module (lockd + NFS + public
header).
**Step 2.2 — Code flow change:**
Record:
- Before: `shutdown_client(server->nlm_host->h_rpcclnt)` — direct
unsynchronized dereference of internal struct field; if `h_rpcclnt` is
NULL or host was freed, immediate crash.
- After: `nlmclnt_shutdown_rpc_clnt(server->nlm_host)` — lockd-internal
helper acquires `nlm_host_mutex`, reads `h_rpcclnt`, NULL-checks it,
then sets `cl_shutdown=1` and cancels tasks under the mutex.
**Step 2.3 — Bug mechanism:**
Record: Category (b) synchronization / race + (d) memory safety (NULL
check).
- Adds `mutex_lock(&nlm_host_mutex)` around `h_rpcclnt` read and use.
This is the same mutex serialized by `nlmclnt_release_host()` (via
`refcount_dec_and_mutex_lock`) and `nlm_shutdown_hosts_net()`
(explicit `mutex_lock`).
- Adds an explicit `if (clnt)` NULL check in the helper — previously
absent in callsite.
- Adds `EXPORT_SYMBOL_GPL` so the helper is callable from fs/nfs.
**Step 2.4 — Fix quality:**
Record: Obviously correct mutex pattern. The helper semantics are clear
(safe when `h_rpcclnt` is NULL, serialized against release). No
regression risk: the operations under the mutex (set flag + cancel
tasks) are short and don't sleep on other locks that could cause
deadlock with the mutex. Slight concern: adds a new exported symbol, but
this is a standard idiom in kernel subsystems.
## Phase 3: Git History Investigation
**Step 3.1 — Blame:**
Record: The buggy callsite
`shutdown_client(server->nlm_host->h_rpcclnt)` was introduced by commit
`7d3e26a054c88` "NFS: Cancel all existing RPC tasks when shutdown" (in
v6.5-rc1). Prior to that, commit `d9615d166c7ed` "NFS: add sysfs
shutdown knob" (also v6.5-rc1) just set `cl_shutdown=1` without
cancelling tasks.
**Step 3.2 — Fixes: tag:**
Record: No explicit Fixes: tag in this commit. The targeted vulnerable
code was introduced in v6.5 (present in 6.6+ stable trees).
**Step 3.3 — File history:**
Record: `fs/nfs/sysfs.c` and `fs/lockd/host.c` are stable files with
steady maintenance. No related prerequisite series needed for the fix
itself (although other commits in the 14-patch series move headers
around, that movement is NOT required for this commit to apply).
**Step 3.4 — Author:**
Record: Chuck Lever is NFSD subsystem maintainer; Jeff Layton is a long-
term NFS/lockd developer. Both have deep expertise in this area.
**Step 3.5 — Dependencies:**
Record: This commit depends only on `nlm_host_mutex` and
`rpc_cancel_tasks()` both of which pre-date v6.5 by years. It does NOT
depend on the sibling header-relocation commits in the series
(`2c562c6e67156`, `4db2f8a016dc9`, `f4d5f8caadd85`) — those are
standalone refactoring.
## Phase 4: Mailing List Research
**Step 4.1 — Find original submission:**
Record: `b4 dig` located the patch at
https://lore.kernel.org/all/20260128151935.1646063-7-cel@kernel.org/ —
[PATCH v4 06/14]. Series title: "Clarify module API boundaries".
**Step 4.1 (evolution):** `b4 dig -a` shows revisions v1 → v2 → v3 → v4.
v2 (Message-ID 20260123185259.1215767-6-cel@kernel.org, subject "NFS:
Use nlmclnt_rpc_clnt() helper to retrieve nlm_host's rpc_clnt") had a
simpler fix: use existing `nlmclnt_rpc_clnt()` + NULL check. v3 upgraded
the fix to introduce `nlmclnt_shutdown_rpc_clnt()` with full mutex
serialization, because the author recognized the NULL check alone has a
race window (TOCTOU — read→check→use unsynchronized against clearing by
release path).
**Step 4.2 — Reviewers:**
Record: Original recipients: NeilBrown, Jeff Layton, Olga Kornievskaia,
Dai Ngo, Tom Talpey, linux-nfs list. Jeff Layton provided `Reviewed-by:`
— he is THE lockd/NFS domain expert. No NAKs or objections.
**Step 4.3 — Bug report:**
Record: No external bug report / Link / syzbot — the race was found via
code review by Jeff Layton (Reported-by).
**Step 4.4 — Related patches:**
Record: The surrounding 13 patches in the series are mostly header-moves
and minor lockd refactoring. This specific patch is self-contained. No
dependency on other patches in the series is required for the bug fix to
work.
**Step 4.5 — Stable ML:**
Record: No pre-existing stable nomination. Not mentioned on stable lists
(the patch was only merged to mainline yesterday, 2026-04-20, via Chuck
Lever's nfsd-7.1 pull).
## Phase 5: Semantic / Call-Graph Analysis
**Step 5.1 — Key functions:**
Record: New: `nlmclnt_shutdown_rpc_clnt()`, `nlmclnt_match_all()`.
Modified: `shutdown_store()` in fs/nfs/sysfs.c.
**Step 5.2 — Callers of `shutdown_store`:**
Record: Called by sysfs when user writes "1" to
`/sys/fs/nfs/server-N/shutdown`. Attribute is `__ATTR_RW` → mode 0644 →
write requires root (CAP_SYS_ADMIN in practice).
**Step 5.3 — Callees:**
Record: `nlmclnt_shutdown_rpc_clnt()` calls
`mutex_lock(&nlm_host_mutex)`, reads `host->h_rpcclnt`, sets
`clnt->cl_shutdown`, and calls `rpc_cancel_tasks()`. All are existing,
stable APIs.
**Step 5.4 — Call chain reachability:**
Record: Trigger path: root writes `1` to sysfs shutdown file while an
NFS v2/v3 mount (with file lock traffic having triggered
`nlm_bind_host`) is being torn down. Reachable from userspace (as root).
**Step 5.5 — Similar patterns:**
Record: `nlmclnt_release_host()` already uses
`refcount_dec_and_mutex_lock(&h_count, &nlm_host_mutex)` — the fix's use
of the same mutex is consistent with the existing locking model.
`nlm_shutdown_hosts_net()` also acquires this mutex.
## Phase 6: Cross-Referencing Stable Trees
**Step 6.1 — Does the buggy code exist in stable?**
Record: `git show stable/linux-6.6.y:fs/nfs/sysfs.c` confirms the pre-
fix code `shutdown_client(server->nlm_host->h_rpcclnt)` is present at
the same location (line 288). Same for 6.12.y. Buggy code introduced in
v6.5, so present in 6.6+, 6.12+, and 6.15/7.0 stable trees.
**Step 6.2 — Backport complications:**
Record: `fs/lockd/host.c` and `fs/nfs/sysfs.c` in 6.6.y and 6.12.y are
very close to the pre-fix mainline state. The new helper can be added
cleanly. `include/linux/lockd/bind.h` in stable trees has the same
structure. Backport should apply with minimal or no adjustment. The
header include switch from `lockd.h` to `bind.h` in fs/nfs/sysfs.c will
still compile in stable because bind.h provides sufficient forward
declaration (struct nlm_host is used only as pointer type after the
fix).
**Step 6.3 — Related stable fixes:**
Record: No earlier fix for this race in stable trees.
## Phase 7: Subsystem Context
**Step 7.1 — Criticality:**
Record: `fs/nfs/` + `fs/lockd/` — IMPORTANT (affects all NFS client
users doing file locking on v2/v3 mounts; NFSv4 has its own locking and
is unaffected).
**Step 7.2 — Activity:**
Record: Mature subsystem with steady development. Bug has existed since
v6.5 (approximately 2 years); fix came from Chuck Lever/Jeff Layton as
part of a code audit / refactoring effort.
## Phase 8: Impact / Risk
**Step 8.1 — Affected users:**
Record: NFSv2/v3 users with shutdown sysfs knob actively used (used by
some admin tooling / container orchestration scenarios). Knob is root-
only.
**Step 8.2 — Trigger conditions:**
Record: Requires (a) root privilege; (b) simultaneous write to
`/sys/fs/nfs/server-N/shutdown` and NFS unmount (`nfs_free_server` path
through `nfs_destroy_server → nlmclnt_done → nlmclnt_release_host`); (c)
the unmount drops the nlm_host refcount to 0, triggering destruction.
Narrow timing window. Not userspace-triggerable by unprivileged users.
**Step 8.3 — Failure mode:**
Record: Use-after-free (host freed by release path while sysfs writer
dereferences `h_rpcclnt`) or NULL-pointer dereference (if `h_rpcclnt`
has been cleared) → kernel oops. Severity: HIGH (kernel crash, potential
memory corruption). Not security-critical in the strict sense (requires
root to trigger), but is a real UAF.
**Step 8.4 — Risk/Benefit:**
Record:
- BENEFIT: Fixes a real UAF / NPD race condition.
- RISK: 32-line change, adds a new EXPORT_SYMBOL_GPL, but semantics are
simple and reviewed by the domain expert. The added mutex is already
held by the peer paths, so no new locking model introduced.
- Ratio: Benefit clearly outweighs risk; fix is small and surgical.
## Phase 9: Synthesis
**Step 9.1 — Evidence summary:**
FOR:
- Real UAF / NPD in the sysfs shutdown path (author explicitly noted in
v2→v3 changelog: "Serialize client-side NLM shutdown to avoid UAF and
NPD").
- Reviewed by Jeff Layton (subsystem expert who is also the Reporter).
- Small, contained, well-structured fix using existing mutex.
- Buggy code exists in 6.6+, 6.12+ stable trees.
- Backport should apply cleanly; affected files in stable are similar to
pre-fix mainline.
AGAINST:
- Just merged to mainline (2026-04-20, i.e., yesterday) — minimal time
under test.
- No `Cc: stable`, no `Fixes:` tag (expected for a candidate under
review).
- Adds a new exported symbol (small kABI expansion, not a blocker for
stable).
- Requires root to trigger, so not a privilege-escalation security bug.
- Part of a larger "API boundary" refactoring series (but this
particular commit is a bona fide fix, not refactoring alone).
UNRESOLVED:
- Exact race mechanism for client hosts vs. `nlm_shutdown_hosts_net()`
(which iterates `nlm_server_hosts`) is somewhat idealized in the
commit message; however, the more certain race between
`shutdown_store()` and `nlmclnt_release_host()` (via
`nfs_destroy_server()`) does exist — `nfs_destroy_server` runs before
`nfs_sysfs_remove_server`, so the sysfs file is still active while the
nlm_host refcount may hit zero and the host can be freed. The fix
addresses that path as well (both hold `nlm_host_mutex`).
**Step 9.2 — Rules checklist:**
1. Obviously correct and tested? Yes (code-review correct; tested in
linux-next via fs-next integration).
2. Fixes a real bug? Yes (UAF / NPD in sysfs shutdown path).
3. Important issue? Yes (kernel crash, potential UAF).
4. Small and contained? Yes (+32/-2 across 3 files).
5. No new features? Correct — adds a helper to support the fix, no new
user-facing functionality.
6. Applies to stable trees? Yes (target code is present and nearly
identical in 6.6.y / 6.12.y).
**Step 9.3 — Exception categories:** Not a quirk / device ID / DT /
build / doc fix. It's a straightforward bug fix.
**Step 9.4 — Decision:**
This is a legitimate UAF/NPD fix for a race in a user-reachable (though
root-gated) sysfs handler. The bug has been latent since v6.5. The fix
is small, well-reviewed by the subsystem expert, and will backport
cleanly. Although the commit is very fresh in mainline and the trigger
window is narrow, the severity (UAF / kernel oops) and the cleanliness
of the fix favor inclusion. The new exported symbol is a trivial API
expansion used solely to implement the fix.
## Verification
- [Phase 1] Parsed all commit tags: `Reported-by`/`Reviewed-by` Jeff
Layton, SOB Chuck Lever. No Fixes:, stable, Link: or syzbot (confirmed
by reading the full message).
- [Phase 1] `b4 dig -c 840621fd2ff23 -a` showed v1/v2/v3/v4 revisions;
v2→v3 cover-letter changelog says "Serialize client-side NLM shutdown
to avoid UAF and NPD" (read in mbox).
- [Phase 2] Diff confirmed: +29 in `fs/lockd/host.c`, +1/−1 in
`include/linux/lockd/bind.h`, +1/−1 include + +1/−1 function-call in
`fs/nfs/sysfs.c`. EXPORT_SYMBOL_GPL added.
- [Phase 2] Read `fs/lockd/host.c` around `nlmclnt_release_host()` —
confirmed same `nlm_host_mutex` is used, so new helper's lock is
consistent with existing release path.
- [Phase 3] `git describe --contains 7d3e26a054c88` → v6.5-rc1~91^2~6:
confirms buggy call `shutdown_client(server->nlm_host->h_rpcclnt)` was
introduced in v6.5.
- [Phase 3] `git describe --contains d9615d166c7ed` → v6.5-rc1~91^2~7:
confirms sysfs shutdown knob itself is from v6.5.
- [Phase 4] `b4 dig -c 840621fd2ff23` → confirmed lore URL and that
patch is 06/14 of "Clarify module API boundaries" series.
- [Phase 4] `b4 dig -c 840621fd2ff23 -w` → original recipients included
NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-
nfs list. Domain experts CC'd.
- [Phase 4] Read v2 and v4 of the patch from lore mbox; v2 used existing
`nlmclnt_rpc_clnt()` helper + NULL check; v3+ upgraded to proper mutex
serialization. Confirmed no NAKs.
- [Phase 5] Read `shutdown_store()` and confirmed `__ATTR_RW(shutdown)`
→ mode 0644 → root-only write.
- [Phase 5] Read `nfs_free_server()` in `fs/nfs/client.c` and confirmed
`server->destroy(server)` (which calls
`nlmclnt_done`/`nlmclnt_release_host`) runs BEFORE
`nfs_sysfs_remove_server()` — a real window during which the sysfs
file is still active while the nlm_host could be dropped.
- [Phase 6] `git show stable/linux-6.6.y:fs/nfs/sysfs.c` confirmed pre-
fix `shutdown_client(server->nlm_host->h_rpcclnt)` present at line 288
in 6.6.y.
- [Phase 6] `git show stable/linux-6.12.y:fs/lockd/host.c` and
`include/linux/lockd/bind.h` confirmed both files are structurally
compatible for a clean backport.
- [Phase 7] Read attribute macros confirming sysfs file mode.
- [Phase 8] Severity assessment: UAF of `struct nlm_host` → kernel oops
/ potential memory corruption = HIGH.
- UNVERIFIED: The exact claim that `nlm_shutdown_hosts_net()` races with
`shutdown_store()` on client hosts is partly unverified —
`nlm_shutdown_hosts_net()` iterates `nlm_server_hosts`, not
`nlm_client_hosts`. However, the more concrete race between
`shutdown_store()` and `nlmclnt_release_host()` (via NFS unmount path)
IS verified via code reading. The fix uses the same mutex and
addresses both serialization points. This detail doesn't change the
YES/NO decision — there is a real race the patch closes.
- UNVERIFIED: No concrete crash stack trace or user bug report exists —
the race was found by code review. This does not change the assessment
(UAF fixes from code audit are routinely backported).
The fix is small, correct, closes a real UAF in a user-reachable (root-
gated) sysfs path present since v6.5, and is reviewed by the domain
expert. It backports cleanly to 6.6.y and 6.12.y.
**YES**
fs/lockd/host.c | 29 +++++++++++++++++++++++++++++
fs/nfs/sysfs.c | 4 ++--
include/linux/lockd/bind.h | 1 +
3 files changed, 32 insertions(+), 2 deletions(-)
diff --git a/fs/lockd/host.c b/fs/lockd/host.c
index 1a9582a10a86f..015900d2d4c22 100644
--- a/fs/lockd/host.c
+++ b/fs/lockd/host.c
@@ -306,6 +306,35 @@ void nlmclnt_release_host(struct nlm_host *host)
}
}
+/* Callback for rpc_cancel_tasks() - matches all tasks for cancellation */
+static bool nlmclnt_match_all(const struct rpc_task *task, const void *data)
+{
+ return true;
+}
+
+/**
+ * nlmclnt_shutdown_rpc_clnt - safely shut down NLM client RPC operations
+ * @host: nlm_host to shut down
+ *
+ * Cancels outstanding RPC tasks and marks the client as shut down.
+ * Synchronizes with nlmclnt_release_host() via nlm_host_mutex to prevent
+ * races between shutdown and host destruction. Safe to call if h_rpcclnt
+ * is NULL or already shut down.
+ */
+void nlmclnt_shutdown_rpc_clnt(struct nlm_host *host)
+{
+ struct rpc_clnt *clnt;
+
+ mutex_lock(&nlm_host_mutex);
+ clnt = host->h_rpcclnt;
+ if (clnt) {
+ clnt->cl_shutdown = 1;
+ rpc_cancel_tasks(clnt, -EIO, nlmclnt_match_all, NULL);
+ }
+ mutex_unlock(&nlm_host_mutex);
+}
+EXPORT_SYMBOL_GPL(nlmclnt_shutdown_rpc_clnt);
+
/**
* nlmsvc_lookup_host - Find an NLM host handle matching a remote client
* @rqstp: incoming NLM request
diff --git a/fs/nfs/sysfs.c b/fs/nfs/sysfs.c
index 1da4f707f9efe..3a197252a1329 100644
--- a/fs/nfs/sysfs.c
+++ b/fs/nfs/sysfs.c
@@ -13,7 +13,7 @@
#include <linux/nfs_fs.h>
#include <net/net_namespace.h>
#include <linux/rcupdate.h>
-#include <linux/lockd/lockd.h>
+#include <linux/lockd/bind.h>
#include "internal.h"
#include "nfs4_fs.h"
@@ -288,7 +288,7 @@ shutdown_store(struct kobject *kobj, struct kobj_attribute *attr,
shutdown_client(server->client_acl);
if (server->nlm_host)
- shutdown_client(server->nlm_host->h_rpcclnt);
+ nlmclnt_shutdown_rpc_clnt(server->nlm_host);
out:
shutdown_nfs_client(server->nfs_client);
return count;
diff --git a/include/linux/lockd/bind.h b/include/linux/lockd/bind.h
index c53c81242e727..40c124f932252 100644
--- a/include/linux/lockd/bind.h
+++ b/include/linux/lockd/bind.h
@@ -58,6 +58,7 @@ struct nlmclnt_initdata {
extern struct nlm_host *nlmclnt_init(const struct nlmclnt_initdata *nlm_init);
extern void nlmclnt_done(struct nlm_host *host);
extern struct rpc_clnt *nlmclnt_rpc_clnt(struct nlm_host *host);
+extern void nlmclnt_shutdown_rpc_clnt(struct nlm_host *host);
/*
* NLM client operations provide a means to modify RPC processing of NLM
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH AUTOSEL 7.0] NFS: fix writeback in presence of errors
[not found] <20260428104133.2858589-1-sashal@kernel.org>
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.6] NFS: Use nlmclnt_shutdown_rpc_clnt() to safely shut down NLM Sasha Levin
@ 2026-04-28 10:40 ` Sasha Levin
1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2026-04-28 10:40 UTC (permalink / raw)
To: patches, stable
Cc: Olga Kornievskaia, Trond Myklebust, Sasha Levin, anna, linux-nfs,
linux-kernel
From: Olga Kornievskaia <okorniev@redhat.com>
[ Upstream commit 5d3869a41f3608101c00ff9c9c7c2364c555fa65 ]
After running xfstest generic/751, in certain conditions, can have
a writeback IO stuck while experiencing one of the two patterns.
Pattern#1: writeback IO experiences ENOSPC on an offset smaller
than the filesize. Example,
write offset=0 len=4096 how=unstable OK
write offset=8192 len=4096 how=unstable OK
write offset=12288 len=4096 how=unstable ENOSPC
write offset=4096 len=4096 how=unstable ENOSPC
client sends a commit and receives a verifier which is different
from the last successful write. It marks pages dirty and writeback
retries. But it again send writes unstable and gets into the same
pattern, running into the ENOSPC error and sending a commit because
writes were sent at unstable.
Pattern#2: an unstable write followed by a short write and ENOSPC.
write offset=0 len=4096 how=unstable OK
write offset=4096 len=4096 how=unstable returns OK but count=100
write offset=4197 len=3996 how=stable returns ENOSPC
client send a commit and receives a verifier different from
the last unstable write. The same behaviour is retried in a loop.
Instead, this patch proposes to identify those conditions and mark
requests to be done synchronously instead. Previous solution tried
to mark it in the nfs_page, however that's not persistent thus
instead mark it in the nfs_open_context.
Furthermore, the same problem occurs during localio code path so
recognize that IO needs to be done sync in that case as well.
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Now I have enough information to provide the full analysis.
---
# Analysis: NFS: fix writeback in presence of errors
## PHASE 1: COMMIT MESSAGE FORENSICS
**Step 1.1: Subject Line**
- Subsystem prefix: `NFS:`
- Action verb: `fix`
- Record: NFS / fix / writeback gets stuck in a loop in presence of
errors (ENOSPC, short writes, verifier mismatches).
**Step 1.2: Tags**
- Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> (Red Hat NFS
developer, frequent NFS contributor)
- Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> (NFS
subsystem maintainer)
- NO `Fixes:` tag
- NO `Cc: stable@vger.kernel.org`
- NO `Reported-by:` (the bug was found internally via xfstests
generic/751)
- NO `Tested-by:` / `Reviewed-by:` / `Acked-by:`
- NO `Link:` to lore/syzkaller/bugzilla
- Record: Two SOBs (author + maintainer). Absence of stable/Fixes tags
is exactly why this is in the review pipeline. No reviewer trailers
but maintainer has SOB indicating he applied it.
**Step 1.3: Commit Body**
- Bug description: Writeback IO gets stuck in an endless retry loop in
two patterns:
- Pattern #1: Unstable write → ENOSPC → COMMIT → verifier mismatch →
mark pages dirty → retry as unstable → loop.
- Pattern #2: Unstable write returning short (count=100) → stable
write at offset 4197 → ENOSPC → COMMIT → verifier mismatch → loop.
- Reproduction: `xfstest generic/751`.
- Mechanism explained: After short write/ENOSPC/verifier mismatch, mark
`nfs_open_context` so subsequent writes are forced to `NFS_FILE_SYNC`
(stable). v1 used a per-page `PG_SYNC` flag, which doesn't survive
page reallocation, so design moved to per-open-context flag.
- Record: Author clearly understands root cause; reproduction via
standard xfstests test; same problem in localio path is also fixed.
**Step 1.4: Hidden Bug Fix Detection**
- Verb is "fix", not disguised. Ports of this clearly fix a hang.
- Record: This is an explicit bug fix (writeback livelock).
## PHASE 2: DIFF ANALYSIS
**Step 2.1: Inventory**
- `fs/nfs/localio.c`: +14 / -1 (in `nfs_local_call_write`,
`nfs_local_do_write`)
- `fs/nfs/pagelist.c`: +3 (in `__nfs_pageio_add_request`)
- `fs/nfs/write.c`: +9 (in `nfs_write_completion`,
`nfs_writeback_result`, `nfs_commit_release_pages`)
- `include/linux/nfs_fs.h`: +1 (new `NFS_CONTEXT_WRITE_SYNC` flag)
- Total: 27 insertions, 1 deletion across 4 files. Surgical, single-
subsystem.
- Record: Small, contained, NFS-only changes touching only error-
handling paths.
**Step 2.2: Code Flow Change**
- `nfs_writeback_result` (write.c): on short write, set
`NFS_CONTEXT_WRITE_SYNC` on the open context.
- `nfs_commit_release_pages` (write.c): on verifier mismatch (server
lost data), set the flag.
- `nfs_local_call_write` (localio.c): on short write in localio path,
set the flag.
- `nfs_write_completion` (write.c): clear flag when an unstable write
succeeds and needs commit.
- `__nfs_pageio_add_request` (pagelist.c): when flag set, OR
`FLUSH_STABLE` into `pg_ioflags` so future writes go stable.
- `nfs_local_do_write` (localio.c): when flag set, set `hdr->args.stable
= NFS_FILE_SYNC`.
- Record: Adds a sticky "force stable writes" flag set on error paths
and consulted on submission.
**Step 2.3: Bug Mechanism Class**
- Logic / liveness fix (livelock / infinite retry loop in NFS error
recovery), with synchronization/state-machine corrective. Not a
UAF/leak/race classic.
- Trigger: server ENOSPC during unstable write OR verifier mismatch on
commit (e.g., server crash/reboot or storage commit failure).
- Record: Liveness/livelock fix in NFS write recovery state machine.
**Step 2.4: Fix Quality**
- Reasoning is sound: forcing FILE_SYNC after error eliminates the
unstable-write→commit→verifier-mismatch retry cycle.
- One subtle concern: the flag is per-open-context and is only cleared
when `nfs_write_need_commit(hdr)` is true after a write completion.
Once the flag is set, all writes go stable; stable writes don't need
commit, so they don't clear the flag. The flag effectively persists
until a successful unstable write (which only happens if flag is
cleared first). This means after a single short-write event, the open
context becomes permanently sync. That's intentional fail-safe
behavior, but is a behavior change.
- No locking/memory-management concerns; no new allocations; no API
change visible to userspace.
- Record: Logic appears correct; small regression risk = sustained sync
writes after a transient short-write (perf cost only, not
correctness).
## PHASE 3: GIT HISTORY INVESTIGATION
**Step 3.1: Blame**
- The general `nfs_writeback_result` short-write handling has been
present since `6c75dc0d498c` ("NFS: merge _full and _partial write
rpc_ops", 2012, v3.5). The buggy retry loop is therefore in
essentially every stable kernel.
- Record: Bug present since 2012/v3.5. Affects all current LTS trees
(5.10 through 6.12+).
**Step 3.2: Fixes Tag**
- This commit has no `Fixes:` tag.
- The closely related predecessor `3a06bac55bf56` ("NFS: improve 'Server
wrote zero bytes' error") has `Fixes: 6c75dc0d498c` (2012). That
predecessor is the change the candidate's diff context (the `&&
!list_empty(&hdr->pages)` line) depends on.
- Record: No Fixes tag. Bug origin best matches `6c75dc0d498c` (2012)
per related commit.
**Step 3.3: File History / Series**
- `git log fs-next -- fs/nfs/write.c` shows the immediate predecessors:
`3a06bac55bf56` (Feb 2026, "NFS: improve 'Server wrote zero bytes'
error") then `5d3869a41f360` (this commit, Apr 2026).
- The diff shown in the candidate uses the post-`3a06bac` context (`if
(resp->count < argp->count && !list_empty(&hdr->pages))`). For a clean
apply on stable, `3a06bac55bf56` should also land (it has Fixes: 2012,
so likely already auto-selected).
- Standalone? Mostly. Localio bits depend on localio existing (v6.12+).
- Record: One reasonable prerequisite (`3a06bac55bf56`) for a clean
context match; not a hard logical dependency though - the actual hunk
additions only need pre-existing structure.
**Step 3.4: Author / Maintainer**
- Olga Kornievskaia: long-standing NFS client developer at Red Hat, many
commits in `fs/nfs/`.
- Trond Myklebust: NFS subsystem maintainer; he applied and signed off.
- Record: High-trust authorship.
**Step 3.5: Dependencies**
- Required field `NFS_CONTEXT_FILE_OPEN` was added in 2021 (commit
`e97bc66377bca`, Trond Myklebust); confirmed present in stable 5.15
and later.
- `fs/nfs/localio.c` only exists in v6.12+ (added by `70ba381e1a431` for
v6.12).
- Record: Pre-6.12 stable trees can take only the
`write.c`+`pagelist.c`+`nfs_fs.h` portions; `localio.c` portion does
not apply there.
## PHASE 4: MAILING LIST RESEARCH
**Step 4.1: b4 dig**
- `b4 dig -c 5d3869a41f360`: matched submission at
https://patch.msgid.link/20260413222423.90089-1-okorniev@redhat.com
(v3).
- `b4 dig -c 5d3869a41f360 -a`: three series found:
- v1 (RFC) 2026-03-12:
https://patch.msgid.link/20260312171526.85759-1-okorniev@redhat.com
- v2 2026-03-25:
https://patch.msgid.link/20260325180050.55186-1-okorniev@redhat.com
- v3 2026-04-13: applied version
- `b4 am` for v3 thread: `Analyzing 1 messages in the thread` /
`Analyzing 0 code-review messages` — no public review feedback, no
reviewer-suggested `Cc: stable`.
- Major design change between v1 and v2: v1 used a per-page `PG_SYNC`
bit; v2/v3 moved to per-open-context `NFS_CONTEXT_WRITE_SYNC` (because
per-page flag is not persistent through page recycling) and added the
localio path.
- v2 → v3 was a minor cleanup (compute `iov_iter_count` once into
`icount` variable).
- Record: Three revisions; significant design rework v1→v2; v3
essentially same as v2 with small refactor; no reviewer feedback
visible on lore.
**Step 4.2: b4 dig -w (Recipients)**
- To: trondmy@kernel.org, anna@kernel.org. Cc: linux-
nfs@vger.kernel.org.
- Both NFS maintainers (Trond Myklebust, Anna Schumaker) and the NFS
list were addressed.
- Record: Correct maintainer audience; Trond's SOB shows he
reviewed/applied.
**Step 4.3 / 4.4 / 4.5: Bug report / Series / Stable history**
- No external bug report / Reported-by / Link tags.
- Standalone fix; no patch series.
- Could not reach lore.kernel.org search interactively (Anubis bot
protection). Relying on b4 dig output.
- Record: No external bug report references; no public stable discussion
located.
## PHASE 5: CODE SEMANTIC ANALYSIS
**Step 5.1: Modified Functions**
- `nfs_local_call_write`, `nfs_local_do_write` (localio path)
- `__nfs_pageio_add_request` (write submission/coalescing path)
- `nfs_write_completion`, `nfs_writeback_result`,
`nfs_commit_release_pages` (write/commit completion)
**Step 5.2: Callers / Reachability**
- `nfs_writeback_result` is invoked as `rw_result` in `nfs_rw_write_ops`
— runs on every NFS WRITE RPC completion. Universal NFS write path.
- `nfs_commit_release_pages` runs on every NFS COMMIT completion.
- `nfs_write_completion` runs on completion of a `nfs_pgio_header` group
of writes.
- `__nfs_pageio_add_request` runs on every page added to a pageio
descriptor — the core write submission coalescer.
- `nfs_local_call_write` runs in the localio (loopback NFS-on-same-host)
write path.
- Record: All hot paths in NFS write submission and completion.
Reachable on every NFS write from userspace.
**Step 5.3 / 5.4: Reachability**
- Trigger: NFS server returns ENOSPC, short write, or different verifier
on commit. All are realistic in production (filling disk, quota,
server reboot).
- Record: Reachable from any unprivileged write(2) over NFS once disk
fills up.
**Step 5.5: Similar Patterns**
- The flag is consulted in two places (pagelist + localio); set in three
places (writeback_result, commit_release_pages, local_call_write);
cleared in one place (write_completion). Consistent design.
- Record: No other writeback retry logic relies on the same pattern;
this is a fresh mechanism.
## PHASE 6: STABLE TREE ANALYSIS
**Step 6.1: Code in stable trees?**
- `nfs_writeback_result` and the unstable-write retry logic exist in
essentially all currently-supported LTS (5.10, 5.15, 6.1, 6.6, 6.12+).
Verified for v6.1 and v6.6 directly.
- The bug pattern (server ENOSPC + verifier mismatch loop) thus exists
in all of them.
- `NFS_CONTEXT_FILE_OPEN` exists from v5.15+ — adding
`NFS_CONTEXT_WRITE_SYNC` next to it is safe.
- Record: Bug present in 5.10 onward; flag header location compatible
with 5.15+; for 5.10 a manual placement check would be needed.
**Step 6.2: Backport Difficulty**
- 6.12+ trees: Should mostly apply; the `nfs_writeback_result` hunk has
a context dependency on `3a06bac55bf56` (also a stable candidate, has
its own Fixes tag). Either land both, or fuzz/adapt one line of
context.
- Pre-6.12 trees: No `fs/nfs/localio.c` — the localio hunks must be
dropped. Core fix in `write.c`/`pagelist.c`/`nfs_fs.h` still applies.
- Record: Minor adjustment needed; not a clean-apply for all trees but
conceptually portable.
**Step 6.3: Already in stable?**
- Not in any stable backport branch (`for-greg/*`); `git branch --all
--contains 5d3869a41f360` returns only origin/master (mainline) and
the upstream repo.
- Record: Not yet backported.
## PHASE 7: SUBSYSTEM CONTEXT
**Step 7.1: Subsystem Criticality**
- `fs/nfs/` (NFS client) — IMPORTANT: very widely used (every distro,
enterprise, container/cloud workloads with NFS storage).
- Record: IMPORTANT / widespread use.
**Step 7.2: Activity**
- NFS client is actively maintained; Olga and Trond have been pushing
many fixes recently.
- Record: Active subsystem.
## PHASE 8: IMPACT AND RISK
**Step 8.1: Affected Users**
- Any NFS client user where the server can return ENOSPC, short writes,
or replays a different verifier (server reboot/crash mid-commit). That
is essentially all production NFS deployments.
- Record: NFS users — broad.
**Step 8.2: Trigger Conditions**
- Server-side disk full, quota hit, or NFS server reboot/commit-retry.
Common in real environments. Reproducer: xfstests generic/751.
- Triggered by unprivileged user writes.
- Record: Realistic and triggerable by any NFS user.
**Step 8.3: Severity**
- Failure mode: writeback `kworker` enters infinite retry loop. This
means: dirty pages never clear, fsync(2) never returns, eventually the
system exhibits hang task warnings, dirty memory accumulates,
balance_dirty_pages-style throttling stalls all writers, application-
level data is never reported back to userspace as failed.
- Severity: HIGH (effective hang of writeback, eventual OOM-class
behavior, application data loss/incorrect error semantics; no kernel
oops but a livelock that operators notice as a hung NFS).
- Record: HIGH severity (livelock in writeback / data loss-class
symptoms).
**Step 8.4: Risk vs Benefit**
- BENEFIT: Eliminates a real, reproducible writeback livelock affecting
all NFS users hitting ENOSPC or commit-verifier mismatch — high
benefit.
- RISK:
- Scope is medium (4 files / ~27 lines).
- Touches hot paths (every NFS write/commit completion).
- Behavior change: once a short-write/verifier-mismatch event happens
on a file, the open context becomes "sticky-sync" until a successful
unstable write+commit happens (which by construction can't happen
until the flag is cleared, so realistically until the file is
reopened). This is a permanent perf regression for the lifetime of
that open fd after a single transient error.
- The patch went through 3 iterations with a notable design change
v1→v2 — author had to redesign once. v3 vs v2 is trivial.
- No `Reviewed-by`/`Tested-by` from external parties on lore (only
maintainer SOB).
- Net: Benefit clearly outweighs risk: livelock is severe; perf
regression after error is acceptable; mechanism is contained to error
paths.
- Record: HIGH benefit, LOW–MEDIUM risk, ratio favors backport.
## PHASE 9: SYNTHESIS
**Step 9.1: Evidence**
FOR:
- Real, severe, reproducible bug (xfstests generic/751).
- Concrete two-pattern description with offsets/lengths/return codes.
- Maintainer (Trond Myklebust) signed off and applied directly.
- Bug exists in essentially all current stable trees (logic since 2012).
- Authoritative author/maintainer pair.
- Single subsystem; small line count.
AGAINST:
- No `Fixes:` tag.
- No `Cc: stable` tag.
- No public reviewer trailers / Tested-by.
- v1→v2 design change indicates the design space was non-trivial.
- The flag is sticky-sync once set (intentional but a behavior change).
- Localio portion only relevant for v6.12+; pre-6.12 needs manual trim.
- One contextual prerequisite (`3a06bac55bf56`) for mechanical clean-
apply on the `write.c` hunk near `nfs_writeback_result` — that
prerequisite has its own `Fixes:` tag and is itself a likely stable
candidate.
**Step 9.2: Stable Rules Checklist**
1. Obviously correct? Mostly — design is straightforward; no public
review evidence to confirm "obvious", but maintainer applied it.
Acceptable.
2. Real bug? YES — xfstests generic/751 reproducer.
3. Important? YES — writeback livelock, applies to NFS users hitting
ENOSPC / server commit failures (= many production NFS users).
4. Small/contained? YES — 27 insertions / 1 deletion across 4 files in
one subsystem.
5. No new features/APIs? No userspace API; adds an internal-only flag
bit (NFS_CONTEXT_WRITE_SYNC). Acceptable.
6. Applies to stable? Mostly — needs context fuzz / drop of localio
hunks for pre-6.12.
**Step 9.3: Exception Categories** Not applicable (not device ID, quirk,
DT, build, or doc fix).
**Step 9.4: Decision**
This is a real fix for a serious, reproducible NFS writeback livelock
that has existed for years in all current stable kernels. The change is
bounded to NFS error-handling paths, comes from the NFS subsystem
maintainer, and has a clear root-cause explanation tied to a standard
xfstests reproducer. The principal downsides — sticky sync after an
error, multi-revision design history, no Fixes/stable tag — are real but
do not outweigh the benefit of breaking an infinite writeback retry loop
for end users. Pre-6.12 backports require dropping the localio hunks;
otherwise the core fix is portable.
## Verification
- [Phase 1] Parsed subject/body from the commit message provided and
from `git show 5d3869a41f360` — confirmed reproducer is xfstests
generic/751; two failure patterns with offsets/return codes; author =
Olga Kornievskaia, applied by Trond Myklebust. No Fixes/Cc-
stable/Reported-by/Tested-by/Reviewed-by/Link tags.
- [Phase 2] Diff inventory verified via `git show 5d3869a41f360 --stat`:
`fs/nfs/localio.c | 15 ++++++++++++++- ; fs/nfs/pagelist.c | 3 +++ ;
fs/nfs/write.c | 9 +++++++++ ; include/linux/nfs_fs.h | 1 + ; 4 files
changed, 27 insertions(+), 1 deletion(-)`.
- [Phase 2] Read current `fs/nfs/write.c` (lines 909–946 and 1545–1596)
and `fs/nfs/localio.c` (lines 848–920) to confirm pre-patch behavior
and where each hunk lands.
- [Phase 3] `git log master --oneline 5d3869a41f360 -2` confirms
predecessor `3a06bac55bf56` ("NFS: improve 'Server wrote zero bytes'
error"). `git show 3a06bac55bf56` shows it adds `&&
!list_empty(&hdr->pages)` and has `Fixes: 6c75dc0d498c` (2012, v3.5).
- [Phase 3] `git tag --contains e97bc66377bca` shows
`NFS_CONTEXT_FILE_OPEN` is in v5.15+ (and corresponding p-* internal
tags).
- [Phase 3] `git tag --contains 70ba381e1a431` shows `fs/nfs/localio.c`
was added in v6.12 only.
- [Phase 3] `git branch --all --contains 5d3869a41f360` shows the commit
is only on origin/master and stable/master (mainline) — not in any
backport branch.
- [Phase 4] `b4 dig -c 5d3869a41f360` returned the v3 lore URL:
https://patch.msgid.link/20260413222423.90089-1-okorniev@redhat.com.
- [Phase 4] `b4 dig -c 5d3869a41f360 -a` returned three series (v1 RFC
2026-03-12, v2 2026-03-25, v3 2026-04-13).
- [Phase 4] `b4 dig -c 5d3869a41f360 -w` confirmed To:
trondmy@kernel.org, anna@kernel.org; Cc: linux-nfs@vger.kernel.org.
- [Phase 4] Read v1 mbox (`/tmp/analysis/v1/...`) — uses `PG_SYNC` per-
page bit, no localio handling.
- [Phase 4] Read v2 mbox (`/tmp/analysis/v2/...`) — uses
`NFS_CONTEXT_WRITE_SYNC`, includes localio handling. v2→v3 only
difference is computing `iov_iter_count` into `icount` once.
- [Phase 4] `b4 am` of v3: 0 code-review messages on lore. Could not
browse lore directly (Anubis bot wall) — relied on b4's mbox fetch.
- [Phase 5] Hunks inspected directly in `fs/nfs/write.c`,
`fs/nfs/pagelist.c`, `fs/nfs/localio.c`; confirmed
`nfs_writeback_result` is the standard `rw_result` callback for NFS
write RPC completion (universal write path).
- [Phase 6] `git show v6.6:fs/nfs/write.c` and `git show
v6.1:fs/nfs/write.c` confirm `nfs_writeback_result` and
`nfs_commit_release_pages` use the same pattern as mainline (so the
bug exists there).
- [Phase 6] `git show v6.6:fs/nfs/pagelist.c` confirms
`__nfs_pageio_add_request` is structurally the same in v6.6.
- [Phase 6] `git log for-greg/7.0-200 --oneline -- fs/nfs/write.c` does
not include `5d3869a41f360` or `3a06bac55bf56` — neither has been
backported yet.
- [Phase 8] Severity assessment: livelock in writeback is verified by
the reproducer described in the commit (xfstests generic/751) and by
reading the loop logic in `nfs_writeback_result` /
`nfs_commit_release_pages` showing how the retry path returns to the
same state.
- UNVERIFIED: I could not directly confirm a public Tested-by
(lore.kernel.org Anubis blocked manual browsing). b4 dig found no
review trailers, but I cannot rule out off-list/Hammerspace internal
testing. This does not drive my decision either way.
- UNVERIFIED: I did not separately verify that `3a06bac55bf56` will be
selected for stable; that affects only the cleanness of context match,
not correctness of the candidate's logic.
The fix addresses a serious writeback livelock with a concrete
reproducer, comes from the NFS maintainer, is bounded to the NFS
subsystem, and the bug exists across current stable trees. Risks are
modest (modest perf cost after errors, minor backport adjustments for
pre-6.12). On balance, this is appropriate stable material.
**YES**
fs/nfs/localio.c | 15 ++++++++++++++-
fs/nfs/pagelist.c | 3 +++
fs/nfs/write.c | 9 +++++++++
include/linux/nfs_fs.h | 1 +
4 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index 4c7d16a99ed61..e55c5977fcc3a 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -865,6 +865,8 @@ static void nfs_local_call_write(struct work_struct *work)
file_start_write(filp);
n_iters = atomic_read(&iocb->n_iters);
for (int i = 0; i < n_iters ; i++) {
+ size_t icount;
+
if (iocb->iter_is_dio_aligned[i]) {
iocb->kiocb.ki_flags |= IOCB_DIRECT;
/* Only use AIO completion if DIO-aligned segment is last */
@@ -881,8 +883,16 @@ static void nfs_local_call_write(struct work_struct *work)
if (status == -EIOCBQUEUED)
continue;
/* Break on completion, errors, or short writes */
+ icount = iov_iter_count(&iocb->iters[i]);
if (nfs_local_pgio_done(iocb, status) || status < 0 ||
- (size_t)status < iov_iter_count(&iocb->iters[i])) {
+ (size_t)status < icount) {
+ if ((size_t)status < icount) {
+ struct nfs_lock_context *ctx =
+ iocb->hdr->req->wb_lock_context;
+
+ set_bit(NFS_CONTEXT_WRITE_SYNC,
+ &ctx->open_context->flags);
+ }
nfs_local_write_iocb_done(iocb);
break;
}
@@ -901,6 +911,9 @@ static void nfs_local_do_write(struct nfs_local_kiocb *iocb,
__func__, hdr->args.count, hdr->args.offset,
(hdr->args.stable == NFS_UNSTABLE) ? "unstable" : "stable");
+ if (test_bit(NFS_CONTEXT_WRITE_SYNC,
+ &hdr->req->wb_lock_context->open_context->flags))
+ hdr->args.stable = NFS_FILE_SYNC;
switch (hdr->args.stable) {
default:
break;
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index a9373de891c98..4a87b2fdb2e6e 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -1186,6 +1186,9 @@ static int __nfs_pageio_add_request(struct nfs_pageio_descriptor *desc,
nfs_page_group_lock(req);
+ if (test_bit(NFS_CONTEXT_WRITE_SYNC,
+ &req->wb_lock_context->open_context->flags))
+ desc->pg_ioflags |= FLUSH_STABLE;
subreq = req;
subreq_size = subreq->wb_bytes;
for(;;) {
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 1ed4b3590b1ac..ddae197d2d3f9 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -927,9 +927,13 @@ static void nfs_write_completion(struct nfs_pgio_header *hdr)
goto remove_req;
}
if (nfs_write_need_commit(hdr)) {
+ struct nfs_open_context *ctx =
+ hdr->req->wb_lock_context->open_context;
+
/* Reset wb_nio, since the write was successful. */
req->wb_nio = 0;
memcpy(&req->wb_verf, &hdr->verf.verifier, sizeof(req->wb_verf));
+ clear_bit(NFS_CONTEXT_WRITE_SYNC, &ctx->flags);
nfs_mark_request_commit(req, hdr->lseg, &cinfo,
hdr->ds_commit_idx);
goto next;
@@ -1553,7 +1557,10 @@ static void nfs_writeback_result(struct rpc_task *task,
if (resp->count < argp->count) {
static unsigned long complain;
+ struct nfs_open_context *ctx =
+ hdr->req->wb_lock_context->open_context;
+ set_bit(NFS_CONTEXT_WRITE_SYNC, &ctx->flags);
/* This a short write! */
nfs_inc_stats(hdr->inode, NFSIOS_SHORTWRITE);
@@ -1837,6 +1844,8 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data)
/* We have a mismatch. Write the page again */
dprintk(" mismatch\n");
nfs_mark_request_dirty(req);
+ set_bit(NFS_CONTEXT_WRITE_SYNC,
+ &req->wb_lock_context->open_context->flags);
atomic_long_inc(&NFS_I(data->inode)->redirtied_pages);
next:
nfs_unlock_and_release_request(req);
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 8dd79a3f3d662..4623262da3c09 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -109,6 +109,7 @@ struct nfs_open_context {
#define NFS_CONTEXT_BAD (2)
#define NFS_CONTEXT_UNLOCK (3)
#define NFS_CONTEXT_FILE_OPEN (4)
+#define NFS_CONTEXT_WRITE_SYNC (5)
struct nfs4_threshold *mdsthreshold;
struct list_head list;
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-04-28 10:42 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260428104133.2858589-1-sashal@kernel.org>
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.6] NFS: Use nlmclnt_shutdown_rpc_clnt() to safely shut down NLM Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0] NFS: fix writeback in presence of errors Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox