public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Olga Kornievskaia <okorniev@redhat.com>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Sasha Levin <sashal@kernel.org>,
	anna@kernel.org, linux-nfs@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 7.0] NFS: fix writeback in presence of errors
Date: Tue, 28 Apr 2026 06:40:42 -0400	[thread overview]
Message-ID: <20260428104133.2858589-31-sashal@kernel.org> (raw)
In-Reply-To: <20260428104133.2858589-1-sashal@kernel.org>

From: Olga Kornievskaia <okorniev@redhat.com>

[ Upstream commit 5d3869a41f3608101c00ff9c9c7c2364c555fa65 ]

After running xfstest generic/751, in certain conditions, can have
a writeback IO stuck while experiencing one of the two patterns.

Pattern#1: writeback IO experiences ENOSPC on an offset smaller
than the filesize. Example,
write offset=0 len=4096 how=unstable OK
write offset=8192 len=4096 how=unstable OK
write offset=12288 len=4096 how=unstable ENOSPC
write offset=4096 len=4096 how=unstable ENOSPC
client sends a commit and receives a verifier which is different
from the last successful write. It marks pages dirty and writeback
retries. But it again send writes unstable and gets into the same
pattern, running into the ENOSPC error and sending a commit because
writes were sent at unstable.

Pattern#2: an unstable write followed by a short write and ENOSPC.
write offset=0 len=4096 how=unstable OK
write offset=4096 len=4096 how=unstable returns OK but count=100
write offset=4197 len=3996 how=stable returns ENOSPC
client send a commit and receives a verifier different from
the last unstable write. The same behaviour is retried in a loop.

Instead, this patch proposes to identify those conditions and mark
requests to be done synchronously instead. Previous solution tried
to mark it in the nfs_page, however that's not persistent thus
instead mark it in the nfs_open_context.

Furthermore, the same problem occurs during localio code path so
recognize that IO needs to be done sync in that case as well.

Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have enough information to provide the full analysis.

---

# Analysis: NFS: fix writeback in presence of errors

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem prefix: `NFS:`
- Action verb: `fix`
- Record: NFS / fix / writeback gets stuck in a loop in presence of
  errors (ENOSPC, short writes, verifier mismatches).

**Step 1.2: Tags**
- Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> (Red Hat NFS
  developer, frequent NFS contributor)
- Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> (NFS
  subsystem maintainer)
- NO `Fixes:` tag
- NO `Cc: stable@vger.kernel.org`
- NO `Reported-by:` (the bug was found internally via xfstests
  generic/751)
- NO `Tested-by:` / `Reviewed-by:` / `Acked-by:`
- NO `Link:` to lore/syzkaller/bugzilla
- Record: Two SOBs (author + maintainer). Absence of stable/Fixes tags
  is exactly why this is in the review pipeline. No reviewer trailers
  but maintainer has SOB indicating he applied it.

**Step 1.3: Commit Body**
- Bug description: Writeback IO gets stuck in an endless retry loop in
  two patterns:
  - Pattern #1: Unstable write → ENOSPC → COMMIT → verifier mismatch →
    mark pages dirty → retry as unstable → loop.
  - Pattern #2: Unstable write returning short (count=100) → stable
    write at offset 4197 → ENOSPC → COMMIT → verifier mismatch → loop.
- Reproduction: `xfstest generic/751`.
- Mechanism explained: After short write/ENOSPC/verifier mismatch, mark
  `nfs_open_context` so subsequent writes are forced to `NFS_FILE_SYNC`
  (stable). v1 used a per-page `PG_SYNC` flag, which doesn't survive
  page reallocation, so design moved to per-open-context flag.
- Record: Author clearly understands root cause; reproduction via
  standard xfstests test; same problem in localio path is also fixed.

**Step 1.4: Hidden Bug Fix Detection**
- Verb is "fix", not disguised. Ports of this clearly fix a hang.
- Record: This is an explicit bug fix (writeback livelock).

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- `fs/nfs/localio.c`: +14 / -1 (in `nfs_local_call_write`,
  `nfs_local_do_write`)
- `fs/nfs/pagelist.c`: +3 (in `__nfs_pageio_add_request`)
- `fs/nfs/write.c`: +9 (in `nfs_write_completion`,
  `nfs_writeback_result`, `nfs_commit_release_pages`)
- `include/linux/nfs_fs.h`: +1 (new `NFS_CONTEXT_WRITE_SYNC` flag)
- Total: 27 insertions, 1 deletion across 4 files. Surgical, single-
  subsystem.
- Record: Small, contained, NFS-only changes touching only error-
  handling paths.

**Step 2.2: Code Flow Change**
- `nfs_writeback_result` (write.c): on short write, set
  `NFS_CONTEXT_WRITE_SYNC` on the open context.
- `nfs_commit_release_pages` (write.c): on verifier mismatch (server
  lost data), set the flag.
- `nfs_local_call_write` (localio.c): on short write in localio path,
  set the flag.
- `nfs_write_completion` (write.c): clear flag when an unstable write
  succeeds and needs commit.
- `__nfs_pageio_add_request` (pagelist.c): when flag set, OR
  `FLUSH_STABLE` into `pg_ioflags` so future writes go stable.
- `nfs_local_do_write` (localio.c): when flag set, set `hdr->args.stable
  = NFS_FILE_SYNC`.
- Record: Adds a sticky "force stable writes" flag set on error paths
  and consulted on submission.

**Step 2.3: Bug Mechanism Class**
- Logic / liveness fix (livelock / infinite retry loop in NFS error
  recovery), with synchronization/state-machine corrective. Not a
  UAF/leak/race classic.
- Trigger: server ENOSPC during unstable write OR verifier mismatch on
  commit (e.g., server crash/reboot or storage commit failure).
- Record: Liveness/livelock fix in NFS write recovery state machine.

**Step 2.4: Fix Quality**
- Reasoning is sound: forcing FILE_SYNC after error eliminates the
  unstable-write→commit→verifier-mismatch retry cycle.
- One subtle concern: the flag is per-open-context and is only cleared
  when `nfs_write_need_commit(hdr)` is true after a write completion.
  Once the flag is set, all writes go stable; stable writes don't need
  commit, so they don't clear the flag. The flag effectively persists
  until a successful unstable write (which only happens if flag is
  cleared first). This means after a single short-write event, the open
  context becomes permanently sync. That's intentional fail-safe
  behavior, but is a behavior change.
- No locking/memory-management concerns; no new allocations; no API
  change visible to userspace.
- Record: Logic appears correct; small regression risk = sustained sync
  writes after a transient short-write (perf cost only, not
  correctness).

## PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1: Blame**
- The general `nfs_writeback_result` short-write handling has been
  present since `6c75dc0d498c` ("NFS: merge _full and _partial write
  rpc_ops", 2012, v3.5). The buggy retry loop is therefore in
  essentially every stable kernel.
- Record: Bug present since 2012/v3.5. Affects all current LTS trees
  (5.10 through 6.12+).

**Step 3.2: Fixes Tag**
- This commit has no `Fixes:` tag.
- The closely related predecessor `3a06bac55bf56` ("NFS: improve 'Server
  wrote zero bytes' error") has `Fixes: 6c75dc0d498c` (2012). That
  predecessor is the change the candidate's diff context (the `&&
  !list_empty(&hdr->pages)` line) depends on.
- Record: No Fixes tag. Bug origin best matches `6c75dc0d498c` (2012)
  per related commit.

**Step 3.3: File History / Series**
- `git log fs-next -- fs/nfs/write.c` shows the immediate predecessors:
  `3a06bac55bf56` (Feb 2026, "NFS: improve 'Server wrote zero bytes'
  error") then `5d3869a41f360` (this commit, Apr 2026).
- The diff shown in the candidate uses the post-`3a06bac` context (`if
  (resp->count < argp->count && !list_empty(&hdr->pages))`). For a clean
  apply on stable, `3a06bac55bf56` should also land (it has Fixes: 2012,
  so likely already auto-selected).
- Standalone? Mostly. Localio bits depend on localio existing (v6.12+).
- Record: One reasonable prerequisite (`3a06bac55bf56`) for a clean
  context match; not a hard logical dependency though - the actual hunk
  additions only need pre-existing structure.

**Step 3.4: Author / Maintainer**
- Olga Kornievskaia: long-standing NFS client developer at Red Hat, many
  commits in `fs/nfs/`.
- Trond Myklebust: NFS subsystem maintainer; he applied and signed off.
- Record: High-trust authorship.

**Step 3.5: Dependencies**
- Required field `NFS_CONTEXT_FILE_OPEN` was added in 2021 (commit
  `e97bc66377bca`, Trond Myklebust); confirmed present in stable 5.15
  and later.
- `fs/nfs/localio.c` only exists in v6.12+ (added by `70ba381e1a431` for
  v6.12).
- Record: Pre-6.12 stable trees can take only the
  `write.c`+`pagelist.c`+`nfs_fs.h` portions; `localio.c` portion does
  not apply there.

## PHASE 4: MAILING LIST RESEARCH

**Step 4.1: b4 dig**
- `b4 dig -c 5d3869a41f360`: matched submission at
  https://patch.msgid.link/20260413222423.90089-1-okorniev@redhat.com
  (v3).
- `b4 dig -c 5d3869a41f360 -a`: three series found:
  - v1 (RFC) 2026-03-12:
    https://patch.msgid.link/20260312171526.85759-1-okorniev@redhat.com
  - v2 2026-03-25:
    https://patch.msgid.link/20260325180050.55186-1-okorniev@redhat.com
  - v3 2026-04-13: applied version
- `b4 am` for v3 thread: `Analyzing 1 messages in the thread` /
  `Analyzing 0 code-review messages` — no public review feedback, no
  reviewer-suggested `Cc: stable`.
- Major design change between v1 and v2: v1 used a per-page `PG_SYNC`
  bit; v2/v3 moved to per-open-context `NFS_CONTEXT_WRITE_SYNC` (because
  per-page flag is not persistent through page recycling) and added the
  localio path.
- v2 → v3 was a minor cleanup (compute `iov_iter_count` once into
  `icount` variable).
- Record: Three revisions; significant design rework v1→v2; v3
  essentially same as v2 with small refactor; no reviewer feedback
  visible on lore.

**Step 4.2: b4 dig -w (Recipients)**
- To: trondmy@kernel.org, anna@kernel.org. Cc: linux-
  nfs@vger.kernel.org.
- Both NFS maintainers (Trond Myklebust, Anna Schumaker) and the NFS
  list were addressed.
- Record: Correct maintainer audience; Trond's SOB shows he
  reviewed/applied.

**Step 4.3 / 4.4 / 4.5: Bug report / Series / Stable history**
- No external bug report / Reported-by / Link tags.
- Standalone fix; no patch series.
- Could not reach lore.kernel.org search interactively (Anubis bot
  protection). Relying on b4 dig output.
- Record: No external bug report references; no public stable discussion
  located.

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: Modified Functions**
- `nfs_local_call_write`, `nfs_local_do_write` (localio path)
- `__nfs_pageio_add_request` (write submission/coalescing path)
- `nfs_write_completion`, `nfs_writeback_result`,
  `nfs_commit_release_pages` (write/commit completion)

**Step 5.2: Callers / Reachability**
- `nfs_writeback_result` is invoked as `rw_result` in `nfs_rw_write_ops`
  — runs on every NFS WRITE RPC completion. Universal NFS write path.
- `nfs_commit_release_pages` runs on every NFS COMMIT completion.
- `nfs_write_completion` runs on completion of a `nfs_pgio_header` group
  of writes.
- `__nfs_pageio_add_request` runs on every page added to a pageio
  descriptor — the core write submission coalescer.
- `nfs_local_call_write` runs in the localio (loopback NFS-on-same-host)
  write path.
- Record: All hot paths in NFS write submission and completion.
  Reachable on every NFS write from userspace.

**Step 5.3 / 5.4: Reachability**
- Trigger: NFS server returns ENOSPC, short write, or different verifier
  on commit. All are realistic in production (filling disk, quota,
  server reboot).
- Record: Reachable from any unprivileged write(2) over NFS once disk
  fills up.

**Step 5.5: Similar Patterns**
- The flag is consulted in two places (pagelist + localio); set in three
  places (writeback_result, commit_release_pages, local_call_write);
  cleared in one place (write_completion). Consistent design.
- Record: No other writeback retry logic relies on the same pattern;
  this is a fresh mechanism.

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Code in stable trees?**
- `nfs_writeback_result` and the unstable-write retry logic exist in
  essentially all currently-supported LTS (5.10, 5.15, 6.1, 6.6, 6.12+).
  Verified for v6.1 and v6.6 directly.
- The bug pattern (server ENOSPC + verifier mismatch loop) thus exists
  in all of them.
- `NFS_CONTEXT_FILE_OPEN` exists from v5.15+ — adding
  `NFS_CONTEXT_WRITE_SYNC` next to it is safe.
- Record: Bug present in 5.10 onward; flag header location compatible
  with 5.15+; for 5.10 a manual placement check would be needed.

**Step 6.2: Backport Difficulty**
- 6.12+ trees: Should mostly apply; the `nfs_writeback_result` hunk has
  a context dependency on `3a06bac55bf56` (also a stable candidate, has
  its own Fixes tag). Either land both, or fuzz/adapt one line of
  context.
- Pre-6.12 trees: No `fs/nfs/localio.c` — the localio hunks must be
  dropped. Core fix in `write.c`/`pagelist.c`/`nfs_fs.h` still applies.
- Record: Minor adjustment needed; not a clean-apply for all trees but
  conceptually portable.

**Step 6.3: Already in stable?**
- Not in any stable backport branch (`for-greg/*`); `git branch --all
  --contains 5d3869a41f360` returns only origin/master (mainline) and
  the upstream repo.
- Record: Not yet backported.

## PHASE 7: SUBSYSTEM CONTEXT

**Step 7.1: Subsystem Criticality**
- `fs/nfs/` (NFS client) — IMPORTANT: very widely used (every distro,
  enterprise, container/cloud workloads with NFS storage).
- Record: IMPORTANT / widespread use.

**Step 7.2: Activity**
- NFS client is actively maintained; Olga and Trond have been pushing
  many fixes recently.
- Record: Active subsystem.

## PHASE 8: IMPACT AND RISK

**Step 8.1: Affected Users**
- Any NFS client user where the server can return ENOSPC, short writes,
  or replays a different verifier (server reboot/crash mid-commit). That
  is essentially all production NFS deployments.
- Record: NFS users — broad.

**Step 8.2: Trigger Conditions**
- Server-side disk full, quota hit, or NFS server reboot/commit-retry.
  Common in real environments. Reproducer: xfstests generic/751.
- Triggered by unprivileged user writes.
- Record: Realistic and triggerable by any NFS user.

**Step 8.3: Severity**
- Failure mode: writeback `kworker` enters infinite retry loop. This
  means: dirty pages never clear, fsync(2) never returns, eventually the
  system exhibits hang task warnings, dirty memory accumulates,
  balance_dirty_pages-style throttling stalls all writers, application-
  level data is never reported back to userspace as failed.
- Severity: HIGH (effective hang of writeback, eventual OOM-class
  behavior, application data loss/incorrect error semantics; no kernel
  oops but a livelock that operators notice as a hung NFS).
- Record: HIGH severity (livelock in writeback / data loss-class
  symptoms).

**Step 8.4: Risk vs Benefit**
- BENEFIT: Eliminates a real, reproducible writeback livelock affecting
  all NFS users hitting ENOSPC or commit-verifier mismatch — high
  benefit.
- RISK:
  - Scope is medium (4 files / ~27 lines).
  - Touches hot paths (every NFS write/commit completion).
  - Behavior change: once a short-write/verifier-mismatch event happens
    on a file, the open context becomes "sticky-sync" until a successful
    unstable write+commit happens (which by construction can't happen
    until the flag is cleared, so realistically until the file is
    reopened). This is a permanent perf regression for the lifetime of
    that open fd after a single transient error.
  - The patch went through 3 iterations with a notable design change
    v1→v2 — author had to redesign once. v3 vs v2 is trivial.
  - No `Reviewed-by`/`Tested-by` from external parties on lore (only
    maintainer SOB).
- Net: Benefit clearly outweighs risk: livelock is severe; perf
  regression after error is acceptable; mechanism is contained to error
  paths.
- Record: HIGH benefit, LOW–MEDIUM risk, ratio favors backport.

## PHASE 9: SYNTHESIS

**Step 9.1: Evidence**

FOR:
- Real, severe, reproducible bug (xfstests generic/751).
- Concrete two-pattern description with offsets/lengths/return codes.
- Maintainer (Trond Myklebust) signed off and applied directly.
- Bug exists in essentially all current stable trees (logic since 2012).
- Authoritative author/maintainer pair.
- Single subsystem; small line count.

AGAINST:
- No `Fixes:` tag.
- No `Cc: stable` tag.
- No public reviewer trailers / Tested-by.
- v1→v2 design change indicates the design space was non-trivial.
- The flag is sticky-sync once set (intentional but a behavior change).
- Localio portion only relevant for v6.12+; pre-6.12 needs manual trim.
- One contextual prerequisite (`3a06bac55bf56`) for mechanical clean-
  apply on the `write.c` hunk near `nfs_writeback_result` — that
  prerequisite has its own `Fixes:` tag and is itself a likely stable
  candidate.

**Step 9.2: Stable Rules Checklist**
1. Obviously correct? Mostly — design is straightforward; no public
   review evidence to confirm "obvious", but maintainer applied it.
   Acceptable.
2. Real bug? YES — xfstests generic/751 reproducer.
3. Important? YES — writeback livelock, applies to NFS users hitting
   ENOSPC / server commit failures (= many production NFS users).
4. Small/contained? YES — 27 insertions / 1 deletion across 4 files in
   one subsystem.
5. No new features/APIs? No userspace API; adds an internal-only flag
   bit (NFS_CONTEXT_WRITE_SYNC). Acceptable.
6. Applies to stable? Mostly — needs context fuzz / drop of localio
   hunks for pre-6.12.

**Step 9.3: Exception Categories** Not applicable (not device ID, quirk,
DT, build, or doc fix).

**Step 9.4: Decision**

This is a real fix for a serious, reproducible NFS writeback livelock
that has existed for years in all current stable kernels. The change is
bounded to NFS error-handling paths, comes from the NFS subsystem
maintainer, and has a clear root-cause explanation tied to a standard
xfstests reproducer. The principal downsides — sticky sync after an
error, multi-revision design history, no Fixes/stable tag — are real but
do not outweigh the benefit of breaking an infinite writeback retry loop
for end users. Pre-6.12 backports require dropping the localio hunks;
otherwise the core fix is portable.

## Verification

- [Phase 1] Parsed subject/body from the commit message provided and
  from `git show 5d3869a41f360` — confirmed reproducer is xfstests
  generic/751; two failure patterns with offsets/return codes; author =
  Olga Kornievskaia, applied by Trond Myklebust. No Fixes/Cc-
  stable/Reported-by/Tested-by/Reviewed-by/Link tags.
- [Phase 2] Diff inventory verified via `git show 5d3869a41f360 --stat`:
  `fs/nfs/localio.c | 15 ++++++++++++++- ; fs/nfs/pagelist.c | 3 +++ ;
  fs/nfs/write.c | 9 +++++++++ ; include/linux/nfs_fs.h | 1 + ; 4 files
  changed, 27 insertions(+), 1 deletion(-)`.
- [Phase 2] Read current `fs/nfs/write.c` (lines 909–946 and 1545–1596)
  and `fs/nfs/localio.c` (lines 848–920) to confirm pre-patch behavior
  and where each hunk lands.
- [Phase 3] `git log master --oneline 5d3869a41f360 -2` confirms
  predecessor `3a06bac55bf56` ("NFS: improve 'Server wrote zero bytes'
  error"). `git show 3a06bac55bf56` shows it adds `&&
  !list_empty(&hdr->pages)` and has `Fixes: 6c75dc0d498c` (2012, v3.5).
- [Phase 3] `git tag --contains e97bc66377bca` shows
  `NFS_CONTEXT_FILE_OPEN` is in v5.15+ (and corresponding p-* internal
  tags).
- [Phase 3] `git tag --contains 70ba381e1a431` shows `fs/nfs/localio.c`
  was added in v6.12 only.
- [Phase 3] `git branch --all --contains 5d3869a41f360` shows the commit
  is only on origin/master and stable/master (mainline) — not in any
  backport branch.
- [Phase 4] `b4 dig -c 5d3869a41f360` returned the v3 lore URL:
  https://patch.msgid.link/20260413222423.90089-1-okorniev@redhat.com.
- [Phase 4] `b4 dig -c 5d3869a41f360 -a` returned three series (v1 RFC
  2026-03-12, v2 2026-03-25, v3 2026-04-13).
- [Phase 4] `b4 dig -c 5d3869a41f360 -w` confirmed To:
  trondmy@kernel.org, anna@kernel.org; Cc: linux-nfs@vger.kernel.org.
- [Phase 4] Read v1 mbox (`/tmp/analysis/v1/...`) — uses `PG_SYNC` per-
  page bit, no localio handling.
- [Phase 4] Read v2 mbox (`/tmp/analysis/v2/...`) — uses
  `NFS_CONTEXT_WRITE_SYNC`, includes localio handling. v2→v3 only
  difference is computing `iov_iter_count` into `icount` once.
- [Phase 4] `b4 am` of v3: 0 code-review messages on lore. Could not
  browse lore directly (Anubis bot wall) — relied on b4's mbox fetch.
- [Phase 5] Hunks inspected directly in `fs/nfs/write.c`,
  `fs/nfs/pagelist.c`, `fs/nfs/localio.c`; confirmed
  `nfs_writeback_result` is the standard `rw_result` callback for NFS
  write RPC completion (universal write path).
- [Phase 6] `git show v6.6:fs/nfs/write.c` and `git show
  v6.1:fs/nfs/write.c` confirm `nfs_writeback_result` and
  `nfs_commit_release_pages` use the same pattern as mainline (so the
  bug exists there).
- [Phase 6] `git show v6.6:fs/nfs/pagelist.c` confirms
  `__nfs_pageio_add_request` is structurally the same in v6.6.
- [Phase 6] `git log for-greg/7.0-200 --oneline -- fs/nfs/write.c` does
  not include `5d3869a41f360` or `3a06bac55bf56` — neither has been
  backported yet.
- [Phase 8] Severity assessment: livelock in writeback is verified by
  the reproducer described in the commit (xfstests generic/751) and by
  reading the loop logic in `nfs_writeback_result` /
  `nfs_commit_release_pages` showing how the retry path returns to the
  same state.
- UNVERIFIED: I could not directly confirm a public Tested-by
  (lore.kernel.org Anubis blocked manual browsing). b4 dig found no
  review trailers, but I cannot rule out off-list/Hammerspace internal
  testing. This does not drive my decision either way.
- UNVERIFIED: I did not separately verify that `3a06bac55bf56` will be
  selected for stable; that affects only the cleanness of context match,
  not correctness of the candidate's logic.

The fix addresses a serious writeback livelock with a concrete
reproducer, comes from the NFS maintainer, is bounded to the NFS
subsystem, and the bug exists across current stable trees. Risks are
modest (modest perf cost after errors, minor backport adjustments for
pre-6.12). On balance, this is appropriate stable material.

**YES**

 fs/nfs/localio.c       | 15 ++++++++++++++-
 fs/nfs/pagelist.c      |  3 +++
 fs/nfs/write.c         |  9 +++++++++
 include/linux/nfs_fs.h |  1 +
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index 4c7d16a99ed61..e55c5977fcc3a 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -865,6 +865,8 @@ static void nfs_local_call_write(struct work_struct *work)
 	file_start_write(filp);
 	n_iters = atomic_read(&iocb->n_iters);
 	for (int i = 0; i < n_iters ; i++) {
+		size_t icount;
+
 		if (iocb->iter_is_dio_aligned[i]) {
 			iocb->kiocb.ki_flags |= IOCB_DIRECT;
 			/* Only use AIO completion if DIO-aligned segment is last */
@@ -881,8 +883,16 @@ static void nfs_local_call_write(struct work_struct *work)
 		if (status == -EIOCBQUEUED)
 			continue;
 		/* Break on completion, errors, or short writes */
+		icount = iov_iter_count(&iocb->iters[i]);
 		if (nfs_local_pgio_done(iocb, status) || status < 0 ||
-		    (size_t)status < iov_iter_count(&iocb->iters[i])) {
+		    (size_t)status < icount) {
+			if ((size_t)status < icount) {
+				struct nfs_lock_context *ctx =
+					iocb->hdr->req->wb_lock_context;
+
+				set_bit(NFS_CONTEXT_WRITE_SYNC,
+					&ctx->open_context->flags);
+			}
 			nfs_local_write_iocb_done(iocb);
 			break;
 		}
@@ -901,6 +911,9 @@ static void nfs_local_do_write(struct nfs_local_kiocb *iocb,
 		__func__, hdr->args.count, hdr->args.offset,
 		(hdr->args.stable == NFS_UNSTABLE) ?  "unstable" : "stable");
 
+	if (test_bit(NFS_CONTEXT_WRITE_SYNC,
+		     &hdr->req->wb_lock_context->open_context->flags))
+		hdr->args.stable = NFS_FILE_SYNC;
 	switch (hdr->args.stable) {
 	default:
 		break;
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index a9373de891c98..4a87b2fdb2e6e 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -1186,6 +1186,9 @@ static int __nfs_pageio_add_request(struct nfs_pageio_descriptor *desc,
 
 	nfs_page_group_lock(req);
 
+	if (test_bit(NFS_CONTEXT_WRITE_SYNC,
+		     &req->wb_lock_context->open_context->flags))
+		desc->pg_ioflags |= FLUSH_STABLE;
 	subreq = req;
 	subreq_size = subreq->wb_bytes;
 	for(;;) {
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 1ed4b3590b1ac..ddae197d2d3f9 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -927,9 +927,13 @@ static void nfs_write_completion(struct nfs_pgio_header *hdr)
 			goto remove_req;
 		}
 		if (nfs_write_need_commit(hdr)) {
+			struct nfs_open_context *ctx =
+				hdr->req->wb_lock_context->open_context;
+
 			/* Reset wb_nio, since the write was successful. */
 			req->wb_nio = 0;
 			memcpy(&req->wb_verf, &hdr->verf.verifier, sizeof(req->wb_verf));
+			clear_bit(NFS_CONTEXT_WRITE_SYNC, &ctx->flags);
 			nfs_mark_request_commit(req, hdr->lseg, &cinfo,
 				hdr->ds_commit_idx);
 			goto next;
@@ -1553,7 +1557,10 @@ static void nfs_writeback_result(struct rpc_task *task,
 
 	if (resp->count < argp->count) {
 		static unsigned long    complain;
+		struct nfs_open_context *ctx =
+			hdr->req->wb_lock_context->open_context;
 
+		set_bit(NFS_CONTEXT_WRITE_SYNC, &ctx->flags);
 		/* This a short write! */
 		nfs_inc_stats(hdr->inode, NFSIOS_SHORTWRITE);
 
@@ -1837,6 +1844,8 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data)
 		/* We have a mismatch. Write the page again */
 		dprintk(" mismatch\n");
 		nfs_mark_request_dirty(req);
+		set_bit(NFS_CONTEXT_WRITE_SYNC,
+			&req->wb_lock_context->open_context->flags);
 		atomic_long_inc(&NFS_I(data->inode)->redirtied_pages);
 	next:
 		nfs_unlock_and_release_request(req);
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 8dd79a3f3d662..4623262da3c09 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -109,6 +109,7 @@ struct nfs_open_context {
 #define NFS_CONTEXT_BAD			(2)
 #define NFS_CONTEXT_UNLOCK	(3)
 #define NFS_CONTEXT_FILE_OPEN		(4)
+#define NFS_CONTEXT_WRITE_SYNC		(5)
 
 	struct nfs4_threshold	*mdsthreshold;
 	struct list_head list;
-- 
2.53.0


  parent reply	other threads:[~2026-04-28 10:42 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28 10:40 [PATCH AUTOSEL 7.0] ALSA: hda/realtek: add quirk for HONOR MRB-XXX M1020 Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] tools/power/x86/intel-speed-select: Avoid current base freq as maximum Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] um: fix address-of CMSG_DATA() rvalue in stub Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.12] tty: serial: samsung_tty: avoid dev_dbg deadlock Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] drm/amdgpu: fix CPER ring header parsing Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] io_uring/rsrc: unify nospec indexing for direct descriptors Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] um: avoid struct sigcontext redefinition with musl Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.15] leds: lgm-sso: Fix typo in macro for src offset Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.15] fs/ntfs3: increase CLIENT_REC name field size Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.15] ksmbd: fix CreateOptions sanitization clobbering the whole field Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.1] thunderbolt: Disable CLx on Titan Ridge-based devices with old firmware Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.6] NFS: Use nlmclnt_shutdown_rpc_clnt() to safely shut down NLM Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.12] smb: client: compress: fix buffer overrun in lz77_compress() Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] drm/amd/display: Pass min page size from SOC BB to dml2_1 plane config Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.12] usb: dwc3: Support USB3340x ULPI PHY high-speed negotiation Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.12] smb: client: compress: fix counting in LZ77 match finding Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] mfd: mt6397: Properly fix CID of MT6328, MT6331 and MT6332 Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.1] um: Disable GCOV_PROFILE_ALL on 32-bit UML with Clang 20/21 Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.12] ASoC: qcom: x1e80100: limit speaker volumes Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.12] smb: client: compress: fix bad encoding on last LZ77 flag Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.15] fs/ntfs3: fix potential double iput on d_make_root() failure Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] scsi: storvsc: Handle PERSISTENT_RESERVE_IN truncation for Hyper-V vFC Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0] fs: aio: set VMA_DONTCOPY_BIT in mmap to fix NULL-pointer-dereference error Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] dt-bindings: arm64: add Marvell 7k COMe boards Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] ecryptfs: Set s_time_gran to get correct time granularity Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] usb: usbip: fix OOB read/write in usbip_pad_iso() Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] scsi: lpfc: Remove unnecessary ndlp kref get in lpfc_check_nlp_post_devloss Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] leds: core: Implement fallback to software node name for LED names Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.15] ntfs3: reject inodes with zero non-DOS link count Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] f2fs: fix to skip empty sections in f2fs_get_victim Sasha Levin
2026-04-28 10:40 ` Sasha Levin [this message]
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.6] dt-bindings: rtc: microcrystal,rv3028: Allow to specify vdd-supply Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] fs: aio: reject partial mremap to avoid Null-pointer-dereference error Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.15] fs/ntfs3: fix $LXDEV xattr lookup Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] scsi: ufs: ufs-pci: Add support for Intel Nova Lake Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.1] scsi: lpfc: Fix incorrect txcmplq_cnt during cleanup in lpfc_sli_abort_ring() Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] drm/amdgpu: drop userq fence driver refs out of fence process() Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.15] ksmbd: fix O(N^2) DoS in smb2_lock via unbounded LockCount Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] usb: gadget: bdc: validate status-report endpoint indices Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.12] coda_flag_children(): fix a UAF Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] fbdev: savage: fix probe-path EDID cleanup leaks Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0] scsi: virtio_scsi: Move INIT_WORK calls to virtscsi_probe() Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] iio: ABI: fix current_trigger description Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] staging: octeon: fix free_irq dev_id mismatch in cvm_oct_rx_shutdown Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.12] mfd: intel-lpss: Add Intel Nova Lake-H PCI IDs Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] tty: serial: imx: keep dma request disabled before dma transfer setup Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.12] greybus: beagleplay: bound bootloader RX buffer copy Sasha Levin
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-5.10] serial: qcom-geni: Fix RTS behavior with flow control Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] selftests: fib_nexthops: test stale has_v4 on nexthop replace Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.1] ntfs3: fix OOB write in attr_wof_frame_info() Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] arm64: cputype: Add C1-Pro definitions Sasha Levin
2026-04-28 11:13   ` Mark Rutland
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.18] drm/amd/display: Fix HostVMMinPageSize unit mismatch in DML2.1 Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.12] 9p/trans_xen: make cleanup idempotent after dataring alloc errors Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0] drm/amdgpu: OR init_pte_flags into invalid leaf PTE updates Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.6] scsi: ufs: core: Disable timestamp for Kioxia THGJFJT0E25BAIP Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.18] f2fs: fix to freeze GC and discard threads quickly Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] scsi: esas2r: Fix __printf annotation on esas2r_log_master() Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.18] rtc: max77686: convert to i2c_new_ancillary_device Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.1] rtc: ti-k3: Add support to resume from IO DDR low power mode Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.12] bus: mhi: host: pci_generic: Add Telit FE912C04 modem support Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] usb: usbip: fix integer overflow in usbip_recv_iso() Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] clk: qcom: rcg2: expand frac table for mdss_pixel_clk_src Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] usb: usbip: validate iso frame actual_length in usbip_recv_iso() Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.12] bus: mhi: host: pci_generic: Add Qualcomm SDX35 modem Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0] drm/amd/display: Use overlay cursor when color pipeline is active Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.18] platform/x86: hp-wmi: Add support for Omen 16-wf1xxx (8C77) Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.12] smb: server: stop sending fake security descriptors Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.18] ALSA: usb-audio: Add quirk entries for NexiGo N930W webcam Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.15] ntfs3: fix memory leak in indx_create_allocate() Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] staging: fbtft: fix unchecked write return value in fb_agm1264k-fl Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.18] scsi: lpfc: Add PCI ID support for LPe42100 series adapters Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.12] io_uring: take page references for NOMMU pbuf_ring mmaps Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.18] iio: imu: st_lsm6dsx: Add ACPI ID for SHIFT13mi gyroscope Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.15] dt-bindings: clock: qcom,gcc-sc8180x: Add missing GDSCs Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260428104133.2858589-31-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=anna@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=okorniev@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=trond.myklebust@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox