From: Mike Snitzer <snitzer@kernel.org>
To: Trond Myklebust <trondmy@kernel.org>, Anna Schumaker <anna@kernel.org>
Cc: linux-nfs@vger.kernel.org
Subject: [PATCH 2/2] NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS in pg_get_mirror_count_write
Date: Thu, 4 Jun 2026 16:24:03 -0400 [thread overview]
Message-ID: <20260604202403.20856-3-snitzer@kernel.org> (raw)
In-Reply-To: <20260604202403.20856-1-snitzer@kernel.org>
The FF_FLAGS_NO_IO_THRU_MDS flag lives on each lseg, so any fallback
decision made when there is no current lseg (e.g. between LAYOUTRETURN
and the next LAYOUTGET) cannot run the per-lseg check.
Introduce a sticky hdr-level ditto for FF_FLAGS_NO_IO_THRU_MDS in
struct nfs4_flexfile_layout::flags (NFS4_FF_HDR_NO_IO_THRU_MDS bit),
set whenever ff_layout_alloc_lseg() parses an lseg with the flag. The
bit is never cleared for the lifetime of the layout hdr; the server is
assumed to be consistent in its no-fallback policy per file.
kzalloc() in ff_layout_alloc_layout_hdr() zero-initializes the field.
Use the new ff_layout_hdr_no_fallback_to_mds() helper to gate
ff_layout_pg_get_mirror_count_write(): when pnfs_update_layout() returns
NULL (e.g. NFS_LAYOUT_BULK_RECALL, pnfs_layout_io_test_failed,
pnfs_layoutgets_blocked) the existing code unconditionally calls
nfs_pageio_reset_write_mds(). This is a source of unwanted WRITE to
MDS. Fix it by checking NFS4_FF_HDR_NO_IO_THRU_MDS bit, and if set
surface -EAGAIN instead; the writepage-side caller (nfs_do_writepage()
for buffered, nfs_direct_write_reschedule() for O_DIRECT) then
redirties the request so writeback retries via pNFS.
Fixes: 260074cd8413 ("pNFS/flexfiles: Add support for FF_FLAGS_NO_IO_THRU_MDS")
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
fs/nfs/flexfilelayout/flexfilelayout.c | 13 +++++++++++++
fs/nfs/flexfilelayout/flexfilelayout.h | 16 ++++++++++++++++
2 files changed, 29 insertions(+)
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index 38bcd260e0a91..a63f90be11dfd 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -636,6 +636,9 @@ ff_layout_alloc_lseg(struct pnfs_layout_hdr *lh,
if (!p)
goto out_sort_mirrors;
fls->flags = be32_to_cpup(p);
+ if (fls->flags & FF_FLAGS_NO_IO_THRU_MDS)
+ set_bit(NFS4_FF_HDR_NO_IO_THRU_MDS,
+ &FF_LAYOUT_FROM_HDR(lh)->flags);
p = xdr_inline_decode(&stream, 4);
if (!p)
@@ -1185,6 +1188,16 @@ ff_layout_pg_get_mirror_count_write(struct nfs_pageio_descriptor *pgio,
0, NFS4_MAX_UINT64, IOMODE_RW,
NFS_I(pgio->pg_inode)->layout,
pgio->pg_lseg);
+ if (NFS_I(pgio->pg_inode)->layout &&
+ ff_layout_hdr_no_fallback_to_mds(NFS_I(pgio->pg_inode)->layout)) {
+ /*
+ * FF_FLAGS_NO_IO_THRU_MDS: no current lseg but the server's
+ * policy forbids MDS fallback. Surface -EAGAIN so writeback
+ * retries rather than silently issuing the WRITE via MDS.
+ */
+ pgio->pg_error = -EAGAIN;
+ goto out;
+ }
/* no lseg means that pnfs is not in use, so no mirroring here */
nfs_pageio_reset_write_mds(pgio);
out:
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.h b/fs/nfs/flexfilelayout/flexfilelayout.h
index 17a008c8e97ce..a5bd00f69e824 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.h
+++ b/fs/nfs/flexfilelayout/flexfilelayout.h
@@ -112,12 +112,16 @@ struct nfs4_ff_layout_segment {
struct nfs4_ff_layout_mirror *mirror_array[] __counted_by(mirror_array_cnt);
};
+/* nfs4_flexfile_layout::flags bit indices */
+#define NFS4_FF_HDR_NO_IO_THRU_MDS 0 /* any lseg has had FF_FLAGS_NO_IO_THRU_MDS */
+
struct nfs4_flexfile_layout {
struct pnfs_layout_hdr generic_hdr;
struct pnfs_ds_commit_info commit_info;
struct list_head mirrors;
struct list_head error_list; /* nfs4_ff_layout_ds_err */
ktime_t last_report_time; /* Layoutstat report times */
+ unsigned long flags;
};
struct nfs4_flexfile_layoutreturn_args {
@@ -184,6 +188,18 @@ ff_layout_no_fallback_to_mds(struct pnfs_layout_segment *lseg)
return FF_LAYOUT_LSEG(lseg)->flags & FF_FLAGS_NO_IO_THRU_MDS;
}
+/*
+ * Sticky hdr-level mirror of FF_FLAGS_NO_IO_THRU_MDS so callers that have
+ * no current lseg (e.g. between LAYOUTRETURN and the next LAYOUTGET) can
+ * still honor the no-MDS-fallback policy.
+ */
+static inline bool
+ff_layout_hdr_no_fallback_to_mds(struct pnfs_layout_hdr *lo)
+{
+ return test_bit(NFS4_FF_HDR_NO_IO_THRU_MDS,
+ &FF_LAYOUT_FROM_HDR(lo)->flags);
+}
+
static inline bool
ff_layout_no_read_on_rw(struct pnfs_layout_segment *lseg)
{
--
2.44.0
prev parent reply other threads:[~2026-06-04 20:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 20:24 [PATCH 0/2] NFSv4/flexfiles: fix unwanted in-band IO fallback from DS to MDS Mike Snitzer
2026-06-04 20:24 ` [PATCH 1/2] NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS on fatal DS connect errors Mike Snitzer
2026-06-09 17:14 ` Mkrtchyan, Tigran
2026-06-04 20:24 ` Mike Snitzer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260604202403.20856-3-snitzer@kernel.org \
--to=snitzer@kernel.org \
--cc=anna@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox