From: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
To: Mike Snitzer <snitzer@kernel.org>
Cc: Trond Myklebust <trondmy@kernel.org>,
Anna Schumaker <anna@kernel.org>,
linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 1/2] NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS on fatal DS connect errors
Date: Tue, 9 Jun 2026 19:14:02 +0200 (CEST) [thread overview]
Message-ID: <1103477872.3950199.1781025242395.JavaMail.zimbra@desy.de> (raw)
In-Reply-To: <20260604202403.20856-2-snitzer@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 3853 bytes --]
Just to hand up. I built the proposed patches on top of 7.1.0-rc6 and
ran a bunch of tests against the dCache NFS server with flexfile layout
with tightly coupled NFSv4.1 DSes.
No smoking guns detected.
Thanks, Mile.
Best,
Tigran.
----- Original Message -----
> From: "Mike Snitzer" <snitzer@kernel.org>
> To: "Trond Myklebust" <trondmy@kernel.org>, "Anna Schumaker" <anna@kernel.org>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Thursday, 4 June, 2026 22:24:02
> Subject: [PATCH 1/2] NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS on fatal DS connect errors
> Commit f06bedfa62d5 ("pNFS/flexfiles: don't attempt pnfs on fatal DS
> errors") teaches ff_layout_{read,write}_pagelist() to return
> PNFS_NOT_ATTEMPTED when nfs4_ff_layout_prepare_ds() fails with a
> nfs_error_is_fatal() errno (e.g. -ETIMEDOUT from a SOFTCONN connect
> deadline, -ENOMEM, -ERESTARTSYS), so that the client gives up instead
> of spinning. pnfs_do_{read,write}() then dispatches the I/O through
> pnfs_{read,write}_through_mds() → nfs_pageio_reset_{read,write}_mds().
>
> That fallback is unconditional and silently violates FF_FLAGS_NO_IO_THRU_MDS:
> when the layout segment carries the flag (typically single-mirror
> appliance layouts where MDS I/O is explicitly forbidden), the
> out_failed: path's \`&& !ds_fatal_error\` clause overrides the flag's
> short-circuit through ff_layout_avoid_mds_available_ds() and routes
> the I/O to the MDS file handle anyway.
>
> This is reachable in practice during a data-server restart: SOFTCONN
> exhaustion produces -ETIMEDOUT, which is fatal per nfs_error_is_fatal(),
> which triggers PNFS_NOT_ATTEMPTED, which silently goes to MDS.
>
> Preserve the upstream "don't spin on fatal errors" intent for layouts
> that permit MDS fallback. For layouts with FF_FLAGS_NO_IO_THRU_MDS
> set, mark the layout for return and request PNFS_TRY_AGAIN instead;
> if the server cannot supply a usable layout the failure now surfaces
> cleanly via pnfs_update_layout(), rather than via silent MDS I/O that
> contradicts the flag.
>
> Fixes: f06bedfa62d5 ("pNFS/flexfiles: don't attempt pnfs on fatal DS errors")
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> ---
> fs/nfs/flexfilelayout/flexfilelayout.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c
> b/fs/nfs/flexfilelayout/flexfilelayout.c
> index 4d142f1fdf61a..38bcd260e0a91 100644
> --- a/fs/nfs/flexfilelayout/flexfilelayout.c
> +++ b/fs/nfs/flexfilelayout/flexfilelayout.c
> @@ -2204,6 +2204,14 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr)
> out_failed:
> if (ff_layout_avoid_mds_available_ds(lseg) && !ds_fatal_error)
> return PNFS_TRY_AGAIN;
> + if (ff_layout_no_fallback_to_mds(lseg)) {
> + /*
> + * FF_FLAGS_NO_IO_THRU_MDS: force fresh LAYOUTGET,
> + * never fall through to MDS I/O.
> + */
> + pnfs_error_mark_layout_for_return(hdr->inode, lseg);
> + return PNFS_TRY_AGAIN;
> + }
> trace_pnfs_mds_fallback_read_pagelist(hdr->inode,
> hdr->args.offset, hdr->args.count,
> IOMODE_READ, NFS_I(hdr->inode)->layout, lseg);
> @@ -2289,6 +2297,14 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int
> sync)
> out_failed:
> if (ff_layout_avoid_mds_available_ds(lseg) && !ds_fatal_error)
> return PNFS_TRY_AGAIN;
> + if (ff_layout_no_fallback_to_mds(lseg)) {
> + /*
> + * FF_FLAGS_NO_IO_THRU_MDS: force fresh LAYOUTGET,
> + * never fall through to MDS I/O.
> + */
> + pnfs_error_mark_layout_for_return(hdr->inode, lseg);
> + return PNFS_TRY_AGAIN;
> + }
> trace_pnfs_mds_fallback_write_pagelist(hdr->inode,
> hdr->args.offset, hdr->args.count,
> IOMODE_RW, NFS_I(hdr->inode)->layout, lseg);
> --
> 2.44.0
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2309 bytes --]
next prev parent reply other threads:[~2026-06-09 17:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 20:24 [PATCH 0/2] NFSv4/flexfiles: fix unwanted in-band IO fallback from DS to MDS Mike Snitzer
2026-06-04 20:24 ` [PATCH 1/2] NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS on fatal DS connect errors Mike Snitzer
2026-06-09 17:14 ` Mkrtchyan, Tigran [this message]
2026-06-04 20:24 ` [PATCH 2/2] NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS in pg_get_mirror_count_write Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1103477872.3950199.1781025242395.JavaMail.zimbra@desy.de \
--to=tigran.mkrtchyan@desy.de \
--cc=anna@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=snitzer@kernel.org \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox