Linux NFS development
 help / color / mirror / Atom feed
From: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
To: Mike Snitzer <snitzer@kernel.org>
Cc: Trond Myklebust <trondmy@kernel.org>,
	Anna Schumaker <anna@kernel.org>,
	 linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 1/2] NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS on fatal DS connect errors
Date: Tue, 9 Jun 2026 19:14:02 +0200 (CEST)	[thread overview]
Message-ID: <1103477872.3950199.1781025242395.JavaMail.zimbra@desy.de> (raw)
In-Reply-To: <20260604202403.20856-2-snitzer@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 3853 bytes --]


Just to hand up. I built the proposed patches on top of 7.1.0-rc6 and
ran a bunch of tests against the dCache NFS server with flexfile layout
with tightly coupled NFSv4.1 DSes.

No smoking guns detected.

Thanks, Mile.

Best,
  Tigran.



----- Original Message -----
> From: "Mike Snitzer" <snitzer@kernel.org>
> To: "Trond Myklebust" <trondmy@kernel.org>, "Anna Schumaker" <anna@kernel.org>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Thursday, 4 June, 2026 22:24:02
> Subject: [PATCH 1/2] NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS on fatal DS connect errors

> Commit f06bedfa62d5 ("pNFS/flexfiles: don't attempt pnfs on fatal DS
> errors") teaches ff_layout_{read,write}_pagelist() to return
> PNFS_NOT_ATTEMPTED when nfs4_ff_layout_prepare_ds() fails with a
> nfs_error_is_fatal() errno (e.g. -ETIMEDOUT from a SOFTCONN connect
> deadline, -ENOMEM, -ERESTARTSYS), so that the client gives up instead
> of spinning.  pnfs_do_{read,write}() then dispatches the I/O through
> pnfs_{read,write}_through_mds() → nfs_pageio_reset_{read,write}_mds().
> 
> That fallback is unconditional and silently violates FF_FLAGS_NO_IO_THRU_MDS:
> when the layout segment carries the flag (typically single-mirror
> appliance layouts where MDS I/O is explicitly forbidden), the
> out_failed: path's \`&& !ds_fatal_error\` clause overrides the flag's
> short-circuit through ff_layout_avoid_mds_available_ds() and routes
> the I/O to the MDS file handle anyway.
> 
> This is reachable in practice during a data-server restart: SOFTCONN
> exhaustion produces -ETIMEDOUT, which is fatal per nfs_error_is_fatal(),
> which triggers PNFS_NOT_ATTEMPTED, which silently goes to MDS.
> 
> Preserve the upstream "don't spin on fatal errors" intent for layouts
> that permit MDS fallback.  For layouts with FF_FLAGS_NO_IO_THRU_MDS
> set, mark the layout for return and request PNFS_TRY_AGAIN instead;
> if the server cannot supply a usable layout the failure now surfaces
> cleanly via pnfs_update_layout(), rather than via silent MDS I/O that
> contradicts the flag.
> 
> Fixes: f06bedfa62d5 ("pNFS/flexfiles: don't attempt pnfs on fatal DS errors")
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> ---
> fs/nfs/flexfilelayout/flexfilelayout.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
> 
> diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c
> b/fs/nfs/flexfilelayout/flexfilelayout.c
> index 4d142f1fdf61a..38bcd260e0a91 100644
> --- a/fs/nfs/flexfilelayout/flexfilelayout.c
> +++ b/fs/nfs/flexfilelayout/flexfilelayout.c
> @@ -2204,6 +2204,14 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr)
> out_failed:
> 	if (ff_layout_avoid_mds_available_ds(lseg) && !ds_fatal_error)
> 		return PNFS_TRY_AGAIN;
> +	if (ff_layout_no_fallback_to_mds(lseg)) {
> +		/*
> +		 * FF_FLAGS_NO_IO_THRU_MDS: force fresh LAYOUTGET,
> +		 * never fall through to MDS I/O.
> +		 */
> +		pnfs_error_mark_layout_for_return(hdr->inode, lseg);
> +		return PNFS_TRY_AGAIN;
> +	}
> 	trace_pnfs_mds_fallback_read_pagelist(hdr->inode,
> 			hdr->args.offset, hdr->args.count,
> 			IOMODE_READ, NFS_I(hdr->inode)->layout, lseg);
> @@ -2289,6 +2297,14 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int
> sync)
> out_failed:
> 	if (ff_layout_avoid_mds_available_ds(lseg) && !ds_fatal_error)
> 		return PNFS_TRY_AGAIN;
> +	if (ff_layout_no_fallback_to_mds(lseg)) {
> +		/*
> +		 * FF_FLAGS_NO_IO_THRU_MDS: force fresh LAYOUTGET,
> +		 * never fall through to MDS I/O.
> +		 */
> +		pnfs_error_mark_layout_for_return(hdr->inode, lseg);
> +		return PNFS_TRY_AGAIN;
> +	}
> 	trace_pnfs_mds_fallback_write_pagelist(hdr->inode,
> 			hdr->args.offset, hdr->args.count,
> 			IOMODE_RW, NFS_I(hdr->inode)->layout, lseg);
> --
> 2.44.0

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2309 bytes --]

  reply	other threads:[~2026-06-09 17:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-04 20:24 [PATCH 0/2] NFSv4/flexfiles: fix unwanted in-band IO fallback from DS to MDS Mike Snitzer
2026-06-04 20:24 ` [PATCH 1/2] NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS on fatal DS connect errors Mike Snitzer
2026-06-09 17:14   ` Mkrtchyan, Tigran [this message]
2026-06-04 20:24 ` [PATCH 2/2] NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS in pg_get_mirror_count_write Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1103477872.3950199.1781025242395.JavaMail.zimbra@desy.de \
    --to=tigran.mkrtchyan@desy.de \
    --cc=anna@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=snitzer@kernel.org \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox