Linux NFS development
 help / color / mirror / Atom feed
From: Laurence Oberman <loberman@redhat.com>
To: Trond Myklebust <trondmy@kernel.org>,
	Benjamin Coddington	 <bcodding@redhat.com>,
	Anna Schumaker <anna@kernel.org>, Tejun Heo <tj@kernel.org>,
	 Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	 djeffery@redhat.com
Subject: Re: [PATCH 2/2] NFS: Improve nfsiod workqueue detection for allocation flags
Date: Mon, 07 Jul 2025 16:28:03 -0400	[thread overview]
Message-ID: <59530cbe001f5d02fa007ce642a860a7bade4422.camel@redhat.com> (raw)
In-Reply-To: <a7621e726227260396291e82363d2b82e5f2ad73.camel@kernel.org>

On Mon, 2025-07-07 at 12:25 -0700, Trond Myklebust wrote:
> On Mon, 2025-07-07 at 14:46 -0400, Benjamin Coddington wrote:
> > The NFS client writeback paths change which flags are passed to
> > their
> > memory allocation calls based on whether the current task is
> > running
> > from
> > within a workqueue or not.  More specifically, it appears that
> > during
> > writeback allocations with PF_WQ_WORKER set on current->flags will
> > add
> > __GFP_NORETRY | __GFP_NOWARN.  Presumably this is because nfsiod
> > can
> > simply fail quickly and later retry to write back that specific
> > page
> > should
> > the allocation fail.
> > 
> > However, the check for PF_WQ_WORKER is too general because tasks
> > can
> > enter NFS
> > writeback paths from other workqueues.  Specifically, the loopback
> > driver
> > tends to perform writeback into backing files on NFS with
> > PF_WQ_WORKER set,
> > and additionally sets PF_MEMALLOC_NOIO.  The combination of
> > PF_MEMALLOC_NOIO with __GFP_NORETRY can easily result in allocation
> > failures and the loopback driver has no retry functionality.  As a
> > result,
> > after commit 0bae835b63c5 ("NFS: Avoid writeback threads getting
> > stuck in
> > mempool_alloc()") users are seeing corrupted loop-mounted
> > filesystems
> > backed
> > by image files on NFS.
> > 
> > In a preceding patch, we introduced a function to allow NFS to
> > detect
> > if
> > the task is executing within a specific workqueue.  Here we use
> > that
> > helper
> > to set __GFP_NORETRY | __GFP_NOWARN only if the workqueue is
> > nfsiod.
> > 
> > Fixes: 0bae835b63c5 ("NFS: Avoid writeback threads getting stuck in
> > mempool_alloc()")
> > Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> > ---
> >  fs/nfs/internal.h | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> > index 69c2c10ee658..173172afa3f5 100644
> > --- a/fs/nfs/internal.h
> > +++ b/fs/nfs/internal.h
> > @@ -12,6 +12,7 @@
> >  #include <linux/nfs_page.h>
> >  #include <linux/nfslocalio.h>
> >  #include <linux/wait_bit.h>
> > +#include <linux/workqueue.h>
> >  
> >  #define NFS_SB_MASK (SB_NOSUID|SB_NODEV|SB_NOEXEC|SB_SYNCHRONOUS)
> >  
> > @@ -669,9 +670,18 @@ nfs_write_match_verf(const struct
> > nfs_writeverf
> > *verf,
> >  		!nfs_write_verifier_cmp(&req->wb_verf, &verf-
> > > verifier);
> >  }
> >  
> > +static inline bool is_nfsiod(void)
> > +{
> > +	struct workqueue_struct *current_wq = current_workqueue();
> > +
> > +	if (current_wq)
> > +		return current_wq == nfsiod_workqueue;
> > +	return false;
> > +}
> > +
> >  static inline gfp_t nfs_io_gfp_mask(void)
> >  {
> > -	if (current->flags & PF_WQ_WORKER)
> > +	if (is_nfsiod())
> >  		return GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN;
> >  	return GFP_KERNEL;
> >  }
> 
> 
> Instead of trying to identify the nfsiod_workqueue, why not apply
> current_gfp_context() in order to weed out callers that set
> PF_MEMALLOC_NOIO and PF_MEMALLOC_NOFS?
> 
> i.e.
> 
> 
> static inline gfp_t nfs_io_gfp_mask(void)
> {
> 	gfp_t ret = current_gfp_context(GFP_KERNEL);
> 
> 	if ((current->flags & PF_WQ_WORKER) && ret == GFP_KERNEL)
> 		ret |= __GFP_NORETRY | __GFP_NOWARN;
> 	return ret;
> }
> 
> 

I am testing both patch options to see if both prevent the failed write
with no other impact and will report back.

The test is confined to the use case of an XFS file system served by an
image that is located on NFS. as that is where the failed writes were
seen.




  parent reply	other threads:[~2025-07-07 20:28 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-07 18:46 [PATCH 0/2] Fix loopback mounted filesystems on NFS Benjamin Coddington
2025-07-07 18:46 ` [PATCH 1/2] workqueue: Add a helper to identify current workqueue Benjamin Coddington
2025-07-08  4:37   ` Tejun Heo
2025-07-08 10:25     ` Benjamin Coddington
2025-07-07 18:46 ` [PATCH 2/2] NFS: Improve nfsiod workqueue detection for allocation flags Benjamin Coddington
2025-07-07 19:25   ` Trond Myklebust
2025-07-07 20:12     ` Benjamin Coddington
2025-07-07 20:42       ` Trond Myklebust
2025-07-07 20:28     ` Laurence Oberman [this message]
2025-07-08 16:50       ` Laurence Oberman
2025-07-08 17:03         ` Benjamin Coddington
2025-07-08 17:09           ` Laurence Oberman
     [not found]             ` <F889E706-9B2B-48CA-B30E-60FB5EFE2578@redhat.com>
2025-07-09 14:36               ` [PATCH] NFS: Fixup allocation flags for nfsiod's __GFP_NORETRY Laurence Oberman
2025-07-07 19:15 ` [PATCH 0/2] Fix loopback mounted filesystems on NFS Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59530cbe001f5d02fa007ce642a860a7bade4422.camel@redhat.com \
    --to=loberman@redhat.com \
    --cc=anna@kernel.org \
    --cc=bcodding@redhat.com \
    --cc=djeffery@redhat.com \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox