linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "NeilBrown" <neilb@suse.de>
To: "Christoph Hellwig" <hch@infradead.org>
Cc: "Christoph Hellwig" <hch@infradead.org>,
	"Dave Chinner" <david@fromorbit.com>,
	"Mike Snitzer" <snitzer@kernel.org>,
	linux-xfs@vger.kernel.org, "Brian Foster" <bfoster@redhat.com>,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH v2] xfs: enable WQ_MEM_RECLAIM on m_sync_workqueue
Date: Wed, 10 Jul 2024 09:12:58 +1000	[thread overview]
Message-ID: <172056677808.15471.5200774043985229799@noble.neil.brown.name> (raw)
In-Reply-To: <ZoVdAPusEMugHBl8@infradead.org>

On Thu, 04 Jul 2024, Christoph Hellwig wrote:
> On Wed, Jul 03, 2024 at 09:29:00PM +1000, NeilBrown wrote:
> > I know nothing of this stance.  Do you have a reference?
> 
> No particular one.
> 
> > I have put a modest amount of work into ensure NFS to a server on the
> > same machine works and last I checked it did - though I'm more
> > confident of NFSv3 than NFSv4 because of the state manager thread.
> 
> How do you propagate the NOFS flag (and NOIO for a loop device) to
> the server an the workqueues run by the server and the file system
> call by it?  How do you ensure WQ_MEM_RECLAIM gets propagate to
> all workqueues that could be called by the file system on the
> server (the problem kicking off this discussion)?
> 

Do we need to propagate these?

NOFS is for deadlock avoidance.  A filesystem "backend" (Dave's term - I
think for the parts of the fs that handle write-back) might allocate
memory, that might block waiting for memory reclaim, memory reclaim
might re-enter the filesystem backend and might block on a lock (or
similar) held while allocating memory.  NOFS breaks that deadlock.

The important thing here isn't the NOFS flag, it is breaking any
possible deadlock.

Layered filesystems introduce a new complexity.  The backend for one
filesystem can call into the front end of another filesystem.  That
front-end is not required to use NOFS and even if we impose
PF_MEMALLOC_NOFS, the front-end might wait for some work-queue action
which doesn't inherit the NOFS flag.

But this doesn't necessarily matter.  Calling into the filesystem is not
the problem - blocking waiting for a reply is the problem.  It is
blocking that creates deadlocks.  So if the backend of one filesystem
queues to a separate thread the work for the front end of the other
filesystem and doesn't wait for the work to complete, then a deadlock
cannot be introduced.

/dev/loop uses the loop%d workqueue for this.  loop-back NFS hands the
front-end work over to nfsd.  The proposed localio implementation uses a
nfslocaliod workqueue for exactly the same task.  These remove the
possibility of deadlock and mean that there is no need to pass NOFS
through to the front-end of the backing filesystem.

Note that there is a separate question concerning pageout to a swap
file.  pageout needs more than just deadlock avoidance.  It needs
guaranteed progress in low memory conditions.   It needs PF_MEMALLOC (or
mempools) and that cannot be finessed using work queues.  I don't think
that Linux is able to support pageout through layered filesystems.

So while I support loop-back NFS and swap-over-NFS, I don't support them
in combination.  We don't support swap on /dev/loop when it is backed by
a file - for that we have swap-to-file.

Thank you for challenging me on this - it helped me clarify my thoughts
and understanding for myself.

NeilBrown

  parent reply	other threads:[~2024-07-09 23:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-28 16:18 [RFC PATCH] xfs: enable WQ_MEM_RECLAIM on m_sync_workqueue Mike Snitzer
2024-06-30 16:35 ` [PATCH v2] " Mike Snitzer
2024-06-30 23:46   ` Dave Chinner
2024-07-01  4:45     ` Christoph Hellwig
2024-07-02 23:51       ` Dave Chinner
2024-07-03 11:29       ` NeilBrown
2024-07-03 14:15         ` Christoph Hellwig
2024-07-03 23:02           ` Dave Chinner
2024-07-09 23:12           ` NeilBrown [this message]
2024-07-11 11:55             ` Dave Chinner
2024-07-01 14:13     ` Mike Snitzer
2024-07-02 12:33       ` Trond Myklebust
2024-07-02 13:04         ` Dave Chinner
2024-07-02 14:00           ` Trond Myklebust
2024-07-02 23:15             ` Dave Chinner
2024-07-06  0:32             ` NeilBrown
2024-07-06  6:13               ` Christoph Hellwig
2024-07-06  6:37                 ` Christoph Hellwig
2024-07-09 23:39                   ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=172056677808.15471.5200774043985229799@noble.neil.brown.name \
    --to=neilb@suse.de \
    --cc=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=snitzer@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).