From: "NeilBrown" <neilb@suse.de>
To: "Christoph Hellwig" <hch@infradead.org>
Cc: "Christoph Hellwig" <hch@infradead.org>,
"Dave Chinner" <david@fromorbit.com>,
"Mike Snitzer" <snitzer@kernel.org>,
linux-xfs@vger.kernel.org, "Brian Foster" <bfoster@redhat.com>,
linux-nfs@vger.kernel.org
Subject: Re: [PATCH v2] xfs: enable WQ_MEM_RECLAIM on m_sync_workqueue
Date: Wed, 10 Jul 2024 09:12:58 +1000 [thread overview]
Message-ID: <172056677808.15471.5200774043985229799@noble.neil.brown.name> (raw)
In-Reply-To: <ZoVdAPusEMugHBl8@infradead.org>
On Thu, 04 Jul 2024, Christoph Hellwig wrote:
> On Wed, Jul 03, 2024 at 09:29:00PM +1000, NeilBrown wrote:
> > I know nothing of this stance. Do you have a reference?
>
> No particular one.
>
> > I have put a modest amount of work into ensure NFS to a server on the
> > same machine works and last I checked it did - though I'm more
> > confident of NFSv3 than NFSv4 because of the state manager thread.
>
> How do you propagate the NOFS flag (and NOIO for a loop device) to
> the server an the workqueues run by the server and the file system
> call by it? How do you ensure WQ_MEM_RECLAIM gets propagate to
> all workqueues that could be called by the file system on the
> server (the problem kicking off this discussion)?
>
Do we need to propagate these?
NOFS is for deadlock avoidance. A filesystem "backend" (Dave's term - I
think for the parts of the fs that handle write-back) might allocate
memory, that might block waiting for memory reclaim, memory reclaim
might re-enter the filesystem backend and might block on a lock (or
similar) held while allocating memory. NOFS breaks that deadlock.
The important thing here isn't the NOFS flag, it is breaking any
possible deadlock.
Layered filesystems introduce a new complexity. The backend for one
filesystem can call into the front end of another filesystem. That
front-end is not required to use NOFS and even if we impose
PF_MEMALLOC_NOFS, the front-end might wait for some work-queue action
which doesn't inherit the NOFS flag.
But this doesn't necessarily matter. Calling into the filesystem is not
the problem - blocking waiting for a reply is the problem. It is
blocking that creates deadlocks. So if the backend of one filesystem
queues to a separate thread the work for the front end of the other
filesystem and doesn't wait for the work to complete, then a deadlock
cannot be introduced.
/dev/loop uses the loop%d workqueue for this. loop-back NFS hands the
front-end work over to nfsd. The proposed localio implementation uses a
nfslocaliod workqueue for exactly the same task. These remove the
possibility of deadlock and mean that there is no need to pass NOFS
through to the front-end of the backing filesystem.
Note that there is a separate question concerning pageout to a swap
file. pageout needs more than just deadlock avoidance. It needs
guaranteed progress in low memory conditions. It needs PF_MEMALLOC (or
mempools) and that cannot be finessed using work queues. I don't think
that Linux is able to support pageout through layered filesystems.
So while I support loop-back NFS and swap-over-NFS, I don't support them
in combination. We don't support swap on /dev/loop when it is backed by
a file - for that we have swap-to-file.
Thank you for challenging me on this - it helped me clarify my thoughts
and understanding for myself.
NeilBrown
next prev parent reply other threads:[~2024-07-09 23:13 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-28 16:18 [RFC PATCH] xfs: enable WQ_MEM_RECLAIM on m_sync_workqueue Mike Snitzer
2024-06-30 16:35 ` [PATCH v2] " Mike Snitzer
2024-06-30 23:46 ` Dave Chinner
2024-07-01 4:45 ` Christoph Hellwig
2024-07-02 23:51 ` Dave Chinner
2024-07-03 11:29 ` NeilBrown
2024-07-03 14:15 ` Christoph Hellwig
2024-07-03 23:02 ` Dave Chinner
2024-07-09 23:12 ` NeilBrown [this message]
2024-07-11 11:55 ` Dave Chinner
2024-07-01 14:13 ` Mike Snitzer
2024-07-02 12:33 ` Trond Myklebust
2024-07-02 13:04 ` Dave Chinner
2024-07-02 14:00 ` Trond Myklebust
2024-07-02 23:15 ` Dave Chinner
2024-07-06 0:32 ` NeilBrown
2024-07-06 6:13 ` Christoph Hellwig
2024-07-06 6:37 ` Christoph Hellwig
2024-07-09 23:39 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=172056677808.15471.5200774043985229799@noble.neil.brown.name \
--to=neilb@suse.de \
--cc=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).