From: Johannes Weiner <hannes@cmpxchg.org>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Nhat Pham <nphamcs@gmail.com>, Chris Li <chrisl@kernel.org>,
Chengming Zhou <zhouchengming@bytedance.com>,
Huang Ying <ying.huang@intel.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] mm: zswap: remove unnecessary tree cleanups in zswap_swapoff()
Date: Tue, 23 Jan 2024 10:38:51 -0500 [thread overview]
Message-ID: <20240123153851.GA1745986@cmpxchg.org> (raw)
In-Reply-To: <CAJD7tkaATS48HVuBfbOmPM3EvRUoPFr66WhF64UC4FkyVH5exg@mail.gmail.com>
On Mon, Jan 22, 2024 at 12:39:16PM -0800, Yosry Ahmed wrote:
> On Mon, Jan 22, 2024 at 12:19 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
> >
> > On Sat, Jan 20, 2024 at 02:40:07AM +0000, Yosry Ahmed wrote:
> > > During swapoff, try_to_unuse() makes sure that zswap_invalidate() is
> > > called for all swap entries before zswap_swapoff() is called. This means
> > > that all zswap entries should already be removed from the tree. Simplify
> > > zswap_swapoff() by removing the tree cleanup loop, and leaving an
> > > assertion in its place.
> > >
> > > Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
> >
> > Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> >
> > That's a great simplification.
> >
> > Removing the tree->lock made me double take, but at this point the
> > swapfile and its cache should be fully dead and I don't see how any of
> > the zswap operations that take tree->lock could race at this point.
>
> It took me a while staring at the code to realize this loop is pointless.
>
> However, while I have your attention on the swapoff path, there's a
> slightly irrelevant problem that I think might be there, but I am not
> sure.
>
> It looks to me like swapoff can race with writeback, and there may be
> a chance of UAF for the zswap tree. For example, if zswap_swapoff()
> races with shrink_memcg_cb(), I feel like we may free the tree as it
> is being used. For example if zswap_swapoff()->kfree(tree) happen
> right before shrink_memcg_cb()->list_lru_isolate(l, item).
>
> Please tell me that I am being paranoid and that there is some
> protection against zswap writeback racing with swapoff. It feels like
> we are very careful with zswap entries refcounting, but not with the
> zswap tree itself.
Hm, I don't see how.
Writeback operates on entries from the LRU. By the time
zswap_swapoff() is called, try_to_unuse() -> zswap_invalidate() should
will have emptied out the LRU and tree.
Writeback could have gotten a refcount to the entry and dropped the
tree->lock. But then it does __read_swap_cache_async(), and while
holding the page lock checks the tree under lock once more; if that
finds the entry valid, it means try_to_unuse() hasn't started on this
page yet, and would be held up by the page lock/writeback state.
> > > Chengming, Chris, I think this should make the tree split and the xarray
> > > conversion patches simpler (especially the former). If others agree,
> > > both changes can be rebased on top of this.
> >
> > The resulting code is definitely simpler, but this patch is not a
> > completely trivial cleanup, either. If you put it before Chengming's
> > patch and it breaks something, it would be difficult to pull out
> > without affecting the tree split.
>
> Are you suggesting I rebase this on top of Chengming's patches? I can
> definitely do this, but the patch will be slightly less
> straightforward, and if the tree split patches break something it
> would be difficult to pull out as well. If you feel like this patch is
> more likely to break things, I can rebase.
Yeah I think it's more subtle. I'd only ask somebody to rebase an
already tested patch on a newer one if the latter were an obvious,
low-risk, prep-style patch. Your patch is good, but it doesn't quite
fit into this particular category, so I'd say no jumping the queue ;)
next prev parent reply other threads:[~2024-01-23 15:39 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-20 2:40 [PATCH 0/2] mm: zswap: simplify zswap_swapoff() Yosry Ahmed
2024-01-20 2:40 ` [PATCH 1/2] mm: swap: update inuse_pages after all cleanups are done Yosry Ahmed
2024-01-22 13:17 ` Chengming Zhou
2024-01-23 8:59 ` Huang, Ying
2024-01-23 9:40 ` Yosry Ahmed
2024-01-23 9:54 ` Yosry Ahmed
2024-01-24 3:13 ` Huang, Ying
2024-01-24 3:20 ` Yosry Ahmed
2024-01-24 3:27 ` Huang, Ying
2024-01-24 4:15 ` Yosry Ahmed
2024-01-20 2:40 ` [PATCH 2/2] mm: zswap: remove unnecessary tree cleanups in zswap_swapoff() Yosry Ahmed
2024-01-22 13:13 ` Chengming Zhou
2024-01-22 20:19 ` Johannes Weiner
2024-01-22 20:39 ` Yosry Ahmed
2024-01-23 15:38 ` Johannes Weiner [this message]
2024-01-23 15:54 ` Yosry Ahmed
2024-01-23 20:12 ` Johannes Weiner
2024-01-23 21:02 ` Yosry Ahmed
2024-01-24 6:57 ` Yosry Ahmed
2024-01-25 5:28 ` Chris Li
2024-01-25 7:59 ` Yosry Ahmed
2024-01-25 18:55 ` Chris Li
2024-01-25 20:57 ` Yosry Ahmed
2024-01-25 22:31 ` Chris Li
2024-01-25 22:33 ` Yosry Ahmed
2024-01-26 1:09 ` Chris Li
2024-01-24 7:20 ` Chengming Zhou
2024-01-25 5:44 ` Chris Li
2024-01-25 8:01 ` Yosry Ahmed
2024-01-25 19:03 ` Chris Li
2024-01-25 21:01 ` Yosry Ahmed
2024-01-25 7:53 ` Yosry Ahmed
2024-01-25 8:03 ` Yosry Ahmed
2024-01-25 8:30 ` Chengming Zhou
2024-01-25 8:42 ` Yosry Ahmed
2024-01-25 8:52 ` Chengming Zhou
2024-01-25 9:03 ` Yosry Ahmed
2024-01-25 9:22 ` Chengming Zhou
2024-01-25 9:26 ` Yosry Ahmed
2024-01-25 9:38 ` Chengming Zhou
2024-01-26 0:03 ` Chengming Zhou
2024-01-26 0:05 ` Yosry Ahmed
2024-01-26 0:10 ` Chengming Zhou
2024-01-23 20:30 ` Nhat Pham
2024-01-23 21:04 ` Yosry Ahmed
2024-01-22 21:21 ` Nhat Pham
2024-01-22 22:31 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240123153851.GA1745986@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=chrisl@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.