From: Johannes Weiner <hannes@cmpxchg.org>
To: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com,
akpm@linux-foundation.org, nphamcs@gmail.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@meta.com
Subject: Re: [PATCH] mm: zswap: fix potential memory corruption on duplicate store
Date: Mon, 25 Sep 2023 08:25:13 -0400 [thread overview]
Message-ID: <20230925122513.GA347250@cmpxchg.org> (raw)
In-Reply-To: <CA+CLi1hT30jtGGVwWh8LBoLq3ijRoYdxiMB301Jc97Z9=JLHGA@mail.gmail.com>
On Mon, Sep 25, 2023 at 10:36:06AM +0200, Domenico Cerasuolo wrote:
> On Fri, Sep 22, 2023 at 7:42 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
> >
> > On Fri, Sep 22, 2023 at 07:22:11PM +0200, Domenico Cerasuolo wrote:
> > > While stress-testing zswap a memory corruption was happening when writing
> > > back pages. __frontswap_store used to check for duplicate entries before
> > > attempting to store a page in zswap, this was because if the store fails
> > > the old entry isn't removed from the tree. This change removes duplicate
> > > entries in zswap_store before the actual attempt.
> > >
> > > Based on commit ce9ecca0238b ("Linux 6.6-rc2")
> > >
> > > Fixes: 42c06a0e8ebe ("mm: kill frontswap")
> > > Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
> >
> > Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> >
> > > @@ -1218,6 +1218,19 @@ bool zswap_store(struct folio *folio)
> > > if (!zswap_enabled || !tree)
> > > return false;
> > >
> > > + /*
> > > + * If this is a duplicate, it must be removed before attempting to store
> > > + * it, otherwise, if the store fails the old page won't be removed from
> > > + * the tree, and it might be written back overriding the new data.
> > > + */
> > > + spin_lock(&tree->lock);
> > > + dupentry = zswap_rb_search(&tree->rbroot, offset);
> > > + if (dupentry) {
> > > + zswap_duplicate_entry++;
> > > + zswap_invalidate_entry(tree, dupentry);
> > > + }
> > > + spin_unlock(&tree->lock);
> >
> > Do we still need the dupe handling at the end of the function then?
> >
> > The dupe store happens because a page that's already in swapcache has
> > changed and we're trying to swap_writepage() it again with new data.
> >
> > But the page is locked at this point, pinning the swap entry. So even
> > after the tree lock is dropped I don't see how *another* store to the
> > tree at this offset could occur while we're compressing.
>
> My reasoning here was that frontswap used to invalidate a dupe right before
> calling store(), so I thought that the check at the end of zswap_store() was
> actually needed.
Ok the git history is actually really enlightening of how this came to
be. Initially, frontswap didn't invalidate. Only zswap did. Then
someone ran into exactly the scenario that you encountered:
commit fb993fa1a2f669215fa03a09eed7848f2663e336
Author: Weijie Yang <weijie.yang@samsung.com>
Date: Tue Dec 2 15:59:25 2014 -0800
mm: frontswap: invalidate expired data on a dup-store failure
If a frontswap dup-store failed, it should invalidate the expired page
in the backend, or it could trigger some data corruption issue.
Such as:
1. use zswap as the frontswap backend with writeback feature
2. store a swap page(version_1) to entry A, success
3. dup-store a newer page(version_2) to the same entry A, fail
4. use __swap_writepage() write version_2 page to swapfile, success
5. zswap do shrink, writeback version_1 page to swapfile
6. version_2 page is overwrited by version_1, data corrupt.
This patch fixes this issue by invalidating expired data immediately
when meet a dup-store failure.
This split the invalidation duty: On store success, zswap would
invalidate. On store failure, frontswap would.
Then this patch moved the invalidate in frontswap to before the store:
commit d1dc6f1bcf1e998e7ce65fc120da371ab047a999
Author: Dan Streetman <ddstreet@ieee.org>
Date: Wed Jun 24 16:58:18 2015 -0700
frontswap: allow multiple backends
at which point the after-store invalidate in zswap became unnecessary.
> Since we acquire the tree lock anyway to add the new entry to the LRU, wouldn't
> checking the result of zswap_rb_insert be a very cheap sanity check? We could
> treat it as an error and fail the store(), or just add a warning and still
> invalidate the dupe?
Yeah, it's less about performance and more about code clarity.
A warning probably makes the most sense. It gives us the sanity check
and an opportunity to document expectations wrt the swapcache, while
keeping the code resilient against data corruption.
next prev parent reply other threads:[~2023-09-25 12:25 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-22 17:22 [PATCH] mm: zswap: fix potential memory corruption on duplicate store Domenico Cerasuolo
2023-09-22 17:42 ` Johannes Weiner
2023-09-25 8:36 ` Domenico Cerasuolo
2023-09-25 12:25 ` Johannes Weiner [this message]
2023-09-22 18:17 ` Nhat Pham
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230925122513.GA347250@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=cerasuolodomenico@gmail.com \
--cc=ddstreet@ieee.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=sjenning@redhat.com \
--cc=vitaly.wool@konsulko.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox