From: Yosry Ahmed <yosryahmed@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Zhongkun He <hezhongkun.hzk@bytedance.com>,
Chengming Zhou <zhouchengming@bytedance.com>,
Barry Song <21cnbao@gmail.com>, Chris Li <chrisl@kernel.org>,
Nhat Pham <nphamcs@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices
Date: Mon, 25 Mar 2024 11:41:05 -0700 [thread overview]
Message-ID: <CAJD7tkakaLzB7TU9kDRLGTCUJ-WdkSTSt1z4eZR5vUfS3-n+ew@mail.gmail.com> (raw)
In-Reply-To: <20240325163003.GA42450@cmpxchg.org>
On Mon, Mar 25, 2024 at 9:30 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Sun, Mar 24, 2024 at 02:22:46PM -0700, Yosry Ahmed wrote:
> > On Sun, Mar 24, 2024 at 2:04 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
> > >
> > > Zhongkun He reports data corruption when combining zswap with zram.
> > >
> > > The issue is the exclusive loads we're doing in zswap. They assume
> > > that all reads are going into the swapcache, which can assume
> > > authoritative ownership of the data and so the zswap copy can go.
> > >
> > > However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try
> > > to bypass the swapcache. This results in an optimistic read of the
> > > swap data into a page that will be dismissed if the fault fails due to
> > > races. In this case, zswap mustn't drop its authoritative copy.
> > >
> > > Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=zV9P691B9bVq33erwOXNTmEaUbi9DrDeJzw@mail.gmail.com/
> > > Reported-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
> > > Fixes: b9c91c43412f ("mm: zswap: support exclusive loads")
> > > Cc: stable@vger.kernel.org [6.5+]
> > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > > Tested-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
> >
> > Do we also want to mention somewhere (commit log or comment) that
> > keeping the entry in the tree is fine because we are still protected
> > from concurrent loads/invalidations/writeback by swapcache_prepare()
> > setting SWAP_HAS_CACHE or so?
>
> I don't think it's necessary, as zswap isn't doing anything special
> here. It's up to the caller to follow the generic swap exclusion
> protocol that zswap also adheres to. So IMO the relevant comment
> should be, and is, above that swapcache_prepare() in do_swap_page().
From the perspective of someone looking at the zswap code, it isn't
immediately clear what protects the zswap entry in the non-exclusive
load case from being freed from under us. At some point we had a
refcount, then we used to remove it from the tree under lock so others
wouldn't have access to it. Now it's less clear because we rely on
protection outside of zswap code.
We also document other places where we rely on the swapcache for
synchronization, so I think it may be worth briefly mentioning this
here as well, especially that in this code we explicitly check for the
folio not being in the swapcache. That said, I don't feel strongly
about it. Tracking down the SWP_SYNCHRONOUS_IO code should eventually
make it clear. Also, the commit log will end up having a link to this
thread anyway so the details are not completely unfindable :)
next prev parent reply other threads:[~2024-03-25 18:41 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-24 21:04 [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices Johannes Weiner
2024-03-24 21:22 ` Yosry Ahmed
2024-03-25 4:54 ` Barry Song
2024-03-25 7:06 ` Yosry Ahmed
2024-03-25 7:33 ` Chengming Zhou
2024-03-25 8:38 ` Yosry Ahmed
2024-03-25 9:22 ` Chengming Zhou
2024-03-25 9:40 ` Yosry Ahmed
2024-03-25 9:46 ` Chengming Zhou
2024-03-25 18:35 ` Yosry Ahmed
2024-03-25 16:30 ` Johannes Weiner
2024-03-25 18:41 ` Yosry Ahmed [this message]
2024-03-25 0:01 ` Chengming Zhou
2024-03-25 3:01 ` [External] " Zhongkun He
2024-03-25 17:09 ` Nhat Pham
2024-03-25 21:27 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJD7tkakaLzB7TU9kDRLGTCUJ-WdkSTSt1z4eZR5vUfS3-n+ew@mail.gmail.com \
--to=yosryahmed@google.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chrisl@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hezhongkun.hzk@bytedance.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).