From: Chengming Zhou <chengming.zhou@linux.dev>
To: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Zhongkun He <hezhongkun.hzk@bytedance.com>,
Chengming Zhou <zhouchengming@bytedance.com>,
Yosry Ahmed <yosryahmed@google.com>,
Barry Song <21cnbao@gmail.com>, Chris Li <chrisl@kernel.org>,
Nhat Pham <nphamcs@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices
Date: Mon, 25 Mar 2024 08:01:24 +0800 [thread overview]
Message-ID: <dc4481a6-37f9-4648-bcf7-44f54a737523@linux.dev> (raw)
In-Reply-To: <20240324210447.956973-1-hannes@cmpxchg.org>
On 2024/3/25 05:04, Johannes Weiner wrote:
> Zhongkun He reports data corruption when combining zswap with zram.
>
> The issue is the exclusive loads we're doing in zswap. They assume
> that all reads are going into the swapcache, which can assume
> authoritative ownership of the data and so the zswap copy can go.
>
> However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try
> to bypass the swapcache. This results in an optimistic read of the
> swap data into a page that will be dismissed if the fault fails due to
> races. In this case, zswap mustn't drop its authoritative copy.
>
> Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=zV9P691B9bVq33erwOXNTmEaUbi9DrDeJzw@mail.gmail.com/
> Reported-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
> Fixes: b9c91c43412f ("mm: zswap: support exclusive loads")
> Cc: stable@vger.kernel.org [6.5+]
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Tested-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
Very nice solution!
Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>
Thanks.
> ---
> mm/zswap.c | 23 +++++++++++++++++++----
> 1 file changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 535c907345e0..41a1170f7cfe 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1622,6 +1622,7 @@ bool zswap_load(struct folio *folio)
> swp_entry_t swp = folio->swap;
> pgoff_t offset = swp_offset(swp);
> struct page *page = &folio->page;
> + bool swapcache = folio_test_swapcache(folio);
> struct zswap_tree *tree = swap_zswap_tree(swp);
> struct zswap_entry *entry;
> u8 *dst;
> @@ -1634,7 +1635,20 @@ bool zswap_load(struct folio *folio)
> spin_unlock(&tree->lock);
> return false;
> }
> - zswap_rb_erase(&tree->rbroot, entry);
> + /*
> + * When reading into the swapcache, invalidate our entry. The
> + * swapcache can be the authoritative owner of the page and
> + * its mappings, and the pressure that results from having two
> + * in-memory copies outweighs any benefits of caching the
> + * compression work.
> + *
> + * (Most swapins go through the swapcache. The notable
> + * exception is the singleton fault on SWP_SYNCHRONOUS_IO
> + * files, which reads into a private page and may free it if
> + * the fault fails. We remain the primary owner of the entry.)
> + */
> + if (swapcache)
> + zswap_rb_erase(&tree->rbroot, entry);
> spin_unlock(&tree->lock);
>
> if (entry->length)
> @@ -1649,9 +1663,10 @@ bool zswap_load(struct folio *folio)
> if (entry->objcg)
> count_objcg_event(entry->objcg, ZSWPIN);
>
> - zswap_entry_free(entry);
> -
> - folio_mark_dirty(folio);
> + if (swapcache) {
> + zswap_entry_free(entry);
> + folio_mark_dirty(folio);
> + }
>
> return true;
> }
next prev parent reply other threads:[~2024-03-25 0:01 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-24 21:04 [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices Johannes Weiner
2024-03-24 21:22 ` Yosry Ahmed
2024-03-25 4:54 ` Barry Song
2024-03-25 7:06 ` Yosry Ahmed
2024-03-25 7:33 ` Chengming Zhou
2024-03-25 8:38 ` Yosry Ahmed
2024-03-25 9:22 ` Chengming Zhou
2024-03-25 9:40 ` Yosry Ahmed
2024-03-25 9:46 ` Chengming Zhou
2024-03-25 18:35 ` Yosry Ahmed
2024-03-25 16:30 ` Johannes Weiner
2024-03-25 18:41 ` Yosry Ahmed
2024-03-25 0:01 ` Chengming Zhou [this message]
2024-03-25 3:01 ` [External] " Zhongkun He
2024-03-25 17:09 ` Nhat Pham
2024-03-25 21:27 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dc4481a6-37f9-4648-bcf7-44f54a737523@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chrisl@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hezhongkun.hzk@bytedance.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=yosryahmed@google.com \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.