From: Chengming Zhou <chengming.zhou@linux.dev>
To: Yosry Ahmed <yosryahmed@google.com>, Barry Song <21cnbao@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Zhongkun He <hezhongkun.hzk@bytedance.com>,
Chengming Zhou <zhouchengming@bytedance.com>,
Chris Li <chrisl@kernel.org>, Nhat Pham <nphamcs@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Kairui Song <kasong@tencent.com>
Subject: Re: [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices
Date: Mon, 25 Mar 2024 15:33:20 +0800 [thread overview]
Message-ID: <1e7ce417-b9dd-4d62-9f54-0adf1ccdae35@linux.dev> (raw)
In-Reply-To: <CAJD7tka5K69q20bxTsBk38JC7mdPr3UsxXpsnggDO_iQA=qxug@mail.gmail.com>
On 2024/3/25 15:06, Yosry Ahmed wrote:
> On Sun, Mar 24, 2024 at 9:54 PM Barry Song <21cnbao@gmail.com> wrote:
>>
>> On Mon, Mar 25, 2024 at 10:23 AM Yosry Ahmed <yosryahmed@google.com> wrote:
>>>
>>> On Sun, Mar 24, 2024 at 2:04 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
>>>>
>>>> Zhongkun He reports data corruption when combining zswap with zram.
>>>>
>>>> The issue is the exclusive loads we're doing in zswap. They assume
>>>> that all reads are going into the swapcache, which can assume
>>>> authoritative ownership of the data and so the zswap copy can go.
>>>>
>>>> However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try
>>>> to bypass the swapcache. This results in an optimistic read of the
>>>> swap data into a page that will be dismissed if the fault fails due to
>>>> races. In this case, zswap mustn't drop its authoritative copy.
>>>>
>>>> Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=zV9P691B9bVq33erwOXNTmEaUbi9DrDeJzw@mail.gmail.com/
>>>> Reported-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
>>>> Fixes: b9c91c43412f ("mm: zswap: support exclusive loads")
>>>> Cc: stable@vger.kernel.org [6.5+]
>>>> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>>>> Tested-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
>>
>> Acked-by: Barry Song <baohua@kernel.org>
>>
>>>
>>> Do we also want to mention somewhere (commit log or comment) that
>>> keeping the entry in the tree is fine because we are still protected
>>> from concurrent loads/invalidations/writeback by swapcache_prepare()
>>> setting SWAP_HAS_CACHE or so?
>>
>> It seems that Kairui's patch comprehensively addresses the issue at hand.
>> Johannes's solution, on the other hand, appears to align zswap behavior
>> more closely with that of a traditional swap device, only releasing an entry
>> when the corresponding swap slot is freed, particularly in the sync-io case.
>
> It actually worked out quite well that Kairui's fix landed shortly
> before this bug was reported, as this fix wouldn't have been possible
> without it as far as I can tell.
>
>>
>> Johannes' patch has inspired me to consider whether zRAM could achieve
>> a comparable outcome by immediately releasing objects in swap cache
>> scenarios. When I have the opportunity, I plan to experiment with zRAM.
>
> That would be interesting. I am curious if it would be as
> straightforward in zram to just mark the folio as dirty in this case
> like zswap does, given its implementation as a block device.
>
This makes me wonder who is responsible for marking folio dirty in this swapcache
bypass case? Should we call folio_mark_dirty() after the swap_read_folio()?
next prev parent reply other threads:[~2024-03-25 7:33 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-24 21:04 [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices Johannes Weiner
2024-03-24 21:22 ` Yosry Ahmed
2024-03-25 4:54 ` Barry Song
2024-03-25 7:06 ` Yosry Ahmed
2024-03-25 7:33 ` Chengming Zhou [this message]
2024-03-25 8:38 ` Yosry Ahmed
2024-03-25 9:22 ` Chengming Zhou
2024-03-25 9:40 ` Yosry Ahmed
2024-03-25 9:46 ` Chengming Zhou
2024-03-25 18:35 ` Yosry Ahmed
2024-03-25 16:30 ` Johannes Weiner
2024-03-25 18:41 ` Yosry Ahmed
2024-03-25 0:01 ` Chengming Zhou
2024-03-25 3:01 ` [External] " Zhongkun He
2024-03-25 17:09 ` Nhat Pham
2024-03-25 21:27 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1e7ce417-b9dd-4d62-9f54-0adf1ccdae35@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chrisl@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hezhongkun.hzk@bytedance.com \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=yosryahmed@google.com \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).