From: Yosry Ahmed <yosry.ahmed@linux.dev>
To: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
akpm@linux-foundation.org, chengming.zhou@linux.dev,
linux-mm@kvack.org, kernel-team@meta.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] zswap: do not crash the kernel on decompression failure
Date: Thu, 27 Feb 2025 00:00:27 +0000 [thread overview]
Message-ID: <Z7-rG7Y3DK33MyCl@google.com> (raw)
In-Reply-To: <CAKEwX=PR3tJM4X00hSua-w-FNR_ZwQ1oRqdT2Cgj_FV9cCUing@mail.gmail.com>
On Wed, Feb 26, 2025 at 03:20:13PM -0800, Nhat Pham wrote:
> On Wed, Feb 26, 2025 at 7:33 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> >
> > On Tue, Feb 25, 2025 at 11:57:27PM -0500, Johannes Weiner wrote:
> > > On Wed, Feb 26, 2025 at 03:12:35AM +0000, Yosry Ahmed wrote:
> > > > On Tue, Feb 25, 2025 at 01:32:00PM -0800, Nhat Pham wrote:
> > > > > Currently, we crash the kernel when a decompression failure occurs in
> > > > > zswap (either because of memory corruption, or a bug in the compression
> > > > > algorithm). This is overkill. We should only SIGBUS the unfortunate
> > > > > process asking for the zswap entry on zswap load, and skip the corrupted
> > > > > entry in zswap writeback.
> > > >
> > > > Some relevant observations/questions, but not really actionable for this
> > > > patch, perhaps some future work, or more likely some incoherent
> > > > illogical thoughts :
> > > >
> > > > (1) It seems like not making the folio uptodate will cause shmem faults
> > > > to mark the swap entry as hwpoisoned, but I don't see similar handling
> > > > for do_swap_page(). So it seems like even if we SIGBUS the process,
> > > > other processes mapping the same page could follow in the same
> > > > footsteps.
> > >
> > > It's analogous to what __end_swap_bio_read() does for block backends,
> > > so it's hitchhiking on the standard swap protocol for read failures.
> >
> > Right, that's also how I got the idea when I did the same for large
> > folios handling.
>
> And your handling of the large folio (along with the comment in the
> other thread) was how I got the idea for this patch :)
>
> >
> > >
> > > The page sticks around if there are other users. It can get reclaimed,
> > > but since it's not marked dirty, it won't get overwritten. Another
> > > access will either find it in the swapcache and die on !uptodate; if
> > > it was reclaimed, it will attempt another decompression. If all
> > > references have been killed, zswap_invalidate() will finally drop it.
> > >
> > > Swapoff actually poisons the page table as well (unuse_pte).
> >
> > Right. My question was basically why don't we also poison the page table
> > in do_swap_page() in this case. It's like that we never swapoff.
>
> That would require a rmap walk right? To also poison the other PTEs
> that point to the faulty (z)swap entry?
>
> Or am I misunderstanding your point :)
Oh I meant why not just mark the entry where the fault happened as
poisoned at least. Finding other PTEs that point to the swap entry is a
different story. I don't think we can even use the rmap here.
next prev parent reply other threads:[~2025-02-27 0:00 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-25 21:32 [PATCH] zswap: do not crash the kernel on decompression failure Nhat Pham
2025-02-25 22:25 ` Andrew Morton
2025-02-25 22:28 ` Nhat Pham
2025-02-26 0:51 ` Johannes Weiner
2025-02-26 2:45 ` Yosry Ahmed
2025-02-26 3:57 ` Johannes Weiner
2025-02-26 15:42 ` Yosry Ahmed
2025-02-26 23:39 ` Nhat Pham
2025-02-26 2:40 ` Yosry Ahmed
2025-02-26 23:16 ` Nhat Pham
2025-02-27 0:03 ` Yosry Ahmed
2025-02-26 3:12 ` Yosry Ahmed
2025-02-26 4:57 ` Johannes Weiner
2025-02-26 15:33 ` Yosry Ahmed
2025-02-26 23:20 ` Nhat Pham
2025-02-27 0:00 ` Yosry Ahmed [this message]
2025-02-26 23:29 ` Nhat Pham
2025-02-26 23:58 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z7-rG7Y3DK33MyCl@google.com \
--to=yosry.ahmed@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.