Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@collabora.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chia-I Wu <olvaffe@gmail.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Kairui Song <kasong@tencent.com>,
	kasan-dev@googlegroups.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v2] mm/shmem: set __GFP_SKIP_KASAN for swap_cluster_readahead
Date: Thu, 21 May 2026 10:51:48 +0200	[thread overview]
Message-ID: <20260521105148.5497c281@fedora> (raw)
In-Reply-To: <aa244c61-b936-4c53-af7c-2f5190867a7f@linux.alibaba.com>

On Thu, 21 May 2026 15:05:21 +0800
Baolin Wang <baolin.wang@linux.alibaba.com> wrote:

> On 5/21/26 1:06 AM, Chia-I Wu wrote:
> > On Wed, May 20, 2026 at 3:04 AM Baolin Wang
> > <baolin.wang@linux.alibaba.com> wrote:  
> >>
> >> CC Kairui,
> >>
> >> On 5/20/26 12:31 PM, Chia-I Wu via B4 Relay wrote:  
> >>> From: Chia-I Wu <olvaffe@gmail.com>
> >>>
> >>> swap_cluster_readahead can allocate folios for other mappings. If the
> >>> gfp flags do not have __GFP_SKIP_KASAN, but the other mappings have
> >>> PROT_MTE, we can end up with false KASAN errors such as
> >>>
> >>>     BUG: KASAN: invalid-access in swap_writepage+0xb0/0x21c
> >>>     Read at addr f5ffff81aa71dff8 by task WM.task-4/6956
> >>>     Pointer tag: [f5], memory tag: [f9]
> >>>
> >>> In the above example, because __GFP_SKIP_KASAN was missing, KASAN set
> >>> both pointer tag and memory tag to 0xf5 when swap_cluster_readahead
> >>> allocated the folio. But the userspace had already set the memory tag to
> >>> 0xf9 before swapped out. arch_swap_restore restored the memory tag back
> >>> to 0xf9, leading to the mismatch.
> >>>
> >>> Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
> >>> ---
> >>> Changes in v2:
> >>> - set __GFP_SKIP_KASAN for shmem instead of drm/panthor
> >>> - Link to v1: https://patch.msgid.link/20260512-panthor-kasan-v1-1-d8d3e275d71b@gmail.com
> >>> ---
> >>>    mm/shmem.c | 5 +++++
> >>>    1 file changed, 5 insertions(+)
> >>>
> >>> diff --git a/mm/shmem.c b/mm/shmem.c
> >>> index 3b5dc21b323c2..db9130a8c5b76 100644
> >>> --- a/mm/shmem.c
> >>> +++ b/mm/shmem.c
> >>> @@ -1784,6 +1784,11 @@ static struct folio *shmem_swapin_cluster(swp_entry_t swap, gfp_t gfp,
> >>>        pgoff_t ilx;
> >>>        struct folio *folio;
> >>>
> >>> +     /* swap_cluster_readahead might cross the mapping boundary and
> >>> +      * allocate pages for other mappings. We have to skip KASAN.
> >>> +      */
> >>> +     gfp |= __GFP_SKIP_KASAN;
> >>> +
> >>>        mpol = shmem_get_pgoff_policy(info, index, 0, &ilx);
> >>>        folio = swap_cluster_readahead(swap, gfp, mpol, ilx);
> >>>        mpol_cond_put(mpol);  
> >>
> >> If we force __GFP_SKIP_KASAN, would this cause issues for mappings that
> >> explicitly should NOT have the flag? and your v1 link already mentions
> >> this scenario.  
> > We lose the benefits of kasan hw tags (other modes are not affected)
> > by forcing the flag.
> > 
> > The other mappings swap_cluster_readahead can affect are anon
> > mappings, regular shmem mappings, or gpu shmem mappings. I think only
> > gpu shmem mappings miss __GFP_SKIP_KASAN. That might not even be
> > intentional, because gpu shmem mappings pick GFP_HIGHUSER over
> > GFP_HIGHUSER_MOVABLE to avoid __GFP_MOVABLE. That was before
> > __GFP_SKIP_KASAN was added to GFP_HIGHUSER_MOVABLE.  
> 
> It sounds like the right approach would be to explicitly set 
> __GFP_SKIP_KASAN for GPU shmem mappings, no? I think having users 
> explicitly set __GFP_SKIP_KASAN makes the implications clearer than 
> having shmem core set it implicitly.

It's a bit of a shame that we have to explicitly set this
__GFP_SKIP_KASAN flag when we select GFP_HIGHUSER though (means a lot
of patching to do in drivers/gpu/drm/ basically, because basically
every driver relying on shmem for its buffer allocation uses this flag).

Also, it feels like KASAN poisoning for these pages would be interesting
to have since we know we won't allow MTE_PROT on userspace mappings
anyway. Oh, and some buffers might even be kernel only (no mmap()
allowed), which makes them even better candidates for poisoning.

> 
> We could also consider adding a VM_WARN in shmem_swapin_cluster() to 
> detect any mappings missing the __GFP_SKIP_KASAN flag.

If the general consensus is that all shmem-backed allocation must have
__GFP_SKIP_KASAN, yes, it'd make sense to add a VM_WARN.

> 
> > I guess what I am trying to say is these are all user pages. We have
> > to skip kasan when user pages can be mapped PROT_MTE. The  
> 
> Yes, regular shmem mappings typically default to GFP_HIGHUSER_MOVABLE, 
> while GPU shmem mappings are a special case.

They are not that special, they are just not MOVABLE because the GPU
might also access the same pages under the hood. If it's assumed that
any page being exposed through mmap() must have __GFP_SKIP_KASAN, why
does GFP_HIGHUSER not have that flag too?

> 
> > justification for gpu shmem mappings is that they cannot be mapped
> > PROT_MTE. But if readahead can affect non-gpu shmem mappings, it seems
> > we have to either force __GFP_SKIP_KASAN or to cap/disable readahead.  

I'm no MM expert, so it's probably me not understanding how this
swap-readahead logic is supposed to work, but the whole idea of using
different flags from those that were requested by the f_mapping seems
fragile. I mean, this comments proves [1] it's not the first time the
problem is considered, and I'm wondering why __GFP_SKIP_KASAN should be
treated differently from zones. Yes, that's an extra copy if the
SKIP_KASAN flags don't match but the zones do, but in practice, won't
we have GFP_HIGHUSER and GFP_HIGHUSER_MOVABLE in different zones? Or is
the problem that, even with a copy, it's already too late to restore
the flags because they been overwritten during kazan unpoisoning?

[1]https://elixir.bootlin.com/linux/v7.0.9/source/mm/shmem.c#L2112


  reply	other threads:[~2026-05-21  8:51 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-20  4:31 [PATCH RFC v2] mm/shmem: set __GFP_SKIP_KASAN for swap_cluster_readahead Chia-I Wu via B4 Relay
2026-05-20 10:04 ` Baolin Wang
2026-05-20 17:06   ` Chia-I Wu
2026-05-21  7:05     ` Baolin Wang
2026-05-21  8:51       ` Boris Brezillon [this message]
2026-05-21 15:49         ` Chia-I Wu
2026-05-21 21:12           ` Chia-I Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260521105148.5497c281@fedora \
    --to=boris.brezillon@collabora.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=hughd@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=olvaffe@gmail.com \
    --cc=ryabinin.a.a@gmail.com \
    --cc=vincenzo.frascino@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox