All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kiryl Shutsemau <kirill@shutemov.name>
To: Ihor Solodrai <ihor.solodrai@linux.dev>
Cc: Borislav Petkov <bp@alien8.de>, Thomas Gleixner <tglx@kernel.org>,
	 Ingo Molnar <mingo@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org,  "H. Peter Anvin" <hpa@zytor.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	 Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	Eduard Zingerman <eddyz87@gmail.com>,
	 Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	bpf@vger.kernel.org, kasan-dev@googlegroups.com,
	 linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] kasan: Fix false-positive wild-memory-access on x86 under 5-level paging
Date: Fri, 12 Jun 2026 17:30:02 +0100	[thread overview]
Message-ID: <aiwxWk0gOV4ZlKcT@thinkstation> (raw)
In-Reply-To: <20260610175651.647515-1-ihor.solodrai@linux.dev>

On Wed, Jun 10, 2026 at 10:56:51AM -0700, Ihor Solodrai wrote:
> On x86_64 with 5-level paging (LA57) and inline generic KASAN, the
> following flaky splat may be observed on boot:
> 
>     BUG: KASAN: wild-memory-access in do_raw_spin_lock+0xcf/0x260
>     Write of size 4 at addr ff110001000c90b8 by task swapper/0/0
> 
>     CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 7.1.0-rc5-gcba33e0b2907 #1 PREEMPT(full)
>     Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>     Call Trace:
>      <IRQ>
>      dump_stack_lvl+0x54/0x70
>      kasan_report+0x117/0x150
>      ? do_raw_spin_lock+0xcf/0x260
>      kasan_check_range+0x264/0x2c0
>      do_raw_spin_lock+0xcf/0x260
>      handle_edge_irq+0x35/0x770
>      ? do_raw_spin_unlock+0x51/0x2a0
>      __common_interrupt+0xae/0x120
>      common_interrupt+0x7c/0x90
>      </IRQ>
>      <TASK>
>      asm_common_interrupt+0x26/0x40
>     RIP: 0010:identify_cpu+0x2b2/0x3460
>     Code: 00 41 c7 07 00 00 00 00 4d 89 e6 49 c1 ee 03 43 0f b6 04 06 84 c0 0f 85 a3 1c 00 00 41 c7 04 24 00 00 00 00 31 c0 31 c9 0f a2 <89> c7 42 0f b6 44 05 00 84 c0 0f 85 ad 1c 00 00 41 89 3f 48 8b 44
>     RSP: 0000:ffffffff97807df0 EFLAGS: 00000246
>     RAX: 0000000000000020 RBX: 00000000756e6547 RCX: 000000006c65746e
>     RDX: 0000000049656e69 RSI: 0000000000000000 RDI: ffffffff98632fd8
>     RBP: 1ffffffff30c65fc R08: dffffc0000000000 R09: 0000000000000004
>     R10: ffffffff98632fc4 R11: fffffbfff30c65fb R12: ffffffff98633050
>     R13: ffffffff98633048 R14: 1ffffffff30c660a R15: ffffffff98632fe0
>      identify_boot_cpu+0xd/0xd0
>      arch_cpu_finalize_init+0x24/0x1f0
>      start_kernel+0x31e/0x3e0
>      x86_64_start_reservations+0x24/0x30
>      x86_64_start_kernel+0x13a/0x140
>      common_startup_64+0x12c/0x137
>      </TASK>
> 
> It fires very early in boot. If kasan_multi_shot is set, the reports
> are non-fatal and keep repeating, and the boot CPU wedges before
> userspace is reached. The accessed addresses are valid 5-level kernel
> pointers, so the report is a false positive.
> 
> The root cause is in generic KASAN not seeing
> cpu_feature_enabled(X86_FEATURE_LA57) set, because the bit is cleared
> in identify_cpu() when the offending interrupt happens [1]:
> 
>   memset(&c->x86_capability, 0, ...);   /* clears X86_FEATURE_LA57 */
>   ...
>   get_cpu_cap(c);                       /* re-reads CPUID, restores it */
> 
> addr_has_metadata() then uses the 4-level threshold, and 5-level
> kernel addresses fall below it, so kasan_check_range() reports them as
> wild-memory-access.
> 
> Define USE_EARLY_PGTABLE_L5 in mm/kasan/generic.c so
> addr_has_metadata() uses the stable variable, as
> arch/x86/mm/kasan_init_64.c already does.
> 

I'd rather not push USE_EARLY_PGTABLE_L5 into generic KASAN code.

It's an x86 paging detail in arch-independent files. It's incomplete
(report.c and report_generic.c also call addr_has_metadata()). And it's
a permanent slowdown on the KASAN hot path -- pgtable_l5_enabled()
becomes a runtime load of __pgtable_l5_enabled on every check, whereas
cpu_feature_enabled() gets patched to a constant after alternatives.

And it leaves the real bug in place: the window where
boot_cpu_data.x86_capability reads back zero is visible to *any*
cpu_feature_enabled() caller in interrupt context, not just KASAN.

The window is opened by identify_cpu() itself, so fix it there:

	diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
	--- a/arch/x86/kernel/cpu/common.c
	+++ b/arch/x86/kernel/cpu/common.c
	@@ -2003,6 +2003,7 @@ static void generic_identify(struct cpuinfo_x86 *c)
	  */
	 static void identify_cpu(struct cpuinfo_x86 *c)
	 {
	+	unsigned long flags;
	 	int i;
	@@ -2022,12 +2023,21 @@ static void identify_cpu(struct cpuinfo_x86 *c)
	 	c->x86_cache_alignment = c->x86_clflush_size;
	+
	+	/*
	+	 * x86_capability is cleared and repopulated from CPUID below. On
	+	 * the boot CPU this runs with IRQs on and before alternatives are
	+	 * patched, so cpu_feature_enabled() reads the live bits; an
	+	 * interrupt in this window sees e.g. X86_FEATURE_LA57 as disabled.
	+	 */
	+	local_irq_save(flags);
	 	memset(&c->x86_capability, 0, sizeof(c->x86_capability));
	 #ifdef CONFIG_X86_VMX_FEATURE_NAMES
	 	memset(&c->vmx_capability, 0, sizeof(c->vmx_capability));
	 #endif
	 	generic_identify(c);
	+	local_irq_restore(flags);

save/restore keeps it correct for the secondary-CPU callers that already run
with IRQs off.

I reproduced your splat with parallel TCG guests (-cpu max, kasan_multi_shot):
~10% of boots hit it, 0/~200 with the above.

I am not sure how wide the irq-off window suppose to be. I scoped it to
memset() .. generic_identify(), where LA57 is restored. Later code
(apply_forced_caps(), ->c_init(), setup_sm*p()) only refines bits.

Widen it to be defensive, or keep it tight?

Any better solution?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov


  parent reply	other threads:[~2026-06-12 16:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-10 17:56 [PATCH v1] kasan: Fix false-positive wild-memory-access on x86 under 5-level paging Ihor Solodrai
2026-06-10 18:17 ` sashiko-bot
2026-06-10 18:28   ` Ihor Solodrai
2026-06-10 18:39 ` Andrey Konovalov
2026-06-10 21:55   ` Ihor Solodrai
2026-06-12 16:30 ` Kiryl Shutsemau [this message]
2026-06-12 19:42   ` Ihor Solodrai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aiwxWk0gOV4ZlKcT@thinkstation \
    --to=kirill@shutemov.name \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bp@alien8.de \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=eddyz87@gmail.com \
    --cc=hpa@zytor.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=memxor@gmail.com \
    --cc=mingo@redhat.com \
    --cc=ryabinin.a.a@gmail.com \
    --cc=tglx@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.