[PATCH v2] x86/kexec: Disable KCOV instrumentation after load

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2] x86/kexec: Disable KCOV instrumentation after load_segments()
@ 2026-03-17 22:03 Aleksandr Nogikh
  2026-03-20 15:56 ` Borislav Petkov
  2026-03-22  0:08 ` Andrew Morton
  0 siblings, 2 replies; 4+ messages in thread
From: Aleksandr Nogikh @ 2026-03-17 22:03 UTC (permalink / raw)
  To: bp, tglx, mingo
  Cc: x86, linux-kernel, dvyukov, kasan-dev, linux-mm, Aleksandr Nogikh,
	stable

The load_segments() function changes segment registers, invalidating
GS base (which KCOV relies on for per-cpu data). When CONFIG_KCOV is
enabled, any subsequent instrumented C code call (e.g.
native_gdt_invalidate()) begins crashing the kernel in an endless
loop.

To reproduce the problem, it's sufficient to do kexec on a
KCOV-instrumented kernel:
$ kexec -l /boot/otherKernel
$ kexec -e

The real-world context for this problem is enabling crash dump
collection in syzkaller. For this, the tool loads a panic kernel
before fuzzing and then calls makedumpfile after the panic. This
workflow requires both CONFIG_KEXEC and CONFIG_KCOV to be enabled
simultaneously.

Adding safeguards directly to the KCOV fast-path
(__sanitizer_cov_trace_pc()) is also undesirable as it would
introduce an extra performance overhead.

Disabling instrumentation for the individual functions would be too
fragile, so let's fix the bug by disabling KCOV instrumentation for
the entire machine_kexec_64.c and physaddr.c. If coverage-guided
fuzzing ever needs these components in the future, we should consider
other approaches.

The problem is not relevant for 32 bit kernels as CONFIG_KCOV is not
supported there.

Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Aleksandr Nogikh <nogikh@google.com>
Cc: stable@vger.kernel.org
---
v2:
Updated the comments to explain the underlying context.

v1:
https://lore.kernel.org/all/20260216173716.2279847-1-nogikh@google.com/
---
 arch/x86/kernel/Makefile | 10 ++++++++++
 arch/x86/mm/Makefile     | 10 ++++++++++
 2 files changed, 20 insertions(+)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index e9aeeeafad173..41b1333907ded 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -43,6 +43,16 @@ KCOV_INSTRUMENT_dumpstack_$(BITS).o			:= n
 KCOV_INSTRUMENT_unwind_orc.o				:= n
 KCOV_INSTRUMENT_unwind_frame.o				:= n
 KCOV_INSTRUMENT_unwind_guess.o				:= n
+# Disable KCOV to prevent crashes during kexec: load_segments() invalidates
+# the GS base, which KCOV relies on for per-CPU data.
+# As KCOV && KEXEC compatibility should be preserved (e.g. syzkaller is
+# using it to collect crash dumps during kernel fuzzing), we could either
+# selectively disable KCOV instrumentation, which can be fragile, or add
+# more checks to KCOV, which would slow it down.
+# As a compromise solution, let's disable KCOV instrumentation for the
+# whole file. If its coverage is ever needed, we should consider other
+# approaches.
+KCOV_INSTRUMENT_machine_kexec_64.o			:= n
 
 CFLAGS_head32.o := -fno-stack-protector
 CFLAGS_head64.o := -fno-stack-protector
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 5b9908f13dcfd..ea3a31b54e49e 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -4,6 +4,16 @@ KCOV_INSTRUMENT_tlb.o			:= n
 KCOV_INSTRUMENT_mem_encrypt.o		:= n
 KCOV_INSTRUMENT_mem_encrypt_amd.o	:= n
 KCOV_INSTRUMENT_pgprot.o		:= n
+# Disable KCOV to prevent crashes during kexec: load_segments() invalidates
+# the GS base, which KCOV relies on for per-CPU data.
+# As KCOV && KEXEC compatibility should be preserved (e.g. syzkaller is
+# using it to collect crash dumps during kernel fuzzing), we could either
+# selectively disable KCOV instrumentation, which can be fragile, or add
+# more checks to KCOV, which would slow it down.
+# As a compromise solution, let's disable KCOV instrumentation for the
+# whole file. If its coverage is ever needed, we should consider other
+# approaches.
+KCOV_INSTRUMENT_physaddr.o		:= n
 
 KASAN_SANITIZE_mem_encrypt.o		:= n
 KASAN_SANITIZE_mem_encrypt_amd.o	:= n

base-commit: f338e77383789c0cae23ca3d48adcc5e9e137e3c
-- 
2.53.0.959.g497ff81fa9-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] x86/kexec: Disable KCOV instrumentation after load_segments()
  2026-03-17 22:03 [PATCH v2] x86/kexec: Disable KCOV instrumentation after load_segments() Aleksandr Nogikh
@ 2026-03-20 15:56 ` Borislav Petkov
  2026-03-22  0:08 ` Andrew Morton
  1 sibling, 0 replies; 4+ messages in thread
From: Borislav Petkov @ 2026-03-20 15:56 UTC (permalink / raw)
  To: Aleksandr Nogikh
  Cc: tglx, mingo, x86, linux-kernel, dvyukov, kasan-dev, linux-mm,
	stable

On Tue, Mar 17, 2026 at 11:03:19PM +0100, Aleksandr Nogikh wrote:
> The load_segments() function changes segment registers, invalidating
> GS base (which KCOV relies on for per-cpu data). When CONFIG_KCOV is
> enabled, any subsequent instrumented C code call (e.g.
> native_gdt_invalidate()) begins crashing the kernel in an endless
> loop.
> 
> To reproduce the problem, it's sufficient to do kexec on a
> KCOV-instrumented kernel:
> $ kexec -l /boot/otherKernel
> $ kexec -e
> 
> The real-world context for this problem is enabling crash dump
> collection in syzkaller. For this, the tool loads a panic kernel
> before fuzzing and then calls makedumpfile after the panic. This
> workflow requires both CONFIG_KEXEC and CONFIG_KCOV to be enabled
> simultaneously.
> 
> Adding safeguards directly to the KCOV fast-path
> (__sanitizer_cov_trace_pc()) is also undesirable as it would
> introduce an extra performance overhead.
> 
> Disabling instrumentation for the individual functions would be too
> fragile, so let's fix the bug by disabling KCOV instrumentation for
> the entire machine_kexec_64.c and physaddr.c. If coverage-guided
> fuzzing ever needs these components in the future, we should consider
						     ^^

Please use passive voice in your commit message: no "we" or "I", etc,
and describe your changes in imperative mood.

Also in the comments below.

> other approaches.
> 
> The problem is not relevant for 32 bit kernels as CONFIG_KCOV is not
> supported there.
> 
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Aleksandr Nogikh <nogikh@google.com>
> Cc: stable@vger.kernel.org
> ---
> v2:
> Updated the comments to explain the underlying context.
> 
> v1:
> https://lore.kernel.org/all/20260216173716.2279847-1-nogikh@google.com/
> ---
>  arch/x86/kernel/Makefile | 10 ++++++++++
>  arch/x86/mm/Makefile     | 10 ++++++++++
>  2 files changed, 20 insertions(+)


./scripts/checkpatch.pl /tmp/current.patch 

...
 
WARNING: The commit message has 'stable@', perhaps it also needs a 'Fixes:' tag?

> 
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index e9aeeeafad173..41b1333907ded 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -43,6 +43,16 @@ KCOV_INSTRUMENT_dumpstack_$(BITS).o			:= n
>  KCOV_INSTRUMENT_unwind_orc.o				:= n
>  KCOV_INSTRUMENT_unwind_frame.o				:= n
>  KCOV_INSTRUMENT_unwind_guess.o				:= n
> +# Disable KCOV to prevent crashes during kexec: load_segments() invalidates
> +# the GS base, which KCOV relies on for per-CPU data.
> +# As KCOV && KEXEC compatibility should be preserved (e.g. syzkaller is
> +# using it to collect crash dumps during kernel fuzzing), we could either
> +# selectively disable KCOV instrumentation, which can be fragile, or add
> +# more checks to KCOV, which would slow it down.
> +# As a compromise solution, let's disable KCOV instrumentation for the
> +# whole file. If its coverage is ever needed, we should consider other
> +# approaches.
> +KCOV_INSTRUMENT_machine_kexec_64.o			:= n
>  
>  CFLAGS_head32.o := -fno-stack-protector
>  CFLAGS_head64.o := -fno-stack-protector
> diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
> index 5b9908f13dcfd..ea3a31b54e49e 100644
> --- a/arch/x86/mm/Makefile
> +++ b/arch/x86/mm/Makefile
> @@ -4,6 +4,16 @@ KCOV_INSTRUMENT_tlb.o			:= n
>  KCOV_INSTRUMENT_mem_encrypt.o		:= n
>  KCOV_INSTRUMENT_mem_encrypt_amd.o	:= n
>  KCOV_INSTRUMENT_pgprot.o		:= n
> +# Disable KCOV to prevent crashes during kexec: load_segments() invalidates
> +# the GS base, which KCOV relies on for per-CPU data.
> +# As KCOV && KEXEC compatibility should be preserved (e.g. syzkaller is
> +# using it to collect crash dumps during kernel fuzzing), we could either
> +# selectively disable KCOV instrumentation, which can be fragile, or add
> +# more checks to KCOV, which would slow it down.
> +# As a compromise solution, let's disable KCOV instrumentation for the
> +# whole file. If its coverage is ever needed, we should consider other
> +# approaches.

Instead of repeating this big comment block, just say something along the
lines of:

# See "Disable KCOV" comment in arch/x86/kernel/Makefile

> +KCOV_INSTRUMENT_physaddr.o		:= n
>  
>  KASAN_SANITIZE_mem_encrypt.o		:= n
>  KASAN_SANITIZE_mem_encrypt_amd.o	:= n
> 
> base-commit: f338e77383789c0cae23ca3d48adcc5e9e137e3c
> -- 

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] x86/kexec: Disable KCOV instrumentation after load_segments()
  2026-03-17 22:03 [PATCH v2] x86/kexec: Disable KCOV instrumentation after load_segments() Aleksandr Nogikh
  2026-03-20 15:56 ` Borislav Petkov
@ 2026-03-22  0:08 ` Andrew Morton
  2026-03-23 10:46   ` Aleksandr Nogikh
  1 sibling, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2026-03-22  0:08 UTC (permalink / raw)
  To: Aleksandr Nogikh
  Cc: bp, tglx, mingo, x86, linux-kernel, dvyukov, kasan-dev, linux-mm,
	stable

On Tue, 17 Mar 2026 23:03:19 +0100 Aleksandr Nogikh <nogikh@google.com> wrote:

> The load_segments() function changes segment registers, invalidating
> GS base (which KCOV relies on for per-cpu data). When CONFIG_KCOV is
> enabled, any subsequent instrumented C code call (e.g.
> native_gdt_invalidate()) begins crashing the kernel in an endless
> loop.
> 
> ...
> 
> Disabling instrumentation for the individual functions would be too
> fragile, so let's fix the bug by disabling KCOV instrumentation for
> the entire machine_kexec_64.c and physaddr.c. If coverage-guided
> fuzzing ever needs these components in the future, we should consider
> other approaches.
> 

AI review has questions:
	https://sashiko.dev/#/patchset/20260317220319.788561-1-nogikh@google.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] x86/kexec: Disable KCOV instrumentation after load_segments()
  2026-03-22  0:08 ` Andrew Morton
@ 2026-03-23 10:46   ` Aleksandr Nogikh
  0 siblings, 0 replies; 4+ messages in thread
From: Aleksandr Nogikh @ 2026-03-23 10:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: bp, tglx, mingo, x86, linux-kernel, dvyukov, kasan-dev, linux-mm,
	stable

On Sun, Mar 22, 2026 at 1:08 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue, 17 Mar 2026 23:03:19 +0100 Aleksandr Nogikh <nogikh@google.com> wrote:
>
> > The load_segments() function changes segment registers, invalidating
> > GS base (which KCOV relies on for per-cpu data). When CONFIG_KCOV is
> > enabled, any subsequent instrumented C code call (e.g.
> > native_gdt_invalidate()) begins crashing the kernel in an endless
> > loop.
> >
> > ...
> >
> > Disabling instrumentation for the individual functions would be too
> > fragile, so let's fix the bug by disabling KCOV instrumentation for
> > the entire machine_kexec_64.c and physaddr.c. If coverage-guided
> > fuzzing ever needs these components in the future, we should consider
> > other approaches.
> >
>
> AI review has questions:
>         https://sashiko.dev/#/patchset/20260317220319.788561-1-nogikh@google.com

Regarding the comments:

> Does this fix cover the CONFIG_KEXEC_JUMP path where execution returns to a KCOV-instrumented kernel?

It doesn't. The fix only covers the main kexec functionality because
that's where the problem manifested: on syzbot we only use `kexec -p`,
not CONFIG_KEXEC_JUMP.

For CONFIG_KEXEC_JUMP, it should be (hopefully) enough to disable the
KCOV instrumentation for `arch/x86/power/cpu.c`, but I am not sure if
we want to also cover it here.

> Is disabling KCOV for all of physaddr.c an overly broad fix that causes
unnecessary loss of coverage for core memory primitives like __phys_addr()?

Disabling the instrumentation at a more granular level would be more
fragile (this was discussed in the v1 series and mentioned in the v2
commit message). When preparing the patch, I tried annotating
individual functions to resolve the problem, it was quite a
whack-a-mole..

Regarding the __phys_addr coverage: so far, it hasn't been super
important during kernel fuzzing. If necessary, we can easily
reconsider the approach later - for now it's just a few lines in
Makefiles.

-- 
Aleksandr

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-03-23 10:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-17 22:03 [PATCH v2] x86/kexec: Disable KCOV instrumentation after load_segments() Aleksandr Nogikh
2026-03-20 15:56 ` Borislav Petkov
2026-03-22  0:08 ` Andrew Morton
2026-03-23 10:46   ` Aleksandr Nogikh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox