* [BUG] x86/virt/tdx: tdx_offline_cpu() violates tdx_cpu_flush_cache() preemption assert
@ 2026-05-11 21:33 David CARLIER
2026-05-12 1:00 ` Huang, Kai
0 siblings, 1 reply; 2+ messages in thread
From: David CARLIER @ 2026-05-11 21:33 UTC (permalink / raw)
To: Vishal Verma, Rick Edgecombe, Dave Hansen
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
Kai Huang, Chao Gao, Kiryl Shutsemau, Sean Christopherson,
Paolo Bonzini, Adrian Hunter, x86, kvm, linux-coco,
open list:SCHEDULER
Hi,
In commit 597bdf6e068e ("x86/virt/tdx: Pull kexec cache flush logic into
arch/x86"), tdx_offline_cpu() gained a call to tdx_cpu_flush_cache(),
which starts with lockdep_assert_preemption_disabled().
tdx_offline_cpu() is registered at CPUHP_AP_ONLINE_DYN. ONLINE-section
teardown callbacks run from the pinned per-CPU hotplug thread with
preemption and interrupts enabled (Documentation/core-api/cpu_hotplug.rst,
and cpuhp_thread_fun() only disables IRQs for atomic states).
The other callers — tdx_shutdown_cpu() via on_each_cpu(), and the
crash path — satisfy the assertion. Only the offline path doesn't, and
the splat should fire on every offline once the TDX module is
initialized and the done: path is taken.
Wrapping the call with preempt_disable() / preempt_enable() at the
offline site keeps the contract for the kexec/shutdown callers.
Not yet reproduced on a debug kernel; reporting on inspection.
Fixes: 597bdf6e068e ("x86/virt/tdx: Pull kexec cache flush logic
into arch/x86")
Cheers,
David
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [BUG] x86/virt/tdx: tdx_offline_cpu() violates tdx_cpu_flush_cache() preemption assert
2026-05-11 21:33 [BUG] x86/virt/tdx: tdx_offline_cpu() violates tdx_cpu_flush_cache() preemption assert David CARLIER
@ 2026-05-12 1:00 ` Huang, Kai
0 siblings, 0 replies; 2+ messages in thread
From: Huang, Kai @ 2026-05-12 1:00 UTC (permalink / raw)
To: Verma, Vishal L, devnexen@gmail.com, Edgecombe, Rick P,
dave.hansen@linux.intel.com
Cc: Gao, Chao, linux-kernel@vger.kernel.org, seanjc@google.com,
bp@alien8.de, kas@kernel.org, hpa@zytor.com, mingo@redhat.com,
Hunter, Adrian, x86@kernel.org, tglx@kernel.org,
pbonzini@redhat.com, linux-coco@lists.linux.dev,
kvm@vger.kernel.org
On Mon, 2026-05-11 at 22:33 +0100, David CARLIER wrote:
> Hi,
>
> In commit 597bdf6e068e ("x86/virt/tdx: Pull kexec cache flush logic into
> arch/x86"), tdx_offline_cpu() gained a call to tdx_cpu_flush_cache(),
> which starts with lockdep_assert_preemption_disabled().
>
> tdx_offline_cpu() is registered at CPUHP_AP_ONLINE_DYN. ONLINE-section
> teardown callbacks run from the pinned per-CPU hotplug thread with
> preemption and interrupts enabled (Documentation/core-api/cpu_hotplug.rst,
> and cpuhp_thread_fun() only disables IRQs for atomic states).
>
> The other callers — tdx_shutdown_cpu() via on_each_cpu(), and the
> crash path — satisfy the assertion. Only the offline path doesn't, and
> the splat should fire on every offline once the TDX module is
> initialized and the done: path is taken.
>
> Wrapping the call with preempt_disable() / preempt_enable() at the
> offline site keeps the contract for the kexec/shutdown callers.
>
> Not yet reproduced on a debug kernel; reporting on inspection.
>
> Fixes: 597bdf6e068e ("x86/virt/tdx: Pull kexec cache flush logic
> into arch/x86")
Right the lockdep_assert_preemption_disabled() is wrong when
tdx_cpu_flush_cache() is called from CPUHP context (there's no functionality
issue, though, it's just the lockdep assertion is wrong).
It was introduced when the TDX host kexec support was added, so the above commit
is not the right one to blame. Previously the tdx_cpu_flush_cache() was called
from KVM's module unload path, also via the CPUHP context. The commit above
only moved it to TDX core's CPU offline path.
The latest version to fix is:
https://lore.kernel.org/lkml/20260407233333.1608820-1-kai.huang@intel.com/
but it needs rebasing now.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-12 1:00 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-11 21:33 [BUG] x86/virt/tdx: tdx_offline_cpu() violates tdx_cpu_flush_cache() preemption assert David CARLIER
2026-05-12 1:00 ` Huang, Kai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox