From: Juergen Gross <jgross@suse.com>
To: "Reshetova, Elena" <elena.reshetova@intel.com>,
"Annapurve, Vishal" <vannapurve@google.com>,
"Hansen, Dave" <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"bp@alien8.de" <bp@alien8.de>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"peterz@infradead.org" <peterz@infradead.org>,
"mingo@redhat.com" <mingo@redhat.com>,
"hpa@zytor.com" <hpa@zytor.com>,
"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
"x86@kernel.org" <x86@kernel.org>,
"kas@kernel.org" <kas@kernel.org>,
"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
"dwmw@amazon.co.uk" <dwmw@amazon.co.uk>,
"Huang, Kai" <kai.huang@intel.com>,
"seanjc@google.com" <seanjc@google.com>,
"Chatre, Reinette" <reinette.chatre@intel.com>,
"Yamahata, Isaku" <isaku.yamahata@intel.com>,
"Williams, Dan J" <dan.j.williams@intel.com>,
"ashish.kalra@amd.com" <ashish.kalra@amd.com>,
"nik.borisov@suse.com" <nik.borisov@suse.com>,
"Gao, Chao" <chao.gao@intel.com>,
"sagis@google.com" <sagis@google.com>,
"Chen, Farrah" <farrah.chen@intel.com>,
Binbin Wu <binbin.wu@linux.intel.com>
Subject: Re: [PATCH 4/7] x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum
Date: Thu, 2 Oct 2025 09:46:54 +0200 [thread overview]
Message-ID: <27d19ea5-d078-405b-a963-91d19b4229c8@suse.com> (raw)
In-Reply-To: <DM8PR11MB575071F87791817215355DD8E7E7A@DM8PR11MB5750.namprd11.prod.outlook.com>
[-- Attachment #1.1.1: Type: text/plain, Size: 3840 bytes --]
On 02.10.25 08:59, Reshetova, Elena wrote:
>> On Wed, Oct 1, 2025 at 7:32 AM Dave Hansen <dave.hansen@intel.com>
>> wrote:
>>>
>>> On 9/30/25 19:05, Vishal Annapurve wrote:
>>> ...
>>>>> Any workarounds are going to be slow and probably imperfect. That's not
>>>>
>>>> Do we really need to deploy workarounds that are complex and slow to
>>>> get kdump working for the majority of the scenarios? Is there any
>>>> analysis done for the risk with imperfect and simpler workarounds vs
>>>> benefits of kdump functionality?
>>>>
>>>>> a great match for kdump. I'm perfectly happy waiting for fixed hardware
>>>>> from what I've seen.
>>>>
>>>> IIUC SPR/EMR - two CPU generations out there are impacted by this
>>>> erratum and just disabling kdump functionality IMO is not the best
>>>> solution here.
>>>
>>> That's an eminently reasonable position. But we're speaking in broad
>>> generalities and I'm unsure what you don't like about the status quo or
>>> how you'd like to see things change.
>>
>> Looks like the decision to disable kdump was taken between [1] -> [2].
>> "The kernel currently doesn't track which page is TDX private memory.
>> It's not trivial to reset TDX private memory. For simplicity, this
>> series simply disables kexec/kdump for such platforms. This will be
>> enhanced in the future."
>>
>> A patch [3] from the series[1], describes the issue as:
>> "This problem is triggered by "partial" writes where a write transaction
>> of less than cacheline lands at the memory controller. The CPU does
>> these via non-temporal write instructions (like MOVNTI), or through
>> UC/WC memory mappings. The issue can also be triggered away from the
>> CPU by devices doing partial writes via DMA."
>>
>> And also mentions:
>> "Also note only the normal kexec needs to worry about this problem, but
>> not the crash kexec: 1) The kdump kernel only uses the special memory
>> reserved by the first kernel, and the reserved memory can never be used
>> by TDX in the first kernel; 2) The /proc/vmcore, which reflects the
>> first (crashed) kernel's memory, is only for read. The read will never
>> "poison" TDX memory thus cause unexpected machine check (only partial
>> write does)."
>
> While the statement that the read will never poison the memory is correct,
> the situation we can theoretically worry about is the following in my understanding:
>
> 1. During its execution on platform with partial write problem, host OS or other
> actor executing outside of SEAM mode triggers partial write into a cache line that
> originally belonged to TDX private memory.
> This is smth that host OS or other entities should not do, but it could happen due
> to host OS bugs, etc.
> 2. The above causes the specified cache line to be poisoned by mem controller.
> However, here we assume that no one accesses this cache line from TDX module,
> TD guests or Host OS for the time being and the problem remains hidden.
> 3. Host OS crashes due to some other issue, kdump crash kernel is triggered,
> and kdump starts to read all the memory from the previous host kernel to dump
> the diagnostics info.
> 4. At some point of time, kdump crash kernel reaches the memory with the poisoned
> cache line, consumes poison, and the #MC is issued for the kernel space.
>
> Isn't this the reason for also disabling kdump? Or do I miss smth?
So lets compare the 2 cases with kdump enabled and disabled in your scenario
(crash of the host OS):
kdump enabled: No dump can be produced due to the #MC and system is rebooted.
kdump disabled: No dump is produced and system is rebooted after crash.
What is the main concern with kdump enabled? I don't see any disadvantage with
enabling it, just the advantage that in many cases a dump will be written.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
next prev parent reply other threads:[~2025-10-02 7:46 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-01 16:09 [PATCH v8 0/7] TDX host: kexec/kdump support Paolo Bonzini
2025-09-01 16:09 ` [PATCH 1/7] x86/kexec: Consolidate relocate_kernel() function parameters Paolo Bonzini
2025-09-01 16:09 ` [PATCH 2/7] x86/sme: Use percpu boolean to control WBINVD during kexec Paolo Bonzini
2025-09-01 16:09 ` [PATCH 3/7] x86/virt/tdx: Mark memory cache state incoherent when making SEAMCALL Paolo Bonzini
2025-09-01 16:09 ` [PATCH 4/7] x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum Paolo Bonzini
2025-09-30 1:38 ` Vishal Annapurve
2025-09-30 21:32 ` Dave Hansen
2025-10-01 2:05 ` Vishal Annapurve
2025-10-01 14:32 ` Dave Hansen
2025-10-01 17:17 ` Vishal Annapurve
2025-10-01 18:00 ` Dave Hansen
2025-10-01 21:19 ` Huang, Kai
2025-10-02 6:59 ` Reshetova, Elena
2025-10-02 7:46 ` Juergen Gross [this message]
2025-10-02 8:10 ` Reshetova, Elena
2025-10-02 15:06 ` Dave Hansen
2025-10-02 16:09 ` Vishal Annapurve
2025-10-18 15:54 ` Vishal Annapurve
2025-10-21 17:08 ` Dave Hansen
2025-10-22 2:50 ` Vishal Annapurve
2025-10-22 21:05 ` Huang, Kai
2025-10-23 16:54 ` Vishal Annapurve
2025-10-07 13:31 ` Jürgen Groß
2025-10-08 15:40 ` Dave Hansen
2025-10-08 18:13 ` Jürgen Groß
2025-10-26 23:33 ` Vishal Annapurve
2025-10-27 0:50 ` Huang, Kai
2025-10-27 16:23 ` Edgecombe, Rick P
2025-10-27 21:28 ` Huang, Kai
2025-10-28 0:07 ` Vishal Annapurve
2025-10-28 9:31 ` Huang, Kai
2025-11-03 16:44 ` Vishal Annapurve
2025-09-01 16:09 ` [PATCH 5/7] x86/virt/tdx: Remove the !KEXEC_CORE dependency Paolo Bonzini
2025-09-01 16:09 ` [PATCH 6/7] x86/virt/tdx: Update the kexec section in the TDX documentation Paolo Bonzini
2025-09-01 16:09 ` [PATCH 7/7] KVM: TDX: Explicitly do WBINVD when no more TDX SEAMCALLs Paolo Bonzini
2025-10-03 13:09 ` [PATCH v8 0/7] TDX host: kexec/kdump support Paolo Bonzini
2025-10-03 13:54 ` David Woodhouse
2025-10-03 14:05 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27d19ea5-d078-405b-a963-91d19b4229c8@suse.com \
--to=jgross@suse.com \
--cc=ashish.kalra@amd.com \
--cc=binbin.wu@linux.intel.com \
--cc=bp@alien8.de \
--cc=chao.gao@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dwmw@amazon.co.uk \
--cc=elena.reshetova@intel.com \
--cc=farrah.chen@intel.com \
--cc=hpa@zytor.com \
--cc=isaku.yamahata@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=nik.borisov@suse.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=reinette.chatre@intel.com \
--cc=rick.p.edgecombe@intel.com \
--cc=sagis@google.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=vannapurve@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox