All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: "Huang, Kai" <kai.huang@intel.com>,
	"Luck, Tony" <tony.luck@intel.com>,
	"Hunter, Adrian" <adrian.hunter@intel.com>,
	"Annapurve, Vishal" <vannapurve@google.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Li, Xiaoyao" <xiaoyao.li@intel.com>,
	"Zhao, Yan Y" <yan.y.zhao@intel.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"seanjc@google.com" <seanjc@google.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"Yamahata, Isaku" <isaku.yamahata@intel.com>,
	"tony.lindgren@linux.intel.com" <tony.lindgren@linux.intel.com>,
	"binbin.wu@linux.intel.com" <binbin.wu@linux.intel.com>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"Chatre, Reinette" <reinette.chatre@intel.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
	"bp@alien8.de" <bp@alien8.de>, "Gao, Chao" <chao.gao@intel.com>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH 2/2] KVM: TDX: Do not clear poisoned pages
Date: Thu, 26 Jun 2025 15:33:11 -0700	[thread overview]
Message-ID: <b439abd6-9fd9-4f51-82e2-c8b1304e7cca@intel.com> (raw)
In-Reply-To: <f51e62543aa765da3b4f4ed19aa13340881fbc89.camel@intel.com>

On 6/26/25 15:20, Huang, Kai wrote:
> But IMHO we may should just have a simple policy that when a page is marked
> as poisoned, it should never be touched again.  It's only one page anyway
> (for one TD) so losing that doesn't seem bad to me.  If we want to clear the
> poisoned page, then perhaps we should mark that page to be not-poisoned
> again.

The simplest policy is to do nothing.

The kernel only has 29 places that check PageHWPoison(). I'd guess that
roughly half of those are the memory-failure.c infrastructure and
bare-minimum code to handle poison, like not allowing pages to go back
into the allocator.

There are something like 5,000 lines of code in the kernel that deal
with a literal 'struct page'. 29 checks for ~5,000 sites is pretty
minuscule. We obviously don't have a policy that every place that uses
'struct page' needs to check for poison. We also don't even have a
policy where writes to or reads from a page check for poison.

Why is this TDX code so special that PageHWPoison() needs to be checked.
For instance:

$ grep -r PageHWPoison arch/x86/
arch/x86/kernel/cpu/mce/core.c:	SetPageHWPoison(p);
arch/x86/kernel/cpu/mce/core.c:	SetPageHWPoison(p);

In other words, this would be the *ONLY* arch/x86 site. Why?

  reply	other threads:[~2025-06-26 22:33 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-18 12:08 [PATCH 0/2] Fixes for recovery for machine check in TDX/SEAM non-root mode Adrian Hunter
2025-06-18 12:08 ` [PATCH 1/2] x86/mce: Fix missing address mask in recovery for errors " Adrian Hunter
2025-06-18 12:36   ` Xiaoyao Li
2025-06-18 14:55   ` Dave Hansen
2025-06-19 11:57     ` Adrian Hunter
2025-06-27 15:23       ` Adrian Hunter
2025-06-27 15:25         ` Dave Hansen
2025-06-27 16:24           ` Luck, Tony
2025-06-27 16:33             ` Dave Hansen
2025-07-30 10:54               ` Adrian Hunter
2025-07-30 11:57                 ` Huang, Kai
2025-07-30 14:20                 ` Vishal Annapurve
2025-06-27 16:28         ` Luck, Tony
2025-06-18 23:20   ` Huang, Kai
2025-06-18 23:39     ` Luck, Tony
2025-06-18 23:46       ` Luck, Tony
2025-06-18 23:57         ` Huang, Kai
2025-06-18 23:53       ` Huang, Kai
2025-06-18 12:08 ` [PATCH 2/2] KVM: TDX: Do not clear poisoned pages Adrian Hunter
2025-06-18 12:39   ` Xiaoyao Li
2025-06-18 14:58   ` Dave Hansen
2025-06-25 14:33     ` Vishal Annapurve
2025-06-25 16:25       ` Adrian Hunter
2025-06-25 16:31         ` Dave Hansen
2025-06-25 16:42           ` Adrian Hunter
2025-06-25 16:57             ` Dave Hansen
2025-06-25 16:42         ` Edgecombe, Rick P
2025-06-25 22:32         ` Huang, Kai
2025-06-25 22:38           ` Dave Hansen
2025-06-26  1:19             ` Huang, Kai
2025-06-26 15:31               ` Luck, Tony
2025-06-26 22:20                 ` Huang, Kai
2025-06-26 22:33                   ` Dave Hansen [this message]
2025-06-27  0:56                     ` Huang, Kai
2025-06-18 23:09   ` Huang, Kai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b439abd6-9fd9-4f51-82e2-c8b1304e7cca@intel.com \
    --to=dave.hansen@intel.com \
    --cc=adrian.hunter@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=chao.gao@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=reinette.chatre@intel.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.lindgren@linux.intel.com \
    --cc=tony.luck@intel.com \
    --cc=vannapurve@google.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.