From: William Roche <william.roche@oracle.com>
To: Borislav Petkov <bp@alien8.de>
Cc: yazen.ghannam@amd.com, tony.luck@intel.com, tglx@kernel.org,
mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org,
hpa@zytor.com, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, John.Allen@amd.com,
jane.chu@oracle.com
Subject: Re: [PATCH v3 1/1] x86/mce/amd: Guard SMCA DESTAT access on non-SMCA machines
Date: Tue, 17 Mar 2026 22:52:50 +0100 [thread overview]
Message-ID: <34ffc43b-bb11-49a7-8c64-3f4abc0d4bb3@oracle.com> (raw)
In-Reply-To: <20260317202414.GGabm4bu6rDjVcspDH@fat_crate.local>
On 3/17/26 21:24, Borislav Petkov wrote:
> On Tue, Mar 17, 2026 at 09:06:54PM +0100, William Roche wrote:
>> Relaying the error to the guest doesn't only have a value to target a VM
>> process but also deal with free memory or clean file cache memory impacted
>> etc... Cases where a memory error may not crash the kernel can benefit to
>> the VM too
>
> I don't understand - what do you mean with "free memory or clean file cache
> memory"?
The physical address of an uncorrected memory error (if/when it can be
identified) can give a chance to a kernel reaction depending on the
state (and type) of the impacted memory -- as implemented in
mm/memory-failure.c with error_states[], me_pagecache_clean() or
try_memory_failure()...
The Kernel can try to "deal" with the error. The process case (with its
SIGBUS) is probably the most common one, but a few kernel memory pages
impacted by a memory error could be isolated (poisoned) without
requiring a kernel crash. Free memory pages or clean page cache pages
could be an example of that, they are poisoned and should not be used by
the system after that. The kernel can also return EIO error on poisoned
page cache failed access attempt, etc...
These mechanisms are implemented for the bare-metal running kernel, but
what is really interesting when relaying the error to a VM is that its
kernel can, in some cases, also benefit from these mechanisms. And
having a chance (even small) to avoid a VM crash is a significant gain
for virtualized workload.
Just giving my point of view on why we care about VM relayed memory
errors :)
William.
next prev parent reply other threads:[~2026-03-17 21:56 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-17 10:38 [PATCH v3 0/1] AMD VM crashing on deferred memory error injection “William Roche
2026-03-17 10:38 ` [PATCH v3 1/1] x86/mce/amd: Guard SMCA DESTAT access on non-SMCA machines “William Roche
2026-03-17 13:32 ` Borislav Petkov
2026-03-17 13:38 ` William Roche
2026-03-17 18:17 ` Borislav Petkov
2026-03-17 20:06 ` William Roche
2026-03-17 20:24 ` Borislav Petkov
2026-03-17 21:52 ` William Roche [this message]
2026-03-18 20:24 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=34ffc43b-bb11-49a7-8c64-3f4abc0d4bb3@oracle.com \
--to=william.roche@oracle.com \
--cc=John.Allen@amd.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jane.chu@oracle.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=tglx@kernel.org \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=yazen.ghannam@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox