From: Yazen Ghannam <yazen.ghannam@amd.com>
To: Tony Luck <tony.luck@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
Borislav Petkov <bp@alien8.de>, Hanjun Guo <guohanjun@huawei.com>,
Mauro Carvalho Chehab <mchehab@kernel.org>,
linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org,
patches@lists.linux.dev, Andi Kleen <andi.kleen@intel.com>
Subject: Re: [PATCH] ACPI: APEI: GHES: Improve ghes_notify_nmi() status check
Date: Wed, 5 Nov 2025 16:19:24 -0500 [thread overview]
Message-ID: <20251105211924.GA1264471@yaz-khff2.amd.com> (raw)
In-Reply-To: <20251103230547.8715-1-tony.luck@intel.com>
On Mon, Nov 03, 2025 at 03:05:47PM -0800, Tony Luck wrote:
> ghes_notify_nmi() is called for every NMI and must check whether the NMI was
> generated because an error was signalled by platform firmware.
>
> This check is very expensive as for each registered GHES NMI source it reads
> from the acpi generic address attached to this error source to get the physical
> address of the acpi_hest_generic_status block. It then checks the "block_status"
> to see if an error was logged.
>
> The ACPI/APEI code must create virtual mappings for each of those physical
> addresses, and tear them down afterwards. On an Icelake system this takes around
> 15,000 TSC cycles. Enough to disturb efforts to profile system performance.
>
> If that were not bad enough, there are some atomic accesses in the code path
> that will cause cache line bounces between CPUs. A problem that gets worse as
> the core count increases.
>
> But BIOS changes neither the acpi generic address nor the physical address of
> the acpi_hest_generic_status block. So this walk can be done once when the NMI is
> registered to save the virtual address (unmapping if the NMI is ever unregistered).
> The "block_status" can be checked directly in the NMI handler. This can be done
> without any atomic accesses.
>
> Resulting time to check that there is not an error record is around 900 cycles.
>
> Reported-by: Andi Kleen <andi.kleen@intel.com>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
>
> ---
> N.B. I only talked to an Intel BIOS expert about this. GHES code is shared by
> other architectures, so it would be wise to get confirmation on whether this
> assumption applies to all, or is Intel (or X86) specific.
I think that is how the ACPI spec describes it.
https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html?highlight=hest#error-source-discovery
The HEST and other tables are fixed at init time. There's an ACPI notify
event for if/when a device method needs to be re-evaluted, but I don't
think anything in APEI expects that.
Thanks,
Yazen
next prev parent reply other threads:[~2025-11-05 21:19 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-03 23:05 [PATCH] ACPI: APEI: GHES: Improve ghes_notify_nmi() status check Tony Luck
2025-11-05 21:19 ` Yazen Ghannam [this message]
2025-11-05 23:53 ` Luck, Tony
2025-11-06 1:46 ` Shuai Xue
2025-11-06 2:09 ` Luck, Tony
2025-11-06 5:04 ` Shuai Xue
2025-11-06 18:03 ` Luck, Tony
2025-11-07 2:20 ` Shuai Xue
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251105211924.GA1264471@yaz-khff2.amd.com \
--to=yazen.ghannam@amd.com \
--cc=andi.kleen@intel.com \
--cc=bp@alien8.de \
--cc=guohanjun@huawei.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@kernel.org \
--cc=patches@lists.linux.dev \
--cc=rafael@kernel.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox