linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/22] AMD MCA interrupts rework
@ 2025-06-24 14:15 Yazen Ghannam
  2025-06-24 14:15 ` [PATCH v4 01/22] x86/mce: Don't remove sysfs if thresholding sysfs init fails Yazen Ghannam
                   ` (21 more replies)
  0 siblings, 22 replies; 39+ messages in thread
From: Yazen Ghannam @ 2025-06-24 14:15 UTC (permalink / raw)
  To: x86, Tony Luck, Rafael J. Wysocki, Len Brown
  Cc: linux-kernel, linux-edac, Smita.KoralahalliChannabasappa,
	Qiuxu Zhuo, linux-acpi, Yazen Ghannam

Hi all,

This set unifies the AMD MCA interrupt handlers with common MCA code.
The goal is to avoid duplicating functionality like reading and clearing
MCA banks.

Based on feedback, this revision also include changes to the MCA init
flow.

Patches 1-8:
General fixes and cleanups.

Patches 9-14:
Add BSP-only init flow and related changes.

Patches 15-18:
Updates from v1 set.

Patches 19-21:
Interrupt storm handling rebased on current set.

Patch 22:
Add support to get threshold limit from APEI HEST.

Thanks,
Yazen

---
Changes in v4:
- Rebase on v6.16-rc3.
- Address comments from Boris about function names.
- Redo DFR handler integration.
- Drop AMD APIC LVT rework.
- Include more AMD thresholding reworks and fixes.
- Add support to get threshold limit from APEI HEST.
- Reorder patches so most fixes and reworks are at the beginning.
- Link to v3: https://lore.kernel.org/r/20250415-wip-mca-updates-v3-0-8ffd9eb4aa56@amd.com

Changes in v3:
- Rebased on tip/x86/merge rather than tip/master.
- Updated MSR access helpers (*msrl -> *msrq).
- Add patch to fix polling after a storm.
- Link to v2: https://lore.kernel.org/r/20250213-wip-mca-updates-v2-0-3636547fe05f@amd.com

Changes in v2:
- Add general cleanup pre-patches.
- Add changes for BSP-only init.
- Add interrupt storm handling for AMD.
- Link to v1: https://lore.kernel.org/r/20240523155641.2805411-1-yazen.ghannam@amd.com

---
Borislav Petkov (1):
      x86/mce: Cleanup bank processing on init

Smita Koralahalli (1):
      x86/mce: Handle AMD threshold interrupt storms

Yazen Ghannam (20):
      x86/mce: Don't remove sysfs if thresholding sysfs init fails
      x86/mce: Restore poll settings after storm subsides
      x86/mce/amd: Add default names for MCA banks and blocks
      x86/mce/amd: Fix threshold limit reset
      x86/mce/amd: Rename threshold restart function
      x86/mce/amd: Remove return value for mce_threshold_{create,remove}_device()
      x86/mce/amd: Remove smca_banks_map
      x86/mce/amd: Put list_head in threshold_bank
      x86/mce: Remove __mcheck_cpu_init_early()
      x86/mce: Define BSP-only init
      x86/mce: Define BSP-only SMCA init
      x86/mce: Do 'UNKNOWN' vendor check early
      x86/mce: Separate global and per-CPU quirks
      x86/mce: Move machine_check_poll() status checks to helper functions
      x86/mce: Unify AMD THR handler with MCA Polling
      x86/mce: Unify AMD DFR handler with MCA Polling
      x86/mce/amd: Support SMCA Corrected Error Interrupt
      x86/mce/amd: Remove redundant reset_block()
      x86/mce/amd: Define threshold restart function for banks
      x86/mce: Save and use APEI corrected threshold limit

 arch/x86/include/asm/mce.h          |  23 ++-
 arch/x86/kernel/acpi/apei.c         |   2 +
 arch/x86/kernel/cpu/common.c        |   1 +
 arch/x86/kernel/cpu/mce/amd.c       | 366 ++++++++++++++----------------------
 arch/x86/kernel/cpu/mce/core.c      | 363 +++++++++++++++++------------------
 arch/x86/kernel/cpu/mce/intel.c     |  18 ++
 arch/x86/kernel/cpu/mce/internal.h  |  12 ++
 arch/x86/kernel/cpu/mce/threshold.c |  16 ++
 8 files changed, 379 insertions(+), 422 deletions(-)
---
base-commit: 86731a2a651e58953fc949573895f2fa6d456841
change-id: 20250210-wip-mca-updates-bed2a67c9c57


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2025-08-25 13:59 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-24 14:15 [PATCH v4 00/22] AMD MCA interrupts rework Yazen Ghannam
2025-06-24 14:15 ` [PATCH v4 01/22] x86/mce: Don't remove sysfs if thresholding sysfs init fails Yazen Ghannam
2025-06-28 11:33   ` [tip: ras/urgent] " tip-bot2 for Yazen Ghannam
2025-06-24 14:15 ` [PATCH v4 02/22] x86/mce: Restore poll settings after storm subsides Yazen Ghannam
2025-06-25 13:22   ` Nikolay Borisov
2025-06-28 11:33   ` [tip: ras/urgent] x86/mce: Ensure user polling settings are honored when restarting timer tip-bot2 for Yazen Ghannam
2025-06-24 14:15 ` [PATCH v4 03/22] x86/mce/amd: Add default names for MCA banks and blocks Yazen Ghannam
2025-06-28 11:33   ` [tip: ras/urgent] " tip-bot2 for Yazen Ghannam
2025-06-24 14:15 ` [PATCH v4 04/22] x86/mce/amd: Fix threshold limit reset Yazen Ghannam
2025-06-28 11:33   ` [tip: ras/urgent] " tip-bot2 for Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 05/22] x86/mce/amd: Rename threshold restart function Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 06/22] x86/mce/amd: Remove return value for mce_threshold_{create,remove}_device() Yazen Ghannam
2025-06-25 14:57   ` Nikolay Borisov
2025-06-24 14:16 ` [PATCH v4 07/22] x86/mce/amd: Remove smca_banks_map Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 08/22] x86/mce/amd: Put list_head in threshold_bank Yazen Ghannam
2025-06-25 16:52   ` Nikolay Borisov
2025-06-27 11:14     ` Nikolay Borisov
2025-06-30 12:57       ` Yazen Ghannam
2025-08-25 13:59     ` Borislav Petkov
2025-06-24 14:16 ` [PATCH v4 09/22] x86/mce: Cleanup bank processing on init Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 10/22] x86/mce: Remove __mcheck_cpu_init_early() Yazen Ghannam
2025-06-26  8:03   ` Nikolay Borisov
2025-06-30 12:58     ` Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 11/22] x86/mce: Define BSP-only init Yazen Ghannam
2025-06-25 11:04   ` Nikolay Borisov
2025-06-25 11:26     ` Borislav Petkov
2025-06-24 14:16 ` [PATCH v4 12/22] x86/mce: Define BSP-only SMCA init Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 13/22] x86/mce: Do 'UNKNOWN' vendor check early Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 14/22] x86/mce: Separate global and per-CPU quirks Yazen Ghannam
2025-06-27 11:02   ` Nikolay Borisov
2025-06-30 13:00     ` Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 15/22] x86/mce: Move machine_check_poll() status checks to helper functions Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 16/22] x86/mce: Unify AMD THR handler with MCA Polling Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 17/22] x86/mce: Unify AMD DFR " Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 18/22] x86/mce/amd: Support SMCA Corrected Error Interrupt Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 19/22] x86/mce/amd: Remove redundant reset_block() Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 20/22] x86/mce/amd: Define threshold restart function for banks Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 21/22] x86/mce: Handle AMD threshold interrupt storms Yazen Ghannam
2025-06-24 14:16 ` [PATCH v4 22/22] x86/mce: Save and use APEI corrected threshold limit Yazen Ghannam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).