Linux EDAC development
 help / color / mirror / Atom feed
* [PATCH v7 0/8] AMD MCA interrupts rework
@ 2025-10-16 16:37 Yazen Ghannam
  2025-10-16 16:37 ` [PATCH v7 1/8] x86/mce: Unify AMD THR handler with MCA Polling Yazen Ghannam
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Yazen Ghannam @ 2025-10-16 16:37 UTC (permalink / raw)
  To: x86, Tony Luck, Rafael J. Wysocki, Len Brown
  Cc: linux-kernel, linux-edac, Smita.KoralahalliChannabasappa,
	Qiuxu Zhuo, Nikolay Borisov, Bert Karwatzki, linux-acpi,
	Yazen Ghannam

Hi all,

This set unifies the AMD MCA interrupt handlers with common MCA code.
The goal is to avoid duplicating functionality like reading and clearing
MCA banks.

Based on feedback, this revision also include changes to the MCA init
flow.

Patches 1-2:
Unify AMD interrupt handlers with common MCE code.

Patches 3-4:
SMCA Corrected Error Interrupt support.

Patches 5-7:
Interrupt storm handling rebased on current set.

Patch 8:
Add support to get threshold limit from APEI HEST.

Thanks,
Yazen

---
Changes in v7:
- Rework DFR error handling to avoid reporting bogus errors.
- Don't modify polling banks for AMD-systems after an interrupt storm.
- Link to v6: https://lore.kernel.org/r/20250908-wip-mca-updates-v6-0-eef5d6c74b9c@amd.com
- Link to "spurious errors" thread:
  https://lore.kernel.org/r/20250915010010.3547-1-spasswolf@web.de

Changes in v6:
- Rebase on tip/ras/core.
- Address comments from Boris for patches 1, 8, and 10.
- Link to v5: https://lore.kernel.org/r/20250825-wip-mca-updates-v5-0-865768a2eef8@amd.com

Changes in v5:
- Rebase on v6.17-rc1.
- Add tags and address comments from Nikolay.
- Added back patch that was dropped from v4.
- Link to v4: https://lore.kernel.org/r/20250624-wip-mca-updates-v4-0-236dd74f645f@amd.com

Changes in v4:
- Rebase on v6.16-rc3.
- Address comments from Boris about function names.
- Redo DFR handler integration.
- Drop AMD APIC LVT rework.
- Include more AMD thresholding reworks and fixes.
- Add support to get threshold limit from APEI HEST.
- Reorder patches so most fixes and reworks are at the beginning.
- Link to v3: https://lore.kernel.org/r/20250415-wip-mca-updates-v3-0-8ffd9eb4aa56@amd.com

Changes in v3:
- Rebased on tip/x86/merge rather than tip/master.
- Updated MSR access helpers (*msrl -> *msrq).
- Add patch to fix polling after a storm.
- Link to v2: https://lore.kernel.org/r/20250213-wip-mca-updates-v2-0-3636547fe05f@amd.com

Changes in v2:
- Add general cleanup pre-patches.
- Add changes for BSP-only init.
- Add interrupt storm handling for AMD.
- Link to v1: https://lore.kernel.org/r/20240523155641.2805411-1-yazen.ghannam@amd.com

---
Smita Koralahalli (1):
      x86/mce: Handle AMD threshold interrupt storms

Yazen Ghannam (7):
      x86/mce: Unify AMD THR handler with MCA Polling
      x86/mce: Unify AMD DFR handler with MCA Polling
      x86/mce/amd: Enable interrupt vectors once per-CPU on SMCA systems
      x86/mce/amd: Support SMCA Corrected Error Interrupt
      x86/mce/amd: Remove redundant reset_block()
      x86/mce/amd: Define threshold restart function for banks
      x86/mce: Save and use APEI corrected threshold limit

 arch/x86/include/asm/mce.h          |  13 ++
 arch/x86/kernel/acpi/apei.c         |   2 +
 arch/x86/kernel/cpu/mce/amd.c       | 340 ++++++++++++++----------------------
 arch/x86/kernel/cpu/mce/core.c      |  51 +++++-
 arch/x86/kernel/cpu/mce/internal.h  |   4 +
 arch/x86/kernel/cpu/mce/threshold.c |  19 +-
 6 files changed, 216 insertions(+), 213 deletions(-)
---
base-commit: 5c6f123c419b6e20f84ac1683089a52f449273aa
change-id: 20250210-wip-mca-updates-bed2a67c9c57


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-11-02 12:32 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-16 16:37 [PATCH v7 0/8] AMD MCA interrupts rework Yazen Ghannam
2025-10-16 16:37 ` [PATCH v7 1/8] x86/mce: Unify AMD THR handler with MCA Polling Yazen Ghannam
2025-10-16 16:37 ` [PATCH v7 2/8] x86/mce: Unify AMD DFR " Yazen Ghannam
2025-10-24 15:03   ` Borislav Petkov
2025-10-24 20:30     ` Yazen Ghannam
2025-10-24 21:27       ` Borislav Petkov
2025-10-25 15:03         ` Borislav Petkov
2025-10-27 13:35           ` Yazen Ghannam
2025-10-27 14:11             ` Yazen Ghannam
2025-10-28 15:22               ` Borislav Petkov
2025-10-28 15:42                 ` Yazen Ghannam
2025-10-28 17:46                   ` Borislav Petkov
2025-10-28 20:37                     ` Yazen Ghannam
2025-10-28 23:18                       ` Borislav Petkov
2025-10-29 15:09                         ` Yazen Ghannam
2025-10-29 16:02                           ` Borislav Petkov
2025-10-16 16:37 ` [PATCH v7 3/8] x86/mce/amd: Enable interrupt vectors once per-CPU on SMCA systems Yazen Ghannam
2025-10-16 16:37 ` [PATCH v7 4/8] x86/mce/amd: Support SMCA Corrected Error Interrupt Yazen Ghannam
2025-10-16 16:37 ` [PATCH v7 5/8] x86/mce/amd: Remove redundant reset_block() Yazen Ghannam
2025-10-16 16:37 ` [PATCH v7 6/8] x86/mce/amd: Define threshold restart function for banks Yazen Ghannam
2025-10-16 16:37 ` [PATCH v7 7/8] x86/mce: Handle AMD threshold interrupt storms Yazen Ghannam
2025-10-16 16:37 ` [PATCH v7 8/8] x86/mce: Save and use APEI corrected threshold limit Yazen Ghannam
2025-11-02 12:32   ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox