public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mario Limonciello <superm1@kernel.org>
To: "Borislav Petkov" <bp@alien8.de>,
	"Jean Delvare" <jdelvare@suse.com>,
	"Andi Shyti" <andi.shyti@kernel.org>,
	"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>,
	Mario Limonciello <mario.limonciello@amd.com>,
	Yazen Ghannam <yazen.ghannam@amd.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)),
	"H . Peter Anvin" <hpa@zytor.com>,
	Shyam Sundar S K <Shyam-sundar.S-k@amd.com>,
	Hans de Goede <hdegoede@redhat.com>,
	linux-doc@vger.kernel.org (open list:DOCUMENTATION),
	linux-kernel@vger.kernel.org (open list),
	linux-i2c@vger.kernel.org (open list:I2C/SMBUS CONTROLLER
	DRIVERS FOR PC),
	platform-driver-x86@vger.kernel.org (open list:AMD PMC DRIVER)
Subject: [PATCH v5 5/5] x86/CPU/AMD: Print the reason for the last reset
Date: Tue, 22 Apr 2025 18:48:30 -0500	[thread overview]
Message-ID: <20250422234830.2840784-6-superm1@kernel.org> (raw)
In-Reply-To: <20250422234830.2840784-1-superm1@kernel.org>

From: Yazen Ghannam <yazen.ghannam@amd.com>

The following register contains bits that indicate the cause for the
previous reset.

        PMx000000C0 (FCH::PM::S5_RESET_STATUS)

This is useful for debug. The reasons for reset are broken into 6 high
level categories. Decode it by category and print during boot.

Specifics within a category are split off into debugging documentation.

The register is accessed indirectly through a "PM" port in the FCH. Use
MMIO access in order to avoid restrictions with legacy port access.

Use a late_initcall() to ensure that MMIO has been set up before trying
to access the register.

This register was introduced with AMD Family 17h, so avoid access on
older families. There is no CPUID feature bit for this register.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
v4:
 * Use loop that can output multiple reasons
 * Drop "Unknown" condition and have dedicated message
v3:
 * Align strings in the CSV and code.
 * Switch to an array of strings
 * Switch to looking up bit of first value
 * Re-order message to have number first (makes grepping easier)
 * Add x86/amd prefix to message
v2:
 * Add string for each reason, but still include value in case multiple
   values are set.
---
 Documentation/arch/x86/amd-debugging.rst | 41 +++++++++++++++
 arch/x86/include/asm/amd/fch.h           |  1 +
 arch/x86/kernel/cpu/amd.c                | 64 ++++++++++++++++++++++++
 3 files changed, 106 insertions(+)

diff --git a/Documentation/arch/x86/amd-debugging.rst b/Documentation/arch/x86/amd-debugging.rst
index 01427cf97ee33..32a3f99409c7a 100644
--- a/Documentation/arch/x86/amd-debugging.rst
+++ b/Documentation/arch/x86/amd-debugging.rst
@@ -319,3 +319,44 @@ messages.  To help with this, a tool has been created at
 `amd-debug-tools <https://git.kernel.org/pub/scm/linux/kernel/git/superm1/amd-debug-tools.git/about/>`_
 to help parse the messages.
 
+Random reboot issues
+====================
+When a random reboot occurs, the high-level reason for the reboot is stored
+in a register that will persist onto the next boot.
+
+There are 6 classes of reasons for the reboot:
+ * Software induced
+ * Power state transition
+ * Pin induced
+ * Hardware induced
+ * Remote reset
+ * Internal CPU event
+
+.. csv-table::
+   :header: "Bit", "Type", "Reason"
+   :align: left
+
+   "0",  "Pin",      "thermal pin BP_THERMTRIP_L was tripped"
+   "1",  "Pin",      "power button was pressed for 4 seconds"
+   "2",  "Pin",      "shutdown pin was shorted"
+   "4",  "Remote",   "remote ASF power off command was received"
+   "9",  "Internal", "internal CPU thermal limit was tripped"
+   "16", "Pin",      "system reset pin BP_SYS_RST_L was tripped"
+   "17", "Software", "software issued PCI reset"
+   "18", "Software", "software wrote 0x4 to reset control register 0xCF9"
+   "19", "Software", "software wrote 0x6 to reset control register 0xCF9"
+   "20", "Software", "software wrote 0xE to reset control register 0xCF9"
+   "21", "Sleep",    "ACPI power state transition occurred"
+   "22", "Pin",      "keyboard reset pin KB_RST_L was asserted"
+   "23", "Internal", "internal CPU shutdown event occurred"
+   "24", "Hardware", "system failed to boot before failed boot timer expired"
+   "25", "Hardware", "hardware watchdog timer expired"
+   "26", "Remote",   "remote ASF reset command was received"
+   "27", "Internal", "an uncorrected error caused a data fabric sync flood event"
+   "29", "Internal", "FCH and MP1 failed warm reset handshake"
+   "30", "Internal", "a parity error occurred"
+   "31", "Internal", "a software sync flood event occurred"
+
+This information is read by the kernel at bootup and is saved into the
+kernel ring buffer. When a random reboot occurs this message can be helpful
+to determine the next component to debug such an issue.
diff --git a/arch/x86/include/asm/amd/fch.h b/arch/x86/include/asm/amd/fch.h
index 9b32e8a03193e..4a6e1e3b685a4 100644
--- a/arch/x86/include/asm/amd/fch.h
+++ b/arch/x86/include/asm/amd/fch.h
@@ -9,5 +9,6 @@
 #define FCH_PM_DECODEEN			0x00
 #define FCH_PM_DECODEEN_SMBUS0SEL	GENMASK(20, 19)
 #define FCH_PM_SCRATCH			0x80
+#define FCH_PM_S5_RESET_STATUS		0xC0
 
 #endif
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 1f7925e45b46d..aed82b9ccf8ce 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -9,6 +9,7 @@
 #include <linux/sched/clock.h>
 #include <linux/random.h>
 #include <linux/topology.h>
+#include <asm/amd/fch.h>
 #include <asm/processor.h>
 #include <asm/apic.h>
 #include <asm/cacheinfo.h>
@@ -1237,3 +1238,66 @@ void amd_check_microcode(void)
 	if (cpu_feature_enabled(X86_FEATURE_ZEN2))
 		on_each_cpu(zenbleed_check_cpu, NULL, 1);
 }
+
+static const char * const s5_reset_reason_txt[] = {
+	[0] = "thermal pin BP_THERMTRIP_L was tripped",
+	[1] = "power button was pressed for 4 seconds",
+	[2] = "shutdown pin was shorted",
+	[4] = "remote ASF power off command was received",
+	[9] = "internal CPU thermal limit was tripped",
+	[16] = "system reset pin BP_SYS_RST_L was tripped",
+	[17] = "software issued PCI reset",
+	[18] = "software wrote 0x4 to reset control register 0xCF9",
+	[19] = "software wrote 0x6 to reset control register 0xCF9",
+	[20] = "software wrote 0xE to reset control register 0xCF9",
+	[21] = "ACPI power state transition occurred",
+	[22] = "keyboard reset pin KB_RST_L was asserted",
+	[23] = "internal CPU shutdown event occurred",
+	[24] = "system failed to boot before failed boot timer expired",
+	[25] = "hardware watchdog timer expired",
+	[26] = "remote ASF reset command was received",
+	[27] = "an uncorrected error caused a data fabric sync flood event",
+	[29] = "FCH and MP1 failed warm reset handshake",
+	[30] = "a parity error occurred",
+	[31] = "a software sync flood event occurred",
+};
+
+static __init int print_s5_reset_status_mmio(void)
+{
+	void __iomem *addr;
+	unsigned long value;
+	int nr_reasons = 0;
+	int bit = -1;
+
+	if (!cpu_feature_enabled(X86_FEATURE_ZEN))
+		return 0;
+
+	addr = ioremap(FCH_PM_BASE + FCH_PM_S5_RESET_STATUS, sizeof(value));
+	if (!addr)
+		return 0;
+
+	value = ioread32(addr);
+	iounmap(addr);
+
+	/* Iterate on each bit in the 'value' mask: */
+	while (true) {
+		bit = find_next_bit(&value, BITS_PER_LONG, bit + 1);
+
+		/* Reached the end of the word, no more bits: */
+		if (bit >= BITS_PER_LONG) {
+			if (!nr_reasons)
+				pr_info("x86/amd: Previous system reset reason [0x%08lx]: Unknown\n", value);
+			break;
+		}
+
+		if (!s5_reset_reason_txt[bit])
+			continue;
+
+		nr_reasons++;
+		pr_info("x86/amd: Previous system reset reason [0x%08lx]: %s\n",
+			value, s5_reset_reason_txt[bit]);
+	}
+
+	return 0;
+}
+late_initcall(print_s5_reset_status_mmio);
-- 
2.43.0


  parent reply	other threads:[~2025-04-22 23:48 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-22 23:48 [PATCH v5 0/5] AMD Zen debugging documentation Mario Limonciello
2025-04-22 23:48 ` [PATCH v5 1/5] Documentation: Add AMD Zen debugging document Mario Limonciello
2025-05-02  9:36   ` [tip: x86/platform] " tip-bot2 for Mario Limonciello
2025-04-22 23:48 ` [PATCH v5 2/5] i2c: piix4: Depends on X86 Mario Limonciello
2025-04-24 16:24   ` [tip: x86/platform] i2c: piix4: Make CONFIG_I2C_PIIX4 dependent on CONFIG_X86 tip-bot2 for Mario Limonciello
2025-04-25 11:18   ` [PATCH v5 2/5] i2c: piix4: Depends on X86 Andi Shyti
2025-04-26  9:42     ` Ingo Molnar
2025-04-28 18:18       ` Andi Shyti
2025-04-26  9:57   ` [tip: x86/platform] i2c: piix4: Make CONFIG_I2C_PIIX4 dependent on CONFIG_X86 tip-bot2 for Mario Limonciello
2025-06-10  9:16   ` [PATCH v5 2/5] i2c: piix4: Depends on X86 Geert Uytterhoeven
2025-06-10  9:24     ` Huacai Chen
2025-06-10 14:12       ` Mario Limonciello
2025-06-10 14:53         ` Hans de Goede
2025-06-10 14:55           ` Hans de Goede
2025-06-10 16:59             ` Geert Uytterhoeven
2025-06-10 18:52               ` Hans de Goede
2025-04-22 23:48 ` [PATCH v5 3/5] i2c: piix4: Move SB800_PIIX4_FCH_PM_ADDR definition to amd/fch.h Mario Limonciello
2025-04-24 16:24   ` [tip: x86/platform] i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to <asm/amd/fch.h> tip-bot2 for Mario Limonciello
2025-04-25 11:18   ` [PATCH v5 3/5] i2c: piix4: Move SB800_PIIX4_FCH_PM_ADDR definition to amd/fch.h Andi Shyti
2025-04-26  9:57   ` [tip: x86/platform] i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to <asm/amd/fch.h> tip-bot2 for Mario Limonciello
2025-04-22 23:48 ` [PATCH v5 4/5] platform/x86/amd: pmc: use FCH_PM_BASE definition Mario Limonciello
2025-04-24 16:24   ` [tip: x86/platform] platform/x86/amd/pmc: Use " tip-bot2 for Mario Limonciello
2025-04-26  9:56   ` tip-bot2 for Mario Limonciello
2025-04-29 14:39   ` [PATCH v5 4/5] platform/x86/amd: pmc: use " Ilpo Järvinen
2025-04-22 23:48 ` Mario Limonciello [this message]
2025-04-30 19:03   ` [PATCH v5 5/5] x86/CPU/AMD: Print the reason for the last reset Borislav Petkov
2025-04-30 19:05     ` Mario Limonciello
2025-04-30 19:10       ` Borislav Petkov
2025-04-30 19:17         ` Mario Limonciello
2025-04-30 19:25           ` Borislav Petkov
2025-04-30 19:32             ` Mario Limonciello
2025-04-30 19:38               ` Borislav Petkov
2025-05-01  8:31               ` Borislav Petkov
2025-05-04  6:38                 ` Ingo Molnar
2025-05-02  9:36   ` [tip: x86/platform] " tip-bot2 for Yazen Ghannam
2025-05-04  6:37     ` Ingo Molnar
2025-05-04  7:03       ` [PATCH] x86/CPU/AMD: Clean up the last-reset printing code a bit Ingo Molnar
2025-05-04  9:52         ` Borislav Petkov
2025-05-04 18:08           ` Mario Limonciello
2025-05-05  5:32         ` [tip: x86/platform] " tip-bot2 for Ingo Molnar
2025-05-05 14:12   ` [tip: x86/platform] x86/CPU/AMD: Print the reason for the last reset tip-bot2 for Yazen Ghannam
2025-04-23 15:02 ` [PATCH v5 0/5] AMD Zen debugging documentation Jonathan Corbet
2025-04-28 16:14   ` Mario Limonciello
2025-04-24 15:58 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250422234830.2840784-6-superm1@kernel.org \
    --to=superm1@kernel.org \
    --cc=Shyam-sundar.S-k@amd.com \
    --cc=andi.shyti@kernel.org \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=hdegoede@redhat.com \
    --cc=hpa@zytor.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=jdelvare@suse.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-i2c@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=mingo@redhat.com \
    --cc=platform-driver-x86@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yazen.ghannam@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox