From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6AD72135D0; Sun, 26 Jan 2025 15:02:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737903774; cv=none; b=P60PHicvu4O1YOCIWvb8BEwB2rpcHw0pInJ1tFmixnhwBiDyoltTk170rxv/ANU8a4e4Ch93ZA4KsPrv3g+FborYCLw1HPZkO9zb+/1h5ow93bRGplINMSYxrdRiFhmujSRA/NdjSV8KKU6Kr5d9bGCo3FlpKuVBcF+xy7SvykY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737903774; c=relaxed/simple; bh=ybt+0WYCg7XQ+suYa+qdBNCnjlvEKqD1kZ2RGSQT6gk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=R4KK7HNMtDkKcNNRyyGpqz4mKokSw8Aj9ECEmHv9RciDpw+k4ABQRlRhO8XXD77pa17ImC9wwTQsrJNMs2Zd9XXNVCnuapZ9BsMoiLrLhUgnVsKGRL26d4n4BXFKOVpSz09hhl9WOfjnSfLcmB/2VzyUPGx3CePE0g9jL6FFAVk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EHC7Nhe+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EHC7Nhe+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D46A5C4CED3; Sun, 26 Jan 2025 15:02:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737903773; bh=ybt+0WYCg7XQ+suYa+qdBNCnjlvEKqD1kZ2RGSQT6gk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EHC7Nhe+3Ic97WxKJCCnKXeEiSUcvgof08icdQhtOdzjnzaqK9MVBo0cN8/HrDiRw wYrAlNm3vn0zJvKGIsnqpPNLX/SIC75a59paONE/JMLwgC3XDzilDd8qsAICT1I5Ud p1TO9Zp0OGYhj5dBiphFNz473u3rUcB8zqQ5uz4y/TKM3Y80/B1CcnGiI/Q68CpLe/ hpkyGTHTpopd8DVaBR+PyRyR4+Jc5m9E6qnmXjcO5i2tjycU/hQBatOu7LY5y80Dfv jyWSKBv2xNKfamXdsyRzYBiLUicfGMs50JbEzsmMW3etrOcNL9MMIfcnU9M2ih4vnk URsOZm+rKGswA== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Borislav Petkov , Feng Tang , Ira Weiny , "Rafael J . Wysocki" , Sasha Levin , rafael@kernel.org, dave.jiang@intel.com, Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, u.kleine-koenig@baylibre.com, peterz@infradead.org, linux-acpi@vger.kernel.org Subject: [PATCH AUTOSEL 6.12 18/29] APEI: GHES: Have GHES honor the panic= setting Date: Sun, 26 Jan 2025 10:01:59 -0500 Message-Id: <20250126150210.955385-18-sashal@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250126150210.955385-1-sashal@kernel.org> References: <20250126150210.955385-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.12.11 Content-Transfer-Encoding: 8bit From: Borislav Petkov [ Upstream commit 5c0e00a391dd0099fe95991bb2f962848d851916 ] The GHES driver overrides the panic= setting by force-rebooting the system after a fatal hw error has been reported. The intent being that such an error would be reported earlier. However, this is not optimal when a hard-to-debug issue requires long time to reproduce and when that happens, the box will get rebooted after 30 seconds and thus destroy the whole hw context of when the error happened. So rip out the default GHES panic timeout and honor the global one. In the panic disabled (panic=0) case, the error will still be logged to dmesg for later inspection and if panic after a hw error is really required, then that can be controlled the usual way - use panic= on the cmdline or set it in the kernel .config's CONFIG_PANIC_TIMEOUT. Reported-by: Feng Tang Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Feng Tang Reviewed-by: Ira Weiny Link: https://patch.msgid.link/20250113125224.GFZ4UMiNtWIJvgpveU@fat_crate.local Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin --- drivers/acpi/apei/ghes.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index ada93cfde9ba1..cff6685fa6cc6 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -173,8 +173,6 @@ static struct gen_pool *ghes_estatus_pool; static struct ghes_estatus_cache __rcu *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE]; static atomic_t ghes_estatus_cache_alloced; -static int ghes_panic_timeout __read_mostly = 30; - static void __iomem *ghes_map(u64 pfn, enum fixed_addresses fixmap_idx) { phys_addr_t paddr; @@ -983,14 +981,16 @@ static void __ghes_panic(struct ghes *ghes, struct acpi_hest_generic_status *estatus, u64 buf_paddr, enum fixed_addresses fixmap_idx) { + const char *msg = GHES_PFX "Fatal hardware error"; + __ghes_print_estatus(KERN_EMERG, ghes->generic, estatus); ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx); - /* reboot to log the error! */ if (!panic_timeout) - panic_timeout = ghes_panic_timeout; - panic("Fatal hardware error!"); + pr_emerg("%s but panic disabled\n", msg); + + panic(msg); } static int ghes_proc(struct ghes *ghes) -- 2.39.5