All of lore.kernel.org
 help / color / mirror / Atom feed
From: Breno Leitao <leitao@debian.org>
To: Miaohe Lin <linmiaohe@huawei.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	 Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport <rppt@kernel.org>,
	 Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,  Shuah Khan <shuah@kernel.org>,
	Naoya Horiguchi <nao.horiguchi@gmail.com>,
	 Steven Rostedt <rostedt@goodmis.org>,
	 Masami Hiramatsu <mhiramat@kernel.org>,
	 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	 Jonathan Corbet <corbet@lwn.net>,
	Shuah Khan <skhan@linuxfoundation.org>,
	 "Liam R. Howlett" <liam@infradead.org>,
	 "Liam R. Howlett" <liam@infradead.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	 linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	 Breno Leitao <leitao@debian.org>,
	linux-trace-kernel@vger.kernel.org,  kernel-team@meta.com
Subject: [PATCH v7 6/6] Documentation: document panic_on_unrecoverable_memory_failure sysctl
Date: Wed, 13 May 2026 08:39:37 -0700	[thread overview]
Message-ID: <20260513-ecc_panic-v7-6-be2e578e61da@debian.org> (raw)
In-Reply-To: <20260513-ecc_panic-v7-0-be2e578e61da@debian.org>

Add documentation for the new vm.panic_on_unrecoverable_memory_failure
sysctl, describing which failures trigger a panic (kernel-owned pages
the handler cannot recover) and which are intentionally left out
(transient allocator races and unclassified pages).

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 Documentation/admin-guide/sysctl/vm.rst | 80 +++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 97e12359775c9..452c2ab25b35e 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -67,6 +67,7 @@ Currently, these files are in /proc/sys/vm:
 - page-cluster
 - page_lock_unfairness
 - panic_on_oom
+- panic_on_unrecoverable_memory_failure
 - percpu_pagelist_high_fraction
 - stat_interval
 - stat_refresh
@@ -925,6 +926,85 @@ panic_on_oom=2+kdump gives you very strong tool to investigate
 why oom happens. You can get snapshot.
 
 
+panic_on_unrecoverable_memory_failure
+======================================
+
+When a hardware memory error (e.g. multi-bit ECC) hits a kernel page
+that cannot be recovered by the memory failure handler, the default
+behaviour is to ignore the error and continue operation.  This is
+dangerous because the corrupted data remains accessible to the kernel,
+risking silent data corruption or a delayed crash when the poisoned
+memory is next accessed.
+
+When enabled, this sysctl triggers a panic on memory failure events
+hitting reserved (``PageReserved``) memory: firmware reservations,
+the kernel image, vDSO, the zero page, and similar memblock-reserved
+regions.  These are owned by the kernel, are not managed by the page
+allocator, and cannot be recovered by the memory failure handler.
+
+Other unrecoverable kernel-owned populations (slab, vmalloc, page
+tables, kernel stacks, ...) are not currently covered by this
+sysctl.  The handler cannot reliably distinguish them from a
+userspace folio temporarily off the LRU during migration or
+compaction, and the cost of a false-positive panic on a recoverable
+userspace page is too high.  Such pages still go through the
+standard MF_MSG_GET_HWPOISON path: ``PG_hwpoison`` is set on them
+and a delayed crash on the next access remains possible.  Coverage
+may grow in the future as the handler gains stronger
+kernel-ownership signals.
+
+Recoverable failure paths are also intentionally left out: in-flight
+buddy allocations and other transient races with the page allocator
+can reach the same diagnostic, and panicking on them would risk
+killing the box for a page destined for userspace where the standard
+SIGBUS recovery path applies.  Pages whose state could not be
+classified at all are not covered either, since an unknown state is
+not a sound basis for a panic decision.
+
+For many environments it is preferable to panic immediately with a clean
+crash dump that captures the original error context, rather than to
+continue and face a random crash later whose cause is difficult to
+diagnose.
+
+Use cases
+---------
+
+This option is most useful in environments where unattributed crashes
+are expensive to debug or where data integrity must take precedence
+over availability:
+
+* Large fleets, where multi-bit ECC errors on kernel pages are observed
+  regularly and post-mortem analysis of an unrelated downstream crash
+  (often seconds to minutes after the original error) consumes
+  significant engineering effort.
+
+* Systems configured with kdump, where panicking at the moment of the
+  hardware error produces a vmcore that still contains the faulting
+  address, the affected page state, and the originating MCE/GHES
+  record — context that is typically lost by the time a delayed crash
+  occurs.
+
+* High-availability clusters that rely on fast, deterministic node
+  failure for failover, and prefer an immediate panic over silent data
+  corruption propagating to replicas or persistent storage.
+
+* Kernel and platform developers reproducing hwpoison issues with
+  tools such as ``mce-inject`` or error-injection debugfs interfaces,
+  where panicking on the unrecoverable path makes regressions
+  immediately visible instead of surfacing as later, unrelated
+  failures.
+
+= =====================================================================
+0 Try to continue operation (default).
+1 Panic immediately.  If the ``panic`` sysctl is also non-zero then the
+  machine will be rebooted.
+= =====================================================================
+
+Example::
+
+     echo 1 > /proc/sys/vm/panic_on_unrecoverable_memory_failure
+
+
 percpu_pagelist_high_fraction
 =============================
 

-- 
2.53.0-Meta


      parent reply	other threads:[~2026-05-13 15:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 15:39 [PATCH v7 0/6] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
2026-05-13 15:39 ` [PATCH v7 1/6] mm/memory-failure: drop dead error_states[] entry for reserved pages Breno Leitao
2026-05-13 20:10   ` David Hildenbrand (Arm)
2026-05-14 10:55     ` Breno Leitao
2026-05-14  9:12   ` Lance Yang
2026-05-15  2:48   ` Miaohe Lin
2026-05-13 15:39 ` [PATCH v7 2/6] mm/memory-failure: surface unhandlable kernel pages as -ENOTRECOVERABLE Breno Leitao
2026-05-14 13:28   ` Lance Yang
2026-05-14 14:37     ` Breno Leitao
2026-05-15  7:03       ` Lance Yang
2026-05-15  3:04   ` Miaohe Lin
2026-05-13 15:39 ` [PATCH v7 3/6] mm/memory-failure: report MF_MSG_KERNEL for unrecoverable kernel pages Breno Leitao
2026-05-13 15:39 ` [PATCH v7 4/6] mm/memory-failure: short-circuit PG_reserved before get_hwpoison_page() Breno Leitao
2026-05-13 19:49   ` David Hildenbrand (Arm)
2026-05-14 11:06     ` Breno Leitao
2026-05-13 15:39 ` [PATCH v7 5/6] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
2026-05-13 15:39 ` Breno Leitao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260513-ecc_panic-v7-6-be2e578e61da@debian.org \
    --to=leitao@debian.org \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=david@kernel.org \
    --cc=kernel-team@meta.com \
    --cc=liam@infradead.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=ljs@kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@suse.com \
    --cc=nao.horiguchi@gmail.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.