From: Miaohe Lin <linmiaohe@huawei.com>
To: Breno Leitao <leitao@debian.org>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
<linux-doc@vger.kernel.org>, <kernel-team@meta.com>,
Naoya Horiguchi <nao.horiguchi@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH v4 3/3] Documentation: document panic_on_unrecoverable_memory_failure sysctl
Date: Wed, 22 Apr 2026 11:43:16 +0800 [thread overview]
Message-ID: <7b4a6659-e2e5-5e63-2952-c7a840ffcdec@huawei.com> (raw)
In-Reply-To: <20260415-ecc_panic-v4-3-2d0277f8f601@debian.org>
On 2026/4/15 20:55, Breno Leitao wrote:
> Add documentation for the new vm.panic_on_unrecoverable_memory_failure
> sysctl, describing the three categories of failures that trigger a
> panic and noting which kernel page types are not yet covered.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> Documentation/admin-guide/sysctl/vm.rst | 37 +++++++++++++++++++++++++++++++++
> 1 file changed, 37 insertions(+)
>
> diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> index 97e12359775c9..592ce9ec38c4b 100644
> --- a/Documentation/admin-guide/sysctl/vm.rst
> +++ b/Documentation/admin-guide/sysctl/vm.rst
> @@ -67,6 +67,7 @@ Currently, these files are in /proc/sys/vm:
> - page-cluster
> - page_lock_unfairness
> - panic_on_oom
> +- panic_on_unrecoverable_memory_failure
> - percpu_pagelist_high_fraction
> - stat_interval
> - stat_refresh
> @@ -925,6 +926,42 @@ panic_on_oom=2+kdump gives you very strong tool to investigate
> why oom happens. You can get snapshot.
>
>
> +panic_on_unrecoverable_memory_failure
> +======================================
> +
> +When a hardware memory error (e.g. multi-bit ECC) hits a kernel page
> +that cannot be recovered by the memory failure handler, the default
> +behaviour is to ignore the error and continue operation. This is
> +dangerous because the corrupted data remains accessible to the kernel,
> +risking silent data corruption or a delayed crash when the poisoned
> +memory is next accessed.
> +
> +When enabled, this sysctl triggers a panic on three categories of
> +unrecoverable failures: reserved kernel pages, non-buddy kernel pages
> +with zero refcount (e.g. tail pages of high-order allocations), and
> +pages whose state cannot be classified as recoverable.
> +
> +Note that some kernel page types — such as slab objects, vmalloc
> +allocations, kernel stacks, and page tables — share a failure path
> +with transient refcount races and are not currently covered by this
> +option. I.e, do not panic when not confident of the page status.
> +
> +For many environments it is preferable to panic immediately with a clean
> +crash dump that captures the original error context, rather than to
> +continue and face a random crash later whose cause is difficult to
> +diagnose.
Should we add some userful cases to show the real-world application scenarios?
Thanks.
.
> +
> += =====================================================================
> +0 Try to continue operation (default).
> +1 Panic immediately. If the ``panic`` sysctl is also non-zero then the
> + machine will be rebooted.
> += =====================================================================
> +
> +Example::
> +
> + echo 1 > /proc/sys/vm/panic_on_unrecoverable_memory_failure
> +
> +
> percpu_pagelist_high_fraction
> =============================
>
>
next prev parent reply other threads:[~2026-04-22 3:43 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-15 12:54 [PATCH v4 0/3] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
2026-04-15 12:55 ` [PATCH v4 1/3] mm/memory-failure: report MF_MSG_KERNEL for reserved pages Breno Leitao
2026-04-22 2:50 ` Miaohe Lin
2026-04-15 12:55 ` [PATCH v4 2/3] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
2026-04-22 3:36 ` Miaohe Lin
2026-04-22 15:21 ` Breno Leitao
2026-04-23 2:38 ` Miaohe Lin
2026-04-24 12:01 ` Breno Leitao
2026-04-27 2:44 ` Miaohe Lin
2026-04-27 14:49 ` Breno Leitao
2026-04-28 2:12 ` Miaohe Lin
2026-04-15 12:55 ` [PATCH v4 3/3] Documentation: document panic_on_unrecoverable_memory_failure sysctl Breno Leitao
2026-04-22 3:43 ` Miaohe Lin [this message]
2026-04-22 15:23 ` Breno Leitao
2026-04-23 2:05 ` Miaohe Lin
2026-04-15 20:56 ` [PATCH v4 0/3] mm/memory-failure: add panic option for unrecoverable pages Jiaqi Yan
2026-04-16 15:32 ` Breno Leitao
2026-04-16 16:26 ` Jiaqi Yan
2026-04-17 9:10 ` Breno Leitao
2026-04-18 0:18 ` Jiaqi Yan
2026-04-22 2:49 ` Miaohe Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7b4a6659-e2e5-5e63-2952-c7a840ffcdec@huawei.com \
--to=linmiaohe@huawei.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=david@kernel.org \
--cc=kernel-team@meta.com \
--cc=leitao@debian.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=nao.horiguchi@gmail.com \
--cc=rppt@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.