All of lore.kernel.org
 help / color / mirror / Atom feed
From: Breno Leitao <leitao@debian.org>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	 linux-trace-kernel@vger.kernel.org, kernel-team@meta.com,
	Lance Yang <lance.yang@linux.dev>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	 Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport <rppt@kernel.org>,
	 Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Shuah Khan <shuah@kernel.org>,
	 Naoya Horiguchi <nao.horiguchi@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	 Masami Hiramatsu <mhiramat@kernel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	 Jonathan Corbet <corbet@lwn.net>,
	Shuah Khan <skhan@linuxfoundation.org>,
	 "Liam R. Howlett" <liam@infradead.org>
Subject: Re: [PATCH v8 2/6] mm/memory-failure: surface unhandlable kernel pages as -ENOTRECOVERABLE
Date: Fri, 5 Jun 2026 02:35:23 -0700	[thread overview]
Message-ID: <aiKXrovzrNN-gExm@gmail.com> (raw)
In-Reply-To: <4b27467e-935f-5587-2f48-5a794c30a592@huawei.com>

On Wed, Jun 03, 2026 at 10:33:04AM +0800, Miaohe Lin wrote:
> On 2026/6/2 17:41, David Hildenbrand (Arm) wrote:
> > On 6/2/26 05:08, Miaohe Lin wrote:
> >> On 2026/6/1 21:22, David Hildenbrand (Arm) wrote:
> >>> On 6/1/26 14:28, Miaohe Lin wrote:
> >>>>
> >>>> Thanks for your patch.
> >>>>
> >>>>
> >>>> Once shake_page finds a lightweight range-based way to shrink slab, slab pages could be freed
> >>>> into buddy and above PageSlab test should be removed then. Maybe add a TODO or XXX here?
> >>>>
> >>>>
> >>>> I'm not sure but is it safe or a common way to test PageReserved, PageSlab,
> >>>> PageTable and PageLargeKmalloc without extra page refcnt?
> >>>
> >>> Checking typed pages in a racy fashion is fine (PageSlab, PageTable,
> >>> PageLargeKmalloc).
> >>
> >> Got it. Thanks.
> >>
> >>> Checking PageReserved in a racy fashion is fine as well. TESTPAGEFLAG() will
> >>> allow checking it on compound pages.
> >>
> >> It seems PageReserved is not intended to be set on compound pages. I see there are PF_NO_COMPOUND
> >> in its definition: PAGEFLAG(Reserved, reserved, PF_NO_COMPOUND).
> >>
> >>>
> >>> For PageLargeKmalloc, we would want to check the head page, though. The page
> >>> type is only stored for the head page.
> >>
> >> Maybe we should check the head page for PageSlab and PageTable too? alloc_slab_page only
> >> set PageSlab on the head page and __pagetable_ctor uses __folio_set_pgtable to set PageTable
> >> on folio.
> >>
> >>>
> >>> So maybe we want to lookup the compound head (if any) and perform the type
> >>> checks against that?
> >>
> >> Maybe we should or we might miss some pages that could have been handled. And
> >> if compound head is required, should we hold an extra page refcnt to guard against
> >> possible folio split race?
> > 
> > Races are fine. We might miss some pages, but that can happen on races either way.
> > 
> > 
> > I'd just do something like
> > 
> > if (PageReserved(page))
> > 	return true;
> > 
> > head = compound_head(page);
> 
> If @head is split just after compound_head. And then @head is freed into buddy and re-allocated as slab
> page while @page is still in the buddy. We would panic on this scene as @head is PageSlab. But we were
> supposed to successfully handle @page. Or am I miss something?

You're right that it is racy, but I think it is an acceptable race here.

For it to happen, the poisoned @page has to be a tail of a live compound page
at the time of the fault, and then -- in the few instructions between
compound_head() and the PageSlab(head) test -- that compound page has to be
split, the old head freed to buddy, and that head re-allocated as a slab page,
all while @page lands back in the buddy.  It cannot happen without concurrent
split/free/alloc activity in that exact window.

It is also worth noting the page in question genuinely took a unrecoverable ECC
error, and panic_on_unrecoverable_memory_failure is opt-in -- an operator who
enables it has explicitly chosen to crash rather than risk running on corrupted
memory.  Mis-attributing one such rare, genuinely-poisoned page as
unrecoverable is within that contract.

Thanks for the review and discussions,
--breno

  reply	other threads:[~2026-06-05  9:35 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-27 14:06 [PATCH v8 0/6] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
2026-05-27 14:06 ` [PATCH v8 1/6] mm/memory-failure: drop dead error_states[] entry for reserved pages Breno Leitao
2026-05-27 14:06 ` [PATCH v8 2/6] mm/memory-failure: surface unhandlable kernel pages as -ENOTRECOVERABLE Breno Leitao
2026-06-01 12:28   ` Miaohe Lin
2026-06-01 13:22     ` David Hildenbrand (Arm)
2026-06-02  3:08       ` Miaohe Lin
2026-06-02  9:41         ` David Hildenbrand (Arm)
2026-06-03  2:33           ` Miaohe Lin
2026-06-05  9:35             ` Breno Leitao [this message]
2026-06-05  9:42               ` David Hildenbrand (Arm)
2026-05-27 14:06 ` [PATCH v8 3/6] mm/memory-failure: report MF_MSG_KERNEL for unrecoverable kernel pages Breno Leitao
2026-06-01 13:24   ` David Hildenbrand (Arm)
2026-06-02  3:31   ` Miaohe Lin
2026-05-27 14:06 ` [PATCH v8 4/6] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
2026-06-02  7:05   ` Miaohe Lin
2026-06-05  9:37     ` Breno Leitao
2026-05-27 14:06 ` [PATCH v8 5/6] Documentation: document panic_on_unrecoverable_memory_failure sysctl Breno Leitao
2026-05-27 14:06 ` [PATCH v8 6/6] selftests/mm: add hwpoison-panic destructive test Breno Leitao
2026-05-27 19:39 ` [PATCH v8 0/6] mm/memory-failure: add panic option for unrecoverable pages Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aiKXrovzrNN-gExm@gmail.com \
    --to=leitao@debian.org \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=david@kernel.org \
    --cc=kernel-team@meta.com \
    --cc=lance.yang@linux.dev \
    --cc=liam@infradead.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=ljs@kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@suse.com \
    --cc=nao.horiguchi@gmail.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.