From: Breno Leitao <leitao@debian.org>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, kernel-team@meta.com,
Naoya Horiguchi <nao.horiguchi@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH v4 2/3] mm/memory-failure: add panic option for unrecoverable pages
Date: Mon, 27 Apr 2026 07:49:34 -0700 [thread overview]
Message-ID: <ae93PTScNGzxo7aR@gmail.com> (raw)
In-Reply-To: <5e05384e-740e-b374-2370-01f96d1dac9f@huawei.com>
On Mon, Apr 27, 2026 at 10:44:55AM +0800, Miaohe Lin wrote:
> On 2026/4/24 20:01, Breno Leitao wrote:
> > On Thu, Apr 23, 2026 at 10:38:19AM +0800, Miaohe Lin wrote:
> >>> are you suggesting I drop MF_MSG_KERNEL_HIGH_ORDER from here, or, document this
> >>> will not hit userspace pages?
> >>
> >> No, maybe we should rule out or document above rare case if I'm not miss something.
> >
> > Good catch. A buddy page being concurrently allocated to userspace can
> > briefly satisfy get_hwpoison_page() == 0 && !is_free_buddy_page(), and
> > that page is recoverable via the standard SIGBUS path — panicking on
> > it would be wrong.
> >
> > The page allocator can't filter it out either.
> >
> > check_new_pages() is gated by is_check_pages_enabled() and is a no-op
> > when CONFIG_DEBUG_VM=n.
> >
> > For v6 I'll try to rule out the race inside panic_on_unrecoverable_mf() so
> > action_result() stays unchanged:
> >
> > case MF_MSG_KERNEL_HIGH_ORDER:
> > p = pfn_to_online_page(pfn);
> > if (!p)
> > return true;
> > cpu_relax();
> > return page_count(p) == 0 &&
> > !PageLRU(p) &&
> > !page_mapped(p) &&
> > !page_folio(p)->mapping &&
> > !is_free_buddy_page(p);
> >
> >
> > A buddy page being allocated must transit rmqueue() → prep_new_page() →
> > post_alloc_hook() before the caller can use it. Each step either bumps
> > _refcount or sets state we can observe (PageLRU, ->mapping). cpu_relax()
> > lets that remote-CPU progress become visible before we resample.
> >
> > A genuine non-buddy high-order kernel tail page stays unowned across the
> > recheck, so the panic still fires on the case this series targets.
> >
> > The window is much narrowed now, not eliminated — I'll say so in the changelog.
> >
> > I also added a selftest that enables the sysctl, injects MADV_HWPOISON
> > on a userspace anon page in a forked child, and asserts SIGBUS (not a
> > panic). I've been running this in a loop for hours, and I haven't seen any
> > false positive.
>
> The userspace anon pages are already allocated. Those pages are in a stable state.
> So your selftest cannot test above window. Or am I miss something?
You're right, the test doesn't directly hit the race window. By the time
madvise(MADV_HWPOISON) runs the page is fully owned by the process and goes
through the steady-state SIGBUS path; the buddy→user transition that the
recheck guards is already over.
What the test actually proves is the negative: the recheck didn't break the
common, non-racing path — i.e. a normal recoverable userspace page still
returns SIGBUS instead of panicking. It's a smoke test against gross
regressions of the recheck logic, not a reproducer of the original race.
Reproducing the race from userspace is hard because the window is
microseconds wide inside the allocator.
If you think this test doesn't bring any value, i am happy to just drop it.
next prev parent reply other threads:[~2026-04-27 14:49 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-15 12:54 [PATCH v4 0/3] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
2026-04-15 12:55 ` [PATCH v4 1/3] mm/memory-failure: report MF_MSG_KERNEL for reserved pages Breno Leitao
2026-04-22 2:50 ` Miaohe Lin
2026-04-15 12:55 ` [PATCH v4 2/3] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
2026-04-22 3:36 ` Miaohe Lin
2026-04-22 15:21 ` Breno Leitao
2026-04-23 2:38 ` Miaohe Lin
2026-04-24 12:01 ` Breno Leitao
2026-04-27 2:44 ` Miaohe Lin
2026-04-27 14:49 ` Breno Leitao [this message]
2026-04-28 2:12 ` Miaohe Lin
2026-04-15 12:55 ` [PATCH v4 3/3] Documentation: document panic_on_unrecoverable_memory_failure sysctl Breno Leitao
2026-04-22 3:43 ` Miaohe Lin
2026-04-22 15:23 ` Breno Leitao
2026-04-23 2:05 ` Miaohe Lin
2026-04-15 20:56 ` [PATCH v4 0/3] mm/memory-failure: add panic option for unrecoverable pages Jiaqi Yan
2026-04-16 15:32 ` Breno Leitao
2026-04-16 16:26 ` Jiaqi Yan
2026-04-17 9:10 ` Breno Leitao
2026-04-18 0:18 ` Jiaqi Yan
2026-04-22 2:49 ` Miaohe Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ae93PTScNGzxo7aR@gmail.com \
--to=leitao@debian.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=david@kernel.org \
--cc=kernel-team@meta.com \
--cc=linmiaohe@huawei.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=nao.horiguchi@gmail.com \
--cc=rppt@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.