public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Sasha Levin <sashal@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>,
	akpm@linux-foundation.org, corbet@lwn.net, ljs@kernel.org,
	Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, skhan@linuxfoundation.org,
	jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, Sasha Levin <sashal@nvidia.com>,
	Sanif Veeras <sveeras@nvidia.com>,
	"Claude:claude-opus-4-7" <noreply@anthropic.com>
Subject: Re: [RFC 4/7] mm: add page consistency checker implementation
Date: Mon, 27 Apr 2026 17:40:34 +0200	[thread overview]
Message-ID: <c50aeb03-538b-4dca-9a81-ab84bc19a333@kernel.org> (raw)
In-Reply-To: <ae9ucgtCNf0JQtGu@laps>

>>>
>>> For something like a datacenter deployment I'd agree with you - the odds are
>>> too low to care. For an unsupervised self driving vehicle, where there's no
>>> human (locally or remotely) available to take over, I'd like the odds to be as
>>> low as possible :)
>>
>> I thought that people usually use special RT OSes (with proven logic etc) for
>> any safety-related systems. Using Linux on the core safety system sounds ...
>> scary.
> 
> RT OSes are indeed the current approach.
> 
> s/scary/exciting ;)

Not so exciting for the passengers ;)

> 
>> But, I'd expect corruption of other data (user pages? page tables?) a much
>> bigger problem than page al locator metdata? What am I missing that this here is
>> -- in context of the bigger problems there -- a thing we particularly care about?
> 
> You are very correct! The allocator work was fairly standalone, so it was an
> easy first project to tackle.

But in general, wouldn't we just expect ECC memory to give us an MCE, so we can
detect what was corrupted and act accordingly?

That's how it usually works: hw detects a memory corruption and injects an MCE.
We detect that we corrupted memmap state and kill the kernel.

Why does ECC not help here?

> 
> In general, the approach depends on what we're trying to defend from:
> 
> 1. bugs: an ASI-like MMU enforced "context" system.
> 2. physics: just like in most other areas - lots of redundancy. For example,
> consider redundant variables in safety critical code which exists as two
> copies: var_v1 = value and var_v2 = value XOR mask. When accessing them, read
> both copies, XOR the second back, compare.
> 
> There were a few sessions back in LPC about this. Here's the one from Bryan
> Huntsman which gives a good overview:
> https://www.youtube.com/watch?v=ie_ClBCed94

Thanks, but I fundamentally don't understand how RAS capabilities interact here?
We have mm/memory-failure.c for a reason :)

-- 
Cheers,

David


  reply	other threads:[~2026-04-27 15:49 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-24 14:00 [RFC 0/7] mm: dual-bitmap page allocator consistency checker Sasha Levin
2026-04-24 14:00 ` [RFC 1/7] mm: add generic dual-bitmap consistency primitives Sasha Levin
2026-04-24 14:00 ` [RFC 2/7] mm: add page consistency checker header Sasha Levin
2026-04-24 14:00 ` [RFC 3/7] mm: add Kconfig options for page consistency checker Sasha Levin
2026-04-24 14:00 ` [RFC 4/7] mm: add page consistency checker implementation Sasha Levin
2026-04-24 14:25   ` David Hildenbrand (Arm)
2026-04-24 14:49     ` Sasha Levin
2026-04-24 15:06       ` Pasha Tatashin
2026-04-24 18:28         ` David Hildenbrand (Arm)
2026-04-24 23:34           ` Sasha Levin
2026-04-25  5:30             ` David Hildenbrand (Arm)
2026-04-25 16:38               ` Sasha Levin
2026-04-27 12:32                 ` David Hildenbrand (Arm)
2026-04-27 14:10                   ` Sasha Levin
2026-04-27 15:40                     ` David Hildenbrand (Arm) [this message]
2026-04-24 18:26       ` David Hildenbrand (Arm)
2026-04-24 14:00 ` [RFC 5/7] mm/page_alloc: integrate page consistency hooks Sasha Levin
2026-04-24 14:00 ` [RFC 6/7] Documentation/mm: add page consistency checker documentation Sasha Levin
2026-04-24 14:00 ` [RFC 7/7] mm/page_consistency: add KUnit tests for dual-bitmap primitives Sasha Levin
2026-04-24 15:34 ` [RFC 0/7] mm: dual-bitmap page allocator consistency checker Matthew Wilcox
2026-04-24 15:53   ` Sasha Levin
2026-04-24 15:42 ` Vlastimil Babka (SUSE)
2026-04-24 16:25   ` Sasha Levin
2026-04-25  5:51     ` David Hildenbrand (Arm)
2026-04-25 16:09       ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c50aeb03-538b-4dca-9a81-ab84bc19a333@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=noreply@anthropic.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=rppt@kernel.org \
    --cc=sashal@kernel.org \
    --cc=sashal@nvidia.com \
    --cc=skhan@linuxfoundation.org \
    --cc=surenb@google.com \
    --cc=sveeras@nvidia.com \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox