From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: David Hildenbrand <david@redhat.com>
Cc: Kalesh Singh <kaleshsingh@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Suren Baghdasaryan <surenb@google.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Matthew Wilcox <willy@infradead.org>,
Vlastimil Babka <vbabka@suse.cz>,
"Paul E . McKenney" <paulmck@kernel.org>,
Jann Horn <jannh@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Shuah Khan <shuah@kernel.org>,
linux-kselftest@vger.kernel.org, linux-api@vger.kernel.org,
John Hubbard <jhubbard@nvidia.com>,
Juan Yescas <jyescas@google.com>
Subject: Re: [PATCH 0/4] mm: permit guard regions for file-backed/shmem mappings
Date: Thu, 20 Feb 2025 09:04:58 +0000 [thread overview]
Message-ID: <b0a95f2c-093c-45fd-b4a2-2ba5cbc37e2c@lucifer.local> (raw)
In-Reply-To: <4aa97b5c-3ddc-442b-8ec9-cc43ebe9e599@redhat.com>
On Thu, Feb 20, 2025 at 09:57:37AM +0100, David Hildenbrand wrote:
> On 20.02.25 09:51, Lorenzo Stoakes wrote:
> > On Wed, Feb 19, 2025 at 12:56:31PM -0800, Kalesh Singh wrote:
> > > > We also can't change smaps in the way you want, it _has_ to still give
> > > > output per VMA information.
> > >
> > > Sorry I wasn't suggesting to change the entries in smaps, rather
> > > agreeing to your marker suggestion. Maybe a set of ranges for each
> > > smaps entry that has guards? It doesn't solve the use case, but does
> > > make these regions visible to userspace.
> >
> > No, you are not providing a usecase for this. /proc/$pid/pagemaps does not
> > contaminate the smaps output, mess with efforts to make it RCU readable,
> > require updating the ioctl interface, etc. so it is clearly the better
> > choice.
> >
> > >
> > > >
> > > > The proposed change that would be there would be a flag or something
> > > > indicating that the VMA has guard regions _SOMEWHERE_ in it.
> > > >
> > > > Since this doesn't solve your problem, adds complexity, and nobody else
> > > > seems to need it, I would suggest this is not worthwhile and I'd rather not
> > > > do this.
> > > >
> > > > Therefore for your needs there are literally only two choices here:
> > > >
> > > > 1. Add a bit to /proc/$pid/pagemap OR
> > > > 2. a new interface.
> > > >
> > > > I am not in favour of a new interface here, if we can just extend pagemap.
> > > >
> > > > What you'd have to do is:
> > > >
> > > > 1. Find virtual ranges via /proc/$pid/maps
> > > > 2. iterate through /proc/$pid/pagemaps to retrieve state for all ranges.
> > > >
> > >
> > > Could we also consider an smaps field like:
> > >
> > > VmGuards: [AAA, BBB), [CCC, DDD), ...
> > >
> > > or something of that sort?
> >
> > No, absolutely, categorically not. You realise these could be thousands of
> > characters long right?
> >
> > /proc/$pid/pagemaps resolves this without contaminating this output.
> >
> > > > Well I'm glad that you guys find it useful for _something_ ;)
> > > >
> > > > Again this wasn't written only for you (it is broadly a good feature for
> > > > upstream), but I did have your use case in mind, so I'm a little
> > > > disappointed that it doesn't help, as I like to solve problems.
> > > >
> > > > But I'm glad it solves at least some for you...
> > >
> > > I recall Liam had a proposal to store the guard ranges in the maple tree?
> > >
> > > I wonder if that can be used in combination with this approach to have
> > > a better representation of this?
> >
> > This was an alternative proposal made prior to the feature being
> > implemented (and you and others at Google were welcome to comment and many
> > were cc'd, etc.).
> >
> > There is no 'in combination with'. This feature would take weeks/months to
> > implement, fundamentally impact the maple tree VMA implementation
> > and... not actually achieve anything + immediately be redundant.
> >
> > Plus it'd likely be slower, have locking implications, would have kernel
> > memory allocation implications, a lot more complexity and probably other
> > problems besides (we discussed this at length at the time and a number of
> > issues came up, I can't recall all of them).
> >
> > To be crystal clear - we are empathically NOT changing /proc/$pid/maps to
> > lie about VMAs regardless of underlying implementation, nor adding
> > thousands of characters to /proc/$pid/smaps entries.
>
> Yes. Calling it a "guard region" might be part of the problem
> (/"misunderstanding"), because it reminds people of "virtual memory
> regions".
>
> "Guard markers" or similar might have been clearer that these operate on
> individual PTEs, require page table scanning etc ... which makes them a lot
> more scalable and fine-grained and provides all these benfits, with the
> downside being that we don't end up with that many "virtual memory regions"
> that maps/smaps operate on.
Honestly David you and the naming... :P
I disagree, sorry. Saying 'guard' anything might make people think one
thing or another. We can't account for that. I mean don't get me started on
'pinning' or any of the million other overloaded terms we use...
I _hugely_ publicly went out of my way to express the limitations, I gave a
talk, we had meetings, I mentioned it in the series.
Honestly if at that point you still don't realise, that's not a naming
problem. It's a 'did not participate with upstream' problem.
I like guard regions, as they're not pages as we previously referred to
them. People have no idea what a marker is, it doesn't sound like it spans
ranges, no don't like it sorry.
And sorry but this naming topic is closed :) I already let you change the
naming of the MADV_'s, which broke my heart, there will not be a second
heart breaking...
>
> [...]
>
> >
> > As I said to you earlier, the _best_ we could do in smaps would be to add a
> > flag like 'Grd' or something to indicate some part of the VMA is
> > guarded. But I won't do that unless somebody has an -actual use case- for
> > it.
>
> Right, and that would limit where you have to manually scan. Something
> similar is being done with uffd-wp markers IIRC.
Yeah that's a good point, but honestly if you're reading smaps that reads
the page tables, then reading /proc/$pid/pagemaps and reading page tables
TWICE that seems inefficient vs. just reading /proc/$pid/maps, then reading
/proc/$pid/pagemaps and reading page tables once.
>
> --
> Cheers,
>
> David / dhildenb
>
next prev parent reply other threads:[~2025-02-20 9:05 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-13 18:16 [PATCH 0/4] mm: permit guard regions for file-backed/shmem mappings Lorenzo Stoakes
2025-02-13 18:17 ` [PATCH 1/4] mm: allow guard regions in file-backed and read-only mappings Lorenzo Stoakes
2025-02-18 14:15 ` Vlastimil Babka
2025-02-18 16:01 ` David Hildenbrand
2025-02-18 16:12 ` Lorenzo Stoakes
2025-02-18 16:17 ` David Hildenbrand
2025-02-18 16:21 ` Lorenzo Stoakes
2025-02-18 16:27 ` David Hildenbrand
2025-02-18 16:49 ` Lorenzo Stoakes
2025-02-18 17:00 ` David Hildenbrand
2025-02-18 17:04 ` Lorenzo Stoakes
2025-02-24 14:02 ` Lorenzo Stoakes
2025-02-13 18:17 ` [PATCH 2/4] selftests/mm: rename guard-pages to guard-regions Lorenzo Stoakes
2025-02-18 14:15 ` Vlastimil Babka
2025-03-02 8:35 ` Lorenzo Stoakes
2025-02-13 18:17 ` [PATCH 3/4] tools/selftests: expand all guard region tests to file-backed Lorenzo Stoakes
2025-02-18 14:17 ` Vlastimil Babka
2025-04-22 10:37 ` Ryan Roberts
2025-04-22 10:47 ` Lorenzo Stoakes
2025-04-22 11:03 ` Ryan Roberts
2025-04-22 11:07 ` Lorenzo Stoakes
2025-04-22 11:11 ` Ryan Roberts
2025-02-13 18:17 ` [PATCH 4/4] tools/selftests: add file/shmem-backed mapping guard region tests Lorenzo Stoakes
2025-02-18 14:18 ` Vlastimil Babka
2025-02-18 12:12 ` [PATCH 0/4] mm: permit guard regions for file-backed/shmem mappings Vlastimil Babka
2025-02-18 13:05 ` Lorenzo Stoakes
2025-02-18 14:35 ` David Hildenbrand
2025-02-18 14:53 ` Lorenzo Stoakes
2025-02-18 15:20 ` David Hildenbrand
2025-02-18 16:43 ` Lorenzo Stoakes
2025-02-18 17:14 ` David Hildenbrand
2025-02-18 17:20 ` Lorenzo Stoakes
2025-02-18 17:25 ` David Hildenbrand
2025-02-18 17:28 ` Lorenzo Stoakes
2025-02-18 17:31 ` David Hildenbrand
2025-02-25 15:54 ` Vlastimil Babka
2025-02-25 16:31 ` David Hildenbrand
2025-02-25 16:37 ` Lorenzo Stoakes
2025-02-25 16:48 ` David Hildenbrand
2025-02-19 8:25 ` Kalesh Singh
2025-02-19 8:35 ` Kalesh Singh
2025-02-19 9:15 ` Lorenzo Stoakes
2025-02-19 17:32 ` Liam R. Howlett
2025-02-19 9:03 ` Lorenzo Stoakes
2025-02-19 9:15 ` David Hildenbrand
2025-02-19 9:17 ` Lorenzo Stoakes
2025-02-19 18:52 ` Kalesh Singh
2025-02-19 19:20 ` Lorenzo Stoakes
2025-02-19 20:56 ` Kalesh Singh
2025-02-20 8:51 ` Lorenzo Stoakes
2025-02-20 8:57 ` David Hildenbrand
2025-02-20 9:04 ` Lorenzo Stoakes [this message]
2025-02-20 9:23 ` David Hildenbrand
2025-02-20 9:47 ` Lorenzo Stoakes
2025-02-20 10:03 ` David Hildenbrand
2025-02-20 10:15 ` Lorenzo Stoakes
2025-02-20 12:44 ` David Hildenbrand
2025-02-20 13:18 ` Lorenzo Stoakes
2025-02-20 16:21 ` Suren Baghdasaryan
2025-02-20 18:08 ` Kalesh Singh
2025-02-21 11:04 ` Lorenzo Stoakes
2025-02-21 17:24 ` Kalesh Singh
2025-02-20 9:22 ` Vlastimil Babka
2025-02-20 9:53 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b0a95f2c-093c-45fd-b4a2-2ba5cbc37e2c@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=jannh@google.com \
--cc=jhubbard@nvidia.com \
--cc=jyescas@google.com \
--cc=kaleshsingh@google.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=paulmck@kernel.org \
--cc=shuah@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).