linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Suren Baghdasaryan <surenb@google.com>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Matthew Wilcox <willy@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Jann Horn <jannh@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Muchun Song <muchun.song@linux.dev>,
	Richard Henderson <richard.henderson@linaro.org>,
	Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
	Matt Turner <mattst88@gmail.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	"James E . J . Bottomley" <James.Bottomley@hansenpartnership.com>,
	Helge Deller <deller@gmx.de>, Chris Zankel <chris@zankel.net>,
	Max Filippov <jcmvbkbc@gmail.com>, Arnd Bergmann <arnd@arndb.de>,
	linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org,
	linux-parisc@vger.kernel.org, linux-arch@vger.kernel.org,
	Shuah Khan <shuah@kernel.org>,
	Christian Brauner <brauner@kernel.org>,
	linux-kselftest@vger.kernel.org,
	Sidhartha Kumar <sidhartha.kumar@oracle.com>,
	Jeff Xu <jeffxu@chromium.org>,
	Christoph Hellwig <hch@infradead.org>,
	linux-api@vger.kernel.org, John Hubbard <jhubbard@nvidia.com>
Subject: Re: [PATCH v2 3/5] mm: madvise: implement lightweight guard page mechanism
Date: Mon, 21 Oct 2024 19:23:32 +0200	[thread overview]
Message-ID: <c8272b9d-5c33-4b44-9d6d-1d25c7ac23dd@redhat.com> (raw)
In-Reply-To: <b2bca752-77f3-4b63-abe9-348a5fc2a5cc@lucifer.local>

On 21.10.24 19:15, Lorenzo Stoakes wrote:
> On Mon, Oct 21, 2024 at 07:05:27PM +0200, David Hildenbrand wrote:
>> On 20.10.24 18:20, Lorenzo Stoakes wrote:
>>> Implement a new lightweight guard page feature, that is regions of userland
>>> virtual memory that, when accessed, cause a fatal signal to arise.
>>>
>>> Currently users must establish PROT_NONE ranges to achieve this.
>>>
>>> However this is very costly memory-wise - we need a VMA for each and every
>>> one of these regions AND they become unmergeable with surrounding VMAs.
>>>
>>> In addition repeated mmap() calls require repeated kernel context switches
>>> and contention of the mmap lock to install these ranges, potentially also
>>> having to unmap memory if installed over existing ranges.
>>>
>>> The lightweight guard approach eliminates the VMA cost altogether - rather
>>> than establishing a PROT_NONE VMA, it operates at the level of page table
>>> entries - poisoning PTEs such that accesses to them cause a fault followed
>>> by a SIGSGEV signal being raised.
>>>
>>> This is achieved through the PTE marker mechanism, which a previous commit
>>> in this series extended to permit this to be done, installed via the
>>> generic page walking logic, also extended by a prior commit for this
>>> purpose.
>>>
>>> These poison ranges are established with MADV_GUARD_POISON, and if the
>>> range in which they are installed contain any existing mappings, they will
>>> be zapped, i.e. free the range and unmap memory (thus mimicking the
>>> behaviour of MADV_DONTNEED in this respect).
>>>
>>> Any existing poison entries will be left untouched. There is no nesting of
>>> poisoned pages.
>>>
>>> Poisoned ranges are NOT cleared by MADV_DONTNEED, as this would be rather
>>> unexpected behaviour, but are cleared on process teardown or unmapping of
>>> memory ranges.
>>>
>>> Ranges can have the poison property removed by MADV_GUARD_UNPOISON -
>>> 'remedying' the poisoning. The ranges over which this is applied, should
>>> they contain non-poison entries, will be untouched, only poison entries
>>> will be cleared.
>>>
>>> We permit this operation on anonymous memory only, and only VMAs which are
>>> non-special, non-huge and not mlock()'d (if we permitted this we'd have to
>>> drop locked pages which would be rather counterintuitive).
>>>
>>> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
>>> Suggested-by: Jann Horn <jannh@google.com>
>>> Suggested-by: David Hildenbrand <david@redhat.com>
>>> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>>> ---
>>>    arch/alpha/include/uapi/asm/mman.h     |   3 +
>>>    arch/mips/include/uapi/asm/mman.h      |   3 +
>>>    arch/parisc/include/uapi/asm/mman.h    |   3 +
>>>    arch/xtensa/include/uapi/asm/mman.h    |   3 +
>>>    include/uapi/asm-generic/mman-common.h |   3 +
>>>    mm/madvise.c                           | 168 +++++++++++++++++++++++++
>>>    mm/mprotect.c                          |   3 +-
>>>    mm/mseal.c                             |   1 +
>>>    8 files changed, 186 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
>>> index 763929e814e9..71e13f27742d 100644
>>> --- a/arch/alpha/include/uapi/asm/mman.h
>>> +++ b/arch/alpha/include/uapi/asm/mman.h
>>> @@ -78,6 +78,9 @@
>>>    #define MADV_COLLAPSE	25		/* Synchronous hugepage collapse */
>>> +#define MADV_GUARD_POISON 102		/* fatal signal on access to range */
>>> +#define MADV_GUARD_UNPOISON 103		/* revoke guard poisoning */
>>
>> Just to raise it here: MADV_GUARD_INSTALL / MADV_GUARD_REMOVE or sth. like
>> that would have been even clearer, at least to me.
> 
> :)
> 
> It still feels like poisoning to me because we're explicitly putting
> something in the page tables to make a range have different fault behaviour
> like a HW poisoning, and 'installing' suggests backing or something like
> this, I think that's more confusing.

I connect "poison" to "SIGBUS" and "corrupt memory state", not to "there 
is nothing and there must not be anything". Thus my thinking. But again, 
not the end of the world, just wanted to raise it ...

> 
>>
>> But no strong opinion, just if somebody else reading along was wondering
>> about the same.
>>
>>
>> I'm hoping to find more time to have a closer look at this this week, but in
>> general, the concept sounds reasonable to me.
> 
> Thanks!

Thank you for implementing this and up-streaming it!

-- 
Cheers,

David / dhildenb


  reply	other threads:[~2024-10-21 17:23 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-20 16:20 [PATCH v2 0/5] implement lightweight guard pages Lorenzo Stoakes
2024-10-20 16:20 ` [PATCH v2 1/5] mm: pagewalk: add the ability to install PTEs Lorenzo Stoakes
2024-10-21 13:27   ` Vlastimil Babka
2024-10-21 13:50     ` Lorenzo Stoakes
2024-10-20 16:20 ` [PATCH v2 2/5] mm: add PTE_MARKER_GUARD PTE marker Lorenzo Stoakes
2024-10-21 13:45   ` Vlastimil Babka
2024-10-21 19:57     ` Lorenzo Stoakes
2024-10-21 20:42     ` Lorenzo Stoakes
2024-10-21 21:13       ` Lorenzo Stoakes
2024-10-21 21:20         ` Dave Hansen
2024-10-21 14:13   ` Vlastimil Babka
2024-10-21 14:33     ` Lorenzo Stoakes
2024-10-21 14:54       ` Vlastimil Babka
2024-10-21 15:33         ` Lorenzo Stoakes
2024-10-21 15:41           ` Lorenzo Stoakes
2024-10-21 16:00           ` David Hildenbrand
2024-10-21 16:23             ` Lorenzo Stoakes
2024-10-21 16:44               ` David Hildenbrand
2024-10-21 16:51                 ` Lorenzo Stoakes
2024-10-21 17:00                   ` David Hildenbrand
2024-10-21 17:14                     ` Lorenzo Stoakes
2024-10-21 17:21                       ` David Hildenbrand
2024-10-21 17:26                       ` Vlastimil Babka
2024-10-22 19:13                         ` David Hildenbrand
2024-10-20 16:20 ` [PATCH v2 3/5] mm: madvise: implement lightweight guard page mechanism Lorenzo Stoakes
2024-10-21 17:05   ` David Hildenbrand
2024-10-21 17:15     ` Lorenzo Stoakes
2024-10-21 17:23       ` David Hildenbrand [this message]
2024-10-21 19:25         ` John Hubbard
2024-10-21 19:39           ` Lorenzo Stoakes
2024-10-21 20:18             ` David Hildenbrand
2024-10-21 20:11   ` Vlastimil Babka
2024-10-21 20:17     ` David Hildenbrand
2024-10-21 20:25       ` Vlastimil Babka
2024-10-21 20:30         ` Lorenzo Stoakes
2024-10-21 20:37         ` David Hildenbrand
2024-10-21 20:49           ` Lorenzo Stoakes
2024-10-21 21:20             ` David Hildenbrand
2024-10-21 21:33               ` Lorenzo Stoakes
2024-10-21 21:35               ` Vlastimil Babka
2024-10-21 21:46                 ` Lorenzo Stoakes
2024-10-22 19:18                 ` David Hildenbrand
2024-10-21 20:27     ` Lorenzo Stoakes
2024-10-21 20:45       ` Vlastimil Babka
2024-10-22 19:08         ` Jann Horn
2024-10-22 19:35           ` Lorenzo Stoakes
2024-10-22 19:57             ` Jann Horn
2024-10-22 20:45               ` Lorenzo Stoakes
2024-10-20 16:20 ` [PATCH v2 4/5] tools: testing: update tools UAPI header for mman-common.h Lorenzo Stoakes
2024-10-20 16:20 ` [PATCH v2 5/5] selftests/mm: add self tests for guard page feature Lorenzo Stoakes
2024-10-21 21:31   ` Shuah Khan
2024-10-22 10:25     ` Lorenzo Stoakes
2024-10-20 17:37 ` [PATCH v2 0/5] implement lightweight guard pages Florian Weimer
2024-10-20 19:45   ` Lorenzo Stoakes
2024-10-23  6:24   ` Dmitry Vyukov
2024-10-23  7:19     ` David Hildenbrand
2024-10-23  8:11       ` Lorenzo Stoakes
2024-10-23  8:56         ` Dmitry Vyukov
2024-10-23  9:06           ` Vlastimil Babka
2024-10-23  9:13             ` David Hildenbrand
2024-10-23  9:18               ` Lorenzo Stoakes
2024-10-23  9:29                 ` David Hildenbrand
2024-10-23 11:31                   ` Marco Elver
2024-10-23 11:36                     ` David Hildenbrand
2024-10-23 11:40                       ` Lorenzo Stoakes
2024-10-23  9:17             ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c8272b9d-5c33-4b44-9d6d-1d25c7ac23dd@redhat.com \
    --to=david@redhat.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=brauner@kernel.org \
    --cc=chris@zankel.net \
    --cc=deller@gmx.de \
    --cc=hch@infradead.org \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jannh@google.com \
    --cc=jcmvbkbc@gmail.com \
    --cc=jeffxu@chromium.org \
    --cc=jhubbard@nvidia.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mattst88@gmail.com \
    --cc=muchun.song@linux.dev \
    --cc=paulmck@kernel.org \
    --cc=richard.henderson@linaro.org \
    --cc=shuah@kernel.org \
    --cc=sidhartha.kumar@oracle.com \
    --cc=surenb@google.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).