From: "Jörn Engel" <joern@purestorage.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Uday Shankar <ushankar@purestorage.com>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, "Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
Oscar Salvador <osalvador@suse.de>
Subject: Re: [bug report?] unintuitive behavior when mapping over hugepage-backed PROT_NONE regions
Date: Fri, 7 Feb 2025 11:35:40 -0800 [thread overview]
Message-ID: <Z6ZgjJnL0utjKy_P@cork> (raw)
In-Reply-To: <b113c695-4c8d-4276-bdc4-409195b636dd@lucifer.local>
On Fri, Feb 07, 2025 at 01:12:33PM +0000, Lorenzo Stoakes wrote:
>
> So TL;DR is - aggregate operations failing means any or all of the
> operation failed, you can no longer rely on the mapping state being what
> you expected.
Coming back to the "what should the interface be?" question, I can see
three reasonable answers:
1. Failure should result in no change. We have a bug and will fix it.
2. Failure should result in no change. But fixing things is exceedingly
hard and we may have to live with current reality for a long time.
3. Failure should result in undefined behavior.
I think you convincingly argue against the first answer. It might still
be useful to also argue against the third answer.
For background, I wrote a somewhat weird memory allocator in 2017,
called "big_allocate". Underlying problem is that your favorite malloc
tends to do a reasonable job for small to medium objects, but eventually
gives up and calls mmap()/munmap() for large objects. With a heavily
threaded process, the combination of mmap_sem and TLB shootdown via IPI
is a big performance-killer. Solution is a specialized allocator for
large objects instead of mmap()/munmap().
The original (and still current) design of big_allocate has a mapping
structure somewhat similar to "struct page" in the kernel. It relies on
having a large chunk of virtual memory space that it directly controls,
so that it can have a simple 1:1 mapping between virtual memory and
"struct page".
To get a large chunk of virtual memory space, big_allocate does a
MAP_NONE mmap(). It then later does the MAP_RW mmap() to allocate
memory. Often combined with MAP_HUGETLB, for obvious performance
reasons. (Side note: I wish MAP_RW existed in the headers.)
If memory serves, big_allocate resulted in a 2-3% macrobenchmark
improvement.
Current big_allocate has a number of ugly warts I rather dislike. One
of those warts is that you now have existing users that rely on mmap()
over existing MAP_NONE mappings working. At least with the special set
of conditions we care about.
I have some plans to rewrite big_allocate with a different design. But
for now we have existing code that may make your life harder than you
wished for.
Jörn
--
Without congressional action or a strong judicial precedent, I would
_strongly_ recommend against anyone trusting their private data to a
company with physical ties to the United States.
-- Ladar Levison
next prev parent reply other threads:[~2025-02-07 19:35 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-06 6:18 [bug report?] unintuitive behavior when mapping over hugepage-backed PROT_NONE regions Uday Shankar
2025-02-06 9:01 ` Oscar Salvador
2025-02-06 18:11 ` Jörn Engel
2025-02-06 18:54 ` Oscar Salvador
2025-02-07 10:29 ` Lorenzo Stoakes
2025-02-07 10:49 ` Vlastimil Babka
2025-02-07 12:33 ` Lorenzo Stoakes
2025-02-06 19:44 ` Uday Shankar
2025-02-07 13:12 ` Lorenzo Stoakes
2025-02-07 19:35 ` Jörn Engel [this message]
2025-02-08 16:02 ` Lorenzo Stoakes
2025-02-08 17:37 ` Jörn Engel
2025-02-08 17:40 ` Lorenzo Stoakes
2025-02-08 17:53 ` Jörn Engel
2025-02-08 18:00 ` Lorenzo Stoakes
2025-02-08 21:16 ` Jörn Engel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6ZgjJnL0utjKy_P@cork \
--to=joern@purestorage.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=jannh@google.com \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=ushankar@purestorage.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.