From: David Hildenbrand <david@redhat.com>
To: Rongwei Wang <rongwei.wang@linux.alibaba.com>,
Matthew Wilcox <willy@infradead.org>
Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org,
"xuyu@linux.alibaba.com" <xuyu@linux.alibaba.com>
Subject: Re: [PATCH RFC v2 0/4] Add support for sharing page tables across processes (Previously mshare)
Date: Mon, 31 Jul 2023 18:30:22 +0200 [thread overview]
Message-ID: <d3d03475-7977-fc55-188d-7df350ee0f29@redhat.com> (raw)
In-Reply-To: <9faea1cf-d3da-47ff-eb41-adc5bd73e5ca@linux.alibaba.com>
On 31.07.23 18:19, Rongwei Wang wrote:
>
> On 2023/7/31 20:50, David Hildenbrand wrote:
>> On 31.07.23 14:25, Matthew Wilcox wrote:
>>> On Mon, Jul 31, 2023 at 12:35:00PM +0800, Rongwei Wang wrote:
>>>> Hi Matthew
>>>>
>>>> May I ask you another question about mshare under this RFC? I
>>>> remember you
>>>> said you will redesign the mshare to per-vma not per-mapping
>>>> (apologize if
>>>> remember wrongly) in last time MM alignment session. And I also
>>>> refer to you
>>>> to re-code this part in our internal version (based on this RFC). It
>>>> seems
>>>> that per VMA will can simplify the structure of pgtable sharing, even
>>>> doesn't care the different permission of file mapping. these are
>>>> advantages
>>>> (maybe) that I can imagine. But IMHO, It seems not a strongly reason to
>>>> switch per-mapping to per-vma.
>>>>
>>>> And I can't imagine other considerations of upstream. Can you share the
>>>> reason why redesigning in a per-vma way, due to integation with
>>>> hugetlbfs
>>>> pgtable sharing or anonymous page sharing?
>>>
>>> It was David who wants to make page table sharing be per-VMA. I think
>>> he is advocating for the wrong approach. In any case, I don't have time
>>> to work on mshare and Khalid is on leave until September, so I don't
>>> think anybody is actively working on mshare.
>>
>> Not that I also don't have any time to look into this, but my comment
>> essentially was that we should try decoupling page table sharing
>> (reduce memory consumption, shorter rmap walk) from the
>> mprotect(PROT_READ) use case.
>
> Hi David, Matthew
>
> Thanks for your reply.
>
> Uh, sorry, I can't imagine the relative between decouping page table
> sharing with per-VMA design. And I think mprotect(PROT_READ) has to
> modify all sharing page tables of related tasks. It seems that I miss
> something about per-VMA from your words.
Assume we do do the page table sharing at mmap time, if the flags are
right. Let's focus on the most common:
mmap(memfd, PROT_READ | PROT_WRITE, MAP_SHARED)
And doing the same in each and every process.
Having the original design of doing an mprotect(PROT_READ) in each and
every process is just absolutely inefficient to protect a memfd page.
For that case, my thought was that you actually want to write-protect
the pages on the memfd level.
So instead of doing mprotect(PROT_READ) in 999 processes, or doing
mprotect(PROT_READ) on mshare(), you have memfd feature to protect pages
from any write access -- not using virtual addresses but using an offset
in the memfd.
Assume such a (badly imagined) memfd_protect(PROT_READ) would make sure
that:
(1) Any page table mappings of the page are write-protected and
(2) Any write access using the page table mappings trigger write-notify and
(3) Any other access -- e.g., write() -- similarly informs memfd.
Without page table sharing, (1) would have to walk all mappings via the
rmap. With page table sharing, it would only have to walk one page table.
But the features would be two separate things.
What memfd would do with that write notification (inject a signal,
something like uffd) would be a different story.
Again, just an idea and maybe complete garbage.
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2023-07-31 16:31 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-26 16:49 [PATCH RFC v2 0/4] Add support for sharing page tables across processes (Previously mshare) Khalid Aziz
2023-04-26 16:49 ` [PATCH RFC v2 1/4] mm/ptshare: Add vm flag for shared PTE Khalid Aziz
2023-04-26 16:49 ` [PATCH RFC v2 2/4] mm/ptshare: Add flag MAP_SHARED_PT to mmap() Khalid Aziz
2023-04-26 16:49 ` [PATCH RFC v2 3/4] mm/ptshare: Create new mm struct for page table sharing Khalid Aziz
2023-06-26 8:08 ` Karim Manaouil
2023-04-26 16:49 ` [PATCH RFC v2 4/4] mm/ptshare: Add page fault handling for page table shared regions Khalid Aziz
2023-04-26 21:27 ` [PATCH RFC v2 0/4] Add support for sharing page tables across processes (Previously mshare) Mike Kravetz
2023-04-27 16:40 ` Khalid Aziz
2023-06-12 16:25 ` Peter Xu
2023-06-30 11:29 ` Rongwei Wang
2023-07-31 4:35 ` Rongwei Wang
2023-07-31 12:25 ` Matthew Wilcox
2023-07-31 12:50 ` David Hildenbrand
2023-07-31 16:19 ` Rongwei Wang
2023-07-31 16:30 ` David Hildenbrand [this message]
2023-07-31 16:38 ` Matthew Wilcox
2023-07-31 16:48 ` David Hildenbrand
2023-07-31 16:54 ` Matthew Wilcox
2023-07-31 17:06 ` David Hildenbrand
2023-08-01 6:53 ` Rongwei Wang
2023-08-01 19:28 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d3d03475-7977-fc55-188d-7df350ee0f29@redhat.com \
--to=david@redhat.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rongwei.wang@linux.alibaba.com \
--cc=willy@infradead.org \
--cc=xuyu@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).