linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Yznaga <anthony.yznaga@oracle.com>
To: David Hildenbrand <david@redhat.com>,
	James Houghton <jthoughton@google.com>
Cc: akpm@linux-foundation.org, willy@infradead.org,
	markhemm@googlemail.com, viro@zeniv.linux.org.uk,
	khalid@kernel.org, andreyknvl@gmail.com, dave.hansen@intel.com,
	luto@kernel.org, brauner@kernel.org, arnd@arndb.de,
	ebiederm@xmission.com, catalin.marinas@arm.com,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, mhiramat@kernel.org, rostedt@goodmis.org,
	vasily.averin@linux.dev, xhao@linux.alibaba.com, pcc@google.com,
	neilb@suse.de, maz@kernel.org
Subject: Re: [RFC PATCH v3 06/10] mm/mshare: Add vm flag for shared PTEs
Date: Mon, 7 Oct 2024 16:03:33 -0700	[thread overview]
Message-ID: <4ce254ec-8b7c-4a83-8a7f-af6a8963cd09@oracle.com> (raw)
In-Reply-To: <04c4314e-7958-47bd-8281-23c3e35fc10e@redhat.com>


On 10/7/24 3:24 AM, David Hildenbrand wrote:
> On 04.09.24 01:40, James Houghton wrote:
>> On Tue, Sep 3, 2024 at 4:23 PM Anthony Yznaga 
>> <anthony.yznaga@oracle.com> wrote:
>>>
>>> From: Khalid Aziz <khalid@kernel.org>
>>>
>>> Add a bit to vm_flags to indicate a vma shares PTEs with others. Add
>>> a function to determine if a vma shares PTEs by checking this flag.
>>> This is to be used to find the shared page table entries on page fault
>>> for vmas sharing PTEs.
>>>
>>> Signed-off-by: Khalid Aziz <khalid@kernel.org>
>>> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
>>> Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
>>> ---
>>>   include/linux/mm.h             | 7 +++++++
>>>   include/trace/events/mmflags.h | 3 +++
>>>   mm/internal.h                  | 5 +++++
>>>   3 files changed, 15 insertions(+)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index 6549d0979b28..3aa0b3322284 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -413,6 +413,13 @@ extern unsigned int kobjsize(const void *objp);
>>>   #define VM_DROPPABLE           VM_NONE
>>>   #endif
>>>
>>> +#ifdef CONFIG_64BIT
>>> +#define VM_SHARED_PT_BIT       41
>>> +#define VM_SHARED_PT           BIT(VM_SHARED_PT_BIT)
>>> +#else
>>> +#define VM_SHARED_PT           VM_NONE
>>> +#endif
>>> +
>>>   #ifdef CONFIG_64BIT
>>>   /* VM is sealed, in vm_flags */
>>>   #define VM_SEALED      _BITUL(63)
>>> diff --git a/include/trace/events/mmflags.h 
>>> b/include/trace/events/mmflags.h
>>> index b63d211bd141..e1ae1e60d086 100644
>>> --- a/include/trace/events/mmflags.h
>>> +++ b/include/trace/events/mmflags.h
>>> @@ -167,8 +167,10 @@ IF_HAVE_PG_ARCH_X(arch_3)
>>>
>>>   #ifdef CONFIG_64BIT
>>>   # define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name},
>>> +# define IF_HAVE_VM_SHARED_PT(flag, name) {flag, name},
>>>   #else
>>>   # define IF_HAVE_VM_DROPPABLE(flag, name)
>>> +# define IF_HAVE_VM_SHARED_PT(flag, name)
>>>   #endif
>>>
>>>   #define __def_vmaflag_names \
>>> @@ -204,6 +206,7 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY, 
>>> "softdirty"     )               \
>>>          {VM_HUGEPAGE,                   "hugepage" },              \
>>>          {VM_NOHUGEPAGE,                 "nohugepage" },              \
>>>   IF_HAVE_VM_DROPPABLE(VM_DROPPABLE,     "droppable" )               \
>>> +IF_HAVE_VM_SHARED_PT(VM_SHARED_PT,     "sharedpt" )               \
>>>          {VM_MERGEABLE,                  "mergeable" }               \
>>>
>>>   #define show_vma_flags(flags) \
>>> diff --git a/mm/internal.h b/mm/internal.h
>>> index b4d86436565b..8005d5956b6e 100644
>>> --- a/mm/internal.h
>>> +++ b/mm/internal.h
>>> @@ -1578,4 +1578,9 @@ void unlink_file_vma_batch_init(struct 
>>> unlink_vma_file_batch *);
>>>   void unlink_file_vma_batch_add(struct unlink_vma_file_batch *, 
>>> struct vm_area_struct *);
>>>   void unlink_file_vma_batch_final(struct unlink_vma_file_batch *);
>>>
>>
>> Hi Anthony,
>>
>> I'm really excited to see this series on the mailing list again! :) I
>> won't have time to review this series in too much detail, but I hope
>> something like it gets merged eventually.
>>
>>> +static inline bool vma_is_shared(const struct vm_area_struct *vma)
>>> +{
>>> +       return VM_SHARED_PT && (vma->vm_flags & VM_SHARED_PT);
>>> +}
>>
>> Tiny comment - I find vma_is_shared() to be a bit of a confusing name,
>> especially given how vma_is_shared_maywrite() is defined. (Sorry if
>> this has already been discussed before.)
>>
>> How about vma_is_shared_pt()?
>
> vma_is_mshare() ? ;)

vma_is_vmas()? :-D


>
> The whole "shared PT / shared PTE" is a bit misleading IMHO and a bit 
> too dominant in the series. Yes, we're sharing PTEs/page tables, but 
> the main point is that a single mshare VMA might cover multiple 
> different VMAs (in a different process).
>
> I would describe mshare VMAs as being something that shares page 
> tables with another MM, BUT, also that the VMA is a container and what 
> exactly the *actual* VMAs in there are (including holes), only the 
> owner knows.
>
> E.g., is_vm_hugetlb_page() might be *false* for an mshare VMA, but 
> there might be hugetlb folios mapped into the page tables, described 
> by a is_vm_hugetlb_page() VMA in the owner MM.
>
> So again, it's not just "sharing page tables".

Understood. I'm okay with something like vma_is_mshare() or some other 
shorthand for a "container" VMA. And I recognize that I need to identify 
which code paths need to be enlightened to container VMAs and which 
should expect to be operating on a real VMA or don't care.


Anthony



  reply	other threads:[~2024-10-07 23:04 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-03 23:22 [RFC PATCH v3 00/10] Add support for shared PTEs across processes Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 01/10] mm: Add msharefs filesystem Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 02/10] mm/mshare: pre-populate msharefs with information file Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 03/10] mm/mshare: make msharefs writable and support directories Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 04/10] mm/mshare: allocate an mm_struct for msharefs files Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 05/10] mm/mshare: Add ioctl support Anthony Yznaga
2024-10-14 20:08   ` Jann Horn
2024-10-16  0:49     ` Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 06/10] mm/mshare: Add vm flag for shared PTEs Anthony Yznaga
2024-09-03 23:40   ` James Houghton
2024-09-03 23:58     ` Anthony Yznaga
2024-10-07 10:24     ` David Hildenbrand
2024-10-07 23:03       ` Anthony Yznaga [this message]
2024-09-03 23:22 ` [RFC PATCH v3 07/10] mm/mshare: Add mmap support Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 08/10] mm/mshare: Add basic page table sharing support Anthony Yznaga
2024-10-07  8:41   ` Kirill A. Shutemov
2024-10-07 17:45     ` Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 09/10] mm: create __do_mmap() to take an mm_struct * arg Anthony Yznaga
2024-10-07  8:44   ` Kirill A. Shutemov
2024-10-07 17:46     ` Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 10/10] mshare: add MSHAREFS_CREATE_MAPPING Anthony Yznaga
2024-10-02 17:35 ` [RFC PATCH v3 00/10] Add support for shared PTEs across processes Dave Hansen
2024-10-02 19:30   ` Anthony Yznaga
2024-10-02 23:11     ` Dave Hansen
2024-10-03  0:24       ` Anthony Yznaga
2024-10-07  8:44   ` David Hildenbrand
2024-10-07 15:58     ` Dave Hansen
2024-10-07 16:27       ` David Hildenbrand
2024-10-07 16:45         ` Sean Christopherson
2024-10-08  1:37           ` Anthony Yznaga
2024-10-07  8:48   ` David Hildenbrand
2024-10-07  9:01 ` Kirill A. Shutemov
2024-10-07 19:23   ` Anthony Yznaga
2024-10-07 19:41     ` David Hildenbrand
2024-10-07 19:46       ` Anthony Yznaga
2024-10-14 20:07 ` Jann Horn
2024-10-16  0:59   ` Anthony Yznaga
2024-10-16 13:25     ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ce254ec-8b7c-4a83-8a7f-af6a8963cd09@oracle.com \
    --to=anthony.yznaga@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=arnd@arndb.de \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jthoughton@google.com \
    --cc=khalid@kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=markhemm@googlemail.com \
    --cc=maz@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=neilb@suse.de \
    --cc=pcc@google.com \
    --cc=rostedt@goodmis.org \
    --cc=vasily.averin@linux.dev \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=xhao@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).