linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Zach O'Keefe <zokeefe@google.com>
Cc: Yang Shi <shy828301@gmail.com>, Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jann Horn <jannh@google.com>, Song Liu <songliubraving@fb.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma
Date: Mon, 9 Jan 2023 09:50:12 +0100	[thread overview]
Message-ID: <d50eb6f2-0585-7441-081b-cadaa5901c6e@redhat.com> (raw)
In-Reply-To: <CAAa6QmSByYrWkp+8K0NK+pocKT0CVj83RaVUB1VqMPvuPHnpNQ@mail.gmail.com>

>>>>>>
>>>>>> Side note: set_huge_pmd() wins the award of "ugliest mm function of early
>>>>>> 2023". I was briefly concerned how do_set_pmd() decides whether the PMD can be
>>>>>> writable or not. Turns out it's communicated via vm_fault->flags. Just
>>>>>> horrible.
> 
> My first Linux award! :) At least it's not "worst mm security issue of
> early 2023". I'll take it!

Good that you're not taking my words the wrong way.

MADV_COLLAPSE is a very useful feature (especially also for THP tests 
[1]). I wish I could have looked at some of the patches earlier. But we 
cannot wait forever to get something merged, otherwise we'd never get 
bigger changes upstream.

... so there is plenty of time left in 2023 to cleanup khugepaged.c :P


[1] https://lkml.kernel.org/r/20230104144905.460075-1-david@redhat.com

[...]


>> For example: why even *care* about the complexity of installing a PMD in
>> collapse_pte_mapped_thp() using set_huge_pmd() just for MADV_COLLAPSE?
>>
>> Sure, we avoid a single page fault afterwards, but is this *really*
>> worth the extra code here? I mean, after we installed the PMD, the page
>> could just get reclaimed either way, so there is no guarantee that we
>> have a PMD mapped once we return to user space IIUC.
> 
> A valid question. The first reason is just semantic symmetry for
> MADV_COLLAPSE called on anon vs file/shmem memory. It would be nice to
> say that "on success, the memory range provided will be backed by
> PMD-mapped hugepages", rather than special-casing file/shmem.

But there will never be such a guarantee, right? We could even see a 
split before just before we return to user space IIRC.

> 
> The second reason has a more practical use case. In userfaultfd-based
> live migration (using  UFFDIO_REGISTER_MODE_MINOR) pages are migrated
> at 4KiB granularity, and it may take a long (O(many minutes)) for the
> transfer of all pages to complete. To avoid severe performance
> degradation on the target guest, the vmm wants to MADV_COLLAPSE
> hugepage-sized regions as they fill up. Since the guest memory is
> still uffd-registered, requiring refault post-MADV_COLLAPSE won't
> work, since the uffd machinery will intercept the fault, and no PMD
> will be mapped. As such, either uffd needs to be taught to install PMD
> mappings, or the PMD mapping already must be in-place.

That's an interesting point, thanks. I assume we'd get another minor 
fault and when resolving that, we'll default to a PTE mapping.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2023-01-09  8:50 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-22 20:41 [PATCH] mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma Hugh Dickins
2023-01-02 12:19 ` David Hildenbrand
2023-01-03 20:40   ` Hugh Dickins
2023-01-04  9:20     ` David Hildenbrand
2023-01-05  0:03       ` Yang Shi
2023-01-05 10:44         ` David Hildenbrand
2023-01-05 21:29           ` Zach O'Keefe
2023-01-09  8:50             ` David Hildenbrand [this message]
2023-01-17 23:00               ` Zach O'Keefe
2023-01-23 11:09                 ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d50eb6f2-0585-7441-081b-cadaa5901c6e@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shy828301@gmail.com \
    --cc=songliubraving@fb.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).