Re: [RFC PATCH] mm/hmm: Add userfaultfd support to fault handling

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Cc: jgg@ziepe.ca, leon@kernel.org, akpm@linux-foundation.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] mm/hmm: Add userfaultfd support to fault handling
Date: Wed, 22 Apr 2026 20:14:24 +0200	[thread overview]
Message-ID: <b47ba580-2443-482f-8a92-558af82d13f2@kernel.org> (raw)
In-Reply-To: <aejiV8rQIOgrlCkh@skinsburskii.localdomain>

On 4/22/26 16:59, Stanislav Kinsburskii wrote:
> On Wed, Apr 22, 2026 at 09:27:44AM +0200, David Hildenbrand (Arm) wrote:
>> On 4/1/26 00:24, Stanislav Kinsburskii wrote:
>>> Add support for userfaultfd-enabled VMAs to the HMM framework.
>>>
>>> Extract fault handling logic into hmm_handle_mm_fault() to handle both
>>> regular and userfaultfd-backed mappings. The implementation follows
>>> fixup_user_fault() for handling VM_FAULT_RETRY and VM_FAULT_COMPLETED, with
>>> a key difference: instead of retrying or moving forward respectively,
>>> return -EBUSY after reacquiring mmap_read_lock. Since the lock was
>>> released, the VMA could have changed, so defer retry logic to the caller.
>>>
>>> This approach is inefficient for userfaultfd-backed VMAs, as HMM can only
>>> populate one page at a time, but keeps the framework simple by avoiding
>>> complex retry logic within HMM itself.
>>>
>>> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
>>> ---
>>>  mm/hmm.c |   40 ++++++++++++++++++++++++++++++++++++----
>>>  1 file changed, 36 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/mm/hmm.c b/mm/hmm.c
>>> index f6c4ddff4bd6..d04d68e21473 100644
>>> --- a/mm/hmm.c
>>> +++ b/mm/hmm.c
>>> @@ -59,6 +59,35 @@ static int hmm_pfns_fill(unsigned long addr, unsigned long end,
>>>  	return 0;
>>>  }
>>>  
>>> +static int hmm_handle_mm_fault(struct vm_area_struct *vma,
>>> +			       unsigned long addr,
>>> +			       unsigned int fault_flags)
>>> +{
>>> +	int ret;
>>> +
>>> +	if (userfaultfd_missing(vma)) {
>>> +		struct mm_struct *mm = vma->vm_mm;
>>> +
>>> +		fault_flags |= FAULT_FLAG_ALLOW_RETRY |
>>> +			       FAULT_FLAG_USER;
>>> +
>>> +		ret = handle_mm_fault(vma, addr, fault_flags, NULL);
>>> +
>>> +		if (ret & (VM_FAULT_COMPLETED | VM_FAULT_RETRY)) {
>>> +			mmap_read_lock(mm);
>>> +			return -EBUSY;
>>> +		}
>>> +
>>> +		if (ret & VM_FAULT_ERROR)
>>> +			return vm_fault_to_errno(ret, 0);
>>> +	} else {
>>> +		ret = handle_mm_fault(vma, addr, fault_flags, NULL);
>>> +		if (ret & VM_FAULT_ERROR)
>>> +			return vm_fault_to_errno(ret, 0);
>>> +	}
>>
>> I'm surprised that there is userfaultfd_missing() logic required here at
>> all.
>>
>> What prevents us from always calling handle_mm_fault() in a way +
>> handling return values, such that we will just do the right thing
>> independent of userfaultfd_missing()?
>>
> 
> Essentially, the main issue with the current HMM framework is that
> handle_mm_fault() with FAULT_FLAG_ALLOW_RETRY (needed for userfaultfd)
> can drop the mm lock. The framework does not expect or support that yet.
> That is why I had to add the branching.
> 
> The worst part is that handle_mm_fault() can unlock via
> mm_read_unlock(), while svm_range_restore_work() takes the mm write
> lock and this RFC patch doesn't address this problem.
> 
> The main question where I’d like feedback is: should userfaultfd be
> supported in the HMM framework at all, given the complexity around
> unlocking and retry handling (similar to other gup helpers) or should
> this complexity be offloaded to the caller side?
> 
> If it should be suported, then I see two ways to implement this:
> 
> 1. Introduce a new HMM_NEED_USER_FAULT flag to tell the framework that
>    the caller needs userfaultfd support, and extend the framework
>    accordingly. This is not intrusive to existing code. But if we want lazy
>    migration for GPU state in the future (and we likely do), GPU drivers
>    will have to be updated to set this flag later. The only real user of
>    this feature right now is the MSHV driver and I'd extend it to set
>    this flag right away and leave GPU HMM users for later.
> 
> 2. Add retry logic to hmm_handle_mm_fault() to handle VM_FAULT_RETRY and
>    VM_FAULT_COMPLETED, as required for userfaultfd support. This would be
>    transparent to users. But it would require significant changes: mm
>    relocking, VMA lookup, and other parts. Also, before doing that, the
>    kfd_svm driver would need to be changed to downgrade the mm lock to
>    read.

Exactly this. If we want this, it should be done the proper way, by teaching
calling code that the lock was dropped etc.

There should not be userfaultfd special-casing anywhere, it's just a matter of
teaching the code that we might get the mmap lock dropped.

Now, that might be a bit of work :)

What we do in mm/gup.c, is letting callers opt-in whether they can handle
getting the mmap lock dropped -- not opting in to userfaultfd.

See FOLL_UNLOCKABLE. Such an approach enables a step-wise implementation.
But note that FOLL_UNLOCKABLE is an internal flag. We set it automatically when
someone passes us a "locked" variable, indicating that they can deal with it.


See populate_vma_page_range() as one example.

-- 
Cheers,

David

next prev parent reply	other threads:[~2026-04-22 18:14 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-31 22:24 [RFC PATCH] mm/hmm: Add userfaultfd support to fault handling Stanislav Kinsburskii
2026-04-22  7:27 ` David Hildenbrand (Arm)
2026-04-22 14:59   ` Stanislav Kinsburskii
2026-04-22 18:14     ` David Hildenbrand (Arm) [this message]
2026-04-30 17:36       ` Stanislav Kinsburskii
  -- strict thread matches above, loose matches on Subject: below --
2026-04-02 16:37 Stanislav Kinsburskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b47ba580-2443-482f-8a92-558af82d13f2@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=skinsburskii@linux.microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.