Re: [RFC PATCH] mm/hmm: Add userfaultfd support to fault handling

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Cc: jgg@ziepe.ca, leon@kernel.org, akpm@linux-foundation.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] mm/hmm: Add userfaultfd support to fault handling
Date: Wed, 22 Apr 2026 20:14:24 +0200	[thread overview]
Message-ID: <b47ba580-2443-482f-8a92-558af82d13f2@kernel.org> (raw)
In-Reply-To: <aejiV8rQIOgrlCkh@skinsburskii.localdomain>

On 4/22/26 16:59, Stanislav Kinsburskii wrote:
> On Wed, Apr 22, 2026 at 09:27:44AM +0200, David Hildenbrand (Arm) wrote:
>> On 4/1/26 00:24, Stanislav Kinsburskii wrote:
>>> Add support for userfaultfd-enabled VMAs to the HMM framework.
>>>
>>> Extract fault handling logic into hmm_handle_mm_fault() to handle both
>>> regular and userfaultfd-backed mappings. The implementation follows
>>> fixup_user_fault() for handling VM_FAULT_RETRY and VM_FAULT_COMPLETED, with
>>> a key difference: instead of retrying or moving forward respectively,
>>> return -EBUSY after reacquiring mmap_read_lock. Since the lock was
>>> released, the VMA could have changed, so defer retry logic to the caller.
>>>
>>> This approach is inefficient for userfaultfd-backed VMAs, as HMM can only
>>> populate one page at a time, but keeps the framework simple by avoiding
>>> complex retry logic within HMM itself.
>>>
>>> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
>>> ---
>>>  mm/hmm.c |   40 ++++++++++++++++++++++++++++++++++++----
>>>  1 file changed, 36 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/mm/hmm.c b/mm/hmm.c
>>> index f6c4ddff4bd6..d04d68e21473 100644
>>> --- a/mm/hmm.c
>>> +++ b/mm/hmm.c
>>> @@ -59,6 +59,35 @@ static int hmm_pfns_fill(unsigned long addr, unsigned long end,
>>>  	return 0;
>>>  }
>>>  
>>> +static int hmm_handle_mm_fault(struct vm_area_struct *vma,
>>> +			       unsigned long addr,
>>> +			       unsigned int fault_flags)
>>> +{
>>> +	int ret;
>>> +
>>> +	if (userfaultfd_missing(vma)) {
>>> +		struct mm_struct *mm = vma->vm_mm;
>>> +
>>> +		fault_flags |= FAULT_FLAG_ALLOW_RETRY |
>>> +			       FAULT_FLAG_USER;
>>> +
>>> +		ret = handle_mm_fault(vma, addr, fault_flags, NULL);
>>> +
>>> +		if (ret & (VM_FAULT_COMPLETED | VM_FAULT_RETRY)) {
>>> +			mmap_read_lock(mm);
>>> +			return -EBUSY;
>>> +		}
>>> +
>>> +		if (ret & VM_FAULT_ERROR)
>>> +			return vm_fault_to_errno(ret, 0);
>>> +	} else {
>>> +		ret = handle_mm_fault(vma, addr, fault_flags, NULL);
>>> +		if (ret & VM_FAULT_ERROR)
>>> +			return vm_fault_to_errno(ret, 0);
>>> +	}
>>
>> I'm surprised that there is userfaultfd_missing() logic required here at
>> all.
>>
>> What prevents us from always calling handle_mm_fault() in a way +
>> handling return values, such that we will just do the right thing
>> independent of userfaultfd_missing()?
>>
> 
> Essentially, the main issue with the current HMM framework is that
> handle_mm_fault() with FAULT_FLAG_ALLOW_RETRY (needed for userfaultfd)
> can drop the mm lock. The framework does not expect or support that yet.
> That is why I had to add the branching.
> 
> The worst part is that handle_mm_fault() can unlock via
> mm_read_unlock(), while svm_range_restore_work() takes the mm write
> lock and this RFC patch doesn't address this problem.
> 
> The main question where I’d like feedback is: should userfaultfd be
> supported in the HMM framework at all, given the complexity around
> unlocking and retry handling (similar to other gup helpers) or should
> this complexity be offloaded to the caller side?
> 
> If it should be suported, then I see two ways to implement this:
> 
> 1. Introduce a new HMM_NEED_USER_FAULT flag to tell the framework that
>    the caller needs userfaultfd support, and extend the framework
>    accordingly. This is not intrusive to existing code. But if we want lazy
>    migration for GPU state in the future (and we likely do), GPU drivers
>    will have to be updated to set this flag later. The only real user of
>    this feature right now is the MSHV driver and I'd extend it to set
>    this flag right away and leave GPU HMM users for later.
> 
> 2. Add retry logic to hmm_handle_mm_fault() to handle VM_FAULT_RETRY and
>    VM_FAULT_COMPLETED, as required for userfaultfd support. This would be
>    transparent to users. But it would require significant changes: mm
>    relocking, VMA lookup, and other parts. Also, before doing that, the
>    kfd_svm driver would need to be changed to downgrade the mm lock to
>    read.

Exactly this. If we want this, it should be done the proper way, by teaching
calling code that the lock was dropped etc.

There should not be userfaultfd special-casing anywhere, it's just a matter of
teaching the code that we might get the mmap lock dropped.

Now, that might be a bit of work :)

What we do in mm/gup.c, is letting callers opt-in whether they can handle
getting the mmap lock dropped -- not opting in to userfaultfd.

See FOLL_UNLOCKABLE. Such an approach enables a step-wise implementation.
But note that FOLL_UNLOCKABLE is an internal flag. We set it automatically when
someone passes us a "locked" variable, indicating that they can deal with it.


See populate_vma_page_range() as one example.

-- 
Cheers,

David

next prev parent reply	other threads:[~2026-04-22 18:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-31 22:24 [RFC PATCH] mm/hmm: Add userfaultfd support to fault handling Stanislav Kinsburskii
2026-04-22  7:27 ` David Hildenbrand (Arm)
2026-04-22 14:59   ` Stanislav Kinsburskii
2026-04-22 18:14     ` David Hildenbrand (Arm) [this message]
  -- strict thread matches above, loose matches on Subject: below --
2026-04-02 16:37 Stanislav Kinsburskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b47ba580-2443-482f-8a92-558af82d13f2@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=skinsburskii@linux.microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox