public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: "Boone, Max" <mboone@akamai.com>
Cc: Alex Williamson <alex@shazbot.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Tottenham, Max" <mtottenh@akamai.com>,
	"Hunt, Joshua" <johunt@akamai.com>,
	"Pelland, Matt" <mpelland@akamai.com>
Subject: Re: [RFC 1/1] mm/pagewalk: don't split device-backed huge pfnmaps
Date: Wed, 11 Mar 2026 11:45:52 +0100	[thread overview]
Message-ID: <d7f3c613-92f5-4400-812e-a368bee45a41@kernel.org> (raw)
In-Reply-To: <0A734214-9E4B-4A65-AFB1-055883639358@akamai.com>

On 3/11/26 11:34, Boone, Max wrote:
> 
> 
>> On Mar 11, 2026, at 10:59 AM, David Hildenbrand (Arm) <david@kernel.org> wrote:
>>
>> !-------------------------------------------------------------------|
>>  This Message Is From an External Sender
>>  This message came from outside your organization.
>> |-------------------------------------------------------------------!
>>
>>> The -EINVAL originates from:
>>>
>>> vfio_dma_do_map -> vfio_pin_map_dma -> vfio_pin_pages_remote
>>> -> vaddr_get_pfns -> pin_user_pages_remote (mm/gup.c)
>>>
>>> Possibly that’s also the origin of the concurrent PUD modification that requires
>>> the retry in the walker in this patch.
>>
>> We'd have to find out why we manage to trigger a -EINVAL here. I don't
>> see how anything that this patch does could trigger that. So maybe a
>> problem in user space? (calling it on unsupported VMAs?).
>>
> 
> It looks like I was mistaken the EINVAL being from pin_user_pages_remote, 
> rather it originates from:
> 
> vfio_dma_do_map
>   -> vfio_pin_map_dma
>     -> vfio_pin_pages_remote
>       -> vaddr_get_pfns
>         -> follow_fault_pfn
>           -> follow_pfnmap_start (mm/memory.c)
> 
> In vfio_iommu_type1.c, follow_fault_pfn first checks whether follow_pfnmap_start
> returns an error; if it does, it calls fixup_user_fault to fault the mapping in and then 
> retries follow_pfnmap_start to obtain the PFN.
> 
> Sounds to me that the walker is likely re-splitting the PUD entry between 
> the fixup_user_fault and follow_pfnmap_start calls?

Could be. IIRC, there are also scenarios where handle_mm_fault() just
returns success even though nothing was faulted in (e.g., when it
detects some races).

The code in follow_fault_pfn() should likely be updated to handle more
than one attempt. That's also what GUP does.

Likely, follow_fault_pfn() was never taught about PFNMAP mappings that
can be faulted+zapped (in the past they were always static).

If you turn that into a (possibly) endless loop, does the problem go away?

-- 
Cheers,

David


  reply	other threads:[~2026-03-11 10:46 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-09 17:49 [RFC 0/1] Avoid pagewalk hugepage-split race with VFIO DMA set Max Boone
2026-03-09 17:49 ` [RFC 1/1] mm/pagewalk: don't split device-backed huge pfnmaps Max Boone
2026-03-09 20:19   ` David Hildenbrand (Arm)
2026-03-09 22:47     ` Boone, Max
2026-03-09 23:02     ` Boone, Max
2026-03-10  9:11       ` David Hildenbrand (Arm)
2026-03-10 11:38         ` Boone, Max
2026-03-10 15:19           ` David Hildenbrand (Arm)
2026-03-11  9:42             ` Boone, Max
2026-03-11  9:59               ` David Hildenbrand (Arm)
2026-03-11 10:34                 ` Boone, Max
2026-03-11 10:45                   ` David Hildenbrand (Arm) [this message]
2026-03-11 11:14                     ` Boone, Max
2026-03-11 11:59                       ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d7f3c613-92f5-4400-812e-a368bee45a41@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@shazbot.org \
    --cc=johunt@akamai.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mboone@akamai.com \
    --cc=mhocko@suse.com \
    --cc=mpelland@akamai.com \
    --cc=mtottenh@akamai.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox