From: Alex Williamson <alex@shazbot.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: "Boone, Max" <mboone@akamai.com>,
Max Boone via B4 Relay <devnull+mboone.akamai.com@kernel.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"stable@vger.kernel.org" <stable@vger.kernel.org>,
alex@shazbot.org
Subject: Re: [PATCH] vfio/type1: Retry follow_pfnmap_start() when PFNMAP is zapped
Date: Thu, 19 Mar 2026 08:30:39 -0600 [thread overview]
Message-ID: <20260319083039.03865989@shazbot.org> (raw)
In-Reply-To: <45e50068-751c-4e8c-a6b0-62cf8d1e58e6@kernel.org>
On Thu, 19 Mar 2026 14:18:49 +0100
"David Hildenbrand (Arm)" <david@kernel.org> wrote:
> On 3/19/26 09:36, Boone, Max wrote:
> >
> >
> >> On Mar 18, 2026, at 10:22 PM, Alex Williamson <alex@shazbot.org> wrote:
> >>
> >> […]
> >>
> >>> + /*
> >>> + * follow_pfnmap_start() returns -EINVAL for
> >>> + * invalid parameters and non-present entries.
> >>> + * If that happens here after a successful
> >>> + * fixup_user_fault(), it is likely that the
> >>> + * pfnmap has been zapped. Retry instead of
> >>> + * failing.
> >>> + */
> >>
> >> It's a little stronger than that, right? We're betting that the only
> >> remaining non-zero return is due to a race and we can introduce what
> >> appears to be potential for an infinite loop here because -EAGAIN will
> >> get kicked out to redo the vma_lookup() and fixup_user_fault() should
> >> return a genuine error if we're completely in the weeds. Should we
> >> make this a little stronger and more specific? Thanks,
> >
> > I’d say that the best case would be to have follow_pfnmap_start() return
> > -EINVAL or -ENOENT w.r.t. which of the two return values it is. But then
> > again, we could theoretically run into an infinite loop I guess - as the zap
> > and faulting could run in lockstep (the race window is extremely small
> > though).
>
> Well, in theory :) To hit that race repeatedly, you'd really have to be
> quite lucky I guess.
>
> But the real question is: if user space triggered the pinning, and user
> space keeps hurting itself to make progress, is that a real problem?
I'd say no, that's not a problem. It's really just that masking all
non-zero returns as -EAGAIN makes some assumptions about the other
error conditions that could change over time, so minimally those
dependencies should be clearly stated. Even better would be if we
didn't need to make those assumptions. Thanks,
Alex
> I guess the crucial part would be to
>
> a) Have some cond_resched(() in there?
> b) Checking for fatal signals somewhere?
> c) Possibly drop locks (mmap lock?) every now and then?
>
> For GUP, a) and b) are in place in __get_user_pages().
>
> c) might be done, but I think it's less deterministic.
>
> >
> > We could make the retry above bounded, and bubble up a -EBUSY such
> > that users of the ioctl can decide to retry instead of fail?
>
> Would that be a possible ABI break? You'd really have to only do that in
> a case where user space does stupid things, I guess.
>
> >
> > David, you mentioned that gup already has retry logic that we don’t have
> > with follow_fault_pfn() -> follow_pfnmap_start(). Would we potentially run
> > into an infinite loop with this change?
>
> GUP triggers page faults through faultin_page(). If handle_mm_fault()
> returns
>
> * VM_FAULT_COMPLETED we return -EAGAIN
> * VM_FAULT_ERROR we return the error
> * VM_FAULT_RETRY we return -EBUSY
> * Otherwise 0
>
> In the caller __get_user_pages(), we
> * Retry immediately with ret == 0
> * Return to the GUP caller (letting it retry) with -EBUSY/-EAGAIN
>
> Having at least a) and b) sounds reasonable. Not sure about having c),
> might be tricky if we are not allowed to drop the lock.
>
next prev parent reply other threads:[~2026-03-19 14:30 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-17 15:07 [PATCH] vfio/type1: Retry follow_pfnmap_start() when PFNMAP is zapped Max Boone via B4 Relay
2026-03-18 21:22 ` Alex Williamson
2026-03-19 8:36 ` Boone, Max
2026-03-19 13:18 ` David Hildenbrand (Arm)
2026-03-19 14:30 ` Alex Williamson [this message]
2026-03-19 19:44 ` David Hildenbrand (Arm)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260319083039.03865989@shazbot.org \
--to=alex@shazbot.org \
--cc=david@kernel.org \
--cc=devnull+mboone.akamai.com@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mboone@akamai.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox