From: Sean Christopherson <seanjc@google.com>
To: Yu Zhang <yu.c.zhang@linux.intel.com>
Cc: kvm@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
linux-kernel@vger.kernel.org, Peter Xu <peterx@redhat.com>,
David Stevens <stevensd@chromium.org>,
kvmarm@lists.linux.dev, linuxppc-dev@lists.ozlabs.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v7 2/8] KVM: Introduce __kvm_follow_pfn function
Date: Fri, 4 Aug 2023 15:03:41 -0700 [thread overview]
Message-ID: <ZM11vUK3vIjjykaz@google.com> (raw)
In-Reply-To: <20230706145247.ddjqsvmfdeimzva6@linux.intel.com>
On Thu, Jul 06, 2023, Yu Zhang wrote:
> On Thu, Jul 06, 2023 at 02:29:24PM +0900, David Stevens wrote:
> > On Wed, Jul 5, 2023 at 7:53 PM Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
> > >
> > > On Wed, Jul 05, 2023 at 06:22:59PM +0900, David Stevens wrote:
> > > > On Wed, Jul 5, 2023 at 12:10 PM Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
> > > > >
> > > > > > @@ -2514,35 +2512,26 @@ static bool hva_to_pfn_fast(unsigned long addr, bool write_fault,
> > > > > > * The slow path to get the pfn of the specified host virtual address,
> > > > > > * 1 indicates success, -errno is returned if error is detected.
> > > > > > */
> > > > > > -static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
> > > > > > - bool interruptible, bool *writable, kvm_pfn_t *pfn)
> > > > > > +static int hva_to_pfn_slow(struct kvm_follow_pfn *foll, kvm_pfn_t *pfn)
> > > > > > {
> > > > > > - unsigned int flags = FOLL_HWPOISON;
> > > > > > + unsigned int flags = FOLL_HWPOISON | FOLL_GET | foll->flags;
> > > > > > struct page *page;
> > > > > > int npages;
> > > > > >
> > > > > > might_sleep();
> > > > > >
> > > > > > - if (writable)
> > > > > > - *writable = write_fault;
> > > > > > -
> > > > > > - if (write_fault)
> > > > > > - flags |= FOLL_WRITE;
> > > > > > - if (async)
> > > > > > - flags |= FOLL_NOWAIT;
> > > > > > - if (interruptible)
> > > > > > - flags |= FOLL_INTERRUPTIBLE;
> > > > > > -
> > > > > > - npages = get_user_pages_unlocked(addr, 1, &page, flags);
> > > > > > + npages = get_user_pages_unlocked(foll->hva, 1, &page, flags);
> > > > > > if (npages != 1)
> > > > > > return npages;
> > > > > >
> > > > > > + foll->writable = (foll->flags & FOLL_WRITE) && foll->allow_write_mapping;
> > > > > > +
> > > > > > /* map read fault as writable if possible */
> > > > > > - if (unlikely(!write_fault) && writable) {
> > > > > > + if (unlikely(!foll->writable) && foll->allow_write_mapping) {
> > > > >
> > > > > I guess !foll->writable should be !(foll->flags & FOLL_WRITE) here.
> > > >
> > > > The two statements are logically equivalent, although I guess using
> > > > !(foll->flags & FOLL_WRITE) may be a little clearer, if a little more
> > > > verbose.
> > >
> > > Well, as the comment says, we wanna try to map the read fault as writable
> > > whenever possible. And __gfn_to_pfn_memslot() will only set the FOLL_WRITE
> > > for write faults. So I guess using !foll->writable will not allow this.
> > > Did I miss anything?
> >
> > We just set the foll->writable out parameter to be equal to
> > ((foll->flags & FOLL_WRITE) && foll->allow_write_mapping). Taking a =
> > foll->flags & FOLL_WRITE and b = foll->allow_write_mapping, we have
> > !(a && b) && b -> (!a || !b) && b -> (!a && b) || (!b && b) -> !a &&
> > b.
>
> Ouch, my bad again... I typed "!foll->writable", but missed the "!" in
> my head while calculating... Thanks! :)
The code is funky and confusing though. Specifically, FOLL_WRITE without
allow_write_mapping is nonsensical, and yields the even more nonsensical output
of a successful FOLL_WRITE with foll->writable==%false.
It "works" because callers only consume foll->writable when foll->allow_write_mapping
is true, but relying on that is ugly and completely unnecessary. Similarly, the
"allow" terminology is misleading. FOLL_WRITE *always* allows writable mappings.
This wasn't as much of problem in the previous code because the lower levels took
the pointer, i.e. avoided the "allow" terminology entirely.
So we should either keep that behavior, i.e. replace "bool allow_write_mapping"
with "bool *writable", or rename allow_write_mapping to something like
opportunistically_map_writable, and then unconditionally set foll->writable
whenever KVM obtains a writable mapping, i.e. regardless of whether the original
fault was a read or a write.
My vote is for the latter. If opportunistically_map_writable is too verbose,
try_map_writable would be another option. Hmm, I'll make "try_map_writable" my
official vote.
Ah, and I also vote to use an if-elif instead of unconditionally setting foll->writable.
That makes the relationship between FOLL_WRITE and try_map_writable a bit more
obvious IMO. E.g.
static bool hva_to_pfn_fast(struct kvm_follow_pfn *foll, kvm_pfn_t *pfn)
{
struct page *page[1];
/*
* Fast pin a writable pfn only if it is a write fault request
* or the caller allows to map a writable pfn for a read fault
* request.
*/
if (!((foll->flags & FOLL_WRITE) || foll->try_map_writable))
return false;
if (get_user_page_fast_only(foll->hva, FOLL_WRITE, page)) {
*pfn = page_to_pfn(page[0]);
foll->writable = true;
return true;
}
return false;
}
/*
* The slow path to get the pfn of the specified host virtual address,
* 1 indicates success, -errno is returned if error is detected.
*/
static int hva_to_pfn_slow(struct kvm_follow_pfn *foll, kvm_pfn_t *pfn)
{
unsigned int flags = FOLL_HWPOISON | FOLL_GET | foll->flags;
struct page *page;
int npages;
might_sleep();
npages = get_user_pages_unlocked(foll->hva, 1, &page, flags);
if (npages != 1)
return npages;
if (foll->flags & FOLL_WRITE) {
foll->writable = true;
} else if (foll->try_map_writable) {
struct page *wpage;
/* map read fault as writable if possible */
if (get_user_page_fast_only(foll->hva, FOLL_WRITE, &wpage)) {
foll->writable = true;
put_page(page);
page = wpage;
}
}
*pfn = page_to_pfn(page);
return npages;
}
next prev parent reply other threads:[~2023-08-04 22:05 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-04 7:50 [PATCH v7 0/8] KVM: allow mapping non-refcounted pages David Stevens
2023-07-04 7:50 ` [PATCH v7 1/8] KVM: Assert that a page's refcount is elevated when marking accessed/dirty David Stevens
2023-07-04 7:50 ` [PATCH v7 2/8] KVM: Introduce __kvm_follow_pfn function David Stevens
2023-07-05 3:10 ` Yu Zhang
2023-07-05 9:22 ` David Stevens
2023-07-05 10:53 ` Yu Zhang
2023-07-06 5:29 ` David Stevens
2023-07-06 14:52 ` Yu Zhang
2023-08-04 22:03 ` Sean Christopherson [this message]
2023-07-05 8:47 ` Zhi Wang
2023-07-05 9:08 ` David Stevens
2023-07-11 17:37 ` Zhi Wang
2023-07-06 1:34 ` Isaku Yamahata
2023-07-06 5:52 ` David Stevens
2023-08-04 22:13 ` Sean Christopherson
2023-07-04 7:50 ` [PATCH v7 3/8] KVM: Make __kvm_follow_pfn not imply FOLL_GET David Stevens
2023-07-05 7:23 ` Yu Zhang
2023-07-05 11:56 ` Yu Zhang
2023-07-06 6:09 ` David Stevens
2023-07-05 13:19 ` Zhi Wang
2023-07-06 6:49 ` David Stevens
2023-07-11 17:33 ` Zhi Wang
2023-07-11 21:59 ` Sean Christopherson
2023-09-05 8:26 ` David Stevens
2023-09-06 0:45 ` Sean Christopherson
2023-09-06 3:24 ` David Stevens
2023-09-06 22:03 ` Sean Christopherson
2023-07-04 7:50 ` [PATCH v7 4/8] KVM: x86/mmu: Migrate to __kvm_follow_pfn David Stevens
2023-07-05 8:07 ` Yu Zhang
2023-08-04 22:30 ` Sean Christopherson
2023-07-06 1:54 ` Isaku Yamahata
2023-08-24 8:03 ` David Stevens
2023-07-04 7:50 ` [PATCH v7 5/8] KVM: x86/mmu: Don't pass FOLL_GET " David Stevens
2023-07-05 10:18 ` Yu Zhang
2023-07-05 14:17 ` Yu Zhang
2023-07-06 4:52 ` David Stevens
2023-07-06 7:19 ` Yu Zhang
2023-07-06 15:58 ` Isaku Yamahata
2023-07-07 1:35 ` David Stevens
2023-07-10 16:34 ` Isaku Yamahata
2023-07-11 2:59 ` David Stevens
2023-08-04 22:45 ` Sean Christopherson
2023-07-05 10:25 ` Yu Zhang
2023-08-24 8:03 ` David Stevens
2023-08-24 15:15 ` Sean Christopherson
2023-08-25 1:38 ` David Stevens
2023-08-31 21:18 ` Sean Christopherson
2023-07-06 2:10 ` Isaku Yamahata
2023-07-06 5:18 ` David Stevens
2023-07-19 6:09 ` Yan Zhao
2023-07-19 7:16 ` David Stevens
2023-07-04 7:50 ` [PATCH v7 6/8] KVM: arm64: Migrate " David Stevens
2023-07-04 7:50 ` [PATCH v7 7/8] KVM: PPC: " David Stevens
2023-07-04 7:50 ` [PATCH v7 8/8] KVM: remove __gfn_to_pfn_memslot David Stevens
2023-08-04 22:47 ` [PATCH v7 0/8] KVM: allow mapping non-refcounted pages Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZM11vUK3vIjjykaz@google.com \
--to=seanjc@google.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maz@kernel.org \
--cc=peterx@redhat.com \
--cc=stevensd@chromium.org \
--cc=yu.c.zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).