linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Zhi Wang <zhi.wang.linux@gmail.com>
Cc: kvm@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	linux-kernel@vger.kernel.org, Peter Xu <peterx@redhat.com>,
	David Stevens <stevensd@chromium.org>,
	kvmarm@lists.linux.dev, linuxppc-dev@lists.ozlabs.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v7 3/8] KVM: Make __kvm_follow_pfn not imply FOLL_GET
Date: Tue, 11 Jul 2023 14:59:59 -0700	[thread overview]
Message-ID: <ZK3Q34WNLjGVQQw+@google.com> (raw)
In-Reply-To: <20230711203348.00000fb8.zhi.wang.linux@gmail.com>

On Tue, Jul 11, 2023, Zhi Wang wrote:
> On Thu, 6 Jul 2023 15:49:39 +0900
> David Stevens <stevensd@chromium.org> wrote:
> 
> > On Wed, Jul 5, 2023 at 10:19___PM Zhi Wang <zhi.wang.linux@gmail.com> wrote:
> > >
> > > On Tue,  4 Jul 2023 16:50:48 +0900
> > > David Stevens <stevensd@chromium.org> wrote:
> > > If yes, do we have to use FOLL_GET to resolve GFN associated with a tail page?
> > > It seems gup can tolerate gup_flags without FOLL_GET, but it is more like a
> > > temporary solution. I don't think it is a good idea to play tricks with
> > > a temporary solution, more like we are abusing the toleration.  
> > 
> > I'm not sure I understand what you're getting at. This series never
> > calls gup without FOLL_GET.
> > 
> > This series aims to provide kvm_follow_pfn as a unified API on top of
> > gup+follow_pte. Since one of the major clients of this API uses an mmu
> > notifier, it makes sense to support returning a pfn without taking a
> > reference. And we indeed need to do that for certain types of memory.
> > 
> 
> I am not having prob with taking a pfn without taking a ref. I am
> questioning if using !FOLL_GET in struct kvm_follow_pfn to indicate taking
> a pfn without a ref is a good idea or not, while there is another flag
> actually showing it.
> 
> I can understand that using FOLL_XXX in kvm_follow_pfn saves some
> translation between struct kvm_follow_pfn.{write, async, xxxx} and GUP
> flags. However FOLL_XXX is for GUP. Using FOLL_XXX for reflecting the
> requirements of GUP in the code path that going to call GUP is reasonable.
> 
> But using FOLL_XXX with purposes that are not related to GUP call really
> feels off.

I agree, assuming you're talking specifically about the logic in hva_to_pfn_remapped()
that handles non-refcounted pages, i.e. this

	if (get_page_unless_zero(page)) {
		foll->is_refcounted_page = true;
		if (!(foll->flags & FOLL_GET))
			put_page(page);
	} else if (foll->flags & FOLL_GET) {
		r = -EFAULT;
	}

should be

	if (get_page_unless_zero(page)) {
		foll->is_refcounted_page = true;
		if (!(foll->flags & FOLL_GET))
			put_page(page);
	else if (!foll->guarded_by_mmu_notifier)
		r = -EFAULT;

because it's not the desire to grab a reference that makes getting non-refcounted
pfns "safe", it's whether or not the caller is plugged into the MMU notifiers.

Though that highlights that checking guarded_by_mmu_notifier should be done for
*all* non-refcounted pfns, not just non-refcounted struct page memory.

As for the other usage of FOLL_GET in this series (using it to conditionally do
put_page()), IMO that's very much related to the GUP call.  Invoking put_page()
is a hack to workaround the fact that GUP doesn't provide a way to get the pfn
without grabbing a reference to the page.  In an ideal world, KVM would NOT pass
FOLL_GET to the various GUP helpers, i.e. FOLL_GET would be passed as-is and KVM
wouldn't "need" to kinda sorta overload FOLL_GET to manually drop the reference.

I do think it's worth providing a helper to consolidate and document that hacky
code, e.g. add a kvm_follow_refcounted_pfn() helper.

All in all, I think the below (completely untested) is what we want?

David (and others), I am planning on doing a full review of this series "soon",
but it will likely be a few weeks until that happens.  I jumped in on this
specific thread because this caught my eye and I really don't want to throw out
*all* of the FOLL_GET usage.

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5b5afd70f239..90d424990e0a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2481,6 +2481,25 @@ static inline int check_user_page_hwpoison(unsigned long addr)
        return rc == -EHWPOISON;
 }
 
+static kvm_pfn_t kvm_follow_refcounted_pfn(struct kvm_follow_pfn *foll,
+                                          struct page *page)
+{
+       kvm_pfn_t pfn = page_to_pfn(page);
+
+       foll->is_refcounted_page = true;
+
+       /*
+        * FIXME: Ideally, KVM wouldn't pass FOLL_GET to gup() when the caller
+        * doesn't want to grab a reference, but gup() doesn't support getting
+        * just the pfn, i.e. FOLL_GET is effectively mandatory.  If that ever
+        * changes, drop this and simply don't pass FOLL_GET to gup().
+        */
+       if (!(foll->flags & FOLL_GET))
+               put_page(page);
+
+       return pfn;
+}
+
 /*
  * The fast path to get the writable pfn which will be stored in @pfn,
  * true indicates success, otherwise false is returned.  It's also the
@@ -2500,11 +2519,9 @@ static bool hva_to_pfn_fast(struct kvm_follow_pfn *foll, kvm_pfn_t *pfn)
                return false;
 
        if (get_user_page_fast_only(foll->hva, FOLL_WRITE, page)) {
-               *pfn = page_to_pfn(page[0]);
                foll->writable = foll->allow_write_mapping;
-               foll->is_refcounted_page = true;
-               if (!(foll->flags & FOLL_GET))
-                       put_page(page[0]);
+
+               *pfn = kvm_follow_refcounted_pfn(foll, page[0]);
                return true;
        }
 
@@ -2528,7 +2545,6 @@ static int hva_to_pfn_slow(struct kvm_follow_pfn *foll, kvm_pfn_t *pfn)
                return npages;
 
        foll->writable = (foll->flags & FOLL_WRITE) && foll->allow_write_mapping;
-       foll->is_refcounted_page = true;
 
        /* map read fault as writable if possible */
        if (unlikely(!foll->writable) && foll->allow_write_mapping) {
@@ -2540,9 +2556,8 @@ static int hva_to_pfn_slow(struct kvm_follow_pfn *foll, kvm_pfn_t *pfn)
                        page = wpage;
                }
        }
-       *pfn = page_to_pfn(page);
-       if (!(foll->flags & FOLL_GET))
-               put_page(page);
+
+       *pfn = kvm_follow_refcounted_pfn(foll, page);
        return npages;
 }
 
@@ -2610,17 +2625,16 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma, struct kvm_follow_pfn
        if (!page)
                goto out;
 
-       if (get_page_unless_zero(page)) {
-               foll->is_refcounted_page = true;
-               if (!(foll->flags & FOLL_GET))
-                       put_page(page);
-       } else if (foll->flags & FOLL_GET) {
-               r = -EFAULT;
-       }
-
+       if (get_page_unless_zero(page))
+               WARN_ON_ONCE(kvm_follow_refcounted_pfn(foll, page) != pfn);
 out:
        pte_unmap_unlock(ptep, ptl);
-       *p_pfn = pfn;
+
+       if (!foll->is_refcounted_page && !foll->guarded_by_mmu_notifier &&
+           !allow_unsafe_mappings)
+               r = -EFAULT;
+       else
+               *p_pfn = pfn;
 
        return r;
 }


  reply	other threads:[~2023-07-11 22:01 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-04  7:50 [PATCH v7 0/8] KVM: allow mapping non-refcounted pages David Stevens
2023-07-04  7:50 ` [PATCH v7 1/8] KVM: Assert that a page's refcount is elevated when marking accessed/dirty David Stevens
2023-07-04  7:50 ` [PATCH v7 2/8] KVM: Introduce __kvm_follow_pfn function David Stevens
2023-07-05  3:10   ` Yu Zhang
2023-07-05  9:22     ` David Stevens
2023-07-05 10:53       ` Yu Zhang
2023-07-06  5:29         ` David Stevens
2023-07-06 14:52           ` Yu Zhang
2023-08-04 22:03             ` Sean Christopherson
2023-07-05  8:47   ` Zhi Wang
2023-07-05  9:08     ` David Stevens
2023-07-11 17:37       ` Zhi Wang
2023-07-06  1:34   ` Isaku Yamahata
2023-07-06  5:52     ` David Stevens
2023-08-04 22:13   ` Sean Christopherson
2023-07-04  7:50 ` [PATCH v7 3/8] KVM: Make __kvm_follow_pfn not imply FOLL_GET David Stevens
2023-07-05  7:23   ` Yu Zhang
2023-07-05 11:56   ` Yu Zhang
2023-07-06  6:09     ` David Stevens
2023-07-05 13:19   ` Zhi Wang
2023-07-06  6:49     ` David Stevens
2023-07-11 17:33       ` Zhi Wang
2023-07-11 21:59         ` Sean Christopherson [this message]
2023-09-05  8:26           ` David Stevens
2023-09-06  0:45             ` Sean Christopherson
2023-09-06  3:24               ` David Stevens
2023-09-06 22:03                 ` Sean Christopherson
2023-07-04  7:50 ` [PATCH v7 4/8] KVM: x86/mmu: Migrate to __kvm_follow_pfn David Stevens
2023-07-05  8:07   ` Yu Zhang
2023-08-04 22:30     ` Sean Christopherson
2023-07-06  1:54   ` Isaku Yamahata
2023-08-24  8:03     ` David Stevens
2023-07-04  7:50 ` [PATCH v7 5/8] KVM: x86/mmu: Don't pass FOLL_GET " David Stevens
2023-07-05 10:18   ` Yu Zhang
2023-07-05 14:17     ` Yu Zhang
2023-07-06  4:52     ` David Stevens
2023-07-06  7:19       ` Yu Zhang
2023-07-06 15:58       ` Isaku Yamahata
2023-07-07  1:35         ` David Stevens
2023-07-10 16:34           ` Isaku Yamahata
2023-07-11  2:59             ` David Stevens
2023-08-04 22:45       ` Sean Christopherson
2023-07-05 10:25   ` Yu Zhang
2023-08-24  8:03     ` David Stevens
2023-08-24 15:15       ` Sean Christopherson
2023-08-25  1:38         ` David Stevens
2023-08-31 21:18           ` Sean Christopherson
2023-07-06  2:10   ` Isaku Yamahata
2023-07-06  5:18     ` David Stevens
2023-07-19  6:09   ` Yan Zhao
2023-07-19  7:16     ` David Stevens
2023-07-04  7:50 ` [PATCH v7 6/8] KVM: arm64: Migrate " David Stevens
2023-07-04  7:50 ` [PATCH v7 7/8] KVM: PPC: " David Stevens
2023-07-04  7:50 ` [PATCH v7 8/8] KVM: remove __gfn_to_pfn_memslot David Stevens
2023-08-04 22:47 ` [PATCH v7 0/8] KVM: allow mapping non-refcounted pages Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZK3Q34WNLjGVQQw+@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maz@kernel.org \
    --cc=peterx@redhat.com \
    --cc=stevensd@chromium.org \
    --cc=zhi.wang.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).