All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	David Matlack <dmatlack@google.com>,
	kvm@vger.kernel.org,  linux-kernel@vger.kernel.org,
	David Stevens <stevensd@chromium.org>,
	 Matthew Wilcox <willy@infradead.org>
Subject: Re: [RFC PATCH 0/4] KVM: x86/mmu: Rework marking folios dirty/accessed
Date: Fri, 5 Apr 2024 06:59:58 -0700	[thread overview]
Message-ID: <ZhAD3hQwI0ltYnFp@google.com> (raw)
In-Reply-To: <7a1c58d7-ddd9-40fc-a4ef-81c548de2b07@redhat.com>

On Fri, Apr 05, 2024, David Hildenbrand wrote:
> On 05.04.24 11:37, Paolo Bonzini wrote:
> > On Fri, Apr 5, 2024 at 8:53 AM David Hildenbrand <david@redhat.com> wrote:
> > > >        mmu_notifier_invalidate_range_start(&range);
> > > >        tlb_start_vma(&tlb, vma);
> > > >        walk_page_range(vma->vm_mm, range.start, range.end,
> > > >                        &madvise_free_walk_ops, &tlb);
> > > >        tlb_end_vma(&tlb, vma);
> > > >        mmu_notifier_invalidate_range_end(&range);
> > > > 
> > > 
> > > Indeed, we do setup the MMU notifier invalidation. We do the start/end
> > > ... I was looking for PTE notifications.
> > > 
> > > I spotted the set_pte_at(), not a set_pte_at_notify() like we do in
> > > other code. Maybe that's not required here (digging through
> > > documentation I'm still left clueless). [...]
> > > Absolutely unclear to me when we *must* to use it, or if it is. Likely
> > > its a pure optimization when we're *changing* a PTE.
> > 
> > Yes, .change_pte() is just an optimization. The original point of it
> > was for KSM, so that KVM could flip the sPTE to a new location without
> > first zapping it. At the time there was also an .invalidate_page()
> > callback, and both of them were *not* bracketed by calls to
> > mmu_notifier_invalidate_range_{start,end}()
> > 
> > Later on, both callbacks were changed to occur within an
> > invalidate_range_start/end() block.
> > 
> > Commit 6bdb913f0a70 ("mm: wrap calls to set_pte_at_notify with
> > invalidate_range_start and invalidate_range_end", 2012-10-09) did so
> > for .change_pte(). The reason to do so was a bit roundabout, namely to
> > allow sleepable .invalidate_page() hooks (because .change_pte() is not
> > sleepable and at the time .invalidate_page() was used as a fallback
> > for .change_pte()).
> > 
> > This however made KVM's usage of the .change_pte() callback completely
> > moot, because KVM unmaps the sPTEs during .invalidate_range_start()
> > and therefore .change_pte() has no hope of succeeding.
> > 
> > (Commit 369ea8242c0f ("mm/rmap: update to new mmu_notifier semantic
> > v2", 2017-08-31) is where the other set of non-bracketed calls to MMU
> > notifier callbacks was changed; calls to
> > mmu_notifier_invalidate_page() were replaced by calls to
> > mmu_notifier_invalidate_range(), bracketed by calls to
> > mmu_notifier_invalidate_range_{start,end}()).
> > 
> > Since KVM is the only user of .change_pte(), we can remove
> > .change_pte() and set_pte_at_notify() completely.
> 
> Nice, thanks for all that information!

Ya, from commit c13fda237f08 ("KVM: Assert that notifier count is elevated in
.change_pte()"):

    x86 and MIPS are clearcut nops if the old SPTE is not-present, and that
    is guaranteed due to the prior invalidation.  PPC simply unmaps the SPTE,
    which again should be a nop due to the invalidation.  arm64 is a bit
    murky, but it's also likely a nop because kvm_pgtable_stage2_map() is
    called without a cache pointer, which means it will map an entry if and
    only if an existing PTE was found.

I'm 100% in favor of removing .change_pte().  As I've said multiple times, the
only reason I haven't sent a patch is because I didn't want it to prompt someone
into resurrecting the original behavior. :-)

  reply	other threads:[~2024-04-05 14:00 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-20  0:50 [RFC PATCH 0/4] KVM: x86/mmu: Rework marking folios dirty/accessed Sean Christopherson
2024-03-20  0:50 ` [RFC PATCH 1/4] KVM: x86/mmu: Skip the "try unsync" path iff the old SPTE was a leaf SPTE Sean Christopherson
2024-03-20  0:50 ` [RFC PATCH 2/4] KVM: x86/mmu: Mark folio dirty when creating SPTE, not when zapping/modifying Sean Christopherson
2024-03-20  0:50 ` [RFC PATCH 3/4] KVM: x86/mmu: Mark page/folio accessed only when zapping leaf SPTEs Sean Christopherson
2024-03-20  0:50 ` [RFC PATCH 4/4] KVM: x86/mmu: Don't force flush if SPTE update clears Accessed bit Sean Christopherson
2024-03-20 12:56 ` [RFC PATCH 0/4] KVM: x86/mmu: Rework marking folios dirty/accessed David Hildenbrand
2024-04-02 17:38   ` David Matlack
2024-04-02 18:31     ` David Hildenbrand
2024-04-03  0:17       ` Sean Christopherson
2024-04-03 21:43         ` David Hildenbrand
2024-04-03 22:19           ` Sean Christopherson
2024-04-04 15:44             ` David Hildenbrand
2024-04-04 17:31               ` Sean Christopherson
2024-04-04 18:23                 ` David Hildenbrand
2024-04-04 22:02                   ` Sean Christopherson
2024-04-05  6:53                     ` David Hildenbrand
2024-04-05  9:37                       ` Paolo Bonzini
2024-04-05 10:14                         ` David Hildenbrand
2024-04-05 13:59                           ` Sean Christopherson [this message]
2024-04-05 14:06                             ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZhAD3hQwI0ltYnFp@google.com \
    --to=seanjc@google.com \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=stevensd@chromium.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.