All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH] Documentation: kvm: clarify SRCU locking order
Date: Thu, 5 Jan 2023 22:38:33 +0000	[thread overview]
Message-ID: <Y7dRaY+spKan+VcV@google.com> (raw)
In-Reply-To: <Y7RpB+trpnhVRhQW@google.com>

On Tue, Jan 03, 2023, Sean Christopherson wrote:
> On Wed, Dec 28, 2022, Paolo Bonzini wrote:
> > Currently only the locking order of SRCU vs kvm->slots_arch_lock
> > and kvm->slots_lock is documented.  Extend this to kvm->lock
> > since Xen emulation got it terribly wrong.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  Documentation/virt/kvm/locking.rst | 19 ++++++++++++++-----
> >  1 file changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
> > index 845a561629f1..a3ca76f9be75 100644
> > --- a/Documentation/virt/kvm/locking.rst
> > +++ b/Documentation/virt/kvm/locking.rst
> > @@ -16,17 +16,26 @@ The acquisition orders for mutexes are as follows:
> >  - kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
> >    them together is quite rare.
> >  
> > -- Unlike kvm->slots_lock, kvm->slots_arch_lock is released before
> > -  synchronize_srcu(&kvm->srcu).  Therefore kvm->slots_arch_lock
> > -  can be taken inside a kvm->srcu read-side critical section,
> > -  while kvm->slots_lock cannot.
> > -
> >  - kvm->mn_active_invalidate_count ensures that pairs of
> >    invalidate_range_start() and invalidate_range_end() callbacks
> >    use the same memslots array.  kvm->slots_lock and kvm->slots_arch_lock
> >    are taken on the waiting side in install_new_memslots, so MMU notifiers
> >    must not take either kvm->slots_lock or kvm->slots_arch_lock.
> >  
> > +For SRCU:
> > +
> > +- ``synchronize_srcu(&kvm->srcu)`` is called _inside_
> > +  the kvm->slots_lock critical section, therefore kvm->slots_lock
> > +  cannot be taken inside a kvm->srcu read-side critical section.
> > +  Instead, kvm->slots_arch_lock is released before the call
> > +  to ``synchronize_srcu()`` and _can_ be taken inside a
> > +  kvm->srcu read-side critical section.
> > +
> > +- kvm->lock is taken inside kvm->srcu, therefore
> 
> Prior to the recent Xen change, is this actually true?

I was thinking of a different change, but v5.19 is still kinda recent, so I'll
count it.  Better to be lucky than good :-)

Commit 2fd6df2f2b47 ("KVM: x86/xen: intercept EVTCHNOP_send from guests") introduced
the only case I can find where kvm->srcu is taken inside kvm->lock.

> There are many instances where kvm->srcu is taken inside kvm->lock, but I
> can't find any existing cases where the reverse is true.  Logically, it makes
> sense to take kvm->lock first since kvm->srcu can be taken deep in helpers,
> e.g. for accessing guest memory.  It's also more consistent to take kvm->lock
> first since kvm->srcu is taken inside vcpu->mutex, and vcpu->mutex is taken
> inside kvm->lock.
> 
> Disallowing synchronize_srcu(kvm->srcu) inside kvm->lock isn't probelmatic per se,
> but it's going to result in a weird set of rules because synchronize_scru() can,
> and is, called while holding a variety of other locks.
> 
> In other words, IMO taking kvm->srcu outside of kvm->lock in the Xen code is the
> real bug.

I'm doubing down on this.  Taking kvm->srcu outside of kvm->lock is all kinds of
sketchy, and likely indicates a larger problem.  The aformentioned commit that
introduced the problematic kvm->srcu vs. kvm->lock also blatantly violates ordering
between kvm->lock and vcpu->mutex.  Details in the link[*].

The vast majority of flows that take kvm->srcu will already hold a lock of some
kind, otherwise the task can't safely deference any VM/vCPU/device data and thus
has no reason to acquire kvm->srcu.  E.g. taking kvm->srcu to read guest memory
is nonsensical without a stable guest physical address to work with.

There are exceptions, e.g. evtchn_set_fn() and maybe some ioctls(), but in most
cases, taking kvm->lock inside kvm->srcu is just asking for problems.

[*] https://lore.kernel.org/all/Y7dN0Negds7XUbvI@google.com

  reply	other threads:[~2023-01-05 22:38 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-28 11:04 [PATCH] Documentation: kvm: clarify SRCU locking order Paolo Bonzini
2023-01-03 17:42 ` Sean Christopherson
2023-01-05 22:38   ` Sean Christopherson [this message]
2023-01-11 10:26     ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7dRaY+spKan+VcV@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.