linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Keir Fraser <keirf@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Eric Auger <eric.auger@redhat.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Li RongQing <lirongqing@baidu.com>
Subject: Re: [PATCH 3/3] KVM: Avoid synchronize_srcu() in kvm_io_bus_register_dev()
Date: Mon, 30 Jun 2025 09:59:10 +0000	[thread overview]
Message-ID: <aGJf7v9EQoEZiQUk@google.com> (raw)
In-Reply-To: <aFrANSe6fJOfMpOC@google.com>

On Tue, Jun 24, 2025 at 08:11:49AM -0700, Sean Christopherson wrote:
> +Li
> 
> On Tue, Jun 24, 2025, Keir Fraser wrote:
> > Device MMIO registration may happen quite frequently during VM boot,
> > and the SRCU synchronization each time has a measurable effect
> > on VM startup time. In our experiments it can account for around 25%
> > of a VM's startup time.
> > 
> > Replace the synchronization with a deferred free of the old kvm_io_bus
> > structure.
> > 
> > Signed-off-by: Keir Fraser <keirf@google.com>
> > ---
> >  include/linux/kvm_host.h |  1 +
> >  virt/kvm/kvm_main.c      | 10 ++++++++--
> >  2 files changed, 9 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index 3bde4fb5c6aa..28a63f1ad314 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -205,6 +205,7 @@ struct kvm_io_range {
> >  struct kvm_io_bus {
> >  	int dev_count;
> >  	int ioeventfd_count;
> > +	struct rcu_head rcu;
> >  	struct kvm_io_range range[];
> >  };
> >  
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index eec82775c5bf..b7d4da8ba0b2 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -5924,6 +5924,13 @@ int kvm_io_bus_read(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_io_bus_read);
> >  
> > +static void __free_bus(struct rcu_head *rcu)
> > +{
> > +	struct kvm_io_bus *bus = container_of(rcu, struct kvm_io_bus, rcu);
> > +
> > +	kfree(bus);
> > +}
> > +
> >  int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
> >  			    int len, struct kvm_io_device *dev)
> >  {
> > @@ -5962,8 +5969,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
> >  	memcpy(new_bus->range + i + 1, bus->range + i,
> >  		(bus->dev_count - i) * sizeof(struct kvm_io_range));
> >  	rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
> > -	synchronize_srcu_expedited(&kvm->srcu);
> > -	kfree(bus);
> > +	call_srcu(&kvm->srcu, &bus->rcu, __free_bus);
> 
> I'm 99% certain this will break ABI.  KVM needs to ensure all readers are guaranteed
> to see the new device prior to returning to userspace.

I'm not sure I understand this. How can userspace (or a guest VCPU)
know that it is executing *after* the MMIO registration, except via
some form of synchronization or other ordering of its own? For
example, that PCI BAR setup happens as part of PCI probing happening
early in device registration in the guest OS, strictly before the MMIO
region will be accessed. Otherwise the access is inherently racy
against the registration?

> I'm quite confident there
> are other flows that rely on the synchronization, the vGIC case is simply the one
> that's documented.

If they're in the kernel they can be fixed? If necessary I'll go audit the callers.

 Regards,
  Keir

> https://lore.kernel.org/all/aAkAY40UbqzQNr8m@google.com


  reply	other threads:[~2025-06-30 10:10 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-24  9:22 [PATCH 0/3] KVM: Speed up MMIO registrations Keir Fraser
2025-06-24  9:22 ` [PATCH 1/3] KVM: arm64: vgic-init: Remove vgic_ready() macro Keir Fraser
2025-06-24  9:22 ` [PATCH 2/3] KVM: arm64: vgic: Explicitly implement vgic_dist::ready ordering Keir Fraser
2025-06-24  9:22 ` [PATCH 3/3] KVM: Avoid synchronize_srcu() in kvm_io_bus_register_dev() Keir Fraser
2025-06-24 15:11   ` Sean Christopherson
2025-06-30  9:59     ` Keir Fraser [this message]
2025-07-07 18:49       ` Sean Christopherson
2025-07-10  6:51         ` Keir Fraser
2025-07-10 13:34           ` Sean Christopherson
2025-07-15 12:43         ` Keir Fraser
2025-07-15 14:01           ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aGJf7v9EQoEZiQUk@google.com \
    --to=keirf@google.com \
    --cc=eric.auger@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lirongqing@baidu.com \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).