All of lore.kernel.org
 help / color / mirror / Atom feed
From: Elena Afanasova <eafanasova@gmail.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: kvm@vger.kernel.org, jag.raman@oracle.com, elena.ufimtseva@oracle.com
Subject: Re: [RFC v2 3/4] KVM: add support for ioregionfd cmds/replies serialization
Date: Wed, 03 Feb 2021 06:10:25 -0800	[thread overview]
Message-ID: <dc35fdd3eb2febbe49cfd6561da6faf045f12ee3.camel@gmail.com> (raw)
In-Reply-To: <20210130185415.GD98016@stefanha-x1.localdomain>

On Sat, 2021-01-30 at 18:54 +0000, Stefan Hajnoczi wrote:
> On Thu, Jan 28, 2021 at 09:32:22PM +0300, Elena Afanasova wrote:
> > Add ioregionfd context and kvm_io_device_ops->prepare/finish()
> > in order to serialize all bytes requested by guest.
> > 
> > Signed-off-by: Elena Afanasova <eafanasova@gmail.com>
> > ---
> >  arch/x86/kvm/x86.c       |  19 ++++++++
> >  include/kvm/iodev.h      |  14 ++++++
> >  include/linux/kvm_host.h |   4 ++
> >  virt/kvm/ioregion.c      | 102 +++++++++++++++++++++++++++++++++
> > ------
> >  virt/kvm/kvm_main.c      |  32 ++++++++++++
> >  5 files changed, 157 insertions(+), 14 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index a04516b531da..393fb0f4bf46 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5802,6 +5802,8 @@ static int vcpu_mmio_write(struct kvm_vcpu
> > *vcpu, gpa_t addr, int len,
> >  	int ret = 0;
> >  	bool is_apic;
> >  
> > +	kvm_io_bus_prepare(vcpu, KVM_MMIO_BUS, addr, len);
> > +
> >  	do {
> >  		n = min(len, 8);
> >  		is_apic = lapic_in_kernel(vcpu) &&
> > @@ -5823,8 +5825,10 @@ static int vcpu_mmio_write(struct kvm_vcpu
> > *vcpu, gpa_t addr, int len,
> >  	if (ret == -EINTR) {
> >  		vcpu->run->exit_reason = KVM_EXIT_INTR;
> >  		++vcpu->stat.signal_exits;
> > +		return handled;
> >  	}
> >  #endif
> > +	kvm_io_bus_finish(vcpu, KVM_MMIO_BUS, addr, len);
> 
> Hmm...it would be nice for kvm_io_bus_prepare() to return the idx or
> the
> device pointer so the devices don't need to be searched in
> read/write/finish. However, it's complicated by the loop which may
> access multiple devices.
> 
Agree

> > @@ -9309,6 +9325,7 @@ static int complete_ioregion_mmio(struct
> > kvm_vcpu *vcpu)
> >  		vcpu->mmio_cur_fragment++;
> >  	}
> >  
> > +	vcpu->ioregion_ctx.dev->ops->finish(vcpu->ioregion_ctx.dev);
> >  	vcpu->mmio_needed = 0;
> >  	if (!vcpu->ioregion_ctx.in) {
> >  		srcu_read_unlock(&vcpu->kvm->srcu, idx);
> > @@ -9333,6 +9350,7 @@ static int complete_ioregion_pio(struct
> > kvm_vcpu *vcpu)
> >  		vcpu->ioregion_ctx.val += vcpu->ioregion_ctx.len;
> >  	}
> >  
> > +	vcpu->ioregion_ctx.dev->ops->finish(vcpu->ioregion_ctx.dev);
> >  	if (vcpu->ioregion_ctx.in)
> >  		r = kvm_emulate_instruction(vcpu, EMULTYPE_NO_DECODE);
> >  	srcu_read_unlock(&vcpu->kvm->srcu, idx);
> > @@ -9352,6 +9370,7 @@ static int complete_ioregion_fast_pio(struct
> > kvm_vcpu *vcpu)
> >  	complete_ioregion_access(vcpu, vcpu->ioregion_ctx.addr,
> >  				 vcpu->ioregion_ctx.len,
> >  				 vcpu->ioregion_ctx.val);
> > +	vcpu->ioregion_ctx.dev->ops->finish(vcpu->ioregion_ctx.dev);
> >  	srcu_read_unlock(&vcpu->kvm->srcu, idx);
> >  
> >  	if (vcpu->ioregion_ctx.in) {
> 
> Normally userspace will invoke ioctl(KVM_RUN) and reach one of these
> completion functions, but what if the vcpu fd is closed instead?
> ->finish() should still be called to avoid leaks.
> 
Will fix

> > diff --git a/include/kvm/iodev.h b/include/kvm/iodev.h
> > index d75fc4365746..db8a3c69b7bb 100644
> > --- a/include/kvm/iodev.h
> > +++ b/include/kvm/iodev.h
> > @@ -25,6 +25,8 @@ struct kvm_io_device_ops {
> >  		     gpa_t addr,
> >  		     int len,
> >  		     const void *val);
> > +	void (*prepare)(struct kvm_io_device *this);
> > +	void (*finish)(struct kvm_io_device *this);
> >  	void (*destructor)(struct kvm_io_device *this);
> >  };
> >  
> > @@ -55,6 +57,18 @@ static inline int kvm_iodevice_write(struct
> > kvm_vcpu *vcpu,
> >  				 : -EOPNOTSUPP;
> >  }
> >  
> > +static inline void kvm_iodevice_prepare(struct kvm_io_device *dev)
> > +{
> > +	if (dev->ops->prepare)
> > +		dev->ops->prepare(dev);
> > +}
> > +
> > +static inline void kvm_iodevice_finish(struct kvm_io_device *dev)
> > +{
> > +	if (dev->ops->finish)
> > +		dev->ops->finish(dev);
> > +}
> 
> A performance optimization: keep a separate list of struct
> kvm_io_devices that implement prepare/finish. That way the search
> doesn't need to iterate over devices that don't support this
> interface.
> 
Thanks for the idea

> Before implementing an optimization like this it would be good to
> check
> how this patch affects performance on guests with many in-kernel
> devices
> (e.g. a guest that has many multi-queue virtio-net/blk devices with
> ioeventfd). ioregionfd shouldn't reduce performance of existing KVM
> configurations, so it's worth measuring.
> 
> > diff --git a/virt/kvm/ioregion.c b/virt/kvm/ioregion.c
> > index da38124e1418..3474090ccc8c 100644
> > --- a/virt/kvm/ioregion.c
> > +++ b/virt/kvm/ioregion.c
> > @@ -1,6 +1,6 @@
> >  // SPDX-License-Identifier: GPL-2.0-only
> >  #include <linux/kvm_host.h>
> > -#include <linux/fs.h>
> > +#include <linux/wait.h>
> >  #include <kvm/iodev.h>
> >  #include "eventfd.h"
> >  #include <uapi/linux/ioregion.h>
> > @@ -12,15 +12,23 @@ kvm_ioregionfd_init(struct kvm *kvm)
> >  	INIT_LIST_HEAD(&kvm->ioregions_pio);
> >  }
> >  
> > +/* Serializes ioregionfd cmds/replies */
> 
> Please expand on this comment:
> 
>   ioregions that share the same rfd are serialized so that only one
> vCPU
>   thread sends a struct ioregionfd_cmd to userspace at a time. This
>   ensures that the struct ioregionfd_resp received from userspace
> will
>   be processed by the one and only vCPU thread that sent it.
> 
>   A waitqueue is used to wake up waiting vCPU threads in order. Most
> of
>   the time the waitqueue is unused and the lock is not contended.
>   For best performance userspace should set up ioregionfds so that
> there
>   is no contention (e.g. dedicated ioregionfds for queue doorbell
>   registers on multi-queue devices).
> 
> A comment along these lines will give readers an idea of why the code
> does this.

Ok, thank you


  reply	other threads:[~2021-02-03 14:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-28 18:32 [RFC v2 0/4] Introduce MMIO/PIO dispatch file descriptors (ioregionfd) Elena Afanasova
2021-01-28 18:32 ` [RFC v2 2/4] KVM: x86: add support for ioregionfd signal handling Elena Afanasova
2021-01-30 16:58   ` Stefan Hajnoczi
2021-02-03 14:00     ` Elena Afanasova
2021-02-09  6:21   ` Jason Wang
2021-02-09 14:49     ` Stefan Hajnoczi
2021-02-10 19:06     ` Elena Afanasova
2021-02-09  6:26   ` Jason Wang
2021-01-28 18:32 ` [RFC v2 3/4] KVM: add support for ioregionfd cmds/replies serialization Elena Afanasova
2021-01-30 18:54   ` Stefan Hajnoczi
2021-02-03 14:10     ` Elena Afanasova [this message]
2021-01-28 18:32 ` [RFC v2 4/4] KVM: enforce NR_IOBUS_DEVS limit if kmemcg is disabled Elena Afanasova
2021-01-29 18:48 ` [RESEND RFC v2 1/4] KVM: add initial support for KVM_SET_IOREGION Elena Afanasova
2021-01-30 15:04   ` Stefan Hajnoczi
2021-02-04 13:03   ` Cornelia Huck
2021-02-05 18:39     ` Elena Afanasova
2021-02-08 11:49       ` Cornelia Huck
2021-02-08  6:21   ` Jason Wang
2021-02-09 14:59     ` Stefan Hajnoczi
2021-02-18  6:17       ` Jason Wang
2021-02-10 19:31     ` Elena Afanasova
2021-02-11 14:59       ` Stefan Hajnoczi
2021-02-17 23:05         ` Elena Afanasova
2021-02-18  6:22         ` Jason Wang
2021-02-18  6:20       ` Jason Wang
2021-01-30 14:56 ` [RFC v2 0/4] Introduce MMIO/PIO dispatch file descriptors (ioregionfd) Stefan Hajnoczi
2021-02-02 14:59 ` Stefan Hajnoczi
2021-02-08  6:02 ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dc35fdd3eb2febbe49cfd6561da6faf045f12ee3.camel@gmail.com \
    --to=eafanasova@gmail.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=jag.raman@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.