From: Christoffer Dall <christoffer.dall@linaro.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, "Andre Przywara" <andre.przywara@arm.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
"Alexander Graf" <agraf@suse.de>
Subject: Re: [PATCH v2] KVM: Synchronize accesses to the kvm->devices list
Date: Wed, 3 Aug 2016 13:51:49 +0200 [thread overview]
Message-ID: <20160803115149.GX32244@cbox> (raw)
In-Reply-To: <2efb2d21-102b-a210-b851-96eb9936b7d7@redhat.com>
On Wed, Aug 03, 2016 at 01:24:41PM +0200, Paolo Bonzini wrote:
>
>
> On 03/08/2016 12:39, Christoffer Dall wrote:
> > The list of KVM devices is not currently protected with any lock. As
> > far as I can see nothing prevents multiple threads from creating devices
> > on the KVM fd simultaenously, potentially leading to corruption.
> > Further, we already have VFIO looping over this list without any form of
> > synchronization, and we are about to add similar functionality for the
> > ITS on arm64.
> >
> > Note that the traversal in kvm_destroy_devices does not appear to be a
> > problem because only the last reference to the KVM struct can actually
> > perform this action.
> >
> > I use a mutex to protect the list, since the arm64 ITS code is going to
> > need to iterate over the devices (to find all ITSes) and register KVM IO
> > devices for each one, which in turn grabs another mutex and kmallocs
> > stuff.
> >
> > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> > ---
> > Changes since v1:
> > - My RCU knowledge was too rusty and was both incorrectly implemented
> > and would not address the arm64 need. Just use a mutex instead.
> >
> > include/linux/kvm_host.h | 1 +
> > virt/kvm/kvm_main.c | 8 ++++++++
> > virt/kvm/vfio.c | 11 +++++++++--
> > 3 files changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index aafd702..de32196 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -421,6 +421,7 @@ struct kvm {
> > long mmu_notifier_count;
> > #endif
> > long tlbs_dirty;
> > + struct mutex devices_lock; /* protects the devices list */
> > struct list_head devices;
> > struct dentry *debugfs_dentry;
> > struct kvm_stat_data **debugfs_stat_data;
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 61b31a5..33e0579 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -616,6 +616,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
> > mutex_init(&kvm->irq_lock);
> > mutex_init(&kvm->slots_lock);
> > atomic_set(&kvm->users_count, 1);
> > + mutex_init(&kvm->devices_lock);
> > INIT_LIST_HEAD(&kvm->devices);
> >
> > r = kvm_arch_init_vm(kvm, type);
> > @@ -694,6 +695,11 @@ static void kvm_destroy_devices(struct kvm *kvm)
> > {
> > struct kvm_device *dev, *tmp;
> >
> > + /*
> > + * We do not need to take the devices_lock here, because nobody else
> > + * has a reference to the struct kvm at this point and therefore
> > + * cannot access the devices list anyhow.
> > + */
> > list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) {
> > list_del(&dev->vm_node);
> > dev->ops->destroy(dev);
> > @@ -2842,7 +2848,9 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
> > return ret;
> > }
> >
> > + mutex_lock(&kvm->devices_lock);
> > list_add(&dev->vm_node, &kvm->devices);
> > + mutex_unlock(&kvm->devices_lock);
>
> This is not enough for the VFIO test to be robust. I think it's easier
> to move ops->create entirely under the kvm_lock. This requires some
> auditing of the create functions, but many of them (e.g.
> kvm_vgic_create) already use that lock.
Duh, I fell into the classic case of just wanting to fix our needs
instead of considering what this actually does. Apologies.
>
> The only complex case is xics. It takes kvm_lock only for part of the
> initialization, and the call to xics_debugfs_init would be better moved
> outside kvm_lock. For that we can add an ops->init(dev) function that
> is called outside kvm_lock and cannot fail.
>
> For the others:
>
> - vgic creation is already entirely under kvm_lock, only kvm_vgic_create
> needs adjustment
>
> - mpic uses kvm_set_irq_routing, which is okay because irq_lock nests
> inside kvm_lock
>
> - flic_create is trivial would have a multiple-creation bug fixed
>
> - likewise for kvm_vfio_create
>
> Would you like to do this?
>
I'll have a look and see what I can put together.
Our ITS (just merged) currently suggests a somewhat broken ABI to
userspace, so would it be acceptible to implement the ITS stuff with a
similarly broken approach (ie. without any locks/synhronization) that
the VFIO device takes currently and merge that more or less immediately
and fix it together with the rest afterwards?
(By afterwards, I will prioritize this).
-Christoffer
next prev parent reply other threads:[~2016-08-03 11:51 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-03 10:15 [PATCH] KVM: Synchronize accesses to the kvm->devices list Christoffer Dall
2016-08-03 10:39 ` [PATCH v2] " Christoffer Dall
2016-08-03 11:24 ` Paolo Bonzini
2016-08-03 11:51 ` Christoffer Dall [this message]
2016-08-08 19:40 ` Christoffer Dall
2016-08-08 22:18 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160803115149.GX32244@cbox \
--to=christoffer.dall@linaro.org \
--cc=agraf@suse.de \
--cc=andre.przywara@arm.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox