From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5927EC10F14 for ; Mon, 15 Apr 2019 03:32:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1D62F20850 for ; Mon, 15 Apr 2019 03:32:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.b="EsYq4UUb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726313AbfDODcb (ORCPT ); Sun, 14 Apr 2019 23:32:31 -0400 Received: from bilbo.ozlabs.org ([203.11.71.1]:38531 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726042AbfDODca (ORCPT ); Sun, 14 Apr 2019 23:32:30 -0400 Received: by ozlabs.org (Postfix, from userid 1007) id 44jDZy6rwPz9s55; Mon, 15 Apr 2019 13:32:26 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gibson.dropbear.id.au; s=201602; t=1555299146; bh=iRwfS35S3mT+lPArqMldFLCW2MAujzsyzPH7hdyeaK4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=EsYq4UUbV15cJ85mJMxbPalykeH6pTXG7tw5WyGejjVlfQwqeIJxbVZlC3Mk61v1V C+jyhNGZ64Lojoxs0vgB7vTfudpk4tOSw+mtQ8PtCpJfPYeTJKCmYOuoD1WpnqKU7e CZLAke3avURibV2JpqJfe6/3+tzmYai3xxuOehDs= Date: Mon, 15 Apr 2019 13:32:19 +1000 From: David Gibson To: =?iso-8859-1?Q?C=E9dric?= Le Goater Cc: kvm-ppc@vger.kernel.org, Paul Mackerras , kvm@vger.kernel.org Subject: Re: [RFC PATCH v4 17/17] KVM: PPC: Book3S HV: XIVE: introduce a 'release' device operation Message-ID: <20190415033219.GC32705@umbus.fritz.box> References: <20190320083751.27001-1-clg@kaod.org> <20190409141347.3029-1-clg@kaod.org> <20190409141347.3029-2-clg@kaod.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="z4+8/lEcDcG5Ke9S" Content-Disposition: inline In-Reply-To: <20190409141347.3029-2-clg@kaod.org> User-Agent: Mutt/1.11.3 (2019-02-01) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org --z4+8/lEcDcG5Ke9S Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Apr 09, 2019 at 04:13:47PM +0200, C=E9dric Le Goater wrote: > When the VM boots, the CAS negotiation process determines which > interrupt mode to use and invokes a machine reset. At that time, any > links to the previous KVM interrupt device should be 'destroyed' > before the new chosen one is created. >=20 > To perform the necessary cleanups in KVM, we extend the KVM device > interface with a new 'release' operation which is called when the file > descriptor of the device is closed. >=20 > Such operations are defined for the XICS-on-XIVE and the XIVE native > KVM devices. They clear the vCPU interrupt presenters that could be > attached and then destroy the device. >=20 > Signed-off-by: C=E9dric Le Goater > --- > include/linux/kvm_host.h | 1 + > arch/powerpc/kvm/book3s_xive.c | 50 +++++++++++++++++++++++++-- > arch/powerpc/kvm/book3s_xive_native.c | 23 ++++++++++++ > virt/kvm/kvm_main.c | 13 +++++++ > 4 files changed, 85 insertions(+), 2 deletions(-) >=20 > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 831d963451d8..3b444620d8fc 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -1246,6 +1246,7 @@ struct kvm_device_ops { > long (*ioctl)(struct kvm_device *dev, unsigned int ioctl, > unsigned long arg); > int (*mmap)(struct kvm_device *dev, struct vm_area_struct *vma); > + void (*release)(struct kvm_device *dev); > }; > =20 > void kvm_device_get(struct kvm_device *dev); > diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xiv= e.c > index 4d4e1730de84..ba777db849d7 100644 > --- a/arch/powerpc/kvm/book3s_xive.c > +++ b/arch/powerpc/kvm/book3s_xive.c > @@ -1100,11 +1100,19 @@ void kvmppc_xive_disable_vcpu_interrupts(struct k= vm_vcpu *vcpu) > void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu) > { > struct kvmppc_xive_vcpu *xc =3D vcpu->arch.xive_vcpu; > - struct kvmppc_xive *xive =3D xc->xive; > + struct kvmppc_xive *xive; > int i; > =20 > + if (!kvmppc_xics_enabled(vcpu)) > + return; > + > + if (!xc) > + return; > + > pr_devel("cleanup_vcpu(cpu=3D%d)\n", xc->server_num); > =20 > + xive =3D xc->xive; > + > /* Ensure no interrupt is still routed to that VP */ > xc->valid =3D false; > kvmppc_xive_disable_vcpu_interrupts(vcpu); > @@ -1141,6 +1149,10 @@ void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcp= u) > } > /* Free the VP */ > kfree(xc); > + > + /* Cleanup the vcpu */ > + vcpu->arch.irq_type =3D KVMPPC_IRQ_DEFAULT; > + vcpu->arch.xive_vcpu =3D NULL; > } > =20 > int kvmppc_xive_connect_vcpu(struct kvm_device *dev, > @@ -1158,7 +1170,7 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev, > } > if (xive->kvm !=3D vcpu->kvm) > return -EPERM; > - if (vcpu->arch.irq_type) > + if (vcpu->arch.irq_type !=3D KVMPPC_IRQ_DEFAULT) > return -EBUSY; > if (kvmppc_xive_find_server(vcpu->kvm, cpu)) { > pr_devel("Duplicate !\n"); > @@ -1855,6 +1867,39 @@ static void kvmppc_xive_free(struct kvm_device *de= v) > kfree(dev); > } > =20 > +static void kvmppc_xive_release(struct kvm_device *dev) > +{ > + struct kvmppc_xive *xive =3D dev->private; > + struct kvm *kvm =3D xive->kvm; > + struct kvm_vcpu *vcpu; > + int i; > + > + pr_devel("Releasing xive device\n"); > + > + /* > + * When releasing the KVM device fd, the vCPUs can still be > + * running and we should clean up the vCPU interrupt > + * presenters first. > + */ > + if (atomic_read(&kvm->online_vcpus) !=3D 0) { What prevents online_vcpus from becoming non-zero after this test, but before the kvmppc_xive_free()? Is the test actually necessary? The operations below should be safe even if there are no online cpus, yes? > + /* > + * call kick_all_cpus_sync() to ensure that all CPUs > + * have executed any pending interrupts > + */ > + if (is_kvmppc_hv_enabled(kvm)) > + kick_all_cpus_sync(); > + > + /* > + * TODO: There is still a race window with the early > + * checks in kvmppc_native_connect_vcpu() > + */ That's... not reassuring. What are the consequences of that race, and what do you plan to do about it? > + kvm_for_each_vcpu(i, vcpu, kvm) > + kvmppc_xive_cleanup_vcpu(vcpu); > + } > + > + kvmppc_xive_free(dev); > +} > + > struct kvmppc_xive *kvmppc_xive_get_device(struct kvm *kvm, u32 type) > { > struct kvmppc_xive *xive; > @@ -2043,6 +2088,7 @@ struct kvm_device_ops kvm_xive_ops =3D { > .name =3D "kvm-xive", > .create =3D kvmppc_xive_create, > .init =3D kvmppc_xive_init, > + .release =3D kvmppc_xive_release, > .destroy =3D kvmppc_xive_free, > .set_attr =3D xive_set_attr, > .get_attr =3D xive_get_attr, > diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/boo= k3s_xive_native.c > index 092db0efe628..629da7bf2a89 100644 > --- a/arch/powerpc/kvm/book3s_xive_native.c > +++ b/arch/powerpc/kvm/book3s_xive_native.c > @@ -996,6 +996,28 @@ static void kvmppc_xive_native_free(struct kvm_devic= e *dev) > kfree(dev); > } > =20 > +static void kvmppc_xive_native_release(struct kvm_device *dev) > +{ > + struct kvmppc_xive *xive =3D dev->private; > + struct kvm *kvm =3D xive->kvm; > + struct kvm_vcpu *vcpu; > + int i; > + > + pr_devel("Releasing xive native device\n"); > + > + /* > + * When releasing the KVM device fd, the vCPUs can still be > + * running and we should clean up the vCPU interrupt > + * presenters first. > + */ > + if (atomic_read(&kvm->online_vcpus) !=3D 0) { Likewise here. > + kvm_for_each_vcpu(i, vcpu, kvm) > + kvmppc_xive_native_cleanup_vcpu(vcpu); > + } > + > + kvmppc_xive_native_free(dev); > +} > + > static int kvmppc_xive_native_create(struct kvm_device *dev, u32 type) > { > struct kvmppc_xive *xive; > @@ -1187,6 +1209,7 @@ struct kvm_device_ops kvm_xive_native_ops =3D { > .name =3D "kvm-xive-native", > .create =3D kvmppc_xive_native_create, > .init =3D kvmppc_xive_native_init, > + .release =3D kvmppc_xive_native_release, > .destroy =3D kvmppc_xive_native_free, > .set_attr =3D kvmppc_xive_native_set_attr, > .get_attr =3D kvmppc_xive_native_get_attr, > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index ea2018ae1cd7..ea2619d5ca98 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -2938,6 +2938,19 @@ static int kvm_device_release(struct inode *inode,= struct file *filp) > struct kvm_device *dev =3D filp->private_data; > struct kvm *kvm =3D dev->kvm; > =20 > + if (!dev) > + return -ENODEV; > + > + if (dev->kvm !=3D kvm) > + return -EPERM; > + > + if (dev->ops->release) { > + mutex_lock(&kvm->lock); > + list_del(&dev->vm_node); > + dev->ops->release(dev); > + mutex_unlock(&kvm->lock); > + } > + Wasn't there a big comment that explained that release replaced destroy somewhere? > kvm_put_kvm(kvm); > return 0; > } --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --z4+8/lEcDcG5Ke9S Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlyz+0MACgkQbDjKyiDZ s5KIIA/+JNeQnyTu/d51WDqtoyP2n25ftrbz8ImZlkTVWL/xMGufbdobOXDR/Pow WMuH6gi10mqXXT7DkqEnxHXDwqoVybdREGaRq4pCxPo1TsFv9zYo0oGz7152kgwM mXV4/uZMFOwqs7pA1rM6mAlgx07mbKkVgy39oAdhbo0hu1swZeXou2y1e10j8MA+ vXPqx6hqbaRmsJXm8pueFZGmhsCzpIwrZAWj9uKYSzLDwy42dle1qV4LFYBozQd0 NSrYjlVq3eDvoP45G/JBnapw9rQMXVKFEpWG+XAAkogXLqXInH3XrCWYn3vA5Vd8 rOCZ605XaVQ43mcb2R7Lh1MYf5qbGYrgOBmgvLX1KvVTQYwvn6fxjGhjv7eimVmg bH8WHZV9/LYRlElcVQocpsf2ga2L5u6N3X3SzwNhYEhwpiK+L5CpFdH7iI+P8yvo XOSl52kATLdGKwniNEm/3fp2xM+aIBKeg82PbcJ02VKVFh+D8EcAA+qQggDKdeTr TLVI+Exr4P20NsGShHOEfR54XPgDe4fCJnUJ2JCsIls6EUolsFX/yuRExnK4xmvf 9ZQfJw7IMxQ9TeeekO5g0vCe8ZSAlNAP7WaWBcUGzEbRFB4Zyl5KNIjHbIFXgyie wf92FNIc618h3vRS1RWX4zbdyP/j0qdiorSia4oK5si5pP9idow= =36VA -----END PGP SIGNATURE----- --z4+8/lEcDcG5Ke9S--