From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:37226) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T4pMp-0005tU-1X for qemu-devel@nongnu.org; Fri, 24 Aug 2012 04:35:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T4pMk-0005lU-80 for qemu-devel@nongnu.org; Fri, 24 Aug 2012 04:35:46 -0400 Received: from mout.web.de ([212.227.15.3]:58180) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T4pMj-0005lK-QF for qemu-devel@nongnu.org; Fri, 24 Aug 2012 04:35:42 -0400 Message-ID: <50373CD6.3070002@web.de> Date: Fri, 24 Aug 2012 10:35:34 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <50371D06.3010706@web.de> <20120824082048.GC7830@redhat.com> <50373998.3030708@web.de> In-Reply-To: <50373998.3030708@web.de> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigA2E0A5982052F199D8E08CC2" Subject: Re: [Qemu-devel] [PATCH v2][for 1.2?] msix: Drop tracking of used vectors List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Cam Macdonell , qemu-devel , Hannes Reinecke This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigA2E0A5982052F199D8E08CC2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 2012-08-24 10:21, Jan Kiszka wrote: > On 2012-08-24 10:20, Michael S. Tsirkin wrote: >> On Fri, Aug 24, 2012 at 08:19:50AM +0200, Jan Kiszka wrote: >>> From: Jan Kiszka >>> >>> This optimization was once used in qemu-kvm to keep KVM route usage l= ow. >>> But now we solved that problem via lazy updates. It also tried to han= dle >>> the case of vectors shared between different sources of the same devi= ce. >>> However, this never really worked and will have to be addressed >>> differently anyway. So drop this obsolete interface. >>> >>> We still need interfaces to clear pending vectors though. Provide >>> msix_clear_vector and msix_clear_all_vectors for this. >>> >>> This also unbreaks MSI-X after reset for ivshmem and megasas as devic= e >>> models can't easily mark their vectors used afterward (megasas didn't= >>> even try). >>> >>> Signed-off-by: Jan Kiszka >>> --- >>> >>> This patch has been posted some moons again, and we had a discussion >>> about the impact on the existing users. At that time, MSI-X refactori= ng >>> for KVM support was not yet merged. Now it is, and it should be bette= r >>> much clearer that vector usage tracking has no business with that >>> feature. >>> >>> hw/ivshmem.c | 20 ------------------- >>> hw/megasas.c | 4 --- >>> hw/msix.c | 57 ++++++++++++---------------------------------= --------- >>> hw/msix.h | 5 +-- >>> hw/pci.h | 2 - >>> hw/virtio-pci.c | 20 +++++++----------- >>> 6 files changed, 23 insertions(+), 85 deletions(-) >>> >>> diff --git a/hw/ivshmem.c b/hw/ivshmem.c >>> index 47f2a16..5ffff48 100644 >>> --- a/hw/ivshmem.c >>> +++ b/hw/ivshmem.c >>> @@ -523,28 +523,11 @@ static void ivshmem_read(void *opaque, const ui= nt8_t * buf, int flags) >>> return; >>> } >>> =20 >>> -/* Select the MSI-X vectors used by device. >>> - * ivshmem maps events to vectors statically, so >>> - * we just enable all vectors on init and after reset. */ >>> -static void ivshmem_use_msix(IVShmemState * s) >>> -{ >>> - int i; >>> - >>> - if (!msix_present(&s->dev)) { >>> - return; >>> - } >>> - >>> - for (i =3D 0; i < s->vectors; i++) { >>> - msix_vector_use(&s->dev, i); >>> - } >>> -} >>> - >>> static void ivshmem_reset(DeviceState *d) >>> { >>> IVShmemState *s =3D DO_UPCAST(IVShmemState, dev.qdev, d); >>> =20 >>> s->intrstatus =3D 0; >>> - ivshmem_use_msix(s); >>> return; >>> } >>> =20 >>> @@ -586,8 +569,6 @@ static void ivshmem_setup_msi(IVShmemState * s) >>> =20 >>> /* allocate QEMU char devices for receiving interrupts */ >>> s->eventfd_table =3D g_malloc0(s->vectors * sizeof(EventfdEntry)= ); >>> - >>> - ivshmem_use_msix(s); >>> } >>> =20 >>> static void ivshmem_save(QEMUFile* f, void *opaque) >>> @@ -629,7 +610,6 @@ static int ivshmem_load(QEMUFile* f, void *opaque= , int version_id) >>> =20 >>> if (ivshmem_has_feature(proxy, IVSHMEM_MSI)) { >>> msix_load(&proxy->dev, f); >>> - ivshmem_use_msix(proxy); >>> } else { >>> proxy->intrstatus =3D qemu_get_be32(f); >>> proxy->intrmask =3D qemu_get_be32(f); >>> diff --git a/hw/megasas.c b/hw/megasas.c >>> index c35a15d..4766117 100644 >>> --- a/hw/megasas.c >>> +++ b/hw/megasas.c >>> @@ -2121,10 +2121,6 @@ static int megasas_scsi_init(PCIDevice *dev) >>> pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_IO, &s->port= _io); >>> pci_register_bar(&s->dev, 3, bar_type, &s->queue_io); >>> =20 >>> - if (megasas_use_msix(s)) { >>> - msix_vector_use(&s->dev, 0); >>> - } >>> - >>> if (!s->sas_addr) { >>> s->sas_addr =3D ((NAA_LOCALLY_ASSIGNED_ID << 24) | >>> IEEE_COMPANY_LOCALLY_ASSIGNED) << 36; >>> diff --git a/hw/msix.c b/hw/msix.c >>> index aea340b..163ce4c 100644 >>> --- a/hw/msix.c >>> +++ b/hw/msix.c >>> @@ -273,7 +273,6 @@ int msix_init(struct PCIDevice *dev, unsigned sho= rt nentries, >>> =20 >>> dev->msix_table =3D g_malloc0(table_size); >>> dev->msix_pba =3D g_malloc0(pba_size); >>> - dev->msix_entry_used =3D g_malloc0(nentries * sizeof *dev->msix_= entry_used); >>> =20 >>> msix_mask_all(dev, nentries); >>> =20 >>> @@ -326,16 +325,6 @@ int msix_init_exclusive_bar(PCIDevice *dev, unsi= gned short nentries, >>> return 0; >>> } >>> =20 >>> -static void msix_free_irq_entries(PCIDevice *dev) >>> -{ >>> - int vector; >>> - >>> - for (vector =3D 0; vector < dev->msix_entries_nr; ++vector) { >>> - dev->msix_entry_used[vector] =3D 0; >>> - msix_clr_pending(dev, vector); >>> - } >>> -} >>> - >>> /* Clean up resources for the device. */ >>> void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegi= on *pba_bar) >>> { >>> @@ -344,7 +333,6 @@ void msix_uninit(PCIDevice *dev, MemoryRegion *ta= ble_bar, MemoryRegion *pba_bar) >>> } >>> pci_del_capability(dev, PCI_CAP_ID_MSIX, MSIX_CAP_LENGTH); >>> dev->msix_cap =3D 0; >>> - msix_free_irq_entries(dev); >>> dev->msix_entries_nr =3D 0; >>> memory_region_del_subregion(pba_bar, &dev->msix_pba_mmio); >>> memory_region_destroy(&dev->msix_pba_mmio); >>> @@ -354,8 +342,6 @@ void msix_uninit(PCIDevice *dev, MemoryRegion *ta= ble_bar, MemoryRegion *pba_bar) >>> memory_region_destroy(&dev->msix_table_mmio); >>> g_free(dev->msix_table); >>> dev->msix_table =3D NULL; >>> - g_free(dev->msix_entry_used); >>> - dev->msix_entry_used =3D NULL; >>> dev->cap_present &=3D ~QEMU_PCI_CAP_MSIX; >>> return; >>> } >>> @@ -390,7 +376,6 @@ void msix_load(PCIDevice *dev, QEMUFile *f) >>> return; >>> } >>> =20 >>> - msix_free_irq_entries(dev); >>> qemu_get_buffer(f, dev->msix_table, n * PCI_MSIX_ENTRY_SIZE); >>> qemu_get_buffer(f, dev->msix_pba, (n + 7) / 8); >>> msix_update_function_masked(dev); >>> @@ -419,8 +404,9 @@ void msix_notify(PCIDevice *dev, unsigned vector)= >>> { >>> MSIMessage msg; >>> =20 >>> - if (vector >=3D dev->msix_entries_nr || !dev->msix_entry_used[ve= ctor]) >>> + if (vector >=3D dev->msix_entries_nr) { >>> return; >>> + } >>> if (msix_is_masked(dev, vector)) { >>> msix_set_pending(dev, vector); >>> return; >>> @@ -436,7 +422,7 @@ void msix_reset(PCIDevice *dev) >>> if (!msix_present(dev)) { >>> return; >>> } >>> - msix_free_irq_entries(dev); >>> + msix_clear_all_vectors(dev); >>> dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &=3D >>> ~dev->wmask[dev->msix_cap + MSIX_CONTROL_OFFSET]; >>> memset(dev->msix_table, 0, dev->msix_entries_nr * PCI_MSIX_ENTRY= _SIZE); >>> @@ -444,41 +430,24 @@ void msix_reset(PCIDevice *dev) >>> msix_mask_all(dev, dev->msix_entries_nr); >>> } >>> =20 >>> -/* PCI spec suggests that devices make it possible for software to c= onfigure >>> - * less vectors than supported by the device, but does not specify a= standard >>> - * mechanism for devices to do so. >>> - * >>> - * We support this by asking devices to declare vectors software is = going to >>> - * actually use, and checking this on the notification path. Devices= that >>> - * don't want to follow the spec suggestion can declare all vectors = as used. */ >>> - >>> -/* Mark vector as used. */ >>> -int msix_vector_use(PCIDevice *dev, unsigned vector) >>> +/* Clear pending vector. */ >>> +void msix_clear_vector(PCIDevice *dev, unsigned vector) >>> { >>> - if (vector >=3D dev->msix_entries_nr) >>> - return -EINVAL; >>> - dev->msix_entry_used[vector]++; >>> - return 0; >>> -} >>> - >> >> I keep thinking we should instead call vector notifier >> here, so that virtio can detect and handle cases where >> kvm_virtio_pci_vq_vector_use fails. >> ATM it just asserts. >> >> Maybe I am wrong. >> >> Let's delay this decision until 1.3. >=20 > Yep, that feature can be added on top of this patch in 1.3, but we > should still remove the usage tracking to fix its issues. There is no > feature regression related to this. OK, to address that virtio requirement (in 1.3) we can simply preallocate an MSI route on VIRTIO_MSI_CONFIG/QUEUE_VECTOR and update that route (I have a patch for kvm_irqchip_update_msi_route here already) on vector_use. The point is: That all is pure virtio business, nothing the generic MSI-X layer nor any normal MSI-X device or even pci-assign/vfio have to deal with. Jan --------------enigA2E0A5982052F199D8E08CC2 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlA3PNYACgkQitSsb3rl5xRXYQCgq8k7y44UnB8vX70LAhrBK994 1iQAoL+N1jw2YVRUFZobVI9HnuacyQYp =TcUv -----END PGP SIGNATURE----- --------------enigA2E0A5982052F199D8E08CC2--