From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:45744) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SXkIC-0002uR-Mu for qemu-devel@nongnu.org; Thu, 24 May 2012 22:30:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SXkIA-0003Ym-LN for qemu-devel@nongnu.org; Thu, 24 May 2012 22:30:16 -0400 Received: from fmmailgate01.web.de ([217.72.192.221]:52532) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SXkIA-0003Yf-Bw for qemu-devel@nongnu.org; Thu, 24 May 2012 22:30:14 -0400 Received: from moweb002.kundenserver.de (moweb002.kundenserver.de [172.19.20.108]) by fmmailgate01.web.de (Postfix) with ESMTP id 1A6901AEA1228 for ; Fri, 25 May 2012 04:30:12 +0200 (CEST) Message-ID: <4FBEEEA4.2060504@web.de> Date: Thu, 24 May 2012 23:29:56 -0300 From: Jan Kiszka MIME-Version: 1.0 References: <4FBDE6D6.80700@ozlabs.ru> <4FBE2349.6040800@siemens.com> <4FBEDDF3.20108@ozlabs.ru> In-Reply-To: <4FBEDDF3.20108@ozlabs.ru> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigB9342D1214F93F803CDE72A1" Subject: Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy Cc: Alex Williamson , David Gibson , qemu-devel@nongnu.org, kvm@vger.kernel.org, Alex Graf This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigB9342D1214F93F803CDE72A1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 2012-05-24 22:18, Alexey Kardashevskiy wrote: > On 24/05/12 22:02, Jan Kiszka wrote: >> On 2012-05-24 04:44, Alexey Kardashevskiy wrote: >>> [Found while debugging VFIO on POWER but it is platform independent] >>> >>> There is a feature in PCI (>=3D2.3?) to mask/unmask INTx via PCI_COMM= AND and >>> PCI_STATUS registers. >> >> Yes, 2.3 introduced this. Masking is done via command register, checki= ng >> if the source was the PCI in question via the status register. The >> latter is important for supporting IRQ sharing - and that's why we >> introduced this masking API to the PCI layer. >=20 >=20 > Is not it just a quite small optimization to not to disable interrupts = on all devices which share > the same IRQ but just on those who fired an interrupt? If so, do PCI de= vices really often share > IRQs? Does not supporting this mean real slowdown on such devices? >=20 > As far as I understand, everyone who cares about performance uses MSI/M= SIX, no? Not everyone is blessed with MSI-only PCI devices. From my notebook: # cat /proc/interrupts [...] 22: [...] IO-APIC-fasteoi ehci_hcd:usb1, ehci_hcd:usb2 So, if I want to assign one EHCI controller to a guest, I have to disable the other as well. The same can happen quickly if you attach a few legacy PCI adapters to a system and want to pass them through. >=20 >=20 >>> And there is some API to support that (commit a2e27787f893621c5a6b865= acf6b7766f8671328). >>> >>> I have a network adapter: >>> 0001:00:01.0 Ethernet controller: Chelsio Communications Inc T310 10G= bE Single Port Adapter >>> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ = Stepping- SERR+ FastB2B- DisINTx- >>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=3Dfast >TAbort- SERR- >> >>> pci_intx_mask_supported() reports that the feature is supported for t= his adapter >>> BUT the adapter does not set PCI_STATUS_INTERRUPT so pci_check_and_se= t_intx_mask() >>> never changes PCI_COMMAND and INTx does not work on it when we use it= as VFIO-PCI device. >>> >>> If I remove the check of this bit, it works fine as it is called from= an interrupt handler and >>> Status bit check is redundant. >>> >>> Opened a spec: >>> PCI LOCAL BUS SPECIFICATION, REV. 3.0, Table 6-2: Status Register Bit= s >>> =3D=3D=3D >>> 3 This read-only bit reflects the state of the interrupt in the >>> device/function. Only when the Interrupt Disable bit in the command >>> register is a 0 and this Interrupt Status bit is a 1, will the >>> device=E2=80=99s/function=E2=80=99s INTx# signal be asserted. Setting= the Interrupt >>> Disable bit to a 1 has no effect on the state of this bit. >>> =3D=3D=3D >>> With this adapter, INTx# is asserted but Status bit is still 0. >>> >>> Is it mandatory for a device to set Status bit if it supports INTx ma= sking? >>> >>> 2 Alex: if it is mandatory, then we need to be able to disable pci_2_= 3 in VFIO-PCI >>> somehow. >> >> Since PCI 2.3, this bit is mandatory, and it should be independent of >> the masking bit. The question is, if your device is supposed to suppor= t >> 2.3, thus is just buggy, or if our detection algorithm is unreliable. = It >> basically builds on the assumption that, if we can flip the mask bit, >> the feature should be present. I guess that is the best we can do. May= be >> we can augment this with a blacklist of devices that "support" flippin= g >> without actually providing the feature. >=20 > It is a good moment to start :) > Not sure where - in VFIO or along with that PCI INTx API. At PCI level as the API is VFIO agnostic (it was introduced for "classic" KVM device assignment, in fact). >=20 > Here is that broken device: > aik@vpl2:~$ lspci -s 1:1:0.0 > 0001:01:00.0 Ethernet controller: Chelsio Communications Inc T310 10GbE= Single Port Adapter > aik@vpl2:~$ lspci -ns 1:1:0.0 > 0001:01:00.0 0200: 1425:0030 A patch to add the infrastructure as well would be even more welcome. :) You could have a look at drivers/pci/quirks.c for patterns how to do this= =2E Jan --------------enigB9342D1214F93F803CDE72A1 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk++7qkACgkQitSsb3rl5xTSnQCgl877MJv4bTtwSbABJL1ACYje xIYAn3Dtq3A5at53DiiXs5BZHU7EBC0T =TEGb -----END PGP SIGNATURE----- --------------enigB9342D1214F93F803CDE72A1--