From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:45850) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T4oyK-0004WU-Bv for qemu-devel@nongnu.org; Fri, 24 Aug 2012 04:10:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T4oyJ-0006GH-32 for qemu-devel@nongnu.org; Fri, 24 Aug 2012 04:10:28 -0400 Received: from mx1.redhat.com ([209.132.183.28]:30202) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T4oyI-0006GD-R4 for qemu-devel@nongnu.org; Fri, 24 Aug 2012 04:10:27 -0400 Date: Fri, 24 Aug 2012 11:11:36 +0300 From: "Michael S. Tsirkin" Message-ID: <20120824081136.GB7830@redhat.com> References: <5037182A.7080902@web.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <5037182A.7080902@web.de> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Cam Macdonell , "qemu-devel@nongnu.org Developers" On Fri, Aug 24, 2012 at 07:59:06AM +0200, Jan Kiszka wrote: > On 2012-08-24 01:13, Cam Macdonell wrote: > > Hi Jan, > >=20 > > I've bisected a bug in which MSI interrupts are not being delivered t= o > > the following patch, where msix_reset was moved in tot he PCI core. > >=20 > > commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2 > > Author: Jan Kiszka > > Date: Tue May 15 20:09:56 2012 -0300 > >=20 > > msi: Invoke msi/msix_reset from PCI core > >=20 > > There is no point in pushing this burden to the devices, they ten= d to > > forget to call them (like intel-hda, ahci, xhci did). Instead, re= set > > functions are now called from pci_device_reset. They do nothing i= f > > MSI/MSI-X is not in use. > >=20 > > I've been debugging and it seems that when msix_notify() is triggered > > the second test in the "if" fails > >=20 > > /* Send an MSI-X message */ > > void msix_notify(PCIDevice *dev, unsigned vector) > > { > > MSIMessage msg; > >=20 > > if (vector >=3D dev->msix_entries_nr || !dev->msix_entry_used[vec= tor]) > > return; > >=20 > > =E2=80=A6 > > } > >=20 > > here is some MSI-X debugging statements > >=20 > > msix_init > > IVSHMEM: msix initialized (1 vectors) > > IVSHMEM: using vector 0 > > IVSHMEM: ivshmem_reset > > IVSHMEM: using vector 0 > > msix_reset > > msix_free_irq_entries 0x7fd52d1cea20 > >=20 > > msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it > > may be the cause. >=20 > I suppose you mean it sets the msix_entry_used array to 0. >=20 > >=20 > > Shouldn't ivshmem's reset (which reenables the vectors) be triggered > > by the msix_reset? >=20 > Actually, the whole msix vector usage tracking is useless today, this > just shows its downsides (in the absence of benefits). Megasas is > affected by this problem as well, virtio not as it calls msix_vector_us= e > during the configuration process the guest driver triggers. >=20 > Two options: > - I can send my removal patch for msix_vector_use/unuse that I was > only planning for 1.3 so far, and we kill this pitfall earlier. > - We re-add msix_vector_use calls to the affected device models for > 1.2 and drop them later again for 1.3 when removing usage tracking. > [The third option to keep the usage tracking is a non-option for me. ;)= ] >=20 > Michael? Second option seems more prudent to me. Can you send a patch pls? > >=20 > > Thanks, > > Cam > >=20 > > p.s. And apologies, I should've caught this bug closer to that patch > > being merged. >=20 > No problem. I should have seen this issue while changing the code. >=20 > Jan >=20 >=20