* [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI
@ 2012-08-23 23:13 Cam Macdonell
2012-08-24 5:59 ` Jan Kiszka
0 siblings, 1 reply; 7+ messages in thread
From: Cam Macdonell @ 2012-08-23 23:13 UTC (permalink / raw)
To: Jan Kiszka; +Cc: qemu-devel@nongnu.org Developers, Michael S. Tsirkin
Hi Jan,
I've bisected a bug in which MSI interrupts are not being delivered to
the following patch, where msix_reset was moved in tot he PCI core.
commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date: Tue May 15 20:09:56 2012 -0300
msi: Invoke msi/msix_reset from PCI core
There is no point in pushing this burden to the devices, they tend to
forget to call them (like intel-hda, ahci, xhci did). Instead, reset
functions are now called from pci_device_reset. They do nothing if
MSI/MSI-X is not in use.
I've been debugging and it seems that when msix_notify() is triggered
the second test in the "if" fails
/* Send an MSI-X message */
void msix_notify(PCIDevice *dev, unsigned vector)
{
MSIMessage msg;
if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
return;
…
}
here is some MSI-X debugging statements
msix_init
IVSHMEM: msix initialized (1 vectors)
IVSHMEM: using vector 0
IVSHMEM: ivshmem_reset
IVSHMEM: using vector 0
msix_reset
msix_free_irq_entries 0x7fd52d1cea20
msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
may be the cause.
Shouldn't ivshmem's reset (which reenables the vectors) be triggered
by the msix_reset?
Thanks,
Cam
p.s. And apologies, I should've caught this bug closer to that patch
being merged.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI
2012-08-23 23:13 [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI Cam Macdonell
@ 2012-08-24 5:59 ` Jan Kiszka
2012-08-24 8:11 ` Michael S. Tsirkin
0 siblings, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2012-08-24 5:59 UTC (permalink / raw)
To: Cam Macdonell; +Cc: qemu-devel@nongnu.org Developers, Michael S. Tsirkin
[-- Attachment #1: Type: text/plain, Size: 2361 bytes --]
On 2012-08-24 01:13, Cam Macdonell wrote:
> Hi Jan,
>
> I've bisected a bug in which MSI interrupts are not being delivered to
> the following patch, where msix_reset was moved in tot he PCI core.
>
> commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date: Tue May 15 20:09:56 2012 -0300
>
> msi: Invoke msi/msix_reset from PCI core
>
> There is no point in pushing this burden to the devices, they tend to
> forget to call them (like intel-hda, ahci, xhci did). Instead, reset
> functions are now called from pci_device_reset. They do nothing if
> MSI/MSI-X is not in use.
>
> I've been debugging and it seems that when msix_notify() is triggered
> the second test in the "if" fails
>
> /* Send an MSI-X message */
> void msix_notify(PCIDevice *dev, unsigned vector)
> {
> MSIMessage msg;
>
> if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
> return;
>
> …
> }
>
> here is some MSI-X debugging statements
>
> msix_init
> IVSHMEM: msix initialized (1 vectors)
> IVSHMEM: using vector 0
> IVSHMEM: ivshmem_reset
> IVSHMEM: using vector 0
> msix_reset
> msix_free_irq_entries 0x7fd52d1cea20
>
> msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
> may be the cause.
I suppose you mean it sets the msix_entry_used array to 0.
>
> Shouldn't ivshmem's reset (which reenables the vectors) be triggered
> by the msix_reset?
Actually, the whole msix vector usage tracking is useless today, this
just shows its downsides (in the absence of benefits). Megasas is
affected by this problem as well, virtio not as it calls msix_vector_use
during the configuration process the guest driver triggers.
Two options:
- I can send my removal patch for msix_vector_use/unuse that I was
only planning for 1.3 so far, and we kill this pitfall earlier.
- We re-add msix_vector_use calls to the affected device models for
1.2 and drop them later again for 1.3 when removing usage tracking.
[The third option to keep the usage tracking is a non-option for me. ;)]
Michael?
>
> Thanks,
> Cam
>
> p.s. And apologies, I should've caught this bug closer to that patch
> being merged.
No problem. I should have seen this issue while changing the code.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI
2012-08-24 5:59 ` Jan Kiszka
@ 2012-08-24 8:11 ` Michael S. Tsirkin
2012-08-24 8:15 ` Jan Kiszka
0 siblings, 1 reply; 7+ messages in thread
From: Michael S. Tsirkin @ 2012-08-24 8:11 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Cam Macdonell, qemu-devel@nongnu.org Developers
On Fri, Aug 24, 2012 at 07:59:06AM +0200, Jan Kiszka wrote:
> On 2012-08-24 01:13, Cam Macdonell wrote:
> > Hi Jan,
> >
> > I've bisected a bug in which MSI interrupts are not being delivered to
> > the following patch, where msix_reset was moved in tot he PCI core.
> >
> > commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
> > Author: Jan Kiszka <jan.kiszka@siemens.com>
> > Date: Tue May 15 20:09:56 2012 -0300
> >
> > msi: Invoke msi/msix_reset from PCI core
> >
> > There is no point in pushing this burden to the devices, they tend to
> > forget to call them (like intel-hda, ahci, xhci did). Instead, reset
> > functions are now called from pci_device_reset. They do nothing if
> > MSI/MSI-X is not in use.
> >
> > I've been debugging and it seems that when msix_notify() is triggered
> > the second test in the "if" fails
> >
> > /* Send an MSI-X message */
> > void msix_notify(PCIDevice *dev, unsigned vector)
> > {
> > MSIMessage msg;
> >
> > if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
> > return;
> >
> > …
> > }
> >
> > here is some MSI-X debugging statements
> >
> > msix_init
> > IVSHMEM: msix initialized (1 vectors)
> > IVSHMEM: using vector 0
> > IVSHMEM: ivshmem_reset
> > IVSHMEM: using vector 0
> > msix_reset
> > msix_free_irq_entries 0x7fd52d1cea20
> >
> > msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
> > may be the cause.
>
> I suppose you mean it sets the msix_entry_used array to 0.
>
> >
> > Shouldn't ivshmem's reset (which reenables the vectors) be triggered
> > by the msix_reset?
>
> Actually, the whole msix vector usage tracking is useless today, this
> just shows its downsides (in the absence of benefits). Megasas is
> affected by this problem as well, virtio not as it calls msix_vector_use
> during the configuration process the guest driver triggers.
>
> Two options:
> - I can send my removal patch for msix_vector_use/unuse that I was
> only planning for 1.3 so far, and we kill this pitfall earlier.
> - We re-add msix_vector_use calls to the affected device models for
> 1.2 and drop them later again for 1.3 when removing usage tracking.
> [The third option to keep the usage tracking is a non-option for me. ;)]
>
> Michael?
Second option seems more prudent to me. Can you send a patch pls?
> >
> > Thanks,
> > Cam
> >
> > p.s. And apologies, I should've caught this bug closer to that patch
> > being merged.
>
> No problem. I should have seen this issue while changing the code.
>
> Jan
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI
2012-08-24 8:11 ` Michael S. Tsirkin
@ 2012-08-24 8:15 ` Jan Kiszka
2012-08-24 8:36 ` Michael S. Tsirkin
0 siblings, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2012-08-24 8:15 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Cam Macdonell, qemu-devel@nongnu.org Developers
[-- Attachment #1: Type: text/plain, Size: 2843 bytes --]
On 2012-08-24 10:11, Michael S. Tsirkin wrote:
> On Fri, Aug 24, 2012 at 07:59:06AM +0200, Jan Kiszka wrote:
>> On 2012-08-24 01:13, Cam Macdonell wrote:
>>> Hi Jan,
>>>
>>> I've bisected a bug in which MSI interrupts are not being delivered to
>>> the following patch, where msix_reset was moved in tot he PCI core.
>>>
>>> commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>> Date: Tue May 15 20:09:56 2012 -0300
>>>
>>> msi: Invoke msi/msix_reset from PCI core
>>>
>>> There is no point in pushing this burden to the devices, they tend to
>>> forget to call them (like intel-hda, ahci, xhci did). Instead, reset
>>> functions are now called from pci_device_reset. They do nothing if
>>> MSI/MSI-X is not in use.
>>>
>>> I've been debugging and it seems that when msix_notify() is triggered
>>> the second test in the "if" fails
>>>
>>> /* Send an MSI-X message */
>>> void msix_notify(PCIDevice *dev, unsigned vector)
>>> {
>>> MSIMessage msg;
>>>
>>> if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
>>> return;
>>>
>>> …
>>> }
>>>
>>> here is some MSI-X debugging statements
>>>
>>> msix_init
>>> IVSHMEM: msix initialized (1 vectors)
>>> IVSHMEM: using vector 0
>>> IVSHMEM: ivshmem_reset
>>> IVSHMEM: using vector 0
>>> msix_reset
>>> msix_free_irq_entries 0x7fd52d1cea20
>>>
>>> msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
>>> may be the cause.
>>
>> I suppose you mean it sets the msix_entry_used array to 0.
>>
>>>
>>> Shouldn't ivshmem's reset (which reenables the vectors) be triggered
>>> by the msix_reset?
>>
>> Actually, the whole msix vector usage tracking is useless today, this
>> just shows its downsides (in the absence of benefits). Megasas is
>> affected by this problem as well, virtio not as it calls msix_vector_use
>> during the configuration process the guest driver triggers.
>>
>> Two options:
>> - I can send my removal patch for msix_vector_use/unuse that I was
>> only planning for 1.3 so far, and we kill this pitfall earlier.
>> - We re-add msix_vector_use calls to the affected device models for
>> 1.2 and drop them later again for 1.3 when removing usage tracking.
>> [The third option to keep the usage tracking is a non-option for me. ;)]
>>
>> Michael?
>
> Second option seems more prudent to me. Can you send a patch pls?
In fact, it's not as easy. ivshmem already tries to restore the usage
flag but fails due to reset handler ordering. I do not see ATM where
there is some "enable device" during re-initialization, at least for
ivshmem. Haven't checked megasas yet.
I tend to think removing is simpler and less risky. Please have a look
at the patch I sent earlier.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI
2012-08-24 8:15 ` Jan Kiszka
@ 2012-08-24 8:36 ` Michael S. Tsirkin
2012-08-24 8:39 ` Jan Kiszka
0 siblings, 1 reply; 7+ messages in thread
From: Michael S. Tsirkin @ 2012-08-24 8:36 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Cam Macdonell, qemu-devel@nongnu.org Developers
On Fri, Aug 24, 2012 at 10:15:33AM +0200, Jan Kiszka wrote:
> On 2012-08-24 10:11, Michael S. Tsirkin wrote:
> > On Fri, Aug 24, 2012 at 07:59:06AM +0200, Jan Kiszka wrote:
> >> On 2012-08-24 01:13, Cam Macdonell wrote:
> >>> Hi Jan,
> >>>
> >>> I've bisected a bug in which MSI interrupts are not being delivered to
> >>> the following patch, where msix_reset was moved in tot he PCI core.
> >>>
> >>> commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
> >>> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >>> Date: Tue May 15 20:09:56 2012 -0300
> >>>
> >>> msi: Invoke msi/msix_reset from PCI core
> >>>
> >>> There is no point in pushing this burden to the devices, they tend to
> >>> forget to call them (like intel-hda, ahci, xhci did). Instead, reset
> >>> functions are now called from pci_device_reset. They do nothing if
> >>> MSI/MSI-X is not in use.
> >>>
> >>> I've been debugging and it seems that when msix_notify() is triggered
> >>> the second test in the "if" fails
> >>>
> >>> /* Send an MSI-X message */
> >>> void msix_notify(PCIDevice *dev, unsigned vector)
> >>> {
> >>> MSIMessage msg;
> >>>
> >>> if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
> >>> return;
> >>>
> >>> …
> >>> }
> >>>
> >>> here is some MSI-X debugging statements
> >>>
> >>> msix_init
> >>> IVSHMEM: msix initialized (1 vectors)
> >>> IVSHMEM: using vector 0
> >>> IVSHMEM: ivshmem_reset
> >>> IVSHMEM: using vector 0
> >>> msix_reset
> >>> msix_free_irq_entries 0x7fd52d1cea20
> >>>
> >>> msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
> >>> may be the cause.
> >>
> >> I suppose you mean it sets the msix_entry_used array to 0.
> >>
> >>>
> >>> Shouldn't ivshmem's reset (which reenables the vectors) be triggered
> >>> by the msix_reset?
> >>
> >> Actually, the whole msix vector usage tracking is useless today, this
> >> just shows its downsides (in the absence of benefits). Megasas is
> >> affected by this problem as well, virtio not as it calls msix_vector_use
> >> during the configuration process the guest driver triggers.
> >>
> >> Two options:
> >> - I can send my removal patch for msix_vector_use/unuse that I was
> >> only planning for 1.3 so far, and we kill this pitfall earlier.
> >> - We re-add msix_vector_use calls to the affected device models for
> >> 1.2 and drop them later again for 1.3 when removing usage tracking.
> >> [The third option to keep the usage tracking is a non-option for me. ;)]
> >>
> >> Michael?
> >
> > Second option seems more prudent to me. Can you send a patch pls?
>
> In fact, it's not as easy. ivshmem already tries to restore the usage
> flag but fails due to reset handler ordering. I do not see ATM where
> there is some "enable device" during re-initialization, at least for
> ivshmem. Haven't checked megasas yet.
>
> I tend to think removing is simpler and less risky. Please have a look
> at the patch I sent earlier.
>
> Jan
>
>
Actually virtio is the only one that changes vector use
so the only one that needs to reset it.
I am thinking we should find a way to move vector use
out from core to virtio-pci.
But for now something like the below should do the trick.
Untested unfortunately, will test virtio Sunday
but would appreciate review and testing reports
esp with ivshmem etc.
-->
msix: avoid need to use/unuse vectors for most devices
The facility to use/unuse vectors is helpful for virtio
but little else since everyone just seems to use
vectors in their init function.
Avoid clearing msix vector use info on reset and load.
For virtio, clear it explicitly.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
diff --git a/hw/msix.c b/hw/msix.c
index 800fc32..d040cc2 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -340,6 +340,15 @@ static void msix_free_irq_entries(PCIDevice *dev)
}
}
+static void msix_clear_all_vectors(PCIDevice *dev)
+{
+ int vector;
+
+ for (vector = 0; vector < dev->msix_entries_nr; ++vector) {
+ msix_clr_pending(dev, vector);
+ }
+}
+
/* Clean up resources for the device. */
void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegion *pba_bar)
{
@@ -394,7 +403,7 @@ void msix_load(PCIDevice *dev, QEMUFile *f)
return;
}
- msix_free_irq_entries(dev);
+ msix_clear_all_vectors(dev);
qemu_get_buffer(f, dev->msix_table, n * PCI_MSIX_ENTRY_SIZE);
qemu_get_buffer(f, dev->msix_pba, (n + 7) / 8);
msix_update_function_masked(dev);
@@ -440,7 +449,7 @@ void msix_reset(PCIDevice *dev)
if (!msix_present(dev)) {
return;
}
- msix_free_irq_entries(dev);
+ msix_clear_all_vectors(dev);
dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &=
~dev->wmask[dev->msix_cap + MSIX_CONTROL_OFFSET];
memset(dev->msix_table, 0, dev->msix_entries_nr * PCI_MSIX_ENTRY_SIZE);
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 125eded..ca0b204 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -131,6 +131,7 @@ static int virtio_pci_load_config(void * opaque, QEMUFile *f)
if (ret) {
return ret;
}
+ msix_unuse_all_vectors(&proxy->pci_dev);
msix_load(&proxy->pci_dev, f);
if (msix_present(&proxy->pci_dev)) {
qemu_get_be16s(f, &proxy->vdev->config_vector);
@@ -246,6 +247,7 @@ void virtio_pci_reset(DeviceState *d)
VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
virtio_pci_stop_ioeventfd(proxy);
virtio_reset(proxy->vdev);
+ msix_unuse_all_vectors(&proxy->pci_dev);
proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
}
--
MST
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI
2012-08-24 8:36 ` Michael S. Tsirkin
@ 2012-08-24 8:39 ` Jan Kiszka
2012-08-24 9:20 ` Michael S. Tsirkin
0 siblings, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2012-08-24 8:39 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Cam Macdonell, qemu-devel@nongnu.org Developers
[-- Attachment #1: Type: text/plain, Size: 6136 bytes --]
On 2012-08-24 10:36, Michael S. Tsirkin wrote:
> On Fri, Aug 24, 2012 at 10:15:33AM +0200, Jan Kiszka wrote:
>> On 2012-08-24 10:11, Michael S. Tsirkin wrote:
>>> On Fri, Aug 24, 2012 at 07:59:06AM +0200, Jan Kiszka wrote:
>>>> On 2012-08-24 01:13, Cam Macdonell wrote:
>>>>> Hi Jan,
>>>>>
>>>>> I've bisected a bug in which MSI interrupts are not being delivered to
>>>>> the following patch, where msix_reset was moved in tot he PCI core.
>>>>>
>>>>> commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
>>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>>> Date: Tue May 15 20:09:56 2012 -0300
>>>>>
>>>>> msi: Invoke msi/msix_reset from PCI core
>>>>>
>>>>> There is no point in pushing this burden to the devices, they tend to
>>>>> forget to call them (like intel-hda, ahci, xhci did). Instead, reset
>>>>> functions are now called from pci_device_reset. They do nothing if
>>>>> MSI/MSI-X is not in use.
>>>>>
>>>>> I've been debugging and it seems that when msix_notify() is triggered
>>>>> the second test in the "if" fails
>>>>>
>>>>> /* Send an MSI-X message */
>>>>> void msix_notify(PCIDevice *dev, unsigned vector)
>>>>> {
>>>>> MSIMessage msg;
>>>>>
>>>>> if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
>>>>> return;
>>>>>
>>>>> …
>>>>> }
>>>>>
>>>>> here is some MSI-X debugging statements
>>>>>
>>>>> msix_init
>>>>> IVSHMEM: msix initialized (1 vectors)
>>>>> IVSHMEM: using vector 0
>>>>> IVSHMEM: ivshmem_reset
>>>>> IVSHMEM: using vector 0
>>>>> msix_reset
>>>>> msix_free_irq_entries 0x7fd52d1cea20
>>>>>
>>>>> msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
>>>>> may be the cause.
>>>>
>>>> I suppose you mean it sets the msix_entry_used array to 0.
>>>>
>>>>>
>>>>> Shouldn't ivshmem's reset (which reenables the vectors) be triggered
>>>>> by the msix_reset?
>>>>
>>>> Actually, the whole msix vector usage tracking is useless today, this
>>>> just shows its downsides (in the absence of benefits). Megasas is
>>>> affected by this problem as well, virtio not as it calls msix_vector_use
>>>> during the configuration process the guest driver triggers.
>>>>
>>>> Two options:
>>>> - I can send my removal patch for msix_vector_use/unuse that I was
>>>> only planning for 1.3 so far, and we kill this pitfall earlier.
>>>> - We re-add msix_vector_use calls to the affected device models for
>>>> 1.2 and drop them later again for 1.3 when removing usage tracking.
>>>> [The third option to keep the usage tracking is a non-option for me. ;)]
>>>>
>>>> Michael?
>>>
>>> Second option seems more prudent to me. Can you send a patch pls?
>>
>> In fact, it's not as easy. ivshmem already tries to restore the usage
>> flag but fails due to reset handler ordering. I do not see ATM where
>> there is some "enable device" during re-initialization, at least for
>> ivshmem. Haven't checked megasas yet.
>>
>> I tend to think removing is simpler and less risky. Please have a look
>> at the patch I sent earlier.
>>
>> Jan
>>
>>
>
> Actually virtio is the only one that changes vector use
> so the only one that needs to reset it.
>
> I am thinking we should find a way to move vector use
> out from core to virtio-pci.
> But for now something like the below should do the trick.
> Untested unfortunately, will test virtio Sunday
> but would appreciate review and testing reports
> esp with ivshmem etc.
>
> -->
>
> msix: avoid need to use/unuse vectors for most devices
>
> The facility to use/unuse vectors is helpful for virtio
> but little else since everyone just seems to use
> vectors in their init function.
>
> Avoid clearing msix vector use info on reset and load.
> For virtio, clear it explicitly.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> ---
>
> diff --git a/hw/msix.c b/hw/msix.c
> index 800fc32..d040cc2 100644
> --- a/hw/msix.c
> +++ b/hw/msix.c
> @@ -340,6 +340,15 @@ static void msix_free_irq_entries(PCIDevice *dev)
> }
> }
>
> +static void msix_clear_all_vectors(PCIDevice *dev)
> +{
> + int vector;
> +
> + for (vector = 0; vector < dev->msix_entries_nr; ++vector) {
> + msix_clr_pending(dev, vector);
> + }
> +}
> +
> /* Clean up resources for the device. */
> void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegion *pba_bar)
> {
> @@ -394,7 +403,7 @@ void msix_load(PCIDevice *dev, QEMUFile *f)
> return;
> }
>
> - msix_free_irq_entries(dev);
> + msix_clear_all_vectors(dev);
> qemu_get_buffer(f, dev->msix_table, n * PCI_MSIX_ENTRY_SIZE);
> qemu_get_buffer(f, dev->msix_pba, (n + 7) / 8);
> msix_update_function_masked(dev);
> @@ -440,7 +449,7 @@ void msix_reset(PCIDevice *dev)
> if (!msix_present(dev)) {
> return;
> }
> - msix_free_irq_entries(dev);
> + msix_clear_all_vectors(dev);
> dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &=
> ~dev->wmask[dev->msix_cap + MSIX_CONTROL_OFFSET];
> memset(dev->msix_table, 0, dev->msix_entries_nr * PCI_MSIX_ENTRY_SIZE);
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index 125eded..ca0b204 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -131,6 +131,7 @@ static int virtio_pci_load_config(void * opaque, QEMUFile *f)
> if (ret) {
> return ret;
> }
> + msix_unuse_all_vectors(&proxy->pci_dev);
> msix_load(&proxy->pci_dev, f);
> if (msix_present(&proxy->pci_dev)) {
> qemu_get_be16s(f, &proxy->vdev->config_vector);
> @@ -246,6 +247,7 @@ void virtio_pci_reset(DeviceState *d)
> VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
> virtio_pci_stop_ioeventfd(proxy);
> virtio_reset(proxy->vdev);
> + msix_unuse_all_vectors(&proxy->pci_dev);
> proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
> }
>
>
This just prolongs the life of a wrong API, but I'm fine for 1.2 if it
helps. However, I will push on removing all this for 1.3.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI
2012-08-24 8:39 ` Jan Kiszka
@ 2012-08-24 9:20 ` Michael S. Tsirkin
0 siblings, 0 replies; 7+ messages in thread
From: Michael S. Tsirkin @ 2012-08-24 9:20 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Cam Macdonell, qemu-devel@nongnu.org Developers
On Fri, Aug 24, 2012 at 10:39:15AM +0200, Jan Kiszka wrote:
> On 2012-08-24 10:36, Michael S. Tsirkin wrote:
> > On Fri, Aug 24, 2012 at 10:15:33AM +0200, Jan Kiszka wrote:
> >> On 2012-08-24 10:11, Michael S. Tsirkin wrote:
> >>> On Fri, Aug 24, 2012 at 07:59:06AM +0200, Jan Kiszka wrote:
> >>>> On 2012-08-24 01:13, Cam Macdonell wrote:
> >>>>> Hi Jan,
> >>>>>
> >>>>> I've bisected a bug in which MSI interrupts are not being delivered to
> >>>>> the following patch, where msix_reset was moved in tot he PCI core.
> >>>>>
> >>>>> commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
> >>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>> Date: Tue May 15 20:09:56 2012 -0300
> >>>>>
> >>>>> msi: Invoke msi/msix_reset from PCI core
> >>>>>
> >>>>> There is no point in pushing this burden to the devices, they tend to
> >>>>> forget to call them (like intel-hda, ahci, xhci did). Instead, reset
> >>>>> functions are now called from pci_device_reset. They do nothing if
> >>>>> MSI/MSI-X is not in use.
> >>>>>
> >>>>> I've been debugging and it seems that when msix_notify() is triggered
> >>>>> the second test in the "if" fails
> >>>>>
> >>>>> /* Send an MSI-X message */
> >>>>> void msix_notify(PCIDevice *dev, unsigned vector)
> >>>>> {
> >>>>> MSIMessage msg;
> >>>>>
> >>>>> if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
> >>>>> return;
> >>>>>
> >>>>> …
> >>>>> }
> >>>>>
> >>>>> here is some MSI-X debugging statements
> >>>>>
> >>>>> msix_init
> >>>>> IVSHMEM: msix initialized (1 vectors)
> >>>>> IVSHMEM: using vector 0
> >>>>> IVSHMEM: ivshmem_reset
> >>>>> IVSHMEM: using vector 0
> >>>>> msix_reset
> >>>>> msix_free_irq_entries 0x7fd52d1cea20
> >>>>>
> >>>>> msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
> >>>>> may be the cause.
> >>>>
> >>>> I suppose you mean it sets the msix_entry_used array to 0.
> >>>>
> >>>>>
> >>>>> Shouldn't ivshmem's reset (which reenables the vectors) be triggered
> >>>>> by the msix_reset?
> >>>>
> >>>> Actually, the whole msix vector usage tracking is useless today, this
> >>>> just shows its downsides (in the absence of benefits). Megasas is
> >>>> affected by this problem as well, virtio not as it calls msix_vector_use
> >>>> during the configuration process the guest driver triggers.
> >>>>
> >>>> Two options:
> >>>> - I can send my removal patch for msix_vector_use/unuse that I was
> >>>> only planning for 1.3 so far, and we kill this pitfall earlier.
> >>>> - We re-add msix_vector_use calls to the affected device models for
> >>>> 1.2 and drop them later again for 1.3 when removing usage tracking.
> >>>> [The third option to keep the usage tracking is a non-option for me. ;)]
> >>>>
> >>>> Michael?
> >>>
> >>> Second option seems more prudent to me. Can you send a patch pls?
> >>
> >> In fact, it's not as easy. ivshmem already tries to restore the usage
> >> flag but fails due to reset handler ordering. I do not see ATM where
> >> there is some "enable device" during re-initialization, at least for
> >> ivshmem. Haven't checked megasas yet.
> >>
> >> I tend to think removing is simpler and less risky. Please have a look
> >> at the patch I sent earlier.
> >>
> >> Jan
> >>
> >>
> >
> > Actually virtio is the only one that changes vector use
> > so the only one that needs to reset it.
> >
> > I am thinking we should find a way to move vector use
> > out from core to virtio-pci.
> > But for now something like the below should do the trick.
> > Untested unfortunately, will test virtio Sunday
> > but would appreciate review and testing reports
> > esp with ivshmem etc.
> >
> > -->
> >
> > msix: avoid need to use/unuse vectors for most devices
> >
> > The facility to use/unuse vectors is helpful for virtio
> > but little else since everyone just seems to use
> > vectors in their init function.
> >
> > Avoid clearing msix vector use info on reset and load.
> > For virtio, clear it explicitly.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> > ---
> >
> > diff --git a/hw/msix.c b/hw/msix.c
> > index 800fc32..d040cc2 100644
> > --- a/hw/msix.c
> > +++ b/hw/msix.c
> > @@ -340,6 +340,15 @@ static void msix_free_irq_entries(PCIDevice *dev)
> > }
> > }
> >
> > +static void msix_clear_all_vectors(PCIDevice *dev)
> > +{
> > + int vector;
> > +
> > + for (vector = 0; vector < dev->msix_entries_nr; ++vector) {
> > + msix_clr_pending(dev, vector);
> > + }
> > +}
> > +
> > /* Clean up resources for the device. */
> > void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegion *pba_bar)
> > {
> > @@ -394,7 +403,7 @@ void msix_load(PCIDevice *dev, QEMUFile *f)
> > return;
> > }
> >
> > - msix_free_irq_entries(dev);
> > + msix_clear_all_vectors(dev);
> > qemu_get_buffer(f, dev->msix_table, n * PCI_MSIX_ENTRY_SIZE);
> > qemu_get_buffer(f, dev->msix_pba, (n + 7) / 8);
> > msix_update_function_masked(dev);
> > @@ -440,7 +449,7 @@ void msix_reset(PCIDevice *dev)
> > if (!msix_present(dev)) {
> > return;
> > }
> > - msix_free_irq_entries(dev);
> > + msix_clear_all_vectors(dev);
> > dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &=
> > ~dev->wmask[dev->msix_cap + MSIX_CONTROL_OFFSET];
> > memset(dev->msix_table, 0, dev->msix_entries_nr * PCI_MSIX_ENTRY_SIZE);
> > diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> > index 125eded..ca0b204 100644
> > --- a/hw/virtio-pci.c
> > +++ b/hw/virtio-pci.c
> > @@ -131,6 +131,7 @@ static int virtio_pci_load_config(void * opaque, QEMUFile *f)
> > if (ret) {
> > return ret;
> > }
> > + msix_unuse_all_vectors(&proxy->pci_dev);
> > msix_load(&proxy->pci_dev, f);
> > if (msix_present(&proxy->pci_dev)) {
> > qemu_get_be16s(f, &proxy->vdev->config_vector);
> > @@ -246,6 +247,7 @@ void virtio_pci_reset(DeviceState *d)
> > VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
> > virtio_pci_stop_ioeventfd(proxy);
> > virtio_reset(proxy->vdev);
> > + msix_unuse_all_vectors(&proxy->pci_dev);
> > proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
> > }
> >
> >
>
> This just prolongs the life of a wrong API, but I'm fine for 1.2 if it
> helps. However, I will push on removing all this for 1.3.
>
> Jan
>
For 1.3 we'll move use tracking into virtio, yes.
--
MST
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-08-24 9:19 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-23 23:13 [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI Cam Macdonell
2012-08-24 5:59 ` Jan Kiszka
2012-08-24 8:11 ` Michael S. Tsirkin
2012-08-24 8:15 ` Jan Kiszka
2012-08-24 8:36 ` Michael S. Tsirkin
2012-08-24 8:39 ` Jan Kiszka
2012-08-24 9:20 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).