All of lore.kernel.org
 help / color / mirror / Atom feed
* Xen 4.1 rc5 outstanding bugs
@ 2011-02-18 18:59 Stefano Stabellini
  2011-02-21  8:32 ` Jan Beulich
  2011-03-08 15:43 ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 10+ messages in thread
From: Stefano Stabellini @ 2011-02-18 18:59 UTC (permalink / raw)
  To: xen-devel

Hi all,
I went through the list of bugs affecting the latest Xen 4.1 RC and
I made a list of what they seem to be the most serious.
All of them affect PCI passthrough and seem to be hypervisor/qemu-xen
bugs apart from the last one that is a libxenlight/xl bug.


 *  VF passthrough does not work
Passing through a normal NIC seem to work but passing through a VF
doesn't.
The device appears in the guest but it cannot exchange packets, the
guest kernel version doesn't seem to matter.
>From the qemu logs:

pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is already                 function.

It might be the same problem of the two following bug reports:
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1709
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1742


 *  Xen panic on guest shutdown with PCI Passthrough
http://xen.1045712.n5.nabble.com/Xen-panic-on-guest-shutdown-with-PCI-Passthrough-tt3371337.html#none
When the guest with a passthrough pci device is shut down, Xen panic
on a NMI - MEMORY ERROR.
(XEN) Xen call trace:
(XEN)    [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121
(XEN)    [<ffff82c48015f087>] mask_msi_irq+0xe/0x10
(XEN)    [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa
(XEN)    [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307
(XEN)    [<ffff82c480163321>] free_domain_pirqs+0x57/0x83
(XEN)    [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3
(XEN)    [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a
(XEN)    [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1
(XEN)    [<ffff82c480123327>] __do_softirq+0x88/0x99
(XEN)    [<ffff82c4801233a2>] do_softirq+0x6a/0x7a


 *  possible Xen pirq leak at domain shutdown
If a guest doesn't support pci hot-unplug (or a malicious guest), it
won't do anything in response to the acpi SCI interrupt we send when the
domain is destroyed, therefore unregister_real_device will never be
called in qemu and we might be leaking MSIs in the Xen (to be
verified).
http://xen.1045712.n5.nabble.com/template/NamlServlet.jtp?macro=print_post&node=3369367


 *  Xen warns about MSIs when assigning a PCI device to a guest
also known as "Xen complains msi error when startup"
At startup Xen prints multiple:
(XEN) Xen WARN at msi.c:635
(XEN) Xen WARN at msi.c:648
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1732


 *  PCI hot-unplug causes a guest crash
also know as "fail to detach NIC from guest"
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1736


 * multiple PCI devices passthrough to PV guests or HVM guests with
   stubdoms is broken with the xl toolstack
Cannot assign >1 PCI passthrough devices as domain creation time because
libxl creates a bus for the first device and increments "num_devs" node
in xenstore for each subsequent device but pciback cannot cope with
num_devs changing while the guest is not running to respond to the
reconfiguration request. A fix would be to create the entire bus in a
single cold-plug operation at start of day.



 - Stefano

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Xen 4.1 rc5 outstanding bugs
  2011-02-18 18:59 Xen 4.1 rc5 outstanding bugs Stefano Stabellini
@ 2011-02-21  8:32 ` Jan Beulich
  2011-02-21 11:05   ` Stefano Stabellini
  2011-03-08 15:43 ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2011-02-21  8:32 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

>>> On 18.02.11 at 19:59, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:
>  *  Xen panic on guest shutdown with PCI Passthrough
> http://xen.1045712.n5.nabble.com/Xen-panic-on-guest-shutdown-with-PCI-Passthrough-tt 
> 3371337.html#none
> When the guest with a passthrough pci device is shut down, Xen panic
> on a NMI - MEMORY ERROR.
> (XEN) Xen call trace:
> (XEN)    [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121
> (XEN)    [<ffff82c48015f087>] mask_msi_irq+0xe/0x10
> (XEN)    [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa
> (XEN)    [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307
> (XEN)    [<ffff82c480163321>] free_domain_pirqs+0x57/0x83
> (XEN)    [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3
> (XEN)    [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a
> (XEN)    [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1
> (XEN)    [<ffff82c480123327>] __do_softirq+0x88/0x99
> (XEN)    [<ffff82c4801233a2>] do_softirq+0x6a/0x7a

Are you sure you want to consider this one severe? Xen by itself
can't cause NMIs (unless sending IPIs as such, which certainly isn't
the case here), and iirc from the mailing list post corresponding to
this it's being observed only on a single (with the above presumably
buggy) system.

Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Xen 4.1 rc5 outstanding bugs
  2011-02-21  8:32 ` Jan Beulich
@ 2011-02-21 11:05   ` Stefano Stabellini
  2011-02-21 11:25     ` Jan Beulich
  2011-02-21 17:29     ` Anthony PERARD
  0 siblings, 2 replies; 10+ messages in thread
From: Stefano Stabellini @ 2011-02-21 11:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini

On Mon, 21 Feb 2011, Jan Beulich wrote:
> >>> On 18.02.11 at 19:59, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:
> >  *  Xen panic on guest shutdown with PCI Passthrough
> > http://xen.1045712.n5.nabble.com/Xen-panic-on-guest-shutdown-with-PCI-Passthrough-tt 
> > 3371337.html#none
> > When the guest with a passthrough pci device is shut down, Xen panic
> > on a NMI - MEMORY ERROR.
> > (XEN) Xen call trace:
> > (XEN)    [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121
> > (XEN)    [<ffff82c48015f087>] mask_msi_irq+0xe/0x10
> > (XEN)    [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa
> > (XEN)    [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307
> > (XEN)    [<ffff82c480163321>] free_domain_pirqs+0x57/0x83
> > (XEN)    [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3
> > (XEN)    [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a
> > (XEN)    [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1
> > (XEN)    [<ffff82c480123327>] __do_softirq+0x88/0x99
> > (XEN)    [<ffff82c4801233a2>] do_softirq+0x6a/0x7a
> 
> Are you sure you want to consider this one severe? Xen by itself
> can't cause NMIs (unless sending IPIs as such, which certainly isn't
> the case here), and iirc from the mailing list post corresponding to
> this it's being observed only on a single (with the above presumably
> buggy) system.
 
Good point. We should have another identical machine, we'll try to
reproduce it there.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Xen 4.1 rc5 outstanding bugs
  2011-02-21 11:05   ` Stefano Stabellini
@ 2011-02-21 11:25     ` Jan Beulich
  2011-02-21 17:29     ` Anthony PERARD
  1 sibling, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2011-02-21 11:25 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel@lists.xensource.com

>>> On 21.02.11 at 12:05, Stefano Stabellini <stefano.stabellini@eu.citrix.com>
wrote:
> On Mon, 21 Feb 2011, Jan Beulich wrote:
>> >>> On 18.02.11 at 19:59, Stefano Stabellini <stefano.stabellini@eu.citrix.com> 
> wrote:
>> >  *  Xen panic on guest shutdown with PCI Passthrough
>> > 
> http://xen.1045712.n5.nabble.com/Xen-panic-on-guest-shutdown-with-PCI-Passthrough-tt 
> 
>> > 3371337.html#none
>> > When the guest with a passthrough pci device is shut down, Xen panic
>> > on a NMI - MEMORY ERROR.
>> > (XEN) Xen call trace:
>> > (XEN)    [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121
>> > (XEN)    [<ffff82c48015f087>] mask_msi_irq+0xe/0x10
>> > (XEN)    [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa
>> > (XEN)    [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307
>> > (XEN)    [<ffff82c480163321>] free_domain_pirqs+0x57/0x83
>> > (XEN)    [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3
>> > (XEN)    [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a
>> > (XEN)    [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1
>> > (XEN)    [<ffff82c480123327>] __do_softirq+0x88/0x99
>> > (XEN)    [<ffff82c4801233a2>] do_softirq+0x6a/0x7a
>> 
>> Are you sure you want to consider this one severe? Xen by itself
>> can't cause NMIs (unless sending IPIs as such, which certainly isn't
>> the case here), and iirc from the mailing list post corresponding to
>> this it's being observed only on a single (with the above presumably
>> buggy) system.
>  
> Good point. We should have another identical machine, we'll try to
> reproduce it there.

Another identical machine may have the same problem (if its a
design flaw rather than broken hardware) - reproducing on a
different machine would be of much more interest.

Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Xen 4.1 rc5 outstanding bugs
  2011-02-21 11:05   ` Stefano Stabellini
  2011-02-21 11:25     ` Jan Beulich
@ 2011-02-21 17:29     ` Anthony PERARD
  2011-02-22  6:17       ` Zhang, Yang Z
  1 sibling, 1 reply; 10+ messages in thread
From: Anthony PERARD @ 2011-02-21 17:29 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel@lists.xensource.com, Jan Beulich

On Mon, Feb 21, 2011 at 11:05, Stefano Stabellini
<stefano.stabellini@eu.citrix.com> wrote:
> On Mon, 21 Feb 2011, Jan Beulich wrote:
>> >>> On 18.02.11 at 19:59, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:
>> >  *  Xen panic on guest shutdown with PCI Passthrough
>> > http://xen.1045712.n5.nabble.com/Xen-panic-on-guest-shutdown-with-PCI-Passthrough-tt
>> > 3371337.html#none
>> > When the guest with a passthrough pci device is shut down, Xen panic
>> > on a NMI - MEMORY ERROR.
>> > (XEN) Xen call trace:
>> > (XEN)    [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121
>> > (XEN)    [<ffff82c48015f087>] mask_msi_irq+0xe/0x10
>> > (XEN)    [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa
>> > (XEN)    [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307
>> > (XEN)    [<ffff82c480163321>] free_domain_pirqs+0x57/0x83
>> > (XEN)    [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3
>> > (XEN)    [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a
>> > (XEN)    [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1
>> > (XEN)    [<ffff82c480123327>] __do_softirq+0x88/0x99
>> > (XEN)    [<ffff82c4801233a2>] do_softirq+0x6a/0x7a
>>
>> Are you sure you want to consider this one severe? Xen by itself
>> can't cause NMIs (unless sending IPIs as such, which certainly isn't
>> the case here), and iirc from the mailing list post corresponding to
>> this it's being observed only on a single (with the above presumably
>> buggy) system.
>
> Good point. We should have another identical machine, we'll try to
> reproduce it there.

I have the same issue on another identical machine.

Host: Dell System PowerEdge R310

Dom0: Debian Squeeze with Kernel dom0 2.6.32.27
Xen 4.1-rc6-pre
Guest: Debian lenny with default kernel.

pci devices:
Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet
02:00.0 -> passthrough to guest
02:00.1 -> keep as network card for dom0

A bug report have been filled here:
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1744

-- 
Anthony PERARD

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Xen 4.1 rc5 outstanding bugs
  2011-02-21 17:29     ` Anthony PERARD
@ 2011-02-22  6:17       ` Zhang, Yang Z
  2011-02-23 18:58         ` Stefano Stabellini
  0 siblings, 1 reply; 10+ messages in thread
From: Zhang, Yang Z @ 2011-02-22  6:17 UTC (permalink / raw)
  To: Anthony PERARD, Stefano Stabellini
  Cc: Jan, xen-devel@lists.xensource.com, Beulich

[-- Attachment #1: Type: text/plain, Size: 2926 bytes --]

I used rhel5u5 as guest and cannot reproduce this issue in my westmere-EP. Can you have a try with rhel5u5 guest?

Host: westmere-ep

Software:
Xen 4.1-rc6-pre
Pvops dom0: rhel5u5 with 2.6.32.27
Guest: rhel5u5

best regards
yang


> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com
> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Anthony
> PERARD
> Sent: Tuesday, February 22, 2011 1:30 AM
> To: Stefano Stabellini
> Cc: xen-devel@lists.xensource.com; Jan Beulich
> Subject: Re: [Xen-devel] Xen 4.1 rc5 outstanding bugs
> 
> On Mon, Feb 21, 2011 at 11:05, Stefano Stabellini
> <stefano.stabellini@eu.citrix.com> wrote:
> > On Mon, 21 Feb 2011, Jan Beulich wrote:
> >> >>> On 18.02.11 at 19:59, Stefano Stabellini
> <stefano.stabellini@eu.citrix.com> wrote:
> >> >  *  Xen panic on guest shutdown with PCI Passthrough
> >> >
> http://xen.1045712.n5.nabble.com/Xen-panic-on-guest-shutdown-with-PCI-Pas
> sthrough-tt
> >> > 3371337.html#none
> >> > When the guest with a passthrough pci device is shut down, Xen panic
> >> > on a NMI - MEMORY ERROR.
> >> > (XEN) Xen call trace:
> >> > (XEN)    [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121
> >> > (XEN)    [<ffff82c48015f087>] mask_msi_irq+0xe/0x10
> >> > (XEN)    [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa
> >> > (XEN)    [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307
> >> > (XEN)    [<ffff82c480163321>] free_domain_pirqs+0x57/0x83
> >> > (XEN)    [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3
> >> > (XEN)    [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a
> >> > (XEN)    [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1
> >> > (XEN)    [<ffff82c480123327>] __do_softirq+0x88/0x99
> >> > (XEN)    [<ffff82c4801233a2>] do_softirq+0x6a/0x7a
> >>
> >> Are you sure you want to consider this one severe? Xen by itself
> >> can't cause NMIs (unless sending IPIs as such, which certainly isn't
> >> the case here), and iirc from the mailing list post corresponding to
> >> this it's being observed only on a single (with the above presumably
> >> buggy) system.
> >
> > Good point. We should have another identical machine, we'll try to
> > reproduce it there.
> 
> I have the same issue on another identical machine.
> 
> Host: Dell System PowerEdge R310
> 
> Dom0: Debian Squeeze with Kernel dom0 2.6.32.27
> Xen 4.1-rc6-pre
> Guest: Debian lenny with default kernel.
> 
> pci devices:
> Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet
> 02:00.0 -> passthrough to guest
> 02:00.1 -> keep as network card for dom0
> 
> A bug report have been filled here:
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1744
> 
> --
> Anthony PERARD
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Xen 4.1 rc5 outstanding bugs
  2011-02-22  6:17       ` Zhang, Yang Z
@ 2011-02-23 18:58         ` Stefano Stabellini
  0 siblings, 0 replies; 10+ messages in thread
From: Stefano Stabellini @ 2011-02-23 18:58 UTC (permalink / raw)
  To: Zhang, Yang Z
  Cc: Anthony Perard, xen-devel@lists.xensource.com, Jan Beulich,
	Stefano Stabellini

On Tue, 22 Feb 2011, Zhang, Yang Z wrote:
> I used rhel5u5 as guest and cannot reproduce this issue in my westmere-EP. Can you have a try with rhel5u5 guest?

I didn't try with a rhel5.5 guest but I noticed that the device we are
trying to passthrough is a multifunction device:

02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)

when the VM is shutdown Xen receives an NMI trying to mask an msix from
__pirq_guest_unbind.
The NMI happens even if we leave the other function unused or if we
assign it to the same VM.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Xen 4.1 rc5 outstanding bugs
  2011-02-18 18:59 Xen 4.1 rc5 outstanding bugs Stefano Stabellini
  2011-02-21  8:32 ` Jan Beulich
@ 2011-03-08 15:43 ` Konrad Rzeszutek Wilk
  2011-03-08 17:13   ` Stefano Stabellini
  1 sibling, 1 reply; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-03-08 15:43 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

On Fri, Feb 18, 2011 at 06:59:55PM +0000, Stefano Stabellini wrote:
> Hi all,
> I went through the list of bugs affecting the latest Xen 4.1 RC and
> I made a list of what they seem to be the most serious.

What is the status of these? I know you made some strides in fixing
most if not all of them both in the hypervisor and Linux kernel.

> All of them affect PCI passthrough and seem to be hypervisor/qemu-xen
> bugs apart from the last one that is a libxenlight/xl bug.
> 
> 
>  *  VF passthrough does not work
> Passing through a normal NIC seem to work but passing through a VF
> doesn't.
> The device appears in the guest but it cannot exchange packets, the
> guest kernel version doesn't seem to matter.
> >From the qemu logs:
> 
> pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is already                 function.
> 
> It might be the same problem of the two following bug reports:
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1709
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1742
> 
> 
>  *  Xen panic on guest shutdown with PCI Passthrough
> http://xen.1045712.n5.nabble.com/Xen-panic-on-guest-shutdown-with-PCI-Passthrough-tt3371337.html#none
> When the guest with a passthrough pci device is shut down, Xen panic
> on a NMI - MEMORY ERROR.
> (XEN) Xen call trace:
> (XEN)    [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121
> (XEN)    [<ffff82c48015f087>] mask_msi_irq+0xe/0x10
> (XEN)    [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa
> (XEN)    [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307
> (XEN)    [<ffff82c480163321>] free_domain_pirqs+0x57/0x83
> (XEN)    [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3
> (XEN)    [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a
> (XEN)    [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1
> (XEN)    [<ffff82c480123327>] __do_softirq+0x88/0x99
> (XEN)    [<ffff82c4801233a2>] do_softirq+0x6a/0x7a
> 
> 
>  *  possible Xen pirq leak at domain shutdown
> If a guest doesn't support pci hot-unplug (or a malicious guest), it
> won't do anything in response to the acpi SCI interrupt we send when the
> domain is destroyed, therefore unregister_real_device will never be
> called in qemu and we might be leaking MSIs in the Xen (to be
> verified).
> http://xen.1045712.n5.nabble.com/template/NamlServlet.jtp?macro=print_post&node=3369367
> 
> 
>  *  Xen warns about MSIs when assigning a PCI device to a guest
> also known as "Xen complains msi error when startup"
> At startup Xen prints multiple:
> (XEN) Xen WARN at msi.c:635
> (XEN) Xen WARN at msi.c:648
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1732
> 
> 
>  *  PCI hot-unplug causes a guest crash
> also know as "fail to detach NIC from guest"
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1736
> 
> 
>  * multiple PCI devices passthrough to PV guests or HVM guests with
>    stubdoms is broken with the xl toolstack
> Cannot assign >1 PCI passthrough devices as domain creation time because
> libxl creates a bus for the first device and increments "num_devs" node
> in xenstore for each subsequent device but pciback cannot cope with
> num_devs changing while the guest is not running to respond to the
> reconfiguration request. A fix would be to create the entire bus in a
> single cold-plug operation at start of day.
> 
> 
> 
>  - Stefano
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Xen 4.1 rc5 outstanding bugs
  2011-03-08 15:43 ` Konrad Rzeszutek Wilk
@ 2011-03-08 17:13   ` Stefano Stabellini
  2011-03-09 17:10     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 10+ messages in thread
From: Stefano Stabellini @ 2011-03-08 17:13 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini

On Tue, 8 Mar 2011, Konrad Rzeszutek Wilk wrote:
> On Fri, Feb 18, 2011 at 06:59:55PM +0000, Stefano Stabellini wrote:
> > Hi all,
> > I went through the list of bugs affecting the latest Xen 4.1 RC and
> > I made a list of what they seem to be the most serious.
> 
> What is the status of these? I know you made some strides in fixing
> most if not all of them both in the hypervisor and Linux kernel.
> 

we have fixed most of them, see below


> > All of them affect PCI passthrough and seem to be hypervisor/qemu-xen
> > bugs apart from the last one that is a libxenlight/xl bug.
> > 
> > 
> >  *  VF passthrough does not work
> > Passing through a normal NIC seem to work but passing through a VF
> > doesn't.
> > The device appears in the guest but it cannot exchange packets, the
> > guest kernel version doesn't seem to matter.
> > >From the qemu logs:
> > 
> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is already                 function.
> > 
> > It might be the same problem of the two following bug reports:
> > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1709
> > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1742

This one is a linux kernel bug and is fixed by "set current_state to D0
in register_slot".


> >  *  Xen panic on guest shutdown with PCI Passthrough
> > http://xen.1045712.n5.nabble.com/Xen-panic-on-guest-shutdown-with-PCI-Passthrough-tt3371337.html#none
> > When the guest with a passthrough pci device is shut down, Xen panic
> > on a NMI - MEMORY ERROR.
> > (XEN) Xen call trace:
> > (XEN)    [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121
> > (XEN)    [<ffff82c48015f087>] mask_msi_irq+0xe/0x10
> > (XEN)    [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa
> > (XEN)    [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307
> > (XEN)    [<ffff82c480163321>] free_domain_pirqs+0x57/0x83
> > (XEN)    [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3
> > (XEN)    [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a
> > (XEN)    [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1
> > (XEN)    [<ffff82c480123327>] __do_softirq+0x88/0x99
> > (XEN)    [<ffff82c4801233a2>] do_softirq+0x6a/0x7a
> > 

This is a Xen bug (it should cope with NMIs instead of crashing) and is
fixed by "NMI: continue in case of PCI SERR erros".


> >  *  possible Xen pirq leak at domain shutdown
> > If a guest doesn't support pci hot-unplug (or a malicious guest), it
> > won't do anything in response to the acpi SCI interrupt we send when the
> > domain is destroyed, therefore unregister_real_device will never be
> > called in qemu and we might be leaking MSIs in the Xen (to be
> > verified).
> > http://xen.1045712.n5.nabble.com/template/NamlServlet.jtp?macro=print_post&node=3369367

I think we have verified that there is no pirq leak.


> >  *  Xen warns about MSIs when assigning a PCI device to a guest
> > also known as "Xen complains msi error when startup"
> > At startup Xen prints multiple:
> > (XEN) Xen WARN at msi.c:635
> > (XEN) Xen WARN at msi.c:648
> > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1732

the warning is still there


> >  *  PCI hot-unplug causes a guest crash
> > also know as "fail to detach NIC from guest"
> > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1736

I cannot repro this bug, I think someone at Intel is working on it.


> >  * multiple PCI devices passthrough to PV guests or HVM guests with
> >    stubdoms is broken with the xl toolstack
> > Cannot assign >1 PCI passthrough devices as domain creation time because
> > libxl creates a bus for the first device and increments "num_devs" node
> > in xenstore for each subsequent device but pciback cannot cope with
> > num_devs changing while the guest is not running to respond to the
> > reconfiguration request. A fix would be to create the entire bus in a
> > single cold-plug operation at start of day.

the problem still persists with stubdoms but IanJ fixed the PV
passthrough bug, see "libxl: Multi-device passthrough coldplug: do not
wait for unstarted guests".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Xen 4.1 rc5 outstanding bugs
  2011-03-08 17:13   ` Stefano Stabellini
@ 2011-03-09 17:10     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-03-09 17:10 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel@lists.xensource.com

On Tue, Mar 08, 2011 at 05:13:46PM +0000, Stefano Stabellini wrote:
> On Tue, 8 Mar 2011, Konrad Rzeszutek Wilk wrote:
> > On Fri, Feb 18, 2011 at 06:59:55PM +0000, Stefano Stabellini wrote:
> > > Hi all,
> > > I went through the list of bugs affecting the latest Xen 4.1 RC and
> > > I made a list of what they seem to be the most serious.
> > 
> > What is the status of these? I know you made some strides in fixing
> > most if not all of them both in the hypervisor and Linux kernel.
> > 
> 
> we have fixed most of them, see below

Indeed!

.. snip..
> > >  *  PCI hot-unplug causes a guest crash
> > > also know as "fail to detach NIC from guest"
> > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1736
> 
> I cannot repro this bug, I think someone at Intel is working on it.

Ok, asked the Intel submitter to see if your assumptions are correct.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-03-09 17:10 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-18 18:59 Xen 4.1 rc5 outstanding bugs Stefano Stabellini
2011-02-21  8:32 ` Jan Beulich
2011-02-21 11:05   ` Stefano Stabellini
2011-02-21 11:25     ` Jan Beulich
2011-02-21 17:29     ` Anthony PERARD
2011-02-22  6:17       ` Zhang, Yang Z
2011-02-23 18:58         ` Stefano Stabellini
2011-03-08 15:43 ` Konrad Rzeszutek Wilk
2011-03-08 17:13   ` Stefano Stabellini
2011-03-09 17:10     ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.