From: Andrew Cooper <andrew.cooper3@citrix.com>
To: xen-devel@lists.xensource.com
Subject: Re: Re: Still struggling with HVM: tx timeouts on emulated nics
Date: Tue, 4 Oct 2011 11:07:15 +0100 [thread overview]
Message-ID: <4E8ADAD3.9050404@citrix.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1110031905200.3519@kaball-desktop>
On 03/10/11 19:13, Stefano Stabellini wrote:
> CC'ing Jan, that probably is going to have an opinion on this.
>
> Let me add a bit of background: Stefan found out that PV on HVM guests
> could loose level interrupts coming from emulated devices. Looking
> through the code I realized that we need to add some logic to inject a
> pirq in the guest if a level interrupt has been raised while the guest
> is servicing the first one.
> While this is all very specific to interrupt remapping and emulated
> devices, I realized that something similar could happen even with dom0
> or other PV guests with PCI passthrough:
>
> 1) the device raises a level interrupt and xen injects it into the
> guest;
>
> 2) the guest is temporarely stuck: it does not ack it or eoi it;
>
> 3) the xen timer kicks in and eois the interrupt;
>
> 4) the device thinks it is all fine and sends a second interrupt;
>
> 5) Xen fails to inject the second interrupt into the guest because the
> guest has still the event channel pending bit set;
>
> at this point the guest looses the second interrupt notification, that
> is not supposed to happen with level interrupts and I think it might
> cause problems with some devices.
>
> Jan, do you think we should try to handle this case, or is it too
> unlikely?
I am not certain whether this is relevant, but the ICH10 IO-APIC
documentation indicated that early EOI'ing of a line level interrupt
should not have this effect. Specifically, it states that EOI'ing a
line level interrupt whos line is still asserted will cause the
interrupt to be "re-raised" from the IO-APIC. It uses this to assert
that it is fine to use multiple IO-APIC entries with the same vector,
with a broadcast of vector number alone to EOI the interrupt.
In this case, while Xen sees two interrupts, from the devices point of
view, only I has happened.
In the case where the device has dropped its line level interrupt of its
own accord, then I would agree that the current Xen behavior is wrong.
I cant offhand think of a good reason why this would occur.
I know it is not helpful in this case, but as a rule of thumb, line
level interrupts should not be used with Xen. The average response time
on an unloaded system is ~30ms, ranging from 5 to 150. (On a sample set
of a Dell R710, Xen 4.1.0 and 2.6.32 dom0, over 2 weeks of debugging
another line level interrupt bug).
~Andrew
> In any case we need to handle the PV on HVM remapping bug, that because
> of the way interrupts are emulated is much more likely to happen...
>
>
> On Mon, 3 Oct 2011, Stefano Stabellini wrote:
>> On Fri, 30 Sep 2011, Stefan Bader wrote:
>>> Also I did not completely remove the section that would return the status
>>> without setting needsEOI. I just changed the if condition to be <0 instead of
>>> <=0 (I knew from the tests that the mapping was always 0 and maybe the <0 check
>>> could be useful for something.
>>>
>>> irq_status_query.flags = 0;
>>> if ( is_hvm_domain(v->domain) &&
>>> domain_pirq_to_irq(v->domain, irq) < 0 )
>>> {
>>> ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0;
>>> break;
>>> }
>>>
>> You need to remove the entire test because we want to receive
>> notifications in all cases.
>>
>>
>>> With that a quick test shows both the re-sends done sometimes and the domU doing
>>> EOIs. And there is no stall apparent. Did the same quick test with the e1000
>>> emulated NIC and that still seems ok. Those were not very thorough tests but at
>>> least I would have observed a stall pretty quick otherwise.
>> I am glad it fixes the problem for you.
>>
>> I am going to send a different patch upstream for Xen 4.2, because I
>> would also like it to cover the very unlikely scenario in which a PV
>> guest (like dom0 or a PV guest with PCI passthrough) is loosing level
>> interrupts because when Xen tries to set the corresponding event channel
>> pending the bit is alreay set. The codebase is different enough that
>> making the same change on 4.1 is non-trivial. I am appending the new
>> patch to this email, it would be great if you could test it. You just
>> need a 4.2 hypervisor, not the entire system. You should be able to
>> perform the test updating only xen.gz.
>> If you have trouble if xen-unstable.hg tip, try changeset 23843.
>>
>> ---
>>
>>
>> diff -r bf533533046c xen/arch/x86/hvm/irq.c
>> --- a/xen/arch/x86/hvm/irq.c Fri Sep 30 14:12:35 2011 +0000
>> +++ b/xen/arch/x86/hvm/irq.c Mon Oct 03 16:54:51 2011 +0000
>> @@ -36,7 +36,8 @@ static void assert_gsi(struct domain *d,
>>
>> if ( hvm_domain_use_pirq(d, pirq) )
>> {
>> - send_guest_pirq(d, pirq);
>> + if ( send_guest_pirq(d, pirq) && ioapic_gsi >= NR_ISAIRQS )
>> + pirq->lost++;
>> return;
>> }
>> vioapic_irq_positive_edge(d, ioapic_gsi);
>> @@ -63,6 +64,7 @@ static void __hvm_pci_intx_assert(
>> {
>> struct hvm_irq *hvm_irq = &d->arch.hvm_domain.irq;
>> unsigned int gsi, link, isa_irq;
>> + struct pirq *pirq;
>>
>> ASSERT((device <= 31) && (intx <= 3));
>>
>> @@ -72,6 +74,11 @@ static void __hvm_pci_intx_assert(
>> gsi = hvm_pci_intx_gsi(device, intx);
>> if ( hvm_irq->gsi_assert_count[gsi]++ == 0 )
>> assert_gsi(d, gsi);
>> + else {
>> + pirq = pirq_info(d, domain_emuirq_to_pirq(d, gsi));
>> + if ( hvm_domain_use_pirq(d, pirq) )
>> + pirq->lost++;
>> + }
>>
>> link = hvm_pci_intx_link(device, intx);
>> isa_irq = hvm_irq->pci_link.route[link];
>> diff -r bf533533046c xen/arch/x86/irq.c
>> --- a/xen/arch/x86/irq.c Fri Sep 30 14:12:35 2011 +0000
>> +++ b/xen/arch/x86/irq.c Mon Oct 03 16:54:51 2011 +0000
>> @@ -965,7 +965,11 @@ static void __do_IRQ_guest(int irq)
>> !test_and_set_bool(pirq->masked) )
>> action->in_flight++;
>> if ( !hvm_do_IRQ_dpci(d, pirq) )
>> - send_guest_pirq(d, pirq);
>> + {
>> + if ( send_guest_pirq(d, pirq) &&
>> + action->ack_type == ACKTYPE_EOI )
>> + pirq->lost++;
>> + }
>> }
>>
>> if ( action->ack_type != ACKTYPE_NONE )
>> diff -r bf533533046c xen/arch/x86/physdev.c
>> --- a/xen/arch/x86/physdev.c Fri Sep 30 14:12:35 2011 +0000
>> +++ b/xen/arch/x86/physdev.c Mon Oct 03 16:54:51 2011 +0000
>> @@ -11,6 +11,7 @@
>> #include <asm/current.h>
>> #include <asm/io_apic.h>
>> #include <asm/msi.h>
>> +#include <asm/hvm/irq.h>
>> #include <asm/hypercall.h>
>> #include <public/xen.h>
>> #include <public/physdev.h>
>> @@ -270,6 +271,10 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
>> if ( !is_hvm_domain(v->domain) ||
>> domain_pirq_to_irq(v->domain, eoi.irq) > 0 )
>> pirq_guest_eoi(pirq);
>> + if ( pirq->lost > 0) {
>> + if ( !send_guest_pirq(v->domain, pirq) )
>> + pirq->lost--;
>> + }
>> spin_unlock(&v->domain->event_lock);
>> ret = 0;
>> break;
>> @@ -328,9 +333,10 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
>> break;
>> irq_status_query.flags = 0;
>> if ( is_hvm_domain(v->domain) &&
>> - domain_pirq_to_irq(v->domain, irq) <= 0 )
>> + domain_pirq_to_irq(v->domain, irq) <= 0 &&
>> + domain_pirq_to_emuirq(v->domain, irq) == IRQ_UNBOUND )
>> {
>> - ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0;
>> + ret = -EINVAL;
>> break;
>> }
>>
>> diff -r bf533533046c xen/include/xen/irq.h
>> --- a/xen/include/xen/irq.h Fri Sep 30 14:12:35 2011 +0000
>> +++ b/xen/include/xen/irq.h Mon Oct 03 16:54:51 2011 +0000
>> @@ -146,6 +146,7 @@ struct pirq {
>> int pirq;
>> u16 evtchn;
>> bool_t masked;
>> + u32 lost;
>> struct rcu_head rcu_head;
>> struct arch_pirq arch;
>> };
>>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
next prev parent reply other threads:[~2011-10-04 10:07 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4E7B4768.8060103@canonical.com>
2011-09-22 17:44 ` Re: Still struggling with HVM: tx timeouts on emulated nics Stefano Stabellini
2011-09-30 9:13 ` Stefan Bader
2011-09-30 14:09 ` Stefano Stabellini
2011-09-30 16:06 ` Stefan Bader
2011-09-30 17:59 ` Stefan Bader
2011-10-03 17:24 ` Stefano Stabellini
2011-10-03 18:13 ` Stefano Stabellini
2011-10-04 10:07 ` Andrew Cooper [this message]
2011-10-04 14:13 ` Stefano Stabellini
2011-10-05 16:10 ` Stefan Bader
2011-10-06 10:12 ` Stefano Stabellini
2011-10-06 12:16 ` Stefan Bader
2011-10-27 10:37 ` [PATCH] xen: do not loose level interrupt notifications Stefano Stabellini
2011-10-27 11:18 ` Keir Fraser
2011-10-27 11:42 ` Stefano Stabellini
2011-10-27 12:17 ` Keir Fraser
2011-09-21 13:03 Still struggling with HVM: tx timeouts on emulated nics Stefan Bader
2011-09-21 13:31 ` Stefano Stabellini
2011-09-21 15:34 ` Stefan Bader
2011-09-22 10:30 ` Stefano Stabellini
2011-09-22 11:58 ` Stefan Bader
2011-09-22 14:32 ` Stefan Bader
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E8ADAD3.9050404@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.