xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Thimo E." <abc@digithi.de>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Keir Fraser <keir@xen.org>, Jan Beulich <JBeulich@suse.com>,
	"Dong, Eddie" <eddie.dong@intel.com>,
	Xen-develList <xen-devel@lists.xen.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	"Zhang, Yang Z" <yang.z.zhang@intel.com>,
	"Zhang, Xiantao" <xiantao.zhang@intel.com>
Subject: Re: cpuidle and un-eoid interrupts at the local apic
Date: Mon, 19 Aug 2013 17:14:38 +0200	[thread overview]
Message-ID: <5212365E.7010803@digithi.de> (raw)
In-Reply-To: <5208CF6B.7030505@citrix.com>


[-- Attachment #1.1: Type: text/plain, Size: 3767 bytes --]

Hello,

after one week of testing an intermediate result:

Since I've set iommu=no-intremap no crash occured so far. The server 
never ran longer without a crash. So a careful "it's working", but, 
because only one 7 days passed so far, not a final horray.

Even if this option really avoids the problem I classify it as nothing 
more than a workaround...obviously a good one because it's working, but 
still a workaround.

Where could the problem of the source be ? Bug in hardware ? Bug in 
software ?

And what does interrupt remapping really do ? Does disabling remapping 
have a performance impact ?

Best regards
   Thimo

Am 12.08.2013 14:04, schrieb Andrew Cooper:
> On 12/08/13 12:52, Thimo E wrote:
>> Hello Yang,
>>
>> attached you'll find the kernel dmesg, xen dmesg, lspci and output of 
>> /proc/interrupts. If you want to see further logfiles, please let me 
>> know.
>>
>> The processor is a Core i5-4670. The board is an Intel  DH87MC 
>> Mainboard. I am really not sure if it supports APICv, but VT-d is 
>> supported enabled enabled.
>>
>>
>>> 4.The status of IRQ 29 is 10 which means the guest already issues 
>>> the EOI because the bit IRQ_GUEST_EOI_PENDING is cleared, so there 
>>> should be no pending EOI in the EOI stack. If possible, can you add 
>>> some debug message in the guest EOI code path(like 
>>> _irq_guest_eoi())) to track the EOI?
>>>
>> I don't see the IRQ29 in /proc/interrupts, what I see is:
>> cat xen-dmesg.txt |grep "29": (XEN) allocated vector 29 for irq 20
>> cat dmesg.txt | grep "eth0": [   23.152355] e1000e 0000:00:19.0: PCI 
>> INT A -> GSI 20 (level, low) -> IRQ 20
>>                                                   [ 23.330408] e1000e 
>> 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
>>
>> So is the ethernet irq the bad one ? That is an Onboard Intel network 
>> adapter.
>
> That would be consistent with the crash seen with our hardware in 
> XenServer
>
>>
>>> 6.I guess the interrupt remapping is enabled in your machine. Can 
>>> you try to disable IR to see whether it still reproduceable?
>>>
>> Just to be sure, your proposal is to try the parameter "no-intremap" ?
>
> specifically, iommu=no-intremap
>
>>
>> Best regards
>>   Thimo
>
> ~Andrew
>
>>
>> Am 12.08.2013 10:49, schrieb Zhang, Yang Z:
>>>
>>> Hi Thimo,
>>>
>>> From your previous experience and log, it shows:
>>>
>>> 1.The interrupt that triggers the issue is a MSI.
>>>
>>> 2.MSI are treated as edge-triggered interrupts nomally, except when 
>>> there is no way to mask the device. In this case, your previous log 
>>> indicates the device is unmaskable(What special device are you 
>>> using?Modern PCI devcie should be maskable).
>>>
>>> 3.The IRQ 29 is belong to dom0, it seems it is not a HVM related issue.
>>>
>>> 4.The status of IRQ 29 is 10 which means the guest already issues 
>>> the EOI because the bit IRQ_GUEST_EOI_PENDING is cleared, so there 
>>> should be no pending EOI in the EOI stack. If possible, can you add 
>>> some debug message in the guest EOI code path(like 
>>> _irq_guest_eoi())) to track the EOI?
>>>
>>> 5.Both of the log show when the issue occured, most of the other 
>>> interrupts which owned by dom0 were in IRQ_MOVE_PENDING status. Is 
>>> it a coincidence? Or it happened only on the special condition like 
>>> heavy of IRQ migration?Perhaps you can disable irq balance in dom0 
>>> and pin the IRQ manually.
>>>
>> |6.I guess the interrupt remapping is enabled in your machine. Can 
>> you try to disable IR to see whether it still reproduceable?
>>>
>>> Also, please provide the whole Xen log.
>>>
>>> Best regards,
>>>
>>> Yang
>>>
>>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


[-- Attachment #1.2: Type: text/html, Size: 18683 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2013-08-19 15:14 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-31 20:32 cpuidle and un-eoid interrupts at the local apic Andrew Cooper
2013-06-03 14:30 ` Jan Beulich
2013-07-31  8:30 ` Thimo E.
2013-07-31  9:47   ` Andrew Cooper
2013-08-02 22:50     ` Thimo E.
2013-08-02 23:32       ` Andrew Cooper
2013-08-05 12:45         ` Jan Beulich
2013-08-05 14:51           ` Andrew Cooper
2013-08-09 21:27             ` Thimo E.
2013-08-09 21:40               ` Andrew Cooper
2013-08-09 21:44                 ` Andrew Cooper
2013-08-11 17:46                   ` Thimo E.
2013-08-12  6:02                     ` Zhang, Yang Z
2013-08-12  8:49                     ` Zhang, Yang Z
2013-08-12  8:57                       ` Jan Beulich
2013-08-12 11:52                       ` Thimo E
2013-08-12 12:04                         ` Andrew Cooper
2013-08-19 15:14                           ` Thimo E. [this message]
2013-08-20  5:43                             ` Thimo Eichstädt
2013-08-20  8:40                               ` Jan Beulich
2013-08-20  8:50                                 ` Zhang, Yang Z
2013-08-23  7:22                                   ` Thimo Eichstädt
2013-08-23  7:30                                     ` Zhang, Yang Z
2013-08-27  1:03                                     ` Zhang, Yang Z
2013-09-04 18:32                                       ` Thimo E.
2013-09-04 18:55                                         ` Andrew Cooper
2013-09-04 19:56                                           ` Thimo E.
2013-09-04 20:54                                             ` Andrew Cooper
2013-09-05  1:45                                               ` Zhang, Yang Z
2013-09-05  7:20                                                 ` Thimo E.
2013-09-05  1:15                                         ` Zhang, Yang Z
2013-09-17  2:09                                         ` Zhang, Yang Z
2013-09-17  7:39                                           ` Thimo E.
2013-09-17  7:43                                             ` Zhang, Yang Z
2013-09-17 21:04                                               ` Thimo E.
2013-09-18  1:18                                                 ` Zhang, Xiantao
2013-09-18 17:24                                                   ` Thimo E.
2013-09-18 12:06                                                 ` Andrew Cooper
2013-08-12 13:54                       ` Thimo E
2013-08-12 14:06                         ` Andrew Cooper
2013-08-13  1:43                           ` Zhang, Yang Z
2013-08-13  6:39                             ` Thimo E.
2013-08-13 11:39                         ` Wu, Feng
2013-08-13 12:46                           ` Andrew Cooper
2013-08-12  9:10                     ` Andrew Cooper
2013-08-12  5:50                 ` Zhang, Yang Z
2013-08-12  8:20               ` Jan Beulich
2013-08-12  9:28                 ` Andrew Cooper
2013-08-12 10:05                   ` Jan Beulich
2013-08-12 10:27                     ` Andrew Cooper
2013-08-14  2:53                       ` Zhang, Yang Z
2013-08-14  7:51                         ` Thimo E.
2013-08-14  9:52                         ` Andrew Cooper
2013-09-07 13:27                           ` Thimo E.
2013-09-07 17:02                             ` Andrew Cooper
2013-09-07 23:37                               ` Thimo E.
2013-09-08  9:53                                 ` Andrew Cooper
2013-09-08 10:24                                   ` Thimo E.
2013-09-09 13:16                                     ` Andrew Cooper
2013-09-09 14:48                                       ` Thimo Eichstädt
2013-09-09 15:12                                         ` Andrew Cooper
2013-09-09  7:59                               ` Jan Beulich
2013-09-09 12:53                                 ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5212365E.7010803@digithi.de \
    --to=abc@digithi.de \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=eddie.dong@intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xen.org \
    --cc=xiantao.zhang@intel.com \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).