xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Thimo E." <abc@digithi.de>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Keir Fraser <keir@xen.org>, Jan Beulich <JBeulich@suse.com>,
	"Dong, Eddie" <eddie.dong@intel.com>,
	Xen-develList <xen-devel@lists.xen.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	"Zhang, Yang Z" <yang.z.zhang@intel.com>,
	"Zhang, Xiantao" <xiantao.zhang@intel.com>
Subject: Re: cpuidle and un-eoid interrupts at the local apic
Date: Wed, 04 Sep 2013 21:56:40 +0200	[thread overview]
Message-ID: <52279078.3030701@digithi.de> (raw)
In-Reply-To: <5227821A.9090201@citrix.com>

Hello Andrew,

thanks for your response. At least I've seen the trigger of the new 
crash (2e) already before, so they seem so belong together.

I can't image that I am the only one on the world who is using a haswell 
board. And as I haven't seen any other Xen bug/crash reports
like mine (and one time you) nor bug reports from users with other 
operating systems, I ask myself if only my hardware is buggy
or if other operating systems handle those "spurious" interrupts in 
another way ?!?!

What does " ioapic_ack=old" change ?

Best regards
   Thimo

Am 04.09.2013 20:55, schrieb Andrew Cooper:
> On 04/09/13 19:32, Thimo E. wrote:
>> Hello again,
>>
>> the last two weeks no crash with pinning dom0_vcpus_pin and
>> restricting dom0 to 1 cpu. But yesterday it crashed again. So changed
>> the command line again to:
>>
>> iommu=no-intremap noirqbalance com1=115200,8n1,0xe050,0
>> console=com1,vga mem=1024G dom0_max_vcpus=4 dom0_mem=752M,max:752M
>> watchdog_timeout=300 lowmem_emergency_pool=1M crashkernel=64M@32M
>> cpuid_mask_xsave_eax=0
>>
>> And today server crashed again and produced a lot of debugging
>> messages, see attached. The "..." in the logfiles mean that the
>> message above the points was repeated very often.
>>
>> My summary so far:
>> - With only 1 cpu atteched to dom0 the server was stable for 2 weeks,
>> the crash there did not really show any irq problems, see
>> crash20130903.txt
>>     You can find Andrews ideas to this in
>> http://forums.citrix.com/thread.jspa?messageID=1760771#1760771
>> - With more than 1 cpu and irqbalance the server produced the crashes
>> I've already posted before
>> - Without irqbalance crash with some other fancy output, see
>> crash20130904.txt
>>
>> Next step is to change the network card.
>>
>> Zhang, any update from your side ? Or do the others have any idea ?
>> Could "ioapic_ack=old" help somewhere ?
>>
>> Best regards
>>    Thimo
>>
> Ok - the second attachment (crash20130903.txt) is the one I have triaged
> before, and the crash is impossible given the expected code flow through
> the function.
>
> %r14 is calculated as a the per-cpu cpu_info, which cannot possibly be
> -1 at the point of the fault.  The only explanation is that the
> pagefault is a result of a spurious jump to this location.
>
>  From a quick glance at the other crash, vector 2e was the problematic
> one (iirc).  The "Bad vmexit (reason 3)" at the top would suggest that
> something on the system has sent an INIT to pcpu 2, which seems antisocial.
>
> As we have identified that the hardware is delivering invalid
> interrupts, I wouldn't necessarily read any more into this new crash;
> something is very broken in the hardware.
>
> I would be interested for any update from Intel regarding the ISR violation.
>
> ~Andrew

  reply	other threads:[~2013-09-04 19:56 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-31 20:32 cpuidle and un-eoid interrupts at the local apic Andrew Cooper
2013-06-03 14:30 ` Jan Beulich
2013-07-31  8:30 ` Thimo E.
2013-07-31  9:47   ` Andrew Cooper
2013-08-02 22:50     ` Thimo E.
2013-08-02 23:32       ` Andrew Cooper
2013-08-05 12:45         ` Jan Beulich
2013-08-05 14:51           ` Andrew Cooper
2013-08-09 21:27             ` Thimo E.
2013-08-09 21:40               ` Andrew Cooper
2013-08-09 21:44                 ` Andrew Cooper
2013-08-11 17:46                   ` Thimo E.
2013-08-12  6:02                     ` Zhang, Yang Z
2013-08-12  8:49                     ` Zhang, Yang Z
2013-08-12  8:57                       ` Jan Beulich
2013-08-12 11:52                       ` Thimo E
2013-08-12 12:04                         ` Andrew Cooper
2013-08-19 15:14                           ` Thimo E.
2013-08-20  5:43                             ` Thimo Eichstädt
2013-08-20  8:40                               ` Jan Beulich
2013-08-20  8:50                                 ` Zhang, Yang Z
2013-08-23  7:22                                   ` Thimo Eichstädt
2013-08-23  7:30                                     ` Zhang, Yang Z
2013-08-27  1:03                                     ` Zhang, Yang Z
2013-09-04 18:32                                       ` Thimo E.
2013-09-04 18:55                                         ` Andrew Cooper
2013-09-04 19:56                                           ` Thimo E. [this message]
2013-09-04 20:54                                             ` Andrew Cooper
2013-09-05  1:45                                               ` Zhang, Yang Z
2013-09-05  7:20                                                 ` Thimo E.
2013-09-05  1:15                                         ` Zhang, Yang Z
2013-09-17  2:09                                         ` Zhang, Yang Z
2013-09-17  7:39                                           ` Thimo E.
2013-09-17  7:43                                             ` Zhang, Yang Z
2013-09-17 21:04                                               ` Thimo E.
2013-09-18  1:18                                                 ` Zhang, Xiantao
2013-09-18 17:24                                                   ` Thimo E.
2013-09-18 12:06                                                 ` Andrew Cooper
2013-08-12 13:54                       ` Thimo E
2013-08-12 14:06                         ` Andrew Cooper
2013-08-13  1:43                           ` Zhang, Yang Z
2013-08-13  6:39                             ` Thimo E.
2013-08-13 11:39                         ` Wu, Feng
2013-08-13 12:46                           ` Andrew Cooper
2013-08-12  9:10                     ` Andrew Cooper
2013-08-12  5:50                 ` Zhang, Yang Z
2013-08-12  8:20               ` Jan Beulich
2013-08-12  9:28                 ` Andrew Cooper
2013-08-12 10:05                   ` Jan Beulich
2013-08-12 10:27                     ` Andrew Cooper
2013-08-14  2:53                       ` Zhang, Yang Z
2013-08-14  7:51                         ` Thimo E.
2013-08-14  9:52                         ` Andrew Cooper
2013-09-07 13:27                           ` Thimo E.
2013-09-07 17:02                             ` Andrew Cooper
2013-09-07 23:37                               ` Thimo E.
2013-09-08  9:53                                 ` Andrew Cooper
2013-09-08 10:24                                   ` Thimo E.
2013-09-09 13:16                                     ` Andrew Cooper
2013-09-09 14:48                                       ` Thimo Eichstädt
2013-09-09 15:12                                         ` Andrew Cooper
2013-09-09  7:59                               ` Jan Beulich
2013-09-09 12:53                                 ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52279078.3030701@digithi.de \
    --to=abc@digithi.de \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=eddie.dong@intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xen.org \
    --cc=xiantao.zhang@intel.com \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).