All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Tim Deegan <tim@xen.org>
Cc: "Keir (Xen.org)" <keir@xen.org>, Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH] x86/watchdog: Use real timestamps for watchdog timeout
Date: Fri, 24 May 2013 11:03:25 +0100	[thread overview]
Message-ID: <519F3AED.2090209@citrix.com> (raw)
In-Reply-To: <20130524093712.GA54769@ocelot.phlegethon.org>

On 24/05/13 10:37, Tim Deegan wrote:
> At 21:32 +0100 on 23 May (1369344726), Andrew Cooper wrote:
>> Do not assume that we will only receive interrupts at a rate of nmi_hz.  On a
>> test system being debugged, I observed a PCI SERR being continuously asserted
>> without the SERR bit being set.  The result was Xen "exceeding" a 300 second
>> timeout within 1 second.
> Sounds like the CPU is indeed stuck, and the watchdog has just optimized
> away the 5 minutes of back-to-back NMIs. :)
>
> Handling this case it nice, but I wonder whether this patch ought to
> detect and report ludicrous NMI rates rather than silently ignoring
> them.  I guess that's hard to do in an NMI handler, other than by
> adjusting the printk when we crash.
>
> Tim.

Actually I suspect the system was livelocked with PCI SERRs being issued
from a PCIe switch.  I only have second granularity on the serial
console, but can confirm that cpu0 was perfectly alive and well within
the same second as the watchdog supposedly expiring.

I was considering trying to work around a ludicrous rate of interrupts,
but decided to go for the easier patch first

~Andrew

      reply	other threads:[~2013-05-24 10:03 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-23 20:32 [PATCH] x86/watchdog: Use real timestamps for watchdog timeout Andrew Cooper
2013-05-24  7:09 ` Jan Beulich
2013-05-24  9:57   ` Andrew Cooper
2013-05-24 10:13     ` Tim Deegan
2013-05-24 10:33       ` Andrew Cooper
2013-05-24 11:42         ` Jan Beulich
2013-05-24 12:00           ` Andrew Cooper
2013-05-24 13:11             ` Jan Beulich
2013-05-24 11:36       ` Jan Beulich
2013-05-24 12:41         ` Tim Deegan
2013-05-24 12:48           ` Andrew Cooper
2013-05-24 13:55             ` Tim Deegan
2013-05-24 14:29               ` Andrew Cooper
2013-05-24 17:10                 ` Tim Deegan
2013-05-24 17:27                   ` Andrew Cooper
2013-05-24 13:17           ` Jan Beulich
2013-05-24 14:01             ` Tim Deegan
2013-05-24  9:37 ` Tim Deegan
2013-05-24 10:03   ` Andrew Cooper [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519F3AED.2090209@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.