Re: [PATCH] x86/watchdog: Use real timestamps for watchdog timeout

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Tim Deegan <tim@xen.org>
Cc: "Keir (Xen.org)" <keir@xen.org>, Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH] x86/watchdog: Use real timestamps for watchdog timeout
Date: Fri, 24 May 2013 13:48:47 +0100	[thread overview]
Message-ID: <519F61AF.2070203@citrix.com> (raw)
In-Reply-To: <20130524124158.GC54769@ocelot.phlegethon.org>

On 24/05/13 13:41, Tim Deegan wrote:
> At 12:36 +0100 on 24 May (1369398982), Jan Beulich wrote:
>>>>> On 24.05.13 at 12:13, Tim Deegan <tim@xen.org> wrote:
>>>>> Handling this case it nice, but I wonder whether this patch ought to
>>>>> detect and report ludicrous NMI rates rather than silently ignoring
>>>>> them.  I guess that's hard to do in an NMI handler, other than by
>>>>> adjusting the printk when we crash.
>>> Actually on second thoughts it's easier: as well as having this patch
>>> (or near equivalent) to avoid premature watchdog expiry, we cna detect
>>> the NMI rate in, say, the timer softirq and report if it's gone mad.
>> I may not get how you imagine this to work, but why would you
>> assume the timer softirq gets any chance to run anymore with an
>> absurd NMI rate (or at least get run enough times to prevent the
>> watchdog to fire)?
> In that case, the watchdog will fire, if we can add something to the
> watchdog printk if the NMI count is mugh higher than (timeout * nmi_hz).
> I was worrying about the case where there are lots of NMIs but not so
> many that the system livelocks entirely.  It would be nice to have some
> non-fatal warning that something is wrong.  A one-time (or at least very
> limited) printk in the softirq would do that for not much extra cost.
>
> At 12:42 +0100 on 24 May (1369399344), Jan Beulich wrote:
>>>>> On 24.05.13 at 12:33, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> On 24/05/13 11:13, Tim Deegan wrote:
>>>> At 10:57 +0100 on 24 May (1369393060), Andrew Cooper wrote:
>>>>> On 24/05/13 08:09, Jan Beulich wrote:
>>>>>> You can't use NOW() here - while the time updating code is safe
>>>>>> against normal interrupts, it's not atomic wrt NMIs.
>>>>> But NMIs are latched at the hardware level.  If we get a nested NMI the
>>>>> Xen will be toast on the exit path anyway.
>>>> The problem is that an NMI can arrive while local_time_calibration() is
>>>> writing its results, so calling NOW() in the NMI handler might return
>>>> garbage. 
>>> Aah - I see.  Sorry - I misunderstood the original point.
>>>
>>> Yes - that is an issue.
>>>
>>> Two solutions come to mind.
>>>
>>> 1) Along with the local_irq_disable()/enable() pairs in
>>> local_time_calibration, having an atomic_t indicating "time data update
>>> in progress", allowing the NMI handler to decide to bail early.
>>>
>>> 2) Modify local_time_calibration() to fill in a shadow cpu_time set, and
>>> a different atomic_t to indicate which one is consistent.  This would
>>> allow the NMI handler to always use one consistent set of timing
>>> information.
> Of those two, I prefer (1), just because it doesn't add any cost to the
> normal users of NOW().
>
> Using TSC to gate the actual watchdog crash might get a bit messy,
> especially if it ends up adding code to the users of write_tsc().  But
> all we need is to double-check the existing count -- I think it would be
> enough to say "the watchdog fired but it looks like it was caused by an
> NMI storm" and crash anyway.  (At least assuming timeout * nmi_hz is a a
> largish number.)  And nmi_watchdog_tick() can just check regs->eip as a
> hint not to trust the scale factors. :)
>
> Tim.

I was not planning to make any requirement to change users of NOW(). 
The only consumers which need to care about the shadow set or "update in
progress" flag would be the NMI and MCE handlers, being the only ones
which can interrupt the current structure update.  The shadow set can be
neatly hidden away from anyone who doesn't wish to know.

~Andrew

next prev parent reply	other threads:[~2013-05-24 12:48 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-23 20:32 [PATCH] x86/watchdog: Use real timestamps for watchdog timeout Andrew Cooper
2013-05-24  7:09 ` Jan Beulich
2013-05-24  9:57   ` Andrew Cooper
2013-05-24 10:13     ` Tim Deegan
2013-05-24 10:33       ` Andrew Cooper
2013-05-24 11:42         ` Jan Beulich
2013-05-24 12:00           ` Andrew Cooper
2013-05-24 13:11             ` Jan Beulich
2013-05-24 11:36       ` Jan Beulich
2013-05-24 12:41         ` Tim Deegan
2013-05-24 12:48           ` Andrew Cooper [this message]
2013-05-24 13:55             ` Tim Deegan
2013-05-24 14:29               ` Andrew Cooper
2013-05-24 17:10                 ` Tim Deegan
2013-05-24 17:27                   ` Andrew Cooper
2013-05-24 13:17           ` Jan Beulich
2013-05-24 14:01             ` Tim Deegan
2013-05-24  9:37 ` Tim Deegan
2013-05-24 10:03   ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519F61AF.2070203@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).