Re: [PATCH] Improve clocksource unstable warning

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: john stultz <johnstul@us.ibm.com>
To: Andrew Lutomirski <luto@mit.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, pc@us.ibm.com
Subject: Re: [PATCH] Improve clocksource unstable warning
Date: Fri, 12 Nov 2010 15:40:21 -0800	[thread overview]
Message-ID: <1289605221.3292.53.camel@localhost.localdomain> (raw)
In-Reply-To: <AANLkTimAfULHTkyLVpGv5r3DcSfVXzsgGiHgTdamNpt2@mail.gmail.com>

On Fri, 2010-11-12 at 16:52 -0500, Andrew Lutomirski wrote:
> On Fri, Nov 12, 2010 at 4:31 PM, john stultz <johnstul@us.ibm.com> wrote:
> > Ideas:
> > 1) Maybe should we check that we get two sequential failures where the
> > cpu seems fast before we throw out the TSC? This will still fall over
> > in some stall cases (ie: a poor rt task hogging the cpu  for 10
> > minutes, pausing for a 10th of a second and then continuing to hog the
> > cpu).
> >
> > 2) We could look at the TSC delta, and if it appears outside the order
> > of 2-10x faster (i don't think any cpus scale up even close to 10x in
> > freq, but please correct me if so), then assume we just have been
> > blocked from running and don't throw out the TSC.
> >
> > 3) Similar to #2 we could look at the max interval that the watchdog
> > clocksource provides, and if the TSC delta is greater then that, avoid
> > throwing things out. This combined with #2 might narrow out the false
> > positives fairly well.
> >
> > Any additional thoughts here?
> 
> Yes.  As far as I know, the watchdog doesn't give arbitrary values
> when it wraps; it just wraps.  Here's a possible heuristic, in
> pseudocode:
> 
> wd_now_1 = (read watchdog)
> cs_now = (read clocksource)
> 
> cs_elapsed = cs_now - cs_last;
> wd_elapsed = wd_now_1 - wd_last;
> 
> if ( abs(wd_elapsed - cs_elapsed) < MAX_DELTA)
>   return;  // We're OK.
> 
> wd_now_2 = (read watchdog again)
> if (abs(wd_now_1 - wd_now_2) > MAX_DELTA / 2)
>   bail;  // The clocksource might be unstable, but we either just
> lagged or the watchdog is unstable, and in either case we don't gain
> anything by marking the clocksource unstable.

This is more easily done by just bounding the clocksource read:
wd_now_1 = watchdog->read()
cs_now = clocksource->read()
wd_now_2 = watchdog->read()

if (((wd_now_2 - wd_now_1)&watchdog->mask) > SOMETHING_SMALL)
	bail; // hit an SMI or some sort of long preemption

> if ( wd_elapsed < cs_elapsed and ( (cs_elapsed - wd_elapsed) %
> wd_wrapping_time ) < (something fairly small) )
>   bail;  // The watchdog most likely wrapped.

Huh. The modulo bit may need tweaking as its not immediately clear its
right. Maybe the following is clearer?:

if ((cs_elapsed > wd_wrapping_time) 
	&& (abs((cs_elapsed % wd_wrapping_time)-wd_elapsed) < MAX_DELTA)
	// should be ok.


> The idea is that there are three obvious causes for false positives.
> 
> 1. Something caused the watchdog itself to lag.  In this case, the
> second watchdog read will differ from the first one by a lot and we'll
> bail.
> 
> 2. The TSC wrapped in the time between two watchdog checks.  I don't
> think that this is very likely on x86 CPUs.  (Tsn't the TSC always 64
> bits?  By the time that a 20GHz CPU wraps a 64-bit TSC, you're
> probably already replaced the computer.)

Yea, current TSCs won't wrap in our lifetime. And even if it ever does
get close, it will have to wrap between watchdog checks to be an issue
(since twos-compliment math makes the subtraction work even if it does
roll over).


> 3. The watchdog wrapped in the time between two watchdog checks.  In
> this case, unless something else went wrong simultaneously, we'll
> detect it.
> 
> On the other hand, the chance that the TSC was unstable by almost
> exactly a multiple of the watchdog wrapping time is pretty unlikely.
> 
> 
> If we're using a non-TSC clocksource, then you could get false
> positives if that clocksource wraps, but maybe the watchdog is
> unnecessary in that case.

Yea. Some similar extra modulo conditions could probably handle that
too. But again, the only clocksource that needs the watchdog is the TSC.

thanks
-john

next prev parent reply	other threads:[~2010-11-12 23:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-10 22:16 [PATCH] Improve clocksource unstable warning Andy Lutomirski
2010-11-10 22:28 ` Thomas Gleixner
2010-11-10 22:42   ` [PATCH v2] " Andy Lutomirski
2010-11-12 21:31 ` [PATCH] " john stultz
2010-11-12 21:51   ` john stultz
2010-11-12 21:52   ` Andrew Lutomirski
2010-11-12 23:40     ` john stultz [this message]
2010-11-12 23:48       ` Andrew Lutomirski
2010-11-12 23:51         ` Andrew Lutomirski
2010-11-13  0:22           ` john stultz
2010-11-13  0:58             ` john stultz
2010-11-17  0:05               ` Andrew Lutomirski
2010-11-17  0:26                 ` john stultz
2010-11-17  0:54                   ` Andrew Lutomirski
2010-11-17  1:19                     ` john stultz
2010-11-17  1:24                       ` Andrew Lutomirski
2010-11-17  1:54                         ` john stultz
2010-11-12 23:52         ` john stultz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1289605221.3292.53.camel@localhost.localdomain \
    --to=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@mit.edu \
    --cc=pc@us.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox