Re: [PATCH] Improve clocksource unstable warning

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: john stultz <johnstul@us.ibm.com>
To: Andrew Lutomirski <luto@mit.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, pc@us.ibm.com
Subject: Re: [PATCH] Improve clocksource unstable warning
Date: Sat, 13 Nov 2010 00:22:02 +0000	[thread overview]
Message-ID: <1289607722.3292.84.camel@localhost.localdomain> (raw)
In-Reply-To: <AANLkTikd0rstGDDcNdb8u2_H09giaZVxPY1Y5qaiy6_O@mail.gmail.com>

On Fri, 2010-11-12 at 18:51 -0500, Andrew Lutomirski wrote:
> On Fri, Nov 12, 2010 at 6:48 PM, Andrew Lutomirski <luto@mit.edu> wrote:
> > On Fri, Nov 12, 2010 at 6:40 PM, john stultz <johnstul@us.ibm.com> wrote:
> >> On Fri, 2010-11-12 at 16:52 -0500, Andrew Lutomirski wrote:
> >>> On Fri, Nov 12, 2010 at 4:31 PM, john stultz <johnstul@us.ibm.com> wrote:
> >>> > Ideas:
> >>> > 1) Maybe should we check that we get two sequential failures where the
> >>> > cpu seems fast before we throw out the TSC? This will still fall over
> >>> > in some stall cases (ie: a poor rt task hogging the cpu  for 10
> >>> > minutes, pausing for a 10th of a second and then continuing to hog the
> >>> > cpu).
> >>> >
> >>> > 2) We could look at the TSC delta, and if it appears outside the order
> >>> > of 2-10x faster (i don't think any cpus scale up even close to 10x in
> >>> > freq, but please correct me if so), then assume we just have been
> >>> > blocked from running and don't throw out the TSC.
> >>> >
> >>> > 3) Similar to #2 we could look at the max interval that the watchdog
> >>> > clocksource provides, and if the TSC delta is greater then that, avoid
> >>> > throwing things out. This combined with #2 might narrow out the false
> >>> > positives fairly well.
> >>> >
> >>> > Any additional thoughts here?
> >>>
> >>> Yes.  As far as I know, the watchdog doesn't give arbitrary values
> >>> when it wraps; it just wraps.  Here's a possible heuristic, in
> >>> pseudocode:
> >>>
> >>> wd_now_1 = (read watchdog)
> >>> cs_now = (read clocksource)
> >>>
> >>> cs_elapsed = cs_now - cs_last;
> >>> wd_elapsed = wd_now_1 - wd_last;
> >>>
> >>> if ( abs(wd_elapsed - cs_elapsed) < MAX_DELTA)
> >>>   return;  // We're OK.
> >>>
> >>> wd_now_2 = (read watchdog again)
> >>> if (abs(wd_now_1 - wd_now_2) > MAX_DELTA / 2)
> >>>   bail;  // The clocksource might be unstable, but we either just
> >>> lagged or the watchdog is unstable, and in either case we don't gain
> >>> anything by marking the clocksource unstable.
> >>
> >> This is more easily done by just bounding the clocksource read:
> >> wd_now_1 = watchdog->read()
> >> cs_now = clocksource->read()
> >> wd_now_2 = watchdog->read()
> >>
> >> if (((wd_now_2 - wd_now_1)&watchdog->mask) > SOMETHING_SMALL)
> >>        bail; // hit an SMI or some sort of long preemption
> >>
> >>> if ( wd_elapsed < cs_elapsed and ( (cs_elapsed - wd_elapsed) %
> >>> wd_wrapping_time ) < (something fairly small) )
> >>>   bail;  // The watchdog most likely wrapped.
> >>
> >> Huh. The modulo bit may need tweaking as its not immediately clear its
> >> right. Maybe the following is clearer?:
> >>
> >> if ((cs_elapsed > wd_wrapping_time)
> >>        && (abs((cs_elapsed % wd_wrapping_time)-wd_elapsed) < MAX_DELTA)
> >>        // should be ok.
> >
> > I think this is wrong if wd_elapsed is large (which could happen if
> > the real wd time is something like (2 * wd_wrapping_time -
> > MAX_DELTA/4)).
> 
> Also wrong if cs_elapsed is just slightly less than wd_wrapping_time
> but the wd clocksource runs enough faster that it wrapped.

Ok. Good point, that's a problem. Hrmmmm. Too much math for Friday. :)

And your implementation above hits the same issue..  hrm.

I want to say we can get away with it using the same twos-complement
masking trick we use for subtracting around an overflow to calculate
cycle deltas, but I'm not confident it will work when we've moved from
clean bit field masks to nanosecond intervals.

I think I'll have to revisit this on Monday.

Thanks again for pointing out this think-o.
-john

next prev parent reply	other threads:[~2010-11-13  0:22 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-10 22:16 [PATCH] Improve clocksource unstable warning Andy Lutomirski
2010-11-10 22:28 ` Thomas Gleixner
2010-11-10 22:42   ` [PATCH v2] " Andy Lutomirski
2010-11-12 21:31 ` [PATCH] " john stultz
2010-11-12 21:51   ` john stultz
2010-11-12 21:52   ` Andrew Lutomirski
2010-11-12 23:40     ` john stultz
2010-11-12 23:48       ` Andrew Lutomirski
2010-11-12 23:51         ` Andrew Lutomirski
2010-11-13  0:22           ` john stultz [this message]
2010-11-13  0:58             ` john stultz
2010-11-17  0:05               ` Andrew Lutomirski
2010-11-17  0:26                 ` john stultz
2010-11-17  0:54                   ` Andrew Lutomirski
2010-11-17  1:19                     ` john stultz
2010-11-17  1:24                       ` Andrew Lutomirski
2010-11-17  1:54                         ` john stultz
2010-11-12 23:52         ` john stultz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1289607722.3292.84.camel@localhost.localdomain \
    --to=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@mit.edu \
    --cc=pc@us.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox