public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Chris Bainbridge <chris.bainbridge@gmail.com>,
	<tglx@linutronix.de>, <sboyd@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: PROBLEM: skew message does not handle negative ns skew
Date: Thu, 8 Jun 2023 14:29:30 +0800	[thread overview]
Message-ID: <ZIF1SsbmR2GHzQ//@feng-clx> (raw)
In-Reply-To: <86521835-f13f-4d43-9a38-9a55abae0b89@paulmck-laptop>

On Wed, Jun 07, 2023 at 12:04:49PM -0700, Paul E. McKenney wrote:
> On Tue, Jun 06, 2023 at 09:52:11PM +0800, Feng Tang wrote:
> > On Tue, Jun 06, 2023 at 02:09:08PM +0100, Chris Bainbridge wrote:
> > > On Tue, 6 Jun 2023 at 13:50, Feng Tang <feng.tang@intel.com> wrote:
> > > >
> > > > And I'm have no idea if there is a real hardware/firmware issue
> > > > or just a false alarm.
> > > 
> > > Is a negative reported skew valid? I don't know, I had assumed so, so
> > > the problem was the conversion from -878159 ns to 18446744073708 ms.
> > 
> > I think it's valid. The related code is from kernel/time/clocksource.c: 
> > 
> > 	"
> > 	cs_wd_msec = div_u64_rem(cs_nsec - wd_nsec, 1000U * 1000U, &wd_rem);
> > 	wd_msec = div_u64_rem(wd_nsec, 1000U * 1000U, &wd_rem);
> > 	pr_warn("                      Clocksource '%s' skewed %lld ns (%lld ms) over watchdog '%s' interval of %lld ns (%lld ms)\n",
> > 		cs->name, cs_nsec - wd_nsec, cs_wd_msec, watchdog->name, wd_nsec, wd_msec);
> > 	"
> > 
> > The negative value just means the watchdog is running faster than
> > TSC in the 512 ms checking interval. The 18446744073708 ms is just
> > a conversion from s64 value in ns (-878159) to a u64 ns, then a
> > u64 ms. 
> 
> That is a bit user-unfriendly.  Does the following fix address this
> issue at your end?
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit 8eb836f2dd44cb1e80dfc603cf47c03603dadcdb
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Wed Jun 7 11:59:49 2023 -0700
> 
>     clocksource: Handle negative skews in "skew is too large" messages
>     
>     The nanosecond-to-millisecond skew computation uses unsigned arithmetic,
>     which produces user-unfriendly large positive numbers for negative skews.
>     Therefore, use signed arithmetic for this computation in order to preserve
>     the negativity.

It does make the error message more consistent and less confusing. Thanks.

Reviewed-by: Feng Tang <feng.tang@intel.com>

>     
>     Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
>     Reported-by: Feng Tang <feng.tang@intel.com>
>     Fixes: dd029269947a ("clocksource: Improve "skew is too large" messages")
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index 91836b727cef..0600e16dbafe 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -473,8 +473,8 @@ static void clocksource_watchdog(struct timer_list *unused)
>  		/* Check the deviation from the watchdog clocksource. */
>  		md = cs->uncertainty_margin + watchdog->uncertainty_margin;
>  		if (abs(cs_nsec - wd_nsec) > md) {
> -			u64 cs_wd_msec;
> -			u64 wd_msec;
> +			s64 cs_wd_msec;
> +			s64 wd_msec;
>  			u32 wd_rem;
>  
>  			pr_warn("timekeeping watchdog on CPU%d: Marking clocksource '%s' as unstable because the skew is too large:\n",
> @@ -483,8 +483,8 @@ static void clocksource_watchdog(struct timer_list *unused)
>  				watchdog->name, wd_nsec, wdnow, wdlast, watchdog->mask);
>  			pr_warn("                      '%s' cs_nsec: %lld cs_now: %llx cs_last: %llx mask: %llx\n",
>  				cs->name, cs_nsec, csnow, cslast, cs->mask);
> -			cs_wd_msec = div_u64_rem(cs_nsec - wd_nsec, 1000U * 1000U, &wd_rem);
> -			wd_msec = div_u64_rem(wd_nsec, 1000U * 1000U, &wd_rem);
> +			cs_wd_msec = div_s64_rem(cs_nsec - wd_nsec, 1000 * 1000, &wd_rem);
> +			wd_msec = div_s64_rem(wd_nsec, 1000 * 1000, &wd_rem);
>  			pr_warn("                      Clocksource '%s' skewed %lld ns (%lld ms) over watchdog '%s' interval of %lld ns (%lld ms)\n",
>  				cs->name, cs_nsec - wd_nsec, cs_wd_msec, watchdog->name, wd_nsec, wd_msec);
>  			if (curr_clocksource == cs)

  reply	other threads:[~2023-06-08  6:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAP-bSRZuLhZQ4Kpb4NRF2yY6XifYpB3ei4=6oFDAaG+OmeGebQ@mail.gmail.com>
2023-06-06 11:28 ` PROBLEM: skew message does not handle negative ns skew Feng Tang
2023-06-06 12:28   ` Chris Bainbridge
2023-06-06 12:42     ` Feng Tang
2023-06-06 13:09       ` Chris Bainbridge
2023-06-06 13:52         ` Feng Tang
2023-06-07 19:04           ` Paul E. McKenney
2023-06-08  6:29             ` Feng Tang [this message]
2023-06-08  9:41             ` Chris Bainbridge
2023-06-08 16:25               ` Paul E. McKenney
2023-06-08 16:27                 ` Chris Bainbridge
2023-06-08 16:42                   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZIF1SsbmR2GHzQ//@feng-clx \
    --to=feng.tang@intel.com \
    --cc=chris.bainbridge@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=sboyd@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox