From: Neeraj Upadhyay <Neeraj.Upadhyay@kernel.org>
To: riel@surriel.com
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
linux-kernel@vger.kernel.org, kernel-team@meta.com,
neeraj.upadhyay@kernel.org, mingo@kernel.org,
rostedt@goodmis.org, Leonardo Bras <leobras@redhat.com>
Subject: Re: [PATCH] smp: print only local CPU info when sched_clock goes backward
Date: Wed, 24 Jul 2024 22:49:44 +0530 [thread overview]
Message-ID: <20240724171944.GA811274@neeraj.linux> (raw)
In-Reply-To: <88d281fe-d101-47d9-b70e-bb6a8959f5ff@paulmck-laptop>
On Mon, Jul 15, 2024 at 11:07:30AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 15, 2024 at 01:49:41PM -0400, Rik van Riel wrote:
> > About 40% of all csd_lock warnings observed in our fleet appear to
> > be due to sched_clock() going backward in time (usually only a little
> > bit), resulting in ts0 being larger than ts2.
> >
> > When the local CPU is at fault, we should print out a message reflecting
> > that, rather than trying to get the remote CPU's stack trace.
> >
> > Signed-off-by: Rik van Riel <riel@surriel.com>
>
> Tested-by: Paul E. McKenney <paulmck@kernel.org>
>
I have included this patch as part of the CSD-lock diagnostics series
which is submitted for review and planned for v6.12 [1]. I have also
included it in RCU tree [2] for more testing.
[1] https://lore.kernel.org/lkml/20240722133559.GA667117@neeraj.linux/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/neeraj.upadhyay/linux-rcu.git/log/?h=next
- Neeraj
> > ---
> > kernel/smp.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/kernel/smp.c b/kernel/smp.c
> > index f085ebcdf9e7..5656ef63ea82 100644
> > --- a/kernel/smp.c
> > +++ b/kernel/smp.c
> > @@ -237,6 +237,14 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
> > if (likely(ts_delta <= csd_lock_timeout_ns || csd_lock_timeout_ns == 0))
> > return false;
> >
> > + if (ts0 > ts2) {
> > + /* Our own sched_clock went backward; don't blame another CPU. */
> > + ts_delta = ts0 - ts2;
> > + pr_alert("sched_clock on CPU %d went backward by %llu ns\n", raw_smp_processor_id(), ts_delta);
> > + *ts1 = ts2;
> > + return false;
> > + }
> > +
> > firsttime = !*bug_id;
> > if (firsttime)
> > *bug_id = atomic_inc_return(&csd_bug_count);
> > --
> > 2.45.2
> >
next prev parent reply other threads:[~2024-07-24 17:19 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-15 17:49 [PATCH] smp: print only local CPU info when sched_clock goes backward Rik van Riel
2024-07-15 18:07 ` Paul E. McKenney
2024-07-24 17:19 ` Neeraj Upadhyay [this message]
2024-07-16 9:04 ` Peter Zijlstra
2024-07-16 13:10 ` Rik van Riel
2024-07-16 13:47 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240724171944.GA811274@neeraj.linux \
--to=neeraj.upadhyay@kernel.org \
--cc=kernel-team@meta.com \
--cc=leobras@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=paulmck@kernel.org \
--cc=riel@surriel.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.