* [PATCH 1/3] locking/csd_lock: Print large numbers as negatives
2024-07-13 16:56 [PATCH 0/3] CSD-lock diagnostics enhancements Neeraj Upadhyay
@ 2024-07-13 16:58 ` neeraj.upadhyay
2024-07-13 17:14 ` Rik van Riel
2024-07-13 16:58 ` [PATCH 2/3] locking/csd_lock: Provide an indication of ongoing CSD-lock stall neeraj.upadhyay
2024-07-13 16:58 ` [PATCH 3/3] locking/csd-lock: Use backoff for repeated reports of same incident neeraj.upadhyay
2 siblings, 1 reply; 8+ messages in thread
From: neeraj.upadhyay @ 2024-07-13 16:58 UTC (permalink / raw)
To: linux-kernel
Cc: rcu, kernel-team, rostedt, mingo, peterz, paulmck, leobras,
imran.f.khan, riel, neeraj.upadhyay
From: "Paul E. McKenney" <paulmck@kernel.org>
The CSD-lock-hold diagnostics from CONFIG_CSD_LOCK_WAIT_DEBUG are
printed in nanoseconds as unsigned long longs, which is a bit obtuse for
human readers when timing bugs result in negative CSD-lock hold times.
Yes, there are some people to whom it is immediately obvious that
18446744073709551615 is really -1, but for the rest of us...
Therefore, print these numbers as signed long longs, making the negative
hold times immediately apparent.
Reported-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/smp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index f085ebcdf9e7..81f7083a53e2 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -248,8 +248,8 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */
/* How long since this CSD lock was stuck. */
ts_delta = ts2 - ts0;
- pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n",
- firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts_delta,
+ pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %lld ns for CPU#%02d %pS(%ps).\n",
+ firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), (s64)ts_delta,
cpu, csd->func, csd->info);
/*
* If the CSD lock is still stuck after 5 minutes, it is unlikely
--
2.40.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 1/3] locking/csd_lock: Print large numbers as negatives
2024-07-13 16:58 ` [PATCH 1/3] locking/csd_lock: Print large numbers as negatives neeraj.upadhyay
@ 2024-07-13 17:14 ` Rik van Riel
0 siblings, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2024-07-13 17:14 UTC (permalink / raw)
To: neeraj.upadhyay, linux-kernel
Cc: rcu, kernel-team, rostedt, mingo, peterz, paulmck, leobras,
imran.f.khan
On Sat, 2024-07-13 at 22:28 +0530, neeraj.upadhyay@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@kernel.org>
>
> The CSD-lock-hold diagnostics from CONFIG_CSD_LOCK_WAIT_DEBUG are
> printed in nanoseconds as unsigned long longs, which is a bit obtuse
> for
> human readers when timing bugs result in negative CSD-lock hold
> times.
> Yes, there are some people to whom it is immediately obvious that
> 18446744073709551615 is really -1, but for the rest of us...
>
To clarify the report a little bit: it appears that, on some CPU
models, occasionally sched_clock() values jump backward, on the
same CPU.
Looking at the number of systems where this happened over time,
leaving out the exact numbers, the distribution looks something
like this:
- 1 day: N systems
- 3 days: 3N systems
- 1 week: 7N systems
- 1 month: 26N systems
- 90 days: 72N systems
This does not appear to be a case of a few systems with bad
hardware, where it happens constantly to the same systems, but
something that many systems experience occasionally, and then
not again for months.
The systems in question advertise CONSTANT_TSC, NONSTOP_TSC,
and generally seem to have stable, nonstop, monotonic TSC
values, but sometimes the values go back in time a little bit.
The cycles_2_ns data does not appear to change during the
episodes of sched_clock() going backward.
The csd_lock code is not the only thing that breaks when the
sched_clock values go backward, but it seems to be the best thing
we seem to have right now to detect it.
I don't know whether adding more detection of this issue would
increase the number of systems where backwards sched_clock is
observed.
Many of the systems with backwards going TSC values seem to
encounter a bunch of them across some time period, end up
getting rebooted, and then behave well for months after.
> Reported-by: Rik van Riel <riel@surriel.com>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Imran Khan <imran.f.khan@oracle.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Leonardo Bras <leobras@redhat.com>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: Rik van Riel <riel@surriel.com>
> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
>
Reviewed-by: Rik van Riel <riel@surriel.com>
--
All Rights Reversed.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/3] locking/csd_lock: Provide an indication of ongoing CSD-lock stall
2024-07-13 16:56 [PATCH 0/3] CSD-lock diagnostics enhancements Neeraj Upadhyay
2024-07-13 16:58 ` [PATCH 1/3] locking/csd_lock: Print large numbers as negatives neeraj.upadhyay
@ 2024-07-13 16:58 ` neeraj.upadhyay
2024-07-13 17:16 ` Rik van Riel
2024-07-13 16:58 ` [PATCH 3/3] locking/csd-lock: Use backoff for repeated reports of same incident neeraj.upadhyay
2 siblings, 1 reply; 8+ messages in thread
From: neeraj.upadhyay @ 2024-07-13 16:58 UTC (permalink / raw)
To: linux-kernel
Cc: rcu, kernel-team, rostedt, mingo, peterz, paulmck, leobras,
imran.f.khan, riel, neeraj.upadhyay
From: "Paul E. McKenney" <paulmck@kernel.org>
If a CSD-lock stall goes on long enough, it will cause an RCU CPU
stall warning. This additional warning provides much additional
console-log traffic and little additional information. Therefore,
provide a new csd_lock_is_stuck() function that returns true if there
is an ongoing CSD-lock stall. This function will be used by the RCU
CPU stall warnings to provide a one-line indication of the stall when
this function returns true.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
include/linux/smp.h | 6 ++++++
kernel/smp.c | 16 ++++++++++++++++
2 files changed, 22 insertions(+)
diff --git a/include/linux/smp.h b/include/linux/smp.h
index fcd61dfe2af3..3871bd32018f 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -294,4 +294,10 @@ int smpcfd_prepare_cpu(unsigned int cpu);
int smpcfd_dead_cpu(unsigned int cpu);
int smpcfd_dying_cpu(unsigned int cpu);
+#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
+bool csd_lock_is_stuck(void);
+#else
+static inline bool csd_lock_is_stuck(void) { return false; }
+#endif
+
#endif /* __LINUX_SMP_H */
diff --git a/kernel/smp.c b/kernel/smp.c
index 81f7083a53e2..c3e8241e9cbb 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -207,6 +207,19 @@ static int csd_lock_wait_getcpu(call_single_data_t *csd)
return -1;
}
+static atomic_t n_csd_lock_stuck;
+
+/**
+ * csd_lock_is_stuck - Has a CSD-lock acquisition been stuck too long?
+ *
+ * Returns @true if a CSD-lock acquisition is stuck and has been stuck
+ * long enough for a "non-responsive CSD lock" message to be printed.
+ */
+bool csd_lock_is_stuck(void)
+{
+ return !!atomic_read(&n_csd_lock_stuck);
+}
+
/*
* Complain if too much time spent waiting. Note that only
* the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
@@ -228,6 +241,7 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
cpu = csd_lock_wait_getcpu(csd);
pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
*bug_id, raw_smp_processor_id(), cpu);
+ atomic_dec(&n_csd_lock_stuck);
return true;
}
@@ -251,6 +265,8 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %lld ns for CPU#%02d %pS(%ps).\n",
firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), (s64)ts_delta,
cpu, csd->func, csd->info);
+ if (firsttime)
+ atomic_dec(&n_csd_lock_stuck);
/*
* If the CSD lock is still stuck after 5 minutes, it is unlikely
* to become unstuck. Use a signed comparison to avoid triggering
--
2.40.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 2/3] locking/csd_lock: Provide an indication of ongoing CSD-lock stall
2024-07-13 16:58 ` [PATCH 2/3] locking/csd_lock: Provide an indication of ongoing CSD-lock stall neeraj.upadhyay
@ 2024-07-13 17:16 ` Rik van Riel
2024-07-13 19:32 ` Paul E. McKenney
0 siblings, 1 reply; 8+ messages in thread
From: Rik van Riel @ 2024-07-13 17:16 UTC (permalink / raw)
To: neeraj.upadhyay, linux-kernel
Cc: rcu, kernel-team, rostedt, mingo, peterz, paulmck, leobras,
imran.f.khan
On Sat, 2024-07-13 at 22:28 +0530, neeraj.upadhyay@kernel.org wrote:
>
> @@ -228,6 +241,7 @@ static bool
> csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
> cpu = csd_lock_wait_getcpu(csd);
> pr_alert("csd: CSD lock (#%d) got unstuck on
> CPU#%02d, CPU#%02d released the lock.\n",
> *bug_id, raw_smp_processor_id(), cpu);
> + atomic_dec(&n_csd_lock_stuck);
> return true;
> }
>
So we decrement it when it gets unstuck. Good.
> @@ -251,6 +265,8 @@ static bool
> csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
> pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d,
> waiting %lld ns for CPU#%02d %pS(%ps).\n",
> firsttime ? "Detected" : "Continued", *bug_id,
> raw_smp_processor_id(), (s64)ts_delta,
> cpu, csd->func, csd->info);
> + if (firsttime)
> + atomic_dec(&n_csd_lock_stuck);
>
However, I don't see any place where it is incremented when things
get stuck, and this line decrements it when a CPU gets stuck for
the first time?
Should this be an atomic_inc?
--
All Rights Reversed.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 2/3] locking/csd_lock: Provide an indication of ongoing CSD-lock stall
2024-07-13 17:16 ` Rik van Riel
@ 2024-07-13 19:32 ` Paul E. McKenney
0 siblings, 0 replies; 8+ messages in thread
From: Paul E. McKenney @ 2024-07-13 19:32 UTC (permalink / raw)
To: Rik van Riel
Cc: neeraj.upadhyay, linux-kernel, rcu, kernel-team, rostedt, mingo,
peterz, leobras, imran.f.khan
On Sat, Jul 13, 2024 at 01:16:47PM -0400, Rik van Riel wrote:
> On Sat, 2024-07-13 at 22:28 +0530, neeraj.upadhyay@kernel.org wrote:
> >
> > @@ -228,6 +241,7 @@ static bool
> > csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
> > cpu = csd_lock_wait_getcpu(csd);
> > pr_alert("csd: CSD lock (#%d) got unstuck on
> > CPU#%02d, CPU#%02d released the lock.\n",
> > *bug_id, raw_smp_processor_id(), cpu);
> > + atomic_dec(&n_csd_lock_stuck);
> > return true;
> > }
> >
>
> So we decrement it when it gets unstuck. Good.
>
> > @@ -251,6 +265,8 @@ static bool
> > csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
> > pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d,
> > waiting %lld ns for CPU#%02d %pS(%ps).\n",
> > firsttime ? "Detected" : "Continued", *bug_id,
> > raw_smp_processor_id(), (s64)ts_delta,
> > cpu, csd->func, csd->info);
> > + if (firsttime)
> > + atomic_dec(&n_csd_lock_stuck);
> >
>
> However, I don't see any place where it is incremented when things
> get stuck, and this line decrements it when a CPU gets stuck for
> the first time?
>
> Should this be an atomic_inc?
Good catch, thank you! I will go get that brown paper bag...
Thanx, Paul
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 3/3] locking/csd-lock: Use backoff for repeated reports of same incident
2024-07-13 16:56 [PATCH 0/3] CSD-lock diagnostics enhancements Neeraj Upadhyay
2024-07-13 16:58 ` [PATCH 1/3] locking/csd_lock: Print large numbers as negatives neeraj.upadhyay
2024-07-13 16:58 ` [PATCH 2/3] locking/csd_lock: Provide an indication of ongoing CSD-lock stall neeraj.upadhyay
@ 2024-07-13 16:58 ` neeraj.upadhyay
2024-07-14 0:59 ` Rik van Riel
2 siblings, 1 reply; 8+ messages in thread
From: neeraj.upadhyay @ 2024-07-13 16:58 UTC (permalink / raw)
To: linux-kernel
Cc: rcu, kernel-team, rostedt, mingo, peterz, paulmck, leobras,
imran.f.khan, riel, neeraj.upadhyay
From: "Paul E. McKenney" <paulmck@kernel.org>
Currently, the CSD-lock diagnostics in CONFIG_CSD_LOCK_WAIT_DEBUG=y
kernels are emitted at five-second intervals. Although this has proven
to be a good time interval for the first diagnostic, if the target CPU
keeps interrupts disabled for way longer than five seconds, the ratio
of useful new information to pointless repetition increases considerably.
Therefore, back off the time period for repeated reports of the same
incident, increasing linearly with the number of reports and logarithmicly
with the number of online CPUs.
[ paulmck: Apply Dan Carpenter feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/smp.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index c3e8241e9cbb..80c1173ce369 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -225,7 +225,7 @@ bool csd_lock_is_stuck(void)
* the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
* so waiting on other types gets much less information.
*/
-static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
+static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id, unsigned long *nmessages)
{
int cpu = -1;
int cpux;
@@ -248,7 +248,9 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
ts2 = sched_clock();
/* How long since we last checked for a stuck CSD lock.*/
ts_delta = ts2 - *ts1;
- if (likely(ts_delta <= csd_lock_timeout_ns || csd_lock_timeout_ns == 0))
+ if (likely(ts_delta <= csd_lock_timeout_ns * (*nmessages + 1) *
+ (!*nmessages ? 1 : (ilog2(num_online_cpus()) / 2 + 1)) ||
+ csd_lock_timeout_ns == 0))
return false;
firsttime = !*bug_id;
@@ -265,6 +267,7 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %lld ns for CPU#%02d %pS(%ps).\n",
firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), (s64)ts_delta,
cpu, csd->func, csd->info);
+ (*nmessages)++;
if (firsttime)
atomic_dec(&n_csd_lock_stuck);
/*
@@ -305,12 +308,13 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
*/
static void __csd_lock_wait(call_single_data_t *csd)
{
+ unsigned long nmessages = 0;
int bug_id = 0;
u64 ts0, ts1;
ts1 = ts0 = sched_clock();
for (;;) {
- if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id))
+ if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id, &nmessages))
break;
cpu_relax();
}
--
2.40.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 3/3] locking/csd-lock: Use backoff for repeated reports of same incident
2024-07-13 16:58 ` [PATCH 3/3] locking/csd-lock: Use backoff for repeated reports of same incident neeraj.upadhyay
@ 2024-07-14 0:59 ` Rik van Riel
0 siblings, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2024-07-14 0:59 UTC (permalink / raw)
To: neeraj.upadhyay, linux-kernel
Cc: rcu, kernel-team, rostedt, mingo, peterz, paulmck, leobras,
imran.f.khan
On Sat, 2024-07-13 at 22:28 +0530, neeraj.upadhyay@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@kernel.org>
>
> Currently, the CSD-lock diagnostics in CONFIG_CSD_LOCK_WAIT_DEBUG=y
> kernels are emitted at five-second intervals. Although this has
> proven
> to be a good time interval for the first diagnostic, if the target
> CPU
> keeps interrupts disabled for way longer than five seconds, the ratio
> of useful new information to pointless repetition increases
> considerably.
>
> Therefore, back off the time period for repeated reports of the same
> incident, increasing linearly with the number of reports and
> logarithmicly
> with the number of online CPUs.
>
> [ paulmck: Apply Dan Carpenter feedback. ]
>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Imran Khan <imran.f.khan@oracle.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Leonardo Bras <leobras@redhat.com>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: Rik van Riel <riel@surriel.com>
> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
--
All Rights Reversed.
^ permalink raw reply [flat|nested] 8+ messages in thread