public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] clocksource: Use pr_info() for "Checking clocksource synchronization" message
@ 2025-01-25  1:54 Waiman Long
  2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
  2025-01-27  9:36 ` [tip: timers/urgent] clocksource: Use pr_info() for "Checking clocksource synchronization" message tip-bot2 for Waiman Long
  0 siblings, 2 replies; 10+ messages in thread
From: Waiman Long @ 2025-01-25  1:54 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt
  Cc: linux-kernel, linux-rt-devel, Waiman Long

The "Checking clocksource synchronization" message is normally printed
when clocksource_verify_percpu() is called for a given clocksource if
both the CLOCK_SOURCE_UNSTABLE and CLOCK_SOURCE_VERIFY_PERCPU flags
are set. It is an informational message and so pr_info() should be used
instead of pr_warn().

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: John Stultz <jstultz@google.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/time/clocksource.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 7304d7cf47f2..77d9566d3aa6 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -382,7 +382,8 @@ void clocksource_verify_percpu(struct clocksource *cs)
 		return;
 	}
 	testcpu = smp_processor_id();
-	pr_warn("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n", cs->name, testcpu, cpumask_pr_args(&cpus_chosen));
+	pr_info("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n",
+		cs->name, testcpu, cpumask_pr_args(&cpus_chosen));
 	for_each_cpu(cpu, &cpus_chosen) {
 		if (cpu == testcpu)
 			continue;
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-25  1:54 [PATCH v2 1/2] clocksource: Use pr_info() for "Checking clocksource synchronization" message Waiman Long
@ 2025-01-25  1:54 ` Waiman Long
  2025-01-25  2:11   ` Waiman Long
                     ` (2 more replies)
  2025-01-27  9:36 ` [tip: timers/urgent] clocksource: Use pr_info() for "Checking clocksource synchronization" message tip-bot2 for Waiman Long
  1 sibling, 3 replies; 10+ messages in thread
From: Waiman Long @ 2025-01-25  1:54 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt
  Cc: linux-kernel, linux-rt-devel, Waiman Long

The following bug report happened in a PREEMPT_RT kernel.

[   30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[   30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
[   30.962673] preempt_count: 1, expected: 0
[   30.962676] RCU nest depth: 0, expected: 0
[   30.962680] 3 locks held by kwatchdog/2012:
[   30.962684]  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
[   30.967703]  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
[   30.972774]  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
[   30.977827] Preemption disabled at:
[   30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
[   30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
[   30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
[   30.982846] Call Trace:
[   30.982850]  <TASK>
[   30.983821]  dump_stack_lvl+0x57/0x81
[   30.983821]  __might_resched.cold+0xf4/0x12f
[   30.983824]  rt_spin_lock+0x4c/0x100
[   30.988833]  get_random_u32+0x4f/0x110
[   30.988833]  clocksource_verify_choose_cpus+0xab/0x1a0
[   30.988833]  clocksource_verify_percpu.part.0+0x6b/0x330
[   30.993894]  __clocksource_watchdog_kthread+0x193/0x1a0
[   30.993898]  clocksource_watchdog_kthread+0x18/0x50
[   30.993898]  kthread+0x114/0x140
[   30.993898]  ret_from_fork+0x2c/0x50
[   31.002864]  </TASK>

It is due to the fact that get_random_u32() is called in
clocksource_verify_choose_cpus() with preemption disabled.
If crng_ready() is true by the time get_random_u32() is called, The
batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
it is a rtmutex and we can't acquire it with preemption disabled.

Fix this problem by using the less random get_random_bytes() function
which will not take any lock. In fact, it has the same random-ness as
get_random_u32_below() when crng_ready() is false.

Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/time/clocksource.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 77d9566d3aa6..659c4b79119c 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -340,9 +340,13 @@ static void clocksource_verify_choose_cpus(void)
 	 * and no replacement CPU is selected.  This gracefully handles
 	 * situations where verify_n_cpus is greater than the number of
 	 * CPUs that are currently online.
+	 *
+	 * The get_random_bytes() is used here to avoid taking lock with
+	 * preemption disabled.
 	 */
 	for (i = 1; i < n; i++) {
-		cpu = get_random_u32_below(nr_cpu_ids);
+		get_random_bytes(&cpu, sizeof(cpu));
+		cpu %= nr_cpu_ids;
 		cpu = cpumask_next(cpu - 1, cpu_online_mask);
 		if (cpu >= nr_cpu_ids)
 			cpu = cpumask_first(cpu_online_mask);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
@ 2025-01-25  2:11   ` Waiman Long
  2025-01-25  4:46     ` Paul E. McKenney
  2025-01-27  9:36   ` [tip: timers/urgent] " tip-bot2 for Waiman Long
  2025-01-29 16:34   ` [PATCH v2 2/2] " Sebastian Andrzej Siewior
  2 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2025-01-25  2:11 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt
  Cc: linux-kernel, linux-rt-devel


On 1/24/25 8:54 PM, Waiman Long wrote:
> The following bug report happened in a PREEMPT_RT kernel.
>
> [   30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> [   30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
> [   30.962673] preempt_count: 1, expected: 0
> [   30.962676] RCU nest depth: 0, expected: 0
> [   30.962680] 3 locks held by kwatchdog/2012:
> [   30.962684]  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
> [   30.967703]  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
> [   30.972774]  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
> [   30.977827] Preemption disabled at:
> [   30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
> [   30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
> [   30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
> [   30.982846] Call Trace:
> [   30.982850]  <TASK>
> [   30.983821]  dump_stack_lvl+0x57/0x81
> [   30.983821]  __might_resched.cold+0xf4/0x12f
> [   30.983824]  rt_spin_lock+0x4c/0x100
> [   30.988833]  get_random_u32+0x4f/0x110
> [   30.988833]  clocksource_verify_choose_cpus+0xab/0x1a0
> [   30.988833]  clocksource_verify_percpu.part.0+0x6b/0x330
> [   30.993894]  __clocksource_watchdog_kthread+0x193/0x1a0
> [   30.993898]  clocksource_watchdog_kthread+0x18/0x50
> [   30.993898]  kthread+0x114/0x140
> [   30.993898]  ret_from_fork+0x2c/0x50
> [   31.002864]  </TASK>
>
> It is due to the fact that get_random_u32() is called in
> clocksource_verify_choose_cpus() with preemption disabled.
> If crng_ready() is true by the time get_random_u32() is called, The
> batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
> it is a rtmutex and we can't acquire it with preemption disabled.
>
> Fix this problem by using the less random get_random_bytes() function
> which will not take any lock. In fact, it has the same random-ness as
> get_random_u32_below() when crng_ready() is false.
>
> Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")

Oh, I should have added

Suggested-by: Paul E. McKenney <paulmck@kernel.org>

> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>   kernel/time/clocksource.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index 77d9566d3aa6..659c4b79119c 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -340,9 +340,13 @@ static void clocksource_verify_choose_cpus(void)
>   	 * and no replacement CPU is selected.  This gracefully handles
>   	 * situations where verify_n_cpus is greater than the number of
>   	 * CPUs that are currently online.
> +	 *
> +	 * The get_random_bytes() is used here to avoid taking lock with
> +	 * preemption disabled.
>   	 */
>   	for (i = 1; i < n; i++) {
> -		cpu = get_random_u32_below(nr_cpu_ids);
> +		get_random_bytes(&cpu, sizeof(cpu));
> +		cpu %= nr_cpu_ids;
>   		cpu = cpumask_next(cpu - 1, cpu_online_mask);
>   		if (cpu >= nr_cpu_ids)
>   			cpu = cpumask_first(cpu_online_mask);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-25  2:11   ` Waiman Long
@ 2025-01-25  4:46     ` Paul E. McKenney
  0 siblings, 0 replies; 10+ messages in thread
From: Paul E. McKenney @ 2025-01-25  4:46 UTC (permalink / raw)
  To: Waiman Long
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
	linux-kernel, linux-rt-devel

On Fri, Jan 24, 2025 at 09:11:27PM -0500, Waiman Long wrote:
> 
> On 1/24/25 8:54 PM, Waiman Long wrote:
> > The following bug report happened in a PREEMPT_RT kernel.
> > 
> > [   30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> > [   30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
> > [   30.962673] preempt_count: 1, expected: 0
> > [   30.962676] RCU nest depth: 0, expected: 0
> > [   30.962680] 3 locks held by kwatchdog/2012:
> > [   30.962684]  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
> > [   30.967703]  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
> > [   30.972774]  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
> > [   30.977827] Preemption disabled at:
> > [   30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
> > [   30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
> > [   30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
> > [   30.982846] Call Trace:
> > [   30.982850]  <TASK>
> > [   30.983821]  dump_stack_lvl+0x57/0x81
> > [   30.983821]  __might_resched.cold+0xf4/0x12f
> > [   30.983824]  rt_spin_lock+0x4c/0x100
> > [   30.988833]  get_random_u32+0x4f/0x110
> > [   30.988833]  clocksource_verify_choose_cpus+0xab/0x1a0
> > [   30.988833]  clocksource_verify_percpu.part.0+0x6b/0x330
> > [   30.993894]  __clocksource_watchdog_kthread+0x193/0x1a0
> > [   30.993898]  clocksource_watchdog_kthread+0x18/0x50
> > [   30.993898]  kthread+0x114/0x140
> > [   30.993898]  ret_from_fork+0x2c/0x50
> > [   31.002864]  </TASK>
> > 
> > It is due to the fact that get_random_u32() is called in
> > clocksource_verify_choose_cpus() with preemption disabled.
> > If crng_ready() is true by the time get_random_u32() is called, The
> > batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
> > it is a rtmutex and we can't acquire it with preemption disabled.
> > 
> > Fix this problem by using the less random get_random_bytes() function
> > which will not take any lock. In fact, it has the same random-ness as
> > get_random_u32_below() when crng_ready() is false.
> > 
> > Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
> 
> Oh, I should have added
> 
> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> 
> > Signed-off-by: Waiman Long <longman@redhat.com>

Works for me!

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

> > ---
> >   kernel/time/clocksource.c | 6 +++++-
> >   1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> > index 77d9566d3aa6..659c4b79119c 100644
> > --- a/kernel/time/clocksource.c
> > +++ b/kernel/time/clocksource.c
> > @@ -340,9 +340,13 @@ static void clocksource_verify_choose_cpus(void)
> >   	 * and no replacement CPU is selected.  This gracefully handles
> >   	 * situations where verify_n_cpus is greater than the number of
> >   	 * CPUs that are currently online.
> > +	 *
> > +	 * The get_random_bytes() is used here to avoid taking lock with
> > +	 * preemption disabled.
> >   	 */
> >   	for (i = 1; i < n; i++) {
> > -		cpu = get_random_u32_below(nr_cpu_ids);
> > +		get_random_bytes(&cpu, sizeof(cpu));
> > +		cpu %= nr_cpu_ids;
> >   		cpu = cpumask_next(cpu - 1, cpu_online_mask);
> >   		if (cpu >= nr_cpu_ids)
> >   			cpu = cpumask_first(cpu_online_mask);
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip: timers/urgent] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
  2025-01-25  2:11   ` Waiman Long
@ 2025-01-27  9:36   ` tip-bot2 for Waiman Long
  2025-01-29 16:34   ` [PATCH v2 2/2] " Sebastian Andrzej Siewior
  2 siblings, 0 replies; 10+ messages in thread
From: tip-bot2 for Waiman Long @ 2025-01-27  9:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Thomas Gleixner, Paul E. McKenney, stable, x86,
	linux-kernel

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     01cfc84024e9a6b619696a35d2e5662255001cd0
Gitweb:        https://git.kernel.org/tip/01cfc84024e9a6b619696a35d2e5662255001cd0
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Fri, 24 Jan 2025 20:54:42 -05:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Mon, 27 Jan 2025 10:30:59 +01:00

clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()

The following bug report happened in a PREEMPT_RT kernel.

 BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
 preempt_count: 1, expected: 0
 RCU nest depth: 0, expected: 0
 3 locks held by kwatchdog/2012:
  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
 Preemption disabled at:
 [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
 CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
 Call Trace:
  <TASK>
  __might_resched.cold+0xf4/0x12f
  rt_spin_lock+0x4c/0x100
  get_random_u32+0x4f/0x110
  clocksource_verify_choose_cpus+0xab/0x1a0
  clocksource_verify_percpu.part.0+0x6b/0x330
  __clocksource_watchdog_kthread+0x193/0x1a0
  clocksource_watchdog_kthread+0x18/0x50
  kthread+0x114/0x140
  ret_from_fork+0x2c/0x50
  </TASK>

This happens due to the fact that get_random_u32() is called in
clocksource_verify_choose_cpus() with preemption disabled.  If crng_ready()
is true by the time get_random_u32() is called, The batched_entropy_32
local lock will be acquired. In a PREEMPT_RT enabled kernel, it is a
rtmutex, which can't be acquireq with preemption disabled.

Fix this problem by using the less random get_random_bytes() function which
will not take any lock. In fact, it has the same random-ness as
get_random_u32_below() when crng_ready() is false.

Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250125015442.3740588-2-longman@redhat.com
---
 kernel/time/clocksource.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 77d9566..659c4b7 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -340,9 +340,13 @@ static void clocksource_verify_choose_cpus(void)
 	 * and no replacement CPU is selected.  This gracefully handles
 	 * situations where verify_n_cpus is greater than the number of
 	 * CPUs that are currently online.
+	 *
+	 * The get_random_bytes() is used here to avoid taking lock with
+	 * preemption disabled.
 	 */
 	for (i = 1; i < n; i++) {
-		cpu = get_random_u32_below(nr_cpu_ids);
+		get_random_bytes(&cpu, sizeof(cpu));
+		cpu %= nr_cpu_ids;
 		cpu = cpumask_next(cpu - 1, cpu_online_mask);
 		if (cpu >= nr_cpu_ids)
 			cpu = cpumask_first(cpu_online_mask);

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [tip: timers/urgent] clocksource: Use pr_info() for "Checking clocksource synchronization" message
  2025-01-25  1:54 [PATCH v2 1/2] clocksource: Use pr_info() for "Checking clocksource synchronization" message Waiman Long
  2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
@ 2025-01-27  9:36 ` tip-bot2 for Waiman Long
  1 sibling, 0 replies; 10+ messages in thread
From: tip-bot2 for Waiman Long @ 2025-01-27  9:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Thomas Gleixner, Paul E. McKenney, John Stultz, x86,
	linux-kernel

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     1f566840a82982141f94086061927a90e79440e5
Gitweb:        https://git.kernel.org/tip/1f566840a82982141f94086061927a90e79440e5
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Fri, 24 Jan 2025 20:54:41 -05:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Mon, 27 Jan 2025 10:30:59 +01:00

clocksource: Use pr_info() for "Checking clocksource synchronization" message

The "Checking clocksource synchronization" message is normally printed
when clocksource_verify_percpu() is called for a given clocksource if
both the CLOCK_SOURCE_UNSTABLE and CLOCK_SOURCE_VERIFY_PERCPU flags
are set.

It is an informational message and so pr_info() is the correct choice.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/all/20250125015442.3740588-1-longman@redhat.com

---
 kernel/time/clocksource.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 7304d7c..77d9566 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -382,7 +382,8 @@ void clocksource_verify_percpu(struct clocksource *cs)
 		return;
 	}
 	testcpu = smp_processor_id();
-	pr_warn("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n", cs->name, testcpu, cpumask_pr_args(&cpus_chosen));
+	pr_info("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n",
+		cs->name, testcpu, cpumask_pr_args(&cpus_chosen));
 	for_each_cpu(cpu, &cpus_chosen) {
 		if (cpu == testcpu)
 			continue;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
  2025-01-25  2:11   ` Waiman Long
  2025-01-27  9:36   ` [tip: timers/urgent] " tip-bot2 for Waiman Long
@ 2025-01-29 16:34   ` Sebastian Andrzej Siewior
  2025-01-29 17:03     ` Waiman Long
  2 siblings, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-01-29 16:34 UTC (permalink / raw)
  To: Waiman Long
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Clark Williams, Steven Rostedt, linux-kernel,
	linux-rt-devel

On 2025-01-24 20:54:42 [-0500], Waiman Long wrote:
> The following bug report happened in a PREEMPT_RT kernel.
> 
> [   30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> [   30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
> [   30.962673] preempt_count: 1, expected: 0
> [   30.962676] RCU nest depth: 0, expected: 0
> [   30.962680] 3 locks held by kwatchdog/2012:
> [   30.962684]  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
> [   30.967703]  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
> [   30.972774]  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
> [   30.977827] Preemption disabled at:
> [   30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
> [   30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
> [   30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
> [   30.982846] Call Trace:
> [   30.982850]  <TASK>
> [   30.983821]  dump_stack_lvl+0x57/0x81
> [   30.983821]  __might_resched.cold+0xf4/0x12f
> [   30.983824]  rt_spin_lock+0x4c/0x100
> [   30.988833]  get_random_u32+0x4f/0x110
> [   30.988833]  clocksource_verify_choose_cpus+0xab/0x1a0
> [   30.988833]  clocksource_verify_percpu.part.0+0x6b/0x330
> [   30.993894]  __clocksource_watchdog_kthread+0x193/0x1a0
> [   30.993898]  clocksource_watchdog_kthread+0x18/0x50
> [   30.993898]  kthread+0x114/0x140
> [   30.993898]  ret_from_fork+0x2c/0x50
> [   31.002864]  </TASK>
> 
> It is due to the fact that get_random_u32() is called in
> clocksource_verify_choose_cpus() with preemption disabled.
> If crng_ready() is true by the time get_random_u32() is called, The
> batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
> it is a rtmutex and we can't acquire it with preemption disabled.
> 
> Fix this problem by using the less random get_random_bytes() function
> which will not take any lock. In fact, it has the same random-ness as
> get_random_u32_below() when crng_ready() is false.

So how does get_random_bytes() not take any locks? It takes locks in my
tree. You two have a lock less tree?

In case your tree is not lock less yet, couldn't we perform the loop
verify_n_cpus+1 times without disabled preemption? Then disable
preemption after return from clocksource_verify_choose_cpus() and then
either remove current CPU from the list if it is or remove a random one
so that we get back to verify_n_cpus CPUs set.

Alternatively, (and this might be easier) use migrate_disable() instead
of preempt_disable() and only use preempt_disable() within the
for_each_cpu() loop if delta is important (which I assume it is). 

But all this would avoid having to run with disabled preemption within
clocksource_verify_choose_cpus() while having the guarantees you need.

> Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
> Signed-off-by: Waiman Long <longman@redhat.com>

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-29 16:34   ` [PATCH v2 2/2] " Sebastian Andrzej Siewior
@ 2025-01-29 17:03     ` Waiman Long
  2025-01-29 19:55       ` Thomas Gleixner
  2025-01-29 20:29       ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 10+ messages in thread
From: Waiman Long @ 2025-01-29 17:03 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Clark Williams, Steven Rostedt, linux-kernel,
	linux-rt-devel

On 1/29/25 11:34 AM, Sebastian Andrzej Siewior wrote:
> On 2025-01-24 20:54:42 [-0500], Waiman Long wrote:
>> The following bug report happened in a PREEMPT_RT kernel.
>>
>> [   30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
>> [   30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
>> [   30.962673] preempt_count: 1, expected: 0
>> [   30.962676] RCU nest depth: 0, expected: 0
>> [   30.962680] 3 locks held by kwatchdog/2012:
>> [   30.962684]  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
>> [   30.967703]  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
>> [   30.972774]  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
>> [   30.977827] Preemption disabled at:
>> [   30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
>> [   30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
>> [   30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
>> [   30.982846] Call Trace:
>> [   30.982850]  <TASK>
>> [   30.983821]  dump_stack_lvl+0x57/0x81
>> [   30.983821]  __might_resched.cold+0xf4/0x12f
>> [   30.983824]  rt_spin_lock+0x4c/0x100
>> [   30.988833]  get_random_u32+0x4f/0x110
>> [   30.988833]  clocksource_verify_choose_cpus+0xab/0x1a0
>> [   30.988833]  clocksource_verify_percpu.part.0+0x6b/0x330
>> [   30.993894]  __clocksource_watchdog_kthread+0x193/0x1a0
>> [   30.993898]  clocksource_watchdog_kthread+0x18/0x50
>> [   30.993898]  kthread+0x114/0x140
>> [   30.993898]  ret_from_fork+0x2c/0x50
>> [   31.002864]  </TASK>
>>
>> It is due to the fact that get_random_u32() is called in
>> clocksource_verify_choose_cpus() with preemption disabled.
>> If crng_ready() is true by the time get_random_u32() is called, The
>> batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
>> it is a rtmutex and we can't acquire it with preemption disabled.
>>
>> Fix this problem by using the less random get_random_bytes() function
>> which will not take any lock. In fact, it has the same random-ness as
>> get_random_u32_below() when crng_ready() is false.
> So how does get_random_bytes() not take any locks? It takes locks in my
> tree. You two have a lock less tree?

You are right. I forgot to check the crng_make_state() call in 
_get_random_bytes() which does take lock.


>
> In case your tree is not lock less yet, couldn't we perform the loop
> verify_n_cpus+1 times without disabled preemption? Then disable
> preemption after return from clocksource_verify_choose_cpus() and then
> either remove current CPU from the list if it is or remove a random one
> so that we get back to verify_n_cpus CPUs set.
>
> Alternatively, (and this might be easier) use migrate_disable() instead
> of preempt_disable() and only use preempt_disable() within the
> for_each_cpu() loop if delta is important (which I assume it is).
>
> But all this would avoid having to run with disabled preemption within
> clocksource_verify_choose_cpus() while having the guarantees you need.

I guess we will have to break clocksource_verify_choose_cpus() into 2 
separate parts, one without preemption disabled and other one with 
preemption disabled. I don't think it is a good idea to just use 
migrate_disable() as we may get too much latency that can affect the 
test result.

I will send out a v3 patch to fix that.

Thanks,
Longman


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-29 17:03     ` Waiman Long
@ 2025-01-29 19:55       ` Thomas Gleixner
  2025-01-29 20:29       ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 10+ messages in thread
From: Thomas Gleixner @ 2025-01-29 19:55 UTC (permalink / raw)
  To: Waiman Long, Sebastian Andrzej Siewior
  Cc: John Stultz, Stephen Boyd, Feng Tang, Paul E. McKenney,
	Clark Williams, Steven Rostedt, linux-kernel, linux-rt-devel

On Wed, Jan 29 2025 at 12:03, Waiman Long wrote:
> On 1/29/25 11:34 AM, Sebastian Andrzej Siewior wrote:
>> But all this would avoid having to run with disabled preemption within
>> clocksource_verify_choose_cpus() while having the guarantees you need.
>
> I guess we will have to break clocksource_verify_choose_cpus() into 2 
> separate parts, one without preemption disabled and other one with 
> preemption disabled. I don't think it is a good idea to just use 
> migrate_disable() as we may get too much latency that can affect the 
> test result.
>
> I will send out a v3 patch to fix that.

I zap the topmost commit in timers/urgent then.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-29 17:03     ` Waiman Long
  2025-01-29 19:55       ` Thomas Gleixner
@ 2025-01-29 20:29       ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-01-29 20:29 UTC (permalink / raw)
  To: Waiman Long
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Clark Williams, Steven Rostedt, linux-kernel,
	linux-rt-devel

On 2025-01-29 12:03:34 [-0500], Waiman Long wrote:
> I guess we will have to break clocksource_verify_choose_cpus() into 2
> separate parts, one without preemption disabled and other one with
> preemption disabled. I don't think it is a good idea to just use
> migrate_disable() as we may get too much latency that can affect the test
> result.

Something like

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 7304d7cf47f2d..bb7c845d7248c 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -373,10 +373,10 @@ void clocksource_verify_percpu(struct clocksource *cs)
 	cpumask_clear(&cpus_ahead);
 	cpumask_clear(&cpus_behind);
 	cpus_read_lock();
-	preempt_disable();
+	migrate_disable();
 	clocksource_verify_choose_cpus();
 	if (cpumask_empty(&cpus_chosen)) {
-		preempt_enable();
+		migrate_enable();
 		cpus_read_unlock();
 		pr_warn("Not enough CPUs to check clocksource '%s'.\n", cs->name);
 		return;
@@ -386,6 +386,7 @@ void clocksource_verify_percpu(struct clocksource *cs)
 	for_each_cpu(cpu, &cpus_chosen) {
 		if (cpu == testcpu)
 			continue;
+		preempt_disable();
 		csnow_begin = cs->read(cs);
 		smp_call_function_single(cpu, clocksource_verify_one_cpu, cs, 1);
 		csnow_end = cs->read(cs);
@@ -400,8 +401,9 @@ void clocksource_verify_percpu(struct clocksource *cs)
 			cs_nsec_max = cs_nsec;
 		if (cs_nsec < cs_nsec_min)
 			cs_nsec_min = cs_nsec;
+		preempt_enable();
 	}
-	preempt_enable();
+	migrate_enable();
 	cpus_read_unlock();
 	if (!cpumask_empty(&cpus_ahead))
 		pr_warn("        CPUs %*pbl ahead of CPU %d for clocksource %s.\n",

> I will send out a v3 patch to fix that.

should do the job. It is untested…

> Thanks,
> Longman

Sebastian

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-01-29 20:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-25  1:54 [PATCH v2 1/2] clocksource: Use pr_info() for "Checking clocksource synchronization" message Waiman Long
2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
2025-01-25  2:11   ` Waiman Long
2025-01-25  4:46     ` Paul E. McKenney
2025-01-27  9:36   ` [tip: timers/urgent] " tip-bot2 for Waiman Long
2025-01-29 16:34   ` [PATCH v2 2/2] " Sebastian Andrzej Siewior
2025-01-29 17:03     ` Waiman Long
2025-01-29 19:55       ` Thomas Gleixner
2025-01-29 20:29       ` Sebastian Andrzej Siewior
2025-01-27  9:36 ` [tip: timers/urgent] clocksource: Use pr_info() for "Checking clocksource synchronization" message tip-bot2 for Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox