All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] clocksource: Use pr_info() for "Checking clocksource synchronization" message
@ 2025-01-25  1:54 Waiman Long
  2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
  2025-01-27  9:36 ` [tip: timers/urgent] clocksource: Use pr_info() for "Checking clocksource synchronization" message tip-bot2 for Waiman Long
  0 siblings, 2 replies; 10+ messages in thread
From: Waiman Long @ 2025-01-25  1:54 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt
  Cc: linux-kernel, linux-rt-devel, Waiman Long

The "Checking clocksource synchronization" message is normally printed
when clocksource_verify_percpu() is called for a given clocksource if
both the CLOCK_SOURCE_UNSTABLE and CLOCK_SOURCE_VERIFY_PERCPU flags
are set. It is an informational message and so pr_info() should be used
instead of pr_warn().

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: John Stultz <jstultz@google.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/time/clocksource.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 7304d7cf47f2..77d9566d3aa6 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -382,7 +382,8 @@ void clocksource_verify_percpu(struct clocksource *cs)
 		return;
 	}
 	testcpu = smp_processor_id();
-	pr_warn("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n", cs->name, testcpu, cpumask_pr_args(&cpus_chosen));
+	pr_info("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n",
+		cs->name, testcpu, cpumask_pr_args(&cpus_chosen));
 	for_each_cpu(cpu, &cpus_chosen) {
 		if (cpu == testcpu)
 			continue;
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-25  1:54 [PATCH v2 1/2] clocksource: Use pr_info() for "Checking clocksource synchronization" message Waiman Long
@ 2025-01-25  1:54 ` Waiman Long
  2025-01-25  2:11   ` Waiman Long
                     ` (2 more replies)
  2025-01-27  9:36 ` [tip: timers/urgent] clocksource: Use pr_info() for "Checking clocksource synchronization" message tip-bot2 for Waiman Long
  1 sibling, 3 replies; 10+ messages in thread
From: Waiman Long @ 2025-01-25  1:54 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt
  Cc: linux-kernel, linux-rt-devel, Waiman Long

The following bug report happened in a PREEMPT_RT kernel.

[   30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[   30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
[   30.962673] preempt_count: 1, expected: 0
[   30.962676] RCU nest depth: 0, expected: 0
[   30.962680] 3 locks held by kwatchdog/2012:
[   30.962684]  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
[   30.967703]  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
[   30.972774]  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
[   30.977827] Preemption disabled at:
[   30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
[   30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
[   30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
[   30.982846] Call Trace:
[   30.982850]  <TASK>
[   30.983821]  dump_stack_lvl+0x57/0x81
[   30.983821]  __might_resched.cold+0xf4/0x12f
[   30.983824]  rt_spin_lock+0x4c/0x100
[   30.988833]  get_random_u32+0x4f/0x110
[   30.988833]  clocksource_verify_choose_cpus+0xab/0x1a0
[   30.988833]  clocksource_verify_percpu.part.0+0x6b/0x330
[   30.993894]  __clocksource_watchdog_kthread+0x193/0x1a0
[   30.993898]  clocksource_watchdog_kthread+0x18/0x50
[   30.993898]  kthread+0x114/0x140
[   30.993898]  ret_from_fork+0x2c/0x50
[   31.002864]  </TASK>

It is due to the fact that get_random_u32() is called in
clocksource_verify_choose_cpus() with preemption disabled.
If crng_ready() is true by the time get_random_u32() is called, The
batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
it is a rtmutex and we can't acquire it with preemption disabled.

Fix this problem by using the less random get_random_bytes() function
which will not take any lock. In fact, it has the same random-ness as
get_random_u32_below() when crng_ready() is false.

Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/time/clocksource.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 77d9566d3aa6..659c4b79119c 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -340,9 +340,13 @@ static void clocksource_verify_choose_cpus(void)
 	 * and no replacement CPU is selected.  This gracefully handles
 	 * situations where verify_n_cpus is greater than the number of
 	 * CPUs that are currently online.
+	 *
+	 * The get_random_bytes() is used here to avoid taking lock with
+	 * preemption disabled.
 	 */
 	for (i = 1; i < n; i++) {
-		cpu = get_random_u32_below(nr_cpu_ids);
+		get_random_bytes(&cpu, sizeof(cpu));
+		cpu %= nr_cpu_ids;
 		cpu = cpumask_next(cpu - 1, cpu_online_mask);
 		if (cpu >= nr_cpu_ids)
 			cpu = cpumask_first(cpu_online_mask);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
@ 2025-01-25  2:11   ` Waiman Long
  2025-01-25  4:46     ` Paul E. McKenney
  2025-01-27  9:36   ` [tip: timers/urgent] " tip-bot2 for Waiman Long
  2025-01-29 16:34   ` [PATCH v2 2/2] " Sebastian Andrzej Siewior
  2 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2025-01-25  2:11 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt
  Cc: linux-kernel, linux-rt-devel


On 1/24/25 8:54 PM, Waiman Long wrote:
> The following bug report happened in a PREEMPT_RT kernel.
>
> [   30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> [   30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
> [   30.962673] preempt_count: 1, expected: 0
> [   30.962676] RCU nest depth: 0, expected: 0
> [   30.962680] 3 locks held by kwatchdog/2012:
> [   30.962684]  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
> [   30.967703]  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
> [   30.972774]  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
> [   30.977827] Preemption disabled at:
> [   30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
> [   30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
> [   30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
> [   30.982846] Call Trace:
> [   30.982850]  <TASK>
> [   30.983821]  dump_stack_lvl+0x57/0x81
> [   30.983821]  __might_resched.cold+0xf4/0x12f
> [   30.983824]  rt_spin_lock+0x4c/0x100
> [   30.988833]  get_random_u32+0x4f/0x110
> [   30.988833]  clocksource_verify_choose_cpus+0xab/0x1a0
> [   30.988833]  clocksource_verify_percpu.part.0+0x6b/0x330
> [   30.993894]  __clocksource_watchdog_kthread+0x193/0x1a0
> [   30.993898]  clocksource_watchdog_kthread+0x18/0x50
> [   30.993898]  kthread+0x114/0x140
> [   30.993898]  ret_from_fork+0x2c/0x50
> [   31.002864]  </TASK>
>
> It is due to the fact that get_random_u32() is called in
> clocksource_verify_choose_cpus() with preemption disabled.
> If crng_ready() is true by the time get_random_u32() is called, The
> batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
> it is a rtmutex and we can't acquire it with preemption disabled.
>
> Fix this problem by using the less random get_random_bytes() function
> which will not take any lock. In fact, it has the same random-ness as
> get_random_u32_below() when crng_ready() is false.
>
> Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")

Oh, I should have added

Suggested-by: Paul E. McKenney <paulmck@kernel.org>

> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>   kernel/time/clocksource.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index 77d9566d3aa6..659c4b79119c 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -340,9 +340,13 @@ static void clocksource_verify_choose_cpus(void)
>   	 * and no replacement CPU is selected.  This gracefully handles
>   	 * situations where verify_n_cpus is greater than the number of
>   	 * CPUs that are currently online.
> +	 *
> +	 * The get_random_bytes() is used here to avoid taking lock with
> +	 * preemption disabled.
>   	 */
>   	for (i = 1; i < n; i++) {
> -		cpu = get_random_u32_below(nr_cpu_ids);
> +		get_random_bytes(&cpu, sizeof(cpu));
> +		cpu %= nr_cpu_ids;
>   		cpu = cpumask_next(cpu - 1, cpu_online_mask);
>   		if (cpu >= nr_cpu_ids)
>   			cpu = cpumask_first(cpu_online_mask);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-25  2:11   ` Waiman Long
@ 2025-01-25  4:46     ` Paul E. McKenney
  0 siblings, 0 replies; 10+ messages in thread
From: Paul E. McKenney @ 2025-01-25  4:46 UTC (permalink / raw)
  To: Waiman Long
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
	linux-kernel, linux-rt-devel

On Fri, Jan 24, 2025 at 09:11:27PM -0500, Waiman Long wrote:
> 
> On 1/24/25 8:54 PM, Waiman Long wrote:
> > The following bug report happened in a PREEMPT_RT kernel.
> > 
> > [   30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> > [   30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
> > [   30.962673] preempt_count: 1, expected: 0
> > [   30.962676] RCU nest depth: 0, expected: 0
> > [   30.962680] 3 locks held by kwatchdog/2012:
> > [   30.962684]  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
> > [   30.967703]  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
> > [   30.972774]  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
> > [   30.977827] Preemption disabled at:
> > [   30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
> > [   30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
> > [   30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
> > [   30.982846] Call Trace:
> > [   30.982850]  <TASK>
> > [   30.983821]  dump_stack_lvl+0x57/0x81
> > [   30.983821]  __might_resched.cold+0xf4/0x12f
> > [   30.983824]  rt_spin_lock+0x4c/0x100
> > [   30.988833]  get_random_u32+0x4f/0x110
> > [   30.988833]  clocksource_verify_choose_cpus+0xab/0x1a0
> > [   30.988833]  clocksource_verify_percpu.part.0+0x6b/0x330
> > [   30.993894]  __clocksource_watchdog_kthread+0x193/0x1a0
> > [   30.993898]  clocksource_watchdog_kthread+0x18/0x50
> > [   30.993898]  kthread+0x114/0x140
> > [   30.993898]  ret_from_fork+0x2c/0x50
> > [   31.002864]  </TASK>
> > 
> > It is due to the fact that get_random_u32() is called in
> > clocksource_verify_choose_cpus() with preemption disabled.
> > If crng_ready() is true by the time get_random_u32() is called, The
> > batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
> > it is a rtmutex and we can't acquire it with preemption disabled.
> > 
> > Fix this problem by using the less random get_random_bytes() function
> > which will not take any lock. In fact, it has the same random-ness as
> > get_random_u32_below() when crng_ready() is false.
> > 
> > Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
> 
> Oh, I should have added
> 
> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> 
> > Signed-off-by: Waiman Long <longman@redhat.com>

Works for me!

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

> > ---
> >   kernel/time/clocksource.c | 6 +++++-
> >   1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> > index 77d9566d3aa6..659c4b79119c 100644
> > --- a/kernel/time/clocksource.c
> > +++ b/kernel/time/clocksource.c
> > @@ -340,9 +340,13 @@ static void clocksource_verify_choose_cpus(void)
> >   	 * and no replacement CPU is selected.  This gracefully handles
> >   	 * situations where verify_n_cpus is greater than the number of
> >   	 * CPUs that are currently online.
> > +	 *
> > +	 * The get_random_bytes() is used here to avoid taking lock with
> > +	 * preemption disabled.
> >   	 */
> >   	for (i = 1; i < n; i++) {
> > -		cpu = get_random_u32_below(nr_cpu_ids);
> > +		get_random_bytes(&cpu, sizeof(cpu));
> > +		cpu %= nr_cpu_ids;
> >   		cpu = cpumask_next(cpu - 1, cpu_online_mask);
> >   		if (cpu >= nr_cpu_ids)
> >   			cpu = cpumask_first(cpu_online_mask);
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip: timers/urgent] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
  2025-01-25  2:11   ` Waiman Long
@ 2025-01-27  9:36   ` tip-bot2 for Waiman Long
  2025-01-29 16:34   ` [PATCH v2 2/2] " Sebastian Andrzej Siewior
  2 siblings, 0 replies; 10+ messages in thread
From: tip-bot2 for Waiman Long @ 2025-01-27  9:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Thomas Gleixner, Paul E. McKenney, stable, x86,
	linux-kernel

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     01cfc84024e9a6b619696a35d2e5662255001cd0
Gitweb:        https://git.kernel.org/tip/01cfc84024e9a6b619696a35d2e5662255001cd0
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Fri, 24 Jan 2025 20:54:42 -05:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Mon, 27 Jan 2025 10:30:59 +01:00

clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()

The following bug report happened in a PREEMPT_RT kernel.

 BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
 preempt_count: 1, expected: 0
 RCU nest depth: 0, expected: 0
 3 locks held by kwatchdog/2012:
  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
 Preemption disabled at:
 [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
 CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
 Call Trace:
  <TASK>
  __might_resched.cold+0xf4/0x12f
  rt_spin_lock+0x4c/0x100
  get_random_u32+0x4f/0x110
  clocksource_verify_choose_cpus+0xab/0x1a0
  clocksource_verify_percpu.part.0+0x6b/0x330
  __clocksource_watchdog_kthread+0x193/0x1a0
  clocksource_watchdog_kthread+0x18/0x50
  kthread+0x114/0x140
  ret_from_fork+0x2c/0x50
  </TASK>

This happens due to the fact that get_random_u32() is called in
clocksource_verify_choose_cpus() with preemption disabled.  If crng_ready()
is true by the time get_random_u32() is called, The batched_entropy_32
local lock will be acquired. In a PREEMPT_RT enabled kernel, it is a
rtmutex, which can't be acquireq with preemption disabled.

Fix this problem by using the less random get_random_bytes() function which
will not take any lock. In fact, it has the same random-ness as
get_random_u32_below() when crng_ready() is false.

Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250125015442.3740588-2-longman@redhat.com
---
 kernel/time/clocksource.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 77d9566..659c4b7 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -340,9 +340,13 @@ static void clocksource_verify_choose_cpus(void)
 	 * and no replacement CPU is selected.  This gracefully handles
 	 * situations where verify_n_cpus is greater than the number of
 	 * CPUs that are currently online.
+	 *
+	 * The get_random_bytes() is used here to avoid taking lock with
+	 * preemption disabled.
 	 */
 	for (i = 1; i < n; i++) {
-		cpu = get_random_u32_below(nr_cpu_ids);
+		get_random_bytes(&cpu, sizeof(cpu));
+		cpu %= nr_cpu_ids;
 		cpu = cpumask_next(cpu - 1, cpu_online_mask);
 		if (cpu >= nr_cpu_ids)
 			cpu = cpumask_first(cpu_online_mask);

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [tip: timers/urgent] clocksource: Use pr_info() for "Checking clocksource synchronization" message
  2025-01-25  1:54 [PATCH v2 1/2] clocksource: Use pr_info() for "Checking clocksource synchronization" message Waiman Long
  2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
@ 2025-01-27  9:36 ` tip-bot2 for Waiman Long
  1 sibling, 0 replies; 10+ messages in thread
From: tip-bot2 for Waiman Long @ 2025-01-27  9:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Thomas Gleixner, Paul E. McKenney, John Stultz, x86,
	linux-kernel

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     1f566840a82982141f94086061927a90e79440e5
Gitweb:        https://git.kernel.org/tip/1f566840a82982141f94086061927a90e79440e5
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Fri, 24 Jan 2025 20:54:41 -05:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Mon, 27 Jan 2025 10:30:59 +01:00

clocksource: Use pr_info() for "Checking clocksource synchronization" message

The "Checking clocksource synchronization" message is normally printed
when clocksource_verify_percpu() is called for a given clocksource if
both the CLOCK_SOURCE_UNSTABLE and CLOCK_SOURCE_VERIFY_PERCPU flags
are set.

It is an informational message and so pr_info() is the correct choice.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/all/20250125015442.3740588-1-longman@redhat.com

---
 kernel/time/clocksource.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 7304d7c..77d9566 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -382,7 +382,8 @@ void clocksource_verify_percpu(struct clocksource *cs)
 		return;
 	}
 	testcpu = smp_processor_id();
-	pr_warn("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n", cs->name, testcpu, cpumask_pr_args(&cpus_chosen));
+	pr_info("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n",
+		cs->name, testcpu, cpumask_pr_args(&cpus_chosen));
 	for_each_cpu(cpu, &cpus_chosen) {
 		if (cpu == testcpu)
 			continue;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
  2025-01-25  2:11   ` Waiman Long
  2025-01-27  9:36   ` [tip: timers/urgent] " tip-bot2 for Waiman Long
@ 2025-01-29 16:34   ` Sebastian Andrzej Siewior
  2025-01-29 17:03     ` Waiman Long
  2 siblings, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-01-29 16:34 UTC (permalink / raw)
  To: Waiman Long
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Clark Williams, Steven Rostedt, linux-kernel,
	linux-rt-devel

On 2025-01-24 20:54:42 [-0500], Waiman Long wrote:
> The following bug report happened in a PREEMPT_RT kernel.
> 
> [   30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> [   30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
> [   30.962673] preempt_count: 1, expected: 0
> [   30.962676] RCU nest depth: 0, expected: 0
> [   30.962680] 3 locks held by kwatchdog/2012:
> [   30.962684]  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
> [   30.967703]  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
> [   30.972774]  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
> [   30.977827] Preemption disabled at:
> [   30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
> [   30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
> [   30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
> [   30.982846] Call Trace:
> [   30.982850]  <TASK>
> [   30.983821]  dump_stack_lvl+0x57/0x81
> [   30.983821]  __might_resched.cold+0xf4/0x12f
> [   30.983824]  rt_spin_lock+0x4c/0x100
> [   30.988833]  get_random_u32+0x4f/0x110
> [   30.988833]  clocksource_verify_choose_cpus+0xab/0x1a0
> [   30.988833]  clocksource_verify_percpu.part.0+0x6b/0x330
> [   30.993894]  __clocksource_watchdog_kthread+0x193/0x1a0
> [   30.993898]  clocksource_watchdog_kthread+0x18/0x50
> [   30.993898]  kthread+0x114/0x140
> [   30.993898]  ret_from_fork+0x2c/0x50
> [   31.002864]  </TASK>
> 
> It is due to the fact that get_random_u32() is called in
> clocksource_verify_choose_cpus() with preemption disabled.
> If crng_ready() is true by the time get_random_u32() is called, The
> batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
> it is a rtmutex and we can't acquire it with preemption disabled.
> 
> Fix this problem by using the less random get_random_bytes() function
> which will not take any lock. In fact, it has the same random-ness as
> get_random_u32_below() when crng_ready() is false.

So how does get_random_bytes() not take any locks? It takes locks in my
tree. You two have a lock less tree?

In case your tree is not lock less yet, couldn't we perform the loop
verify_n_cpus+1 times without disabled preemption? Then disable
preemption after return from clocksource_verify_choose_cpus() and then
either remove current CPU from the list if it is or remove a random one
so that we get back to verify_n_cpus CPUs set.

Alternatively, (and this might be easier) use migrate_disable() instead
of preempt_disable() and only use preempt_disable() within the
for_each_cpu() loop if delta is important (which I assume it is). 

But all this would avoid having to run with disabled preemption within
clocksource_verify_choose_cpus() while having the guarantees you need.

> Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
> Signed-off-by: Waiman Long <longman@redhat.com>

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-29 16:34   ` [PATCH v2 2/2] " Sebastian Andrzej Siewior
@ 2025-01-29 17:03     ` Waiman Long
  2025-01-29 19:55       ` Thomas Gleixner
  2025-01-29 20:29       ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 10+ messages in thread
From: Waiman Long @ 2025-01-29 17:03 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Clark Williams, Steven Rostedt, linux-kernel,
	linux-rt-devel

On 1/29/25 11:34 AM, Sebastian Andrzej Siewior wrote:
> On 2025-01-24 20:54:42 [-0500], Waiman Long wrote:
>> The following bug report happened in a PREEMPT_RT kernel.
>>
>> [   30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
>> [   30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
>> [   30.962673] preempt_count: 1, expected: 0
>> [   30.962676] RCU nest depth: 0, expected: 0
>> [   30.962680] 3 locks held by kwatchdog/2012:
>> [   30.962684]  #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
>> [   30.967703]  #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
>> [   30.972774]  #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
>> [   30.977827] Preemption disabled at:
>> [   30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
>> [   30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
>> [   30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
>> [   30.982846] Call Trace:
>> [   30.982850]  <TASK>
>> [   30.983821]  dump_stack_lvl+0x57/0x81
>> [   30.983821]  __might_resched.cold+0xf4/0x12f
>> [   30.983824]  rt_spin_lock+0x4c/0x100
>> [   30.988833]  get_random_u32+0x4f/0x110
>> [   30.988833]  clocksource_verify_choose_cpus+0xab/0x1a0
>> [   30.988833]  clocksource_verify_percpu.part.0+0x6b/0x330
>> [   30.993894]  __clocksource_watchdog_kthread+0x193/0x1a0
>> [   30.993898]  clocksource_watchdog_kthread+0x18/0x50
>> [   30.993898]  kthread+0x114/0x140
>> [   30.993898]  ret_from_fork+0x2c/0x50
>> [   31.002864]  </TASK>
>>
>> It is due to the fact that get_random_u32() is called in
>> clocksource_verify_choose_cpus() with preemption disabled.
>> If crng_ready() is true by the time get_random_u32() is called, The
>> batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
>> it is a rtmutex and we can't acquire it with preemption disabled.
>>
>> Fix this problem by using the less random get_random_bytes() function
>> which will not take any lock. In fact, it has the same random-ness as
>> get_random_u32_below() when crng_ready() is false.
> So how does get_random_bytes() not take any locks? It takes locks in my
> tree. You two have a lock less tree?

You are right. I forgot to check the crng_make_state() call in 
_get_random_bytes() which does take lock.


>
> In case your tree is not lock less yet, couldn't we perform the loop
> verify_n_cpus+1 times without disabled preemption? Then disable
> preemption after return from clocksource_verify_choose_cpus() and then
> either remove current CPU from the list if it is or remove a random one
> so that we get back to verify_n_cpus CPUs set.
>
> Alternatively, (and this might be easier) use migrate_disable() instead
> of preempt_disable() and only use preempt_disable() within the
> for_each_cpu() loop if delta is important (which I assume it is).
>
> But all this would avoid having to run with disabled preemption within
> clocksource_verify_choose_cpus() while having the guarantees you need.

I guess we will have to break clocksource_verify_choose_cpus() into 2 
separate parts, one without preemption disabled and other one with 
preemption disabled. I don't think it is a good idea to just use 
migrate_disable() as we may get too much latency that can affect the 
test result.

I will send out a v3 patch to fix that.

Thanks,
Longman


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-29 17:03     ` Waiman Long
@ 2025-01-29 19:55       ` Thomas Gleixner
  2025-01-29 20:29       ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 10+ messages in thread
From: Thomas Gleixner @ 2025-01-29 19:55 UTC (permalink / raw)
  To: Waiman Long, Sebastian Andrzej Siewior
  Cc: John Stultz, Stephen Boyd, Feng Tang, Paul E. McKenney,
	Clark Williams, Steven Rostedt, linux-kernel, linux-rt-devel

On Wed, Jan 29 2025 at 12:03, Waiman Long wrote:
> On 1/29/25 11:34 AM, Sebastian Andrzej Siewior wrote:
>> But all this would avoid having to run with disabled preemption within
>> clocksource_verify_choose_cpus() while having the guarantees you need.
>
> I guess we will have to break clocksource_verify_choose_cpus() into 2 
> separate parts, one without preemption disabled and other one with 
> preemption disabled. I don't think it is a good idea to just use 
> migrate_disable() as we may get too much latency that can affect the 
> test result.
>
> I will send out a v3 patch to fix that.

I zap the topmost commit in timers/urgent then.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()
  2025-01-29 17:03     ` Waiman Long
  2025-01-29 19:55       ` Thomas Gleixner
@ 2025-01-29 20:29       ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-01-29 20:29 UTC (permalink / raw)
  To: Waiman Long
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Feng Tang,
	Paul E. McKenney, Clark Williams, Steven Rostedt, linux-kernel,
	linux-rt-devel

On 2025-01-29 12:03:34 [-0500], Waiman Long wrote:
> I guess we will have to break clocksource_verify_choose_cpus() into 2
> separate parts, one without preemption disabled and other one with
> preemption disabled. I don't think it is a good idea to just use
> migrate_disable() as we may get too much latency that can affect the test
> result.

Something like

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 7304d7cf47f2d..bb7c845d7248c 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -373,10 +373,10 @@ void clocksource_verify_percpu(struct clocksource *cs)
 	cpumask_clear(&cpus_ahead);
 	cpumask_clear(&cpus_behind);
 	cpus_read_lock();
-	preempt_disable();
+	migrate_disable();
 	clocksource_verify_choose_cpus();
 	if (cpumask_empty(&cpus_chosen)) {
-		preempt_enable();
+		migrate_enable();
 		cpus_read_unlock();
 		pr_warn("Not enough CPUs to check clocksource '%s'.\n", cs->name);
 		return;
@@ -386,6 +386,7 @@ void clocksource_verify_percpu(struct clocksource *cs)
 	for_each_cpu(cpu, &cpus_chosen) {
 		if (cpu == testcpu)
 			continue;
+		preempt_disable();
 		csnow_begin = cs->read(cs);
 		smp_call_function_single(cpu, clocksource_verify_one_cpu, cs, 1);
 		csnow_end = cs->read(cs);
@@ -400,8 +401,9 @@ void clocksource_verify_percpu(struct clocksource *cs)
 			cs_nsec_max = cs_nsec;
 		if (cs_nsec < cs_nsec_min)
 			cs_nsec_min = cs_nsec;
+		preempt_enable();
 	}
-	preempt_enable();
+	migrate_enable();
 	cpus_read_unlock();
 	if (!cpumask_empty(&cpus_ahead))
 		pr_warn("        CPUs %*pbl ahead of CPU %d for clocksource %s.\n",

> I will send out a v3 patch to fix that.

should do the job. It is untested…

> Thanks,
> Longman

Sebastian

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-01-29 20:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-25  1:54 [PATCH v2 1/2] clocksource: Use pr_info() for "Checking clocksource synchronization" message Waiman Long
2025-01-25  1:54 ` [PATCH v2 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus() Waiman Long
2025-01-25  2:11   ` Waiman Long
2025-01-25  4:46     ` Paul E. McKenney
2025-01-27  9:36   ` [tip: timers/urgent] " tip-bot2 for Waiman Long
2025-01-29 16:34   ` [PATCH v2 2/2] " Sebastian Andrzej Siewior
2025-01-29 17:03     ` Waiman Long
2025-01-29 19:55       ` Thomas Gleixner
2025-01-29 20:29       ` Sebastian Andrzej Siewior
2025-01-27  9:36 ` [tip: timers/urgent] clocksource: Use pr_info() for "Checking clocksource synchronization" message tip-bot2 for Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.