public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate()
@ 2024-07-03  3:16 Yang Yingliang
  2024-07-03  3:16 ` [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper Yang Yingliang
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Yang Yingliang @ 2024-07-03  3:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, vschneid, tglx, yu.c.chen,
	tim.c.chen, yangyingliang, liwei391

From: Yang Yingliang <yangyingliang@huawei.com>

sched_smt_present decrement and set_rq_offline() is called before
cpuset_cpu_inactive(), if cpuset_cpu_inactive() fails, these two
things need be rollback.

Yang Yingliang (4):
  sched/smt: Introduce sched_smt_present_inc/dec() helper
  sched/smt: fix unbalance sched_smt_present dec/inc
  sched/core: Introduce sched_set_rq_on/offline() helper
  sched/core: fix unbalance set_rq_online/offline() in
    sched_cpu_deactivate()

 kernel/sched/core.c | 68 +++++++++++++++++++++++++++++++--------------
 1 file changed, 47 insertions(+), 21 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper
  2024-07-03  3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
@ 2024-07-03  3:16 ` Yang Yingliang
  2024-07-29 10:34   ` [tip: sched/core] " tip-bot2 for Yang Yingliang
  2024-07-03  3:16 ` [PATCH 2/4] sched/smt: fix unbalance sched_smt_present dec/inc Yang Yingliang
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Yang Yingliang @ 2024-07-03  3:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, vschneid, tglx, yu.c.chen,
	tim.c.chen, yangyingliang, liwei391

From: Yang Yingliang <yangyingliang@huawei.com>

Introduce sched_smt_present_inc/dec() helper, so it can be called
in normal or error path simply. No functional changed.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
 kernel/sched/core.c | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index bcf2c4cc0522..880c4c03ef8a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9654,6 +9654,22 @@ static int cpuset_cpu_inactive(unsigned int cpu)
 	return 0;
 }
 
+static inline void sched_smt_present_inc(int cpu)
+{
+#ifdef CONFIG_SCHED_SMT
+	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+		static_branch_inc_cpuslocked(&sched_smt_present);
+#endif
+}
+
+static inline void sched_smt_present_dec(int cpu)
+{
+#ifdef CONFIG_SCHED_SMT
+	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+		static_branch_dec_cpuslocked(&sched_smt_present);
+#endif
+}
+
 int sched_cpu_activate(unsigned int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
@@ -9665,13 +9681,10 @@ int sched_cpu_activate(unsigned int cpu)
 	 */
 	balance_push_set(cpu, false);
 
-#ifdef CONFIG_SCHED_SMT
 	/*
 	 * When going up, increment the number of cores with SMT present.
 	 */
-	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
-		static_branch_inc_cpuslocked(&sched_smt_present);
-#endif
+	sched_smt_present_inc(cpu);
 	set_cpu_active(cpu, true);
 
 	if (sched_smp_initialized) {
@@ -9740,13 +9753,12 @@ int sched_cpu_deactivate(unsigned int cpu)
 	}
 	rq_unlock_irqrestore(rq, &rf);
 
-#ifdef CONFIG_SCHED_SMT
 	/*
 	 * When going down, decrement the number of cores with SMT present.
 	 */
-	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
-		static_branch_dec_cpuslocked(&sched_smt_present);
+	sched_smt_present_dec(cpu);
 
+#ifdef CONFIG_SCHED_SMT
 	sched_core_cpu_deactivate(cpu);
 #endif
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] sched/smt: fix unbalance sched_smt_present dec/inc
  2024-07-03  3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
  2024-07-03  3:16 ` [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper Yang Yingliang
@ 2024-07-03  3:16 ` Yang Yingliang
  2024-07-29 10:34   ` [tip: sched/core] sched/smt: Fix " tip-bot2 for Yang Yingliang
  2024-07-03  3:16 ` [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper Yang Yingliang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Yang Yingliang @ 2024-07-03  3:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, vschneid, tglx, yu.c.chen,
	tim.c.chen, yangyingliang, liwei391

From: Yang Yingliang <yangyingliang@huawei.com>

I got the following warn report while doing stress test:

jump label: negative count!
WARNING: CPU: 3 PID: 38 at kernel/jump_label.c:263 static_key_slow_try_dec+0x9d/0xb0
Call Trace:
 <TASK>
 __static_key_slow_dec_cpuslocked+0x16/0x70
 sched_cpu_deactivate+0x26e/0x2a0
 cpuhp_invoke_callback+0x3ad/0x10d0
 cpuhp_thread_fun+0x3f5/0x680
 smpboot_thread_fn+0x56d/0x8d0
 kthread+0x309/0x400
 ret_from_fork+0x41/0x70
 ret_from_fork_asm+0x1b/0x30
 </TASK>

Because when cpuset_cpu_inactive() fails in sched_cpu_deactivate(),
the cpu offline failed, but sched_smt_present is decremented before
calling sched_cpu_deactivate(), it leads to unbalanced dec/inc, so
fix it by incrementing sched_smt_present in the error path.

Fixes: c5511d03ec09 ("sched/smt: Make sched_smt_present track topology")
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 880c4c03ef8a..5cff01046685 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9768,6 +9768,7 @@ int sched_cpu_deactivate(unsigned int cpu)
 	sched_update_numa(cpu, false);
 	ret = cpuset_cpu_inactive(cpu);
 	if (ret) {
+		sched_smt_present_inc(cpu);
 		balance_push_set(cpu, false);
 		set_cpu_active(cpu, true);
 		sched_update_numa(cpu, true);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper
  2024-07-03  3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
  2024-07-03  3:16 ` [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper Yang Yingliang
  2024-07-03  3:16 ` [PATCH 2/4] sched/smt: fix unbalance sched_smt_present dec/inc Yang Yingliang
@ 2024-07-03  3:16 ` Yang Yingliang
  2024-07-16 13:22   ` Markus Elfring
  2024-07-29 10:34   ` [tip: sched/core] " tip-bot2 for Yang Yingliang
  2024-07-03  3:16 ` [PATCH 4/4] sched/core: fix unbalance set_rq_online/offline() in sched_cpu_deactivate() Yang Yingliang
  2024-07-16  9:54 ` [PATCH 0/4] sched/smt: Fix error handling " Peter Zijlstra
  4 siblings, 2 replies; 11+ messages in thread
From: Yang Yingliang @ 2024-07-03  3:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, vschneid, tglx, yu.c.chen,
	tim.c.chen, yangyingliang, liwei391

From: Yang Yingliang <yangyingliang@huawei.com>

Introduce sched_set_rq_on/offline() helper, so it can be called
in normal or error path simply. No functional changed.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
 kernel/sched/core.c | 40 ++++++++++++++++++++++++++--------------
 1 file changed, 26 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5cff01046685..2e114bce517a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9604,6 +9604,30 @@ void set_rq_offline(struct rq *rq)
 	}
 }
 
+static inline void sched_set_rq_online(struct rq *rq, int cpu)
+{
+	struct rq_flags rf;
+
+	rq_lock_irqsave(rq, &rf);
+	if (rq->rd) {
+		BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
+		set_rq_online(rq);
+	}
+	rq_unlock_irqrestore(rq, &rf);
+}
+
+static inline void sched_set_rq_offline(struct rq *rq, int cpu)
+{
+	struct rq_flags rf;
+
+	rq_lock_irqsave(rq, &rf);
+	if (rq->rd) {
+		BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
+		set_rq_offline(rq);
+	}
+	rq_unlock_irqrestore(rq, &rf);
+}
+
 /*
  * used to mark begin/end of suspend/resume:
  */
@@ -9673,7 +9697,6 @@ static inline void sched_smt_present_dec(int cpu)
 int sched_cpu_activate(unsigned int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
-	struct rq_flags rf;
 
 	/*
 	 * Clear the balance_push callback and prepare to schedule
@@ -9702,12 +9725,7 @@ int sched_cpu_activate(unsigned int cpu)
 	 * 2) At runtime, if cpuset_cpu_active() fails to rebuild the
 	 *    domains.
 	 */
-	rq_lock_irqsave(rq, &rf);
-	if (rq->rd) {
-		BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
-		set_rq_online(rq);
-	}
-	rq_unlock_irqrestore(rq, &rf);
+	sched_set_rq_online(rq, cpu);
 
 	return 0;
 }
@@ -9715,7 +9733,6 @@ int sched_cpu_activate(unsigned int cpu)
 int sched_cpu_deactivate(unsigned int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
-	struct rq_flags rf;
 	int ret;
 
 	/*
@@ -9746,12 +9763,7 @@ int sched_cpu_deactivate(unsigned int cpu)
 	 */
 	synchronize_rcu();
 
-	rq_lock_irqsave(rq, &rf);
-	if (rq->rd) {
-		BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
-		set_rq_offline(rq);
-	}
-	rq_unlock_irqrestore(rq, &rf);
+	sched_set_rq_offline(rq, cpu);
 
 	/*
 	 * When going down, decrement the number of cores with SMT present.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] sched/core: fix unbalance set_rq_online/offline() in sched_cpu_deactivate()
  2024-07-03  3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
                   ` (2 preceding siblings ...)
  2024-07-03  3:16 ` [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper Yang Yingliang
@ 2024-07-03  3:16 ` Yang Yingliang
  2024-07-29 10:34   ` [tip: sched/core] sched/core: Fix " tip-bot2 for Yang Yingliang
  2024-07-16  9:54 ` [PATCH 0/4] sched/smt: Fix error handling " Peter Zijlstra
  4 siblings, 1 reply; 11+ messages in thread
From: Yang Yingliang @ 2024-07-03  3:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, vschneid, tglx, yu.c.chen,
	tim.c.chen, yangyingliang, liwei391

From: Yang Yingliang <yangyingliang@huawei.com>

If cpuset_cpu_inactive() fails, set_rq_online() need be called to rollback.

Fixes: 120455c514f7 ("sched: Fix hotplug vs CPU bandwidth control")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2e114bce517a..01172d8bfe02 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9781,6 +9781,7 @@ int sched_cpu_deactivate(unsigned int cpu)
 	ret = cpuset_cpu_inactive(cpu);
 	if (ret) {
 		sched_smt_present_inc(cpu);
+		sched_set_rq_online(rq, cpu);
 		balance_push_set(cpu, false);
 		set_cpu_active(cpu, true);
 		sched_update_numa(cpu, true);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate()
  2024-07-03  3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
                   ` (3 preceding siblings ...)
  2024-07-03  3:16 ` [PATCH 4/4] sched/core: fix unbalance set_rq_online/offline() in sched_cpu_deactivate() Yang Yingliang
@ 2024-07-16  9:54 ` Peter Zijlstra
  4 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2024-07-16  9:54 UTC (permalink / raw)
  To: Yang Yingliang
  Cc: linux-kernel, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	tglx, yu.c.chen, tim.c.chen, yangyingliang, liwei391

On Wed, Jul 03, 2024 at 11:16:06AM +0800, Yang Yingliang wrote:
> From: Yang Yingliang <yangyingliang@huawei.com>
> 
> sched_smt_present decrement and set_rq_offline() is called before
> cpuset_cpu_inactive(), if cpuset_cpu_inactive() fails, these two
> things need be rollback.
> 
> Yang Yingliang (4):
>   sched/smt: Introduce sched_smt_present_inc/dec() helper
>   sched/smt: fix unbalance sched_smt_present dec/inc
>   sched/core: Introduce sched_set_rq_on/offline() helper
>   sched/core: fix unbalance set_rq_online/offline() in
>     sched_cpu_deactivate()
> 
>  kernel/sched/core.c | 68 +++++++++++++++++++++++++++++++--------------
>  1 file changed, 47 insertions(+), 21 deletions(-)

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper
  2024-07-03  3:16 ` [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper Yang Yingliang
@ 2024-07-16 13:22   ` Markus Elfring
  2024-07-29 10:34   ` [tip: sched/core] " tip-bot2 for Yang Yingliang
  1 sibling, 0 replies; 11+ messages in thread
From: Markus Elfring @ 2024-07-16 13:22 UTC (permalink / raw)
  To: Yang Yingliang, kernel-janitors
  Cc: LKML, Ben Segall, Chen Yu, Daniel Bristot de Oliveira,
	Dietmar Eggemann, Ingo Molnar, Juri Lelli, Mel Gorman,
	Peter Zijlstra, Steven Rostedt, Tim Chen, Thomas Gleixner,
	Valentin Schneider, Vincent Guittot, Wei Li

> Introduce sched_set_rq_on/offline() helper, so it can be called
> in normal or error path simply. No functional changed.

Would you like to improve such a change description another bit?


…
> +++ b/kernel/sched/core.c
> @@ -9604,6 +9604,30 @@ void set_rq_offline(struct rq *rq)
> +static inline void sched_set_rq_online(struct rq *rq, int cpu)
> +{
> +	rq_lock_irqsave(rq, &rf);
> +	if (rq->rd) {
> +	}
> +	rq_unlock_irqrestore(rq, &rf);
> +}
…

Under which circumstances would you become interested to apply a statement
like “guard(rq_lock_irqsave)(rq);”?
https://elixir.bootlin.com/linux/v6.10/source/kernel/sched/sched.h#L1741

Regards,
Markusbsegall@google.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tip: sched/core] sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate()
  2024-07-03  3:16 ` [PATCH 4/4] sched/core: fix unbalance set_rq_online/offline() in sched_cpu_deactivate() Yang Yingliang
@ 2024-07-29 10:34   ` tip-bot2 for Yang Yingliang
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot2 for Yang Yingliang @ 2024-07-29 10:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: stable, Yang Yingliang, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     fe7a11c78d2a9bdb8b50afc278a31ac177000948
Gitweb:        https://git.kernel.org/tip/fe7a11c78d2a9bdb8b50afc278a31ac177000948
Author:        Yang Yingliang <yangyingliang@huawei.com>
AuthorDate:    Wed, 03 Jul 2024 11:16:10 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 29 Jul 2024 12:22:33 +02:00

sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate()

If cpuset_cpu_inactive() fails, set_rq_online() need be called to rollback.

Fixes: 120455c514f7 ("sched: Fix hotplug vs CPU bandwidth control")
Cc: stable@kernel.org
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240703031610.587047-5-yangyingliang@huaweicloud.com
---
 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4d119e9..f3951e4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8022,6 +8022,7 @@ int sched_cpu_deactivate(unsigned int cpu)
 	ret = cpuset_cpu_inactive(cpu);
 	if (ret) {
 		sched_smt_present_inc(cpu);
+		sched_set_rq_online(rq, cpu);
 		balance_push_set(cpu, false);
 		set_cpu_active(cpu, true);
 		sched_update_numa(cpu, true);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip: sched/core] sched/core: Introduce sched_set_rq_on/offline() helper
  2024-07-03  3:16 ` [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper Yang Yingliang
  2024-07-16 13:22   ` Markus Elfring
@ 2024-07-29 10:34   ` tip-bot2 for Yang Yingliang
  1 sibling, 0 replies; 11+ messages in thread
From: tip-bot2 for Yang Yingliang @ 2024-07-29 10:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: stable, Yang Yingliang, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     2f027354122f58ee846468a6f6b48672fff92e9b
Gitweb:        https://git.kernel.org/tip/2f027354122f58ee846468a6f6b48672fff92e9b
Author:        Yang Yingliang <yangyingliang@huawei.com>
AuthorDate:    Wed, 03 Jul 2024 11:16:09 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 29 Jul 2024 12:22:32 +02:00

sched/core: Introduce sched_set_rq_on/offline() helper

Introduce sched_set_rq_on/offline() helper, so it can be called
in normal or error path simply. No functional changed.

Cc: stable@kernel.org
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240703031610.587047-4-yangyingliang@huaweicloud.com
---
 kernel/sched/core.c | 40 ++++++++++++++++++++++++++--------------
 1 file changed, 26 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 949473e..4d119e9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7845,6 +7845,30 @@ void set_rq_offline(struct rq *rq)
 	}
 }
 
+static inline void sched_set_rq_online(struct rq *rq, int cpu)
+{
+	struct rq_flags rf;
+
+	rq_lock_irqsave(rq, &rf);
+	if (rq->rd) {
+		BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
+		set_rq_online(rq);
+	}
+	rq_unlock_irqrestore(rq, &rf);
+}
+
+static inline void sched_set_rq_offline(struct rq *rq, int cpu)
+{
+	struct rq_flags rf;
+
+	rq_lock_irqsave(rq, &rf);
+	if (rq->rd) {
+		BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
+		set_rq_offline(rq);
+	}
+	rq_unlock_irqrestore(rq, &rf);
+}
+
 /*
  * used to mark begin/end of suspend/resume:
  */
@@ -7914,7 +7938,6 @@ static inline void sched_smt_present_dec(int cpu)
 int sched_cpu_activate(unsigned int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
-	struct rq_flags rf;
 
 	/*
 	 * Clear the balance_push callback and prepare to schedule
@@ -7943,12 +7966,7 @@ int sched_cpu_activate(unsigned int cpu)
 	 * 2) At runtime, if cpuset_cpu_active() fails to rebuild the
 	 *    domains.
 	 */
-	rq_lock_irqsave(rq, &rf);
-	if (rq->rd) {
-		BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
-		set_rq_online(rq);
-	}
-	rq_unlock_irqrestore(rq, &rf);
+	sched_set_rq_online(rq, cpu);
 
 	return 0;
 }
@@ -7956,7 +7974,6 @@ int sched_cpu_activate(unsigned int cpu)
 int sched_cpu_deactivate(unsigned int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
-	struct rq_flags rf;
 	int ret;
 
 	/*
@@ -7987,12 +8004,7 @@ int sched_cpu_deactivate(unsigned int cpu)
 	 */
 	synchronize_rcu();
 
-	rq_lock_irqsave(rq, &rf);
-	if (rq->rd) {
-		BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
-		set_rq_offline(rq);
-	}
-	rq_unlock_irqrestore(rq, &rf);
+	sched_set_rq_offline(rq, cpu);
 
 	/*
 	 * When going down, decrement the number of cores with SMT present.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip: sched/core] sched/smt: Fix unbalance sched_smt_present dec/inc
  2024-07-03  3:16 ` [PATCH 2/4] sched/smt: fix unbalance sched_smt_present dec/inc Yang Yingliang
@ 2024-07-29 10:34   ` tip-bot2 for Yang Yingliang
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot2 for Yang Yingliang @ 2024-07-29 10:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: stable, Yang Yingliang, Peter Zijlstra (Intel), Chen Yu, Tim Chen,
	x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     e22f910a26cc2a3ac9c66b8e935ef2a7dd881117
Gitweb:        https://git.kernel.org/tip/e22f910a26cc2a3ac9c66b8e935ef2a7dd881117
Author:        Yang Yingliang <yangyingliang@huawei.com>
AuthorDate:    Wed, 03 Jul 2024 11:16:08 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 29 Jul 2024 12:22:32 +02:00

sched/smt: Fix unbalance sched_smt_present dec/inc

I got the following warn report while doing stress test:

jump label: negative count!
WARNING: CPU: 3 PID: 38 at kernel/jump_label.c:263 static_key_slow_try_dec+0x9d/0xb0
Call Trace:
 <TASK>
 __static_key_slow_dec_cpuslocked+0x16/0x70
 sched_cpu_deactivate+0x26e/0x2a0
 cpuhp_invoke_callback+0x3ad/0x10d0
 cpuhp_thread_fun+0x3f5/0x680
 smpboot_thread_fn+0x56d/0x8d0
 kthread+0x309/0x400
 ret_from_fork+0x41/0x70
 ret_from_fork_asm+0x1b/0x30
 </TASK>

Because when cpuset_cpu_inactive() fails in sched_cpu_deactivate(),
the cpu offline failed, but sched_smt_present is decremented before
calling sched_cpu_deactivate(), it leads to unbalanced dec/inc, so
fix it by incrementing sched_smt_present in the error path.

Fixes: c5511d03ec09 ("sched/smt: Make sched_smt_present track topology")
Cc: stable@kernel.org
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Link: https://lore.kernel.org/r/20240703031610.587047-3-yangyingliang@huaweicloud.com
---
 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index acc04ed..949473e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8009,6 +8009,7 @@ int sched_cpu_deactivate(unsigned int cpu)
 	sched_update_numa(cpu, false);
 	ret = cpuset_cpu_inactive(cpu);
 	if (ret) {
+		sched_smt_present_inc(cpu);
 		balance_push_set(cpu, false);
 		set_cpu_active(cpu, true);
 		sched_update_numa(cpu, true);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip: sched/core] sched/smt: Introduce sched_smt_present_inc/dec() helper
  2024-07-03  3:16 ` [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper Yang Yingliang
@ 2024-07-29 10:34   ` tip-bot2 for Yang Yingliang
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot2 for Yang Yingliang @ 2024-07-29 10:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: stable, Yang Yingliang, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     31b164e2e4af84d08d2498083676e7eeaa102493
Gitweb:        https://git.kernel.org/tip/31b164e2e4af84d08d2498083676e7eeaa102493
Author:        Yang Yingliang <yangyingliang@huawei.com>
AuthorDate:    Wed, 03 Jul 2024 11:16:07 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 29 Jul 2024 12:22:32 +02:00

sched/smt: Introduce sched_smt_present_inc/dec() helper

Introduce sched_smt_present_inc/dec() helper, so it can be called
in normal or error path simply. No functional changed.

Cc: stable@kernel.org
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240703031610.587047-2-yangyingliang@huaweicloud.com
---
 kernel/sched/core.c | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a9f6550..acc04ed 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7895,6 +7895,22 @@ static int cpuset_cpu_inactive(unsigned int cpu)
 	return 0;
 }
 
+static inline void sched_smt_present_inc(int cpu)
+{
+#ifdef CONFIG_SCHED_SMT
+	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+		static_branch_inc_cpuslocked(&sched_smt_present);
+#endif
+}
+
+static inline void sched_smt_present_dec(int cpu)
+{
+#ifdef CONFIG_SCHED_SMT
+	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+		static_branch_dec_cpuslocked(&sched_smt_present);
+#endif
+}
+
 int sched_cpu_activate(unsigned int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
@@ -7906,13 +7922,10 @@ int sched_cpu_activate(unsigned int cpu)
 	 */
 	balance_push_set(cpu, false);
 
-#ifdef CONFIG_SCHED_SMT
 	/*
 	 * When going up, increment the number of cores with SMT present.
 	 */
-	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
-		static_branch_inc_cpuslocked(&sched_smt_present);
-#endif
+	sched_smt_present_inc(cpu);
 	set_cpu_active(cpu, true);
 
 	if (sched_smp_initialized) {
@@ -7981,13 +7994,12 @@ int sched_cpu_deactivate(unsigned int cpu)
 	}
 	rq_unlock_irqrestore(rq, &rf);
 
-#ifdef CONFIG_SCHED_SMT
 	/*
 	 * When going down, decrement the number of cores with SMT present.
 	 */
-	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
-		static_branch_dec_cpuslocked(&sched_smt_present);
+	sched_smt_present_dec(cpu);
 
+#ifdef CONFIG_SCHED_SMT
 	sched_core_cpu_deactivate(cpu);
 #endif
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-07-29 10:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-03  3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
2024-07-03  3:16 ` [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper Yang Yingliang
2024-07-29 10:34   ` [tip: sched/core] " tip-bot2 for Yang Yingliang
2024-07-03  3:16 ` [PATCH 2/4] sched/smt: fix unbalance sched_smt_present dec/inc Yang Yingliang
2024-07-29 10:34   ` [tip: sched/core] sched/smt: Fix " tip-bot2 for Yang Yingliang
2024-07-03  3:16 ` [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper Yang Yingliang
2024-07-16 13:22   ` Markus Elfring
2024-07-29 10:34   ` [tip: sched/core] " tip-bot2 for Yang Yingliang
2024-07-03  3:16 ` [PATCH 4/4] sched/core: fix unbalance set_rq_online/offline() in sched_cpu_deactivate() Yang Yingliang
2024-07-29 10:34   ` [tip: sched/core] sched/core: Fix " tip-bot2 for Yang Yingliang
2024-07-16  9:54 ` [PATCH 0/4] sched/smt: Fix error handling " Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox