* [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate()
@ 2024-07-03 3:16 Yang Yingliang
2024-07-03 3:16 ` [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper Yang Yingliang
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Yang Yingliang @ 2024-07-03 3:16 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, tglx, yu.c.chen,
tim.c.chen, yangyingliang, liwei391
From: Yang Yingliang <yangyingliang@huawei.com>
sched_smt_present decrement and set_rq_offline() is called before
cpuset_cpu_inactive(), if cpuset_cpu_inactive() fails, these two
things need be rollback.
Yang Yingliang (4):
sched/smt: Introduce sched_smt_present_inc/dec() helper
sched/smt: fix unbalance sched_smt_present dec/inc
sched/core: Introduce sched_set_rq_on/offline() helper
sched/core: fix unbalance set_rq_online/offline() in
sched_cpu_deactivate()
kernel/sched/core.c | 68 +++++++++++++++++++++++++++++++--------------
1 file changed, 47 insertions(+), 21 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper
2024-07-03 3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
@ 2024-07-03 3:16 ` Yang Yingliang
2024-07-29 10:34 ` [tip: sched/core] " tip-bot2 for Yang Yingliang
2024-07-03 3:16 ` [PATCH 2/4] sched/smt: fix unbalance sched_smt_present dec/inc Yang Yingliang
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Yang Yingliang @ 2024-07-03 3:16 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, tglx, yu.c.chen,
tim.c.chen, yangyingliang, liwei391
From: Yang Yingliang <yangyingliang@huawei.com>
Introduce sched_smt_present_inc/dec() helper, so it can be called
in normal or error path simply. No functional changed.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
kernel/sched/core.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index bcf2c4cc0522..880c4c03ef8a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9654,6 +9654,22 @@ static int cpuset_cpu_inactive(unsigned int cpu)
return 0;
}
+static inline void sched_smt_present_inc(int cpu)
+{
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+ static_branch_inc_cpuslocked(&sched_smt_present);
+#endif
+}
+
+static inline void sched_smt_present_dec(int cpu)
+{
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+ static_branch_dec_cpuslocked(&sched_smt_present);
+#endif
+}
+
int sched_cpu_activate(unsigned int cpu)
{
struct rq *rq = cpu_rq(cpu);
@@ -9665,13 +9681,10 @@ int sched_cpu_activate(unsigned int cpu)
*/
balance_push_set(cpu, false);
-#ifdef CONFIG_SCHED_SMT
/*
* When going up, increment the number of cores with SMT present.
*/
- if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
- static_branch_inc_cpuslocked(&sched_smt_present);
-#endif
+ sched_smt_present_inc(cpu);
set_cpu_active(cpu, true);
if (sched_smp_initialized) {
@@ -9740,13 +9753,12 @@ int sched_cpu_deactivate(unsigned int cpu)
}
rq_unlock_irqrestore(rq, &rf);
-#ifdef CONFIG_SCHED_SMT
/*
* When going down, decrement the number of cores with SMT present.
*/
- if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
- static_branch_dec_cpuslocked(&sched_smt_present);
+ sched_smt_present_dec(cpu);
+#ifdef CONFIG_SCHED_SMT
sched_core_cpu_deactivate(cpu);
#endif
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] sched/smt: fix unbalance sched_smt_present dec/inc
2024-07-03 3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
2024-07-03 3:16 ` [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper Yang Yingliang
@ 2024-07-03 3:16 ` Yang Yingliang
2024-07-29 10:34 ` [tip: sched/core] sched/smt: Fix " tip-bot2 for Yang Yingliang
2024-07-03 3:16 ` [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper Yang Yingliang
` (2 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Yang Yingliang @ 2024-07-03 3:16 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, tglx, yu.c.chen,
tim.c.chen, yangyingliang, liwei391
From: Yang Yingliang <yangyingliang@huawei.com>
I got the following warn report while doing stress test:
jump label: negative count!
WARNING: CPU: 3 PID: 38 at kernel/jump_label.c:263 static_key_slow_try_dec+0x9d/0xb0
Call Trace:
<TASK>
__static_key_slow_dec_cpuslocked+0x16/0x70
sched_cpu_deactivate+0x26e/0x2a0
cpuhp_invoke_callback+0x3ad/0x10d0
cpuhp_thread_fun+0x3f5/0x680
smpboot_thread_fn+0x56d/0x8d0
kthread+0x309/0x400
ret_from_fork+0x41/0x70
ret_from_fork_asm+0x1b/0x30
</TASK>
Because when cpuset_cpu_inactive() fails in sched_cpu_deactivate(),
the cpu offline failed, but sched_smt_present is decremented before
calling sched_cpu_deactivate(), it leads to unbalanced dec/inc, so
fix it by incrementing sched_smt_present in the error path.
Fixes: c5511d03ec09 ("sched/smt: Make sched_smt_present track topology")
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
kernel/sched/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 880c4c03ef8a..5cff01046685 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9768,6 +9768,7 @@ int sched_cpu_deactivate(unsigned int cpu)
sched_update_numa(cpu, false);
ret = cpuset_cpu_inactive(cpu);
if (ret) {
+ sched_smt_present_inc(cpu);
balance_push_set(cpu, false);
set_cpu_active(cpu, true);
sched_update_numa(cpu, true);
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper
2024-07-03 3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
2024-07-03 3:16 ` [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper Yang Yingliang
2024-07-03 3:16 ` [PATCH 2/4] sched/smt: fix unbalance sched_smt_present dec/inc Yang Yingliang
@ 2024-07-03 3:16 ` Yang Yingliang
2024-07-16 13:22 ` Markus Elfring
2024-07-29 10:34 ` [tip: sched/core] " tip-bot2 for Yang Yingliang
2024-07-03 3:16 ` [PATCH 4/4] sched/core: fix unbalance set_rq_online/offline() in sched_cpu_deactivate() Yang Yingliang
2024-07-16 9:54 ` [PATCH 0/4] sched/smt: Fix error handling " Peter Zijlstra
4 siblings, 2 replies; 11+ messages in thread
From: Yang Yingliang @ 2024-07-03 3:16 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, tglx, yu.c.chen,
tim.c.chen, yangyingliang, liwei391
From: Yang Yingliang <yangyingliang@huawei.com>
Introduce sched_set_rq_on/offline() helper, so it can be called
in normal or error path simply. No functional changed.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
kernel/sched/core.c | 40 ++++++++++++++++++++++++++--------------
1 file changed, 26 insertions(+), 14 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5cff01046685..2e114bce517a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9604,6 +9604,30 @@ void set_rq_offline(struct rq *rq)
}
}
+static inline void sched_set_rq_online(struct rq *rq, int cpu)
+{
+ struct rq_flags rf;
+
+ rq_lock_irqsave(rq, &rf);
+ if (rq->rd) {
+ BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
+ set_rq_online(rq);
+ }
+ rq_unlock_irqrestore(rq, &rf);
+}
+
+static inline void sched_set_rq_offline(struct rq *rq, int cpu)
+{
+ struct rq_flags rf;
+
+ rq_lock_irqsave(rq, &rf);
+ if (rq->rd) {
+ BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
+ set_rq_offline(rq);
+ }
+ rq_unlock_irqrestore(rq, &rf);
+}
+
/*
* used to mark begin/end of suspend/resume:
*/
@@ -9673,7 +9697,6 @@ static inline void sched_smt_present_dec(int cpu)
int sched_cpu_activate(unsigned int cpu)
{
struct rq *rq = cpu_rq(cpu);
- struct rq_flags rf;
/*
* Clear the balance_push callback and prepare to schedule
@@ -9702,12 +9725,7 @@ int sched_cpu_activate(unsigned int cpu)
* 2) At runtime, if cpuset_cpu_active() fails to rebuild the
* domains.
*/
- rq_lock_irqsave(rq, &rf);
- if (rq->rd) {
- BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
- set_rq_online(rq);
- }
- rq_unlock_irqrestore(rq, &rf);
+ sched_set_rq_online(rq, cpu);
return 0;
}
@@ -9715,7 +9733,6 @@ int sched_cpu_activate(unsigned int cpu)
int sched_cpu_deactivate(unsigned int cpu)
{
struct rq *rq = cpu_rq(cpu);
- struct rq_flags rf;
int ret;
/*
@@ -9746,12 +9763,7 @@ int sched_cpu_deactivate(unsigned int cpu)
*/
synchronize_rcu();
- rq_lock_irqsave(rq, &rf);
- if (rq->rd) {
- BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
- set_rq_offline(rq);
- }
- rq_unlock_irqrestore(rq, &rf);
+ sched_set_rq_offline(rq, cpu);
/*
* When going down, decrement the number of cores with SMT present.
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] sched/core: fix unbalance set_rq_online/offline() in sched_cpu_deactivate()
2024-07-03 3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
` (2 preceding siblings ...)
2024-07-03 3:16 ` [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper Yang Yingliang
@ 2024-07-03 3:16 ` Yang Yingliang
2024-07-29 10:34 ` [tip: sched/core] sched/core: Fix " tip-bot2 for Yang Yingliang
2024-07-16 9:54 ` [PATCH 0/4] sched/smt: Fix error handling " Peter Zijlstra
4 siblings, 1 reply; 11+ messages in thread
From: Yang Yingliang @ 2024-07-03 3:16 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, tglx, yu.c.chen,
tim.c.chen, yangyingliang, liwei391
From: Yang Yingliang <yangyingliang@huawei.com>
If cpuset_cpu_inactive() fails, set_rq_online() need be called to rollback.
Fixes: 120455c514f7 ("sched: Fix hotplug vs CPU bandwidth control")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
kernel/sched/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2e114bce517a..01172d8bfe02 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9781,6 +9781,7 @@ int sched_cpu_deactivate(unsigned int cpu)
ret = cpuset_cpu_inactive(cpu);
if (ret) {
sched_smt_present_inc(cpu);
+ sched_set_rq_online(rq, cpu);
balance_push_set(cpu, false);
set_cpu_active(cpu, true);
sched_update_numa(cpu, true);
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate()
2024-07-03 3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
` (3 preceding siblings ...)
2024-07-03 3:16 ` [PATCH 4/4] sched/core: fix unbalance set_rq_online/offline() in sched_cpu_deactivate() Yang Yingliang
@ 2024-07-16 9:54 ` Peter Zijlstra
4 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2024-07-16 9:54 UTC (permalink / raw)
To: Yang Yingliang
Cc: linux-kernel, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
tglx, yu.c.chen, tim.c.chen, yangyingliang, liwei391
On Wed, Jul 03, 2024 at 11:16:06AM +0800, Yang Yingliang wrote:
> From: Yang Yingliang <yangyingliang@huawei.com>
>
> sched_smt_present decrement and set_rq_offline() is called before
> cpuset_cpu_inactive(), if cpuset_cpu_inactive() fails, these two
> things need be rollback.
>
> Yang Yingliang (4):
> sched/smt: Introduce sched_smt_present_inc/dec() helper
> sched/smt: fix unbalance sched_smt_present dec/inc
> sched/core: Introduce sched_set_rq_on/offline() helper
> sched/core: fix unbalance set_rq_online/offline() in
> sched_cpu_deactivate()
>
> kernel/sched/core.c | 68 +++++++++++++++++++++++++++++++--------------
> 1 file changed, 47 insertions(+), 21 deletions(-)
Thanks!
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper
2024-07-03 3:16 ` [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper Yang Yingliang
@ 2024-07-16 13:22 ` Markus Elfring
2024-07-29 10:34 ` [tip: sched/core] " tip-bot2 for Yang Yingliang
1 sibling, 0 replies; 11+ messages in thread
From: Markus Elfring @ 2024-07-16 13:22 UTC (permalink / raw)
To: Yang Yingliang, kernel-janitors
Cc: LKML, Ben Segall, Chen Yu, Daniel Bristot de Oliveira,
Dietmar Eggemann, Ingo Molnar, Juri Lelli, Mel Gorman,
Peter Zijlstra, Steven Rostedt, Tim Chen, Thomas Gleixner,
Valentin Schneider, Vincent Guittot, Wei Li
> Introduce sched_set_rq_on/offline() helper, so it can be called
> in normal or error path simply. No functional changed.
Would you like to improve such a change description another bit?
…
> +++ b/kernel/sched/core.c
> @@ -9604,6 +9604,30 @@ void set_rq_offline(struct rq *rq)
…
> +static inline void sched_set_rq_online(struct rq *rq, int cpu)
> +{
…
> + rq_lock_irqsave(rq, &rf);
> + if (rq->rd) {
…
> + }
> + rq_unlock_irqrestore(rq, &rf);
> +}
…
Under which circumstances would you become interested to apply a statement
like “guard(rq_lock_irqsave)(rq);”?
https://elixir.bootlin.com/linux/v6.10/source/kernel/sched/sched.h#L1741
Regards,
Markusbsegall@google.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tip: sched/core] sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate()
2024-07-03 3:16 ` [PATCH 4/4] sched/core: fix unbalance set_rq_online/offline() in sched_cpu_deactivate() Yang Yingliang
@ 2024-07-29 10:34 ` tip-bot2 for Yang Yingliang
0 siblings, 0 replies; 11+ messages in thread
From: tip-bot2 for Yang Yingliang @ 2024-07-29 10:34 UTC (permalink / raw)
To: linux-tip-commits
Cc: stable, Yang Yingliang, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: fe7a11c78d2a9bdb8b50afc278a31ac177000948
Gitweb: https://git.kernel.org/tip/fe7a11c78d2a9bdb8b50afc278a31ac177000948
Author: Yang Yingliang <yangyingliang@huawei.com>
AuthorDate: Wed, 03 Jul 2024 11:16:10 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 29 Jul 2024 12:22:33 +02:00
sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate()
If cpuset_cpu_inactive() fails, set_rq_online() need be called to rollback.
Fixes: 120455c514f7 ("sched: Fix hotplug vs CPU bandwidth control")
Cc: stable@kernel.org
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240703031610.587047-5-yangyingliang@huaweicloud.com
---
kernel/sched/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4d119e9..f3951e4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8022,6 +8022,7 @@ int sched_cpu_deactivate(unsigned int cpu)
ret = cpuset_cpu_inactive(cpu);
if (ret) {
sched_smt_present_inc(cpu);
+ sched_set_rq_online(rq, cpu);
balance_push_set(cpu, false);
set_cpu_active(cpu, true);
sched_update_numa(cpu, true);
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [tip: sched/core] sched/core: Introduce sched_set_rq_on/offline() helper
2024-07-03 3:16 ` [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper Yang Yingliang
2024-07-16 13:22 ` Markus Elfring
@ 2024-07-29 10:34 ` tip-bot2 for Yang Yingliang
1 sibling, 0 replies; 11+ messages in thread
From: tip-bot2 for Yang Yingliang @ 2024-07-29 10:34 UTC (permalink / raw)
To: linux-tip-commits
Cc: stable, Yang Yingliang, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 2f027354122f58ee846468a6f6b48672fff92e9b
Gitweb: https://git.kernel.org/tip/2f027354122f58ee846468a6f6b48672fff92e9b
Author: Yang Yingliang <yangyingliang@huawei.com>
AuthorDate: Wed, 03 Jul 2024 11:16:09 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 29 Jul 2024 12:22:32 +02:00
sched/core: Introduce sched_set_rq_on/offline() helper
Introduce sched_set_rq_on/offline() helper, so it can be called
in normal or error path simply. No functional changed.
Cc: stable@kernel.org
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240703031610.587047-4-yangyingliang@huaweicloud.com
---
kernel/sched/core.c | 40 ++++++++++++++++++++++++++--------------
1 file changed, 26 insertions(+), 14 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 949473e..4d119e9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7845,6 +7845,30 @@ void set_rq_offline(struct rq *rq)
}
}
+static inline void sched_set_rq_online(struct rq *rq, int cpu)
+{
+ struct rq_flags rf;
+
+ rq_lock_irqsave(rq, &rf);
+ if (rq->rd) {
+ BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
+ set_rq_online(rq);
+ }
+ rq_unlock_irqrestore(rq, &rf);
+}
+
+static inline void sched_set_rq_offline(struct rq *rq, int cpu)
+{
+ struct rq_flags rf;
+
+ rq_lock_irqsave(rq, &rf);
+ if (rq->rd) {
+ BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
+ set_rq_offline(rq);
+ }
+ rq_unlock_irqrestore(rq, &rf);
+}
+
/*
* used to mark begin/end of suspend/resume:
*/
@@ -7914,7 +7938,6 @@ static inline void sched_smt_present_dec(int cpu)
int sched_cpu_activate(unsigned int cpu)
{
struct rq *rq = cpu_rq(cpu);
- struct rq_flags rf;
/*
* Clear the balance_push callback and prepare to schedule
@@ -7943,12 +7966,7 @@ int sched_cpu_activate(unsigned int cpu)
* 2) At runtime, if cpuset_cpu_active() fails to rebuild the
* domains.
*/
- rq_lock_irqsave(rq, &rf);
- if (rq->rd) {
- BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
- set_rq_online(rq);
- }
- rq_unlock_irqrestore(rq, &rf);
+ sched_set_rq_online(rq, cpu);
return 0;
}
@@ -7956,7 +7974,6 @@ int sched_cpu_activate(unsigned int cpu)
int sched_cpu_deactivate(unsigned int cpu)
{
struct rq *rq = cpu_rq(cpu);
- struct rq_flags rf;
int ret;
/*
@@ -7987,12 +8004,7 @@ int sched_cpu_deactivate(unsigned int cpu)
*/
synchronize_rcu();
- rq_lock_irqsave(rq, &rf);
- if (rq->rd) {
- BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
- set_rq_offline(rq);
- }
- rq_unlock_irqrestore(rq, &rf);
+ sched_set_rq_offline(rq, cpu);
/*
* When going down, decrement the number of cores with SMT present.
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [tip: sched/core] sched/smt: Fix unbalance sched_smt_present dec/inc
2024-07-03 3:16 ` [PATCH 2/4] sched/smt: fix unbalance sched_smt_present dec/inc Yang Yingliang
@ 2024-07-29 10:34 ` tip-bot2 for Yang Yingliang
0 siblings, 0 replies; 11+ messages in thread
From: tip-bot2 for Yang Yingliang @ 2024-07-29 10:34 UTC (permalink / raw)
To: linux-tip-commits
Cc: stable, Yang Yingliang, Peter Zijlstra (Intel), Chen Yu, Tim Chen,
x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: e22f910a26cc2a3ac9c66b8e935ef2a7dd881117
Gitweb: https://git.kernel.org/tip/e22f910a26cc2a3ac9c66b8e935ef2a7dd881117
Author: Yang Yingliang <yangyingliang@huawei.com>
AuthorDate: Wed, 03 Jul 2024 11:16:08 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 29 Jul 2024 12:22:32 +02:00
sched/smt: Fix unbalance sched_smt_present dec/inc
I got the following warn report while doing stress test:
jump label: negative count!
WARNING: CPU: 3 PID: 38 at kernel/jump_label.c:263 static_key_slow_try_dec+0x9d/0xb0
Call Trace:
<TASK>
__static_key_slow_dec_cpuslocked+0x16/0x70
sched_cpu_deactivate+0x26e/0x2a0
cpuhp_invoke_callback+0x3ad/0x10d0
cpuhp_thread_fun+0x3f5/0x680
smpboot_thread_fn+0x56d/0x8d0
kthread+0x309/0x400
ret_from_fork+0x41/0x70
ret_from_fork_asm+0x1b/0x30
</TASK>
Because when cpuset_cpu_inactive() fails in sched_cpu_deactivate(),
the cpu offline failed, but sched_smt_present is decremented before
calling sched_cpu_deactivate(), it leads to unbalanced dec/inc, so
fix it by incrementing sched_smt_present in the error path.
Fixes: c5511d03ec09 ("sched/smt: Make sched_smt_present track topology")
Cc: stable@kernel.org
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Link: https://lore.kernel.org/r/20240703031610.587047-3-yangyingliang@huaweicloud.com
---
kernel/sched/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index acc04ed..949473e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8009,6 +8009,7 @@ int sched_cpu_deactivate(unsigned int cpu)
sched_update_numa(cpu, false);
ret = cpuset_cpu_inactive(cpu);
if (ret) {
+ sched_smt_present_inc(cpu);
balance_push_set(cpu, false);
set_cpu_active(cpu, true);
sched_update_numa(cpu, true);
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [tip: sched/core] sched/smt: Introduce sched_smt_present_inc/dec() helper
2024-07-03 3:16 ` [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper Yang Yingliang
@ 2024-07-29 10:34 ` tip-bot2 for Yang Yingliang
0 siblings, 0 replies; 11+ messages in thread
From: tip-bot2 for Yang Yingliang @ 2024-07-29 10:34 UTC (permalink / raw)
To: linux-tip-commits
Cc: stable, Yang Yingliang, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 31b164e2e4af84d08d2498083676e7eeaa102493
Gitweb: https://git.kernel.org/tip/31b164e2e4af84d08d2498083676e7eeaa102493
Author: Yang Yingliang <yangyingliang@huawei.com>
AuthorDate: Wed, 03 Jul 2024 11:16:07 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 29 Jul 2024 12:22:32 +02:00
sched/smt: Introduce sched_smt_present_inc/dec() helper
Introduce sched_smt_present_inc/dec() helper, so it can be called
in normal or error path simply. No functional changed.
Cc: stable@kernel.org
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240703031610.587047-2-yangyingliang@huaweicloud.com
---
kernel/sched/core.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a9f6550..acc04ed 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7895,6 +7895,22 @@ static int cpuset_cpu_inactive(unsigned int cpu)
return 0;
}
+static inline void sched_smt_present_inc(int cpu)
+{
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+ static_branch_inc_cpuslocked(&sched_smt_present);
+#endif
+}
+
+static inline void sched_smt_present_dec(int cpu)
+{
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+ static_branch_dec_cpuslocked(&sched_smt_present);
+#endif
+}
+
int sched_cpu_activate(unsigned int cpu)
{
struct rq *rq = cpu_rq(cpu);
@@ -7906,13 +7922,10 @@ int sched_cpu_activate(unsigned int cpu)
*/
balance_push_set(cpu, false);
-#ifdef CONFIG_SCHED_SMT
/*
* When going up, increment the number of cores with SMT present.
*/
- if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
- static_branch_inc_cpuslocked(&sched_smt_present);
-#endif
+ sched_smt_present_inc(cpu);
set_cpu_active(cpu, true);
if (sched_smp_initialized) {
@@ -7981,13 +7994,12 @@ int sched_cpu_deactivate(unsigned int cpu)
}
rq_unlock_irqrestore(rq, &rf);
-#ifdef CONFIG_SCHED_SMT
/*
* When going down, decrement the number of cores with SMT present.
*/
- if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
- static_branch_dec_cpuslocked(&sched_smt_present);
+ sched_smt_present_dec(cpu);
+#ifdef CONFIG_SCHED_SMT
sched_core_cpu_deactivate(cpu);
#endif
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-07-29 10:34 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-03 3:16 [PATCH 0/4] sched/smt: Fix error handling in sched_cpu_deactivate() Yang Yingliang
2024-07-03 3:16 ` [PATCH 1/4] sched/smt: Introduce sched_smt_present_inc/dec() helper Yang Yingliang
2024-07-29 10:34 ` [tip: sched/core] " tip-bot2 for Yang Yingliang
2024-07-03 3:16 ` [PATCH 2/4] sched/smt: fix unbalance sched_smt_present dec/inc Yang Yingliang
2024-07-29 10:34 ` [tip: sched/core] sched/smt: Fix " tip-bot2 for Yang Yingliang
2024-07-03 3:16 ` [PATCH 3/4] sched/core: Introduce sched_set_rq_on/offline() helper Yang Yingliang
2024-07-16 13:22 ` Markus Elfring
2024-07-29 10:34 ` [tip: sched/core] " tip-bot2 for Yang Yingliang
2024-07-03 3:16 ` [PATCH 4/4] sched/core: fix unbalance set_rq_online/offline() in sched_cpu_deactivate() Yang Yingliang
2024-07-29 10:34 ` [tip: sched/core] sched/core: Fix " tip-bot2 for Yang Yingliang
2024-07-16 9:54 ` [PATCH 0/4] sched/smt: Fix error handling " Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox