linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
@ 2024-12-23  9:14 Hao Jia
  2024-12-23 20:50 ` Markus Elfring
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Hao Jia @ 2024-12-23  9:14 UTC (permalink / raw)
  To: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, vschneid
  Cc: linux-kernel, Hao Jia

From: Hao Jia <jiahao1@lixiang.com>

When the PLACE_LAG scheduling feature is enabled and
dst_cfs_rq->nr_queued is greater than 1, if a task is
ineligible (lag < 0) on the source cpu runqueue, it will
also be ineligible when it is migrated to the destination
cpu runqueue. Because we will keep the original equivalent
lag of the task in place_entity(). So if the task was
ineligible before, it will still be ineligible after
migration.

So in sched_balance_rq(), we prioritize migrating eligible
tasks, and we soft-limit ineligible tasks, allowing them
to migrate only when nr_balance_failed is non-zero to
avoid load-balancing trying very hard to balance the load.

Below are some benchmark test results. From my test results,
this patch shows a slight improvement on hackbench.

Benchmark
=========

All of the benchmarks are done inside a normal cpu cgroup in a
clean environment with cpu turbo disabled, and test machine is:

Single NUMA machine model is 13th Gen Intel(R) Core(TM)
i7-13700, 12 Core/24 HT.

Based on master b86545e02e8c.

Results
=======

hackbench-process-pipes
                      vanilla                  patched
Amean     1       0.5837 (   0.00%)      0.5733 (   1.77%)
Amean     4       1.4423 (   0.00%)      1.4503 (  -0.55%)
Amean     7       2.5147 (   0.00%)      2.4773 (   1.48%)
Amean     12      3.9347 (   0.00%)      3.8880 (   1.19%)
Amean     21      5.3943 (   0.00%)      5.3873 (   0.13%)
Amean     30      6.7840 (   0.00%)      6.6660 (   1.74%)
Amean     48      9.8313 (   0.00%)      9.6100 (   2.25%)
Amean     79     15.4403 (   0.00%)     14.9580 (   3.12%)
Amean     96     18.4970 (   0.00%)     17.9533 (   2.94%)

hackbench-process-sockets
                      vanilla                  patched
Amean     1       0.6297 (   0.00%)      0.6223 (   1.16%)
Amean     4       2.1517 (   0.00%)      2.0887 (   2.93%)
Amean     7       3.6377 (   0.00%)      3.5670 (   1.94%)
Amean     12      6.1277 (   0.00%)      5.9290 (   3.24%)
Amean     21     10.0380 (   0.00%)      9.7623 (   2.75%)
Amean     30     14.1517 (   0.00%)     13.7513 (   2.83%)
Amean     48     24.7253 (   0.00%)     24.2287 (   2.01%)
Amean     79     43.9523 (   0.00%)     43.2330 (   1.64%)
Amean     96     54.5310 (   0.00%)     53.7650 (   1.40%)

tbench4 Throughput
                      vanilla                  patched
Hmean     1       255.97 (   0.00%)      275.01 (   7.44%)
Hmean     2       511.60 (   0.00%)      544.27 (   6.39%)
Hmean     4       996.70 (   0.00%)     1006.57 (   0.99%)
Hmean     8      1646.46 (   0.00%)     1649.15 (   0.16%)
Hmean     16     2259.42 (   0.00%)     2274.35 (   0.66%)
Hmean     32     4725.48 (   0.00%)     4735.57 (   0.21%)
Hmean     64     4411.47 (   0.00%)     4400.05 (  -0.26%)
Hmean     96     4284.31 (   0.00%)     4267.39 (  -0.39%)

Signed-off-by: Hao Jia <jiahao1@lixiang.com>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
Previous discussion link: https://lore.kernel.org/all/20241128084858.25220-1-jiahao.kernel@gmail.com
Link to v1: https://lore.kernel.org/all/20241218080203.80556-1-jiahao.kernel@gmail.com

v1 to v2:
 - Modify dst_cfs_rq->nr_running to dst_cfs_rq->nr_queued to
   resolve conflicts with commit 736c55a02c47 ("sched/fair:
   Rename cfs_rq.nr_running into nr_queued").

 kernel/sched/fair.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5599b0c1ba9b..c884bf631e66 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9396,6 +9396,30 @@ static inline int migrate_degrades_locality(struct task_struct *p,
 }
 #endif
 
+/*
+ * Check whether the task is ineligible on the destination cpu
+ *
+ * When the PLACE_LAG scheduling feature is enabled and
+ * dst_cfs_rq->nr_queued is greater than 1, if the task
+ * is ineligible, it will also be ineligible when
+ * it is migrated to the destination cpu.
+ */
+static inline int task_is_ineligible_on_dst_cpu(struct task_struct *p, int dest_cpu)
+{
+	struct cfs_rq *dst_cfs_rq;
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+	dst_cfs_rq = task_group(p)->cfs_rq[dest_cpu];
+#else
+	dst_cfs_rq = &cpu_rq(dest_cpu)->cfs;
+#endif
+	if (sched_feat(PLACE_LAG) && dst_cfs_rq->nr_queued &&
+	    !entity_eligible(task_cfs_rq(p), &p->se))
+		return 1;
+
+	return 0;
+}
+
 /*
  * can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
  */
@@ -9420,6 +9444,16 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
 	if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
 		return 0;
 
+	/*
+	 * We want to prioritize the migration of eligible tasks.
+	 * For ineligible tasks we soft-limit them and only allow
+	 * them to migrate when nr_balance_failed is non-zero to
+	 * avoid load-balancing trying very hard to balance the load.
+	 */
+	if (!env->sd->nr_balance_failed &&
+	    task_is_ineligible_on_dst_cpu(p, env->dst_cpu))
+		return 0;
+
 	/* Disregard percpu kthreads; they are where they need to be. */
 	if (kthread_is_per_cpu(p))
 		return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2024-12-23  9:14 [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq() Hao Jia
@ 2024-12-23 20:50 ` Markus Elfring
  2024-12-24  1:53   ` Hao Jia
  2025-01-13  9:21 ` [PATCH v2] " Hao Jia
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Markus Elfring @ 2024-12-23 20:50 UTC (permalink / raw)
  To: Hao Jia, kernel-janitors, Ben Segall, Dietmar Eggemann,
	Ingo Molnar, Juri Lelli, Mel Gorman, Peter Zijlstra,
	Steven Rostedt, Valentin Schneider, Vincent Guittot
  Cc: Hao Jia, LKML, Ingo Molnar

…
> All of the benchmarks are done inside a normal cpu cgroup in a
> clean environment with cpu turbo disabled, and test machine is:
…
                         CPU?

You may occasionally put more than 63 characters into text lines
of such a change description.

Regards,
Markus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2024-12-23 20:50 ` Markus Elfring
@ 2024-12-24  1:53   ` Hao Jia
  2024-12-24  8:55     ` [v2] " Markus Elfring
  0 siblings, 1 reply; 16+ messages in thread
From: Hao Jia @ 2024-12-24  1:53 UTC (permalink / raw)
  To: Markus Elfring, Hao Jia, kernel-janitors, Ben Segall,
	Dietmar Eggemann, Ingo Molnar, Juri Lelli, Mel Gorman,
	Peter Zijlstra, Steven Rostedt, Valentin Schneider,
	Vincent Guittot
  Cc: LKML, Ingo Molnar



On 2024/12/24 04:50, Markus Elfring wrote:
> …
>> All of the benchmarks are done inside a normal cpu cgroup in a
>> clean environment with cpu turbo disabled, and test machine is:
> …
>                           CPU?
Thanks for your review, will fix it.

> 
> You may occasionally put more than 63 characters into text lines
> of such a change description.

I checked the patch using ./scripts/checkpatch.pl before sending it, and 
found no warnings or errors. The commit log should preferably have less 
than 75 characters per line. If I'm wrong, please correct me, thank you.

Thanks,
Hao


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2024-12-24  1:53   ` Hao Jia
@ 2024-12-24  8:55     ` Markus Elfring
  0 siblings, 0 replies; 16+ messages in thread
From: Markus Elfring @ 2024-12-24  8:55 UTC (permalink / raw)
  To: Hao Jia, kernel-janitors, Ben Segall, Dietmar Eggemann,
	Ingo Molnar, Juri Lelli, Mel Gorman, Peter Zijlstra,
	Steven Rostedt, Valentin Schneider, Vincent Guittot
  Cc: Hao Jia, LKML, Ingo Molnar

>> You may occasionally put more than 63 characters into text lines
>> of such a change description.
>
> I checked the patch using ./scripts/checkpatch.pl before sending it, and found no warnings or errors. The commit log should preferably have less than 75 characters per line.

Can any texts look nicer if word wrapping would be accordingly adjusted a bit more?

Regards,
Markus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2024-12-23  9:14 [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq() Hao Jia
  2024-12-23 20:50 ` Markus Elfring
@ 2025-01-13  9:21 ` Hao Jia
  2025-01-13 16:40   ` Vincent Guittot
  2025-01-15  9:17 ` [tip: sched/core] " tip-bot2 for Hao Jia
  2025-01-15 13:18 ` [PATCH v2] " Luis Machado
  3 siblings, 1 reply; 16+ messages in thread
From: Hao Jia @ 2025-01-13  9:21 UTC (permalink / raw)
  To: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, vschneid
  Cc: linux-kernel, Hao Jia

Friendly ping...


On 2024/12/23 17:14, Hao Jia wrote:
> From: Hao Jia <jiahao1@lixiang.com>
> 
> When the PLACE_LAG scheduling feature is enabled and
> dst_cfs_rq->nr_queued is greater than 1, if a task is
> ineligible (lag < 0) on the source cpu runqueue, it will
> also be ineligible when it is migrated to the destination
> cpu runqueue. Because we will keep the original equivalent
> lag of the task in place_entity(). So if the task was
> ineligible before, it will still be ineligible after
> migration.
> 
> So in sched_balance_rq(), we prioritize migrating eligible
> tasks, and we soft-limit ineligible tasks, allowing them
> to migrate only when nr_balance_failed is non-zero to
> avoid load-balancing trying very hard to balance the load.
> 
> Below are some benchmark test results. From my test results,
> this patch shows a slight improvement on hackbench.
> 
> Benchmark
> =========
> 
> All of the benchmarks are done inside a normal cpu cgroup in a
> clean environment with cpu turbo disabled, and test machine is:
> 
> Single NUMA machine model is 13th Gen Intel(R) Core(TM)
> i7-13700, 12 Core/24 HT.
> 
> Based on master b86545e02e8c.
> 
> Results
> =======
> 
> hackbench-process-pipes
>                        vanilla                  patched
> Amean     1       0.5837 (   0.00%)      0.5733 (   1.77%)
> Amean     4       1.4423 (   0.00%)      1.4503 (  -0.55%)
> Amean     7       2.5147 (   0.00%)      2.4773 (   1.48%)
> Amean     12      3.9347 (   0.00%)      3.8880 (   1.19%)
> Amean     21      5.3943 (   0.00%)      5.3873 (   0.13%)
> Amean     30      6.7840 (   0.00%)      6.6660 (   1.74%)
> Amean     48      9.8313 (   0.00%)      9.6100 (   2.25%)
> Amean     79     15.4403 (   0.00%)     14.9580 (   3.12%)
> Amean     96     18.4970 (   0.00%)     17.9533 (   2.94%)
> 
> hackbench-process-sockets
>                        vanilla                  patched
> Amean     1       0.6297 (   0.00%)      0.6223 (   1.16%)
> Amean     4       2.1517 (   0.00%)      2.0887 (   2.93%)
> Amean     7       3.6377 (   0.00%)      3.5670 (   1.94%)
> Amean     12      6.1277 (   0.00%)      5.9290 (   3.24%)
> Amean     21     10.0380 (   0.00%)      9.7623 (   2.75%)
> Amean     30     14.1517 (   0.00%)     13.7513 (   2.83%)
> Amean     48     24.7253 (   0.00%)     24.2287 (   2.01%)
> Amean     79     43.9523 (   0.00%)     43.2330 (   1.64%)
> Amean     96     54.5310 (   0.00%)     53.7650 (   1.40%)
> 
> tbench4 Throughput
>                        vanilla                  patched
> Hmean     1       255.97 (   0.00%)      275.01 (   7.44%)
> Hmean     2       511.60 (   0.00%)      544.27 (   6.39%)
> Hmean     4       996.70 (   0.00%)     1006.57 (   0.99%)
> Hmean     8      1646.46 (   0.00%)     1649.15 (   0.16%)
> Hmean     16     2259.42 (   0.00%)     2274.35 (   0.66%)
> Hmean     32     4725.48 (   0.00%)     4735.57 (   0.21%)
> Hmean     64     4411.47 (   0.00%)     4400.05 (  -0.26%)
> Hmean     96     4284.31 (   0.00%)     4267.39 (  -0.39%)
> 
> Signed-off-by: Hao Jia <jiahao1@lixiang.com>
> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> Previous discussion link: https://lore.kernel.org/all/20241128084858.25220-1-jiahao.kernel@gmail.com
> Link to v1: https://lore.kernel.org/all/20241218080203.80556-1-jiahao.kernel@gmail.com
> 
> v1 to v2:
>   - Modify dst_cfs_rq->nr_running to dst_cfs_rq->nr_queued to
>     resolve conflicts with commit 736c55a02c47 ("sched/fair:
>     Rename cfs_rq.nr_running into nr_queued").
> 
>   kernel/sched/fair.c | 34 ++++++++++++++++++++++++++++++++++
>   1 file changed, 34 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5599b0c1ba9b..c884bf631e66 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9396,6 +9396,30 @@ static inline int migrate_degrades_locality(struct task_struct *p,
>   }
>   #endif
>   
> +/*
> + * Check whether the task is ineligible on the destination cpu
> + *
> + * When the PLACE_LAG scheduling feature is enabled and
> + * dst_cfs_rq->nr_queued is greater than 1, if the task
> + * is ineligible, it will also be ineligible when
> + * it is migrated to the destination cpu.
> + */
> +static inline int task_is_ineligible_on_dst_cpu(struct task_struct *p, int dest_cpu)
> +{
> +	struct cfs_rq *dst_cfs_rq;
> +
> +#ifdef CONFIG_FAIR_GROUP_SCHED
> +	dst_cfs_rq = task_group(p)->cfs_rq[dest_cpu];
> +#else
> +	dst_cfs_rq = &cpu_rq(dest_cpu)->cfs;
> +#endif
> +	if (sched_feat(PLACE_LAG) && dst_cfs_rq->nr_queued &&
> +	    !entity_eligible(task_cfs_rq(p), &p->se))
> +		return 1;
> +
> +	return 0;
> +}
> +
>   /*
>    * can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
>    */
> @@ -9420,6 +9444,16 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
>   	if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
>   		return 0;
>   
> +	/*
> +	 * We want to prioritize the migration of eligible tasks.
> +	 * For ineligible tasks we soft-limit them and only allow
> +	 * them to migrate when nr_balance_failed is non-zero to
> +	 * avoid load-balancing trying very hard to balance the load.
> +	 */
> +	if (!env->sd->nr_balance_failed &&
> +	    task_is_ineligible_on_dst_cpu(p, env->dst_cpu))
> +		return 0;
> +
>   	/* Disregard percpu kthreads; they are where they need to be. */
>   	if (kthread_is_per_cpu(p))
>   		return 0;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2025-01-13  9:21 ` [PATCH v2] " Hao Jia
@ 2025-01-13 16:40   ` Vincent Guittot
  2025-01-14  3:18     ` Hao Jia
  0 siblings, 1 reply; 16+ messages in thread
From: Vincent Guittot @ 2025-01-13 16:40 UTC (permalink / raw)
  To: Hao Jia
  Cc: mingo, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, Hao Jia

On Mon, 13 Jan 2025 at 10:21, Hao Jia <jiahao.kernel@gmail.com> wrote:
>
> Friendly ping...
>
>
> On 2024/12/23 17:14, Hao Jia wrote:
> > From: Hao Jia <jiahao1@lixiang.com>
> >
> > When the PLACE_LAG scheduling feature is enabled and
> > dst_cfs_rq->nr_queued is greater than 1, if a task is
> > ineligible (lag < 0) on the source cpu runqueue, it will
> > also be ineligible when it is migrated to the destination
> > cpu runqueue. Because we will keep the original equivalent
> > lag of the task in place_entity(). So if the task was
> > ineligible before, it will still be ineligible after
> > migration.
> >
> > So in sched_balance_rq(), we prioritize migrating eligible
> > tasks, and we soft-limit ineligible tasks, allowing them
> > to migrate only when nr_balance_failed is non-zero to
> > avoid load-balancing trying very hard to balance the load.

Could you explain why you think it's better to balance eligible tasks
in priority and potentially skip a load balance ?

I can see an interest for idle and newly_idle load balance in order to
favor fairness as tasks will become eligible but I don't see why it
would be helpful if dst already has some runnable tasks. Furthermore,
when a cpu is idle or newly idle, we really want to migrate a task
even an non eligible one instead of possibly skipping this load
balance round. With your patch, we might end up not pulling any task,
increasing the nr_balance_failed and waiting next load balance

> >
> > Below are some benchmark test results. From my test results,
> > this patch shows a slight improvement on hackbench.
> >
> > Benchmark
> > =========
> >
> > All of the benchmarks are done inside a normal cpu cgroup in a
> > clean environment with cpu turbo disabled, and test machine is:
> >
> > Single NUMA machine model is 13th Gen Intel(R) Core(TM)
> > i7-13700, 12 Core/24 HT.
> >
> > Based on master b86545e02e8c.
> >
> > Results
> > =======
> >
> > hackbench-process-pipes
> >                        vanilla                  patched
> > Amean     1       0.5837 (   0.00%)      0.5733 (   1.77%)
> > Amean     4       1.4423 (   0.00%)      1.4503 (  -0.55%)
> > Amean     7       2.5147 (   0.00%)      2.4773 (   1.48%)
> > Amean     12      3.9347 (   0.00%)      3.8880 (   1.19%)
> > Amean     21      5.3943 (   0.00%)      5.3873 (   0.13%)
> > Amean     30      6.7840 (   0.00%)      6.6660 (   1.74%)
> > Amean     48      9.8313 (   0.00%)      9.6100 (   2.25%)
> > Amean     79     15.4403 (   0.00%)     14.9580 (   3.12%)
> > Amean     96     18.4970 (   0.00%)     17.9533 (   2.94%)
> >
> > hackbench-process-sockets
> >                        vanilla                  patched
> > Amean     1       0.6297 (   0.00%)      0.6223 (   1.16%)
> > Amean     4       2.1517 (   0.00%)      2.0887 (   2.93%)
> > Amean     7       3.6377 (   0.00%)      3.5670 (   1.94%)
> > Amean     12      6.1277 (   0.00%)      5.9290 (   3.24%)
> > Amean     21     10.0380 (   0.00%)      9.7623 (   2.75%)
> > Amean     30     14.1517 (   0.00%)     13.7513 (   2.83%)
> > Amean     48     24.7253 (   0.00%)     24.2287 (   2.01%)
> > Amean     79     43.9523 (   0.00%)     43.2330 (   1.64%)
> > Amean     96     54.5310 (   0.00%)     53.7650 (   1.40%)
> >
> > tbench4 Throughput
> >                        vanilla                  patched
> > Hmean     1       255.97 (   0.00%)      275.01 (   7.44%)
> > Hmean     2       511.60 (   0.00%)      544.27 (   6.39%)
> > Hmean     4       996.70 (   0.00%)     1006.57 (   0.99%)
> > Hmean     8      1646.46 (   0.00%)     1649.15 (   0.16%)
> > Hmean     16     2259.42 (   0.00%)     2274.35 (   0.66%)
> > Hmean     32     4725.48 (   0.00%)     4735.57 (   0.21%)
> > Hmean     64     4411.47 (   0.00%)     4400.05 (  -0.26%)
> > Hmean     96     4284.31 (   0.00%)     4267.39 (  -0.39%)
> >
> > Signed-off-by: Hao Jia <jiahao1@lixiang.com>
> > Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> > Previous discussion link: https://lore.kernel.org/all/20241128084858.25220-1-jiahao.kernel@gmail.com
> > Link to v1: https://lore.kernel.org/all/20241218080203.80556-1-jiahao.kernel@gmail.com
> >
> > v1 to v2:
> >   - Modify dst_cfs_rq->nr_running to dst_cfs_rq->nr_queued to
> >     resolve conflicts with commit 736c55a02c47 ("sched/fair:
> >     Rename cfs_rq.nr_running into nr_queued").
> >
> >   kernel/sched/fair.c | 34 ++++++++++++++++++++++++++++++++++
> >   1 file changed, 34 insertions(+)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 5599b0c1ba9b..c884bf631e66 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9396,6 +9396,30 @@ static inline int migrate_degrades_locality(struct task_struct *p,
> >   }
> >   #endif
> >
> > +/*
> > + * Check whether the task is ineligible on the destination cpu
> > + *
> > + * When the PLACE_LAG scheduling feature is enabled and
> > + * dst_cfs_rq->nr_queued is greater than 1, if the task
> > + * is ineligible, it will also be ineligible when
> > + * it is migrated to the destination cpu.
> > + */
> > +static inline int task_is_ineligible_on_dst_cpu(struct task_struct *p, int dest_cpu)
> > +{
> > +     struct cfs_rq *dst_cfs_rq;
> > +
> > +#ifdef CONFIG_FAIR_GROUP_SCHED
> > +     dst_cfs_rq = task_group(p)->cfs_rq[dest_cpu];
> > +#else
> > +     dst_cfs_rq = &cpu_rq(dest_cpu)->cfs;
> > +#endif
> > +     if (sched_feat(PLACE_LAG) && dst_cfs_rq->nr_queued &&
> > +         !entity_eligible(task_cfs_rq(p), &p->se))
> > +             return 1;
> > +
> > +     return 0;
> > +}
> > +
> >   /*
> >    * can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
> >    */
> > @@ -9420,6 +9444,16 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
> >       if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
> >               return 0;
> >
> > +     /*
> > +      * We want to prioritize the migration of eligible tasks.
> > +      * For ineligible tasks we soft-limit them and only allow
> > +      * them to migrate when nr_balance_failed is non-zero to
> > +      * avoid load-balancing trying very hard to balance the load.
> > +      */
> > +     if (!env->sd->nr_balance_failed &&
> > +         task_is_ineligible_on_dst_cpu(p, env->dst_cpu))
> > +             return 0;
> > +
> >       /* Disregard percpu kthreads; they are where they need to be. */
> >       if (kthread_is_per_cpu(p))
> >               return 0;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2025-01-13 16:40   ` Vincent Guittot
@ 2025-01-14  3:18     ` Hao Jia
  2025-01-14  8:07       ` Vincent Guittot
  0 siblings, 1 reply; 16+ messages in thread
From: Hao Jia @ 2025-01-14  3:18 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, Hao Jia



On 2025/1/14 00:40, Vincent Guittot wrote:
> On Mon, 13 Jan 2025 at 10:21, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>
>> Friendly ping...
>>
>>
>> On 2024/12/23 17:14, Hao Jia wrote:
>>> From: Hao Jia <jiahao1@lixiang.com>
>>>
>>> When the PLACE_LAG scheduling feature is enabled and
>>> dst_cfs_rq->nr_queued is greater than 1, if a task is
>>> ineligible (lag < 0) on the source cpu runqueue, it will
>>> also be ineligible when it is migrated to the destination
>>> cpu runqueue. Because we will keep the original equivalent
>>> lag of the task in place_entity(). So if the task was
>>> ineligible before, it will still be ineligible after
>>> migration.
>>>
>>> So in sched_balance_rq(), we prioritize migrating eligible
>>> tasks, and we soft-limit ineligible tasks, allowing them
>>> to migrate only when nr_balance_failed is non-zero to
>>> avoid load-balancing trying very hard to balance the load.
> 
> Could you explain why you think it's better to balance eligible tasks
> in priority and potentially skip a load balance ?

In place_entity(), we maintain the task's original equivalent lag, even 
if we migrate the task to dst_rq, this does not change its eligibility 
attribute.

When there are multiple tasks on src_rq, and the dst_cpu has some 
runnable tasks, migrating ineligible tasks to dst_rq will not allow them 
to run. Therefore, such task migration is inefficient. We should 
prioritize migrating tasks that can run on dst_rq.

In other words, migrating ineligible tasks is merely moving them to 
another runqueue to wait until they become eligible.


> 
> I can see an interest for idle and newly_idle load balance in order to
> favor fairness as tasks will become eligible but I don't see why it
> would be helpful if dst already has some runnable tasks. Furthermore,
> when a cpu is idle or newly idle, we really want to migrate a task
> even an non eligible one instead of possibly skipping this load
> balance round. With your patch, we might end up not pulling any task,
> increasing the nr_balance_failed and waiting next load balance
> 

If I understand correctly, when the destination CPU is idle, my patch 
does not change the original behavior. it only prevents the migration of 
ineligible tasks when dst_cfs_rq->nr_queued is greater than 1.

If I missed something, please correct me.


Thanks,
Hao


>>>
>>> Below are some benchmark test results. From my test results,
>>> this patch shows a slight improvement on hackbench.
>>>
>>> Benchmark
>>> =========
>>>
>>> All of the benchmarks are done inside a normal cpu cgroup in a
>>> clean environment with cpu turbo disabled, and test machine is:
>>>
>>> Single NUMA machine model is 13th Gen Intel(R) Core(TM)
>>> i7-13700, 12 Core/24 HT.
>>>
>>> Based on master b86545e02e8c.
>>>
>>> Results
>>> =======
>>>
>>> hackbench-process-pipes
>>>                         vanilla                  patched
>>> Amean     1       0.5837 (   0.00%)      0.5733 (   1.77%)
>>> Amean     4       1.4423 (   0.00%)      1.4503 (  -0.55%)
>>> Amean     7       2.5147 (   0.00%)      2.4773 (   1.48%)
>>> Amean     12      3.9347 (   0.00%)      3.8880 (   1.19%)
>>> Amean     21      5.3943 (   0.00%)      5.3873 (   0.13%)
>>> Amean     30      6.7840 (   0.00%)      6.6660 (   1.74%)
>>> Amean     48      9.8313 (   0.00%)      9.6100 (   2.25%)
>>> Amean     79     15.4403 (   0.00%)     14.9580 (   3.12%)
>>> Amean     96     18.4970 (   0.00%)     17.9533 (   2.94%)
>>>
>>> hackbench-process-sockets
>>>                         vanilla                  patched
>>> Amean     1       0.6297 (   0.00%)      0.6223 (   1.16%)
>>> Amean     4       2.1517 (   0.00%)      2.0887 (   2.93%)
>>> Amean     7       3.6377 (   0.00%)      3.5670 (   1.94%)
>>> Amean     12      6.1277 (   0.00%)      5.9290 (   3.24%)
>>> Amean     21     10.0380 (   0.00%)      9.7623 (   2.75%)
>>> Amean     30     14.1517 (   0.00%)     13.7513 (   2.83%)
>>> Amean     48     24.7253 (   0.00%)     24.2287 (   2.01%)
>>> Amean     79     43.9523 (   0.00%)     43.2330 (   1.64%)
>>> Amean     96     54.5310 (   0.00%)     53.7650 (   1.40%)
>>>
>>> tbench4 Throughput
>>>                         vanilla                  patched
>>> Hmean     1       255.97 (   0.00%)      275.01 (   7.44%)
>>> Hmean     2       511.60 (   0.00%)      544.27 (   6.39%)
>>> Hmean     4       996.70 (   0.00%)     1006.57 (   0.99%)
>>> Hmean     8      1646.46 (   0.00%)     1649.15 (   0.16%)
>>> Hmean     16     2259.42 (   0.00%)     2274.35 (   0.66%)
>>> Hmean     32     4725.48 (   0.00%)     4735.57 (   0.21%)
>>> Hmean     64     4411.47 (   0.00%)     4400.05 (  -0.26%)
>>> Hmean     96     4284.31 (   0.00%)     4267.39 (  -0.39%)
>>>
>>> Signed-off-by: Hao Jia <jiahao1@lixiang.com>
>>> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>>> ---
>>> Previous discussion link: https://lore.kernel.org/all/20241128084858.25220-1-jiahao.kernel@gmail.com
>>> Link to v1: https://lore.kernel.org/all/20241218080203.80556-1-jiahao.kernel@gmail.com
>>>
>>> v1 to v2:
>>>    - Modify dst_cfs_rq->nr_running to dst_cfs_rq->nr_queued to
>>>      resolve conflicts with commit 736c55a02c47 ("sched/fair:
>>>      Rename cfs_rq.nr_running into nr_queued").
>>>
>>>    kernel/sched/fair.c | 34 ++++++++++++++++++++++++++++++++++
>>>    1 file changed, 34 insertions(+)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 5599b0c1ba9b..c884bf631e66 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -9396,6 +9396,30 @@ static inline int migrate_degrades_locality(struct task_struct *p,
>>>    }
>>>    #endif
>>>
>>> +/*
>>> + * Check whether the task is ineligible on the destination cpu
>>> + *
>>> + * When the PLACE_LAG scheduling feature is enabled and
>>> + * dst_cfs_rq->nr_queued is greater than 1, if the task
>>> + * is ineligible, it will also be ineligible when
>>> + * it is migrated to the destination cpu.
>>> + */
>>> +static inline int task_is_ineligible_on_dst_cpu(struct task_struct *p, int dest_cpu)
>>> +{
>>> +     struct cfs_rq *dst_cfs_rq;
>>> +
>>> +#ifdef CONFIG_FAIR_GROUP_SCHED
>>> +     dst_cfs_rq = task_group(p)->cfs_rq[dest_cpu];
>>> +#else
>>> +     dst_cfs_rq = &cpu_rq(dest_cpu)->cfs;
>>> +#endif
>>> +     if (sched_feat(PLACE_LAG) && dst_cfs_rq->nr_queued &&
>>> +         !entity_eligible(task_cfs_rq(p), &p->se))
>>> +             return 1;
>>> +
>>> +     return 0;
>>> +}
>>> +
>>>    /*
>>>     * can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
>>>     */
>>> @@ -9420,6 +9444,16 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
>>>        if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
>>>                return 0;
>>>
>>> +     /*
>>> +      * We want to prioritize the migration of eligible tasks.
>>> +      * For ineligible tasks we soft-limit them and only allow
>>> +      * them to migrate when nr_balance_failed is non-zero to
>>> +      * avoid load-balancing trying very hard to balance the load.
>>> +      */
>>> +     if (!env->sd->nr_balance_failed &&
>>> +         task_is_ineligible_on_dst_cpu(p, env->dst_cpu))
>>> +             return 0;
>>> +
>>>        /* Disregard percpu kthreads; they are where they need to be. */
>>>        if (kthread_is_per_cpu(p))
>>>                return 0;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2025-01-14  3:18     ` Hao Jia
@ 2025-01-14  8:07       ` Vincent Guittot
  2025-01-15  8:55         ` Hao Jia
  0 siblings, 1 reply; 16+ messages in thread
From: Vincent Guittot @ 2025-01-14  8:07 UTC (permalink / raw)
  To: Hao Jia
  Cc: mingo, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, Hao Jia

On Tue, 14 Jan 2025 at 04:18, Hao Jia <jiahao.kernel@gmail.com> wrote:
>
>
>
> On 2025/1/14 00:40, Vincent Guittot wrote:
> > On Mon, 13 Jan 2025 at 10:21, Hao Jia <jiahao.kernel@gmail.com> wrote:
> >>
> >> Friendly ping...
> >>
> >>
> >> On 2024/12/23 17:14, Hao Jia wrote:
> >>> From: Hao Jia <jiahao1@lixiang.com>
> >>>
> >>> When the PLACE_LAG scheduling feature is enabled and
> >>> dst_cfs_rq->nr_queued is greater than 1, if a task is
> >>> ineligible (lag < 0) on the source cpu runqueue, it will
> >>> also be ineligible when it is migrated to the destination
> >>> cpu runqueue. Because we will keep the original equivalent
> >>> lag of the task in place_entity(). So if the task was
> >>> ineligible before, it will still be ineligible after
> >>> migration.
> >>>
> >>> So in sched_balance_rq(), we prioritize migrating eligible
> >>> tasks, and we soft-limit ineligible tasks, allowing them
> >>> to migrate only when nr_balance_failed is non-zero to
> >>> avoid load-balancing trying very hard to balance the load.
> >
> > Could you explain why you think it's better to balance eligible tasks
> > in priority and potentially skip a load balance ?
>
> In place_entity(), we maintain the task's original equivalent lag, even
> if we migrate the task to dst_rq, this does not change its eligibility
> attribute.

Yes, but you don't answer the question why it's better to select an
eligible task vs a non eligible task.

>
> When there are multiple tasks on src_rq, and the dst_cpu has some
> runnable tasks, migrating ineligible tasks to dst_rq will not allow them
> to run. Therefore, such task migration is inefficient. We should

Why is it inefficient ? load balance is about evenly balancing the
number of tasks or the load between CPUs, it never says that the newly
migrated task should run immediately

> prioritize migrating tasks that can run on dst_rq.
>
> In other words, migrating ineligible tasks is merely moving them to
> another runqueue to wait until they become eligible.

But I don't get why it's a problem. Migrating an eligible task might
delay its scheduling because of its deadline vs other tasks already
eligible on the dst_rq. Eligible and non eligible tasks are all
runnable, it's just how much they have already run. In addition,
migrating an eligible task will clear its positive vlag with
DELAY_ZERO which is unfair IMO

>
>
> >
> > I can see an interest for idle and newly_idle load balance in order to
> > favor fairness as tasks will become eligible but I don't see why it
> > would be helpful if dst already has some runnable tasks. Furthermore,
> > when a cpu is idle or newly idle, we really want to migrate a task
> > even an non eligible one instead of possibly skipping this load
> > balance round. With your patch, we might end up not pulling any task,
> > increasing the nr_balance_failed and waiting next load balance
> >
>
> If I understand correctly, when the destination CPU is idle, my patch
> does not change the original behavior. it only prevents the migration of
> ineligible tasks when dst_cfs_rq->nr_queued is greater than 1.

It changes the behavior. My concern is that migrating an eligible task
when dst rq already has runnable tasks, doesn't assure you that it
will give any advantage to this eligible task.

On an idle or newly idle cpu, any runnable tasks that will be pulled
will immediately start to run whatever it was eligible or not on src
cpu. In such a situation, we could consider that selecting an eligible
task which has less running time (positive lag) than others could be
more fair because the eligible task will immediately run. But this is
true as long as we migrate only 1 task. If you migrate several tasks,
an eligible task on src rq could even become ineligible quicker on dst
cpu than on src cpu has it lost its lag

>
> If I missed something, please correct me.
>
>
> Thanks,
> Hao
>
>
> >>>
> >>> Below are some benchmark test results. From my test results,
> >>> this patch shows a slight improvement on hackbench.
> >>>
> >>> Benchmark
> >>> =========
> >>>
> >>> All of the benchmarks are done inside a normal cpu cgroup in a
> >>> clean environment with cpu turbo disabled, and test machine is:
> >>>
> >>> Single NUMA machine model is 13th Gen Intel(R) Core(TM)
> >>> i7-13700, 12 Core/24 HT.
> >>>
> >>> Based on master b86545e02e8c.
> >>>
> >>> Results
> >>> =======
> >>>
> >>> hackbench-process-pipes
> >>>                         vanilla                  patched
> >>> Amean     1       0.5837 (   0.00%)      0.5733 (   1.77%)
> >>> Amean     4       1.4423 (   0.00%)      1.4503 (  -0.55%)
> >>> Amean     7       2.5147 (   0.00%)      2.4773 (   1.48%)
> >>> Amean     12      3.9347 (   0.00%)      3.8880 (   1.19%)
> >>> Amean     21      5.3943 (   0.00%)      5.3873 (   0.13%)
> >>> Amean     30      6.7840 (   0.00%)      6.6660 (   1.74%)
> >>> Amean     48      9.8313 (   0.00%)      9.6100 (   2.25%)
> >>> Amean     79     15.4403 (   0.00%)     14.9580 (   3.12%)
> >>> Amean     96     18.4970 (   0.00%)     17.9533 (   2.94%)
> >>>
> >>> hackbench-process-sockets
> >>>                         vanilla                  patched
> >>> Amean     1       0.6297 (   0.00%)      0.6223 (   1.16%)
> >>> Amean     4       2.1517 (   0.00%)      2.0887 (   2.93%)
> >>> Amean     7       3.6377 (   0.00%)      3.5670 (   1.94%)
> >>> Amean     12      6.1277 (   0.00%)      5.9290 (   3.24%)
> >>> Amean     21     10.0380 (   0.00%)      9.7623 (   2.75%)
> >>> Amean     30     14.1517 (   0.00%)     13.7513 (   2.83%)
> >>> Amean     48     24.7253 (   0.00%)     24.2287 (   2.01%)
> >>> Amean     79     43.9523 (   0.00%)     43.2330 (   1.64%)
> >>> Amean     96     54.5310 (   0.00%)     53.7650 (   1.40%)
> >>>
> >>> tbench4 Throughput
> >>>                         vanilla                  patched
> >>> Hmean     1       255.97 (   0.00%)      275.01 (   7.44%)
> >>> Hmean     2       511.60 (   0.00%)      544.27 (   6.39%)
> >>> Hmean     4       996.70 (   0.00%)     1006.57 (   0.99%)
> >>> Hmean     8      1646.46 (   0.00%)     1649.15 (   0.16%)
> >>> Hmean     16     2259.42 (   0.00%)     2274.35 (   0.66%)
> >>> Hmean     32     4725.48 (   0.00%)     4735.57 (   0.21%)
> >>> Hmean     64     4411.47 (   0.00%)     4400.05 (  -0.26%)
> >>> Hmean     96     4284.31 (   0.00%)     4267.39 (  -0.39%)
> >>>
> >>> Signed-off-by: Hao Jia <jiahao1@lixiang.com>
> >>> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> >>> ---
> >>> Previous discussion link: https://lore.kernel.org/all/20241128084858.25220-1-jiahao.kernel@gmail.com
> >>> Link to v1: https://lore.kernel.org/all/20241218080203.80556-1-jiahao.kernel@gmail.com
> >>>
> >>> v1 to v2:
> >>>    - Modify dst_cfs_rq->nr_running to dst_cfs_rq->nr_queued to
> >>>      resolve conflicts with commit 736c55a02c47 ("sched/fair:
> >>>      Rename cfs_rq.nr_running into nr_queued").
> >>>
> >>>    kernel/sched/fair.c | 34 ++++++++++++++++++++++++++++++++++
> >>>    1 file changed, 34 insertions(+)
> >>>
> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >>> index 5599b0c1ba9b..c884bf631e66 100644
> >>> --- a/kernel/sched/fair.c
> >>> +++ b/kernel/sched/fair.c
> >>> @@ -9396,6 +9396,30 @@ static inline int migrate_degrades_locality(struct task_struct *p,
> >>>    }
> >>>    #endif
> >>>
> >>> +/*
> >>> + * Check whether the task is ineligible on the destination cpu
> >>> + *
> >>> + * When the PLACE_LAG scheduling feature is enabled and
> >>> + * dst_cfs_rq->nr_queued is greater than 1, if the task
> >>> + * is ineligible, it will also be ineligible when
> >>> + * it is migrated to the destination cpu.
> >>> + */
> >>> +static inline int task_is_ineligible_on_dst_cpu(struct task_struct *p, int dest_cpu)
> >>> +{
> >>> +     struct cfs_rq *dst_cfs_rq;
> >>> +
> >>> +#ifdef CONFIG_FAIR_GROUP_SCHED
> >>> +     dst_cfs_rq = task_group(p)->cfs_rq[dest_cpu];
> >>> +#else
> >>> +     dst_cfs_rq = &cpu_rq(dest_cpu)->cfs;
> >>> +#endif
> >>> +     if (sched_feat(PLACE_LAG) && dst_cfs_rq->nr_queued &&
> >>> +         !entity_eligible(task_cfs_rq(p), &p->se))
> >>> +             return 1;
> >>> +
> >>> +     return 0;
> >>> +}
> >>> +
> >>>    /*
> >>>     * can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
> >>>     */
> >>> @@ -9420,6 +9444,16 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
> >>>        if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
> >>>                return 0;
> >>>
> >>> +     /*
> >>> +      * We want to prioritize the migration of eligible tasks.
> >>> +      * For ineligible tasks we soft-limit them and only allow
> >>> +      * them to migrate when nr_balance_failed is non-zero to
> >>> +      * avoid load-balancing trying very hard to balance the load.
> >>> +      */
> >>> +     if (!env->sd->nr_balance_failed &&
> >>> +         task_is_ineligible_on_dst_cpu(p, env->dst_cpu))
> >>> +             return 0;
> >>> +
> >>>        /* Disregard percpu kthreads; they are where they need to be. */
> >>>        if (kthread_is_per_cpu(p))
> >>>                return 0;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2025-01-14  8:07       ` Vincent Guittot
@ 2025-01-15  8:55         ` Hao Jia
  2025-01-15  9:28           ` Vincent Guittot
  0 siblings, 1 reply; 16+ messages in thread
From: Hao Jia @ 2025-01-15  8:55 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, Hao Jia



On 2025/1/14 16:07, Vincent Guittot wrote:
> On Tue, 14 Jan 2025 at 04:18, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>
>>
>>
>> On 2025/1/14 00:40, Vincent Guittot wrote:
>>> On Mon, 13 Jan 2025 at 10:21, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>>>
>>>> Friendly ping...
>>>>
>>>>
>>>> On 2024/12/23 17:14, Hao Jia wrote:
>>>>> From: Hao Jia <jiahao1@lixiang.com>
>>>>>
>>>>> When the PLACE_LAG scheduling feature is enabled and
>>>>> dst_cfs_rq->nr_queued is greater than 1, if a task is
>>>>> ineligible (lag < 0) on the source cpu runqueue, it will
>>>>> also be ineligible when it is migrated to the destination
>>>>> cpu runqueue. Because we will keep the original equivalent
>>>>> lag of the task in place_entity(). So if the task was
>>>>> ineligible before, it will still be ineligible after
>>>>> migration.
>>>>>
>>>>> So in sched_balance_rq(), we prioritize migrating eligible
>>>>> tasks, and we soft-limit ineligible tasks, allowing them
>>>>> to migrate only when nr_balance_failed is non-zero to
>>>>> avoid load-balancing trying very hard to balance the load.
>>>
>>> Could you explain why you think it's better to balance eligible tasks
>>> in priority and potentially skip a load balance ?
>>
>> In place_entity(), we maintain the task's original equivalent lag, even
>> if we migrate the task to dst_rq, this does not change its eligibility
>> attribute.
> 
> Yes, but you don't answer the question why it's better to select an
> eligible task vs a non eligible task.
> 
>>
>> When there are multiple tasks on src_rq, and the dst_cpu has some
>> runnable tasks, migrating ineligible tasks to dst_rq will not allow them
>> to run. Therefore, such task migration is inefficient. We should
> 
> Why is it inefficient ? load balance is about evenly balancing the
> number of tasks or the load between CPUs, it never says that the newly
> migrated task should run immediately


My initial thought is that when we need to migrate some tasks during 
load balancing, at the current point in time, migrating ineligible tasks 
to dst_cpu means they definitely cannot run there. Therefore, I prefer 
to keep them on src_cpu to reduce the overhead of dequeueing and 
enqueueing ineligible tasks.

Migrating eligible tasks to dst_cpu does not guarantee that they will 
run earlier than on src_cpu. it depends on too many factors.



> 
>> prioritize migrating tasks that can run on dst_rq.
>>
>> In other words, migrating ineligible tasks is merely moving them to
>> another runqueue to wait until they become eligible.
> 
> But I don't get why it's a problem. Migrating an eligible task might
> delay its scheduling because of its deadline vs other tasks already
> eligible on the dst_rq. Eligible and non eligible tasks are all
> runnable, it's just how much they have already run. In addition,
> migrating an eligible task will clear its positive vlag with
> DELAY_ZERO which is unfair IMO


Sorry, I'd like to ask you a question that confuses me: Why does 
migrating eligible task will clear the positive vlag?

In detach_task(), the ENQUEUE_DELAYED and DEQUEUE_SLEEP flags are not 
set, and in dequeue_entity(), eligible tasks will not set sched_delayed, 
so they will be dequeued normally with se->on_rq being 0.

Similarly, attach_task() does not set the ENQUEUE_DELAYED and 
DEQUEUE_SLEEP flags, and since se->on_rq is 0, it will not call 
requeue_delayed_entity().


Thanks,
Hao

> 
>>
>>
>>>
>>> I can see an interest for idle and newly_idle load balance in order to
>>> favor fairness as tasks will become eligible but I don't see why it
>>> would be helpful if dst already has some runnable tasks. Furthermore,
>>> when a cpu is idle or newly idle, we really want to migrate a task
>>> even an non eligible one instead of possibly skipping this load
>>> balance round. With your patch, we might end up not pulling any task,
>>> increasing the nr_balance_failed and waiting next load balance
>>>
>>
>> If I understand correctly, when the destination CPU is idle, my patch
>> does not change the original behavior. it only prevents the migration of
>> ineligible tasks when dst_cfs_rq->nr_queued is greater than 1.
> 
> It changes the behavior. My concern is that migrating an eligible task
> when dst rq already has runnable tasks, doesn't assure you that it
> will give any advantage to this eligible task.
> 
> On an idle or newly idle cpu, any runnable tasks that will be pulled
> will immediately start to run whatever it was eligible or not on src
> cpu. In such a situation, we could consider that selecting an eligible
> task which has less running time (positive lag) than others could be
> more fair because the eligible task will immediately run. But this is
> true as long as we migrate only 1 task. If you migrate several tasks,
> an eligible task on src rq could even become ineligible quicker on dst
> cpu than on src cpu has it lost its lag
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [tip: sched/core] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2024-12-23  9:14 [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq() Hao Jia
  2024-12-23 20:50 ` Markus Elfring
  2025-01-13  9:21 ` [PATCH v2] " Hao Jia
@ 2025-01-15  9:17 ` tip-bot2 for Hao Jia
  2025-01-15 13:18 ` [PATCH v2] " Luis Machado
  3 siblings, 0 replies; 16+ messages in thread
From: tip-bot2 for Hao Jia @ 2025-01-15  9:17 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Peter Zijlstra (Intel), Hao Jia, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     873199d27bb25889ab7ccca03c8f30c67f18ae52
Gitweb:        https://git.kernel.org/tip/873199d27bb25889ab7ccca03c8f30c67f18ae52
Author:        Hao Jia <jiahao1@lixiang.com>
AuthorDate:    Mon, 23 Dec 2024 17:14:46 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 13 Jan 2025 14:10:23 +01:00

sched/core: Prioritize migrating eligible tasks in sched_balance_rq()

When the PLACE_LAG scheduling feature is enabled and
dst_cfs_rq->nr_queued is greater than 1, if a task is
ineligible (lag < 0) on the source cpu runqueue, it will
also be ineligible when it is migrated to the destination
cpu runqueue. Because we will keep the original equivalent
lag of the task in place_entity(). So if the task was
ineligible before, it will still be ineligible after
migration.

So in sched_balance_rq(), we prioritize migrating eligible
tasks, and we soft-limit ineligible tasks, allowing them
to migrate only when nr_balance_failed is non-zero to
avoid load-balancing trying very hard to balance the load.

Below are some benchmark test results. From my test results,
this patch shows a slight improvement on hackbench.

Benchmark
=========

All of the benchmarks are done inside a normal cpu cgroup in a
clean environment with cpu turbo disabled, and test machine is:

Single NUMA machine model is 13th Gen Intel(R) Core(TM)
i7-13700, 12 Core/24 HT.

Based on master b86545e02e8c.

Results
=======

hackbench-process-pipes
                      vanilla                  patched
Amean     1       0.5837 (   0.00%)      0.5733 (   1.77%)
Amean     4       1.4423 (   0.00%)      1.4503 (  -0.55%)
Amean     7       2.5147 (   0.00%)      2.4773 (   1.48%)
Amean     12      3.9347 (   0.00%)      3.8880 (   1.19%)
Amean     21      5.3943 (   0.00%)      5.3873 (   0.13%)
Amean     30      6.7840 (   0.00%)      6.6660 (   1.74%)
Amean     48      9.8313 (   0.00%)      9.6100 (   2.25%)
Amean     79     15.4403 (   0.00%)     14.9580 (   3.12%)
Amean     96     18.4970 (   0.00%)     17.9533 (   2.94%)

hackbench-process-sockets
                      vanilla                  patched
Amean     1       0.6297 (   0.00%)      0.6223 (   1.16%)
Amean     4       2.1517 (   0.00%)      2.0887 (   2.93%)
Amean     7       3.6377 (   0.00%)      3.5670 (   1.94%)
Amean     12      6.1277 (   0.00%)      5.9290 (   3.24%)
Amean     21     10.0380 (   0.00%)      9.7623 (   2.75%)
Amean     30     14.1517 (   0.00%)     13.7513 (   2.83%)
Amean     48     24.7253 (   0.00%)     24.2287 (   2.01%)
Amean     79     43.9523 (   0.00%)     43.2330 (   1.64%)
Amean     96     54.5310 (   0.00%)     53.7650 (   1.40%)

tbench4 Throughput
                      vanilla                  patched
Hmean     1       255.97 (   0.00%)      275.01 (   7.44%)
Hmean     2       511.60 (   0.00%)      544.27 (   6.39%)
Hmean     4       996.70 (   0.00%)     1006.57 (   0.99%)
Hmean     8      1646.46 (   0.00%)     1649.15 (   0.16%)
Hmean     16     2259.42 (   0.00%)     2274.35 (   0.66%)
Hmean     32     4725.48 (   0.00%)     4735.57 (   0.21%)
Hmean     64     4411.47 (   0.00%)     4400.05 (  -0.26%)
Hmean     96     4284.31 (   0.00%)     4267.39 (  -0.39%)

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Hao Jia <jiahao1@lixiang.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20241223091446.90208-1-jiahao.kernel@gmail.com
---
 kernel/sched/fair.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7ec2587..52f7278 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9407,6 +9407,30 @@ static inline long migrate_degrades_locality(struct task_struct *p,
 #endif
 
 /*
+ * Check whether the task is ineligible on the destination cpu
+ *
+ * When the PLACE_LAG scheduling feature is enabled and
+ * dst_cfs_rq->nr_queued is greater than 1, if the task
+ * is ineligible, it will also be ineligible when
+ * it is migrated to the destination cpu.
+ */
+static inline int task_is_ineligible_on_dst_cpu(struct task_struct *p, int dest_cpu)
+{
+	struct cfs_rq *dst_cfs_rq;
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+	dst_cfs_rq = task_group(p)->cfs_rq[dest_cpu];
+#else
+	dst_cfs_rq = &cpu_rq(dest_cpu)->cfs;
+#endif
+	if (sched_feat(PLACE_LAG) && dst_cfs_rq->nr_queued &&
+	    !entity_eligible(task_cfs_rq(p), &p->se))
+		return 1;
+
+	return 0;
+}
+
+/*
  * can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
  */
 static
@@ -9432,6 +9456,16 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
 	if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
 		return 0;
 
+	/*
+	 * We want to prioritize the migration of eligible tasks.
+	 * For ineligible tasks we soft-limit them and only allow
+	 * them to migrate when nr_balance_failed is non-zero to
+	 * avoid load-balancing trying very hard to balance the load.
+	 */
+	if (!env->sd->nr_balance_failed &&
+	    task_is_ineligible_on_dst_cpu(p, env->dst_cpu))
+		return 0;
+
 	/* Disregard percpu kthreads; they are where they need to be. */
 	if (kthread_is_per_cpu(p))
 		return 0;

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2025-01-15  8:55         ` Hao Jia
@ 2025-01-15  9:28           ` Vincent Guittot
  2025-01-15 11:55             ` Hao Jia
  0 siblings, 1 reply; 16+ messages in thread
From: Vincent Guittot @ 2025-01-15  9:28 UTC (permalink / raw)
  To: Hao Jia
  Cc: mingo, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, Hao Jia

On Wed, 15 Jan 2025 at 09:55, Hao Jia <jiahao.kernel@gmail.com> wrote:
>
>
>
> On 2025/1/14 16:07, Vincent Guittot wrote:
> > On Tue, 14 Jan 2025 at 04:18, Hao Jia <jiahao.kernel@gmail.com> wrote:
> >>
> >>
> >>
> >> On 2025/1/14 00:40, Vincent Guittot wrote:
> >>> On Mon, 13 Jan 2025 at 10:21, Hao Jia <jiahao.kernel@gmail.com> wrote:
> >>>>
> >>>> Friendly ping...
> >>>>
> >>>>
> >>>> On 2024/12/23 17:14, Hao Jia wrote:
> >>>>> From: Hao Jia <jiahao1@lixiang.com>
> >>>>>
> >>>>> When the PLACE_LAG scheduling feature is enabled and
> >>>>> dst_cfs_rq->nr_queued is greater than 1, if a task is
> >>>>> ineligible (lag < 0) on the source cpu runqueue, it will
> >>>>> also be ineligible when it is migrated to the destination
> >>>>> cpu runqueue. Because we will keep the original equivalent
> >>>>> lag of the task in place_entity(). So if the task was
> >>>>> ineligible before, it will still be ineligible after
> >>>>> migration.
> >>>>>
> >>>>> So in sched_balance_rq(), we prioritize migrating eligible
> >>>>> tasks, and we soft-limit ineligible tasks, allowing them
> >>>>> to migrate only when nr_balance_failed is non-zero to
> >>>>> avoid load-balancing trying very hard to balance the load.
> >>>
> >>> Could you explain why you think it's better to balance eligible tasks
> >>> in priority and potentially skip a load balance ?
> >>
> >> In place_entity(), we maintain the task's original equivalent lag, even
> >> if we migrate the task to dst_rq, this does not change its eligibility
> >> attribute.
> >
> > Yes, but you don't answer the question why it's better to select an
> > eligible task vs a non eligible task.
> >
> >>
> >> When there are multiple tasks on src_rq, and the dst_cpu has some
> >> runnable tasks, migrating ineligible tasks to dst_rq will not allow them
> >> to run. Therefore, such task migration is inefficient. We should
> >
> > Why is it inefficient ? load balance is about evenly balancing the
> > number of tasks or the load between CPUs, it never says that the newly
> > migrated task should run immediately
>
>
> My initial thought is that when we need to migrate some tasks during
> load balancing, at the current point in time, migrating ineligible tasks
> to dst_cpu means they definitely cannot run there. Therefore, I prefer
> to keep them on src_cpu to reduce the overhead of dequeueing and
> enqueueing ineligible tasks.

Sorry but I still don't get why it's important and would make a
difference. They are all runnable but ineligible tasks got more
runtime than other at that point in time so there is no real
difference

>
> Migrating eligible tasks to dst_cpu does not guarantee that they will
> run earlier than on src_cpu. it depends on too many factors.
>
>
>
> >
> >> prioritize migrating tasks that can run on dst_rq.
> >>
> >> In other words, migrating ineligible tasks is merely moving them to
> >> another runqueue to wait until they become eligible.
> >
> > But I don't get why it's a problem. Migrating an eligible task might
> > delay its scheduling because of its deadline vs other tasks already
> > eligible on the dst_rq. Eligible and non eligible tasks are all
> > runnable, it's just how much they have already run. In addition,
> > migrating an eligible task will clear its positive vlag with
> > DELAY_ZERO which is unfair IMO
>
>
> Sorry, I'd like to ask you a question that confuses me: Why does
> migrating eligible task will clear the positive vlag?

sorry I mess up everything that only for delayed dequeue task

>
> In detach_task(), the ENQUEUE_DELAYED and DEQUEUE_SLEEP flags are not
> set, and in dequeue_entity(), eligible tasks will not set sched_delayed,
> so they will be dequeued normally with se->on_rq being 0.
>
> Similarly, attach_task() does not set the ENQUEUE_DELAYED and
> DEQUEUE_SLEEP flags, and since se->on_rq is 0, it will not call
> requeue_delayed_entity().
>
>
> Thanks,
> Hao
>
> >
> >>
> >>
> >>>
> >>> I can see an interest for idle and newly_idle load balance in order to
> >>> favor fairness as tasks will become eligible but I don't see why it
> >>> would be helpful if dst already has some runnable tasks. Furthermore,
> >>> when a cpu is idle or newly idle, we really want to migrate a task
> >>> even an non eligible one instead of possibly skipping this load
> >>> balance round. With your patch, we might end up not pulling any task,
> >>> increasing the nr_balance_failed and waiting next load balance
> >>>
> >>
> >> If I understand correctly, when the destination CPU is idle, my patch
> >> does not change the original behavior. it only prevents the migration of
> >> ineligible tasks when dst_cfs_rq->nr_queued is greater than 1.
> >
> > It changes the behavior. My concern is that migrating an eligible task
> > when dst rq already has runnable tasks, doesn't assure you that it
> > will give any advantage to this eligible task.
> >
> > On an idle or newly idle cpu, any runnable tasks that will be pulled
> > will immediately start to run whatever it was eligible or not on src
> > cpu. In such a situation, we could consider that selecting an eligible
> > task which has less running time (positive lag) than others could be
> > more fair because the eligible task will immediately run. But this is
> > true as long as we migrate only 1 task. If you migrate several tasks,
> > an eligible task on src rq could even become ineligible quicker on dst
> > cpu than on src cpu has it lost its lag
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2025-01-15  9:28           ` Vincent Guittot
@ 2025-01-15 11:55             ` Hao Jia
  2025-01-16 11:26               ` Vincent Guittot
  0 siblings, 1 reply; 16+ messages in thread
From: Hao Jia @ 2025-01-15 11:55 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, Hao Jia



On 2025/1/15 17:28, Vincent Guittot wrote:
> On Wed, 15 Jan 2025 at 09:55, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>
>>
>>
>> On 2025/1/14 16:07, Vincent Guittot wrote:
>>> On Tue, 14 Jan 2025 at 04:18, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2025/1/14 00:40, Vincent Guittot wrote:
>>>>> On Mon, 13 Jan 2025 at 10:21, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>>>>>
>>>>>> Friendly ping...
>>>>>>
>>>>>>
>>>>>> On 2024/12/23 17:14, Hao Jia wrote:
>>>>>>> From: Hao Jia <jiahao1@lixiang.com>
>>>>>>>
>>>>>>> When the PLACE_LAG scheduling feature is enabled and
>>>>>>> dst_cfs_rq->nr_queued is greater than 1, if a task is
>>>>>>> ineligible (lag < 0) on the source cpu runqueue, it will
>>>>>>> also be ineligible when it is migrated to the destination
>>>>>>> cpu runqueue. Because we will keep the original equivalent
>>>>>>> lag of the task in place_entity(). So if the task was
>>>>>>> ineligible before, it will still be ineligible after
>>>>>>> migration.
>>>>>>>
>>>>>>> So in sched_balance_rq(), we prioritize migrating eligible
>>>>>>> tasks, and we soft-limit ineligible tasks, allowing them
>>>>>>> to migrate only when nr_balance_failed is non-zero to
>>>>>>> avoid load-balancing trying very hard to balance the load.
>>>>>
>>>>> Could you explain why you think it's better to balance eligible tasks
>>>>> in priority and potentially skip a load balance ?
>>>>
>>>> In place_entity(), we maintain the task's original equivalent lag, even
>>>> if we migrate the task to dst_rq, this does not change its eligibility
>>>> attribute.
>>>
>>> Yes, but you don't answer the question why it's better to select an
>>> eligible task vs a non eligible task.
>>>
>>>>
>>>> When there are multiple tasks on src_rq, and the dst_cpu has some
>>>> runnable tasks, migrating ineligible tasks to dst_rq will not allow them
>>>> to run. Therefore, such task migration is inefficient. We should
>>>
>>> Why is it inefficient ? load balance is about evenly balancing the
>>> number of tasks or the load between CPUs, it never says that the newly
>>> migrated task should run immediately
>>
>>
>> My initial thought is that when we need to migrate some tasks during
>> load balancing, at the current point in time, migrating ineligible tasks
>> to dst_cpu means they definitely cannot run there. Therefore, I prefer
>> to keep them on src_cpu to reduce the overhead of dequeueing and
>> enqueueing ineligible tasks.
> 
> Sorry but I still don't get why it's important and would make a
> difference. They are all runnable but ineligible tasks got more
> runtime than other at that point in time so there is no real
> difference


I adopt a lazy strategy for ineligible tasks. At the current point in 
time, even if we migrate ineligible tasks to the dst CPU, they still 
have to wait on the dst CPU until they become eligible. We do not see 
clear benefits from migrating ineligible tasks, but their dequeueing and 
enqueueing would instead incur overhead.

Let them wait on the src CPU until they become eligible before migrating 
them. this can reduce the number of task migrations.



Thanks,
Hao


> 
>>
>> Migrating eligible tasks to dst_cpu does not guarantee that they will
>> run earlier than on src_cpu. it depends on too many factors.
>>
>>
>>
>>>
>>>> prioritize migrating tasks that can run on dst_rq.
>>>>
>>>> In other words, migrating ineligible tasks is merely moving them to
>>>> another runqueue to wait until they become eligible.
>>>
>>> But I don't get why it's a problem. Migrating an eligible task might
>>> delay its scheduling because of its deadline vs other tasks already
>>> eligible on the dst_rq. Eligible and non eligible tasks are all
>>> runnable, it's just how much they have already run. In addition,
>>> migrating an eligible task will clear its positive vlag with
>>> DELAY_ZERO which is unfair IMO
>>
>>
>> Sorry, I'd like to ask you a question that confuses me: Why does
>> migrating eligible task will clear the positive vlag?
> 
> sorry I mess up everything that only for delayed dequeue task
> 
>>
>> In detach_task(), the ENQUEUE_DELAYED and DEQUEUE_SLEEP flags are not
>> set, and in dequeue_entity(), eligible tasks will not set sched_delayed,
>> so they will be dequeued normally with se->on_rq being 0.
>>
>> Similarly, attach_task() does not set the ENQUEUE_DELAYED and
>> DEQUEUE_SLEEP flags, and since se->on_rq is 0, it will not call
>> requeue_delayed_entity().
>>
>>
>> Thanks,
>> Hao


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2024-12-23  9:14 [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq() Hao Jia
                   ` (2 preceding siblings ...)
  2025-01-15  9:17 ` [tip: sched/core] " tip-bot2 for Hao Jia
@ 2025-01-15 13:18 ` Luis Machado
  3 siblings, 0 replies; 16+ messages in thread
From: Luis Machado @ 2025-01-15 13:18 UTC (permalink / raw)
  To: Hao Jia, mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, vschneid
  Cc: linux-kernel, Hao Jia

On 12/23/24 09:14, Hao Jia wrote:
> From: Hao Jia <jiahao1@lixiang.com>
> 
> When the PLACE_LAG scheduling feature is enabled and
> dst_cfs_rq->nr_queued is greater than 1, if a task is
> ineligible (lag < 0) on the source cpu runqueue, it will
> also be ineligible when it is migrated to the destination
> cpu runqueue. Because we will keep the original equivalent
> lag of the task in place_entity(). So if the task was
> ineligible before, it will still be ineligible after
> migration.
> 
> So in sched_balance_rq(), we prioritize migrating eligible
> tasks, and we soft-limit ineligible tasks, allowing them
> to migrate only when nr_balance_failed is non-zero to
> avoid load-balancing trying very hard to balance the load.
> 
> Below are some benchmark test results. From my test results,
> this patch shows a slight improvement on hackbench.
> 
> Benchmark
> =========
> 
> All of the benchmarks are done inside a normal cpu cgroup in a
> clean environment with cpu turbo disabled, and test machine is:
> 
> Single NUMA machine model is 13th Gen Intel(R) Core(TM)
> i7-13700, 12 Core/24 HT.
> 
> Based on master b86545e02e8c.
> 
> Results
> =======
> 
> hackbench-process-pipes
>                       vanilla                  patched
> Amean     1       0.5837 (   0.00%)      0.5733 (   1.77%)
> Amean     4       1.4423 (   0.00%)      1.4503 (  -0.55%)
> Amean     7       2.5147 (   0.00%)      2.4773 (   1.48%)
> Amean     12      3.9347 (   0.00%)      3.8880 (   1.19%)
> Amean     21      5.3943 (   0.00%)      5.3873 (   0.13%)
> Amean     30      6.7840 (   0.00%)      6.6660 (   1.74%)
> Amean     48      9.8313 (   0.00%)      9.6100 (   2.25%)
> Amean     79     15.4403 (   0.00%)     14.9580 (   3.12%)
> Amean     96     18.4970 (   0.00%)     17.9533 (   2.94%)
> 
> hackbench-process-sockets
>                       vanilla                  patched
> Amean     1       0.6297 (   0.00%)      0.6223 (   1.16%)
> Amean     4       2.1517 (   0.00%)      2.0887 (   2.93%)
> Amean     7       3.6377 (   0.00%)      3.5670 (   1.94%)
> Amean     12      6.1277 (   0.00%)      5.9290 (   3.24%)
> Amean     21     10.0380 (   0.00%)      9.7623 (   2.75%)
> Amean     30     14.1517 (   0.00%)     13.7513 (   2.83%)
> Amean     48     24.7253 (   0.00%)     24.2287 (   2.01%)
> Amean     79     43.9523 (   0.00%)     43.2330 (   1.64%)
> Amean     96     54.5310 (   0.00%)     53.7650 (   1.40%)
> 
> tbench4 Throughput
>                       vanilla                  patched
> Hmean     1       255.97 (   0.00%)      275.01 (   7.44%)
> Hmean     2       511.60 (   0.00%)      544.27 (   6.39%)
> Hmean     4       996.70 (   0.00%)     1006.57 (   0.99%)
> Hmean     8      1646.46 (   0.00%)     1649.15 (   0.16%)
> Hmean     16     2259.42 (   0.00%)     2274.35 (   0.66%)
> Hmean     32     4725.48 (   0.00%)     4735.57 (   0.21%)
> Hmean     64     4411.47 (   0.00%)     4400.05 (  -0.26%)
> Hmean     96     4284.31 (   0.00%)     4267.39 (  -0.39%)
> 
> Signed-off-by: Hao Jia <jiahao1@lixiang.com>
> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> Previous discussion link: https://lore.kernel.org/all/20241128084858.25220-1-jiahao.kernel@gmail.com
> Link to v1: https://lore.kernel.org/all/20241218080203.80556-1-jiahao.kernel@gmail.com
> 
> v1 to v2:
>  - Modify dst_cfs_rq->nr_running to dst_cfs_rq->nr_queued to
>    resolve conflicts with commit 736c55a02c47 ("sched/fair:
>    Rename cfs_rq.nr_running into nr_queued").
> 
>  kernel/sched/fair.c | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5599b0c1ba9b..c884bf631e66 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9396,6 +9396,30 @@ static inline int migrate_degrades_locality(struct task_struct *p,
>  }
>  #endif
>  
> +/*
> + * Check whether the task is ineligible on the destination cpu
> + *
> + * When the PLACE_LAG scheduling feature is enabled and
> + * dst_cfs_rq->nr_queued is greater than 1, if the task
> + * is ineligible, it will also be ineligible when
> + * it is migrated to the destination cpu.
> + */
> +static inline int task_is_ineligible_on_dst_cpu(struct task_struct *p, int dest_cpu)
> +{
> +	struct cfs_rq *dst_cfs_rq;
> +
> +#ifdef CONFIG_FAIR_GROUP_SCHED
> +	dst_cfs_rq = task_group(p)->cfs_rq[dest_cpu];
> +#else
> +	dst_cfs_rq = &cpu_rq(dest_cpu)->cfs;
> +#endif
> +	if (sched_feat(PLACE_LAG) && dst_cfs_rq->nr_queued &&
> +	    !entity_eligible(task_cfs_rq(p), &p->se))
> +		return 1;
> +
> +	return 0;
> +}
> +
>  /*
>   * can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
>   */
> @@ -9420,6 +9444,16 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
>  	if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
>  		return 0;
>  
> +	/*
> +	 * We want to prioritize the migration of eligible tasks.
> +	 * For ineligible tasks we soft-limit them and only allow
> +	 * them to migrate when nr_balance_failed is non-zero to
> +	 * avoid load-balancing trying very hard to balance the load.
> +	 */
> +	if (!env->sd->nr_balance_failed &&
> +	    task_is_ineligible_on_dst_cpu(p, env->dst_cpu))
> +		return 0;
> +
>  	/* Disregard percpu kthreads; they are where they need to be. */
>  	if (kthread_is_per_cpu(p))
>  		return 0;

Just a general comment.

If we throw tasks with custom slices into the mix, I wonder what kinds of impacts
we would see from migrating eligible tasks with very short slice lengths (they may
run earlier) or very long ones.

We could've picked an inelegible task before, which wouldn't have an impact other than
the overhead of the migration. But the patch might make it so we now pick an eligible
one with a custom slice.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2025-01-15 11:55             ` Hao Jia
@ 2025-01-16 11:26               ` Vincent Guittot
  2025-01-20  5:48                 ` Hao Jia
  0 siblings, 1 reply; 16+ messages in thread
From: Vincent Guittot @ 2025-01-16 11:26 UTC (permalink / raw)
  To: Hao Jia
  Cc: mingo, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, Hao Jia

On Wed, 15 Jan 2025 at 12:55, Hao Jia <jiahao.kernel@gmail.com> wrote:
>
>
>
> On 2025/1/15 17:28, Vincent Guittot wrote:
> > On Wed, 15 Jan 2025 at 09:55, Hao Jia <jiahao.kernel@gmail.com> wrote:
> >>
> >>
> >>
> >> On 2025/1/14 16:07, Vincent Guittot wrote:
> >>> On Tue, 14 Jan 2025 at 04:18, Hao Jia <jiahao.kernel@gmail.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 2025/1/14 00:40, Vincent Guittot wrote:
> >>>>> On Mon, 13 Jan 2025 at 10:21, Hao Jia <jiahao.kernel@gmail.com> wrote:
> >>>>>>
> >>>>>> Friendly ping...
> >>>>>>
> >>>>>>
> >>>>>> On 2024/12/23 17:14, Hao Jia wrote:
> >>>>>>> From: Hao Jia <jiahao1@lixiang.com>
> >>>>>>>
> >>>>>>> When the PLACE_LAG scheduling feature is enabled and
> >>>>>>> dst_cfs_rq->nr_queued is greater than 1, if a task is
> >>>>>>> ineligible (lag < 0) on the source cpu runqueue, it will
> >>>>>>> also be ineligible when it is migrated to the destination
> >>>>>>> cpu runqueue. Because we will keep the original equivalent
> >>>>>>> lag of the task in place_entity(). So if the task was
> >>>>>>> ineligible before, it will still be ineligible after
> >>>>>>> migration.
> >>>>>>>
> >>>>>>> So in sched_balance_rq(), we prioritize migrating eligible
> >>>>>>> tasks, and we soft-limit ineligible tasks, allowing them
> >>>>>>> to migrate only when nr_balance_failed is non-zero to
> >>>>>>> avoid load-balancing trying very hard to balance the load.
> >>>>>
> >>>>> Could you explain why you think it's better to balance eligible tasks
> >>>>> in priority and potentially skip a load balance ?
> >>>>
> >>>> In place_entity(), we maintain the task's original equivalent lag, even
> >>>> if we migrate the task to dst_rq, this does not change its eligibility
> >>>> attribute.
> >>>
> >>> Yes, but you don't answer the question why it's better to select an
> >>> eligible task vs a non eligible task.
> >>>
> >>>>
> >>>> When there are multiple tasks on src_rq, and the dst_cpu has some
> >>>> runnable tasks, migrating ineligible tasks to dst_rq will not allow them
> >>>> to run. Therefore, such task migration is inefficient. We should
> >>>
> >>> Why is it inefficient ? load balance is about evenly balancing the
> >>> number of tasks or the load between CPUs, it never says that the newly
> >>> migrated task should run immediately
> >>
> >>
> >> My initial thought is that when we need to migrate some tasks during
> >> load balancing, at the current point in time, migrating ineligible tasks
> >> to dst_cpu means they definitely cannot run there. Therefore, I prefer
> >> to keep them on src_cpu to reduce the overhead of dequeueing and
> >> enqueueing ineligible tasks.
> >
> > Sorry but I still don't get why it's important and would make a
> > difference. They are all runnable but ineligible tasks got more
> > runtime than other at that point in time so there is no real
> > difference
>
>
> I adopt a lazy strategy for ineligible tasks. At the current point in
> time, even if we migrate ineligible tasks to the dst CPU, they still
> have to wait on the dst CPU until they become eligible. We do not see
> clear benefits from migrating ineligible tasks, but their dequeueing and
> enqueueing would instead incur overhead.

But your explanation doesn't make sense.
Not migrating an ineligible task only make sense for delayed_dequeue
tasks because they don't really want to run but only exhaust their lag
but this is already taken into account by
61b82dfb6b7e ("sched/fair: Do not try to migrate delayed dequeue task")

Did you run your benchmark on top of this change ?

>
> Let them wait on the src CPU until they become eligible before migrating
> them. this can reduce the number of task migrations.
>
>
>
> Thanks,
> Hao
>
>
> >
> >>
> >> Migrating eligible tasks to dst_cpu does not guarantee that they will
> >> run earlier than on src_cpu. it depends on too many factors.
> >>
> >>
> >>
> >>>
> >>>> prioritize migrating tasks that can run on dst_rq.
> >>>>
> >>>> In other words, migrating ineligible tasks is merely moving them to
> >>>> another runqueue to wait until they become eligible.
> >>>
> >>> But I don't get why it's a problem. Migrating an eligible task might
> >>> delay its scheduling because of its deadline vs other tasks already
> >>> eligible on the dst_rq. Eligible and non eligible tasks are all
> >>> runnable, it's just how much they have already run. In addition,
> >>> migrating an eligible task will clear its positive vlag with
> >>> DELAY_ZERO which is unfair IMO
> >>
> >>
> >> Sorry, I'd like to ask you a question that confuses me: Why does
> >> migrating eligible task will clear the positive vlag?
> >
> > sorry I mess up everything that only for delayed dequeue task
> >
> >>
> >> In detach_task(), the ENQUEUE_DELAYED and DEQUEUE_SLEEP flags are not
> >> set, and in dequeue_entity(), eligible tasks will not set sched_delayed,
> >> so they will be dequeued normally with se->on_rq being 0.
> >>
> >> Similarly, attach_task() does not set the ENQUEUE_DELAYED and
> >> DEQUEUE_SLEEP flags, and since se->on_rq is 0, it will not call
> >> requeue_delayed_entity().
> >>
> >>
> >> Thanks,
> >> Hao
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2025-01-16 11:26               ` Vincent Guittot
@ 2025-01-20  5:48                 ` Hao Jia
  2025-02-17  6:13                   ` Hao Jia
  0 siblings, 1 reply; 16+ messages in thread
From: Hao Jia @ 2025-01-20  5:48 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, Hao Jia



On 2025/1/16 19:26, Vincent Guittot wrote:
> On Wed, 15 Jan 2025 at 12:55, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>
>>
>>
>> On 2025/1/15 17:28, Vincent Guittot wrote:
>>> On Wed, 15 Jan 2025 at 09:55, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2025/1/14 16:07, Vincent Guittot wrote:
>>>>> On Tue, 14 Jan 2025 at 04:18, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2025/1/14 00:40, Vincent Guittot wrote:
>>>>>>> On Mon, 13 Jan 2025 at 10:21, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Friendly ping...
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2024/12/23 17:14, Hao Jia wrote:
>>>>>>>>> From: Hao Jia <jiahao1@lixiang.com>
>>>>>>>>>
>>>>>>>>> When the PLACE_LAG scheduling feature is enabled and
>>>>>>>>> dst_cfs_rq->nr_queued is greater than 1, if a task is
>>>>>>>>> ineligible (lag < 0) on the source cpu runqueue, it will
>>>>>>>>> also be ineligible when it is migrated to the destination
>>>>>>>>> cpu runqueue. Because we will keep the original equivalent
>>>>>>>>> lag of the task in place_entity(). So if the task was
>>>>>>>>> ineligible before, it will still be ineligible after
>>>>>>>>> migration.
>>>>>>>>>
>>>>>>>>> So in sched_balance_rq(), we prioritize migrating eligible
>>>>>>>>> tasks, and we soft-limit ineligible tasks, allowing them
>>>>>>>>> to migrate only when nr_balance_failed is non-zero to
>>>>>>>>> avoid load-balancing trying very hard to balance the load.
>>>>>>>
>>>>>>> Could you explain why you think it's better to balance eligible tasks
>>>>>>> in priority and potentially skip a load balance ?
>>>>>>
>>>>>> In place_entity(), we maintain the task's original equivalent lag, even
>>>>>> if we migrate the task to dst_rq, this does not change its eligibility
>>>>>> attribute.
>>>>>
>>>>> Yes, but you don't answer the question why it's better to select an
>>>>> eligible task vs a non eligible task.
>>>>>
>>>>>>
>>>>>> When there are multiple tasks on src_rq, and the dst_cpu has some
>>>>>> runnable tasks, migrating ineligible tasks to dst_rq will not allow them
>>>>>> to run. Therefore, such task migration is inefficient. We should
>>>>>
>>>>> Why is it inefficient ? load balance is about evenly balancing the
>>>>> number of tasks or the load between CPUs, it never says that the newly
>>>>> migrated task should run immediately
>>>>
>>>>
>>>> My initial thought is that when we need to migrate some tasks during
>>>> load balancing, at the current point in time, migrating ineligible tasks
>>>> to dst_cpu means they definitely cannot run there. Therefore, I prefer
>>>> to keep them on src_cpu to reduce the overhead of dequeueing and
>>>> enqueueing ineligible tasks.
>>>
>>> Sorry but I still don't get why it's important and would make a
>>> difference. They are all runnable but ineligible tasks got more
>>> runtime than other at that point in time so there is no real
>>> difference
>>
>>
>> I adopt a lazy strategy for ineligible tasks. At the current point in
>> time, even if we migrate ineligible tasks to the dst CPU, they still
>> have to wait on the dst CPU until they become eligible. We do not see
>> clear benefits from migrating ineligible tasks, but their dequeueing and
>> enqueueing would instead incur overhead.
> 
> But your explanation doesn't make sense.
> Not migrating an ineligible task only make sense for delayed_dequeue
> tasks because they don't really want to run but only exhaust their lag
> but this is already taken into account by
> 61b82dfb6b7e ("sched/fair: Do not try to migrate delayed dequeue task")
> 

Thank you for your suggestion.

Yes, as you mentioned, this commit 61b82dfb6b7e ("sched/fair: Do not try 
to migrate delayed dequeue task") reduces the migration of 
delayed_dequeue tasks, but it doesn't work for ineligible RUNNING tasks 
and when the migration_type is migrate_load.


> Did you run your benchmark on top of this change ?

My previous benchmark tests were based on the torvalds/linux/master 
branch, which does not include commit 61b82dfb6b7e ("sched/fair: Do not 
try to migrate delayed dequeue task"). I will include this commit and 
retest on my machine after my leave ends.

Thanks,
Hao

> 
>>
>> Let them wait on the src CPU until they become eligible before migrating
>> them. this can reduce the number of task migrations.
>>
>>
>>
>> Thanks,
>> Hao
>>
>>
>>>
>>>>
>>>> Migrating eligible tasks to dst_cpu does not guarantee that they will
>>>> run earlier than on src_cpu. it depends on too many factors.
>>>>
>>>>
>>>>
>>>>>
>>>>>> prioritize migrating tasks that can run on dst_rq.
>>>>>>
>>>>>> In other words, migrating ineligible tasks is merely moving them to
>>>>>> another runqueue to wait until they become eligible.
>>>>>
>>>>> But I don't get why it's a problem. Migrating an eligible task might
>>>>> delay its scheduling because of its deadline vs other tasks already
>>>>> eligible on the dst_rq. Eligible and non eligible tasks are all
>>>>> runnable, it's just how much they have already run. In addition,
>>>>> migrating an eligible task will clear its positive vlag with
>>>>> DELAY_ZERO which is unfair IMO
>>>>
>>>>
>>>> Sorry, I'd like to ask you a question that confuses me: Why does
>>>> migrating eligible task will clear the positive vlag?
>>>
>>> sorry I mess up everything that only for delayed dequeue task
>>>
>>>>
>>>> In detach_task(), the ENQUEUE_DELAYED and DEQUEUE_SLEEP flags are not
>>>> set, and in dequeue_entity(), eligible tasks will not set sched_delayed,
>>>> so they will be dequeued normally with se->on_rq being 0.
>>>>
>>>> Similarly, attach_task() does not set the ENQUEUE_DELAYED and
>>>> DEQUEUE_SLEEP flags, and since se->on_rq is 0, it will not call
>>>> requeue_delayed_entity().
>>>>
>>>>
>>>> Thanks,
>>>> Hao
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
  2025-01-20  5:48                 ` Hao Jia
@ 2025-02-17  6:13                   ` Hao Jia
  0 siblings, 0 replies; 16+ messages in thread
From: Hao Jia @ 2025-02-17  6:13 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, Hao Jia



On 2025/1/20 13:48, Hao Jia wrote:
> 
> 
> On 2025/1/16 19:26, Vincent Guittot wrote:
>> On Wed, 15 Jan 2025 at 12:55, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>>
>>>
>>>
>>> On 2025/1/15 17:28, Vincent Guittot wrote:
>>>> On Wed, 15 Jan 2025 at 09:55, Hao Jia <jiahao.kernel@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2025/1/14 16:07, Vincent Guittot wrote:
>>>>>> On Tue, 14 Jan 2025 at 04:18, Hao Jia <jiahao.kernel@gmail.com> 
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2025/1/14 00:40, Vincent Guittot wrote:
>>>>>>>> On Mon, 13 Jan 2025 at 10:21, Hao Jia <jiahao.kernel@gmail.com> 
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Friendly ping...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2024/12/23 17:14, Hao Jia wrote:
>>>>>>>>>> From: Hao Jia <jiahao1@lixiang.com>
>>>>>>>>>>
>>>>>>>>>> When the PLACE_LAG scheduling feature is enabled and
>>>>>>>>>> dst_cfs_rq->nr_queued is greater than 1, if a task is
>>>>>>>>>> ineligible (lag < 0) on the source cpu runqueue, it will
>>>>>>>>>> also be ineligible when it is migrated to the destination
>>>>>>>>>> cpu runqueue. Because we will keep the original equivalent
>>>>>>>>>> lag of the task in place_entity(). So if the task was
>>>>>>>>>> ineligible before, it will still be ineligible after
>>>>>>>>>> migration.
>>>>>>>>>>
>>>>>>>>>> So in sched_balance_rq(), we prioritize migrating eligible
>>>>>>>>>> tasks, and we soft-limit ineligible tasks, allowing them
>>>>>>>>>> to migrate only when nr_balance_failed is non-zero to
>>>>>>>>>> avoid load-balancing trying very hard to balance the load.
>>>>>>>>
>>>>>>>> Could you explain why you think it's better to balance eligible 
>>>>>>>> tasks
>>>>>>>> in priority and potentially skip a load balance ?
>>>>>>>
>>>>>>> In place_entity(), we maintain the task's original equivalent 
>>>>>>> lag, even
>>>>>>> if we migrate the task to dst_rq, this does not change its 
>>>>>>> eligibility
>>>>>>> attribute.
>>>>>>
>>>>>> Yes, but you don't answer the question why it's better to select an
>>>>>> eligible task vs a non eligible task.
>>>>>>
>>>>>>>
>>>>>>> When there are multiple tasks on src_rq, and the dst_cpu has some
>>>>>>> runnable tasks, migrating ineligible tasks to dst_rq will not 
>>>>>>> allow them
>>>>>>> to run. Therefore, such task migration is inefficient. We should
>>>>>>
>>>>>> Why is it inefficient ? load balance is about evenly balancing the
>>>>>> number of tasks or the load between CPUs, it never says that the 
>>>>>> newly
>>>>>> migrated task should run immediately
>>>>>
>>>>>
>>>>> My initial thought is that when we need to migrate some tasks during
>>>>> load balancing, at the current point in time, migrating ineligible 
>>>>> tasks
>>>>> to dst_cpu means they definitely cannot run there. Therefore, I prefer
>>>>> to keep them on src_cpu to reduce the overhead of dequeueing and
>>>>> enqueueing ineligible tasks.
>>>>
>>>> Sorry but I still don't get why it's important and would make a
>>>> difference. They are all runnable but ineligible tasks got more
>>>> runtime than other at that point in time so there is no real
>>>> difference
>>>
>>>
>>> I adopt a lazy strategy for ineligible tasks. At the current point in
>>> time, even if we migrate ineligible tasks to the dst CPU, they still
>>> have to wait on the dst CPU until they become eligible. We do not see
>>> clear benefits from migrating ineligible tasks, but their dequeueing and
>>> enqueueing would instead incur overhead.
>>
>> But your explanation doesn't make sense.
>> Not migrating an ineligible task only make sense for delayed_dequeue
>> tasks because they don't really want to run but only exhaust their lag
>> but this is already taken into account by
>> 61b82dfb6b7e ("sched/fair: Do not try to migrate delayed dequeue task")
>>
> 
> Thank you for your suggestion.
> 
> Yes, as you mentioned, this commit 61b82dfb6b7e ("sched/fair: Do not try 
> to migrate delayed dequeue task") reduces the migration of 
> delayed_dequeue tasks, but it doesn't work for ineligible RUNNING tasks 
> and when the migration_type is migrate_load.
> 
> 
>> Did you run your benchmark on top of this change ?
> 
> My previous benchmark tests were based on the torvalds/linux/master 
> branch, which does not include commit 61b82dfb6b7e ("sched/fair: Do not 
> try to migrate delayed dequeue task"). I will include this commit and 
> retest on my machine after my leave ends.
> 

I'm really sorry for being away for so long. I have retested the 
hackbench on my machine, which includes the commit 61b82dfb6b7e 
("sched/fair: Do not try to migrate delayed dequeue task"). Based on the 
hackbench test results, this patch still brings a slight performance 
improvement.

vanilla: Includes commit 61b82dfb6b7e ("sched/fair: Do not try to 
migrate delayed dequeue task"), but does not include my patch.

patched: Includes both the above commit and my patch.


hackbench-process-pipes
                       vanilla                  patched
Amean     1       0.4087 (   0.00%)      0.4003 (   2.04%)
Amean     4       1.7033 (   0.00%)      1.7100 (  -0.39%)
Amean     7       2.9020 (   0.00%)      2.8750 (   0.93%)
Amean     12      4.2543 (   0.00%)      4.2980 (  -1.03%)
Amean     21      5.8633 (   0.00%)      5.7507 (   1.92%)
Amean     30      7.3757 (   0.00%)      7.2887 (   1.18%)
Amean     48     10.5360 (   0.00%)     10.2647 (   2.58%)
Amean     79     16.5480 (   0.00%)     15.9820 (   3.42%)
Amean     96     19.7873 (   0.00%)     19.1347 (   3.30%)


hackbench-process-sockets
                       vanilla                  patched
Amean     1       0.7520 (   0.00%)      0.7377 (   1.91%)
Amean     4       2.5760 (   0.00%)      2.5103 (   2.55%)
Amean     7       4.3927 (   0.00%)      4.2653 (   2.90%)
Amean     12      7.3923 (   0.00%)      7.1427 (   3.38%)
Amean     21     12.3733 (   0.00%)     11.9760 (   3.21%)
Amean     30     17.2617 (   0.00%)     16.7987 (   2.68%)
Amean     48     28.8577 (   0.00%)     28.1980 (   2.29%)
Amean     79     50.0687 (   0.00%)     49.0887 (   1.96%)
Amean     96     62.1603 (   0.00%)     61.1177 (   1.68%)

FYI.

After the performance tests, I added some code in the can_migrate_task() 
to count the different reasons why tasks could not be migrated during 
the hackbench run.


hit_delayed_dequeue_cnt: Hit if (p->se.sched_delayed) && 
(env->migration_type != migrate_load)


hit_ineligible_cnt: Did not hit "hit_delayed_dequeue" && 
!env->sd->nr_balance_failed && task_is_ineligible_on_dst_cpu()


Count results:

hit_delayed_dequeue_cnt    378432
hit_ineligible_cnt        1862099



Thanks,
Hao

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-02-17  6:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-23  9:14 [PATCH v2] sched/core: Prioritize migrating eligible tasks in sched_balance_rq() Hao Jia
2024-12-23 20:50 ` Markus Elfring
2024-12-24  1:53   ` Hao Jia
2024-12-24  8:55     ` [v2] " Markus Elfring
2025-01-13  9:21 ` [PATCH v2] " Hao Jia
2025-01-13 16:40   ` Vincent Guittot
2025-01-14  3:18     ` Hao Jia
2025-01-14  8:07       ` Vincent Guittot
2025-01-15  8:55         ` Hao Jia
2025-01-15  9:28           ` Vincent Guittot
2025-01-15 11:55             ` Hao Jia
2025-01-16 11:26               ` Vincent Guittot
2025-01-20  5:48                 ` Hao Jia
2025-02-17  6:13                   ` Hao Jia
2025-01-15  9:17 ` [tip: sched/core] " tip-bot2 for Hao Jia
2025-01-15 13:18 ` [PATCH v2] " Luis Machado

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).