public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug
@ 2024-11-26  6:48 Vishal Chourasia
  2024-11-27 18:31 ` Madadi Vineeth Reddy
  2024-12-06  9:07 ` Srikar Dronamraju
  0 siblings, 2 replies; 4+ messages in thread
From: Vishal Chourasia @ 2024-11-26  6:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid, sshegde, Vishal Chourasia

CPU controller limits are not properly enforced during CPU hotplug operations,
particularly during CPU offline. When a CPU goes offline, throttled
processes are unintentionally being unthrottled across all CPUs in the system,
allowing them to exceed their assigned quota limits.

Assigning 6.25% bandwidth limit to a cgroup in a 8 CPU system, where, workload
is running 8 threads for 20 seconds at 100% CPU utilization,
expected (user+sys) time = 10 seconds.

# cat /sys/fs/cgroup/test/cpu.max
50000 100000

# ./ebizzy -t 8 -S 20        // non-hotplug case
real 20.00 s
user 10.81 s                 // intented behaviour
sys   0.00 s

# ./ebizzy -t 8 -S 20        // hotplug case
real 20.00 s
user 14.43 s                 // Workload is able to run for 14 secs
sys   0.00 s                 // when it should have only run for 10 secs

During CPU hotplug, scheduler domains are rebuilt and cpu_attach_domain
is called for every active CPU to update the root domain. That ends up
calling rq_offline_fair which un-throttles any throttled hierarchies.

Unthrottling should only occur for the CPU being hotplugged to allow its
throttled processes to become runnable and get migrated to other CPUs.

With current patch applied,
# ./ebizzy -t 8 -S 20        // hotplug case
real 21.00 s
user 10.16 s                 // intented behaviour
sys   0.00 s

Note: hotplug operation (online, offline) was performed in while(1) loop

Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
---
 kernel/sched/fair.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fbdca89c677f..c436e2307e6f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6684,7 +6684,8 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
 	list_for_each_entry_rcu(tg, &task_groups, list) {
 		struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
 
-		if (!cfs_rq->runtime_enabled)
+		/* Don't unthrottle an active cfs_rq unnecessarily */
+		if (!cfs_rq->runtime_enabled || cpumask_test_cpu(cpu_of(rq), cpu_active_mask))
 			continue;
 
 		/*
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug
  2024-11-26  6:48 [PATCH] sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug Vishal Chourasia
@ 2024-11-27 18:31 ` Madadi Vineeth Reddy
  2024-12-06  9:07 ` Srikar Dronamraju
  1 sibling, 0 replies; 4+ messages in thread
From: Madadi Vineeth Reddy @ 2024-11-27 18:31 UTC (permalink / raw)
  To: Vishal Chourasia, linux-kernel
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid, sshegde,
	Madadi Vineeth Reddy

Hi Vishal,

On 26/11/24 12:18, Vishal Chourasia wrote:
> CPU controller limits are not properly enforced during CPU hotplug operations,
> particularly during CPU offline. When a CPU goes offline, throttled
> processes are unintentionally being unthrottled across all CPUs in the system,
> allowing them to exceed their assigned quota limits.
> 
> Assigning 6.25% bandwidth limit to a cgroup in a 8 CPU system, where, workload
> is running 8 threads for 20 seconds at 100% CPU utilization,
> expected (user+sys) time = 10 seconds.
> 
> # cat /sys/fs/cgroup/test/cpu.max
> 50000 100000
> 
> # ./ebizzy -t 8 -S 20        // non-hotplug case
> real 20.00 s
> user 10.81 s                 // intented behaviour
> sys   0.00 s
> 
> # ./ebizzy -t 8 -S 20        // hotplug case
> real 20.00 s
> user 14.43 s                 // Workload is able to run for 14 secs
> sys   0.00 s                 // when it should have only run for 10 secs
> 
> During CPU hotplug, scheduler domains are rebuilt and cpu_attach_domain
> is called for every active CPU to update the root domain. That ends up
> calling rq_offline_fair which un-throttles any throttled hierarchies.
> 
> Unthrottling should only occur for the CPU being hotplugged to allow its
> throttled processes to become runnable and get migrated to other CPUs.
> 
> With current patch applied,
> # ./ebizzy -t 8 -S 20        // hotplug case
> real 21.00 s
> user 10.16 s                 // intented behaviour
> sys   0.00 s
> 
> Note: hotplug operation (online, offline) was performed in while(1) loop

Tested with and without this patch for the ebizzy workload as mentioned.

Without the patch:
------------------
19376 records/s
real 20.00 s
user 12.49 s
sys   0.00 s

With the patch:
---------------
17708 records/s
real 20.00 s
user 10.07 s
sys   0.00 s

Hence,
Tested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>

Thanks,
Madadi Vineeth Reddy

> 
> Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
> ---
>  kernel/sched/fair.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index fbdca89c677f..c436e2307e6f 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6684,7 +6684,8 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
>  	list_for_each_entry_rcu(tg, &task_groups, list) {
>  		struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
>  
> -		if (!cfs_rq->runtime_enabled)
> +		/* Don't unthrottle an active cfs_rq unnecessarily */
> +		if (!cfs_rq->runtime_enabled || cpumask_test_cpu(cpu_of(rq), cpu_active_mask))
>  			continue;
>  
>  		/*


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug
  2024-11-26  6:48 [PATCH] sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug Vishal Chourasia
  2024-11-27 18:31 ` Madadi Vineeth Reddy
@ 2024-12-06  9:07 ` Srikar Dronamraju
  2024-12-07  5:24   ` Vishal Chourasia
  1 sibling, 1 reply; 4+ messages in thread
From: Srikar Dronamraju @ 2024-12-06  9:07 UTC (permalink / raw)
  To: Vishal Chourasia
  Cc: linux-kernel, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, sshegde

* Vishal Chourasia <vishalc@linux.ibm.com> [2024-11-26 12:18:13]:

> CPU controller limits are not properly enforced during CPU hotplug operations,
> particularly during CPU offline. When a CPU goes offline, throttled
> processes are unintentionally being unthrottled across all CPUs in the system,
> allowing them to exceed their assigned quota limits.
> 
<snip>

> Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
> ---
>  kernel/sched/fair.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index fbdca89c677f..c436e2307e6f 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6684,7 +6684,8 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
>  	list_for_each_entry_rcu(tg, &task_groups, list) {
>  		struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
>  
> -		if (!cfs_rq->runtime_enabled)
> +		/* Don't unthrottle an active cfs_rq unnecessarily */

Per the patch description, its not just unnecessary but unthrottling is
buggy. Unnecessary would mean its just an overhead. Here we dont want to
unthrottle. Other than that this seems to be fine to me.

> +		if (!cfs_rq->runtime_enabled || cpumask_test_cpu(cpu_of(rq), cpu_active_mask))
>  			continue;

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug
  2024-12-06  9:07 ` Srikar Dronamraju
@ 2024-12-07  5:24   ` Vishal Chourasia
  0 siblings, 0 replies; 4+ messages in thread
From: Vishal Chourasia @ 2024-12-07  5:24 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: linux-kernel, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, sshegde

On Fri, Dec 06, 2024 at 02:37:54PM +0530, Srikar Dronamraju wrote:
> * Vishal Chourasia <vishalc@linux.ibm.com> [2024-11-26 12:18:13]:
> 
> > CPU controller limits are not properly enforced during CPU hotplug operations,
> > particularly during CPU offline. When a CPU goes offline, throttled
> > processes are unintentionally being unthrottled across all CPUs in the system,
> > allowing them to exceed their assigned quota limits.
> > 
> <snip>
> 
> > Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
> > ---
> >  kernel/sched/fair.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index fbdca89c677f..c436e2307e6f 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6684,7 +6684,8 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
> >  	list_for_each_entry_rcu(tg, &task_groups, list) {
> >  		struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
> >  
> > -		if (!cfs_rq->runtime_enabled)
> > +		/* Don't unthrottle an active cfs_rq unnecessarily */
> 
> Per the patch description, its not just unnecessary but unthrottling is
> buggy. Unnecessary would mean its just an overhead. Here we dont want to
> unthrottle. Other than that this seems to be fine to me.
sure.


Also, I missed running the patch against checkpatch.pl 

Will be sending out v2 with the updates.

vishalc
> 
> > +		if (!cfs_rq->runtime_enabled || cpumask_test_cpu(cpu_of(rq), cpu_active_mask))
> >  			continue;
> 
> -- 
> Thanks and Regards
> Srikar Dronamraju

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-12-07  5:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-26  6:48 [PATCH] sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug Vishal Chourasia
2024-11-27 18:31 ` Madadi Vineeth Reddy
2024-12-06  9:07 ` Srikar Dronamraju
2024-12-07  5:24   ` Vishal Chourasia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox