From: Chengming Zhou <chengming.zhou@linux.dev>
To: Chuyi Zhou <zhouchuyi@bytedance.com>,
mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com
Cc: linux-kernel@vger.kernel.org, joshdon@google.com
Subject: Re: [PATCH v2 1/2] sched/fair: Decrease cfs bandwidth usage in task_group destruction
Date: Wed, 24 Jul 2024 10:29:27 +0800 [thread overview]
Message-ID: <6df6a8d3-6c5e-4ea7-8f55-08c2a56928f6@linux.dev> (raw)
In-Reply-To: <20240723122006.47053-2-zhouchuyi@bytedance.com>
On 2024/7/23 20:20, Chuyi Zhou wrote:
> The static key __cfs_bandwidth_used is used to indicate whether bandwidth
> control is enabled in the system. Currently, it is only decreased when a
> task group disables bandwidth control. This is incorrect because if there
> was a task group in the past that enabled bandwidth control, the
> __cfs_bandwidth_used will never go to zero, even if there are no task_group
> using bandwidth control now.
>
> This patch tries to fix this issue by decrsasing bandwidth usage in
> destroy_cfs_bandwidth(). cfs_bandwidth_usage_dec() calls
> static_key_slow_dec_cpuslocked which needs to hold hotplug lock, but cfs
> bandwidth destroy maybe run in a rcu callback. Move the call to
> destroy_cfs_bandwidth() from unregister_fair_sched_group() to
> cpu_cgroup_css_free() which runs in process context.
>
> Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Yeah, autogroup can't have bandwidth set, so it's ok to just destroy
bandwidth in .css_free().
Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>
Just some nits below:
> ---
> kernel/sched/core.c | 2 ++
> kernel/sched/fair.c | 13 +++++++------
> kernel/sched/sched.h | 2 ++
> 3 files changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 6d35c48239be..7720d34bd71b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8816,6 +8816,8 @@ static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
> {
> struct task_group *tg = css_tg(css);
>
> + destroy_cfs_bandwidth(tg_cfs_bandwidth(tg));
Instead of exporting this tg_cfs_bandwidth(), how about just changing
the parameter of init_cfs_bandwidth()/destroy_cfs_bandwidth() to tg?
Which maybe clearer? but this is your call.
Thanks.
> +
> /*
> * Relies on the RCU grace period between css_released() and this.
> */
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index da3cdd86ab2e..c56b6d5b8ed7 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5615,7 +5615,7 @@ void __refill_cfs_bandwidth_runtime(struct cfs_bandwidth *cfs_b)
> cfs_b->runtime_snap = cfs_b->runtime;
> }
>
> -static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
> +struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
> {
> return &tg->cfs_bandwidth;
> }
> @@ -6438,7 +6438,7 @@ void start_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> hrtimer_start_expires(&cfs_b->period_timer, HRTIMER_MODE_ABS_PINNED);
> }
>
> -static void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> +void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> {
> int __maybe_unused i;
>
> @@ -6472,6 +6472,9 @@ static void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> local_irq_restore(flags);
> }
> #endif
> + guard(cpus_read_lock)();
> + if (cfs_b->quota != RUNTIME_INF)
> + cfs_bandwidth_usage_dec();
> }
>
> /*
> @@ -6614,11 +6617,11 @@ void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b, struct cfs_bandwidth *paren
> static void init_cfs_rq_runtime(struct cfs_rq *cfs_rq) {}
> #endif
>
> -static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
> +struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
> {
> return NULL;
> }
> -static inline void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {}
> +void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {}
> static inline void update_runtime_enabled(struct rq *rq) {}
> static inline void unthrottle_offline_cfs_rqs(struct rq *rq) {}
> #ifdef CONFIG_CGROUP_SCHED
> @@ -12992,8 +12995,6 @@ void unregister_fair_sched_group(struct task_group *tg)
> struct rq *rq;
> int cpu;
>
> - destroy_cfs_bandwidth(tg_cfs_bandwidth(tg));
> -
> for_each_possible_cpu(cpu) {
> if (tg->se[cpu])
> remove_entity_load_avg(tg->se[cpu]);
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 8a071022bdec..d251842867ce 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -2938,6 +2938,8 @@ extern void init_dl_rq(struct dl_rq *dl_rq);
> extern void cfs_bandwidth_usage_inc(void);
> extern void cfs_bandwidth_usage_dec(void);
>
> +extern struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg);
> +extern void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b);
> #ifdef CONFIG_NO_HZ_COMMON
>
> #define NOHZ_BALANCE_KICK_BIT 0
next prev parent reply other threads:[~2024-07-24 2:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-23 12:20 [PATCH v2 0/2] minor cpu bandwidth control fix Chuyi Zhou
2024-07-23 12:20 ` [PATCH v2 1/2] sched/fair: Decrease cfs bandwidth usage in task_group destruction Chuyi Zhou
2024-07-24 1:26 ` Benjamin Segall
2024-07-24 2:29 ` Chengming Zhou [this message]
2024-07-23 12:20 ` [PATCH v2 2/2] sched/core: Avoid unnecessary update in tg_set_cfs_bandwidth Chuyi Zhou
2024-07-24 1:27 ` Benjamin Segall
2024-07-24 2:31 ` Chengming Zhou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6df6a8d3-6c5e-4ea7-8f55-08c2a56928f6@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=zhouchuyi@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.