All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Segall <bsegall@google.com>
To: Phil Auld <pauld@redhat.com>
Cc: linux-kernel@vger.kernel.org, Juri Lelli <juri.lelli@redhat.com>,
	Ingo Molnar <mingo@redhat.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mel Gorman <mgorman@suse.de>
Subject: Re: [PATCH RESEND] sched/nohz: Add HRTICK_BW for using cfs bandwidth with nohz_full
Date: Thu, 18 May 2023 14:29:04 -0700	[thread overview]
Message-ID: <xm26353twcpr.fsf@google.com> (raw)
In-Reply-To: <20230518132038.3534728-1-pauld@redhat.com> (Phil Auld's message of "Thu, 18 May 2023 09:20:38 -0400")

Phil Auld <pauld@redhat.com> writes:

> CFS bandwidth limits and NOHZ full don't play well together.  Tasks
> can easily run well past their quotas before a remote tick does
> accounting.  This leads to long, multi-period stalls before such
> tasks can run again.  Use the hrtick mechanism to set a sched
> tick to fire at remaining_runtime in the future if we are on
> a nohz full cpu, if the task has quota and if we are likely to
> disable the tick (nr_running == 1).  This allows for bandwidth
> accounting before tasks go too far over quota.
>
> A number of container workloads use a dynamic number of real
> nohz tasks but also have other work that is limited which ends
> up running on the "spare" nohz cpus.  This is an artifact of
> having to specify nohz_full cpus at boot. Adding this hrtick
> resolves the issue of long stalls on these tasks.
>
> Add the sched_feat HRTICK_BW off by default to allow users to
> enable this only when needed.
>
> Signed-off-by: Phil Auld <pauld@redhat.com>
> Suggested-by: Juri Lelli <jlelli@redhat.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Cc: Valentin Schneider <vschneid@redhat.com>
> Cc: Ben Segall <bsegall@google.com>
> ---
>
> Resend with LKML address instead of rh list...
>
>  kernel/sched/core.c     |  2 +-
>  kernel/sched/fair.c     | 20 ++++++++++++++++++++
>  kernel/sched/features.h |  1 +
>  kernel/sched/sched.h    | 12 ++++++++++++
>  4 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index a68d1276bab0..76425c377245 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6562,7 +6562,7 @@ static void __sched notrace __schedule(unsigned int sched_mode)
>  
>  	schedule_debug(prev, !!sched_mode);
>  
> -	if (sched_feat(HRTICK) || sched_feat(HRTICK_DL))
> +	if (sched_feat(HRTICK) || sched_feat(HRTICK_DL) || sched_feat(HRTICK_BW))
>  		hrtick_clear(rq);
>  
>  	local_irq_disable();
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 373ff5f55884..0dd1f6a874bc 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5309,6 +5309,22 @@ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
>  	return ret;
>  }
>  
> +#if defined(CONFIG_SCHED_HRTICK) && defined(CONFIG_NO_HZ_FULL)
> +static void start_hrtick_cfs_bw(struct rq *rq, struct cfs_rq *cfs_rq)
> +{
> +	if (!tick_nohz_full_cpu(rq->cpu) || !cfs_bandwidth_used() || !cfs_rq->runtime_enabled)
> +		return;
> +
> +	/* runtime_remaining should never be negative here but just in case */
> +	if (rq->nr_running == 1 && cfs_rq->runtime_remaining > 0)
> +		hrtick_start(rq, cfs_rq->runtime_remaining);
> +}
> +#else
> +static void start_hrtick_cfs_bw(struct rq *rq, struct cfs_rq *cfs_rq)
> +{
> +}
> +#endif
> +
>  static void __account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64 delta_exec)
>  {
>  	/* dock delta_exec before expiring quota (as it could span periods) */
> @@ -5481,6 +5497,7 @@ static bool throttle_cfs_rq(struct cfs_rq *cfs_rq)
>  	 */
>  	cfs_rq->throttled = 1;
>  	cfs_rq->throttled_clock = rq_clock(rq);
> +
>  	return true;
>  }
>  
> @@ -8096,6 +8113,9 @@ done: __maybe_unused;
>  	if (hrtick_enabled_fair(rq))
>  		hrtick_start_fair(rq, p);
>  
> +	if (hrtick_enabled_bw(rq))
> +		start_hrtick_cfs_bw(rq, task_cfs_rq(p));
> +
>  	update_misfit_status(p, rq);

Implementation-wise this winds up with a tick of
sysctl_sched_cfs_bandwidth_slice, which I suppose the admin could _also_
configure depending on how much NOHZ benefit vs cfsb issues they want.

It's not great that this implementation winds up going all the way
through schedule() for each 5ms-default tick, though.

  parent reply	other threads:[~2023-05-18 21:30 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-18 13:20 [PATCH RESEND] sched/nohz: Add HRTICK_BW for using cfs bandwidth with nohz_full Phil Auld
2023-05-18 13:47 ` Peter Zijlstra
2023-05-18 14:37   ` Phil Auld
2023-06-12 21:39     ` Phil Auld
2023-05-18 16:38 ` kernel test robot
2023-05-18 19:01   ` Phil Auld
2023-05-18 18:43 ` kernel test robot
2023-05-18 21:29 ` Benjamin Segall [this message]
2023-05-18 22:01   ` Phil Auld
2023-06-08 12:51   ` Phil Auld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xm26353twcpr.fsf@google.com \
    --to=bsegall@google.com \
    --cc=bristot@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.