Re: [RFC PATCH 1/2] sched: Rate limit migrations to 1 per 2ms per task

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Swapnil Sapkal <Swapnil.Sapkal@amd.com>,
	Aaron Lu <aaron.lu@intel.com>,
	Julien Desfossez <jdesfossez@digitalocean.com>,
	x86@kernel.org
Subject: Re: [RFC PATCH 1/2] sched: Rate limit migrations to 1 per 2ms per task
Date: Wed, 6 Sep 2023 10:41:45 +0200	[thread overview]
Message-ID: <20230906084145.GC38741@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <20230905171105.1005672-2-mathieu.desnoyers@efficios.com>

On Tue, Sep 05, 2023 at 01:11:04PM -0400, Mathieu Desnoyers wrote:
> Rate limit migrations to 1 migration per 2 milliseconds per task. On a
> kernel with EEVDF scheduler (commit b97d64c722598ffed42ece814a2cb791336c6679),

This is not in any way related to the actual eevdf part, perhaps just
call it fair.


>  include/linux/sched.h |  2 ++
>  kernel/sched/core.c   |  1 +
>  kernel/sched/fair.c   | 14 ++++++++++++++
>  kernel/sched/sched.h  |  2 ++
>  4 files changed, 19 insertions(+)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 177b3f3676ef..1111d04255cc 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -564,6 +564,8 @@ struct sched_entity {
>  
>  	u64				nr_migrations;
>  
> +	u64				next_migration_time;
> +
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>  	int				depth;
>  	struct sched_entity		*parent;
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 479db611f46e..0d294fce261d 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4510,6 +4510,7 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p)
>  	p->se.vruntime			= 0;
>  	p->se.vlag			= 0;
>  	p->se.slice			= sysctl_sched_base_slice;
> +	p->se.next_migration_time	= 0;
>  	INIT_LIST_HEAD(&p->se.group_node);
>  
>  #ifdef CONFIG_FAIR_GROUP_SCHED
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d92da2d78774..24ac69913005 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -960,6 +960,14 @@ int sched_update_scaling(void)
>  
>  static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se);
>  
> +static bool should_migrate_task(struct task_struct *p, int prev_cpu)
> +{
> +	/* Rate limit task migration. */
> +	if (sched_clock_cpu(prev_cpu) < p->se.next_migration_time)
> +	       return false;
> +	return true;
> +}
> +
>  /*
>   * XXX: strictly: vd_i += N*r_i/w_i such that: vd_i > ve_i
>   * this is probably good enough.
> @@ -7897,6 +7905,9 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
>  		want_affine = !wake_wide(p) && cpumask_test_cpu(cpu, p->cpus_ptr);
>  	}
>  
> +	if (want_affine && !should_migrate_task(p, prev_cpu))
> +		return prev_cpu;
> +
>  	rcu_read_lock();
>  	for_each_domain(cpu, tmp) {
>  		/*
> @@ -7944,6 +7955,9 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
>  {
>  	struct sched_entity *se = &p->se;
>  
> +	/* Rate limit task migration. */
> +	se->next_migration_time = sched_clock_cpu(new_cpu) + SCHED_MIGRATION_RATELIMIT_WINDOW;
> +
>  	if (!task_on_rq_migrating(p)) {
>  		remove_entity_load_avg(se);
>  
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index cf54fe338e23..c9b1a5976761 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -104,6 +104,8 @@ struct cpuidle_state;
>  #define TASK_ON_RQ_QUEUED	1
>  #define TASK_ON_RQ_MIGRATING	2
>  
> +#define SCHED_MIGRATION_RATELIMIT_WINDOW	2000000		/* 2 ms */
> +
>  extern __read_mostly int scheduler_running;
>  
>  extern unsigned long calc_load_update;

Urgh... so we already have much of this around task_hot() /
can_migrate_task(). And I would much rather see we extend those things
to this wakeup migration path, rather than build a whole new parallel
thing.

Also:

> I have noticed that in order to observe the speedup, the workload needs
> to keep the CPUs sufficiently busy to cause runqueue lock contention,
> but not so busy that they don't go idle.

This would suggest inhibiting pulling tasks based on rq statistics,
instead of tasks stats. It doesn't matter when the task migrated last,
what matter is that this rq doesn't want new tasks at this point.

Them not the same thing.

next prev parent reply	other threads:[~2023-09-06  8:42 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-05 17:11 [RFC PATCH 0/2] sched/eevdf: Rate limit task migration Mathieu Desnoyers
2023-09-05 17:11 ` [RFC PATCH 1/2] sched: Rate limit migrations to 1 per 2ms per task Mathieu Desnoyers
2023-09-05 20:28   ` Tim Chen
2023-09-05 21:16     ` Mathieu Desnoyers
2023-09-05 22:44       ` Tim Chen
2023-09-06  9:47         ` Peter Zijlstra
2023-09-06 20:51           ` Tim Chen
2023-09-06 21:55             ` Mathieu Desnoyers
2023-09-06  8:44       ` Peter Zijlstra
2023-09-06 13:58         ` Mathieu Desnoyers
2023-09-06  8:41   ` Peter Zijlstra [this message]
2023-09-06 13:57     ` Mathieu Desnoyers
2023-09-06 15:38       ` Mathieu Desnoyers
2023-09-10  7:03       ` Chen Yu
2023-09-13 15:46         ` Mathieu Desnoyers
2023-09-05 17:11 ` [RFC PATCH 2/2] sched: Implement adaptative rate limiting of task migrations Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230906084145.GC38741@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Swapnil.Sapkal@amd.com \
    --cc=aaron.lu@intel.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=jdesfossez@digitalocean.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox