All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Valentin Schneider <valentin.schneider@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
	dietmar.eggemann@arm.com, rostedt@goodmis.org,
	bsegall@google.com, linux-kernel@vger.kernel.org,
	pauld@redhat.com, parth@linux.ibm.com, hdanton@sina.com
Subject: Re: [PATCH v2 2/5] sched/numa: Replace runnable_load_avg by load_avg
Date: Tue, 18 Feb 2020 15:38:01 +0000	[thread overview]
Message-ID: <20200218153801.GF3420@suse.de> (raw)
In-Reply-To: <b67ae78b-17ba-8f3f-9052-fecefb848e3d@arm.com>

On Tue, Feb 18, 2020 at 02:54:14PM +0000, Valentin Schneider wrote:
> On 2/14/20 3:27 PM, Vincent Guittot wrote:
> > @@ -1473,38 +1473,35 @@ bool should_numa_migrate_memory(struct task_struct *p, struct page * page,
> >  	       group_faults_cpu(ng, src_nid) * group_faults(p, dst_nid) * 4;
> >  }
> >  
> > -static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq);
> > -
> > -static unsigned long cpu_runnable_load(struct rq *rq)
> > -{
> > -	return cfs_rq_runnable_load_avg(&rq->cfs);
> > -}
> > +/*
> > + * 'numa_type' describes the node at the moment of load balancing.
> > + */
> > +enum numa_type {
> > +	/* The node has spare capacity that can be used to run more tasks.  */
> > +	node_has_spare = 0,
> > +	/*
> > +	 * The node is fully used and the tasks don't compete for more CPU
> > +	 * cycles. Nevertheless, some tasks might wait before running.
> > +	 */
> > +	node_fully_busy,
> > +	/*
> > +	 * The node is overloaded and can't provide expected CPU cycles to all
> > +	 * tasks.
> > +	 */
> > +	node_overloaded
> > +};
> 
> Could we reuse group_type instead? The definitions are the same modulo
> s/group/node/.
> 

I kept the naming because there is the remote possibility that NUMA
balancing will deviate in some fashion. Right now, it's harmless.

> >  
> >  /* Cached statistics for all CPUs within a node */
> >  struct numa_stats {
> >  	unsigned long load;
> > -
> > +	unsigned long util;
> >  	/* Total compute capacity of CPUs on a node */
> >  	unsigned long compute_capacity;
> > +	unsigned int nr_running;
> > +	unsigned int weight;
> > +	enum numa_type node_type;
> >  };
> >  
> > -/*
> > - * XXX borrowed from update_sg_lb_stats
> > - */
> > -static void update_numa_stats(struct numa_stats *ns, int nid)
> > -{
> > -	int cpu;
> > -
> > -	memset(ns, 0, sizeof(*ns));
> > -	for_each_cpu(cpu, cpumask_of_node(nid)) {
> > -		struct rq *rq = cpu_rq(cpu);
> > -
> > -		ns->load += cpu_runnable_load(rq);
> > -		ns->compute_capacity += capacity_of(cpu);
> > -	}
> > -
> > -}
> > -
> >  struct task_numa_env {
> >  	struct task_struct *p;
> >  
> > @@ -1521,6 +1518,47 @@ struct task_numa_env {
> >  	int best_cpu;
> >  };
> >  
> > +static unsigned long cpu_load(struct rq *rq);
> > +static unsigned long cpu_util(int cpu);
> > +
> > +static inline enum
> > +numa_type numa_classify(unsigned int imbalance_pct,
> > +			 struct numa_stats *ns)
> > +{
> > +	if ((ns->nr_running > ns->weight) &&
> > +	    ((ns->compute_capacity * 100) < (ns->util * imbalance_pct)))
> > +		return node_overloaded;
> > +
> > +	if ((ns->nr_running < ns->weight) ||
> > +	    ((ns->compute_capacity * 100) > (ns->util * imbalance_pct)))
> > +		return node_has_spare;
> > +
> > +	return node_fully_busy;
> > +}
> > +
> 
> As Mel pointed out, this is group_is_overloaded() and group_has_capacity().
> @Mel, you mentioned having a common helper, do you have that laying around?
> I haven't seen it in your reconciliation series.
> 

I didn't merge that part of the first version of my series. I was
waiting to see how the implementation for allowing a small degree of
imbalance looks like. If it's entirely confined in adjust_numa_balance
then I'll create the common helper at the same time. For now, I left the
possibility open that numa_classify would use something different than
group_is_overloaded or group_has_capacity even if I find that hard to
imagine at the moment.

> What I'm naively thinking here is that we could have either move the whole
> thing to just sg_lb_stats (AFAICT the fields of numa_stats are a subset of it),
> or if we really care about the stack we could tweak the ordering to ensure
> we can cast one into the other (not too enticed by that one though).
> 

Yikes, no I'd rather not do that. Basically all I did before was create
a common helper like __lb_has_capacity that only took basic types as
parameters. group_has_capacity and numa_has_capacity were simple wrappers
that read the correct fields from their respective stats structures.

-- 
Mel Gorman
SUSE Labs

  parent reply	other threads:[~2020-02-18 15:38 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-14 15:27 [PATCH v2 0/5] remove runnable_load_avg and improve group_classify Vincent Guittot
2020-02-14 15:27 ` [PATCH v2 1/5] sched/fair: Reorder enqueue/dequeue_task_fair path Vincent Guittot
2020-02-18 12:37   ` Dietmar Eggemann
2020-02-18 13:22     ` Peter Zijlstra
2020-02-18 14:15       ` Vincent Guittot
2020-02-19 11:07         ` Dietmar Eggemann
2020-02-19 16:26           ` Vincent Guittot
2020-02-20 13:38             ` Dietmar Eggemann
     [not found]               ` <20200222152541.GA11669@geo.homenetwork>
2020-02-26 16:30                 ` Vincent Guittot
2020-02-14 15:27 ` [PATCH v2 2/5] sched/numa: Replace runnable_load_avg by load_avg Vincent Guittot
2020-02-18 12:37   ` Dietmar Eggemann
2020-02-18 13:50     ` Mel Gorman
2020-02-18 14:17       ` Vincent Guittot
2020-02-18 14:42         ` Dietmar Eggemann
2020-02-18 14:54   ` Valentin Schneider
2020-02-18 15:33     ` Vincent Guittot
2020-02-18 15:38     ` Mel Gorman [this message]
2020-02-18 16:50       ` Valentin Schneider
2020-02-18 17:41         ` Mel Gorman
2020-02-18 17:54           ` Valentin Schneider
2020-02-18 16:51       ` Vincent Guittot
2020-02-14 15:27 ` [PATCH v2 3/5] sched/pelt: Remove unused runnable load average Vincent Guittot
2020-02-21  9:57   ` Dietmar Eggemann
2020-02-21 11:56     ` Vincent Guittot
2020-02-14 15:27 ` [PATCH v2 4/5] sched/pelt: Add a new runnable average signal Vincent Guittot
2020-02-18 14:54   ` Valentin Schneider
2020-02-18 15:12     ` Peter Zijlstra
2020-02-18 15:28     ` Vincent Guittot
2020-02-18 16:30       ` Valentin Schneider
2020-02-18 21:19   ` Valentin Schneider
2020-02-19  9:02     ` Vincent Guittot
2020-02-19  9:08     ` Mel Gorman
2020-02-19 12:55   ` [PATCH v3 " Vincent Guittot
2020-02-19 14:02     ` Mel Gorman
2020-02-19 20:10     ` Valentin Schneider
2020-02-20 14:36       ` Vincent Guittot
2020-02-20 16:11         ` Valentin Schneider
2020-02-21  8:56           ` Vincent Guittot
2020-02-24 15:57             ` Valentin Schneider
2020-02-21  9:04           ` Mel Gorman
2020-02-21  9:25             ` Vincent Guittot
2020-02-21 10:40               ` Mel Gorman
2020-02-21 13:28                 ` Vincent Guittot
2020-02-20 15:04     ` Dietmar Eggemann
2020-02-21  9:44     ` Dietmar Eggemann
2020-02-21 11:47       ` Vincent Guittot
2020-02-14 15:27 ` [PATCH v2 5/5] sched/fair: Take into account runnable_avg to classify group Vincent Guittot
2020-02-15 21:58 ` [PATCH v2 0/5] remove runnable_load_avg and improve group_classify Mel Gorman
2020-02-21  9:58 ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200218153801.GF3420@suse.de \
    --to=mgorman@suse.de \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=hdanton@sina.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=parth@linux.ibm.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.