All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com
Subject: Re: [PATCH 3/3] mm/sched: memdelay: memory health interface for systems and workloads
Date: Sun, 30 Jul 2017 11:28:13 -0400	[thread overview]
Message-ID: <20170730152813.GA26672@cmpxchg.org> (raw)
In-Reply-To: <20170729091055.GA6524@worktop.programming.kicks-ass.net>

On Sat, Jul 29, 2017 at 11:10:55AM +0200, Peter Zijlstra wrote:
> On Thu, Jul 27, 2017 at 11:30:10AM -0400, Johannes Weiner wrote:
> > +static void domain_cpu_update(struct memdelay_domain *md, int cpu,
> > +			      int old, int new)
> > +{
> > +	enum memdelay_domain_state state;
> > +	struct memdelay_domain_cpu *mdc;
> > +	unsigned long now, delta;
> > +	unsigned long flags;
> > +
> > +	mdc = per_cpu_ptr(md->mdcs, cpu);
> > +	spin_lock_irqsave(&mdc->lock, flags);
> 
> Afaict this is inside scheduler locks, this cannot be a spinlock. Also,
> do we really want to add more atomics there?

I think we should be able to get away without an additional lock and
rely on the rq lock instead. schedule, enqueue, dequeue already hold
it, memdelay_enter/leave could be added. I need to think about what to
do with try_to_wake_up in order to get the cpu move accounting inside
the locked section of ttwu_queue(), but that should be doable too.

> > +	if (old) {
> > +		WARN_ONCE(!mdc->tasks[old], "cpu=%d old=%d new=%d counter=%d\n",
> > +			  cpu, old, new, mdc->tasks[old]);
> > +		mdc->tasks[old] -= 1;
> > +	}
> > +	if (new)
> > +		mdc->tasks[new] += 1;
> > +
> > +	/*
> > +	 * The domain is somewhat delayed when a number of tasks are
> > +	 * delayed but there are still others running the workload.
> > +	 *
> > +	 * The domain is fully delayed when all non-idle tasks on the
> > +	 * CPU are delayed, or when a delayed task is actively running
> > +	 * and preventing productive tasks from making headway.
> > +	 *
> > +	 * The state times then add up over all CPUs in the domain: if
> > +	 * the domain is fully blocked on one CPU and there is another
> > +	 * one running the workload, the domain is considered fully
> > +	 * blocked 50% of the time.
> > +	 */
> > +	if (!mdc->tasks[MTS_DELAYED_ACTIVE] && !mdc->tasks[MTS_DELAYED])
> > +		state = MDS_NONE;
> > +	else if (mdc->tasks[MTS_WORKING])
> > +		state = MDS_SOME;
> > +	else
> > +		state = MDS_FULL;
> > +
> > +	if (mdc->state == state)
> > +		goto unlock;
> > +
> > +	now = ktime_to_ns(ktime_get());
> 
> ktime_get_ns(), also no ktime in scheduler code.

Okay.

I actually don't need a time source that's comparable across CPUs
since accounting periods are always fully contained within one
CPU. From the comment docs, it sounds like cpu_clock() is what I want
to use there?

> > +	/* Account domain state changes */
> > +	rcu_read_lock();
> > +	memcg = mem_cgroup_from_task(task);
> > +	do {
> > +		struct memdelay_domain *md;
> > +
> > +		md = memcg_domain(memcg);
> > +		md->aggregate += delay;
> > +		domain_cpu_update(md, cpu, old, new);
> > +	} while (memcg && (memcg = parent_mem_cgroup(memcg)));
> > +	rcu_read_unlock();
> 
> We are _NOT_ going to do a 3rd cgroup iteration for every task action.

I'll look into that.

Thanks

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com
Subject: Re: [PATCH 3/3] mm/sched: memdelay: memory health interface for systems and workloads
Date: Sun, 30 Jul 2017 11:28:13 -0400	[thread overview]
Message-ID: <20170730152813.GA26672@cmpxchg.org> (raw)
In-Reply-To: <20170729091055.GA6524@worktop.programming.kicks-ass.net>

On Sat, Jul 29, 2017 at 11:10:55AM +0200, Peter Zijlstra wrote:
> On Thu, Jul 27, 2017 at 11:30:10AM -0400, Johannes Weiner wrote:
> > +static void domain_cpu_update(struct memdelay_domain *md, int cpu,
> > +			      int old, int new)
> > +{
> > +	enum memdelay_domain_state state;
> > +	struct memdelay_domain_cpu *mdc;
> > +	unsigned long now, delta;
> > +	unsigned long flags;
> > +
> > +	mdc = per_cpu_ptr(md->mdcs, cpu);
> > +	spin_lock_irqsave(&mdc->lock, flags);
> 
> Afaict this is inside scheduler locks, this cannot be a spinlock. Also,
> do we really want to add more atomics there?

I think we should be able to get away without an additional lock and
rely on the rq lock instead. schedule, enqueue, dequeue already hold
it, memdelay_enter/leave could be added. I need to think about what to
do with try_to_wake_up in order to get the cpu move accounting inside
the locked section of ttwu_queue(), but that should be doable too.

> > +	if (old) {
> > +		WARN_ONCE(!mdc->tasks[old], "cpu=%d old=%d new=%d counter=%d\n",
> > +			  cpu, old, new, mdc->tasks[old]);
> > +		mdc->tasks[old] -= 1;
> > +	}
> > +	if (new)
> > +		mdc->tasks[new] += 1;
> > +
> > +	/*
> > +	 * The domain is somewhat delayed when a number of tasks are
> > +	 * delayed but there are still others running the workload.
> > +	 *
> > +	 * The domain is fully delayed when all non-idle tasks on the
> > +	 * CPU are delayed, or when a delayed task is actively running
> > +	 * and preventing productive tasks from making headway.
> > +	 *
> > +	 * The state times then add up over all CPUs in the domain: if
> > +	 * the domain is fully blocked on one CPU and there is another
> > +	 * one running the workload, the domain is considered fully
> > +	 * blocked 50% of the time.
> > +	 */
> > +	if (!mdc->tasks[MTS_DELAYED_ACTIVE] && !mdc->tasks[MTS_DELAYED])
> > +		state = MDS_NONE;
> > +	else if (mdc->tasks[MTS_WORKING])
> > +		state = MDS_SOME;
> > +	else
> > +		state = MDS_FULL;
> > +
> > +	if (mdc->state == state)
> > +		goto unlock;
> > +
> > +	now = ktime_to_ns(ktime_get());
> 
> ktime_get_ns(), also no ktime in scheduler code.

Okay.

I actually don't need a time source that's comparable across CPUs
since accounting periods are always fully contained within one
CPU. From the comment docs, it sounds like cpu_clock() is what I want
to use there?

> > +	/* Account domain state changes */
> > +	rcu_read_lock();
> > +	memcg = mem_cgroup_from_task(task);
> > +	do {
> > +		struct memdelay_domain *md;
> > +
> > +		md = memcg_domain(memcg);
> > +		md->aggregate += delay;
> > +		domain_cpu_update(md, cpu, old, new);
> > +	} while (memcg && (memcg = parent_mem_cgroup(memcg)));
> > +	rcu_read_unlock();
> 
> We are _NOT_ going to do a 3rd cgroup iteration for every task action.

I'll look into that.

Thanks

  reply	other threads:[~2017-07-30 15:28 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-27 15:30 [PATCH 0/3] memdelay: memory health metric for systems and workloads Johannes Weiner
2017-07-27 15:30 ` Johannes Weiner
2017-07-27 15:30 ` [PATCH 1/3] sched/loadavg: consolidate LOAD_INT, LOAD_FRAC macros Johannes Weiner
2017-07-27 15:30   ` Johannes Weiner
2017-07-27 15:30 ` [PATCH 2/3] mm: workingset: tell cache transitions from workingset thrashing Johannes Weiner
2017-07-27 15:30   ` Johannes Weiner
2017-07-27 15:30 ` [PATCH 3/3] mm/sched: memdelay: memory health interface for systems and workloads Johannes Weiner
2017-07-27 15:30   ` Johannes Weiner
2017-07-27 15:56   ` Johannes Weiner
2017-07-27 15:56     ` Johannes Weiner
2017-07-29  9:10   ` Peter Zijlstra
2017-07-29  9:10     ` Peter Zijlstra
2017-07-30 15:28     ` Johannes Weiner [this message]
2017-07-30 15:28       ` Johannes Weiner
2017-07-31  8:31       ` Peter Zijlstra
2017-07-31  8:31         ` Peter Zijlstra
2017-07-31 18:41         ` Johannes Weiner
2017-07-31 18:41           ` Johannes Weiner
2017-07-31 19:49           ` Mike Galbraith
2017-07-31 19:49             ` Mike Galbraith
2017-07-31 20:38             ` Johannes Weiner
2017-07-31 20:38               ` Johannes Weiner
2017-08-01  2:23               ` Mike Galbraith
2017-08-01  2:23                 ` Mike Galbraith
2017-08-01  7:57           ` Peter Zijlstra
2017-08-01  7:57             ` Peter Zijlstra
2017-08-01 12:26             ` Johannes Weiner
2017-08-01 12:26               ` Johannes Weiner
2017-08-13 14:52               ` Peter Zijlstra
2017-08-13 14:52                 ` Peter Zijlstra
2017-07-29 13:31   ` kbuild test robot
2017-07-27 20:43 ` [PATCH 0/3] memdelay: memory health metric " Andrew Morton
2017-07-27 20:43   ` Andrew Morton
2017-07-28 19:43   ` Johannes Weiner
2017-07-28 19:43     ` Johannes Weiner
2017-08-02  8:11     ` Michal Hocko
2017-08-02  8:11       ` Michal Hocko
2017-07-29  2:48 ` Mike Galbraith
2017-07-29  2:48   ` Mike Galbraith
2017-07-29  3:21   ` Mike Galbraith
2017-07-29  3:21     ` Mike Galbraith
2017-07-29  6:38   ` Mike Galbraith
2017-07-29  6:38     ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170730152813.GA26672@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.