linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: hannes@cmpxchg.org, cgroups@vger.kernel.org, linux-mm@kvack.org,
	vdavydov@parallels.com, kernel-team@fb.com
Subject: Re: [PATCH 1/4] memcg: fix over-high reclaim amount
Date: Mon, 31 Aug 2015 09:38:40 -0400	[thread overview]
Message-ID: <20150831133840.GA2271@mtj.duckdns.org> (raw)
In-Reply-To: <20150831075133.GA29723@dhcp22.suse.cz>

Hello, Michal.

On Mon, Aug 31, 2015 at 09:51:33AM +0200, Michal Hocko wrote:
> The overall reclaim throughput will be higher with the parallel reclaim.

Is reclaim throughput as determined by CPU cycle bandwidth a
meaningful metric?  I'm having a bit of trouble imagining that this
actually would matter especially given that writeback is single
threaded per bdi_writeback.

Shoving in a large number of threads into the same path which walks
the same data structures when there's no clear benefit doesn't usually
end up buying anything.  Cachelines get thrown back and forth, locks
get contended, CPU cycles which could have been used for a lot of more
useful things get wasted and IO pattern becomes random.  We can still
choose to do that but I think we should have explicit justifications
(e.g. it really harms scalability otherwise).

> Threads might still get synchronized on the zone lru lock but this is only
> for isolating them from the LRU. In a larger hierarchies this even might
> not be the case because the hierarchy iterator tries to spread the reclaim
> over different memcgs.
> 
> So the per-memcg mutex would solve the potential over-reclaim but it
> will restrain the reclaim activity unnecessarily. Why is per-contribution
> reclaim such a big deal in the first place? If there are runaways
> allocation requests like GFP_NOWAIT then we should look after those. And
> I would argue that your delayed reclaim idea is a great fit for that. We
> just should track how many pages were charged over high limit in the
> process context and reclaim that amount on the way out from the kernel.

Per-contribution reclaim is not necessarily a "big deal" but is kinda
mushy on the edges which get more pronounced with async reclaim.

* It still can over reclaim to a considerable extent.  The reclaim
  path uses mean reclaim size of 1M and when the high limit is used as
  the main mechanism for reclaim rather than global limit, many
  threads performing simultaneous 1M reclaims will happen.

* We need to keep track of an additional state.  What if a task
  performs multiple NOWAIT try_charge()'s?  I guess we should add up
  those numbers.

* But even if we do that, what does that actually mean?  These numbers
  are arbitrary in nature.  A thread may have just performed
  high-order allocations back to back at the point where it steps over
  the limit with an order-0 allocation or maybe a different thread
  which wasn't consuming much memory can hit it right after.
  @nr_pages is a convenient number that we can use on the spot which
  will make the consumption converge on the limit but I'm not sure
  this is a number that we should keep track of.

Also, having a central point of control means that we can actually
define policies there - e.g. if the overage is less than 10%, let
tasks through as long as there's at least one reclaiming.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-08-31 13:38 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-28 15:25 [PATCHSET] memcg: improve high limit behavior and always enable kmemcg on dfl hier Tejun Heo
2015-08-28 15:25 ` [PATCH 1/4] memcg: fix over-high reclaim amount Tejun Heo
2015-08-28 17:06   ` Michal Hocko
2015-08-28 18:32     ` Tejun Heo
2015-08-31  7:51       ` Michal Hocko
2015-08-31 13:38         ` Tejun Heo [this message]
2015-09-01 12:51           ` Michal Hocko
2015-09-01 18:33             ` Tejun Heo
2015-08-28 15:25 ` [PATCH 2/4] memcg: flatten task_struct->memcg_oom Tejun Heo
2015-08-28 17:11   ` Michal Hocko
2015-08-28 15:25 ` [PATCH 3/4] memcg: punt high overage reclaim to return-to-userland path Tejun Heo
2015-08-28 16:36   ` Vladimir Davydov
2015-08-28 16:48     ` Tejun Heo
2015-08-28 20:32       ` Vladimir Davydov
2015-08-28 20:44         ` Tejun Heo
2015-08-28 22:06           ` Tejun Heo
2015-08-29  7:59             ` Vladimir Davydov
2015-08-30 15:52         ` Vladimir Davydov
2015-08-28 17:13   ` Michal Hocko
2015-08-28 17:56     ` Tejun Heo
2015-08-28 20:45     ` Vladimir Davydov
2015-08-28 20:53       ` Tejun Heo
2015-08-28 21:07         ` Vladimir Davydov
2015-08-28 21:14           ` Tejun Heo
2015-08-28 15:25 ` [PATCH 4/4] memcg: always enable kmemcg on the default hierarchy Tejun Heo
2015-08-28 16:49   ` Vladimir Davydov
2015-08-28 16:56     ` Tejun Heo
2015-08-28 17:14     ` Michal Hocko
2015-08-28 17:41       ` Tejun Heo
2015-09-01 12:44         ` Michal Hocko
2015-09-01 18:51           ` Tejun Heo
2015-09-04 13:30             ` Michal Hocko
2015-09-04 15:38               ` Vladimir Davydov
2015-09-07  9:39                 ` Michal Hocko
2015-09-07 10:01                   ` Vladimir Davydov
2015-09-07 11:03                     ` Michal Hocko
2015-09-04 16:18               ` Tejun Heo
2015-09-07 10:54                 ` Michal Hocko
2015-09-08 18:50                   ` Tejun Heo
2015-11-05 17:30   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150831133840.GA2271@mtj.duckdns.org \
    --to=tj@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).