linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <dchinner@redhat.com>
To: Roman Gushchin <guro@fb.com>
Cc: "lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"mhocko@kernel.org" <mhocko@kernel.org>,
	"riel@surriel.com" <riel@surriel.com>,
	"guroan@gmail.com" <guroan@gmail.com>,
	Kernel Team <Kernel-team@fb.com>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>
Subject: Re: [LSF/MM TOPIC] dying memory cgroups and slab reclaim issues
Date: Tue, 19 Feb 2019 13:04:48 +1100	[thread overview]
Message-ID: <20190219020448.GY31397@rh> (raw)
In-Reply-To: <20190219003140.GA5660@castle.DHCP.thefacebook.com>

On Tue, Feb 19, 2019 at 12:31:45AM +0000, Roman Gushchin wrote:
> Sorry, resending with the fixed to/cc list. Please, ignore the first letter.

Please resend again with linux-fsdevel on the cc list, because this
isn't a MM topic given the regressions from the shrinker patches
have all been on the filesystem side of the shrinkers....

-Dave.

> --
> 
> Recent reverts of memcg leak fixes [1, 2] reintroduced the problem
> with accumulating of dying memory cgroups. This is a serious problem:
> on most of our machines we've seen thousands on dying cgroups, and
> the corresponding memory footprint was measured in hundreds of megabytes.
> The problem was also independently discovered by other companies.
> 
> The fixes were reverted due to xfs regression investigated by Dave Chinner.
> Simultaneously we've seen a very small (0.18%) cpu regression on some hosts,
> which caused Rik van Riel to propose a patch [3], which aimed to fix the
> regression. The idea is to accumulate small memory pressure and apply it
> periodically, so that we don't overscan small shrinker lists. According
> to Jan Kara's data [4], Rik's patch partially fixed the regression,
> but not entirely.
> 
> The path forward isn't entirely clear now, and the status quo isn't acceptable
> sue to memcg leak bug. Dave and Michal's position is to focus on dying memory
> cgroup case and apply some artificial memory pressure on corresponding slabs
> (probably, during cgroup deletion process). This approach can theoretically
> be less harmful for the subtle scanning balance, and not cause any regressions.
> 
> In my opinion, it's not necessarily true. Slab objects can be shared between
> cgroups, and often can't be reclaimed on cgroup removal without an impact on the
> rest of the system. Applying constant artificial memory pressure precisely only
> on objects accounted to dying cgroups is challenging and will likely
> cause a quite significant overhead. Also, by "forgetting" of some slab objects
> under light or even moderate memory pressure, we're wasting memory, which can be
> used for something useful. Dying cgroups are just making this problem more
> obvious because of their size.
> 
> So, using "natural" memory pressure in a way, that all slabs objects are scanned
> periodically, seems to me as the best solution. The devil is in details, and how
> to do it without causing any regressions, is an open question now.
> 
> Also, completely re-parenting slabs to parent cgroup (not only shrinker lists)
> is a potential option to consider.
> 
> It will be nice to discuss the problem on LSF/MM, agree on general path and
> make a potential list of benchmarks, which can be used to prove the solution.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a9a238e83fbb0df31c3b9b67003f8f9d1d1b6c96
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=69056ee6a8a3d576ed31e38b3b14c70d6c74edcc
> [3] https://lkml.org/lkml/2019/1/28/1865
> [4] https://lkml.org/lkml/2019/2/8/336
> 

-- 
Dave Chinner
dchinner@redhat.com


  reply	other threads:[~2019-02-19  2:05 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-19  0:31 [LSF/MM TOPIC] dying memory cgroups and slab reclaim issues Roman Gushchin
2019-02-19  2:04 ` Dave Chinner [this message]
2019-02-19 17:31   ` Rik van Riel
2019-02-19 17:38     ` Michal Hocko
2019-02-19 23:26     ` Dave Chinner
2019-02-20  2:06       ` Rik van Riel
2019-02-20  4:33         ` Dave Chinner
2019-02-20  5:31           ` Roman Gushchin
2019-02-20 17:00           ` Rik van Riel
  -- strict thread matches above, loose matches on Subject: below --
2019-02-19  7:13 Roman Gushchin
2019-02-20  2:47 ` Dave Chinner
2019-02-20  5:50   ` Dave Chinner
2019-02-20  7:27     ` Dave Chinner
2019-02-20 16:20       ` Johannes Weiner
2019-02-21 22:46       ` Roman Gushchin
2019-02-22  1:48         ` Rik van Riel
2019-02-22  1:57           ` Roman Gushchin
2019-02-28 20:30         ` Roman Gushchin
2019-02-28 21:30           ` Dave Chinner
2019-02-28 22:29             ` Roman Gushchin
2019-02-18 23:53 Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190219020448.GY31397@rh \
    --to=dchinner@redhat.com \
    --cc=Kernel-team@fb.com \
    --cc=guro@fb.com \
    --cc=guroan@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mhocko@kernel.org \
    --cc=riel@surriel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).