From: Dave Chinner <dchinner@redhat.com>
To: Roman Gushchin <guro@fb.com>
Cc: "lsf-pc@lists.linux-foundation.org"
<lsf-pc@lists.linux-foundation.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"mhocko@kernel.org" <mhocko@kernel.org>,
"riel@surriel.com" <riel@surriel.com>,
"guroan@gmail.com" <guroan@gmail.com>,
Kernel Team <Kernel-team@fb.com>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>
Subject: Re: [LSF/MM TOPIC] dying memory cgroups and slab reclaim issues
Date: Tue, 19 Feb 2019 13:04:48 +1100 [thread overview]
Message-ID: <20190219020448.GY31397@rh> (raw)
In-Reply-To: <20190219003140.GA5660@castle.DHCP.thefacebook.com>
On Tue, Feb 19, 2019 at 12:31:45AM +0000, Roman Gushchin wrote:
> Sorry, resending with the fixed to/cc list. Please, ignore the first letter.
Please resend again with linux-fsdevel on the cc list, because this
isn't a MM topic given the regressions from the shrinker patches
have all been on the filesystem side of the shrinkers....
-Dave.
> --
>
> Recent reverts of memcg leak fixes [1, 2] reintroduced the problem
> with accumulating of dying memory cgroups. This is a serious problem:
> on most of our machines we've seen thousands on dying cgroups, and
> the corresponding memory footprint was measured in hundreds of megabytes.
> The problem was also independently discovered by other companies.
>
> The fixes were reverted due to xfs regression investigated by Dave Chinner.
> Simultaneously we've seen a very small (0.18%) cpu regression on some hosts,
> which caused Rik van Riel to propose a patch [3], which aimed to fix the
> regression. The idea is to accumulate small memory pressure and apply it
> periodically, so that we don't overscan small shrinker lists. According
> to Jan Kara's data [4], Rik's patch partially fixed the regression,
> but not entirely.
>
> The path forward isn't entirely clear now, and the status quo isn't acceptable
> sue to memcg leak bug. Dave and Michal's position is to focus on dying memory
> cgroup case and apply some artificial memory pressure on corresponding slabs
> (probably, during cgroup deletion process). This approach can theoretically
> be less harmful for the subtle scanning balance, and not cause any regressions.
>
> In my opinion, it's not necessarily true. Slab objects can be shared between
> cgroups, and often can't be reclaimed on cgroup removal without an impact on the
> rest of the system. Applying constant artificial memory pressure precisely only
> on objects accounted to dying cgroups is challenging and will likely
> cause a quite significant overhead. Also, by "forgetting" of some slab objects
> under light or even moderate memory pressure, we're wasting memory, which can be
> used for something useful. Dying cgroups are just making this problem more
> obvious because of their size.
>
> So, using "natural" memory pressure in a way, that all slabs objects are scanned
> periodically, seems to me as the best solution. The devil is in details, and how
> to do it without causing any regressions, is an open question now.
>
> Also, completely re-parenting slabs to parent cgroup (not only shrinker lists)
> is a potential option to consider.
>
> It will be nice to discuss the problem on LSF/MM, agree on general path and
> make a potential list of benchmarks, which can be used to prove the solution.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a9a238e83fbb0df31c3b9b67003f8f9d1d1b6c96
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=69056ee6a8a3d576ed31e38b3b14c70d6c74edcc
> [3] https://lkml.org/lkml/2019/1/28/1865
> [4] https://lkml.org/lkml/2019/2/8/336
>
--
Dave Chinner
dchinner@redhat.com
next prev parent reply other threads:[~2019-02-19 2:05 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-19 0:31 [LSF/MM TOPIC] dying memory cgroups and slab reclaim issues Roman Gushchin
2019-02-19 2:04 ` Dave Chinner [this message]
2019-02-19 17:31 ` Rik van Riel
2019-02-19 17:38 ` Michal Hocko
2019-02-19 23:26 ` Dave Chinner
2019-02-20 2:06 ` Rik van Riel
2019-02-20 4:33 ` Dave Chinner
2019-02-20 5:31 ` Roman Gushchin
2019-02-20 17:00 ` Rik van Riel
-- strict thread matches above, loose matches on Subject: below --
2019-02-19 7:13 Roman Gushchin
2019-02-20 2:47 ` Dave Chinner
2019-02-20 5:50 ` Dave Chinner
2019-02-20 7:27 ` Dave Chinner
2019-02-20 16:20 ` Johannes Weiner
2019-02-21 22:46 ` Roman Gushchin
2019-02-22 1:48 ` Rik van Riel
2019-02-22 1:57 ` Roman Gushchin
2019-02-28 20:30 ` Roman Gushchin
2019-02-28 21:30 ` Dave Chinner
2019-02-28 22:29 ` Roman Gushchin
2019-02-18 23:53 Roman Gushchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190219020448.GY31397@rh \
--to=dchinner@redhat.com \
--cc=Kernel-team@fb.com \
--cc=guro@fb.com \
--cc=guroan@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mhocko@kernel.org \
--cc=riel@surriel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).