From: "Michal Koutný" <mkoutny@suse.com>
To: Roman Gushchin <guro@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org>,
Christoph Lameter <cl@linux.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Shakeel Butt <shakeelb@google.com>,
linux-mm@kvack.org, kernel-team@fb.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 4/5] mm: memcg: charge memcg percpu memory to the parent cgroup
Date: Tue, 11 Aug 2020 20:32:25 +0200 [thread overview]
Message-ID: <20200811183225.GA62582@blackbook> (raw)
In-Reply-To: <20200811165527.GA1507044@carbon.DHCP.thefacebook.com>
[-- Attachment #1: Type: text/plain, Size: 2118 bytes --]
On Tue, Aug 11, 2020 at 09:55:27AM -0700, Roman Gushchin <guro@fb.com> wrote:
> As I said, there are 2 problems with charging systemd (or a similar daemon):
> 1) It often belongs to the root cgroup.
This doesn't hold for systemd (if we agree that systemd is the most
common case).
> 2) OOMing or failing some random memory allocations is a bad way
> to "communicate" a memory shortage to the daemon.
> What we really want is to prevent creating a huge number of cgroups
There's cgroup.max.descendants for that...
> (including dying cgroups) in some specific sub-tree(s).
...oh, so is this limiting the number of cgroups or limiting resources
used?
> OOMing the daemon or returning -ENOMEM to some random syscalls
> will not help us to reach the goal and likely will bring a bad
> experience to a user.
If we reach the situation when memory for cgroup operations is tight,
it'll disappoint the user either way.
My premise is that a running workload is more valuable than the
accompanying manager.
> In a generic case I don't see how we can charge the cgroup which
> creates cgroups without solving these problems first.
In my understanding, "onbehalveness" is a concept useful for various
kernel threads doing deferred work. Here it's promoted to user processes
managing cgroups.
> And if there is a very special case where we have to limit it,
> we can just add an additional layer:
>
> ` root or delegated root
> ` manager-parent-cgroup-with-a-limit
> ` manager-cgroup (systemd, docker, ...)
> ` [aggregation group(s)]
> ` job-group-1
> ` ...
> ` job-group-n
If the charge goes to the parent of created cgroup (job-cgroup-i here),
then the layer adds nothing. Am I missing something?
> I'd definitely charge the parent cgroup in all similar cases.
(This would mandate the controllers on the unified hierarchy, which is
fine IMO.) Then the order of enabling controllers on a subtree (e.g.
cpu,memory vs memory,cpu) by the manager would yield different charging.
This seems wrong^W confusing to me.
Thanks,
Michal
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2020-08-11 18:32 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-23 18:45 [PATCH v3 0/5] mm: memcg accounting of percpu memory Roman Gushchin
2020-06-23 18:45 ` [PATCH v3 1/5] percpu: return number of released bytes from pcpu_free_area() Roman Gushchin
2020-06-24 0:58 ` Shakeel Butt
2020-06-23 18:45 ` [PATCH v3 2/5] mm: memcg/percpu: account percpu memory to memory cgroups Roman Gushchin
2020-06-24 1:25 ` Shakeel Butt
2020-06-23 18:45 ` [PATCH v3 3/5] mm: memcg/percpu: per-memcg percpu memory statistics Roman Gushchin
2020-06-24 1:35 ` Shakeel Butt
2020-08-11 15:05 ` Johannes Weiner
2020-06-23 18:45 ` [PATCH v3 4/5] mm: memcg: charge memcg percpu memory to the parent cgroup Roman Gushchin
2020-06-24 1:40 ` Shakeel Butt
2020-06-24 1:49 ` Roman Gushchin
2020-07-29 17:10 ` Michal Koutný
2020-08-07 4:16 ` Andrew Morton
2020-08-07 4:37 ` Roman Gushchin
2020-08-10 19:33 ` Roman Gushchin
2020-08-11 14:47 ` Michal Koutný
2020-08-11 16:55 ` Roman Gushchin
2020-08-11 18:32 ` Michal Koutný [this message]
2020-08-11 19:32 ` Roman Gushchin
2020-08-12 16:28 ` Michal Koutný
2020-08-11 15:27 ` Johannes Weiner
2020-08-11 17:06 ` Roman Gushchin
[not found] ` <20200811170611.GB1507044-lLJQVQxiE4uLfgCeKHXN1g2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2020-08-13 9:16 ` Naresh Kamboju
2020-08-13 9:16 ` Naresh Kamboju
2020-08-13 23:27 ` Stephen Rothwell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200811183225.GA62582@blackbook \
--to=mkoutny@suse.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=dennis@kernel.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.