From: Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>
To: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Chris Frey <cdfrey-4/nNOD19pEMY+eTVAdjFZg@public.gmane.org>,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Muchun Song <songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
Subject: Re: an argument for keeping oom_control in cgroups v2
Date: Tue, 23 Aug 2022 09:10:37 -0700 [thread overview]
Message-ID: <YwT7/VFUTNmjarTh@P9FQF9L96D> (raw)
In-Reply-To: <YwRgOcfagx4FfQcY-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
On Tue, Aug 23, 2022 at 07:06:01AM +0200, Michal Hocko wrote:
> On Mon 22-08-22 17:22:53, Tejun Heo wrote:
> > (cc'ing memcg folks for visiblity)
> >
> > On Mon, Aug 22, 2022 at 08:04:02AM -0400, Chris Frey wrote:
> > > In cgroups v1 we had:
> > >
> > > memory.soft_limit_in_bytes
> > > memory.limit_in_bytes
> > > memory.memsw.limit_in_bytes
> > > memory.oom_control
> > >
> > > Using these features, we could achieve:
> > >
> > > - cause programs that were memory hungry to suffer performance, but
> > > not stop (soft limit)
>
> There is memory.high with a much more sensible semantic and
> implementation to achieve a similar thing.
>
> > > - cause programs to swap before the system actually ran out of memory
> > > (limit)
>
> Not sure what this is supposed to mean.
>
> > > - cause programs to be OOM-killed if they used too much swap
> > > (memsw.limit...)
>
>
> There is an explicit swap limit. It is true that the semantic is
> different but do you have an example where you cannot really achieve
> what you need by the swap limit?
>
> > >
> > > - cause programs to halt instead of get killed (oom_control)
> > >
> > > That last feature is something I haven't seen duplicated in the settings
> > > for cgroups v2. In terms of handling a truly non-malicious memory hungry
> > > program, it is a feature that has no equal, because the user may require
> > > time to free up memory elsewhere before allocating more to the program,
> > > and he may not want the performance degredation, nor the loss of work,
> > > that comes from the other options.
>
> Yes this functionality is not available in v2 anymore. One reason is
> that the implementation had to be considerably reduced to only block on
> OOM for user space triggered page faults 3812c8c8f395 ("mm: memcg: do
> not trap chargers with full callstack on OOM"). The primary reason is,
> as Tejun indicated, that we cannot simply block a random kernel code
> path and wait for userspace because that is a potential DoS on the rest
> of the system and unrelated workloads which is a trivial breakage of
> workload separation.
>
> This means that many other kernel paths which can cause memcg OOM cannot
> be blocked and so the feature is severly crippled. In order to allow for
> this feature we would essentially need a safe place to wait for the
> userspace for any allocation (charging) kernel path where no locks are
> held yet allocation failure is not observed and that is not feasible.
Btw, it's fairly easy to emulate the oom_control behaviour using cgroups v2:
a userspace agent can listen to memory.high/max events and use the cgroup v2
freezer to stop the workload and handle the oom in v1 oom_control style.
An agent can have a high/real-time priority, so I guess the behavior will be
actually quite close to the v1 experience. Much safer though.
Thanks!
next prev parent reply other threads:[~2022-08-23 16:10 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-22 12:04 an argument for keeping oom_control in cgroups v2 Chris Frey
[not found] ` <20220822120402.GA20333-4/nNOD19pEMY+eTVAdjFZg@public.gmane.org>
2022-08-23 3:22 ` Tejun Heo
[not found] ` <YwRIDTmZJflhKP2n-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-08-23 5:06 ` Michal Hocko
[not found] ` <YwRgOcfagx4FfQcY-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2022-08-23 16:10 ` Roman Gushchin [this message]
2022-08-24 9:30 ` Chris Frey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YwT7/VFUTNmjarTh@P9FQF9L96D \
--to=roman.gushchin-fxuvxftifdnyg1zeobxtfa@public.gmane.org \
--cc=cdfrey-4/nNOD19pEMY+eTVAdjFZg@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=mhocko-IBi9RG/b67k@public.gmane.org \
--cc=shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox