public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>
To: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Chris Frey <cdfrey-4/nNOD19pEMY+eTVAdjFZg@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Muchun Song <songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
Subject: Re: an argument for keeping oom_control in cgroups v2
Date: Tue, 23 Aug 2022 09:10:37 -0700	[thread overview]
Message-ID: <YwT7/VFUTNmjarTh@P9FQF9L96D> (raw)
In-Reply-To: <YwRgOcfagx4FfQcY-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>

On Tue, Aug 23, 2022 at 07:06:01AM +0200, Michal Hocko wrote:
> On Mon 22-08-22 17:22:53, Tejun Heo wrote:
> > (cc'ing memcg folks for visiblity)
> > 
> > On Mon, Aug 22, 2022 at 08:04:02AM -0400, Chris Frey wrote:
> > > In cgroups v1 we had:
> > > 
> > > 	memory.soft_limit_in_bytes
> > > 	memory.limit_in_bytes
> > > 	memory.memsw.limit_in_bytes
> > > 	memory.oom_control
> > > 
> > > Using these features, we could achieve:
> > > 
> > > 	- cause programs that were memory hungry to suffer performance, but
> > > 	  not stop (soft limit)
> 
> There is memory.high with a much more sensible semantic and
> implementation to achieve a similar thing.
> 
> > > 	- cause programs to swap before the system actually ran out of memory
> > > 	  (limit)
> 
> Not sure what this is supposed to mean.
> 
> > > 	- cause programs to be OOM-killed if they used too much swap
> > > 	  (memsw.limit...)
> 
> 
> There is an explicit swap limit. It is true that the semantic is
> different but do you have an example where you cannot really achieve
> what you need by the swap limit?
> 
> > > 
> > > 	- cause programs to halt instead of get killed (oom_control)
> > > 
> > > That last feature is something I haven't seen duplicated in the settings
> > > for cgroups v2.  In terms of handling a truly non-malicious memory hungry
> > > program, it is a feature that has no equal, because the user may require
> > > time to free up memory elsewhere before allocating more to the program,
> > > and he may not want the performance degredation, nor the loss of work,
> > > that comes from the other options.
> 
> Yes this functionality is not available in v2 anymore. One reason is
> that the implementation had to be considerably reduced to only block on
> OOM for user space triggered page faults 3812c8c8f395 ("mm: memcg: do
> not trap chargers with full callstack on OOM"). The primary reason is,
> as Tejun indicated, that we cannot simply block a random kernel code
> path and wait for userspace because that is a potential DoS on the rest
> of the system and unrelated workloads which is a trivial breakage of
> workload separation.
> 
> This means that many other kernel paths which can cause memcg OOM cannot
> be blocked and so the feature is severly crippled. In order to allow for
> this feature we would essentially need a safe place to wait for the
> userspace for any allocation (charging) kernel path where no locks are
> held yet allocation failure is not observed and that is not feasible.

Btw, it's fairly easy to emulate the oom_control behaviour using cgroups v2:
a userspace agent can listen to memory.high/max events and use the cgroup v2
freezer to stop the workload and handle the oom in v1 oom_control style.
An agent can have a high/real-time priority, so I guess the behavior will be
actually quite close to the v1 experience. Much safer though.

Thanks!

  parent reply	other threads:[~2022-08-23 16:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-22 12:04 an argument for keeping oom_control in cgroups v2 Chris Frey
     [not found] ` <20220822120402.GA20333-4/nNOD19pEMY+eTVAdjFZg@public.gmane.org>
2022-08-23  3:22   ` Tejun Heo
     [not found]     ` <YwRIDTmZJflhKP2n-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-08-23  5:06       ` Michal Hocko
     [not found]         ` <YwRgOcfagx4FfQcY-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2022-08-23 16:10           ` Roman Gushchin [this message]
2022-08-24  9:30             ` Chris Frey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YwT7/VFUTNmjarTh@P9FQF9L96D \
    --to=roman.gushchin-fxuvxftifdnyg1zeobxtfa@public.gmane.org \
    --cc=cdfrey-4/nNOD19pEMY+eTVAdjFZg@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=mhocko-IBi9RG/b67k@public.gmane.org \
    --cc=shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox