From: Evgeniy Polyakov <zbr@ioremap.net>
To: Theodore Tso <tytso@mit.edu>,
David Rientjes <rientjes@google.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: Linux killed Kenny, bastard!
Date: Wed, 14 Jan 2009 02:02:40 +0300 [thread overview]
Message-ID: <20090113230240.GA30192@ioremap.net> (raw)
In-Reply-To: <20090113224941.GA14730@mit.edu>
On Tue, Jan 13, 2009 at 05:49:41PM -0500, Theodore Tso (tytso@mit.edu) wrote:
> > User does not work with the some magically calculated scores, he just
> > starts the processes and knows only their names. User can specify pid,
> > but in the case of short-living connections it is not possible. Changing
> > parent oom score opens a huge possibility to kill it, while in case of
> > some application server (or database) it should never be killed, and
> > only some of its clients (which work for the users and not for the
> > calculating backend for example) have to be killed.
>
> The standard way this gets handled for resource limits is very simple:
>
> 1) parent forks the child process
> 2) in the child process we set up resource limits, adjust oom
> 3) exec the child's program.
>
> As Alan has already pointed out to you:
>
> (echo XXXX > /proc/self/oom_adj ; exec /usr/bin/program)
Yes, I saw that in archive, but did not receive myself, so did not
answer. This works in the above simple case, but if we dig a little bit
into the case when there are children, parent has to live and not all
children should be considered equal by the oom-killer, things change
dramatially. And we can not change the sources. Well, in particaular my
case we can, but it is not about the single system :)
> There are two problems; one is whether or not the OOM protection is
> inherited or not, and how one sets OOM protection --- and I think you
> will find a huge resistance to using names as a way of expressing
> policy.
Yup, this whole thread shows this resistance quite good :)
> The second problem is that oom_adj scoring is a hueristic which is
> hard for system administrators to understand --- and these are
> separable problems. Don't try to conflate them, and try using the
> fact that a random score echo'ed into /proc/pid/oom_adj is hard to
> tune as a justification for using process executable names.
I tried, and although I do agree on the fact that it can be used to turn
oom-killer on or off, but not for the tuning. But even this does not
really work in the case showed, when we can not change the application,
and having a main goal to save the parent and kill only some subset of
the short-living children. So we can not really adjust parent oom-score
and get the same in the children, since this will put parent and
important children at risk.
> If you want to argue that using containers is too hard, and there out
> to be a simpler tuning parameter where (for the sake of argument) all
> processes are given a number from 0 to 10, where 5 is the default, and
> higher numbers will be picked unconditionally over lower numbers, and
> the existing OOM score is used to distinguish between two process with
> the same OOM protection, that's fine.
> How we set that OOM protection class, whether it is via setrlimit() or
> echoing into a magic /proc/pid/oom_protection file, and whether it
> inherits across fork and exec calls, are a separate question.
Let's put containers out of the picture. While it may or may not work,
they are definitely not an issue in the given systems. Having simpler
tunables would be great, but we can not change them, since it is
already existing abi, documentation could be extended though, I can
cook up a patch tomorrow if no one else will do this.
--
Evgeniy Polyakov
next prev parent reply other threads:[~2009-01-13 23:02 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-12 15:33 Linux killed Kenny, bastard! Evgeniy Polyakov
2009-01-12 15:44 ` Dave Jones
2009-01-12 15:48 ` Evgeniy Polyakov
2009-01-12 15:51 ` Alan Cox
2009-01-12 15:52 ` Evgeniy Polyakov
2009-01-12 21:29 ` Chris Snook
2009-01-12 21:42 ` Evgeniy Polyakov
2009-01-13 13:52 ` [why oom_adj does not work] " Evgeniy Polyakov
2009-01-13 14:06 ` Alan Cox
2009-01-13 14:24 ` Evgeniy Polyakov
2009-01-13 15:00 ` Balbir Singh
2009-01-13 15:21 ` Evgeniy Polyakov
2009-01-13 18:04 ` Valdis.Kletnieks
2009-01-13 19:46 ` David Rientjes
2009-01-13 21:33 ` Evgeniy Polyakov
2009-01-13 21:39 ` David Rientjes
2009-01-13 22:05 ` Evgeniy Polyakov
2009-01-14 16:12 ` OOM documentation update [was: Linux killed Kenny, bastard!] Evgeniy Polyakov
2009-01-14 17:06 ` [take2] " Evgeniy Polyakov
2009-01-14 21:34 ` Randy Dunlap
2009-01-14 21:53 ` Bryan Donlan
2009-01-14 22:10 ` Evgeniy Polyakov
2009-01-14 22:14 ` [take3] " Evgeniy Polyakov
2009-01-15 0:58 ` David Rientjes
2009-01-15 8:51 ` Evgeniy Polyakov
2009-01-15 8:57 ` [take4] " Evgeniy Polyakov
2009-01-15 11:13 ` David Rientjes
2009-01-12 15:49 ` Linux killed Kenny, bastard! Alan Cox
2009-01-12 15:50 ` Evgeniy Polyakov
2009-01-12 15:52 ` Alan Cox
2009-01-12 15:56 ` Evgeniy Polyakov
2009-01-12 16:19 ` Alan Cox
2009-01-12 16:29 ` Evgeniy Polyakov
2009-01-12 23:00 ` Bill Davidsen
2009-01-12 23:17 ` Evgeniy Polyakov
2009-01-13 1:53 ` David Rientjes
2009-01-13 8:52 ` Evgeniy Polyakov
2009-01-13 9:54 ` David Rientjes
2009-01-13 11:54 ` Evgeniy Polyakov
2009-01-13 12:15 ` Alan Cox
2009-01-13 12:29 ` Evgeniy Polyakov
2009-01-13 13:19 ` Theodore Tso
2009-01-13 13:35 ` Evgeniy Polyakov
2009-01-14 0:24 ` Bill Davidsen
2009-01-14 0:35 ` Evgeniy Polyakov
2009-01-13 13:47 ` Alan Cox
2009-01-13 19:36 ` David Rientjes
2009-01-13 21:46 ` Evgeniy Polyakov
2009-01-13 22:49 ` Theodore Tso
2009-01-13 23:02 ` Evgeniy Polyakov [this message]
2009-01-14 1:11 ` Theodore Tso
2009-01-14 1:20 ` Evgeniy Polyakov
2009-01-14 4:06 ` Theodore Tso
2009-01-13 23:10 ` David Rientjes
2009-01-13 23:35 ` Evgeniy Polyakov
2009-01-13 23:43 ` David Rientjes
2009-01-13 23:55 ` Evgeniy Polyakov
2009-01-14 0:32 ` David Rientjes
2009-01-14 0:53 ` Evgeniy Polyakov
2009-01-14 4:23 ` Valdis.Kletnieks
2009-01-14 9:07 ` Evgeniy Polyakov
2009-01-13 19:15 ` David Rientjes
2009-01-13 22:00 ` Evgeniy Polyakov
2009-01-13 23:26 ` Valdis.Kletnieks
2009-01-13 23:36 ` Evgeniy Polyakov
2009-01-13 13:41 ` Jan-Frode Myklebust
2009-01-13 13:59 ` Alan Cox
2009-01-12 16:22 ` Dave Jones
2009-01-12 16:28 ` Evgeniy Polyakov
2009-01-13 16:35 ` KOSAKI Motohiro
2009-01-13 22:04 ` Evgeniy Polyakov
-- strict thread matches above, loose matches on Subject: below --
2009-01-13 10:58 Tomasz Chmielewski
2009-01-13 12:20 ` Evgeniy Polyakov
[not found] <bTxPW-1lH-13@gated-at.bofh.it>
[not found] ` <bTE53-5LJ-13@gated-at.bofh.it>
[not found] ` <bTEeI-5Y0-15@gated-at.bofh.it>
[not found] ` <bTGTe-1K5-3@gated-at.bofh.it>
[not found] ` <bTNi6-3kb-9@gated-at.bofh.it>
[not found] ` <bTOea-4QT-1@gated-at.bofh.it>
[not found] ` <bTQ6f-7Qs-1@gated-at.bofh.it>
[not found] ` <bTQpv-5o-3@gated-at.bofh.it>
[not found] ` <bTQzd-iK-5@gated-at.bofh.it>
[not found] ` <bTXhp-2IA-31@gated-at.bofh.it>
[not found] ` <bTZjb-62D-25@gated-at.bofh.it>
2009-01-17 15:21 ` Bodo Eggert
2009-01-17 15:41 ` Evgeniy Polyakov
2009-01-18 12:49 ` Bodo Eggert
2009-01-18 13:17 ` Evgeniy Polyakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090113230240.GA30192@ioremap.net \
--to=zbr@ioremap.net \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=rientjes@google.com \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox