linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mandeep Singh Baines <msb@chromium.org>
To: Bodo Eggert <7eggert@gmx.de>
Cc: David Rientjes <rientjes@google.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ying Han <yinghan@google.com>, Bodo Eggert <7eggert@web.de>,
	"Figo.zhang" <figo1802@gmail.com>
Subject: Re: [PATCH] Revert oom rewrite series
Date: Tue, 16 Nov 2010 16:48:54 -0800	[thread overview]
Message-ID: <20101117004854.GA7153@google.com> (raw)
In-Reply-To: <alpine.LSU.0.999.1011170035050.5484@be1.lrz>

Bodo Eggert (7eggert@gmx.de) wrote:
> On Mon, 15 Nov 2010, David Rientjes wrote:
> > On Tue, 16 Nov 2010, Bodo Eggert wrote:
> 
> > > > CAP_SYS_RESOURCE threads have full control over their oom killing priority
> > > > by /proc/pid/oom_score_adj
> > > 
> > > , but unless they are written in the last months and designed for linux
> > > and if the author took some time to research each external process invocation,
> > > they can not be aware of this possibility.
> > > 
> > 
> > You're clearly wrong, CAP_SYS_RESOURCE has been required to modify oom_adj 
> > for over five years (as long as the git history).  8fb4fc68, merged into 
> > 2.6.20, allowed tasks to raise their own oom_adj but not decrease it.  
> > That is unchanged by the rewrite.
> 
> You are misunderstanding me. It was allowed to do this, but it did not need 
> to do it yet. It was enough to be a well-written POSIX application without 
> linux-specific OOM hacks for some specific kernel versions.
> 
> > > Besides that, if each process is supposed to change the default, the default
> > > is wrong.
> > 
> > That doesn't make any sense, if want to protect a thread from the oom 
> > killer you're going to need to modify oom_score_adj, the kernel can't know 
> > what you perceive as being vital.  Having CAP_SYS_RESOURCE alone does not 
> > imply that, it only allows unbounded access to resources.  That's 
> > completely orthogonal to the goal of the oom killer heuristic, which is to 
> > find the most memory-hogging task to kill.
> 
> The old oom killer's task was to guess the best victim to kill. For me, it 
> did a good job (but the system kept thrashing for too long until it kicked

Here's a patch I've been working on to control thrashing.

http://lkml.org/lkml/2010/10/28/289

It works well for our app: web browser. We'd rather OOM quickly and kill
a browser tab than thrash for a few minutes and then OOM. It works well for
us but I'm working on a more generally useful solution.

> the offender). Looking at CAP_SYS_RESOURCE was one way to recognize 
> important processes.
> 
> > > 1) The exponential scale did have a low resolution.
> > > 
> > > 2) The heuristics were developed using much brain power and much
> > >    trial-and-error. You are going back to basics, and some people
> > >    are not convinced that this is better. I googled and I did not
> > >    find a discussion about how and why the new score was designed
> > >    this way.
> > >    looking at the output of:
> > >    cd /proc; for a in [0-9]*; do
> > >      echo `cat $a/oom_score` $a `perl -pes/'\0.*$'// < $a/cmdline`;
> > >    done|grep -v ^0|sort -n |less
> > >    , I 'm not convinced, too.
> > > 
> > 
> > The old heuristics were a mixture of arbitrary values that didn't adjust 
> > scores based on a unit and would often cause the incorrect task to be 
> > targeted because there was no clear goal being achieved.  The new 
> > heuristic has a solid goal: to identify and kill the most memory-hogging 
> > task that is eligible given the context in which the oom occurs.  If you 
> > disagree with that goal and want any of the old heursitics reintroduced, 
> > please show that it makes sense in the oom killer.
> 
> The first old OOM killer did the same as you promise the current one does,
> except for your bugfixes. That's why it killed the wrong applications and
> all the heuristics were added until the complaints stopped.
> 
> Off cause I did not yet test your OOM killer, maybe it really is better.
> Heuristics tend to rot and you did much work to make it right.
> 
> I don't want the old OOM killer back, but I don't want you to fall
> into the same pits as the pre-old OOM killer used to do.
> 
> > > PS) Mapping an exponential value to a linear score is bad. E.g. A
> > >     oom_adj of 8 should make an 1-MB-process as likely to kill as
> > >     a 256-MB-process with oom_adj=0.
> > > 
> > 
> > To show that, you would have to show that an application that exists today 
> > uses an oom_adj for something other than polarization and is based on a 
> > calculation of allowable memory usage.  It simply doesn't exist.
> 
> No such application should exist because the OOM killer should DTRT.
> oom_adj was supposed to let the sysadmin lower his mission-critical
> DB's score to be just lower than the less-important tasks, or to
> point the kernel to his ever-faulty and easily-restarted browser.
> 
> > > PS2) Because I saw this in your presentation PDF: (@udev-people)
> > >     The -17 score of udevd is wrong, since it will even prevent
> > >     the OOM killer from working correctly if it grows to 100 MB:
> > > 
> > 
> > Threads with CAP_SYS_RESOURCE are free to lower the oom_score_adj of any 
> > thread they deem fit and that includes applications that lower its own 
> > oom_score_adj.  The kernel isn't going to prohibit users from setting 
> > their own oom_score_adj.
> 
> My point is: The udev people should not prevent the OOM killer 
> unconditionally, it has an important task in case something goes wrong.
> I just didn't want to start a new thread at that time of day.
> -- 
> How do I set my laser printer on stun?

  parent reply	other threads:[~2010-11-17  0:49 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-14  5:07 [PATCH] Revert oom rewrite series KOSAKI Motohiro
2010-11-14 19:32 ` Linus Torvalds
2010-11-15  0:54   ` KOSAKI Motohiro
2010-11-15  2:19     ` Andrew Morton
     [not found]       ` <AANLkTik_SDaiu2eQsJ9+4ywLR5K5V1Od-hwop6gwas3F@mail.gmail.com>
2010-11-15  4:41         ` Figo.zhang
2010-11-15  6:57       ` KOSAKI Motohiro
2010-11-15 10:34         ` David Rientjes
2010-11-15 23:31           ` Jesper Juhl
2010-11-16  0:06             ` David Rientjes
2010-11-16 10:04               ` Martin Knoblauch
2010-11-16 10:33                 ` Alessandro Suardi
2010-11-16  0:13             ` Valdis.Kletnieks
2010-11-16  6:43               ` David Rientjes
2010-11-16 11:03               ` Alan Cox
2010-11-16 13:03                 ` Florian Mickler
2010-11-16 14:55                   ` Alan Cox
2010-11-16 20:57                     ` David Rientjes
2010-11-16 21:01                       ` Fabio Comolli
2010-11-17  4:04                     ` Valdis.Kletnieks
2010-11-16 15:15               ` Alejandro Riveira Fernández
2010-11-23  7:16           ` KOSAKI Motohiro
2010-11-28  1:45             ` David Rientjes
2010-11-30 13:04               ` KOSAKI Motohiro
2010-11-30 20:02                 ` David Rientjes
2010-11-23  7:16         ` KOSAKI Motohiro
2010-11-23 23:51   ` KOSAKI Motohiro
2010-11-14 21:58 ` David Rientjes
2010-11-15 23:33   ` Bodo Eggert
2010-11-15 23:50     ` David Rientjes
2010-11-17  0:06       ` Bodo Eggert
2010-11-17  0:25         ` David Rientjes
2010-11-17  0:48         ` Mandeep Singh Baines [this message]
  -- strict thread matches above, loose matches on Subject: below --
2010-11-10 15:14 [PATCH v3]mm/oom-kill: direct hardware access processes should get bonus Figo.zhang
2010-11-10 15:24 ` Figo.zhang
2010-11-14  5:21   ` KOSAKI Motohiro
2010-11-14 21:33     ` David Rientjes
2010-11-15  3:26       ` [PATCH] Revert oom rewrite series Figo.zhang
2010-11-15 10:14         ` David Rientjes
2010-11-15 10:57           ` Alan Cox
2010-11-15 20:54             ` David Rientjes
2010-11-23  7:16             ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101117004854.GA7153@google.com \
    --to=msb@chromium.org \
    --cc=7eggert@gmx.de \
    --cc=7eggert@web.de \
    --cc=akpm@linux-foundation.org \
    --cc=figo1802@gmail.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rientjes@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).