public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Protecting processes from the OOM killer
@ 2003-02-28  1:21 Dan Kegel
  2003-02-28 13:40 ` Alan Cox
  0 siblings, 1 reply; 9+ messages in thread
From: Dan Kegel @ 2003-02-28  1:21 UTC (permalink / raw)
  To: linux-kernel

For a while now, I've been trying to figure out how
to make the oom killer not kill important processes.

How about rewarding processes that have an
RSS limit if they stay well below it?
The operator can then mark processes that are important
by using 'ulimit -m'.
(This is orthogonal to Rik's recent patch.)

--- oom_kill.c.orig	2002-09-26 17:31:12.000000000 -0700
+++ oom_kill.c	2003-02-27 16:59:46.000000000 -0800
@@ -86,6 +90,18 @@
  		points *= 2;

  	/*
+	 * Processes which *have* an RSS limit, but which are under half of it,
+	 * are behaving well, so halve their badness points.
+	 * Do it again if they're under a quarter of their RSS limit.
+	 */
+	if (p->rlim[RLIMIT_RSS].rlim_max != ULONG_MAX) {
+		if (p->mm->rss < (p->rlim[RLIMIT_RSS].rlim_max >> (PAGE_SHIFT+1)))
+			points /= 2;
+		if (p->mm->rss < (p->rlim[RLIMIT_RSS].rlim_max >> (PAGE_SHIFT+2)))
+			points /= 2;
+	}
+
+	/*
  	 * Superuser processes are usually more important, so we make it
  	 * less likely that we kill those.
  	 */

-- 
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Protecting processes from the OOM killer
  2003-02-28  1:21 Protecting processes from the OOM killer Dan Kegel
@ 2003-02-28 13:40 ` Alan Cox
  2003-02-28 14:19   ` Ville Herva
  2003-02-28 16:08   ` Dan Kegel
  0 siblings, 2 replies; 9+ messages in thread
From: Alan Cox @ 2003-02-28 13:40 UTC (permalink / raw)
  To: Dan Kegel; +Cc: Linux Kernel Mailing List

On Fri, 2003-02-28 at 01:21, Dan Kegel wrote:
> For a while now, I've been trying to figure out how
> to make the oom killer not kill important processes.

How about by not allowing your system to excessively overcommit.
Everything else is armwaving "works half the time" stuff. By the time
the OOM kicks in the game is already over. The rlimit one doesnt deal
with things like fork explosions where you have lots of processes
all under 1/4 of the rlimit range who cumulatively overcommit. In
fact you now pick harder on other tasks...


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Protecting processes from the OOM killer
  2003-02-28 13:40 ` Alan Cox
@ 2003-02-28 14:19   ` Ville Herva
  2003-02-28 15:37     ` Alan Cox
  2003-02-28 16:08   ` Dan Kegel
  1 sibling, 1 reply; 9+ messages in thread
From: Ville Herva @ 2003-02-28 14:19 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List

On Fri, Feb 28, 2003 at 01:40:19PM +0000, you [Alan Cox] wrote:
> 
> How about by not allowing your system to excessively overcommit.
> Everything else is armwaving "works half the time" stuff. 

Which invites the question: the strict overcommit stuff from -ac (the 'echo
{2,3} > /proc/sys/vm/overcommit_memory' stuff) hasn't found it's way to
mainline yet, has it? I wonder if it would be compatible with up-to-date
-aa vm...


-- v --

v@iki.fi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Protecting processes from the OOM killer
  2003-02-28 14:19   ` Ville Herva
@ 2003-02-28 15:37     ` Alan Cox
  0 siblings, 0 replies; 9+ messages in thread
From: Alan Cox @ 2003-02-28 15:37 UTC (permalink / raw)
  To: Ville Herva; +Cc: Linux Kernel Mailing List

On Fri, 2003-02-28 at 14:19, Ville Herva wrote:
> On Fri, Feb 28, 2003 at 01:40:19PM +0000, you [Alan Cox] wrote:
> > 
> > How about by not allowing your system to excessively overcommit.
> > Everything else is armwaving "works half the time" stuff. 
> 
> Which invites the question: the strict overcommit stuff from -ac (the 'echo
> {2,3} > /proc/sys/vm/overcommit_memory' stuff) hasn't found it's way to
> mainline yet, has it? I wonder if it would be compatible with up-to-date
> -aa vm...

Marcelo didn't want it for base. Its in 2.5 and in -ac. There is no
longer any rmap requirement on the code so it should "just work" with
the -aa changes too


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Protecting processes from the OOM killer
  2003-02-28 13:40 ` Alan Cox
  2003-02-28 14:19   ` Ville Herva
@ 2003-02-28 16:08   ` Dan Kegel
  2003-02-28 22:13     ` James Antill
  2003-03-03 14:45     ` Jesse Pollard
  1 sibling, 2 replies; 9+ messages in thread
From: Dan Kegel @ 2003-02-28 16:08 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List

Alan Cox wrote:
> On Fri, 2003-02-28 at 01:21, Dan Kegel wrote:
> 
>>For a while now, I've been trying to figure out how
>>to make the oom killer not kill important processes.
> 
> 
> How about by not allowing your system to excessively overcommit.

(I'm using 2.4.18; is
http://www.kernel.org/pub/linux/kernel/people/rml/vm/strict-overcommit/v2.4/vm-strict-overcommit-rml-2.4.18-1.patch
still the approprate patch for that?)

> Everything else is armwaving "works half the time" stuff. By the time
> the OOM kicks in the game is already over.

Even with overcommit disallowed, the OOM killer is going to run
when my users try to run too big a job, so I would still like
the OOM killer to behave "well".

> The rlimit one doesnt deal
> with things like fork explosions where you have lots of processes
> all under 1/4 of the rlimit range who cumulatively overcommit. In
> fact you now pick harder on other tasks...

We do not see fork explosions in our workload, but if we did,
we could abuse the RSS limit for now by setting it to zero except for
the processes we wanted to protect from the OOM killer.
If that works in practice the same idea could be done without the abuse;
the RSS limit is just a handy knob.
- Dan

-- 
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Protecting processes from the OOM killer
  2003-02-28 16:08   ` Dan Kegel
@ 2003-02-28 22:13     ` James Antill
  2003-03-03 14:45     ` Jesse Pollard
  1 sibling, 0 replies; 9+ messages in thread
From: James Antill @ 2003-02-28 22:13 UTC (permalink / raw)
  To: Dan Kegel; +Cc: Linux Kernel Mailing List

Dan Kegel <dank@kegel.com> writes:

> Alan Cox wrote:
> > On Fri, 2003-02-28 at 01:21, Dan Kegel wrote:
> >
> > Everything else is armwaving "works half the time" stuff. By the time
> > the OOM kicks in the game is already over.
> 
> Even with overcommit disallowed, the OOM killer is going to run
> when my users try to run too big a job, so I would still like
> the OOM killer to behave "well".

 If OOM is called you've overcommitted memory, so this isn't true
... no overcommit == NULL from malloc() etc.

-- 
# James Antill -- james@and.org
:0:
* ^From: .*james@and\.org
/dev/null

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Protecting processes from the OOM killer
  2003-02-28 16:08   ` Dan Kegel
  2003-02-28 22:13     ` James Antill
@ 2003-03-03 14:45     ` Jesse Pollard
  2003-03-03 16:23       ` Alan Cox
  2003-03-03 16:41       ` Dan Kegel
  1 sibling, 2 replies; 9+ messages in thread
From: Jesse Pollard @ 2003-03-03 14:45 UTC (permalink / raw)
  To: Dan Kegel, Alan Cox; +Cc: Linux Kernel Mailing List

On Friday 28 February 2003 10:08 am, Dan Kegel wrote:
> Alan Cox wrote:
snip
> > Everything else is armwaving "works half the time" stuff. By the time
> > the OOM kicks in the game is already over.
>
> Even with overcommit disallowed, the OOM killer is going to run
> when my users try to run too big a job, so I would still like
> the OOM killer to behave "well".

Shouldn't - the process the user tries to run will not be started since
it must reserve the space first. malloc will fail immediately, allowing the
process to handle the even gracefully and exit.

Anything else is a bug in the application.

-- 
-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Protecting processes from the OOM killer
  2003-03-03 14:45     ` Jesse Pollard
@ 2003-03-03 16:23       ` Alan Cox
  2003-03-03 16:41       ` Dan Kegel
  1 sibling, 0 replies; 9+ messages in thread
From: Alan Cox @ 2003-03-03 16:23 UTC (permalink / raw)
  To: Jesse Pollard; +Cc: Dan Kegel, Linux Kernel Mailing List

On Mon, 2003-03-03 at 14:45, Jesse Pollard wrote:
> Shouldn't - the process the user tries to run will not be started since
> it must reserve the space first. malloc will fail immediately, allowing the
> process to handle the even gracefully and exit.
> 
> Anything else is a bug in the application.

The one case you can't cover cleanly in C is a stack grow exceeding memory
usage. At that point it requires a tiny bit of magic. You can do it, but 
the overcommit blocker has to armwave a little for the kernel and other
things so I've never seen it happen in a normal situation


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Protecting processes from the OOM killer
  2003-03-03 14:45     ` Jesse Pollard
  2003-03-03 16:23       ` Alan Cox
@ 2003-03-03 16:41       ` Dan Kegel
  1 sibling, 0 replies; 9+ messages in thread
From: Dan Kegel @ 2003-03-03 16:41 UTC (permalink / raw)
  To: Jesse Pollard; +Cc: Alan Cox, Linux Kernel Mailing List

Jesse Pollard wrote:
> On Friday 28 February 2003 10:08 am, Dan Kegel wrote:
> 
>>Alan Cox wrote:
> 
> snip
> 
>>>Everything else is armwaving "works half the time" stuff. By the time
>>>the OOM kicks in the game is already over.
>>
>>Even with overcommit disallowed, the OOM killer is going to run
>>when my users try to run too big a job, so I would still like
>>the OOM killer to behave "well".
> 
> 
> Shouldn't - the process the user tries to run will not be started since
> it must reserve the space first. malloc will fail immediately, allowing the
> process to handle the even gracefully and exit.

I thought of that about five minutes after I hit 'send'.

I have a feeling that there might still be a few cases
not perfectly covered by the strict overcommit patch.
Say, memory allocations due to incoming network traffic.
I guess if memory runs out during incoming traffic, the kernel
should simply drop the traffic.  Until all those situations
are nicely ironed out, there's still some chance the OOM killer
might run even on a strict overcommit system.

But enough talking; I need to go try it.
- Dan

-- 
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-03-03 16:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-28  1:21 Protecting processes from the OOM killer Dan Kegel
2003-02-28 13:40 ` Alan Cox
2003-02-28 14:19   ` Ville Herva
2003-02-28 15:37     ` Alan Cox
2003-02-28 16:08   ` Dan Kegel
2003-02-28 22:13     ` James Antill
2003-03-03 14:45     ` Jesse Pollard
2003-03-03 16:23       ` Alan Cox
2003-03-03 16:41       ` Dan Kegel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox