* [PATCH] oom killer killing all threads
@ 2001-08-13 21:12 Eric Lammerts
2001-08-16 4:20 ` dean gaudet
0 siblings, 1 reply; 3+ messages in thread
From: Eric Lammerts @ 2001-08-13 21:12 UTC (permalink / raw)
To: Rik van Riel; +Cc: linux-kernel
Hello,
I recently had the following problem: Roxen (a webserver that uses
threads) was running out of control, eating up more and more memory.
(btw, I'd rather run Apache but it's not my decision to make).
The oom killer kicked in and started killing roxen processes.
Apparently it didn't succeed in killing _all_ threads. So it didn't
help at all, and the machine had to be rebooted.
Wouldn't it be a good idea to kill all processes that have the same
->mm as the process that was selected to be killed? The patch below
implements it. I've tested it and it seems to work nicely.
Eric
--- linux-2.4.8-ac3/mm/oom_kill.c.orig Sat Jul 7 02:02:23 2001
+++ linux-2.4.8-ac3/mm/oom_kill.c Mon Aug 13 23:06:07 2001
@@ -132,34 +132,20 @@
}
}
read_unlock(&tasklist_lock);
return chosen;
}
/**
- * oom_kill - kill the "best" process when we run out of memory
- *
- * If we run out of memory, we have the choice between either
- * killing a random task (bad), letting the system crash (worse)
- * OR try to be smart about which process to kill. Note that we
- * don't have to be perfect here, we just have to be good.
- *
* We must be careful though to never send SIGKILL a process with
* CAP_SYS_RAW_IO set, send SIGTERM instead (but it's unlikely that
* we select a process with CAP_SYS_RAW_IO set).
*/
-void oom_kill(void)
+void oom_kill_task(struct task_struct *p)
{
-
- struct task_struct *p = select_bad_process();
-
- /* Found nothing?!?! Either we hang forever, or we panic. */
- if (p == NULL)
- panic("Out of memory and no killable processes...\n");
-
printk(KERN_ERR "Out of Memory: Killed process %d (%s).\n", p->pid, p->comm);
/*
* We give our sacrificial lamb high priority and access to
* all the memory it needs. That way it should be able to
* exit() and clear out its resources quickly...
*/
@@ -168,15 +154,39 @@
/* This process has hardware access, be more careful. */
if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_RAWIO)) {
force_sig(SIGTERM, p);
} else {
force_sig(SIGKILL, p);
}
+}
+/**
+ * oom_kill - kill the "best" process when we run out of memory
+ *
+ * If we run out of memory, we have the choice between either
+ * killing a random task (bad), letting the system crash (worse)
+ * OR try to be smart about which process to kill. Note that we
+ * don't have to be perfect here, we just have to be good.
+ */
+void oom_kill(void)
+{
+ struct task_struct *p = select_bad_process(), *q;
+
+ /* Found nothing?!?! Either we hang forever, or we panic. */
+ if (p == NULL)
+ panic("Out of memory and no killable processes...\n");
+
+ /* kill all processes that share the ->mm (i.e. all threads) */
+ read_lock(&tasklist_lock);
+ for_each_task(q) {
+ if(q->mm == p->mm) oom_kill_task(q);
+ }
+ read_unlock(&tasklist_lock);
+
/*
* Make kswapd go out of the way, so "p" has a good chance of
* killing itself before someone else gets the chance to ask
* for more memory.
*/
current->policy |= SCHED_YIELD;
schedule();
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] oom killer killing all threads
2001-08-13 21:12 [PATCH] oom killer killing all threads Eric Lammerts
@ 2001-08-16 4:20 ` dean gaudet
2001-08-16 7:52 ` Eric Lammerts
0 siblings, 1 reply; 3+ messages in thread
From: dean gaudet @ 2001-08-16 4:20 UTC (permalink / raw)
To: Eric Lammerts; +Cc: Rik van Riel, linux-kernel
On Mon, 13 Aug 2001, Eric Lammerts wrote:
>
> Hello,
> I recently had the following problem: Roxen (a webserver that uses
> threads) was running out of control, eating up more and more memory.
> (btw, I'd rather run Apache but it's not my decision to make).
>
> The oom killer kicked in and started killing roxen processes.
> Apparently it didn't succeed in killing _all_ threads. So it didn't
> help at all, and the machine had to be rebooted.
this would be a bug in roxen. for the corresponding code in apache, and
our attempt to be robust in the face of such activity take a look at the
sleep(10) in make_child(). basically apache limits how much it will spawn
each second, and if it gets any fork error the parent will cease all
activity for 10 seconds. roxen needs code to limit its own damage, i
don't see why the kernel should do it...
> Wouldn't it be a good idea to kill all processes that have the same
> ->mm as the process that was selected to be killed? The patch below
> implements it. I've tested it and it seems to work nicely.
i think it's a good idea but not for the reason you give... it's a good
idea because a multithreaded process using userland mutexes will have an
unpredictable number of locked mutexes in each thread -- and killing a
thread could result in hangs in the remaining threads as they wait for
mutexes which will never be freed. (threads are kind of evil ;)
hey -- is there a way to know when a task was OOM killed as opposed to
other forms of death?
-dean
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] oom killer killing all threads
2001-08-16 4:20 ` dean gaudet
@ 2001-08-16 7:52 ` Eric Lammerts
0 siblings, 0 replies; 3+ messages in thread
From: Eric Lammerts @ 2001-08-16 7:52 UTC (permalink / raw)
To: dean gaudet; +Cc: Rik van Riel, linux-kernel
On Wed, Aug 15, 2001 at 09:20:36PM -0700, dean gaudet wrote:
> On Mon, 13 Aug 2001, Eric Lammerts wrote:
> > I recently had the following problem: Roxen (a webserver that uses
> > threads) was running out of control, eating up more and more memory.
>
> this would be a bug in roxen. for the corresponding code in apache, and
That wouldn't surprise me.
> roxen needs code to limit its own damage, i
> don't see why the kernel should do it...
Of course roxen needs to be fixed, but in the meantime I'd like to prevent
it from bringing the box down.
> hey -- is there a way to know when a task was OOM killed as opposed to
> other forms of death?
dmesg(8) ;-).
Eric
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2001-08-16 7:52 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-08-13 21:12 [PATCH] oom killer killing all threads Eric Lammerts
2001-08-16 4:20 ` dean gaudet
2001-08-16 7:52 ` Eric Lammerts
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox