* [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
@ 2006-04-13 21:52 Dave Peterson
2006-04-13 23:24 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: Dave Peterson @ 2006-04-13 21:52 UTC (permalink / raw)
To: linux-mm; +Cc: linux-kernel, riel, akpm
The patch below fixes some mm_struct reference counting bugs in
badness().
Signed-Off-By: David S. Peterson <dsp@llnl.gov>
---
Index: linux-2.6.17-rc1-oom/mm/oom_kill.c
===================================================================
--- linux-2.6.17-rc1-oom.orig/mm/oom_kill.c 2006-04-13 14:25:16.000000000 -0700
+++ linux-2.6.17-rc1-oom/mm/oom_kill.c 2006-04-13 14:31:47.000000000 -0700
@@ -47,14 +47,17 @@ unsigned long badness(struct task_struct
{
unsigned long points, cpu_time, run_time, s;
struct list_head *tsk;
+ struct mm_struct *mm, *child_mm;
- if (!p->mm)
+ mm = get_task_mm(p);
+
+ if (mm == NULL)
return 0;
/*
* The memory size of the process is the basis for the badness.
*/
- points = p->mm->total_vm;
+ points = mm->total_vm;
/*
* Processes which fork a lot of child processes are likely
@@ -67,10 +70,21 @@ unsigned long badness(struct task_struct
list_for_each(tsk, &p->children) {
struct task_struct *chld;
chld = list_entry(tsk, struct task_struct, sibling);
- if (chld->mm != p->mm && chld->mm)
- points += chld->mm->total_vm/2 + 1;
+
+ if (chld->mm == mm)
+ continue;
+
+ child_mm = get_task_mm(chld);
+
+ if (child_mm == NULL)
+ continue;
+
+ points += child_mm->total_vm/2 + 1;
+ mmput(child_mm);
}
+ mmput(mm);
+
/*
* CPU time is in tens of seconds and run time is in thousands
* of seconds. There is no particular reason for this other than
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
2006-04-13 21:52 [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c Dave Peterson
@ 2006-04-13 23:24 ` Andrew Morton
2006-04-14 0:44 ` Dave Peterson
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-04-13 23:24 UTC (permalink / raw)
To: Dave Peterson; +Cc: linux-mm, linux-kernel, riel
Dave Peterson <dsp@llnl.gov> wrote:
>
> The patch below fixes some mm_struct reference counting bugs in
> badness().
hm, OK, afaict the code _is_ racy.
But you're now calling mmput() inside read_lock(&tasklist_lock), and
mmput() can sleep in exit_aio() or in exit_mmap()->unmap_vmas(). So
sterner stuff will be needed.
I'll put a might_sleep() into mmput - it's a bit unexpected.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
2006-04-13 23:24 ` Andrew Morton
@ 2006-04-14 0:44 ` Dave Peterson
2006-04-14 7:26 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: Dave Peterson @ 2006-04-14 0:44 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel, riel
On Thursday 13 April 2006 16:24, Andrew Morton wrote:
> Dave Peterson <dsp@llnl.gov> wrote:
> > The patch below fixes some mm_struct reference counting bugs in
> > badness().
>
> hm, OK, afaict the code _is_ racy.
>
> But you're now calling mmput() inside read_lock(&tasklist_lock), and
> mmput() can sleep in exit_aio() or in exit_mmap()->unmap_vmas(). So
> sterner stuff will be needed.
>
> I'll put a might_sleep() into mmput - it's a bit unexpected.
Hmm... fixing this looks rather tricky. If get_task_mm()/mmput() was
only being done on a single mm_struct then I suppose badness() could
do something a bit ugly like passing the reference back to its caller
and letting the caller do the mmput() once tasklist_lock is no longer
held. However here we are iterating over a bunch of child tasks,
potentially doing a get_task_mm()/mmput() for a number of them.
I have a suggestion for a possible solution. Currently mmput() is
implemented as follows:
01 void mmput(struct mm_struct *mm)
02 {
03 if (atomic_dec_and_lock(&mm->mm_users, &mmlist_lock)) {
04 list_del(&mm->mmlist);
05 mmlist_nr--;
06 spin_unlock(&mmlist_lock);
07 exit_aio(mm);
08 exit_mmap(mm);
09 put_swap_token(mm);
10 mmdrop(mm);
11 }
12 }
Suppose we replace lines 07-10 with a little piece of code that adds
the mm_struct to a list. Then a kernel thread empties the list
(perhaps via the work queue mechanism), doing the stuff in lines
07-10 for each mm_struct. This would eliminate the possibility of
mmput() sleeping, potentially making things easier for other callers
of mmput() and causing fewer surprises. Any comments?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
2006-04-14 0:44 ` Dave Peterson
@ 2006-04-14 7:26 ` Andrew Morton
2006-04-14 19:14 ` Dave Peterson
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-04-14 7:26 UTC (permalink / raw)
To: Dave Peterson; +Cc: linux-mm, linux-kernel, riel
Dave Peterson <dsp@llnl.gov> wrote:
>
> On Thursday 13 April 2006 16:24, Andrew Morton wrote:
> > Dave Peterson <dsp@llnl.gov> wrote:
> > > The patch below fixes some mm_struct reference counting bugs in
> > > badness().
> >
> > hm, OK, afaict the code _is_ racy.
> >
> > But you're now calling mmput() inside read_lock(&tasklist_lock), and
> > mmput() can sleep in exit_aio() or in exit_mmap()->unmap_vmas(). So
> > sterner stuff will be needed.
> >
> > I'll put a might_sleep() into mmput - it's a bit unexpected.
>
> Hmm... fixing this looks rather tricky. If get_task_mm()/mmput() was
> only being done on a single mm_struct then I suppose badness() could
> do something a bit ugly like passing the reference back to its caller
> and letting the caller do the mmput() once tasklist_lock is no longer
> held. However here we are iterating over a bunch of child tasks,
> potentially doing a get_task_mm()/mmput() for a number of them.
>
> I have a suggestion for a possible solution. Currently mmput() is
> implemented as follows:
>
> 01 void mmput(struct mm_struct *mm)
> 02 {
> 03 if (atomic_dec_and_lock(&mm->mm_users, &mmlist_lock)) {
> 04 list_del(&mm->mmlist);
> 05 mmlist_nr--;
> 06 spin_unlock(&mmlist_lock);
> 07 exit_aio(mm);
> 08 exit_mmap(mm);
> 09 put_swap_token(mm);
> 10 mmdrop(mm);
> 11 }
> 12 }
>
> Suppose we replace lines 07-10 with a little piece of code that adds
> the mm_struct to a list. Then a kernel thread empties the list
> (perhaps via the work queue mechanism), doing the stuff in lines
> 07-10 for each mm_struct. This would eliminate the possibility of
> mmput() sleeping, potentially making things easier for other callers
> of mmput() and causing fewer surprises. Any comments?
task_lock() can be used to pin a task's ->mm. To use task_lock() in
badness() we'd need to either
a) nest task_lock()s. I don't know if we're doing that anywhere else,
but the parent->child ordering is a natural one. or
b) take a ref on the parent's mm_struct, drop the parent's task_lock()
while we walk the children, then do mmput() on the parent's mm outside
tasklist_lock. This is probably better.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
2006-04-14 7:26 ` Andrew Morton
@ 2006-04-14 19:14 ` Dave Peterson
2006-04-14 19:45 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: Dave Peterson @ 2006-04-14 19:14 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel, riel
On Friday 14 April 2006 00:26, Andrew Morton wrote:
> task_lock() can be used to pin a task's ->mm. To use task_lock() in
> badness() we'd need to either
>
> a) nest task_lock()s. I don't know if we're doing that anywhere else,
> but the parent->child ordering is a natural one. or
>
> b) take a ref on the parent's mm_struct, drop the parent's task_lock()
> while we walk the children, then do mmput() on the parent's mm outside
> tasklist_lock. This is probably better.
Looking a bit more closely at the code, I see that
select_bad_process() iterates over all tasks, repeatedly calling
badness(). This would complicate option 'b' since the iteration is
done while holding tasklist_lock. An alternative to option 'a' that
avoids nesting task_lock()s would be to define a couple of new
functions that might look something like this:
void mmput_atomic(struct mm_struct *mm)
{
if (atomic_dec_and_test(&mm->mm_users)) {
add mm to a global list of expired mm_structs
}
}
void mmput_atomic_cleanup(void)
{
empty the global list of expired mm_structs and do
cleanup stuff for each one
}
Then you could call mmput_atomic() an arbitrary # of times in places
where sleeping is not permitted, as long as mmput_atomic_cleanup() is
later called in a place where sleeping is permissible. In the case
of the OOM killer code, a call to mmput_atomic_cleanup() could be
added to out_of_memory() in a place where we no longer hold
tasklist_lock. Let me know if you have a preference for either of
these options, or if you have other suggestions.
Thanks,
Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
2006-04-14 19:14 ` Dave Peterson
@ 2006-04-14 19:45 ` Andrew Morton
2006-04-14 20:49 ` Dave Peterson
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-04-14 19:45 UTC (permalink / raw)
To: Dave Peterson; +Cc: linux-mm, linux-kernel, riel
Dave Peterson <dsp@llnl.gov> wrote:
>
> On Friday 14 April 2006 00:26, Andrew Morton wrote:
> > task_lock() can be used to pin a task's ->mm. To use task_lock() in
> > badness() we'd need to either
> >
> > a) nest task_lock()s. I don't know if we're doing that anywhere else,
> > but the parent->child ordering is a natural one. or
> >
> > b) take a ref on the parent's mm_struct, drop the parent's task_lock()
> > while we walk the children, then do mmput() on the parent's mm outside
> > tasklist_lock. This is probably better.
>
> Looking a bit more closely at the code, I see that
> select_bad_process() iterates over all tasks, repeatedly calling
> badness(). This would complicate option 'b' since the iteration is
> done while holding tasklist_lock. An alternative to option 'a' that
> avoids nesting task_lock()s would be to define a couple of new
> functions that might look something like this:
>
> void mmput_atomic(struct mm_struct *mm)
> {
> if (atomic_dec_and_test(&mm->mm_users)) {
> add mm to a global list of expired mm_structs
> }
> }
>
> void mmput_atomic_cleanup(void)
> {
> empty the global list of expired mm_structs and do
> cleanup stuff for each one
> }
I think that's way too elaborate.
What's wrong with this?
--- 25/mm/oom_kill.c~a Fri Apr 14 12:37:51 2006
+++ 25-akpm/mm/oom_kill.c Fri Apr 14 12:44:49 2006
@@ -47,15 +47,25 @@ int sysctl_panic_on_oom;
unsigned long badness(struct task_struct *p, unsigned long uptime)
{
unsigned long points, cpu_time, run_time, s;
- struct list_head *tsk;
+ struct mm_struct *mm;
+ struct task_struct *child;
- if (!p->mm)
+ task_lock(p);
+ mm = p->mm;
+ if (!mm) {
+ task_unlock(p);
return 0;
+ }
/*
* The memory size of the process is the basis for the badness.
*/
- points = p->mm->total_vm;
+ points = mm->total_vm;
+
+ /*
+ * After this unlock we can no longer dereference local variable `mm'
+ */
+ task_unlock(p);
/*
* Processes which fork a lot of child processes are likely
@@ -65,11 +75,11 @@ unsigned long badness(struct task_struct
* child is eating the vast majority of memory, adding only half
* to the parents will make the child our kill candidate of choice.
*/
- list_for_each(tsk, &p->children) {
- struct task_struct *chld;
- chld = list_entry(tsk, struct task_struct, sibling);
- if (chld->mm != p->mm && chld->mm)
- points += chld->mm->total_vm/2 + 1;
+ list_for_each_entry(child, &p->children, sibling) {
+ task_lock(child);
+ if (child->mm != mm && child->mm)
+ points += child->mm->total_vm/2 + 1;
+ task_unlock(child);
}
/*
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
2006-04-14 19:45 ` Andrew Morton
@ 2006-04-14 20:49 ` Dave Peterson
2006-04-14 21:31 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: Dave Peterson @ 2006-04-14 20:49 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel, riel
On Friday 14 April 2006 12:45, Andrew Morton wrote:
> Dave Peterson <dsp@llnl.gov> wrote:
> > On Friday 14 April 2006 00:26, Andrew Morton wrote:
> > > task_lock() can be used to pin a task's ->mm. To use task_lock() in
> > > badness() we'd need to either
> > >
> > > a) nest task_lock()s. I don't know if we're doing that anywhere else,
> > > but the parent->child ordering is a natural one. or
> > >
> > > b) take a ref on the parent's mm_struct, drop the parent's task_lock()
> > > while we walk the children, then do mmput() on the parent's mm
> > > outside tasklist_lock. This is probably better.
> >
> > Looking a bit more closely at the code, I see that
> > select_bad_process() iterates over all tasks, repeatedly calling
> > badness(). This would complicate option 'b' since the iteration is
> > done while holding tasklist_lock. An alternative to option 'a' that
> > avoids nesting task_lock()s would be to define a couple of new
> > functions that might look something like this:
> >
> > void mmput_atomic(struct mm_struct *mm)
> > {
> > if (atomic_dec_and_test(&mm->mm_users)) {
> > add mm to a global list of expired mm_structs
> > }
> > }
> >
> > void mmput_atomic_cleanup(void)
> > {
> > empty the global list of expired mm_structs and do
> > cleanup stuff for each one
> > }
>
> I think that's way too elaborate.
>
> What's wrong with this?
Yes of course... no need to nest task_lock()s after all. Looks good
to me.
Another thing I noticed: oom_kill_task() calls mmput() while holding
tasklist_lock. Here the calls to get_task_mm() and mmput() appear to
be unnecessary. We shouldn't need to use any kind of locking or
reference counting since oom_kill_task() doesn't dereference into the
mm_struct or require the value of p->mm to stay constant. I believe
the following (untested) code changes should fix the problem (and
simplify some other parts of the code). Does this look correct?
diff -urNp -X /home/dsp/dontdiff linux-2.6.17-rc1/mm/oom_kill.c linux-2.6.17-rc1-fix/mm/oom_kill.c
--- linux-2.6.17-rc1/mm/oom_kill.c 2006-03-19 21:53:29.000000000 -0800
+++ linux-2.6.17-rc1-fix/mm/oom_kill.c 2006-04-14 13:22:15.000000000 -0700
@@ -244,17 +244,15 @@ static void __oom_kill_task(task_t *p, c
force_sig(SIGKILL, p);
}
-static struct mm_struct *oom_kill_task(task_t *p, const char *message)
+static int oom_kill_task(task_t *p, const char *message)
{
- struct mm_struct *mm = get_task_mm(p);
+ struct mm_struct *mm;
task_t * g, * q;
- if (!mm)
- return NULL;
- if (mm == &init_mm) {
- mmput(mm);
- return NULL;
- }
+ mm = p->mm;
+
+ if ((mm == NULL) || (mm == &init_mm))
+ return 1;
__oom_kill_task(p, message);
/*
@@ -266,13 +264,12 @@ static struct mm_struct *oom_kill_task(t
__oom_kill_task(q, message);
while_each_thread(g, q);
- return mm;
+ return 0;
}
-static struct mm_struct *oom_kill_process(struct task_struct *p,
- unsigned long points, const char *message)
+static int oom_kill_process(struct task_struct *p, unsigned long points,
+ const char *message)
{
- struct mm_struct *mm;
struct task_struct *c;
struct list_head *tsk;
@@ -283,9 +280,8 @@ static struct mm_struct *oom_kill_proces
c = list_entry(tsk, struct task_struct, sibling);
if (c->mm == p->mm)
continue;
- mm = oom_kill_task(c, message);
- if (mm)
- return mm;
+ if (!oom_kill_task(c, message))
+ return 0;
}
return oom_kill_task(p, message);
}
@@ -300,7 +296,6 @@ static struct mm_struct *oom_kill_proces
*/
void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
{
- struct mm_struct *mm = NULL;
task_t *p;
unsigned long points = 0;
@@ -320,12 +315,12 @@ void out_of_memory(struct zonelist *zone
*/
switch (constrained_alloc(zonelist, gfp_mask)) {
case CONSTRAINT_MEMORY_POLICY:
- mm = oom_kill_process(current, points,
+ oom_kill_process(current, points,
"No available memory (MPOL_BIND)");
break;
case CONSTRAINT_CPUSET:
- mm = oom_kill_process(current, points,
+ oom_kill_process(current, points,
"No available memory in cpuset");
break;
@@ -347,8 +342,7 @@ retry:
panic("Out of memory and no killable processes...\n");
}
- mm = oom_kill_process(p, points, "Out of memory");
- if (!mm)
+ if (oom_kill_process(p, points, "Out of memory"))
goto retry;
break;
@@ -357,8 +351,6 @@ retry:
out:
read_unlock(&tasklist_lock);
cpuset_unlock();
- if (mm)
- mmput(mm);
/*
* Give "p" a good chance of killing itself before we
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
2006-04-14 20:49 ` Dave Peterson
@ 2006-04-14 21:31 ` Andrew Morton
2006-04-14 23:52 ` Dave Peterson
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-04-14 21:31 UTC (permalink / raw)
To: Dave Peterson; +Cc: linux-mm, linux-kernel, riel
Dave Peterson <dsp@llnl.gov> wrote:
>
> Another thing I noticed: oom_kill_task() calls mmput() while holding
> tasklist_lock.
Yes, that'll make my new might_sleep() get upset.
oom_kill_task() doesn't _have_ to run mmput() there - we could propagate
the mm back to the top-level and do the mmput() there.
> Here the calls to get_task_mm() and mmput() appear to
> be unnecessary. We shouldn't need to use any kind of locking or
> reference counting since oom_kill_task() doesn't dereference into the
> mm_struct or require the value of p->mm to stay constant. I believe
> the following (untested) code changes should fix the problem (and
> simplify some other parts of the code). Does this look correct?
But yes, this looks better.
>
> diff -urNp -X /home/dsp/dontdiff linux-2.6.17-rc1/mm/oom_kill.c linux-2.6.17-rc1-fix/mm/oom_kill.c
> --- linux-2.6.17-rc1/mm/oom_kill.c 2006-03-19 21:53:29.000000000 -0800
> +++ linux-2.6.17-rc1-fix/mm/oom_kill.c 2006-04-14 13:22:15.000000000 -0700
> @@ -244,17 +244,15 @@ static void __oom_kill_task(task_t *p, c
> force_sig(SIGKILL, p);
> }
>
> -static struct mm_struct *oom_kill_task(task_t *p, const char *message)
> +static int oom_kill_task(task_t *p, const char *message)
> {
> - struct mm_struct *mm = get_task_mm(p);
> + struct mm_struct *mm;
> task_t * g, * q;
Please put a loud comment in here explaining that `mm' may not be dereferenced.
> - if (!mm)
> - return NULL;
> - if (mm == &init_mm) {
> - mmput(mm);
> - return NULL;
> - }
> + mm = p->mm;
> +
> + if ((mm == NULL) || (mm == &init_mm))
I think the parenthesisation here is going a bit far ;)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
2006-04-14 21:31 ` Andrew Morton
@ 2006-04-14 23:52 ` Dave Peterson
2006-04-15 0:00 ` Dave Peterson
0 siblings, 1 reply; 10+ messages in thread
From: Dave Peterson @ 2006-04-14 23:52 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel, riel
The patch below fixes oom_kill_task() so it doesn't call mmput()
(which may sleep) while holding tasklist_lock.
Signed-Off-By: David S. Peterson <dsp@llnl.gov>
---
Below is a revised version of my previous OOM killer patch. I expect
to be away from my computer until Monday, 4/24. If you have any
problems merging this into your tree, let me know and I'll fix things
up when I get back.
Dave
diff -urNp -X /home/dsp/dontdiff linux-2.6.17-rc1/mm/oom_kill.c linux-2.6.17-rc1-fix/mm/oom_kill.c
--- linux-2.6.17-rc1/mm/oom_kill.c 2006-03-19 21:53:29.000000000 -0800
+++ linux-2.6.17-rc1-fix/mm/oom_kill.c 2006-04-14 13:22:15.000000000 -0700
@@ -244,17 +244,15 @@ static void __oom_kill_task(task_t *p, c
force_sig(SIGKILL, p);
}
-static struct mm_struct *oom_kill_task(task_t *p, const char *message)
+static int oom_kill_task(task_t *p, const char *message)
{
- struct mm_struct *mm = get_task_mm(p);
+ struct mm_struct *mm;
task_t * g, * q;
- if (!mm)
- return NULL;
- if (mm == &init_mm) {
- mmput(mm);
- return NULL;
- }
+ mm = p->mm;
+
+ if ((mm == NULL) || (mm == &init_mm))
+ return 1;
__oom_kill_task(p, message);
/*
@@ -266,13 +264,12 @@ static struct mm_struct *oom_kill_task(t
__oom_kill_task(q, message);
while_each_thread(g, q);
- return mm;
+ return 0;
}
-static struct mm_struct *oom_kill_process(struct task_struct *p,
- unsigned long points, const char *message)
+static int oom_kill_process(struct task_struct *p, unsigned long points,
+ const char *message)
{
- struct mm_struct *mm;
struct task_struct *c;
struct list_head *tsk;
@@ -283,9 +280,8 @@ static struct mm_struct *oom_kill_proces
c = list_entry(tsk, struct task_struct, sibling);
if (c->mm == p->mm)
continue;
- mm = oom_kill_task(c, message);
- if (mm)
- return mm;
+ if (!oom_kill_task(c, message))
+ return 0;
}
return oom_kill_task(p, message);
}
@@ -300,7 +296,6 @@ static struct mm_struct *oom_kill_proces
*/
void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
{
- struct mm_struct *mm = NULL;
task_t *p;
unsigned long points = 0;
@@ -320,12 +315,12 @@ void out_of_memory(struct zonelist *zone
*/
switch (constrained_alloc(zonelist, gfp_mask)) {
case CONSTRAINT_MEMORY_POLICY:
- mm = oom_kill_process(current, points,
+ oom_kill_process(current, points,
"No available memory (MPOL_BIND)");
break;
case CONSTRAINT_CPUSET:
- mm = oom_kill_process(current, points,
+ oom_kill_process(current, points,
"No available memory in cpuset");
break;
@@ -347,8 +342,7 @@ retry:
panic("Out of memory and no killable processes...\n");
}
- mm = oom_kill_process(p, points, "Out of memory");
- if (!mm)
+ if (oom_kill_process(p, points, "Out of memory"))
goto retry;
break;
@@ -357,8 +351,6 @@ retry:
out:
read_unlock(&tasklist_lock);
cpuset_unlock();
- if (mm)
- mmput(mm);
/*
* Give "p" a good chance of killing itself before we
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
2006-04-14 23:52 ` Dave Peterson
@ 2006-04-15 0:00 ` Dave Peterson
0 siblings, 0 replies; 10+ messages in thread
From: Dave Peterson @ 2006-04-15 0:00 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel, riel
The patch below fixes oom_kill_task() so it doesn't call mmput()
(which may sleep) while holding tasklist_lock.
Signed-Off-By: David S. Peterson <dsp@llnl.gov>
---
On Friday 14 April 2006 16:52, I wrote:
> Below is a revised version of my previous OOM killer patch. I expect
> to be away from my computer until Monday, 4/24. If you have any
> problems merging this into your tree, let me know and I'll fix things
> up when I get back.
Arrgghh... I sent the wrong patch by mistake. Please ignore the previous
one and merge the patch below instead.
Thanks,
Dave
Index: work/mm/oom_kill.c
===================================================================
--- work.orig/mm/oom_kill.c 2006-04-14 16:21:05.000000000 -0700
+++ work/mm/oom_kill.c 2006-04-14 16:34:31.000000000 -0700
@@ -254,17 +254,24 @@ static void __oom_kill_task(task_t *p, c
force_sig(SIGKILL, p);
}
-static struct mm_struct *oom_kill_task(task_t *p, const char *message)
+static int oom_kill_task(task_t *p, const char *message)
{
- struct mm_struct *mm = get_task_mm(p);
+ struct mm_struct *mm;
task_t * g, * q;
- if (!mm)
- return NULL;
- if (mm == &init_mm) {
- mmput(mm);
- return NULL;
- }
+ mm = p->mm;
+
+ /* WARNING: mm may not be dereferenced since we did not obtain its
+ * value from get_task_mm(p). This is OK since all we need to do is
+ * compare mm to q->mm below.
+ *
+ * Furthermore, even if mm contains a non-NULL value, p->mm may
+ * change to NULL at any time since we do not hold task_lock(p).
+ * However, this is of no concern to us.
+ */
+
+ if (mm == NULL || mm == &init_mm)
+ return 1;
__oom_kill_task(p, message);
/*
@@ -276,13 +283,12 @@ static struct mm_struct *oom_kill_task(t
__oom_kill_task(q, message);
while_each_thread(g, q);
- return mm;
+ return 0;
}
-static struct mm_struct *oom_kill_process(struct task_struct *p,
- unsigned long points, const char *message)
+static int oom_kill_process(struct task_struct *p, unsigned long points,
+ const char *message)
{
- struct mm_struct *mm;
struct task_struct *c;
struct list_head *tsk;
@@ -293,9 +299,8 @@ static struct mm_struct *oom_kill_proces
c = list_entry(tsk, struct task_struct, sibling);
if (c->mm == p->mm)
continue;
- mm = oom_kill_task(c, message);
- if (mm)
- return mm;
+ if (!oom_kill_task(c, message))
+ return 0;
}
return oom_kill_task(p, message);
}
@@ -310,7 +315,6 @@ static struct mm_struct *oom_kill_proces
*/
void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
{
- struct mm_struct *mm = NULL;
task_t *p;
unsigned long points = 0;
@@ -330,12 +334,12 @@ void out_of_memory(struct zonelist *zone
*/
switch (constrained_alloc(zonelist, gfp_mask)) {
case CONSTRAINT_MEMORY_POLICY:
- mm = oom_kill_process(current, points,
+ oom_kill_process(current, points,
"No available memory (MPOL_BIND)");
break;
case CONSTRAINT_CPUSET:
- mm = oom_kill_process(current, points,
+ oom_kill_process(current, points,
"No available memory in cpuset");
break;
@@ -357,8 +361,7 @@ retry:
panic("Out of memory and no killable processes...\n");
}
- mm = oom_kill_process(p, points, "Out of memory");
- if (!mm)
+ if (oom_kill_process(p, points, "Out of memory"))
goto retry;
break;
@@ -367,8 +370,6 @@ retry:
out:
read_unlock(&tasklist_lock);
cpuset_unlock();
- if (mm)
- mmput(mm);
/*
* Give "p" a good chance of killing itself before we
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-04-15 0:00 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-13 21:52 [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c Dave Peterson
2006-04-13 23:24 ` Andrew Morton
2006-04-14 0:44 ` Dave Peterson
2006-04-14 7:26 ` Andrew Morton
2006-04-14 19:14 ` Dave Peterson
2006-04-14 19:45 ` Andrew Morton
2006-04-14 20:49 ` Dave Peterson
2006-04-14 21:31 ` Andrew Morton
2006-04-14 23:52 ` Dave Peterson
2006-04-15 0:00 ` Dave Peterson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).