From: Dave Peterson <dsp@llnl.gov>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, riel@surriel.com
Subject: Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
Date: Fri, 14 Apr 2006 13:49:02 -0700 [thread overview]
Message-ID: <200604141349.02047.dsp@llnl.gov> (raw)
In-Reply-To: <20060414124530.24a36d51.akpm@osdl.org>
On Friday 14 April 2006 12:45, Andrew Morton wrote:
> Dave Peterson <dsp@llnl.gov> wrote:
> > On Friday 14 April 2006 00:26, Andrew Morton wrote:
> > > task_lock() can be used to pin a task's ->mm. To use task_lock() in
> > > badness() we'd need to either
> > >
> > > a) nest task_lock()s. I don't know if we're doing that anywhere else,
> > > but the parent->child ordering is a natural one. or
> > >
> > > b) take a ref on the parent's mm_struct, drop the parent's task_lock()
> > > while we walk the children, then do mmput() on the parent's mm
> > > outside tasklist_lock. This is probably better.
> >
> > Looking a bit more closely at the code, I see that
> > select_bad_process() iterates over all tasks, repeatedly calling
> > badness(). This would complicate option 'b' since the iteration is
> > done while holding tasklist_lock. An alternative to option 'a' that
> > avoids nesting task_lock()s would be to define a couple of new
> > functions that might look something like this:
> >
> > void mmput_atomic(struct mm_struct *mm)
> > {
> > if (atomic_dec_and_test(&mm->mm_users)) {
> > add mm to a global list of expired mm_structs
> > }
> > }
> >
> > void mmput_atomic_cleanup(void)
> > {
> > empty the global list of expired mm_structs and do
> > cleanup stuff for each one
> > }
>
> I think that's way too elaborate.
>
> What's wrong with this?
Yes of course... no need to nest task_lock()s after all. Looks good
to me.
Another thing I noticed: oom_kill_task() calls mmput() while holding
tasklist_lock. Here the calls to get_task_mm() and mmput() appear to
be unnecessary. We shouldn't need to use any kind of locking or
reference counting since oom_kill_task() doesn't dereference into the
mm_struct or require the value of p->mm to stay constant. I believe
the following (untested) code changes should fix the problem (and
simplify some other parts of the code). Does this look correct?
diff -urNp -X /home/dsp/dontdiff linux-2.6.17-rc1/mm/oom_kill.c linux-2.6.17-rc1-fix/mm/oom_kill.c
--- linux-2.6.17-rc1/mm/oom_kill.c 2006-03-19 21:53:29.000000000 -0800
+++ linux-2.6.17-rc1-fix/mm/oom_kill.c 2006-04-14 13:22:15.000000000 -0700
@@ -244,17 +244,15 @@ static void __oom_kill_task(task_t *p, c
force_sig(SIGKILL, p);
}
-static struct mm_struct *oom_kill_task(task_t *p, const char *message)
+static int oom_kill_task(task_t *p, const char *message)
{
- struct mm_struct *mm = get_task_mm(p);
+ struct mm_struct *mm;
task_t * g, * q;
- if (!mm)
- return NULL;
- if (mm == &init_mm) {
- mmput(mm);
- return NULL;
- }
+ mm = p->mm;
+
+ if ((mm == NULL) || (mm == &init_mm))
+ return 1;
__oom_kill_task(p, message);
/*
@@ -266,13 +264,12 @@ static struct mm_struct *oom_kill_task(t
__oom_kill_task(q, message);
while_each_thread(g, q);
- return mm;
+ return 0;
}
-static struct mm_struct *oom_kill_process(struct task_struct *p,
- unsigned long points, const char *message)
+static int oom_kill_process(struct task_struct *p, unsigned long points,
+ const char *message)
{
- struct mm_struct *mm;
struct task_struct *c;
struct list_head *tsk;
@@ -283,9 +280,8 @@ static struct mm_struct *oom_kill_proces
c = list_entry(tsk, struct task_struct, sibling);
if (c->mm == p->mm)
continue;
- mm = oom_kill_task(c, message);
- if (mm)
- return mm;
+ if (!oom_kill_task(c, message))
+ return 0;
}
return oom_kill_task(p, message);
}
@@ -300,7 +296,6 @@ static struct mm_struct *oom_kill_proces
*/
void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
{
- struct mm_struct *mm = NULL;
task_t *p;
unsigned long points = 0;
@@ -320,12 +315,12 @@ void out_of_memory(struct zonelist *zone
*/
switch (constrained_alloc(zonelist, gfp_mask)) {
case CONSTRAINT_MEMORY_POLICY:
- mm = oom_kill_process(current, points,
+ oom_kill_process(current, points,
"No available memory (MPOL_BIND)");
break;
case CONSTRAINT_CPUSET:
- mm = oom_kill_process(current, points,
+ oom_kill_process(current, points,
"No available memory in cpuset");
break;
@@ -347,8 +342,7 @@ retry:
panic("Out of memory and no killable processes...\n");
}
- mm = oom_kill_process(p, points, "Out of memory");
- if (!mm)
+ if (oom_kill_process(p, points, "Out of memory"))
goto retry;
break;
@@ -357,8 +351,6 @@ retry:
out:
read_unlock(&tasklist_lock);
cpuset_unlock();
- if (mm)
- mmput(mm);
/*
* Give "p" a good chance of killing itself before we
WARNING: multiple messages have this Message-ID (diff)
From: Dave Peterson <dsp@llnl.gov>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, riel@surriel.com
Subject: Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
Date: Fri, 14 Apr 2006 13:49:02 -0700 [thread overview]
Message-ID: <200604141349.02047.dsp@llnl.gov> (raw)
In-Reply-To: <20060414124530.24a36d51.akpm@osdl.org>
On Friday 14 April 2006 12:45, Andrew Morton wrote:
> Dave Peterson <dsp@llnl.gov> wrote:
> > On Friday 14 April 2006 00:26, Andrew Morton wrote:
> > > task_lock() can be used to pin a task's ->mm. To use task_lock() in
> > > badness() we'd need to either
> > >
> > > a) nest task_lock()s. I don't know if we're doing that anywhere else,
> > > but the parent->child ordering is a natural one. or
> > >
> > > b) take a ref on the parent's mm_struct, drop the parent's task_lock()
> > > while we walk the children, then do mmput() on the parent's mm
> > > outside tasklist_lock. This is probably better.
> >
> > Looking a bit more closely at the code, I see that
> > select_bad_process() iterates over all tasks, repeatedly calling
> > badness(). This would complicate option 'b' since the iteration is
> > done while holding tasklist_lock. An alternative to option 'a' that
> > avoids nesting task_lock()s would be to define a couple of new
> > functions that might look something like this:
> >
> > void mmput_atomic(struct mm_struct *mm)
> > {
> > if (atomic_dec_and_test(&mm->mm_users)) {
> > add mm to a global list of expired mm_structs
> > }
> > }
> >
> > void mmput_atomic_cleanup(void)
> > {
> > empty the global list of expired mm_structs and do
> > cleanup stuff for each one
> > }
>
> I think that's way too elaborate.
>
> What's wrong with this?
Yes of course... no need to nest task_lock()s after all. Looks good
to me.
Another thing I noticed: oom_kill_task() calls mmput() while holding
tasklist_lock. Here the calls to get_task_mm() and mmput() appear to
be unnecessary. We shouldn't need to use any kind of locking or
reference counting since oom_kill_task() doesn't dereference into the
mm_struct or require the value of p->mm to stay constant. I believe
the following (untested) code changes should fix the problem (and
simplify some other parts of the code). Does this look correct?
diff -urNp -X /home/dsp/dontdiff linux-2.6.17-rc1/mm/oom_kill.c linux-2.6.17-rc1-fix/mm/oom_kill.c
--- linux-2.6.17-rc1/mm/oom_kill.c 2006-03-19 21:53:29.000000000 -0800
+++ linux-2.6.17-rc1-fix/mm/oom_kill.c 2006-04-14 13:22:15.000000000 -0700
@@ -244,17 +244,15 @@ static void __oom_kill_task(task_t *p, c
force_sig(SIGKILL, p);
}
-static struct mm_struct *oom_kill_task(task_t *p, const char *message)
+static int oom_kill_task(task_t *p, const char *message)
{
- struct mm_struct *mm = get_task_mm(p);
+ struct mm_struct *mm;
task_t * g, * q;
- if (!mm)
- return NULL;
- if (mm == &init_mm) {
- mmput(mm);
- return NULL;
- }
+ mm = p->mm;
+
+ if ((mm == NULL) || (mm == &init_mm))
+ return 1;
__oom_kill_task(p, message);
/*
@@ -266,13 +264,12 @@ static struct mm_struct *oom_kill_task(t
__oom_kill_task(q, message);
while_each_thread(g, q);
- return mm;
+ return 0;
}
-static struct mm_struct *oom_kill_process(struct task_struct *p,
- unsigned long points, const char *message)
+static int oom_kill_process(struct task_struct *p, unsigned long points,
+ const char *message)
{
- struct mm_struct *mm;
struct task_struct *c;
struct list_head *tsk;
@@ -283,9 +280,8 @@ static struct mm_struct *oom_kill_proces
c = list_entry(tsk, struct task_struct, sibling);
if (c->mm == p->mm)
continue;
- mm = oom_kill_task(c, message);
- if (mm)
- return mm;
+ if (!oom_kill_task(c, message))
+ return 0;
}
return oom_kill_task(p, message);
}
@@ -300,7 +296,6 @@ static struct mm_struct *oom_kill_proces
*/
void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
{
- struct mm_struct *mm = NULL;
task_t *p;
unsigned long points = 0;
@@ -320,12 +315,12 @@ void out_of_memory(struct zonelist *zone
*/
switch (constrained_alloc(zonelist, gfp_mask)) {
case CONSTRAINT_MEMORY_POLICY:
- mm = oom_kill_process(current, points,
+ oom_kill_process(current, points,
"No available memory (MPOL_BIND)");
break;
case CONSTRAINT_CPUSET:
- mm = oom_kill_process(current, points,
+ oom_kill_process(current, points,
"No available memory in cpuset");
break;
@@ -347,8 +342,7 @@ retry:
panic("Out of memory and no killable processes...\n");
}
- mm = oom_kill_process(p, points, "Out of memory");
- if (!mm)
+ if (oom_kill_process(p, points, "Out of memory"))
goto retry;
break;
@@ -357,8 +351,6 @@ retry:
out:
read_unlock(&tasklist_lock);
cpuset_unlock();
- if (mm)
- mmput(mm);
/*
* Give "p" a good chance of killing itself before we
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-04-14 20:49 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-13 21:52 [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c Dave Peterson
2006-04-13 21:52 ` Dave Peterson
2006-04-13 23:24 ` Andrew Morton
2006-04-13 23:24 ` Andrew Morton
2006-04-14 0:44 ` Dave Peterson
2006-04-14 0:44 ` Dave Peterson
2006-04-14 7:26 ` Andrew Morton
2006-04-14 7:26 ` Andrew Morton
2006-04-14 19:14 ` Dave Peterson
2006-04-14 19:14 ` Dave Peterson
2006-04-14 19:45 ` Andrew Morton
2006-04-14 19:45 ` Andrew Morton
2006-04-14 20:49 ` Dave Peterson [this message]
2006-04-14 20:49 ` Dave Peterson
2006-04-14 21:31 ` Andrew Morton
2006-04-14 21:31 ` Andrew Morton
2006-04-14 23:52 ` Dave Peterson
2006-04-14 23:52 ` Dave Peterson
2006-04-15 0:00 ` Dave Peterson
2006-04-15 0:00 ` Dave Peterson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200604141349.02047.dsp@llnl.gov \
--to=dsp@llnl.gov \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@surriel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.