From: Andrew Morton <akpm@osdl.org>
To: Dave Peterson <dsp@llnl.gov>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, riel@surriel.com
Subject: Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
Date: Fri, 14 Apr 2006 00:26:54 -0700 [thread overview]
Message-ID: <20060414002654.76d1a6bc.akpm@osdl.org> (raw)
In-Reply-To: <200604131744.02114.dsp@llnl.gov>
Dave Peterson <dsp@llnl.gov> wrote:
>
> On Thursday 13 April 2006 16:24, Andrew Morton wrote:
> > Dave Peterson <dsp@llnl.gov> wrote:
> > > The patch below fixes some mm_struct reference counting bugs in
> > > badness().
> >
> > hm, OK, afaict the code _is_ racy.
> >
> > But you're now calling mmput() inside read_lock(&tasklist_lock), and
> > mmput() can sleep in exit_aio() or in exit_mmap()->unmap_vmas(). So
> > sterner stuff will be needed.
> >
> > I'll put a might_sleep() into mmput - it's a bit unexpected.
>
> Hmm... fixing this looks rather tricky. If get_task_mm()/mmput() was
> only being done on a single mm_struct then I suppose badness() could
> do something a bit ugly like passing the reference back to its caller
> and letting the caller do the mmput() once tasklist_lock is no longer
> held. However here we are iterating over a bunch of child tasks,
> potentially doing a get_task_mm()/mmput() for a number of them.
>
> I have a suggestion for a possible solution. Currently mmput() is
> implemented as follows:
>
> 01 void mmput(struct mm_struct *mm)
> 02 {
> 03 if (atomic_dec_and_lock(&mm->mm_users, &mmlist_lock)) {
> 04 list_del(&mm->mmlist);
> 05 mmlist_nr--;
> 06 spin_unlock(&mmlist_lock);
> 07 exit_aio(mm);
> 08 exit_mmap(mm);
> 09 put_swap_token(mm);
> 10 mmdrop(mm);
> 11 }
> 12 }
>
> Suppose we replace lines 07-10 with a little piece of code that adds
> the mm_struct to a list. Then a kernel thread empties the list
> (perhaps via the work queue mechanism), doing the stuff in lines
> 07-10 for each mm_struct. This would eliminate the possibility of
> mmput() sleeping, potentially making things easier for other callers
> of mmput() and causing fewer surprises. Any comments?
task_lock() can be used to pin a task's ->mm. To use task_lock() in
badness() we'd need to either
a) nest task_lock()s. I don't know if we're doing that anywhere else,
but the parent->child ordering is a natural one. or
b) take a ref on the parent's mm_struct, drop the parent's task_lock()
while we walk the children, then do mmput() on the parent's mm outside
tasklist_lock. This is probably better.
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@osdl.org>
To: Dave Peterson <dsp@llnl.gov>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, riel@surriel.com
Subject: Re: [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c
Date: Fri, 14 Apr 2006 00:26:54 -0700 [thread overview]
Message-ID: <20060414002654.76d1a6bc.akpm@osdl.org> (raw)
In-Reply-To: <200604131744.02114.dsp@llnl.gov>
Dave Peterson <dsp@llnl.gov> wrote:
>
> On Thursday 13 April 2006 16:24, Andrew Morton wrote:
> > Dave Peterson <dsp@llnl.gov> wrote:
> > > The patch below fixes some mm_struct reference counting bugs in
> > > badness().
> >
> > hm, OK, afaict the code _is_ racy.
> >
> > But you're now calling mmput() inside read_lock(&tasklist_lock), and
> > mmput() can sleep in exit_aio() or in exit_mmap()->unmap_vmas(). So
> > sterner stuff will be needed.
> >
> > I'll put a might_sleep() into mmput - it's a bit unexpected.
>
> Hmm... fixing this looks rather tricky. If get_task_mm()/mmput() was
> only being done on a single mm_struct then I suppose badness() could
> do something a bit ugly like passing the reference back to its caller
> and letting the caller do the mmput() once tasklist_lock is no longer
> held. However here we are iterating over a bunch of child tasks,
> potentially doing a get_task_mm()/mmput() for a number of them.
>
> I have a suggestion for a possible solution. Currently mmput() is
> implemented as follows:
>
> 01 void mmput(struct mm_struct *mm)
> 02 {
> 03 if (atomic_dec_and_lock(&mm->mm_users, &mmlist_lock)) {
> 04 list_del(&mm->mmlist);
> 05 mmlist_nr--;
> 06 spin_unlock(&mmlist_lock);
> 07 exit_aio(mm);
> 08 exit_mmap(mm);
> 09 put_swap_token(mm);
> 10 mmdrop(mm);
> 11 }
> 12 }
>
> Suppose we replace lines 07-10 with a little piece of code that adds
> the mm_struct to a list. Then a kernel thread empties the list
> (perhaps via the work queue mechanism), doing the stuff in lines
> 07-10 for each mm_struct. This would eliminate the possibility of
> mmput() sleeping, potentially making things easier for other callers
> of mmput() and causing fewer surprises. Any comments?
task_lock() can be used to pin a task's ->mm. To use task_lock() in
badness() we'd need to either
a) nest task_lock()s. I don't know if we're doing that anywhere else,
but the parent->child ordering is a natural one. or
b) take a ref on the parent's mm_struct, drop the parent's task_lock()
while we walk the children, then do mmput() on the parent's mm outside
tasklist_lock. This is probably better.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-04-14 7:27 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-13 21:52 [PATCH 2/2] mm: fix mm_struct reference counting bugs in mm/oom_kill.c Dave Peterson
2006-04-13 21:52 ` Dave Peterson
2006-04-13 23:24 ` Andrew Morton
2006-04-13 23:24 ` Andrew Morton
2006-04-14 0:44 ` Dave Peterson
2006-04-14 0:44 ` Dave Peterson
2006-04-14 7:26 ` Andrew Morton [this message]
2006-04-14 7:26 ` Andrew Morton
2006-04-14 19:14 ` Dave Peterson
2006-04-14 19:14 ` Dave Peterson
2006-04-14 19:45 ` Andrew Morton
2006-04-14 19:45 ` Andrew Morton
2006-04-14 20:49 ` Dave Peterson
2006-04-14 20:49 ` Dave Peterson
2006-04-14 21:31 ` Andrew Morton
2006-04-14 21:31 ` Andrew Morton
2006-04-14 23:52 ` Dave Peterson
2006-04-14 23:52 ` Dave Peterson
2006-04-15 0:00 ` Dave Peterson
2006-04-15 0:00 ` Dave Peterson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060414002654.76d1a6bc.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=dsp@llnl.gov \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@surriel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.