From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Colin Cross <ccross@google.com>
Cc: David Rientjes <rientjes@google.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
werner <w.landgraf@ru.ru>, Rik van Riel <riel@redhat.com>,
Hugh Dickins <hughd@google.com>,
linux-kernel@vger.kernel.org, Oleg Nesterov <oleg@redhat.com>,
Rabin Vincent <rabin.vincent@stericsson.com>,
Christian Bejram <christian.bejram@stericsson.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Anton Vorontsov <anton.vorontsov@linaro.org>,
stable@vger.kernel.org
Subject: Re: v3.4-rc2 out-of-memory problems (was Re: 3.4-rc1 sticks-and-crashs)
Date: Mon, 9 Apr 2012 15:21:20 -0700 [thread overview]
Message-ID: <20120409222120.GA14149@kroah.com> (raw)
In-Reply-To: <CAMbhsRRUNVn9_eL_obZxcZv7LqDW_8SeRApdKzDePbaW_Y4uvg@mail.gmail.com>
On Mon, Apr 09, 2012 at 03:13:00PM -0700, Colin Cross wrote:
> On Mon, Apr 9, 2012 at 2:22 PM, David Rientjes <rientjes@google.com> wrote:
> > On Mon, 9 Apr 2012, Linus Torvalds wrote:
> >
> >> The real bug is actually that those notifiers are a f*cking joke, and
> >> the return value from the notifier is a mistake.
> >>
> >> So I personally think that the real problem is this code in
> >> profile_handoff_task:
> >>
> >> return (ret == NOTIFY_OK) ? 1 : 0;
> >>
> >> and ask yourself two questions:
> >>
> >> - what the hell does NOTIFY_OK/NOTIFY_DONE mean?
> >> - what happens if there are multiple notifiers that all (or some)
> >> return NOTIFY_OK?
> >>
> > NOTIFY_OK should never be a valid response for this notifier the way it's
> > currently implemented, it should be NOTIFY_STOP to stop iterating the call
> > chain to avoid a double free. Right now it doesn't matter because only
> > oprofile is actually freeing the task_struct and lowmemorykiller should be
> > using NOTIFY_DONE.
> >
> > Then we have a completeness issue if multiple callbacks want to return
> > NOTIFY_STOP and an ordering issue if the oprofile callback is invoked
> > before lowmemorykiller.
> >
> >> I'll tell you what my answers are:
> >>
> >> (a) NOTIFY_DONE is the "ok, everything is fine, you can free the
> >> task-struct". It's also what that handoff notifier thing returns if
> >> there are no notifiers registered at all.
> >>
> >> So the fix to the Android lowmemorykiller is as simple as just
> >> changing NOTIFY_OK to NOTIFY_DONE, which will mean that the caller
> >> will properly free the task struct.
> >>
> >
> > I don't think so for Werner's config who also has CONFIG_OPROFILE=y, so
> > oprofile would return NOTIFY_OK and queue the task_struct for free, then
> > the second notifier callback to the lowmemorykiller would return
> > NOTIFY_DONE which would result in put_task_struct() doing free_task()
> > itself for a double free.
> >
> >> The NOTIFY_OK/NOTIFY_DONE difference really does seem to be just
> >> "NOTIFY_OK means that I will free the task myself later". That's what
> >> the oprofile uses, and it frees the task.
> >>
> >> (b) But the whole interface is a total f*cking mess. If *multiple*
> >> people return NOTIFY_OK, they're royally fucked. And the whole (and
> >> only) point of notifiers is that you can register multiple different
> >> ones independently.
> >>
> >> So quite frankly, the *real* bug is not in that android driver
> >> (although I'd say that we should just make it return NOTIFY_DONE and
> >> be done with it). The real bug is that the whole f*cking notifier is a
> >> mistake, and checking the error return was the biggest mistake of all.
> >>
> >
> > Right, we can't handoff the freeing of the task_struct to more than one
> > notifier. It seems misdesigned from the beginning and what we really want
> > is to hijack task->usage for __put_task_struct(task) if we have such a
> > notifier callchain and require each one (currently just oprofile) to take
> > a reference on task->usage for NOTIFY_OK and then be responsible for
> > dropping the reference when it's done with it later instead of requiring
> > it to free the task_struct itself.
> >
> > That's _if_ we want to continue to have such an interface in the first
> > place where it's only really necessary right now for oprofile (and, hence,
> > wasn't implemented in an extendable way). I'm thinking the
> > lowmemorykiller, as I eluded to, could be written in a way where we can
> > detect if a thread we've already killed has exited yet before killing
> > another one. We can't just store a pointer to the task_struct of the
> > killed task since it could be reused for a fork later, but we could use
> > TIF_MEMDIE like the oom killer does.
>
> This was a known issue in 2010, in the android tree the use of
> task_handoff_register was dropped one day after it was added and
> replaced with a new task_free_register hook. I assume Greg dropped
> the fix during the android tree refresh in 3.0 because it depended on
> a change to kernel/fork.c. The two relevant patches are (using
> codeaurora's gitweb becase we don't have one right now):
>
> sched: Add a generic notifier when a task struct is about to be freed
> https://www.codeaurora.org/gitweb/quic/la/?p=kernel/common.git;a=commitdiff;h=667dffa787a87ef4ea43cc65957ce96077fdcd0a
Yes, I can't add a patch like that for this driver, that is why I
thought everyone was getting together to "properly" determine how to
solve this oom notifier problem. Has that work stalled somwhere?
greg k-h
next prev parent reply other threads:[~2012-04-09 22:21 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-09 2:42 v3.4-rc2 out-of-memory problems (was Re: 3.4-rc1 sticks-and-crashs) Linus Torvalds
2012-04-09 2:50 ` Andrew Morton
2012-04-09 3:11 ` Linus Torvalds
2012-04-09 7:04 ` Sven Joachim
2012-04-09 15:24 ` Linus Torvalds
2012-04-09 15:43 ` Sven Joachim
2012-04-09 15:57 ` Rik van Riel
2012-04-09 16:19 ` Sven Joachim
2012-04-09 16:33 ` Rik van Riel
2012-04-09 17:00 ` Pekka Enberg
2012-04-09 17:19 ` Sven Joachim
2012-04-09 17:00 ` Sven Joachim
2012-04-09 17:20 ` Rik van Riel
2012-04-09 10:15 ` David Rientjes
2012-04-09 15:39 ` Linus Torvalds
2012-04-09 21:22 ` David Rientjes
2012-04-09 22:09 ` Linus Torvalds
2012-04-09 23:25 ` David Rientjes
2012-04-09 23:55 ` Linus Torvalds
2012-04-10 0:04 ` David Rientjes
2012-04-14 20:50 ` Srivatsa S. Bhat
2012-04-09 23:56 ` [patch] android, lowmemorykiller: remove task handoff notifier David Rientjes
2012-04-10 1:23 ` Colin Cross
[not found] ` <web-723076709@zbackend1.aha.ru>
[not found] ` <alpine.DEB.2.00.1204091637280.21813@chino.kir.corp.google.com>
[not found] ` <web-723082731@zbackend1.aha.ru>
[not found] ` <alpine.DEB.2.00.1204091707580.21813@chino.kir.corp.google.com>
2012-04-10 7:09 ` v3.4-rc2 out-of-memory problems (was Re: 3.4-rc1 sticks-and-crashs) werner
2012-04-10 7:10 ` werner
2012-04-09 22:13 ` Colin Cross
2012-04-09 22:21 ` Greg Kroah-Hartman [this message]
2012-04-09 22:44 ` john stultz
2012-04-09 22:30 ` Linus Torvalds
2012-04-09 23:37 ` David Rientjes
2012-04-10 0:23 ` Colin Cross
2012-04-10 0:32 ` David Rientjes
2012-04-10 1:21 ` Colin Cross
2012-04-10 1:33 ` David Rientjes
2012-04-10 1:37 ` Colin Cross
-- strict thread matches above, loose matches on Subject: below --
2012-04-09 6:52 werner
2012-04-09 7:01 werner
2012-04-10 1:52 werner
2012-04-10 1:51 ` Rik van Riel
2012-04-10 2:13 ` werner
2012-04-10 12:53 werner
2012-04-14 19:38 werner
2012-04-14 19:58 ` Rik van Riel
2012-04-14 21:03 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120409222120.GA14149@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=anton.vorontsov@linaro.org \
--cc=ccross@google.com \
--cc=christian.bejram@stericsson.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rabin.vincent@stericsson.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=w.landgraf@ru.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox