From: Oleg Nesterov <oleg@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: apw@canonical.com, arjan@linux.intel.com, fhrbata@redhat.com,
john.johansen@canonical.com, rientjes@google.com,
rusty@rustcorp.com.au, tj@kernel.org,
linux-kernel@vger.kernel.org,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Mandeep Singh Baines <msb@chromium.org>
Subject: [PATCH 1/1] hung_task: fix the broken rcu_lock_break() logic
Date: Fri, 17 Feb 2012 19:00:40 +0100 [thread overview]
Message-ID: <20120217180040.GB8762@redhat.com> (raw)
In-Reply-To: <20120217180011.GA8762@redhat.com>
check_hung_uninterruptible_tasks()->rcu_lock_break() introduced by
"softlockup: check all tasks in hung_task" commit ce9dbe24 looks
absolutely wrong.
- rcu_lock_break() does put_task_struct(). If the task has exited
it is not safe to even read its ->state, nothing protects this
task_struct.
- The TASK_DEAD checks are wrong too. Contrary to the comment, we
can't use it to check if the task was unhashed. It can be unhashed
without TASK_DEAD, or it can be valid with TASK_DEAD.
For example, an autoreaping task can do release_task(current)
long before it sets TASK_DEAD in do_exit().
Or, a zombie task can have ->state == TASK_DEAD but release_task()
was not called, and in this case we must not break the loop.
Change this code to check pid_alive() instead, and do this before
we drop the reference to the task_struct.
Note: while_each_thread() under rcu_read_lock() is not really safe,
it can livelock. This will be fixed later, but fortunately in this
case the "max_count" logic saves us anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Mandeep Singh Baines <msb@google.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
kernel/hung_task.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 2e48ec0..c21449f 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -119,15 +119,20 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
* For preemptible RCU it is sufficient to call rcu_read_unlock in order
* to exit the grace period. For classic RCU, a reschedule is required.
*/
-static void rcu_lock_break(struct task_struct *g, struct task_struct *t)
+static bool rcu_lock_break(struct task_struct *g, struct task_struct *t)
{
+ bool can_cont;
+
get_task_struct(g);
get_task_struct(t);
rcu_read_unlock();
cond_resched();
rcu_read_lock();
+ can_cont = pid_alive(g) && pid_alive(t);
put_task_struct(t);
put_task_struct(g);
+
+ return can_cont;
}
/*
@@ -154,9 +159,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
goto unlock;
if (!--batch_count) {
batch_count = HUNG_TASK_BATCHING;
- rcu_lock_break(g, t);
- /* Exit if t or g was unhashed during refresh. */
- if (t->state == TASK_DEAD || g->state == TASK_DEAD)
+ if (!rcu_lock_break(g, t))
goto unlock;
}
/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
--
1.5.5.1
prev parent reply other threads:[~2012-02-17 18:08 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-14 16:47 [PATCH 0/6] make request_module() killable Oleg Nesterov
2012-02-14 16:47 ` [PATCH 1/6] usermodehelper: introduce umh_complete(sub_info) Oleg Nesterov
2012-02-14 16:48 ` [PATCH 2/6] usermodehelper: implement UMH_KILLABLE Oleg Nesterov
2012-02-14 16:48 ` [PATCH 3/6] usermodehelper: kill umh_wait, renumber UMH_* constants Oleg Nesterov
2012-02-15 1:09 ` Rusty Russell
2012-02-15 18:12 ` Oleg Nesterov
2012-02-14 16:48 ` [PATCH 4/6] usermodehelper: ____call_usermodehelper() doesn't need do_exit() Oleg Nesterov
2012-02-14 16:48 ` [PATCH 5/6] kmod: introduce call_modprobe() helper Oleg Nesterov
2012-02-14 16:49 ` [PATCH 6/6] kmod: make __request_module() killable Oleg Nesterov
2012-02-15 20:30 ` Andrew Morton
2012-02-16 15:04 ` Oleg Nesterov
2012-02-16 17:26 ` [PATCH 0/4] make vfork() killable Oleg Nesterov
2012-02-16 17:26 ` [PATCH 1/4] introduce complete_vfork_done() Oleg Nesterov
2012-02-17 0:35 ` Andrew Morton
2012-02-17 14:37 ` Oleg Nesterov
2012-02-16 17:27 ` [PATCH 2/4] vfork: make it killable Oleg Nesterov
2012-02-17 0:39 ` Andrew Morton
2012-02-17 14:44 ` Oleg Nesterov
2012-02-16 17:27 ` [PATCH 3/4] coredump_wait: don't call complete_vfork_done() Oleg Nesterov
2012-02-16 17:27 ` [PATCH 4/4] kill PF_STARTING Oleg Nesterov
2012-02-17 0:26 ` [PATCH 0/4] make vfork() killable Andrew Morton
2012-02-17 2:45 ` Stephen Rothwell
2012-02-17 14:46 ` Oleg Nesterov
[not found] ` <20120216173233.GF30393@redhat.com>
[not found] ` <201202172211.CGH81726.OStOJFLFHQVMFO@I-love.SAKURA.ne.jp>
[not found] ` <20120217150726.GD22440@redhat.com>
2012-02-17 18:00 ` [PATCH 0/1] hung_task: fix the broken rcu_lock_break() logic Oleg Nesterov
2012-02-17 18:00 ` Oleg Nesterov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120217180040.GB8762@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=apw@canonical.com \
--cc=arjan@linux.intel.com \
--cc=fhrbata@redhat.com \
--cc=john.johansen@canonical.com \
--cc=linux-kernel@vger.kernel.org \
--cc=msb@chromium.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=rientjes@google.com \
--cc=rusty@rustcorp.com.au \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.