From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Jones <davej@redhat.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
rostedt <rostedt@goodmis.org>, dhowells <dhowells@redhat.com>,
Al Viro <viro@zeniv.linux.org.uk>
Subject: task_work_add() should not succeed unconditionally (Was: lockdep trace from posix timers)
Date: Fri, 17 Aug 2012 18:40:41 +0200 [thread overview]
Message-ID: <20120817164041.GA12017@redhat.com> (raw)
In-Reply-To: <20120817151747.GA8248@redhat.com>
On 08/17, Oleg Nesterov wrote:
>
> On 08/17, Oleg Nesterov wrote:
> >
> > On 08/16, Peter Zijlstra wrote:
> > >
> > > write_lock_irq(&tasklist_lock)
> > > task_lock(parent) parent->alloc_lock
> >
> > And this is already wrong. See the comment above task_lock().
> >
> > > And since it_lock is IRQ-safe and alloc_lock isn't, you've got the IRQ
> > > inversion deadlock reported.
> >
> > Yes. Or, IOW, write_lock(tasklist) is IRQ-safe and thus it can't nest
> > with alloc_lock.
> >
> > > David, Al, anybody want to have a go at fixing this?
> >
> > I still think that task_work_add() should synhronize with exit_task_work()
> > itself and fail if necessary. But I wasn't able to convince Al ;)
>
> And this is my old patch: http://marc.info/?l=linux-kernel&m=134082268721700
> It should be re-diffed of course.
Something like below. Uncompiled/untested, I need to re-check and test.
Now we can remove that task_lock() and rely on task_work_add().
Al, what do you think?
Oleg.
--- x/include/linux/task_work.h
+++ x/include/linux/task_work.h
@@ -18,8 +18,7 @@ void task_work_run(void);
static inline void exit_task_work(struct task_struct *task)
{
- if (unlikely(task->task_works))
- task_work_run();
+ task_work_run();
}
#endif /* _LINUX_TASK_WORK_H */
--- x/kernel/task_work.c
+++ x/kernel/task_work.c
@@ -2,29 +2,35 @@
#include <linux/task_work.h>
#include <linux/tracehook.h>
+#define TWORK_EXITED ((struct callback_head *)1)
+
int
task_work_add(struct task_struct *task, struct callback_head *twork, bool notify)
{
struct callback_head *last, *first;
unsigned long flags;
+ int err = -ESRCH;
/*
- * Not inserting the new work if the task has already passed
- * exit_task_work() is the responisbility of callers.
+ * We must not insert the new work if the exiting task has already
+ * passed task_work_run().
*/
raw_spin_lock_irqsave(&task->pi_lock, flags);
- last = task->task_works;
- first = last ? last->next : twork;
- twork->next = first;
- if (last)
- last->next = twork;
- task->task_works = twork;
+ if (likely(task->task_works != TWORK_EXITED) {
+ last = task->task_works;
+ first = last ? last->next : twork;
+ twork->next = first;
+ if (last)
+ last->next = twork;
+ task->task_works = twork;
+ err = 0;
+ }
raw_spin_unlock_irqrestore(&task->pi_lock, flags);
/* test_and_set_bit() implies mb(), see tracehook_notify_resume(). */
- if (notify)
+ if (!err && notify)
set_notify_resume(task);
- return 0;
+ return err;
}
struct callback_head *
@@ -35,7 +41,7 @@ task_work_cancel(struct task_struct *tas
raw_spin_lock_irqsave(&task->pi_lock, flags);
last = task->task_works;
- if (last) {
+ if (last && last != TWORK_EXITED) {
struct callback_head *q = last, *p = q->next;
while (1) {
if (p->func == func) {
@@ -63,7 +69,12 @@ void task_work_run(void)
while (1) {
raw_spin_lock_irq(&task->pi_lock);
p = task->task_works;
- task->task_works = NULL;
+ /*
+ * twork->func() can do task_work_add(), do not
+ * set TWORK_EXITED until the list becomes empty.
+ */
+ task->task_works = (!p && (task->flags & PF_EXITING))
+ ? TWORK_EXITED : NULL;
raw_spin_unlock_irq(&task->pi_lock);
if (unlikely(!p))
next prev parent reply other threads:[~2012-08-17 16:44 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-24 20:36 lockdep trace from posix timers Dave Jones
2012-07-27 16:20 ` Dave Jones
2012-08-16 12:54 ` Ming Lei
2012-08-16 14:03 ` Dave Jones
2012-08-16 18:07 ` Peter Zijlstra
2012-08-17 15:14 ` Oleg Nesterov
2012-08-17 15:17 ` Oleg Nesterov
2012-08-17 16:40 ` Oleg Nesterov [this message]
2012-08-20 7:15 ` Peter Zijlstra
2012-08-20 11:44 ` Peter Zijlstra
2012-08-20 11:46 ` Peter Zijlstra
2012-08-20 11:50 ` Peter Zijlstra
2012-08-20 12:19 ` Steven Rostedt
2012-08-20 12:20 ` Peter Zijlstra
2012-08-20 14:59 ` Oleg Nesterov
2012-08-20 15:10 ` Peter Zijlstra
2012-08-20 15:27 ` Peter Zijlstra
2012-08-20 15:32 ` Oleg Nesterov
2012-08-20 15:46 ` Peter Zijlstra
2012-08-20 15:58 ` Oleg Nesterov
2012-08-20 16:03 ` Peter Zijlstra
2012-08-20 15:05 ` Oleg Nesterov
2012-08-20 15:12 ` Peter Zijlstra
2012-08-20 15:41 ` Oleg Nesterov
2012-08-20 15:56 ` Peter Zijlstra
2012-08-20 16:10 ` Oleg Nesterov
2012-08-20 16:19 ` Peter Zijlstra
2012-08-20 16:23 ` Oleg Nesterov
2012-08-21 18:27 ` Oleg Nesterov
2012-08-21 18:34 ` Oleg Nesterov
2012-08-24 18:56 ` Oleg Nesterov
2012-08-26 19:11 ` [PATCH 0/4] (Was: lockdep trace from posix timers) Oleg Nesterov
2012-08-26 19:12 ` [PATCH 1/4] task_work: make task_work_add() lockless Oleg Nesterov
2012-09-14 6:08 ` [tip:core/urgent] task_work: Make " tip-bot for Oleg Nesterov
2012-09-24 19:27 ` [PATCH 1/4] task_work: make " Geert Uytterhoeven
2012-09-24 20:37 ` Oleg Nesterov
2012-08-26 19:12 ` [PATCH 2/4] task_work: task_work_add() should not succeed after exit_task_work() Oleg Nesterov
2012-09-14 6:09 ` [tip:core/urgent] " tip-bot for Oleg Nesterov
2012-08-26 19:12 ` [PATCH 3/4] task_work: revert d35abdb2 "hold task_lock around checks in keyctl" Oleg Nesterov
2012-09-14 6:10 ` [tip:core/urgent] task_work: Revert " hold " tip-bot for Oleg Nesterov
2012-08-26 19:12 ` [PATCH 4/4] task_work: simplify the usage in ptrace_notify() and get_signal_to_deliver() Oleg Nesterov
2012-09-14 6:11 ` [tip:core/urgent] task_work: Simplify " tip-bot for Oleg Nesterov
2012-09-06 18:01 ` [PATCH 0/4] (Was: lockdep trace from posix timers) Oleg Nesterov
2012-09-06 18:35 ` Peter Zijlstra
2012-09-07 13:13 ` Oleg Nesterov
2012-08-28 16:29 ` lockdep trace from posix timers Peter Zijlstra
2012-08-28 17:01 ` Oleg Nesterov
2012-08-28 17:12 ` Oleg Nesterov
2012-08-28 17:28 ` Peter Zijlstra
2012-08-29 15:25 ` Oleg Nesterov
2012-08-20 14:55 ` Oleg Nesterov
2012-08-20 15:43 ` Oleg Nesterov
2012-08-20 15:48 ` Peter Zijlstra
2012-08-20 15:58 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120817164041.GA12017@redhat.com \
--to=oleg@redhat.com \
--cc=davej@redhat.com \
--cc=dhowells@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.