All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Jones <davej@redhat.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	rostedt <rostedt@goodmis.org>, dhowells <dhowells@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: lockdep trace from posix timers
Date: Tue, 28 Aug 2012 19:01:21 +0200	[thread overview]
Message-ID: <20120828170121.GA30165@redhat.com> (raw)
In-Reply-To: <1346171342.2296.4.camel@laptop>

On 08/28, Peter Zijlstra wrote:
>
> On Fri, 2012-08-24 at 20:56 +0200, Oleg Nesterov wrote:
> >
> > Peter, if you think it can work for you and if you agree with
> > the implementation I will be happy to send the patch.
>
> Yeah I think it would work, but I'm not sure why you're introducing the
> cmp_xchg helper just for this..

Please look at 1-4 the patches I sent (only 1-2 are relevant), I removed
this helper. Although I still think it makes sense, but of course not in
task_work.c.

>  struct callback_head *
>  task_work_cancel(struct task_struct *task, task_work_func_t func)
>  {
> -	unsigned long flags;
> -	struct callback_head *last, *res = NULL;
> -
> -	raw_spin_lock_irqsave(&task->pi_lock, flags);
> -	last = task->task_works;
> -	if (last) {
> -		struct callback_head *q = last, *p = q->next;
> -		while (1) {
> -			if (p->func == func) {
> -				q->next = p->next;
> -				if (p == last)
> -					task->task_works = q == p ? NULL : q;
> -				res = p;
> -				break;
> -			}
> -			if (p == last)
> -				break;
> -			q = p;
> -			p = q->next;
> +	struct callback_head **workp, *work;
> +
> +again:
> +	workp = &task->task_works;
> +	work = *workp;
> +	while (work) {
> +		if (work->func == func) {

But you can't dereference this pointer. Without some locking this
can race with another task_work_cancel() or task_work_run(), this
work can be free/unmapped/reused.

> +			if (cmpxchg(workp, work, work->next) == work)
> +				return work;

Or this can race with task_work_cancel(work) + task_work_add(work).
cmpxchg() can succeed even if work->func is already different.

> +static callback_head *task_work_pop(void)
>  {
> -	struct task_struct *task = current;
> -	struct callback_head *p, *q;
> -
> -	while (1) {
> -		raw_spin_lock_irq(&task->pi_lock);
> -		p = task->task_works;
> -		task->task_works = NULL;
> -		raw_spin_unlock_irq(&task->pi_lock);
> -
> -		if (unlikely(!p))
> -			return;
> -
> -		q = p->next; /* head */
> -		p->next = NULL; /* cut it */
> -		while (q) {
> -			p = q->next;
> -			q->func(q);
> -			q = p;
> -		}
> +	struct callback_head **head = &current->task_work;
> +	struct callback_head *entry, *old_entry;
> +
> +	entry = *head;
> +	for (;;) {
> +		if (!entry || entry == &dead)
> +			return NULL;
> +
> +		old_entry = entry;
> +		entry = cmpxchg(head, entry, entry->next);

Well, this obviously means cmpxchg() for each entry...

> ( And yeah, I know, its not FIFO ;-)

Cough. akpm didn't like fifo, Linus disliked it too...

And now you! Whats going on??? ;)

Oleg.


  reply	other threads:[~2012-08-28 16:59 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-24 20:36 lockdep trace from posix timers Dave Jones
2012-07-27 16:20 ` Dave Jones
2012-08-16 12:54   ` Ming Lei
2012-08-16 14:03     ` Dave Jones
2012-08-16 18:07 ` Peter Zijlstra
2012-08-17 15:14   ` Oleg Nesterov
2012-08-17 15:17     ` Oleg Nesterov
2012-08-17 16:40       ` task_work_add() should not succeed unconditionally (Was: lockdep trace from posix timers) Oleg Nesterov
2012-08-20  7:15     ` lockdep trace from posix timers Peter Zijlstra
2012-08-20 11:44       ` Peter Zijlstra
2012-08-20 11:46         ` Peter Zijlstra
2012-08-20 11:50         ` Peter Zijlstra
2012-08-20 12:19           ` Steven Rostedt
2012-08-20 12:20             ` Peter Zijlstra
2012-08-20 14:59         ` Oleg Nesterov
2012-08-20 15:10           ` Peter Zijlstra
2012-08-20 15:27             ` Peter Zijlstra
2012-08-20 15:32               ` Oleg Nesterov
2012-08-20 15:46                 ` Peter Zijlstra
2012-08-20 15:58                   ` Oleg Nesterov
2012-08-20 16:03                     ` Peter Zijlstra
2012-08-20 15:05         ` Oleg Nesterov
2012-08-20 15:12           ` Peter Zijlstra
2012-08-20 15:41             ` Oleg Nesterov
2012-08-20 15:56               ` Peter Zijlstra
2012-08-20 16:10                 ` Oleg Nesterov
2012-08-20 16:19                   ` Peter Zijlstra
2012-08-20 16:23                     ` Oleg Nesterov
2012-08-21 18:27                       ` Oleg Nesterov
2012-08-21 18:34                         ` Oleg Nesterov
2012-08-24 18:56                           ` Oleg Nesterov
2012-08-26 19:11                             ` [PATCH 0/4] (Was: lockdep trace from posix timers) Oleg Nesterov
2012-08-26 19:12                               ` [PATCH 1/4] task_work: make task_work_add() lockless Oleg Nesterov
2012-09-14  6:08                                 ` [tip:core/urgent] task_work: Make " tip-bot for Oleg Nesterov
2012-09-24 19:27                                 ` [PATCH 1/4] task_work: make " Geert Uytterhoeven
2012-09-24 20:37                                   ` Oleg Nesterov
2012-08-26 19:12                               ` [PATCH 2/4] task_work: task_work_add() should not succeed after exit_task_work() Oleg Nesterov
2012-09-14  6:09                                 ` [tip:core/urgent] " tip-bot for Oleg Nesterov
2012-08-26 19:12                               ` [PATCH 3/4] task_work: revert d35abdb2 "hold task_lock around checks in keyctl" Oleg Nesterov
2012-09-14  6:10                                 ` [tip:core/urgent] task_work: Revert " hold " tip-bot for Oleg Nesterov
2012-08-26 19:12                               ` [PATCH 4/4] task_work: simplify the usage in ptrace_notify() and get_signal_to_deliver() Oleg Nesterov
2012-09-14  6:11                                 ` [tip:core/urgent] task_work: Simplify " tip-bot for Oleg Nesterov
2012-09-06 18:01                               ` [PATCH 0/4] (Was: lockdep trace from posix timers) Oleg Nesterov
2012-09-06 18:35                                 ` Peter Zijlstra
2012-09-07 13:13                                   ` Oleg Nesterov
2012-08-28 16:29                             ` lockdep trace from posix timers Peter Zijlstra
2012-08-28 17:01                               ` Oleg Nesterov [this message]
2012-08-28 17:12                                 ` Oleg Nesterov
2012-08-28 17:28                                 ` Peter Zijlstra
2012-08-29 15:25                                   ` Oleg Nesterov
2012-08-20 14:55       ` Oleg Nesterov
2012-08-20 15:43       ` Oleg Nesterov
2012-08-20 15:48         ` Peter Zijlstra
2012-08-20 15:58           ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120828170121.GA30165@redhat.com \
    --to=oleg@redhat.com \
    --cc=davej@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.