public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>,
	Thomas Gleixner <tglx@linutronix.de>,
	Mike Galbraith <efault@gmx.de>, Ingo Molnar <mingo@elte.hu>,
	LKML <linux-kernel@vger.kernel.org>,
	pm list <linux-pm@lists.linux-foundation.org>,
	Greg KH <gregkh@suse.de>, Jesse Barnes <jbarnes@virtuousgeek.org>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume
Date: Wed, 11 Nov 2009 17:08:27 +0900	[thread overview]
Message-ID: <4AFA70FB.60706@kernel.org> (raw)
In-Reply-To: <alpine.LFD.2.01.0911101343590.31845@localhost.localdomain>

Hello,

Linus Torvalds wrote:
> On Tue, 10 Nov 2009, Rafael J. Wysocki wrote:
> And the code really is pretty subtle. queue_delayed_work_on(), for 
> example, uses raw_smp_processor_id() to basically pick a target CPU for 
> the work ("whatever the CPU happens to be now"). But does that have to 
> match the CPU that the timeout will trigger on? I dunno.

It will queue on the cpu the timer callback runs, so delayed work will
follow timer which can be a little bit surprising at times, I suppose.

> I've walked through the code many times, but every time I end up
> forgetting all the subtleties a few days later.

A lot of those subtleties come from the fact struct work is not around
once the work function is invoked, so after that the workqueue code
doesn't have much to work with but the pointer value itself.

> The workqueue code is very fragile in many ways. I'm adding Oleg and Tejun 
> to the Cc, because Oleg is definitely one of the workqueue locking 
> masters, and Tejun has worked on it to make it way more robust, so they 
> may have ideas.

One thing that I can think of which might cause this early release is
self-requeueing works which assume that only one instance of the
function will be executed at any given time.  While preparing to bring
down a cpu, worker threads are unbound from the cpu.  After cpu is
brought down, the workqueue for that cpu is flushed.  This means that
any work which was running or on queue at the time of cpu down will
run on a different cpu.  So, let's assume there's a work function
which looks like the following,

void my_work_fn(struct work_struct *work)
{
	struct my_struct *me = container_of(work, something...);

	DO SOMETHING;

	if (--me->todo)
		schedule_work(work);
	else
		free(me);
}

Which will work perfectly as long as all cpus stay alive as the work
will be pinned on a single cpu and cwq guarantees that a single work
is never executed in parallel.  However, if a cpu is brought down
while my_work_fn() was doing SOMETHING and me->todo was above 1,
schedule_work() will schedule itself to a different cpu which will
happily execute the work in parallel.

As worker threads become unbound, they may bounce among different cpus
while executing and create more than two instances of parallel
execution of the work which can lead to freeings work which is still
on workqueue list with the above code but with different code just two
parallel executions should be able to achieve it.

Another related issue is the behavior flush_work() when a work ends up
scheduled on a different cpu.  flush_work() will only look at a single
cpu workqueue on each flush attempt and if the work is not on the cpu
or there but also running on other cpus, it won't do nothing about it.
So, it's not too difficult to write code where the caller incorrectly
assumes the work is done after flush_work() is finished while the work
actually ended up being scheduled on a different cpu.

One way to debug I can think of is to record work pointer -> function
mapping in a percpu ring buffer and when the bug occurs (can be
detected by matching the 6b poison pattern before dereferencing next
pointer) print out function pointer for the matching work pointer.
The async nature makes the problem very difficult to track down.  :-(

Thanks.

-- 
tejun

  reply	other threads:[~2009-11-11  8:09 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-09 11:50 Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Rafael J. Wysocki
2009-11-09 12:02 ` Ingo Molnar
2009-11-09 12:24   ` Rafael J. Wysocki
2009-11-09 12:49     ` Ingo Molnar
2009-11-09 14:02       ` Thomas Gleixner
2009-11-09 14:16         ` Mike Galbraith
2009-11-09 14:27           ` Rafael J. Wysocki
2009-11-09 14:30             ` Mike Galbraith
2009-11-09 15:47               ` Rafael J. Wysocki
2009-11-09 16:19                 ` Mike Galbraith
2009-11-09 17:36                   ` Rafael J. Wysocki
2009-11-09 18:50                     ` Thomas Gleixner
2009-11-09 20:00                       ` Rafael J. Wysocki
2009-11-09 20:31                         ` [linux-pm] " Alan Stern
2009-11-09 20:48                           ` Rafael J. Wysocki
2009-11-09 21:24                             ` Alan Stern
2009-11-09 20:45                         ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-09 21:42                           ` Linus Torvalds
2009-11-10  0:19                             ` Rafael J. Wysocki
2009-11-10 22:02                               ` Linus Torvalds
2009-11-11  8:08                                 ` Tejun Heo [this message]
2009-11-11 18:13                                   ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Oleg Nesterov
2009-11-12  4:56                                     ` Tejun Heo
2009-11-12 18:35                                       ` Oleg Nesterov
2009-11-12 19:14                                         ` Tejun Heo
2009-11-16 11:01                                           ` Tejun Heo
2009-11-11 11:52                                 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-11 19:52                                   ` Linus Torvalds
2009-11-11 20:18                                     ` Marcel Holtmann
2009-11-11 20:25                                       ` Linus Torvalds
2009-11-11 21:18                                         ` Rafael J. Wysocki
2009-11-11 21:13                                       ` Oliver Neukum
2009-11-11 21:38                                         ` Linus Torvalds
2009-11-11 21:44                                           ` Oliver Neukum
2009-11-11 16:13                                 ` Oleg Nesterov
2009-11-11 20:00                                   ` Rafael J. Wysocki
2009-11-11 20:11                                     ` Linus Torvalds
2009-11-11 20:20                                       ` Marcel Holtmann
2009-11-11 20:24                                     ` Oleg Nesterov
2009-11-11 21:15                                       ` Oliver Neukum
2009-11-11 17:17                                 ` Oleg Nesterov
2009-11-12 17:33                                   ` Thomas Gleixner
2009-11-12 19:17                                     ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-12 20:53                                       ` Thomas Gleixner
2009-11-12 20:53                                     ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-12 20:55                                       ` Thomas Gleixner
2009-11-12 22:55                                         ` Rafael J. Wysocki
2009-11-12 23:08                                           ` Thomas Gleixner
2009-11-15 23:37                                     ` Frederic Weisbecker
2009-11-15 23:40                                       ` Frederic Weisbecker
2009-11-09 19:13                     ` Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Thomas Gleixner
2009-11-09 20:03                       ` Rafael J. Wysocki
2009-11-09 14:26         ` Rafael J. Wysocki
2009-11-09 14:44           ` Mike Galbraith
2009-11-09 15:47             ` Rafael J. Wysocki
2009-11-09 15:57         ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AFA70FB.60706@kernel.org \
    --to=tj@kernel.org \
    --cc=efault@gmx.de \
    --cc=gregkh@suse.de \
    --cc=jbarnes@virtuousgeek.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=rjw@sisk.pl \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox