All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Thomas Gleixner <tglx@linutronix.de>,
	Mike Galbraith <efault@gmx.de>, Ingo Molnar <mingo@elte.hu>,
	LKML <linux-kernel@vger.kernel.org>,
	pm list <linux-pm@lists.linux-foundation.org>,
	Greg KH <gregkh@suse.de>, Jesse Barnes <jbarnes@virtuousgeek.org>
Subject: Re: GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume
Date: Wed, 11 Nov 2009 19:13:49 +0100	[thread overview]
Message-ID: <20091111181349.GA30638@redhat.com> (raw)
In-Reply-To: <4AFA70FB.60706@kernel.org>

On 11/11, Tejun Heo wrote:
>
> One thing that I can think of which might cause this early release is
> self-requeueing works which assume that only one instance of the
> function will be executed at any given time.  While preparing to bring
> down a cpu, worker threads are unbound from the cpu.  After cpu is
> brought down, the workqueue for that cpu is flushed.  This means that
> any work which was running or on queue at the time of cpu down will
> run on a different cpu.  So, let's assume there's a work function
> which looks like the following,
>
> void my_work_fn(struct work_struct *work)
> {
> 	struct my_struct *me = container_of(work, something...);
>
> 	DO SOMETHING;
>
> 	if (--me->todo)
> 		schedule_work(work);
> 	else
> 		free(me);
> }
>
> Which will work perfectly as long as all cpus stay alive as the work
> will be pinned on a single cpu and cwq guarantees that a single work
> is never executed in parallel.  However, if a cpu is brought down
> while my_work_fn() was doing SOMETHING and me->todo was above 1,
> schedule_work() will schedule itself to a different cpu which will
> happily execute the work in parallel.
>
> As worker threads become unbound, they may bounce among different cpus
> while executing and create more than two instances

Well, "more than two instances" is not possible in this particular
case.

But in general I agree. If a self-requeueing work assumes it stays on
the same CPU or it assumes it can never race with itself, it should hook
CPU_DOWN_PREPARE and cancel the work. Like slab.c does with reap_work.

This is even documented, the comment above queue_work() says:

	* We queue the work to the CPU on which it was submitted, but if the CPU dies
	* it can be processed by another CPU.

We can improve things, see http://marc.info/?l=linux-kernel&m=125562105103769

But then we should also change workqueue_cpu_callback(CPU_POST_DEAD).
Instead of flushing, we should carefully move the pending works to
another CPU, otherwise the self-requeueing work can block cpu_down().

> Another related issue is the behavior flush_work() when a work ends up
> scheduled on a different cpu.  flush_work() will only look at a single
> cpu workqueue on each flush attempt and if the work is not on the cpu
> or there but also running on other cpus, it won't do nothing about it.
> So, it's not too difficult to write code where the caller incorrectly
> assumes the work is done after flush_work() is finished while the work
> actually ended up being scheduled on a different cpu.

Yes, flush_work() is not even supposed to work "correctly" in this case.
Please note the changelog for db700897224b5ebdf852f2d38920ce428940d059
In particular:

	More precisely, it "flushes" the result of of the last
	queue_work() which is visible to the caller.

but we can add flush_work_sync().

But flush_work() do not have too much callers. Instead people often
use flush_workqueue() which just can't help if the work_struct is
self-requeueing or if it is delayed_work.

> One way to debug I can think of is to record work pointer -> function
> mapping in a percpu ring buffer

We can record work->func in work->entry.prev, which is either another
work or cwq. Please see the debugging patch I sent.

Not sure this patch will help, but I bet that the actual reason for
this bug is much simpler than the subtle races above ;)

Oleg.


  parent reply	other threads:[~2009-11-11 18:21 UTC|newest]

Thread overview: 112+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-09 11:50 Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Rafael J. Wysocki
2009-11-09 11:50 ` Rafael J. Wysocki
2009-11-09 12:02 ` Ingo Molnar
2009-11-09 12:02 ` Ingo Molnar
2009-11-09 12:24   ` Rafael J. Wysocki
2009-11-09 12:24   ` Rafael J. Wysocki
2009-11-09 12:49     ` Ingo Molnar
2009-11-09 12:49       ` Ingo Molnar
2009-11-09 14:02       ` Thomas Gleixner
2009-11-09 14:16         ` Mike Galbraith
2009-11-09 14:27           ` Rafael J. Wysocki
2009-11-09 14:27             ` Rafael J. Wysocki
2009-11-09 14:30             ` Mike Galbraith
2009-11-09 15:47               ` Rafael J. Wysocki
2009-11-09 16:19                 ` Mike Galbraith
2009-11-09 16:19                 ` Mike Galbraith
2009-11-09 17:36                   ` Rafael J. Wysocki
2009-11-09 17:36                   ` Rafael J. Wysocki
2009-11-09 18:50                     ` Thomas Gleixner
2009-11-09 18:50                     ` Thomas Gleixner
2009-11-09 20:00                       ` Rafael J. Wysocki
2009-11-09 20:00                       ` Rafael J. Wysocki
2009-11-09 20:31                         ` [linux-pm] " Alan Stern
2009-11-09 20:48                           ` Rafael J. Wysocki
2009-11-09 21:24                             ` Alan Stern
2009-11-09 21:24                             ` [linux-pm] " Alan Stern
2009-11-09 20:48                           ` Rafael J. Wysocki
2009-11-09 20:31                         ` Alan Stern
2009-11-09 20:45                         ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-09 21:42                           ` Linus Torvalds
2009-11-09 21:42                             ` Linus Torvalds
2009-11-10  0:19                             ` Rafael J. Wysocki
2009-11-10 22:02                               ` Linus Torvalds
2009-11-10 22:02                                 ` Linus Torvalds
2009-11-11  8:08                                 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-11 18:13                                   ` Oleg Nesterov
2009-11-11 18:13                                   ` Oleg Nesterov [this message]
2009-11-12  4:56                                     ` Tejun Heo
2009-11-12 18:35                                       ` Oleg Nesterov
2009-11-12 18:35                                       ` Oleg Nesterov
2009-11-12 19:14                                         ` Tejun Heo
2009-11-12 19:14                                           ` Tejun Heo
2009-11-16 11:01                                           ` Tejun Heo
2009-11-16 11:01                                           ` Tejun Heo
2009-11-12  4:56                                     ` Tejun Heo
2009-11-11  8:08                                 ` Tejun Heo
2009-11-11 11:52                                 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-11 11:52                                 ` Rafael J. Wysocki
2009-11-11 19:52                                   ` Linus Torvalds
2009-11-11 19:52                                     ` Linus Torvalds
2009-11-11 20:18                                     ` Marcel Holtmann
2009-11-11 20:18                                     ` Marcel Holtmann
2009-11-11 20:25                                       ` Linus Torvalds
2009-11-11 20:25                                         ` Linus Torvalds
2009-11-11 21:18                                         ` Rafael J. Wysocki
2009-11-11 21:18                                         ` Rafael J. Wysocki
2009-11-11 21:13                                       ` Oliver Neukum
2009-11-11 21:38                                         ` Linus Torvalds
2009-11-11 21:38                                           ` Linus Torvalds
2009-11-11 21:44                                           ` Oliver Neukum
2009-11-11 21:44                                           ` Oliver Neukum
2009-11-11 21:13                                       ` Oliver Neukum
2009-11-11 16:13                                 ` Oleg Nesterov
2009-11-11 16:13                                 ` Oleg Nesterov
2009-11-11 20:00                                   ` Rafael J. Wysocki
2009-11-11 20:11                                     ` Linus Torvalds
2009-11-11 20:20                                       ` Marcel Holtmann
2009-11-11 20:20                                       ` Marcel Holtmann
2009-11-11 20:11                                     ` Linus Torvalds
2009-11-11 20:24                                     ` Oleg Nesterov
2009-11-11 20:24                                     ` Oleg Nesterov
2009-11-11 21:15                                       ` Oliver Neukum
2009-11-11 21:15                                       ` Oliver Neukum
2009-11-11 20:00                                   ` Rafael J. Wysocki
2009-11-11 17:17                                 ` Oleg Nesterov
2009-11-12 17:33                                   ` Thomas Gleixner
2009-11-12 19:17                                     ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-12 20:53                                       ` Thomas Gleixner
2009-11-12 20:53                                       ` Thomas Gleixner
2009-11-12 19:17                                     ` Tejun Heo
2009-11-12 20:53                                     ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-12 20:55                                       ` Thomas Gleixner
2009-11-12 22:55                                         ` Rafael J. Wysocki
2009-11-12 23:08                                           ` Thomas Gleixner
2009-11-12 23:08                                           ` Thomas Gleixner
2009-11-12 22:55                                         ` Rafael J. Wysocki
2009-11-12 20:55                                       ` Thomas Gleixner
2009-11-12 20:53                                     ` Rafael J. Wysocki
2009-11-15 23:37                                     ` Frederic Weisbecker
2009-11-15 23:37                                     ` Frederic Weisbecker
2009-11-15 23:40                                       ` Frederic Weisbecker
2009-11-15 23:40                                       ` Frederic Weisbecker
2009-11-12 17:33                                   ` Thomas Gleixner
2009-11-11 17:17                                 ` Oleg Nesterov
2009-11-10  0:19                             ` Rafael J. Wysocki
2009-11-09 20:45                         ` Rafael J. Wysocki
2009-11-09 19:13                     ` Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Thomas Gleixner
2009-11-09 20:03                       ` Rafael J. Wysocki
2009-11-09 20:03                       ` Rafael J. Wysocki
2009-11-09 19:13                     ` Thomas Gleixner
2009-11-09 15:47               ` Rafael J. Wysocki
2009-11-09 14:30             ` Mike Galbraith
2009-11-09 14:16         ` Mike Galbraith
2009-11-09 14:26         ` Rafael J. Wysocki
2009-11-09 14:26         ` Rafael J. Wysocki
2009-11-09 14:44           ` Mike Galbraith
2009-11-09 15:47             ` Rafael J. Wysocki
2009-11-09 15:47             ` Rafael J. Wysocki
2009-11-09 14:44           ` Mike Galbraith
2009-11-09 15:57         ` Linus Torvalds
2009-11-09 15:57         ` Linus Torvalds
2009-11-09 14:02       ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091111181349.GA30638@redhat.com \
    --to=oleg@redhat.com \
    --cc=efault@gmx.de \
    --cc=gregkh@suse.de \
    --cc=jbarnes@virtuousgeek.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=mingo@elte.hu \
    --cc=rjw@sisk.pl \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.