All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Thomas Gleixner <tglx@linutronix.de>,
	Mike Galbraith <efault@gmx.de>, Ingo Molnar <mingo@elte.hu>,
	LKML <linux-kernel@vger.kernel.org>,
	pm list <linux-pm@lists.linux-foundation.org>,
	Greg KH <gregkh@suse.de>, Jesse Barnes <jbarnes@virtuousgeek.org>,
	Andi Kleen <ak@linux.intel.com>
Subject: Re: GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume
Date: Thu, 12 Nov 2009 19:35:08 +0100	[thread overview]
Message-ID: <20091112183508.GA14661@redhat.com> (raw)
In-Reply-To: <4AFB9595.1030002@kernel.org>

On 11/12, Tejun Heo wrote:
>
> Oleg Nesterov wrote:
> >
> > This is even documented, the comment above queue_work() says:
> >
> > 	* We queue the work to the CPU on which it was submitted, but if the CPU dies
> > 	* it can be processed by another CPU.
> >
> > We can improve things, see http://marc.info/?l=linux-kernel&m=125562105103769
> >
> > But then we should also change workqueue_cpu_callback(CPU_POST_DEAD).
> > Instead of flushing, we should carefully move the pending works to
> > another CPU, otherwise the self-requeueing work can block cpu_down().
>
> That looks like an excellent idea and I don't think it will add
> noticeable overhead even done by default and it will magically make
> the "how to implement single-threaded wq semantics in conccurrency
> managed wq" problem go away.  I'll work on it.

I am still not sure all work_structs should single-threaded by default.

To clarify, I am not arguing. Just I don't know. I mean, this change can
break the existing code, and it is not easy to notice the problem.

> If you look at the workqueue code itself very closely, all subtleties
> are properly defined and described.  The problem is that it's not very
> clear and way too subtle when seen from outside and workqueue is
> something used very widely.

Yes, agreed.

> making flush_work() behave as
> flush_work_sync() by default should be doable without too much
> overhead.  I'll give it a shot.

Well, I disagree. Imho it is better to have both flush_work() and
flush_work_sync(). flush_work() is "special" and should be used with
care. But this is minor, and if the work_stuct is single-threaded then
flush_work() == flush_work_sync().

(Perhaps this is what you meant)

> > Not sure this patch will help, but I bet that the actual reason for
> > this bug is much simpler than the subtle races above ;)
>
> And yes it was but still I'm fairly sure unexpected races described
> above are happening.

Yes, sure, I never argued.

My only point was, it is not that workqueues are buggy, they were
designed (and even documented) to work this way. I can't judge if it
was right or not, but personally I think everything is "logical".

That said, I agree that we have too many buggy users, perhaps we
should simplify the rules.

I just noticed that schedule_on_each_cpu() was recently changed by

	HWPOISON: Allow schedule_on_each_cpu() from keventd
	commit: 65a64464349883891e21e74af16c05d6e1eeb4e9

Surprisingly, even this simple change is not exactly right.

	/*
	 * when running in keventd don't schedule a work item on itself.
	 * Can just call directly because the work queue is already bound.
	 * This also is faster.
	 * Make this a generic parameter for other workqueues?
	 */
	if (current_is_keventd()) {
		orig = raw_smp_processor_id();
		INIT_WORK(per_cpu_ptr(works, orig), func);
		func(per_cpu_ptr(works, orig));
	}

OK, but this code should be moved down, under get_online_cpus().

schedule_on_each_cpu() should guarantee that func() can't race with
CPU hotplug, can safely use per-cpu data, etc. That is why flush_work()
is called before put_online_cpus().

Another reason to move this code down is that current_is_keventd()
itself is racy, the "preempt-safe: keventd is per-cpu" comment is not
right. I think do_boot_cpu() case is fine though.

(off-topic, but we can also simplify the code a little bit, the second
 "if (cpu != orig)" is not necessary).


Perhaps you and Linus are right, and we should simplify the rules
unconditionally. But note that the problem above has nothing to do with
single-threaded behaviour, and I do not think it is possible to guarantee
work->func() can't be moved to another CPU.

Oleg.


  parent reply	other threads:[~2009-11-12 18:41 UTC|newest]

Thread overview: 112+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-09 11:50 Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Rafael J. Wysocki
2009-11-09 11:50 ` Rafael J. Wysocki
2009-11-09 12:02 ` Ingo Molnar
2009-11-09 12:24   ` Rafael J. Wysocki
2009-11-09 12:49     ` Ingo Molnar
2009-11-09 12:49       ` Ingo Molnar
2009-11-09 14:02       ` Thomas Gleixner
2009-11-09 14:02       ` Thomas Gleixner
2009-11-09 14:16         ` Mike Galbraith
2009-11-09 14:16         ` Mike Galbraith
2009-11-09 14:27           ` Rafael J. Wysocki
2009-11-09 14:27             ` Rafael J. Wysocki
2009-11-09 14:30             ` Mike Galbraith
2009-11-09 14:30             ` Mike Galbraith
2009-11-09 15:47               ` Rafael J. Wysocki
2009-11-09 15:47               ` Rafael J. Wysocki
2009-11-09 16:19                 ` Mike Galbraith
2009-11-09 17:36                   ` Rafael J. Wysocki
2009-11-09 17:36                   ` Rafael J. Wysocki
2009-11-09 18:50                     ` Thomas Gleixner
2009-11-09 20:00                       ` Rafael J. Wysocki
2009-11-09 20:00                       ` Rafael J. Wysocki
2009-11-09 20:31                         ` [linux-pm] " Alan Stern
2009-11-09 20:48                           ` Rafael J. Wysocki
2009-11-09 20:48                           ` [linux-pm] " Rafael J. Wysocki
2009-11-09 21:24                             ` Alan Stern
2009-11-09 21:24                             ` Alan Stern
2009-11-09 20:31                         ` Alan Stern
2009-11-09 20:45                         ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-09 20:45                         ` Rafael J. Wysocki
2009-11-09 21:42                           ` Linus Torvalds
2009-11-09 21:42                             ` Linus Torvalds
2009-11-10  0:19                             ` Rafael J. Wysocki
2009-11-10 22:02                               ` Linus Torvalds
2009-11-10 22:02                                 ` Linus Torvalds
2009-11-11  8:08                                 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-11 18:13                                   ` Oleg Nesterov
2009-11-11 18:13                                   ` Oleg Nesterov
2009-11-12  4:56                                     ` Tejun Heo
2009-11-12 18:35                                       ` Oleg Nesterov
2009-11-12 18:35                                       ` Oleg Nesterov [this message]
2009-11-12 19:14                                         ` Tejun Heo
2009-11-12 19:14                                           ` Tejun Heo
2009-11-16 11:01                                           ` Tejun Heo
2009-11-16 11:01                                           ` Tejun Heo
2009-11-12  4:56                                     ` Tejun Heo
2009-11-11  8:08                                 ` Tejun Heo
2009-11-11 11:52                                 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-11 11:52                                 ` Rafael J. Wysocki
2009-11-11 19:52                                   ` Linus Torvalds
2009-11-11 19:52                                     ` Linus Torvalds
2009-11-11 20:18                                     ` Marcel Holtmann
2009-11-11 20:18                                     ` Marcel Holtmann
2009-11-11 20:25                                       ` Linus Torvalds
2009-11-11 20:25                                         ` Linus Torvalds
2009-11-11 21:18                                         ` Rafael J. Wysocki
2009-11-11 21:18                                         ` Rafael J. Wysocki
2009-11-11 21:13                                       ` Oliver Neukum
2009-11-11 21:13                                       ` Oliver Neukum
2009-11-11 21:38                                         ` Linus Torvalds
2009-11-11 21:38                                           ` Linus Torvalds
2009-11-11 21:44                                           ` Oliver Neukum
2009-11-11 21:44                                           ` Oliver Neukum
2009-11-11 16:13                                 ` Oleg Nesterov
2009-11-11 20:00                                   ` Rafael J. Wysocki
2009-11-11 20:00                                   ` Rafael J. Wysocki
2009-11-11 20:11                                     ` Linus Torvalds
2009-11-11 20:20                                       ` Marcel Holtmann
2009-11-11 20:20                                       ` Marcel Holtmann
2009-11-11 20:11                                     ` Linus Torvalds
2009-11-11 20:24                                     ` Oleg Nesterov
2009-11-11 20:24                                     ` Oleg Nesterov
2009-11-11 21:15                                       ` Oliver Neukum
2009-11-11 21:15                                       ` Oliver Neukum
2009-11-11 16:13                                 ` Oleg Nesterov
2009-11-11 17:17                                 ` Oleg Nesterov
2009-11-12 17:33                                   ` Thomas Gleixner
2009-11-12 17:33                                   ` Thomas Gleixner
2009-11-12 19:17                                     ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-12 20:53                                       ` Thomas Gleixner
2009-11-12 20:53                                       ` Thomas Gleixner
2009-11-12 19:17                                     ` Tejun Heo
2009-11-12 20:53                                     ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-12 20:55                                       ` Thomas Gleixner
2009-11-12 22:55                                         ` Rafael J. Wysocki
2009-11-12 23:08                                           ` Thomas Gleixner
2009-11-12 23:08                                           ` Thomas Gleixner
2009-11-12 22:55                                         ` Rafael J. Wysocki
2009-11-12 20:55                                       ` Thomas Gleixner
2009-11-12 20:53                                     ` Rafael J. Wysocki
2009-11-15 23:37                                     ` Frederic Weisbecker
2009-11-15 23:37                                     ` Frederic Weisbecker
2009-11-15 23:40                                       ` Frederic Weisbecker
2009-11-15 23:40                                       ` Frederic Weisbecker
2009-11-11 17:17                                 ` Oleg Nesterov
2009-11-10  0:19                             ` Rafael J. Wysocki
2009-11-09 18:50                     ` Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Thomas Gleixner
2009-11-09 19:13                     ` Thomas Gleixner
2009-11-09 19:13                     ` Thomas Gleixner
2009-11-09 20:03                       ` Rafael J. Wysocki
2009-11-09 20:03                       ` Rafael J. Wysocki
2009-11-09 16:19                 ` Mike Galbraith
2009-11-09 14:26         ` Rafael J. Wysocki
2009-11-09 14:44           ` Mike Galbraith
2009-11-09 15:47             ` Rafael J. Wysocki
2009-11-09 15:47             ` Rafael J. Wysocki
2009-11-09 14:44           ` Mike Galbraith
2009-11-09 14:26         ` Rafael J. Wysocki
2009-11-09 15:57         ` Linus Torvalds
2009-11-09 15:57         ` Linus Torvalds
2009-11-09 12:24   ` Rafael J. Wysocki
2009-11-09 12:02 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091112183508.GA14661@redhat.com \
    --to=oleg@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=efault@gmx.de \
    --cc=gregkh@suse.de \
    --cc=jbarnes@virtuousgeek.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=mingo@elte.hu \
    --cc=rjw@sisk.pl \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.