From: Oleg Nesterov <oleg@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
"Rafael J. Wysocki" <rjw@sisk.pl>,
Thomas Gleixner <tglx@linutronix.de>,
Mike Galbraith <efault@gmx.de>, Ingo Molnar <mingo@elte.hu>,
LKML <linux-kernel@vger.kernel.org>,
pm list <linux-pm@lists.linux-foundation.org>,
Greg KH <gregkh@suse.de>, Jesse Barnes <jbarnes@virtuousgeek.org>,
Andi Kleen <ak@linux.intel.com>
Subject: Re: GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume
Date: Thu, 12 Nov 2009 19:35:08 +0100 [thread overview]
Message-ID: <20091112183508.GA14661@redhat.com> (raw)
In-Reply-To: <4AFB9595.1030002@kernel.org>
On 11/12, Tejun Heo wrote:
>
> Oleg Nesterov wrote:
> >
> > This is even documented, the comment above queue_work() says:
> >
> > * We queue the work to the CPU on which it was submitted, but if the CPU dies
> > * it can be processed by another CPU.
> >
> > We can improve things, see http://marc.info/?l=linux-kernel&m=125562105103769
> >
> > But then we should also change workqueue_cpu_callback(CPU_POST_DEAD).
> > Instead of flushing, we should carefully move the pending works to
> > another CPU, otherwise the self-requeueing work can block cpu_down().
>
> That looks like an excellent idea and I don't think it will add
> noticeable overhead even done by default and it will magically make
> the "how to implement single-threaded wq semantics in conccurrency
> managed wq" problem go away. I'll work on it.
I am still not sure all work_structs should single-threaded by default.
To clarify, I am not arguing. Just I don't know. I mean, this change can
break the existing code, and it is not easy to notice the problem.
> If you look at the workqueue code itself very closely, all subtleties
> are properly defined and described. The problem is that it's not very
> clear and way too subtle when seen from outside and workqueue is
> something used very widely.
Yes, agreed.
> making flush_work() behave as
> flush_work_sync() by default should be doable without too much
> overhead. I'll give it a shot.
Well, I disagree. Imho it is better to have both flush_work() and
flush_work_sync(). flush_work() is "special" and should be used with
care. But this is minor, and if the work_stuct is single-threaded then
flush_work() == flush_work_sync().
(Perhaps this is what you meant)
> > Not sure this patch will help, but I bet that the actual reason for
> > this bug is much simpler than the subtle races above ;)
>
> And yes it was but still I'm fairly sure unexpected races described
> above are happening.
Yes, sure, I never argued.
My only point was, it is not that workqueues are buggy, they were
designed (and even documented) to work this way. I can't judge if it
was right or not, but personally I think everything is "logical".
That said, I agree that we have too many buggy users, perhaps we
should simplify the rules.
I just noticed that schedule_on_each_cpu() was recently changed by
HWPOISON: Allow schedule_on_each_cpu() from keventd
commit: 65a64464349883891e21e74af16c05d6e1eeb4e9
Surprisingly, even this simple change is not exactly right.
/*
* when running in keventd don't schedule a work item on itself.
* Can just call directly because the work queue is already bound.
* This also is faster.
* Make this a generic parameter for other workqueues?
*/
if (current_is_keventd()) {
orig = raw_smp_processor_id();
INIT_WORK(per_cpu_ptr(works, orig), func);
func(per_cpu_ptr(works, orig));
}
OK, but this code should be moved down, under get_online_cpus().
schedule_on_each_cpu() should guarantee that func() can't race with
CPU hotplug, can safely use per-cpu data, etc. That is why flush_work()
is called before put_online_cpus().
Another reason to move this code down is that current_is_keventd()
itself is racy, the "preempt-safe: keventd is per-cpu" comment is not
right. I think do_boot_cpu() case is fine though.
(off-topic, but we can also simplify the code a little bit, the second
"if (cpu != orig)" is not necessary).
Perhaps you and Linus are right, and we should simplify the rules
unconditionally. But note that the problem above has nothing to do with
single-threaded behaviour, and I do not think it is possible to guarantee
work->func() can't be moved to another CPU.
Oleg.
next prev parent reply other threads:[~2009-11-12 18:41 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-09 11:50 Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Rafael J. Wysocki
2009-11-09 12:02 ` Ingo Molnar
2009-11-09 12:24 ` Rafael J. Wysocki
2009-11-09 12:49 ` Ingo Molnar
2009-11-09 14:02 ` Thomas Gleixner
2009-11-09 14:16 ` Mike Galbraith
2009-11-09 14:27 ` Rafael J. Wysocki
2009-11-09 14:30 ` Mike Galbraith
2009-11-09 15:47 ` Rafael J. Wysocki
2009-11-09 16:19 ` Mike Galbraith
2009-11-09 17:36 ` Rafael J. Wysocki
2009-11-09 18:50 ` Thomas Gleixner
2009-11-09 20:00 ` Rafael J. Wysocki
2009-11-09 20:31 ` [linux-pm] " Alan Stern
2009-11-09 20:48 ` Rafael J. Wysocki
2009-11-09 21:24 ` Alan Stern
2009-11-09 20:45 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-09 21:42 ` Linus Torvalds
2009-11-10 0:19 ` Rafael J. Wysocki
2009-11-10 22:02 ` Linus Torvalds
2009-11-11 8:08 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-11 18:13 ` Oleg Nesterov
2009-11-12 4:56 ` Tejun Heo
2009-11-12 18:35 ` Oleg Nesterov [this message]
2009-11-12 19:14 ` Tejun Heo
2009-11-16 11:01 ` Tejun Heo
2009-11-11 11:52 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-11 19:52 ` Linus Torvalds
2009-11-11 20:18 ` Marcel Holtmann
2009-11-11 20:25 ` Linus Torvalds
2009-11-11 21:18 ` Rafael J. Wysocki
2009-11-11 21:13 ` Oliver Neukum
2009-11-11 21:38 ` Linus Torvalds
2009-11-11 21:44 ` Oliver Neukum
2009-11-11 16:13 ` Oleg Nesterov
2009-11-11 20:00 ` Rafael J. Wysocki
2009-11-11 20:11 ` Linus Torvalds
2009-11-11 20:20 ` Marcel Holtmann
2009-11-11 20:24 ` Oleg Nesterov
2009-11-11 21:15 ` Oliver Neukum
2009-11-11 17:17 ` Oleg Nesterov
2009-11-12 17:33 ` Thomas Gleixner
2009-11-12 19:17 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-12 20:53 ` Thomas Gleixner
2009-11-12 20:53 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-12 20:55 ` Thomas Gleixner
2009-11-12 22:55 ` Rafael J. Wysocki
2009-11-12 23:08 ` Thomas Gleixner
2009-11-15 23:37 ` Frederic Weisbecker
2009-11-15 23:40 ` Frederic Weisbecker
2009-11-09 19:13 ` Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Thomas Gleixner
2009-11-09 20:03 ` Rafael J. Wysocki
2009-11-09 14:26 ` Rafael J. Wysocki
2009-11-09 14:44 ` Mike Galbraith
2009-11-09 15:47 ` Rafael J. Wysocki
2009-11-09 15:57 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091112183508.GA14661@redhat.com \
--to=oleg@redhat.com \
--cc=ak@linux.intel.com \
--cc=efault@gmx.de \
--cc=gregkh@suse.de \
--cc=jbarnes@virtuousgeek.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=mingo@elte.hu \
--cc=rjw@sisk.pl \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox