From: Oleg Nesterov <oleg@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>,
Thomas Gleixner <tglx@linutronix.de>,
Mike Galbraith <efault@gmx.de>, Ingo Molnar <mingo@elte.hu>,
LKML <linux-kernel@vger.kernel.org>,
pm list <linux-pm@lists.linux-foundation.org>,
Greg KH <gregkh@suse.de>, Jesse Barnes <jbarnes@virtuousgeek.org>,
Tejun Heo <tj@kernel.org>
Subject: Re: GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd)
Date: Wed, 11 Nov 2009 18:17:03 +0100 [thread overview]
Message-ID: <20091111171703.GA28506@redhat.com> (raw)
In-Reply-To: <alpine.LFD.2.01.0911101343590.31845@localhost.localdomain>
On 11/10, Linus Torvalds wrote:
>
> And the code really is pretty subtle. queue_delayed_work_on(), for
> example, uses raw_smp_processor_id() to basically pick a target CPU for
> the work ("whatever the CPU happens to be now"). But does that have to
> match the CPU that the timeout will trigger on?
Yes, but this doesn't matter.
queue_delayed_work_on() does:
/* This stores cwq for the moment, for the timer_fn */
set_wq_data(work, wq_per_cpu(wq, raw_smp_processor_id()));
this is only needed to ensure that delayed_work_timer_fn() which does
struct cpu_workqueue_struct *cwq = get_wq_data(&dwork->work);
struct workqueue_struct *wq = cwq->wq;
gets the correct workqueue_struct, cpu_workqueue_struct can be "wrong"
and even destroyed. queue_delayed_work_on() could use any CPU from
cpu_possible_mask instead of raw_smp_processor_id().
I still owe you the promised changes which should fix the problems
with the "overlapping" works (although I don't agree we never want
to run the same work entry on multiple CPU's at once, so I am not
sure work_struct's should be always "single-threaded"), and the very
first change will be
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -246,7 +246,8 @@ int queue_delayed_work_on(int cpu, struc
timer_stats_timer_set_start_info(&dwork->timer);
/* This stores cwq for the moment, for the timer_fn */
- set_wq_data(work, wq_per_cpu(wq, raw_smp_processor_id()));
+ if (!get_wq_data(work))
+ set_wq_data(work, wq_per_cpu(wq, raw_smp_processor_id()));
timer->expires = jiffies + delay;
timer->data = (unsigned long)dwork;
timer->function = delayed_work_timer_fn;
except this change is not always right. Not only the same work_struct
can run on multiple CPU's, it can run on different workqueues. Perhaps
nobody does this, but this is possible.
IOW, I agree it makes sense to introcude WORK_STRUCT_SINGLE_THREADED flag,
and perhaps it can be even set by default (not sure), but in any case I
think we need work_{set,clear}_single_threaded().
> The workqueue code is very fragile in many ways.
Well, yes. Because any buggy user can easily kill the system, and we
constantly have the bugs like this one.
I think we should teach workqueue.c to use debugobjects.c at least.
But otherwise I don't see how we can improve things very much. The
problem is not that the code itself is fragile, just it has a lot
of buggy users. Once queue_work(work) adds this work to ->worklist
the kernel depends on the owner of this work, it can crash the kernel
in many ways.
Oleg.
next prev parent reply other threads:[~2009-11-11 17:24 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-09 11:50 Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Rafael J. Wysocki
2009-11-09 12:02 ` Ingo Molnar
2009-11-09 12:24 ` Rafael J. Wysocki
2009-11-09 12:49 ` Ingo Molnar
2009-11-09 14:02 ` Thomas Gleixner
2009-11-09 14:16 ` Mike Galbraith
2009-11-09 14:27 ` Rafael J. Wysocki
2009-11-09 14:30 ` Mike Galbraith
2009-11-09 15:47 ` Rafael J. Wysocki
2009-11-09 16:19 ` Mike Galbraith
2009-11-09 17:36 ` Rafael J. Wysocki
2009-11-09 18:50 ` Thomas Gleixner
2009-11-09 20:00 ` Rafael J. Wysocki
2009-11-09 20:31 ` [linux-pm] " Alan Stern
2009-11-09 20:48 ` Rafael J. Wysocki
2009-11-09 21:24 ` Alan Stern
2009-11-09 20:45 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-09 21:42 ` Linus Torvalds
2009-11-10 0:19 ` Rafael J. Wysocki
2009-11-10 22:02 ` Linus Torvalds
2009-11-11 8:08 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-11 18:13 ` Oleg Nesterov
2009-11-12 4:56 ` Tejun Heo
2009-11-12 18:35 ` Oleg Nesterov
2009-11-12 19:14 ` Tejun Heo
2009-11-16 11:01 ` Tejun Heo
2009-11-11 11:52 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-11 19:52 ` Linus Torvalds
2009-11-11 20:18 ` Marcel Holtmann
2009-11-11 20:25 ` Linus Torvalds
2009-11-11 21:18 ` Rafael J. Wysocki
2009-11-11 21:13 ` Oliver Neukum
2009-11-11 21:38 ` Linus Torvalds
2009-11-11 21:44 ` Oliver Neukum
2009-11-11 16:13 ` Oleg Nesterov
2009-11-11 20:00 ` Rafael J. Wysocki
2009-11-11 20:11 ` Linus Torvalds
2009-11-11 20:20 ` Marcel Holtmann
2009-11-11 20:24 ` Oleg Nesterov
2009-11-11 21:15 ` Oliver Neukum
2009-11-11 17:17 ` Oleg Nesterov [this message]
2009-11-12 17:33 ` Thomas Gleixner
2009-11-12 19:17 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-12 20:53 ` Thomas Gleixner
2009-11-12 20:53 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-12 20:55 ` Thomas Gleixner
2009-11-12 22:55 ` Rafael J. Wysocki
2009-11-12 23:08 ` Thomas Gleixner
2009-11-15 23:37 ` Frederic Weisbecker
2009-11-15 23:40 ` Frederic Weisbecker
2009-11-09 19:13 ` Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Thomas Gleixner
2009-11-09 20:03 ` Rafael J. Wysocki
2009-11-09 14:26 ` Rafael J. Wysocki
2009-11-09 14:44 ` Mike Galbraith
2009-11-09 15:47 ` Rafael J. Wysocki
2009-11-09 15:57 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091111171703.GA28506@redhat.com \
--to=oleg@redhat.com \
--cc=efault@gmx.de \
--cc=gregkh@suse.de \
--cc=jbarnes@virtuousgeek.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=mingo@elte.hu \
--cc=rjw@sisk.pl \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox