* [PATCH] sysfs: don't use global workqueue in sysfs_schedule_callback()
@ 2009-03-25 21:11 Alex Chiang
2009-04-03 20:52 ` Alex Chiang
0 siblings, 1 reply; 3+ messages in thread
From: Alex Chiang @ 2009-03-25 21:11 UTC (permalink / raw)
To: gregkh; +Cc: kaneshige.kenji, linux-kernel
A sysfs attribute using sysfs_schedule_callback() to commit suicide
may end up calling device_unregister(), which will eventually call
a driver's ->remove function.
Drivers may call flush_scheduled_work() in their shutdown routines,
in which case lockdep will complain with something like the following:
=============================================
[ INFO: possible recursive locking detected ]
2.6.29-rc8-kk #1
---------------------------------------------
events/4/56 is trying to acquire lock:
(events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
but task is already holding lock:
(events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
other info that might help us debug this:
3 locks held by events/4/56:
#0: (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
#1: (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
#2: (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40
stack backtrace:
Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1
Call Trace:
[<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
[<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
[<ffffffff8026f148>] lock_acquire+0x58/0x80
[<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
[<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
[<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
[<ffffffff80258070>] flush_scheduled_work+0x10/0x20
[<ffffffffa0144065>] e1000_remove+0x55/0xfe [e1000e]
[<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
[<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
[<ffffffff80441da9>] __device_release_driver+0x59/0x90
[<ffffffff80441edb>] device_release_driver+0x2b/0x40
[<ffffffff804419d6>] bus_remove_device+0xa6/0x120
[<ffffffff8043e46b>] device_del+0x12b/0x190
[<ffffffff8043e4f6>] device_unregister+0x26/0x70
[<ffffffff803ba969>] pci_stop_dev+0x49/0x60
[<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
[<ffffffff803c10d9>] remove_callback+0x29/0x40
[<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
[<ffffffff8025769a>] run_workqueue+0x15a/0x230
[<ffffffff80257648>] ? run_workqueue+0x108/0x230
[<ffffffff8025846f>] worker_thread+0x9f/0x100
[<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff802583d0>] ? worker_thread+0x0/0x100
[<ffffffff8025b89d>] kthread+0x4d/0x80
[<ffffffff8020d4ba>] child_rip+0xa/0x20
[<ffffffff8020cebc>] ? restore_args+0x0/0x30
[<ffffffff8025b850>] ? kthread+0x0/0x80
[<ffffffff8020d4b0>] ? child_rip+0x0/0x20
Although we know that the device_unregister path will never acquire
a lock that a driver might try to acquire in its ->remove, in general
we should never attempt to flush a workqueue from within the same
workqueue, and lockdep rightly complains.
So as long as sysfs attributes cannot commit suicide directly and we
are stuck with this callback mechanism, put the sysfs callbacks on
their own workqueue instead of the global one.
This has the side benefit that if a suicidal sysfs attribute kicks
off a long chain of ->remove callbacks, we no longer induce a long
delay on the global queue.
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
---
This also fixes a missing module_put in the error path introduced
by sysfs-only-allow-one-scheduled-removal-callback-per-kobj.patch.
We never destroy the workqueue, but I'm not sure that's a
problem.
---
file.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
---
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 289c43a..979e937 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -667,6 +667,7 @@ struct sysfs_schedule_callback_struct {
struct work_struct work;
};
+static struct workqueue_struct *sysfs_workqueue;
static DEFINE_MUTEX(sysfs_workq_mutex);
static LIST_HEAD(sysfs_workq);
static void sysfs_schedule_callback_work(struct work_struct *work)
@@ -715,11 +716,20 @@ int sysfs_schedule_callback(struct kobject *kobj, void (*func)(void *),
mutex_lock(&sysfs_workq_mutex);
list_for_each_entry_safe(ss, tmp, &sysfs_workq, workq_list)
if (ss->kobj == kobj) {
+ module_put(owner);
mutex_unlock(&sysfs_workq_mutex);
return -EAGAIN;
}
mutex_unlock(&sysfs_workq_mutex);
+ if (sysfs_workqueue == NULL) {
+ sysfs_workqueue = create_workqueue("sysfsd");
+ if (sysfs_workqueue == NULL) {
+ module_put(owner);
+ return -ENOMEM;
+ }
+ }
+
ss = kmalloc(sizeof(*ss), GFP_KERNEL);
if (!ss) {
module_put(owner);
@@ -735,7 +745,7 @@ int sysfs_schedule_callback(struct kobject *kobj, void (*func)(void *),
mutex_lock(&sysfs_workq_mutex);
list_add_tail(&ss->workq_list, &sysfs_workq);
mutex_unlock(&sysfs_workq_mutex);
- schedule_work(&ss->work);
+ queue_work(sysfs_workqueue, &ss->work);
return 0;
}
EXPORT_SYMBOL_GPL(sysfs_schedule_callback);
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] sysfs: don't use global workqueue in sysfs_schedule_callback()
2009-03-25 21:11 [PATCH] sysfs: don't use global workqueue in sysfs_schedule_callback() Alex Chiang
@ 2009-04-03 20:52 ` Alex Chiang
2009-04-04 4:36 ` Greg KH
0 siblings, 1 reply; 3+ messages in thread
From: Alex Chiang @ 2009-04-03 20:52 UTC (permalink / raw)
To: gregkh; +Cc: kaneshige.kenji, linux-kernel
Hi Greg,
You're probably still working through your email backlog, but
just wanted to make sure this patch didn't get lost.
Thanks.
* Alex Chiang <achiang@hp.com>:
> A sysfs attribute using sysfs_schedule_callback() to commit suicide
> may end up calling device_unregister(), which will eventually call
> a driver's ->remove function.
>
> Drivers may call flush_scheduled_work() in their shutdown routines,
> in which case lockdep will complain with something like the following:
>
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.29-rc8-kk #1
> ---------------------------------------------
> events/4/56 is trying to acquire lock:
> (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
>
> but task is already holding lock:
> (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
>
> other info that might help us debug this:
> 3 locks held by events/4/56:
> #0: (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> #1: (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> #2: (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40
>
> stack backtrace:
> Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1
> Call Trace:
> [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
> [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
> [<ffffffff8026f148>] lock_acquire+0x58/0x80
> [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
> [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
> [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
> [<ffffffff80258070>] flush_scheduled_work+0x10/0x20
> [<ffffffffa0144065>] e1000_remove+0x55/0xfe [e1000e]
> [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
> [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
> [<ffffffff80441da9>] __device_release_driver+0x59/0x90
> [<ffffffff80441edb>] device_release_driver+0x2b/0x40
> [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
> [<ffffffff8043e46b>] device_del+0x12b/0x190
> [<ffffffff8043e4f6>] device_unregister+0x26/0x70
> [<ffffffff803ba969>] pci_stop_dev+0x49/0x60
> [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
> [<ffffffff803c10d9>] remove_callback+0x29/0x40
> [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
> [<ffffffff8025769a>] run_workqueue+0x15a/0x230
> [<ffffffff80257648>] ? run_workqueue+0x108/0x230
> [<ffffffff8025846f>] worker_thread+0x9f/0x100
> [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
> [<ffffffff802583d0>] ? worker_thread+0x0/0x100
> [<ffffffff8025b89d>] kthread+0x4d/0x80
> [<ffffffff8020d4ba>] child_rip+0xa/0x20
> [<ffffffff8020cebc>] ? restore_args+0x0/0x30
> [<ffffffff8025b850>] ? kthread+0x0/0x80
> [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
>
> Although we know that the device_unregister path will never acquire
> a lock that a driver might try to acquire in its ->remove, in general
> we should never attempt to flush a workqueue from within the same
> workqueue, and lockdep rightly complains.
>
> So as long as sysfs attributes cannot commit suicide directly and we
> are stuck with this callback mechanism, put the sysfs callbacks on
> their own workqueue instead of the global one.
>
> This has the side benefit that if a suicidal sysfs attribute kicks
> off a long chain of ->remove callbacks, we no longer induce a long
> delay on the global queue.
>
> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Signed-off-by: Alex Chiang <achiang@hp.com>
> ---
> This also fixes a missing module_put in the error path introduced
> by sysfs-only-allow-one-scheduled-removal-callback-per-kobj.patch.
>
> We never destroy the workqueue, but I'm not sure that's a
> problem.
> ---
> file.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
> ---
> diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
> index 289c43a..979e937 100644
> --- a/fs/sysfs/file.c
> +++ b/fs/sysfs/file.c
> @@ -667,6 +667,7 @@ struct sysfs_schedule_callback_struct {
> struct work_struct work;
> };
>
> +static struct workqueue_struct *sysfs_workqueue;
> static DEFINE_MUTEX(sysfs_workq_mutex);
> static LIST_HEAD(sysfs_workq);
> static void sysfs_schedule_callback_work(struct work_struct *work)
> @@ -715,11 +716,20 @@ int sysfs_schedule_callback(struct kobject *kobj, void (*func)(void *),
> mutex_lock(&sysfs_workq_mutex);
> list_for_each_entry_safe(ss, tmp, &sysfs_workq, workq_list)
> if (ss->kobj == kobj) {
> + module_put(owner);
> mutex_unlock(&sysfs_workq_mutex);
> return -EAGAIN;
> }
> mutex_unlock(&sysfs_workq_mutex);
>
> + if (sysfs_workqueue == NULL) {
> + sysfs_workqueue = create_workqueue("sysfsd");
> + if (sysfs_workqueue == NULL) {
> + module_put(owner);
> + return -ENOMEM;
> + }
> + }
> +
> ss = kmalloc(sizeof(*ss), GFP_KERNEL);
> if (!ss) {
> module_put(owner);
> @@ -735,7 +745,7 @@ int sysfs_schedule_callback(struct kobject *kobj, void (*func)(void *),
> mutex_lock(&sysfs_workq_mutex);
> list_add_tail(&ss->workq_list, &sysfs_workq);
> mutex_unlock(&sysfs_workq_mutex);
> - schedule_work(&ss->work);
> + queue_work(sysfs_workqueue, &ss->work);
> return 0;
> }
> EXPORT_SYMBOL_GPL(sysfs_schedule_callback);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] sysfs: don't use global workqueue in sysfs_schedule_callback()
2009-04-03 20:52 ` Alex Chiang
@ 2009-04-04 4:36 ` Greg KH
0 siblings, 0 replies; 3+ messages in thread
From: Greg KH @ 2009-04-04 4:36 UTC (permalink / raw)
To: Alex Chiang; +Cc: kaneshige.kenji, linux-kernel
On Fri, Apr 03, 2009 at 02:52:34PM -0600, Alex Chiang wrote:
> Hi Greg,
>
> You're probably still working through your email backlog, but
> just wanted to make sure this patch didn't get lost.
It's not lost, I'm digging through it all, and will get to it next week.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-04-04 4:53 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-25 21:11 [PATCH] sysfs: don't use global workqueue in sysfs_schedule_callback() Alex Chiang
2009-04-03 20:52 ` Alex Chiang
2009-04-04 4:36 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox