* lockdep report at resume @ 2009-02-07 16:01 Johannes Berg 2009-02-11 12:03 ` Jiri Kosina 2009-03-20 8:12 ` Johannes Berg 0 siblings, 2 replies; 21+ messages in thread From: Johannes Berg @ 2009-02-07 16:01 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: linux-input Nothing special, just suspended/resumed with an external USB mouse, internal keyboard is USB as well: [20920.458587] ======================================================= [20920.458944] [ INFO: possible circular locking dependency detected ] [20920.459231] 2.6.29-rc3-wl-12787-g17d9115-dirty #38 [20920.459449] ------------------------------------------------------- [20920.459736] events/0/9 is trying to acquire lock: [20920.459951] (&dev->mutex){--..}, at: [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 [20920.460369] [20920.460370] but task is already holding lock: [20920.460637] (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 [20920.461049] [20920.461050] which lock already depends on the new lock. [20920.461051] [20920.461425] [20920.461425] the existing dependency chain (in reverse order) is: [20920.461767] [20920.461767] -> #4 (&usbhid->reset_work){--..}: [20920.462068] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 [20920.462361] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 [20920.462653] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 [20920.462945] [<ffffffff80275131>] lock_acquire+0x91/0xc0 [20920.463222] [<ffffffff8025a97f>] run_workqueue+0xff/0x250 [20920.463506] [<ffffffff8025ab7f>] worker_thread+0xaf/0x130 [20920.463791] [<ffffffff8025f1f9>] kthread+0x49/0x90 [20920.464048] [<ffffffff8020cdba>] child_rip+0xa/0x20 [20920.464310] [<ffffffffffffffff>] 0xffffffffffffffff [20920.464572] [20920.464572] -> #3 (events){--..}: [20920.464821] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 [20920.465113] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 [20920.465405] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 [20920.465697] [<ffffffff80275131>] lock_acquire+0x91/0xc0 [20920.465974] [<ffffffff8025ae0d>] cleanup_workqueue_thread+0x3d/0x80 [20920.466298] [<ffffffff805a4c45>] workqueue_cpu_callback+0x95/0x12a [20920.466619] [<ffffffff805b8dd5>] notifier_call_chain+0x65/0xa0 [20920.466924] [<ffffffff80264441>] raw_notifier_call_chain+0x11/0x20 [20920.467244] [<ffffffff805a2be3>] _cpu_down+0x1d3/0x350 [20920.467518] [<ffffffff802483fd>] disable_nonboot_cpus+0xdd/0x160 [20920.467831] [<ffffffff80284345>] hibernation_snapshot+0x135/0x170 [20920.468134] [<ffffffff80287b9d>] snapshot_ioctl+0x41d/0x610 [20920.468134] [<ffffffff802e8921>] vfs_ioctl+0x31/0xa0 [20920.468134] [<ffffffff802e8ce9>] do_vfs_ioctl+0x89/0x350 [20920.468134] [<ffffffff802e8ffa>] sys_ioctl+0x4a/0x80 [20920.468134] [<ffffffff8020bd1b>] system_call_fastpath+0x16/0x1b [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff [20920.468134] [20920.468134] -> #2 (cpu_add_remove_lock){--..}: [20920.468134] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 [20920.468134] [<ffffffff805b328c>] mutex_lock_nested+0xec/0x380 [20920.468134] [<ffffffff80248292>] cpu_maps_update_begin+0x12/0x20 [20920.468134] [<ffffffff8025ae93>] destroy_workqueue+0x33/0xb0 [20920.468134] [<ffffffffa0783110>] input_close_polled_device+0x40/0x50 [input_polldev] [20920.468134] [<ffffffff804bdb27>] input_close_device+0x57/0x90 [20920.468134] [<ffffffffa028f1e9>] evdev_release+0xa9/0xd0 [evdev] [20920.468134] [<ffffffff802db470>] __fput+0xd0/0x1e0 [20920.468134] [<ffffffff802db59d>] fput+0x1d/0x30 [20920.468134] [<ffffffff802d7cf0>] filp_close+0x50/0x80 [20920.468134] [<ffffffff802d7dd9>] sys_close+0xb9/0x120 [20920.468134] [<ffffffff8020bd1b>] system_call_fastpath+0x16/0x1b [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff [20920.468134] [20920.468134] -> #1 (polldev_mutex){--..}: [20920.468134] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 [20920.468134] [<ffffffff805b2e5c>] mutex_lock_interruptible_nested+0xec/0x430 [20920.468134] [<ffffffffa078313c>] input_open_polled_device+0x1c/0xc0 [input_polldev] [20920.468134] [<ffffffff804bdbf8>] input_open_device+0x98/0xc0 [20920.468134] [<ffffffffa028f460>] evdev_open+0x1c0/0x1e0 [evdev] [20920.468134] [<ffffffff804bd08f>] input_open_file+0xff/0x1e0 [20920.468134] [<ffffffff802dd397>] chrdev_open+0x137/0x210 [20920.468134] [<ffffffff802d80aa>] __dentry_open+0x10a/0x340 [20920.468134] [<ffffffff802d83e7>] nameidata_to_filp+0x57/0x70 [20920.468134] [<ffffffff802e702e>] do_filp_open+0x1fe/0x970 [20920.468134] [<ffffffff802d7eb8>] do_sys_open+0x78/0x110 [20920.468134] [<ffffffff802d7f7b>] sys_open+0x1b/0x20 [20920.468134] [<ffffffff8020bd1b>] system_call_fastpath+0x16/0x1b [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff [20920.468134] [20920.468134] -> #0 (&dev->mutex){--..}: [20920.468134] [<ffffffff802738b7>] check_prev_add+0x57/0x770 [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 [20920.468134] [<ffffffff805b328c>] mutex_lock_nested+0xec/0x380 [20920.468134] [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 [20920.468134] [<ffffffff804be18c>] input_unregister_device+0x1c/0x100 [20920.468134] [<ffffffffa03202f4>] bcm5974_disconnect+0x24/0x90 [bcm5974] [20920.468134] [<ffffffffa005c4f8>] usb_unbind_interface+0x68/0x170 [usbcore] [20920.468134] [<ffffffff80485eec>] __device_release_driver+0x7c/0xc0 [20920.468134] [<ffffffff8048603b>] device_release_driver+0x2b/0x40 [20920.468134] [<ffffffffa005c940>] usb_driver_release_interface+0xc0/0xe0 [usbcore] [20920.468134] [<ffffffffa005ca84>] usb_forced_unbind_intf+0x34/0x90 [usbcore] [20920.468134] [<ffffffffa0052035>] usb_reset_device+0xd5/0x220 [usbcore] [20920.468134] [<ffffffffa02d67ca>] hid_reset+0x18a/0x280 [usbhid] [20920.468134] [<ffffffff8025a985>] run_workqueue+0x105/0x250 [20920.468134] [<ffffffff8025ab7f>] worker_thread+0xaf/0x130 [20920.468134] [<ffffffff8025f1f9>] kthread+0x49/0x90 [20920.468134] [<ffffffff8020cdba>] child_rip+0xa/0x20 [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff [20920.468134] [20920.468134] other info that might help us debug this: [20920.468134] [20920.468134] 2 locks held by events/0/9: [20920.468134] #0: (events){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 [20920.468134] #1: (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 [20920.468134] [20920.468134] stack backtrace: [20920.468134] Pid: 9, comm: events/0 Not tainted 2.6.29-rc3-wl-12787-g17d9115-dirty #38 [20920.468134] Call Trace: [20920.468134] [<ffffffff80273360>] print_circular_bug_tail+0xe0/0xf0 [20920.468134] [<ffffffff802738b7>] check_prev_add+0x57/0x770 [20920.468134] [<ffffffff80218afa>] ? save_stack_trace+0x2a/0x50 [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 [20920.468134] [<ffffffff80274aed>] ? __lock_acquire+0x45d/0xa10 [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 [20920.468134] [<ffffffff804be0ac>] ? input_disconnect_device+0x2c/0xf0 [20920.468134] [<ffffffff805b328c>] mutex_lock_nested+0xec/0x380 [20920.468134] [<ffffffff804be0ac>] ? input_disconnect_device+0x2c/0xf0 [20920.468134] [<ffffffff80272d52>] ? trace_hardirqs_on_caller+0x182/0x1e0 [20920.468134] [<ffffffff804be0ac>] ? input_disconnect_device+0x2c/0xf0 [20920.468134] [<ffffffffa0056264>] ? usb_hcd_flush_endpoint+0x84/0x200 [usbcore] [20920.468134] [<ffffffff805b2b0d>] ? __mutex_unlock_slowpath+0x12d/0x1a0 [20920.468134] [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 [20920.468134] [<ffffffff804be18c>] input_unregister_device+0x1c/0x100 [20920.468134] [<ffffffffa03202f4>] bcm5974_disconnect+0x24/0x90 [bcm5974] [20920.468134] [<ffffffffa005c4f8>] usb_unbind_interface+0x68/0x170 [usbcore] [20920.468134] [<ffffffff80485eec>] __device_release_driver+0x7c/0xc0 [20920.468134] [<ffffffff8048603b>] device_release_driver+0x2b/0x40 [20920.468134] [<ffffffffa005c940>] usb_driver_release_interface+0xc0/0xe0 [usbcore] [20920.468134] [<ffffffffa005ca84>] usb_forced_unbind_intf+0x34/0x90 [usbcore] [20920.468134] [<ffffffffa0052035>] usb_reset_device+0xd5/0x220 [usbcore] [20920.468134] [<ffffffffa02d67ca>] hid_reset+0x18a/0x280 [usbhid] [20920.468134] [<ffffffffa02d6640>] ? hid_reset+0x0/0x280 [usbhid] [20920.468134] [<ffffffff8025a985>] run_workqueue+0x105/0x250 [20920.468134] [<ffffffff8025a933>] ? run_workqueue+0xb3/0x250 [20920.468134] [<ffffffff8025ab7f>] worker_thread+0xaf/0x130 [20920.468134] [<ffffffff8025f6c0>] ? autoremove_wake_function+0x0/0x40 [20920.468134] [<ffffffff8025aad0>] ? worker_thread+0x0/0x130 [20920.468134] [<ffffffff8025f1f9>] kthread+0x49/0x90 [20920.468134] [<ffffffff8020cdba>] child_rip+0xa/0x20 [20920.468134] [<ffffffff8020c780>] ? restore_args+0x0/0x30 [20920.468134] [<ffffffff8025f1b0>] ? kthread+0x0/0x90 [20920.468134] [<ffffffff8020cdb0>] ? child_rip+0x0/0x20 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-02-07 16:01 lockdep report at resume Johannes Berg @ 2009-02-11 12:03 ` Jiri Kosina 2009-02-11 14:33 ` Johannes Berg 2009-02-18 13:17 ` Johannes Berg 2009-03-20 8:12 ` Johannes Berg 1 sibling, 2 replies; 21+ messages in thread From: Jiri Kosina @ 2009-02-11 12:03 UTC (permalink / raw) To: Johannes Berg; +Cc: Dmitry Torokhov, linux-input, Jiri Slaby On Sat, 7 Feb 2009, Johannes Berg wrote: > Nothing special, just suspended/resumed with an external USB mouse, > internal keyboard is USB as well: > > [20920.458587] ======================================================= > [20920.458944] [ INFO: possible circular locking dependency detected ] > [20920.459231] 2.6.29-rc3-wl-12787-g17d9115-dirty #38 > [20920.459449] ------------------------------------------------------- > [20920.459736] events/0/9 is trying to acquire lock: > [20920.459951] (&dev->mutex){--..}, at: [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > [20920.460369] > [20920.460370] but task is already holding lock: > [20920.460637] (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > [20920.461049] > [20920.461050] which lock already depends on the new lock. > [20920.461051] > [20920.461425] > [20920.461425] the existing dependency chain (in reverse order) is: > [20920.461767] > [20920.461767] -> #4 (&usbhid->reset_work){--..}: > [20920.462068] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 > [20920.462361] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.462653] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.462945] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.463222] [<ffffffff8025a97f>] run_workqueue+0xff/0x250 > [20920.463506] [<ffffffff8025ab7f>] worker_thread+0xaf/0x130 > [20920.463791] [<ffffffff8025f1f9>] kthread+0x49/0x90 > [20920.464048] [<ffffffff8020cdba>] child_rip+0xa/0x20 > [20920.464310] [<ffffffffffffffff>] 0xffffffffffffffff > [20920.464572] > [20920.464572] -> #3 (events){--..}: > [20920.464821] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 > [20920.465113] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.465405] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.465697] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.465974] [<ffffffff8025ae0d>] cleanup_workqueue_thread+0x3d/0x80 > [20920.466298] [<ffffffff805a4c45>] workqueue_cpu_callback+0x95/0x12a > [20920.466619] [<ffffffff805b8dd5>] notifier_call_chain+0x65/0xa0 > [20920.466924] [<ffffffff80264441>] raw_notifier_call_chain+0x11/0x20 > [20920.467244] [<ffffffff805a2be3>] _cpu_down+0x1d3/0x350 > [20920.467518] [<ffffffff802483fd>] disable_nonboot_cpus+0xdd/0x160 > [20920.467831] [<ffffffff80284345>] hibernation_snapshot+0x135/0x170 > [20920.468134] [<ffffffff80287b9d>] snapshot_ioctl+0x41d/0x610 > [20920.468134] [<ffffffff802e8921>] vfs_ioctl+0x31/0xa0 > [20920.468134] [<ffffffff802e8ce9>] do_vfs_ioctl+0x89/0x350 > [20920.468134] [<ffffffff802e8ffa>] sys_ioctl+0x4a/0x80 > [20920.468134] [<ffffffff8020bd1b>] system_call_fastpath+0x16/0x1b > [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff > [20920.468134] > [20920.468134] -> #2 (cpu_add_remove_lock){--..}: > [20920.468134] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 > [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.468134] [<ffffffff805b328c>] mutex_lock_nested+0xec/0x380 > [20920.468134] [<ffffffff80248292>] cpu_maps_update_begin+0x12/0x20 > [20920.468134] [<ffffffff8025ae93>] destroy_workqueue+0x33/0xb0 > [20920.468134] [<ffffffffa0783110>] input_close_polled_device+0x40/0x50 [input_polldev] > [20920.468134] [<ffffffff804bdb27>] input_close_device+0x57/0x90 > [20920.468134] [<ffffffffa028f1e9>] evdev_release+0xa9/0xd0 [evdev] > [20920.468134] [<ffffffff802db470>] __fput+0xd0/0x1e0 > [20920.468134] [<ffffffff802db59d>] fput+0x1d/0x30 > [20920.468134] [<ffffffff802d7cf0>] filp_close+0x50/0x80 > [20920.468134] [<ffffffff802d7dd9>] sys_close+0xb9/0x120 > [20920.468134] [<ffffffff8020bd1b>] system_call_fastpath+0x16/0x1b > [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff > [20920.468134] > [20920.468134] -> #1 (polldev_mutex){--..}: > [20920.468134] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 > [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.468134] [<ffffffff805b2e5c>] mutex_lock_interruptible_nested+0xec/0x430 > [20920.468134] [<ffffffffa078313c>] input_open_polled_device+0x1c/0xc0 [input_polldev] > [20920.468134] [<ffffffff804bdbf8>] input_open_device+0x98/0xc0 > [20920.468134] [<ffffffffa028f460>] evdev_open+0x1c0/0x1e0 [evdev] > [20920.468134] [<ffffffff804bd08f>] input_open_file+0xff/0x1e0 > [20920.468134] [<ffffffff802dd397>] chrdev_open+0x137/0x210 > [20920.468134] [<ffffffff802d80aa>] __dentry_open+0x10a/0x340 > [20920.468134] [<ffffffff802d83e7>] nameidata_to_filp+0x57/0x70 > [20920.468134] [<ffffffff802e702e>] do_filp_open+0x1fe/0x970 > [20920.468134] [<ffffffff802d7eb8>] do_sys_open+0x78/0x110 > [20920.468134] [<ffffffff802d7f7b>] sys_open+0x1b/0x20 > [20920.468134] [<ffffffff8020bd1b>] system_call_fastpath+0x16/0x1b > [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff > [20920.468134] > [20920.468134] -> #0 (&dev->mutex){--..}: > [20920.468134] [<ffffffff802738b7>] check_prev_add+0x57/0x770 > [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.468134] [<ffffffff805b328c>] mutex_lock_nested+0xec/0x380 > [20920.468134] [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > [20920.468134] [<ffffffff804be18c>] input_unregister_device+0x1c/0x100 > [20920.468134] [<ffffffffa03202f4>] bcm5974_disconnect+0x24/0x90 [bcm5974] > [20920.468134] [<ffffffffa005c4f8>] usb_unbind_interface+0x68/0x170 [usbcore] > [20920.468134] [<ffffffff80485eec>] __device_release_driver+0x7c/0xc0 > [20920.468134] [<ffffffff8048603b>] device_release_driver+0x2b/0x40 > [20920.468134] [<ffffffffa005c940>] usb_driver_release_interface+0xc0/0xe0 [usbcore] > [20920.468134] [<ffffffffa005ca84>] usb_forced_unbind_intf+0x34/0x90 [usbcore] > [20920.468134] [<ffffffffa0052035>] usb_reset_device+0xd5/0x220 [usbcore] > [20920.468134] [<ffffffffa02d67ca>] hid_reset+0x18a/0x280 [usbhid] > [20920.468134] [<ffffffff8025a985>] run_workqueue+0x105/0x250 > [20920.468134] [<ffffffff8025ab7f>] worker_thread+0xaf/0x130 > [20920.468134] [<ffffffff8025f1f9>] kthread+0x49/0x90 > [20920.468134] [<ffffffff8020cdba>] child_rip+0xa/0x20 > [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff > [20920.468134] > [20920.468134] other info that might help us debug this: > [20920.468134] > [20920.468134] 2 locks held by events/0/9: > [20920.468134] #0: (events){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > [20920.468134] #1: (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > [20920.468134] > [20920.468134] stack backtrace: > [20920.468134] Pid: 9, comm: events/0 Not tainted 2.6.29-rc3-wl-12787-g17d9115-dirty #38 > [20920.468134] Call Trace: > [20920.468134] [<ffffffff80273360>] print_circular_bug_tail+0xe0/0xf0 > [20920.468134] [<ffffffff802738b7>] check_prev_add+0x57/0x770 > [20920.468134] [<ffffffff80218afa>] ? save_stack_trace+0x2a/0x50 > [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.468134] [<ffffffff80274aed>] ? __lock_acquire+0x45d/0xa10 > [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.468134] [<ffffffff804be0ac>] ? input_disconnect_device+0x2c/0xf0 > [20920.468134] [<ffffffff805b328c>] mutex_lock_nested+0xec/0x380 > [20920.468134] [<ffffffff804be0ac>] ? input_disconnect_device+0x2c/0xf0 > [20920.468134] [<ffffffff80272d52>] ? trace_hardirqs_on_caller+0x182/0x1e0 > [20920.468134] [<ffffffff804be0ac>] ? input_disconnect_device+0x2c/0xf0 > [20920.468134] [<ffffffffa0056264>] ? usb_hcd_flush_endpoint+0x84/0x200 [usbcore] > [20920.468134] [<ffffffff805b2b0d>] ? __mutex_unlock_slowpath+0x12d/0x1a0 > [20920.468134] [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > [20920.468134] [<ffffffff804be18c>] input_unregister_device+0x1c/0x100 > [20920.468134] [<ffffffffa03202f4>] bcm5974_disconnect+0x24/0x90 [bcm5974] > [20920.468134] [<ffffffffa005c4f8>] usb_unbind_interface+0x68/0x170 [usbcore] > [20920.468134] [<ffffffff80485eec>] __device_release_driver+0x7c/0xc0 > [20920.468134] [<ffffffff8048603b>] device_release_driver+0x2b/0x40 > [20920.468134] [<ffffffffa005c940>] usb_driver_release_interface+0xc0/0xe0 [usbcore] > [20920.468134] [<ffffffffa005ca84>] usb_forced_unbind_intf+0x34/0x90 [usbcore] > [20920.468134] [<ffffffffa0052035>] usb_reset_device+0xd5/0x220 [usbcore] > [20920.468134] [<ffffffffa02d67ca>] hid_reset+0x18a/0x280 [usbhid] > [20920.468134] [<ffffffffa02d6640>] ? hid_reset+0x0/0x280 [usbhid] > [20920.468134] [<ffffffff8025a985>] run_workqueue+0x105/0x250 > [20920.468134] [<ffffffff8025a933>] ? run_workqueue+0xb3/0x250 > [20920.468134] [<ffffffff8025ab7f>] worker_thread+0xaf/0x130 > [20920.468134] [<ffffffff8025f6c0>] ? autoremove_wake_function+0x0/0x40 > [20920.468134] [<ffffffff8025aad0>] ? worker_thread+0x0/0x130 > [20920.468134] [<ffffffff8025f1f9>] kthread+0x49/0x90 > [20920.468134] [<ffffffff8020cdba>] child_rip+0xa/0x20 > [20920.468134] [<ffffffff8020c780>] ? restore_args+0x0/0x30 > [20920.468134] [<ffffffff8025f1b0>] ? kthread+0x0/0x90 > [20920.468134] [<ffffffff8020cdb0>] ? child_rip+0x0/0x20 OK, so we are calling usb_reset_device() from hid_reset() workqueue. You have a composite device, mouse interface being handled by bcm5974 driver, and keyboard interface being handled by usbhid driver. bcm5974 driver doesn't have _pre_reset() and _post_reset() callbacks, so full unbind/bind cycle is performed by USB core. bcm5974_disconnect() calls input_unregister_device(), which creates reset_work -> input_dev->mutex dependency. Unfortunately I don't seem to see from the lockdep dumps where does the input_dev->mutex -> reset_work dependency come from, am I just completely blind? I will look into this. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-02-11 12:03 ` Jiri Kosina @ 2009-02-11 14:33 ` Johannes Berg 2009-02-18 13:17 ` Johannes Berg 1 sibling, 0 replies; 21+ messages in thread From: Johannes Berg @ 2009-02-11 14:33 UTC (permalink / raw) To: Jiri Kosina; +Cc: Dmitry Torokhov, linux-input, Jiri Slaby [-- Attachment #1: Type: text/plain, Size: 2098 bytes --] On Wed, 2009-02-11 at 13:03 +0100, Jiri Kosina wrote: > > [20920.459736] events/0/9 is trying to acquire lock: > > [20920.459951] (&dev->mutex){--..}, at: [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > > [20920.460369] > > [20920.460370] but task is already holding lock: > > [20920.460637] (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > > [20920.461049] > > [20920.461050] which lock already depends on the new lock. > > [20920.461051] > > [20920.461425] > > [20920.461425] the existing dependency chain (in reverse order) is: > > [20920.461767] > > [20920.461767] -> #4 (&usbhid->reset_work){--..}: > > [20920.464572] > > [20920.464572] -> #3 (events){--..}: > > [20920.468134] > > [20920.468134] -> #2 (cpu_add_remove_lock){--..}: > > [20920.468134] > > [20920.468134] -> #1 (polldev_mutex){--..}: > > [20920.468134] > > [20920.468134] -> #0 (&dev->mutex){--..}: > OK, so we are calling usb_reset_device() from hid_reset() workqueue. You > have a composite device, mouse interface being handled by bcm5974 driver, > and keyboard interface being handled by usbhid driver. Right. > bcm5974 driver doesn't have _pre_reset() and _post_reset() callbacks, so > full unbind/bind cycle is performed by USB core. bcm5974_disconnect() > calls input_unregister_device(), which creates reset_work -> > input_dev->mutex dependency. Yup > Unfortunately I don't seem to see from the lockdep dumps where does the > input_dev->mutex -> reset_work dependency come from, am I just completely > blind? Well, there's the entire chain of dependencies, so it's via polldev, and polldev depends on cpu_add_remove_lock, etc. So it's across a whole lot of different subsystems... the dependency is quite complex here! There's probably no direct dependency of input_dev->mutex -> reset_work, but reset_work is on the global workqueue ("events") which has the dependency via some _other_ work that some _other_ subsystem is also giving to schedule_work()! > I will look into this. Thanks. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-02-11 12:03 ` Jiri Kosina 2009-02-11 14:33 ` Johannes Berg @ 2009-02-18 13:17 ` Johannes Berg 1 sibling, 0 replies; 21+ messages in thread From: Johannes Berg @ 2009-02-18 13:17 UTC (permalink / raw) To: Jiri Kosina; +Cc: Dmitry Torokhov, linux-input, Jiri Slaby [-- Attachment #1: Type: text/plain, Size: 988 bytes --] On Wed, 2009-02-11 at 13:03 +0100, Jiri Kosina wrote: > OK, so we are calling usb_reset_device() from hid_reset() workqueue. You > have a composite device, mouse interface being handled by bcm5974 driver, > and keyboard interface being handled by usbhid driver. > > bcm5974 driver doesn't have _pre_reset() and _post_reset() callbacks, so > full unbind/bind cycle is performed by USB core. bcm5974_disconnect() > calls input_unregister_device(), which creates reset_work -> > input_dev->mutex dependency. > > Unfortunately I don't seem to see from the lockdep dumps where does the > input_dev->mutex -> reset_work dependency come from, am I just completely > blind? Any news on this? I just got it again on -rc5, entirely the same report, though it seems I need to have a mouse plugged in, then hibernate, then unplug mouse and finally resume to get this report. IOW it doesn't seem to happen if I don't change the hid devices during hibernate. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-02-07 16:01 lockdep report at resume Johannes Berg 2009-02-11 12:03 ` Jiri Kosina @ 2009-03-20 8:12 ` Johannes Berg 2009-03-20 14:37 ` Jiri Kosina 1 sibling, 1 reply; 21+ messages in thread From: Johannes Berg @ 2009-03-20 8:12 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: linux-input, Jiri Kosina [-- Attachment #1: Type: text/plain, Size: 10207 bytes --] On Sat, 2009-02-07 at 17:02 +0100, Johannes Berg wrote: > Nothing special, just suspended/resumed with an external USB mouse, > internal keyboard is USB as well: > > [20920.458587] ======================================================= > [20920.458944] [ INFO: possible circular locking dependency detected ] > [20920.459231] 2.6.29-rc3-wl-12787-g17d9115-dirty #38 > [20920.459449] ------------------------------------------------------- > [20920.459736] events/0/9 is trying to acquire lock: > [20920.459951] (&dev->mutex){--..}, at: [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > [20920.460369] > [20920.460370] but task is already holding lock: > [20920.460637] (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > [20920.461049] > [20920.461050] which lock already depends on the new lock. Ping? I just ran into it again on 2.6.29-rc8-wl-18593-gef1cb6f-dirty. > [20920.461425] > [20920.461425] the existing dependency chain (in reverse order) is: > [20920.461767] > [20920.461767] -> #4 (&usbhid->reset_work){--..}: > [20920.462068] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 > [20920.462361] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.462653] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.462945] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.463222] [<ffffffff8025a97f>] run_workqueue+0xff/0x250 > [20920.463506] [<ffffffff8025ab7f>] worker_thread+0xaf/0x130 > [20920.463791] [<ffffffff8025f1f9>] kthread+0x49/0x90 > [20920.464048] [<ffffffff8020cdba>] child_rip+0xa/0x20 > [20920.464310] [<ffffffffffffffff>] 0xffffffffffffffff > [20920.464572] > [20920.464572] -> #3 (events){--..}: > [20920.464821] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 > [20920.465113] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.465405] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.465697] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.465974] [<ffffffff8025ae0d>] cleanup_workqueue_thread+0x3d/0x80 > [20920.466298] [<ffffffff805a4c45>] workqueue_cpu_callback+0x95/0x12a > [20920.466619] [<ffffffff805b8dd5>] notifier_call_chain+0x65/0xa0 > [20920.466924] [<ffffffff80264441>] raw_notifier_call_chain+0x11/0x20 > [20920.467244] [<ffffffff805a2be3>] _cpu_down+0x1d3/0x350 > [20920.467518] [<ffffffff802483fd>] disable_nonboot_cpus+0xdd/0x160 > [20920.467831] [<ffffffff80284345>] hibernation_snapshot+0x135/0x170 > [20920.468134] [<ffffffff80287b9d>] snapshot_ioctl+0x41d/0x610 > [20920.468134] [<ffffffff802e8921>] vfs_ioctl+0x31/0xa0 > [20920.468134] [<ffffffff802e8ce9>] do_vfs_ioctl+0x89/0x350 > [20920.468134] [<ffffffff802e8ffa>] sys_ioctl+0x4a/0x80 > [20920.468134] [<ffffffff8020bd1b>] system_call_fastpath+0x16/0x1b > [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff > [20920.468134] > [20920.468134] -> #2 (cpu_add_remove_lock){--..}: > [20920.468134] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 > [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.468134] [<ffffffff805b328c>] mutex_lock_nested+0xec/0x380 > [20920.468134] [<ffffffff80248292>] cpu_maps_update_begin+0x12/0x20 > [20920.468134] [<ffffffff8025ae93>] destroy_workqueue+0x33/0xb0 > [20920.468134] [<ffffffffa0783110>] input_close_polled_device+0x40/0x50 [input_polldev] > [20920.468134] [<ffffffff804bdb27>] input_close_device+0x57/0x90 > [20920.468134] [<ffffffffa028f1e9>] evdev_release+0xa9/0xd0 [evdev] > [20920.468134] [<ffffffff802db470>] __fput+0xd0/0x1e0 > [20920.468134] [<ffffffff802db59d>] fput+0x1d/0x30 > [20920.468134] [<ffffffff802d7cf0>] filp_close+0x50/0x80 > [20920.468134] [<ffffffff802d7dd9>] sys_close+0xb9/0x120 > [20920.468134] [<ffffffff8020bd1b>] system_call_fastpath+0x16/0x1b > [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff > [20920.468134] > [20920.468134] -> #1 (polldev_mutex){--..}: > [20920.468134] [<ffffffff80273c17>] check_prev_add+0x3b7/0x770 > [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.468134] [<ffffffff805b2e5c>] mutex_lock_interruptible_nested+0xec/0x430 > [20920.468134] [<ffffffffa078313c>] input_open_polled_device+0x1c/0xc0 [input_polldev] > [20920.468134] [<ffffffff804bdbf8>] input_open_device+0x98/0xc0 > [20920.468134] [<ffffffffa028f460>] evdev_open+0x1c0/0x1e0 [evdev] > [20920.468134] [<ffffffff804bd08f>] input_open_file+0xff/0x1e0 > [20920.468134] [<ffffffff802dd397>] chrdev_open+0x137/0x210 > [20920.468134] [<ffffffff802d80aa>] __dentry_open+0x10a/0x340 > [20920.468134] [<ffffffff802d83e7>] nameidata_to_filp+0x57/0x70 > [20920.468134] [<ffffffff802e702e>] do_filp_open+0x1fe/0x970 > [20920.468134] [<ffffffff802d7eb8>] do_sys_open+0x78/0x110 > [20920.468134] [<ffffffff802d7f7b>] sys_open+0x1b/0x20 > [20920.468134] [<ffffffff8020bd1b>] system_call_fastpath+0x16/0x1b > [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff > [20920.468134] > [20920.468134] -> #0 (&dev->mutex){--..}: > [20920.468134] [<ffffffff802738b7>] check_prev_add+0x57/0x770 > [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.468134] [<ffffffff805b328c>] mutex_lock_nested+0xec/0x380 > [20920.468134] [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > [20920.468134] [<ffffffff804be18c>] input_unregister_device+0x1c/0x100 > [20920.468134] [<ffffffffa03202f4>] bcm5974_disconnect+0x24/0x90 [bcm5974] > [20920.468134] [<ffffffffa005c4f8>] usb_unbind_interface+0x68/0x170 [usbcore] > [20920.468134] [<ffffffff80485eec>] __device_release_driver+0x7c/0xc0 > [20920.468134] [<ffffffff8048603b>] device_release_driver+0x2b/0x40 > [20920.468134] [<ffffffffa005c940>] usb_driver_release_interface+0xc0/0xe0 [usbcore] > [20920.468134] [<ffffffffa005ca84>] usb_forced_unbind_intf+0x34/0x90 [usbcore] > [20920.468134] [<ffffffffa0052035>] usb_reset_device+0xd5/0x220 [usbcore] > [20920.468134] [<ffffffffa02d67ca>] hid_reset+0x18a/0x280 [usbhid] > [20920.468134] [<ffffffff8025a985>] run_workqueue+0x105/0x250 > [20920.468134] [<ffffffff8025ab7f>] worker_thread+0xaf/0x130 > [20920.468134] [<ffffffff8025f1f9>] kthread+0x49/0x90 > [20920.468134] [<ffffffff8020cdba>] child_rip+0xa/0x20 > [20920.468134] [<ffffffffffffffff>] 0xffffffffffffffff > [20920.468134] > [20920.468134] other info that might help us debug this: > [20920.468134] > [20920.468134] 2 locks held by events/0/9: > [20920.468134] #0: (events){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > [20920.468134] #1: (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > [20920.468134] > [20920.468134] stack backtrace: > [20920.468134] Pid: 9, comm: events/0 Not tainted 2.6.29-rc3-wl-12787-g17d9115-dirty #38 > [20920.468134] Call Trace: > [20920.468134] [<ffffffff80273360>] print_circular_bug_tail+0xe0/0xf0 > [20920.468134] [<ffffffff802738b7>] check_prev_add+0x57/0x770 > [20920.468134] [<ffffffff80218afa>] ? save_stack_trace+0x2a/0x50 > [20920.468134] [<ffffffff802745d6>] validate_chain+0x606/0x6c0 > [20920.468134] [<ffffffff80274acf>] __lock_acquire+0x43f/0xa10 > [20920.468134] [<ffffffff80274aed>] ? __lock_acquire+0x45d/0xa10 > [20920.468134] [<ffffffff80275131>] lock_acquire+0x91/0xc0 > [20920.468134] [<ffffffff804be0ac>] ? input_disconnect_device+0x2c/0xf0 > [20920.468134] [<ffffffff805b328c>] mutex_lock_nested+0xec/0x380 > [20920.468134] [<ffffffff804be0ac>] ? input_disconnect_device+0x2c/0xf0 > [20920.468134] [<ffffffff80272d52>] ? trace_hardirqs_on_caller+0x182/0x1e0 > [20920.468134] [<ffffffff804be0ac>] ? input_disconnect_device+0x2c/0xf0 > [20920.468134] [<ffffffffa0056264>] ? usb_hcd_flush_endpoint+0x84/0x200 [usbcore] > [20920.468134] [<ffffffff805b2b0d>] ? __mutex_unlock_slowpath+0x12d/0x1a0 > [20920.468134] [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > [20920.468134] [<ffffffff804be18c>] input_unregister_device+0x1c/0x100 > [20920.468134] [<ffffffffa03202f4>] bcm5974_disconnect+0x24/0x90 [bcm5974] > [20920.468134] [<ffffffffa005c4f8>] usb_unbind_interface+0x68/0x170 [usbcore] > [20920.468134] [<ffffffff80485eec>] __device_release_driver+0x7c/0xc0 > [20920.468134] [<ffffffff8048603b>] device_release_driver+0x2b/0x40 > [20920.468134] [<ffffffffa005c940>] usb_driver_release_interface+0xc0/0xe0 [usbcore] > [20920.468134] [<ffffffffa005ca84>] usb_forced_unbind_intf+0x34/0x90 [usbcore] > [20920.468134] [<ffffffffa0052035>] usb_reset_device+0xd5/0x220 [usbcore] > [20920.468134] [<ffffffffa02d67ca>] hid_reset+0x18a/0x280 [usbhid] > [20920.468134] [<ffffffffa02d6640>] ? hid_reset+0x0/0x280 [usbhid] > [20920.468134] [<ffffffff8025a985>] run_workqueue+0x105/0x250 > [20920.468134] [<ffffffff8025a933>] ? run_workqueue+0xb3/0x250 > [20920.468134] [<ffffffff8025ab7f>] worker_thread+0xaf/0x130 > [20920.468134] [<ffffffff8025f6c0>] ? autoremove_wake_function+0x0/0x40 > [20920.468134] [<ffffffff8025aad0>] ? worker_thread+0x0/0x130 > [20920.468134] [<ffffffff8025f1f9>] kthread+0x49/0x90 > [20920.468134] [<ffffffff8020cdba>] child_rip+0xa/0x20 > [20920.468134] [<ffffffff8020c780>] ? restore_args+0x0/0x30 > [20920.468134] [<ffffffff8025f1b0>] ? kthread+0x0/0x90 > [20920.468134] [<ffffffff8020cdb0>] ? child_rip+0x0/0x20 > [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-03-20 8:12 ` Johannes Berg @ 2009-03-20 14:37 ` Jiri Kosina 2009-03-20 14:39 ` Johannes Berg 2009-03-23 9:29 ` Johannes Berg 0 siblings, 2 replies; 21+ messages in thread From: Jiri Kosina @ 2009-03-20 14:37 UTC (permalink / raw) To: Johannes Berg; +Cc: Dmitry Torokhov, linux-input On Fri, 20 Mar 2009, Johannes Berg wrote: > > [20920.458587] ======================================================= > > [20920.458944] [ INFO: possible circular locking dependency detected ] > > [20920.459231] 2.6.29-rc3-wl-12787-g17d9115-dirty #38 > > [20920.459449] ------------------------------------------------------- > > [20920.459736] events/0/9 is trying to acquire lock: > > [20920.459951] (&dev->mutex){--..}, at: [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > > [20920.460369] > > [20920.460370] but task is already holding lock: > > [20920.460637] (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > > [20920.461049] > > [20920.461050] which lock already depends on the new lock. > Ping? I just ran into it again on 2.6.29-rc8-wl-18593-gef1cb6f-dirty. I haven't still fully gone through the whole dependency chain, was quite overloaded lately, sorry. There is a patch from Oliver Neukum, that fixes workqueue usage semantics around hid_reset() on [1], could you please check whether it also fixes the report you are seeing? Thanks. [1] http://lkml.org/lkml/2009/3/18/39 -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-03-20 14:37 ` Jiri Kosina @ 2009-03-20 14:39 ` Johannes Berg 2009-03-23 9:29 ` Johannes Berg 1 sibling, 0 replies; 21+ messages in thread From: Johannes Berg @ 2009-03-20 14:39 UTC (permalink / raw) To: Jiri Kosina; +Cc: Dmitry Torokhov, linux-input [-- Attachment #1: Type: text/plain, Size: 1479 bytes --] On Fri, 2009-03-20 at 15:37 +0100, Jiri Kosina wrote: > On Fri, 20 Mar 2009, Johannes Berg wrote: > > > > [20920.458587] ======================================================= > > > [20920.458944] [ INFO: possible circular locking dependency detected ] > > > [20920.459231] 2.6.29-rc3-wl-12787-g17d9115-dirty #38 > > > [20920.459449] ------------------------------------------------------- > > > [20920.459736] events/0/9 is trying to acquire lock: > > > [20920.459951] (&dev->mutex){--..}, at: [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > > > [20920.460369] > > > [20920.460370] but task is already holding lock: > > > [20920.460637] (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > > > [20920.461049] > > > [20920.461050] which lock already depends on the new lock. > > Ping? I just ran into it again on 2.6.29-rc8-wl-18593-gef1cb6f-dirty. > > I haven't still fully gone through the whole dependency chain, was quite > overloaded lately, sorry. > > There is a patch from Oliver Neukum, that fixes workqueue usage semantics > around hid_reset() on [1], could you please check whether it also fixes > the report you are seeing? Thanks. I can't seem to reproduce this completely reliably for some reason (or it's because my own development sometimes runs into lock problems elsewhere and lockdep disables itself), but I'll try to find a way to reproduce it and test the patch, thanks. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-03-20 14:37 ` Jiri Kosina 2009-03-20 14:39 ` Johannes Berg @ 2009-03-23 9:29 ` Johannes Berg 2009-03-31 8:36 ` Jiri Kosina 1 sibling, 1 reply; 21+ messages in thread From: Johannes Berg @ 2009-03-23 9:29 UTC (permalink / raw) To: Jiri Kosina; +Cc: Dmitry Torokhov, linux-input [-- Attachment #1: Type: text/plain, Size: 1327 bytes --] On Fri, 2009-03-20 at 15:37 +0100, Jiri Kosina wrote: > On Fri, 20 Mar 2009, Johannes Berg wrote: > > > > [20920.458587] ======================================================= > > > [20920.458944] [ INFO: possible circular locking dependency detected ] > > > [20920.459231] 2.6.29-rc3-wl-12787-g17d9115-dirty #38 > > > [20920.459449] ------------------------------------------------------- > > > [20920.459736] events/0/9 is trying to acquire lock: > > > [20920.459951] (&dev->mutex){--..}, at: [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > > > [20920.460369] > > > [20920.460370] but task is already holding lock: > > > [20920.460637] (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > > > [20920.461049] > > > [20920.461050] which lock already depends on the new lock. > > Ping? I just ran into it again on 2.6.29-rc8-wl-18593-gef1cb6f-dirty. > > I haven't still fully gone through the whole dependency chain, was quite > overloaded lately, sorry. > > There is a patch from Oliver Neukum, that fixes workqueue usage semantics > around hid_reset() on [1], could you please check whether it also fixes > the report you are seeing? Thanks. > > [1] http://lkml.org/lkml/2009/3/18/39 That patch doesn't even apply against a current tree. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-03-23 9:29 ` Johannes Berg @ 2009-03-31 8:36 ` Jiri Kosina 2009-03-31 8:38 ` Johannes Berg 0 siblings, 1 reply; 21+ messages in thread From: Jiri Kosina @ 2009-03-31 8:36 UTC (permalink / raw) To: Johannes Berg; +Cc: Dmitry Torokhov, linux-input On Mon, 23 Mar 2009, Johannes Berg wrote: > > > > [20920.458587] ======================================================= > > > > [20920.458944] [ INFO: possible circular locking dependency detected ] > > > > [20920.459231] 2.6.29-rc3-wl-12787-g17d9115-dirty #38 > > > > [20920.459449] ------------------------------------------------------- > > > > [20920.459736] events/0/9 is trying to acquire lock: > > > > [20920.459951] (&dev->mutex){--..}, at: [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > > > > [20920.460369] > > > > [20920.460370] but task is already holding lock: > > > > [20920.460637] (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > > > > [20920.461049] > > > > [20920.461050] which lock already depends on the new lock. > > > Ping? I just ran into it again on 2.6.29-rc8-wl-18593-gef1cb6f-dirty. > > I haven't still fully gone through the whole dependency chain, was quite > > overloaded lately, sorry. > > > > There is a patch from Oliver Neukum, that fixes workqueue usage semantics > > around hid_reset() on [1], could you please check whether it also fixes > > the report you are seeing? Thanks. > > [1] http://lkml.org/lkml/2009/3/18/39 > That patch doesn't even apply against a current tree. I wasn't able to reproduce this report on my test systems; did you use only Linus' current tree, or did it have any other trees pulled in? (I am referring to the -wl- in the version string). Could you please send me your config? -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-03-31 8:36 ` Jiri Kosina @ 2009-03-31 8:38 ` Johannes Berg 2009-03-31 8:40 ` Jiri Kosina 0 siblings, 1 reply; 21+ messages in thread From: Johannes Berg @ 2009-03-31 8:38 UTC (permalink / raw) To: Jiri Kosina; +Cc: Dmitry Torokhov, linux-input [-- Attachment #1.1: Type: text/plain, Size: 1867 bytes --] On Tue, 2009-03-31 at 10:36 +0200, Jiri Kosina wrote: > On Mon, 23 Mar 2009, Johannes Berg wrote: > > > > > > [20920.458587] ======================================================= > > > > > [20920.458944] [ INFO: possible circular locking dependency detected ] > > > > > [20920.459231] 2.6.29-rc3-wl-12787-g17d9115-dirty #38 > > > > > [20920.459449] ------------------------------------------------------- > > > > > [20920.459736] events/0/9 is trying to acquire lock: > > > > > [20920.459951] (&dev->mutex){--..}, at: [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > > > > > [20920.460369] > > > > > [20920.460370] but task is already holding lock: > > > > > [20920.460637] (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > > > > > [20920.461049] > > > > > [20920.461050] which lock already depends on the new lock. > > > > Ping? I just ran into it again on 2.6.29-rc8-wl-18593-gef1cb6f-dirty. > > > I haven't still fully gone through the whole dependency chain, was quite > > > overloaded lately, sorry. > > > > > > There is a patch from Oliver Neukum, that fixes workqueue usage semantics > > > around hid_reset() on [1], could you please check whether it also fixes > > > the report you are seeing? Thanks. > > > [1] http://lkml.org/lkml/2009/3/18/39 > > That patch doesn't even apply against a current tree. > > I wasn't able to reproduce this report on my test systems; did you use > only Linus' current tree, or did it have any other trees pulled in? (I am > referring to the -wl- in the version string). -wl is the wireless-testing tree, but it has no input stuff in it. > Could you please send me your config? Sure, attached. I haven't yet tried to reproduce on .29 though, which this config is for (but I haven't changed it since, only taken it forward). johannes [-- Attachment #1.2: config.gz --] [-- Type: application/x-gzip, Size: 14788 bytes --] [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-03-31 8:38 ` Johannes Berg @ 2009-03-31 8:40 ` Jiri Kosina 2009-03-31 8:42 ` Johannes Berg 2009-04-01 9:40 ` Johannes Berg 0 siblings, 2 replies; 21+ messages in thread From: Jiri Kosina @ 2009-03-31 8:40 UTC (permalink / raw) To: Johannes Berg; +Cc: Dmitry Torokhov, linux-input On Tue, 31 Mar 2009, Johannes Berg wrote: > > > > > > [20920.458587] ======================================================= > > > > > > [20920.458944] [ INFO: possible circular locking dependency detected ] > > > > > > [20920.459231] 2.6.29-rc3-wl-12787-g17d9115-dirty #38 > > > > > > [20920.459449] ------------------------------------------------------- > > > > > > [20920.459736] events/0/9 is trying to acquire lock: > > > > > > [20920.459951] (&dev->mutex){--..}, at: [<ffffffff804be0ac>] input_disconnect_device+0x2c/0xf0 > > > > > > [20920.460369] > > > > > > [20920.460370] but task is already holding lock: > > > > > > [20920.460637] (&usbhid->reset_work){--..}, at: [<ffffffff8025a933>] run_workqueue+0xb3/0x250 > > > > > > [20920.461049] > > > > > > [20920.461050] which lock already depends on the new lock. > > > > > Ping? I just ran into it again on 2.6.29-rc8-wl-18593-gef1cb6f-dirty. > > > > I haven't still fully gone through the whole dependency chain, was quite > > > > overloaded lately, sorry. > > > > > > > > There is a patch from Oliver Neukum, that fixes workqueue usage semantics > > > > around hid_reset() on [1], could you please check whether it also fixes > > > > the report you are seeing? Thanks. > > > > [1] http://lkml.org/lkml/2009/3/18/39 > > > That patch doesn't even apply against a current tree. > > I wasn't able to reproduce this report on my test systems; did you use > > only Linus' current tree, or did it have any other trees pulled in? (I am > > referring to the -wl- in the version string). > -wl is the wireless-testing tree, but it has no input stuff in it. Well as the dependency chain is fairly non-trivial, even some non-input stuff could be involved in the circular depepndency. > > Could you please send me your config? > Sure, attached. I haven't yet tried to reproduce on .29 though, which > this config is for (but I haven't changed it since, only taken it > forward). Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-03-31 8:40 ` Jiri Kosina @ 2009-03-31 8:42 ` Johannes Berg 2009-04-01 9:40 ` Johannes Berg 1 sibling, 0 replies; 21+ messages in thread From: Johannes Berg @ 2009-03-31 8:42 UTC (permalink / raw) To: Jiri Kosina; +Cc: Dmitry Torokhov, linux-input, Henrik Rydberg [-- Attachment #1: Type: text/plain, Size: 715 bytes --] On Tue, 2009-03-31 at 10:40 +0200, Jiri Kosina wrote: > > > I wasn't able to reproduce this report on my test systems; did you use > > > only Linus' current tree, or did it have any other trees pulled in? (I am > > > referring to the -wl- in the version string). > > -wl is the wireless-testing tree, but it has no input stuff in it. > > Well as the dependency chain is fairly non-trivial, even some non-input > stuff could be involved in the circular depepndency. I doubt it -- however, it seems bcm5974 is part of this report, and I have a patch to that applied http://johannes.sipsolutions.net/patches/kernel/all/LATEST/005-new-bcm-tp.patch (which doesn't change much, but anyway) johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-03-31 8:40 ` Jiri Kosina 2009-03-31 8:42 ` Johannes Berg @ 2009-04-01 9:40 ` Johannes Berg 2009-04-01 10:00 ` Johannes Berg 2009-04-21 13:51 ` Jiri Kosina 1 sibling, 2 replies; 21+ messages in thread From: Johannes Berg @ 2009-04-01 9:40 UTC (permalink / raw) To: Jiri Kosina Cc: Dmitry Torokhov, linux-input, linux-kernel, Rafael J. Wysocki, Oleg Nesterov [-- Attachment #1: Type: text/plain, Size: 6960 bytes --] On Tue, 2009-03-31 at 10:40 +0200, Jiri Kosina wrote: > > > Could you please send me your config? > > Sure, attached. I haven't yet tried to reproduce on .29 though, which > > this config is for (but I haven't changed it since, only taken it > > forward). I've now gotten it again on 2.6.29-wl-20327-g8f2487d-dirty. I've analysed a bit more. Let's start from the bottom: -> #0 (&dev->mutex){--..}: [<ffffffff80276e87>] check_prev_add+0x57/0x770 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff805cb56c>] mutex_lock_nested+0xfc/0x390 [<ffffffff804d19e1>] input_disconnect_device+0x31/0xf0 [<ffffffff804d1aba>] input_unregister_device+0x1a/0x110 [<ffffffffa00df329>] bcm5974_disconnect+0x29/0x90 [bcm5974] [<ffffffffa001c9ed>] usb_unbind_interface+0x6d/0x180 [usbcore] [<ffffffff80498661>] __device_release_driver+0x81/0xc0 [<ffffffff804987c0>] device_release_driver+0x30/0x50 [<ffffffffa001ce48>] usb_driver_release_interface+0xc8/0xf0 [usbcore] [<ffffffffa001cf99>] usb_forced_unbind_intf+0x39/0x90 [usbcore] [<ffffffffa00121a5>] usb_reset_device+0xd5/0x220 [usbcore] [<ffffffffa00957fa>] hid_reset+0x18a/0x280 [usbhid] [<ffffffff8025d28d>] run_workqueue+0x10d/0x250 Here we have hid_reset being called off schedule_work. It eventually calls into bcm5974 which will, from its usb_driver disconnect call, call input_unregister_device(), which acquires &dev->mutex. -> #1 (polldev_mutex){--..}: [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff805cb12c>] mutex_lock_interruptible_nested+0xec/0x430 [<ffffffffa0058151>] input_open_polled_device+0x21/0xd0 [input_polldev] [<ffffffff804d1528>] input_open_device+0x98/0xc0 [<ffffffffa009f468>] evdev_open+0x1c8/0x1f0 [evdev] [<ffffffff804d099f>] input_open_file+0x10f/0x200 [<ffffffff802e6e47>] chrdev_open+0x147/0x220 [<ffffffff802e18eb>] __dentry_open+0x11b/0x350 [<ffffffff802e1c37>] nameidata_to_filp+0x57/0x70 [<ffffffff802f0e6e>] do_filp_open+0x1fe/0x970 [<ffffffff802e16d0>] do_sys_open+0x80/0x110 [<ffffffff802e17a0>] sys_open+0x20/0x30 This is another code path -- evdev triggered here. Any input polldev will acquire polldev_mutex within its struct input_dev->open() callback, and thus create a dependency of &dev->mutex on polldev_mutex because input_open_device() is called with &dev->mutex held. -> #2 (cpu_add_remove_lock){--..}: [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff805cb56c>] mutex_lock_nested+0xfc/0x390 [<ffffffff8024a447>] cpu_maps_update_begin+0x17/0x20 [<ffffffff8025d7c8>] destroy_workqueue+0x38/0xb0 [<ffffffffa0058115>] input_close_polled_device+0x45/0x60 [input_polldev] [<ffffffff804d145c>] input_close_device+0x5c/0x90 [<ffffffffa009f1e9>] evdev_release+0xa9/0xd0 [evdev] [<ffffffff802e4e25>] __fput+0xd5/0x1e0 [<ffffffff802e4f55>] fput+0x25/0x30 [<ffffffff802e14f8>] filp_close+0x58/0x90 [<ffffffff802e15ee>] sys_close+0xbe/0x120 [<ffffffff8020bf0b>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff This is cute. So input-polldev uses its own workqueue, and it's singlethread. But destroy_workqueue must stop CPU hotplug anyway, calls cpu_map_update_begin() which locks cpu_add_remove_lock. -> #3 (events){--..}: [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff8025d722>] cleanup_workqueue_thread+0x42/0x90 [<ffffffff805bca8d>] workqueue_cpu_callback+0x9d/0x132 [<ffffffff805d1205>] notifier_call_chain+0x65/0xa0 [<ffffffff80267626>] raw_notifier_call_chain+0x16/0x20 [<ffffffff805ba91b>] _cpu_down+0x1db/0x350 [<ffffffff8024a5b5>] disable_nonboot_cpus+0xe5/0x170 [<ffffffff80287ef5>] hibernation_snapshot+0x135/0x170 [<ffffffff8028b8a5>] snapshot_ioctl+0x425/0x620 [<ffffffff802f27a6>] vfs_ioctl+0x36/0xb0 [<ffffffff802f2b89>] do_vfs_ioctl+0x89/0x350 [<ffffffff802f2e9f>] sys_ioctl+0x4f/0x80 [<ffffffff8020bf0b>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff Here we have hibernation, which needs to call disable_nonboot_cpus. This takes down all CPUs, and causes the workqueue code, now running off the workqueue_cpu_callback, to call cleanup_workqueue_thread(), which "acquires" the workqueue. I suspect this will also happen if you go into sysfs and disable a CPU manually, which may help you reproduce this. disable_nonboot_cpus calls cpu_map_update_begin() to avoid other things interfering, and thus creates the dependency of the workqueue on that. -> #4 (&usbhid->reset_work){--..}: [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff8025d287>] run_workqueue+0x107/0x250 [<ffffffff8025d47f>] worker_thread+0xaf/0x130 [<ffffffff80261dae>] kthread+0x4e/0x90 [<ffffffff8020cfba>] child_rip+0xa/0x20 [<ffffffffffffffff>] 0xffffffffffffffff Now, of course, usbhid->reset_work runs off the schedule_work workqueue, which was stopped during hibernation, so it depends on that workqueue. Finally, we're back at the top, with input_disconnect_device() acquiring &dev->mutex. Now, how can a deadlock happen? I think it cannot -- unless you have a polled USB device. The two "&dev->mutex" instances here are from difference devices, but lockdep cannot tell them apart, and if you have a polled USB device then the same can happen. Assume you had a polled USB driver using input_unregister_polled_device, and thus input_unregister_device, in its usb_driver disconnect call. In that case you could potentially trigger the deadlock when you manage to get that usb device reset very very close before calling disable_nonboot_cpus, so close that the usb reset_work is still scheduled or something like that... I don't really see a good way to solve it -- but I hope the analysis helps some -- also adding lots of people to CC. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-04-01 9:40 ` Johannes Berg @ 2009-04-01 10:00 ` Johannes Berg 2009-04-21 13:51 ` Jiri Kosina 1 sibling, 0 replies; 21+ messages in thread From: Johannes Berg @ 2009-04-01 10:00 UTC (permalink / raw) To: Jiri Kosina Cc: Dmitry Torokhov, linux-input, linux-kernel, Rafael J. Wysocki, Oleg Nesterov On Wed, 2009-04-01 at 11:41 +0200, Johannes Berg wrote: > On Tue, 2009-03-31 at 10:40 +0200, Jiri Kosina wrote: > > > > > Could you please send me your config? > > > Sure, attached. I haven't yet tried to reproduce on .29 though, which > > > this config is for (but I haven't changed it since, only taken it > > > forward). > > I've now gotten it again on 2.6.29-wl-20327-g8f2487d-dirty. I guess I should've posted the entire report: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.29-wl-20327-g8f2487d-dirty #71 ------------------------------------------------------- events/0/9 is trying to acquire lock: (&dev->mutex){--..}, at: [<ffffffff804d19e1>] input_disconnect_device+0x31/0xf0 but task is already holding lock: (&usbhid->reset_work){--..}, at: [<ffffffff8025d23b>] run_workqueue+0xbb/0x250 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&usbhid->reset_work){--..}: [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff8025d287>] run_workqueue+0x107/0x250 [<ffffffff8025d47f>] worker_thread+0xaf/0x130 [<ffffffff80261dae>] kthread+0x4e/0x90 [<ffffffff8020cfba>] child_rip+0xa/0x20 [<ffffffffffffffff>] 0xffffffffffffffff -> #3 (events){--..}: [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff8025d722>] cleanup_workqueue_thread+0x42/0x90 [<ffffffff805bca8d>] workqueue_cpu_callback+0x9d/0x132 [<ffffffff805d1205>] notifier_call_chain+0x65/0xa0 [<ffffffff80267626>] raw_notifier_call_chain+0x16/0x20 [<ffffffff805ba91b>] _cpu_down+0x1db/0x350 [<ffffffff8024a5b5>] disable_nonboot_cpus+0xe5/0x170 [<ffffffff80287ef5>] hibernation_snapshot+0x135/0x170 [<ffffffff8028b8a5>] snapshot_ioctl+0x425/0x620 [<ffffffff802f27a6>] vfs_ioctl+0x36/0xb0 [<ffffffff802f2b89>] do_vfs_ioctl+0x89/0x350 [<ffffffff802f2e9f>] sys_ioctl+0x4f/0x80 [<ffffffff8020bf0b>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff -> #2 (cpu_add_remove_lock){--..}: [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff805cb56c>] mutex_lock_nested+0xfc/0x390 [<ffffffff8024a447>] cpu_maps_update_begin+0x17/0x20 [<ffffffff8025d7c8>] destroy_workqueue+0x38/0xb0 [<ffffffffa0058115>] input_close_polled_device+0x45/0x60 [input_polldev] [<ffffffff804d145c>] input_close_device+0x5c/0x90 [<ffffffffa009f1e9>] evdev_release+0xa9/0xd0 [evdev] [<ffffffff802e4e25>] __fput+0xd5/0x1e0 [<ffffffff802e4f55>] fput+0x25/0x30 [<ffffffff802e14f8>] filp_close+0x58/0x90 [<ffffffff802e15ee>] sys_close+0xbe/0x120 [<ffffffff8020bf0b>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff -> #1 (polldev_mutex){--..}: [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff805cb12c>] mutex_lock_interruptible_nested+0xec/0x430 [<ffffffffa0058151>] input_open_polled_device+0x21/0xd0 [input_polldev] [<ffffffff804d1528>] input_open_device+0x98/0xc0 [<ffffffffa009f468>] evdev_open+0x1c8/0x1f0 [evdev] [<ffffffff804d099f>] input_open_file+0x10f/0x200 [<ffffffff802e6e47>] chrdev_open+0x147/0x220 [<ffffffff802e18eb>] __dentry_open+0x11b/0x350 [<ffffffff802e1c37>] nameidata_to_filp+0x57/0x70 [<ffffffff802f0e6e>] do_filp_open+0x1fe/0x970 [<ffffffff802e16d0>] do_sys_open+0x80/0x110 [<ffffffff802e17a0>] sys_open+0x20/0x30 [<ffffffff8020bf0b>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff -> #0 (&dev->mutex){--..}: [<ffffffff80276e87>] check_prev_add+0x57/0x770 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff805cb56c>] mutex_lock_nested+0xfc/0x390 [<ffffffff804d19e1>] input_disconnect_device+0x31/0xf0 [<ffffffff804d1aba>] input_unregister_device+0x1a/0x110 [<ffffffffa00df329>] bcm5974_disconnect+0x29/0x90 [bcm5974] [<ffffffffa001c9ed>] usb_unbind_interface+0x6d/0x180 [usbcore] [<ffffffff80498661>] __device_release_driver+0x81/0xc0 [<ffffffff804987c0>] device_release_driver+0x30/0x50 [<ffffffffa001ce48>] usb_driver_release_interface+0xc8/0xf0 [usbcore] [<ffffffffa001cf99>] usb_forced_unbind_intf+0x39/0x90 [usbcore] [<ffffffffa00121a5>] usb_reset_device+0xd5/0x220 [usbcore] [<ffffffffa00957fa>] hid_reset+0x18a/0x280 [usbhid] [<ffffffff8025d28d>] run_workqueue+0x10d/0x250 [<ffffffff8025d47f>] worker_thread+0xaf/0x130 [<ffffffff80261dae>] kthread+0x4e/0x90 [<ffffffff8020cfba>] child_rip+0xa/0x20 [<ffffffffffffffff>] 0xffffffffffffffff other info that might help us debug this: 2 locks held by events/0/9: #0: (events){--..}, at: [<ffffffff8025d23b>] run_workqueue+0xbb/0x250 #1: (&usbhid->reset_work){--..}, at: [<ffffffff8025d23b>] run_workqueue+0xbb/0x250 stack backtrace: Pid: 9, comm: events/0 Not tainted 2.6.29-wl-20327-g8f2487d-dirty #71 Call Trace: [<ffffffff80276930>] print_circular_bug_tail+0xe0/0xf0 [<ffffffff80276e87>] check_prev_add+0x57/0x770 [<ffffffff802191bf>] ? save_stack_trace+0x2f/0x50 [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 [<ffffffff802780ad>] ? __lock_acquire+0x45d/0xa10 [<ffffffff802786f1>] lock_acquire+0x91/0xc0 [<ffffffff804d19e1>] ? input_disconnect_device+0x31/0xf0 [<ffffffff805cb56c>] mutex_lock_nested+0xfc/0x390 [<ffffffff804d19e1>] ? input_disconnect_device+0x31/0xf0 [<ffffffff804d19e1>] ? input_disconnect_device+0x31/0xf0 [<ffffffff804d19e1>] input_disconnect_device+0x31/0xf0 [<ffffffff804d1aba>] input_unregister_device+0x1a/0x110 [<ffffffffa00df329>] bcm5974_disconnect+0x29/0x90 [bcm5974] [<ffffffffa001c9ed>] usb_unbind_interface+0x6d/0x180 [usbcore] [<ffffffff80498661>] __device_release_driver+0x81/0xc0 [<ffffffff804987c0>] device_release_driver+0x30/0x50 [<ffffffffa001ce48>] usb_driver_release_interface+0xc8/0xf0 [usbcore] [<ffffffffa001cf99>] usb_forced_unbind_intf+0x39/0x90 [usbcore] [<ffffffffa00121a5>] usb_reset_device+0xd5/0x220 [usbcore] [<ffffffffa00957fa>] hid_reset+0x18a/0x280 [usbhid] [<ffffffffa0095670>] ? hid_reset+0x0/0x280 [usbhid] [<ffffffff8025d28d>] run_workqueue+0x10d/0x250 [<ffffffff8025d23b>] ? run_workqueue+0xbb/0x250 [<ffffffff8025d47f>] worker_thread+0xaf/0x130 [<ffffffff80262280>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8025d3d0>] ? worker_thread+0x0/0x130 [<ffffffff80261dae>] kthread+0x4e/0x90 [<ffffffff8020cfba>] child_rip+0xa/0x20 johannes ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-04-01 9:40 ` Johannes Berg 2009-04-01 10:00 ` Johannes Berg @ 2009-04-21 13:51 ` Jiri Kosina 2009-04-21 13:59 ` Oliver Neukum 1 sibling, 1 reply; 21+ messages in thread From: Jiri Kosina @ 2009-04-21 13:51 UTC (permalink / raw) To: Johannes Berg Cc: Dmitry Torokhov, linux-input, linux-kernel, Rafael J. Wysocki, Oleg Nesterov, Oliver Neukum [ thanks for the analysis Johannes ... involving also Oliver here; really seems like a false positive to me ] On Wed, 1 Apr 2009, Johannes Berg wrote: > On Tue, 2009-03-31 at 10:40 +0200, Jiri Kosina wrote: > > > > > Could you please send me your config? > > > Sure, attached. I haven't yet tried to reproduce on .29 though, which > > > this config is for (but I haven't changed it since, only taken it > > > forward). > > I've now gotten it again on 2.6.29-wl-20327-g8f2487d-dirty. > > I've analysed a bit more. > > Let's start from the bottom: > > -> #0 (&dev->mutex){--..}: > [<ffffffff80276e87>] check_prev_add+0x57/0x770 > [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 > [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 > [<ffffffff802786f1>] lock_acquire+0x91/0xc0 > [<ffffffff805cb56c>] mutex_lock_nested+0xfc/0x390 > [<ffffffff804d19e1>] input_disconnect_device+0x31/0xf0 > [<ffffffff804d1aba>] input_unregister_device+0x1a/0x110 > [<ffffffffa00df329>] bcm5974_disconnect+0x29/0x90 [bcm5974] > [<ffffffffa001c9ed>] usb_unbind_interface+0x6d/0x180 [usbcore] > [<ffffffff80498661>] __device_release_driver+0x81/0xc0 > [<ffffffff804987c0>] device_release_driver+0x30/0x50 > [<ffffffffa001ce48>] usb_driver_release_interface+0xc8/0xf0 [usbcore] > [<ffffffffa001cf99>] usb_forced_unbind_intf+0x39/0x90 [usbcore] > [<ffffffffa00121a5>] usb_reset_device+0xd5/0x220 [usbcore] > [<ffffffffa00957fa>] hid_reset+0x18a/0x280 [usbhid] > [<ffffffff8025d28d>] run_workqueue+0x10d/0x250 > > > Here we have hid_reset being called off schedule_work. It eventually > calls into bcm5974 which will, from its usb_driver disconnect call, call > input_unregister_device(), which acquires &dev->mutex. > > > -> #1 (polldev_mutex){--..}: > [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 > [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 > [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 > [<ffffffff802786f1>] lock_acquire+0x91/0xc0 > [<ffffffff805cb12c>] mutex_lock_interruptible_nested+0xec/0x430 > [<ffffffffa0058151>] input_open_polled_device+0x21/0xd0 [input_polldev] > [<ffffffff804d1528>] input_open_device+0x98/0xc0 > [<ffffffffa009f468>] evdev_open+0x1c8/0x1f0 [evdev] > [<ffffffff804d099f>] input_open_file+0x10f/0x200 > [<ffffffff802e6e47>] chrdev_open+0x147/0x220 > [<ffffffff802e18eb>] __dentry_open+0x11b/0x350 > [<ffffffff802e1c37>] nameidata_to_filp+0x57/0x70 > [<ffffffff802f0e6e>] do_filp_open+0x1fe/0x970 > [<ffffffff802e16d0>] do_sys_open+0x80/0x110 > [<ffffffff802e17a0>] sys_open+0x20/0x30 > > This is another code path -- evdev triggered here. Any input polldev > will acquire polldev_mutex within its struct input_dev->open() callback, > and thus create a dependency of &dev->mutex on polldev_mutex because > input_open_device() is called with &dev->mutex held. > > -> #2 (cpu_add_remove_lock){--..}: > [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 > [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 > [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 > [<ffffffff802786f1>] lock_acquire+0x91/0xc0 > [<ffffffff805cb56c>] mutex_lock_nested+0xfc/0x390 > [<ffffffff8024a447>] cpu_maps_update_begin+0x17/0x20 > [<ffffffff8025d7c8>] destroy_workqueue+0x38/0xb0 > [<ffffffffa0058115>] input_close_polled_device+0x45/0x60 [input_polldev] > [<ffffffff804d145c>] input_close_device+0x5c/0x90 > [<ffffffffa009f1e9>] evdev_release+0xa9/0xd0 [evdev] > [<ffffffff802e4e25>] __fput+0xd5/0x1e0 > [<ffffffff802e4f55>] fput+0x25/0x30 > [<ffffffff802e14f8>] filp_close+0x58/0x90 > [<ffffffff802e15ee>] sys_close+0xbe/0x120 > [<ffffffff8020bf0b>] system_call_fastpath+0x16/0x1b > [<ffffffffffffffff>] 0xffffffffffffffff > > This is cute. > So input-polldev uses its own workqueue, and it's singlethread. But > destroy_workqueue must stop CPU hotplug anyway, calls > cpu_map_update_begin() which locks cpu_add_remove_lock. > > -> #3 (events){--..}: > [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 > [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 > [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 > [<ffffffff802786f1>] lock_acquire+0x91/0xc0 > [<ffffffff8025d722>] cleanup_workqueue_thread+0x42/0x90 > [<ffffffff805bca8d>] workqueue_cpu_callback+0x9d/0x132 > [<ffffffff805d1205>] notifier_call_chain+0x65/0xa0 > [<ffffffff80267626>] raw_notifier_call_chain+0x16/0x20 > [<ffffffff805ba91b>] _cpu_down+0x1db/0x350 > [<ffffffff8024a5b5>] disable_nonboot_cpus+0xe5/0x170 > [<ffffffff80287ef5>] hibernation_snapshot+0x135/0x170 > [<ffffffff8028b8a5>] snapshot_ioctl+0x425/0x620 > [<ffffffff802f27a6>] vfs_ioctl+0x36/0xb0 > [<ffffffff802f2b89>] do_vfs_ioctl+0x89/0x350 > [<ffffffff802f2e9f>] sys_ioctl+0x4f/0x80 > [<ffffffff8020bf0b>] system_call_fastpath+0x16/0x1b > [<ffffffffffffffff>] 0xffffffffffffffff > > Here we have hibernation, which needs to call disable_nonboot_cpus. This > takes down all CPUs, and causes the workqueue code, now running off the > workqueue_cpu_callback, to call cleanup_workqueue_thread(), which > "acquires" the workqueue. I suspect this will also happen if you go into > sysfs and disable a CPU manually, which may help you reproduce this. > > disable_nonboot_cpus calls cpu_map_update_begin() to avoid other things > interfering, and thus creates the dependency of the workqueue on that. > > > -> #4 (&usbhid->reset_work){--..}: > [<ffffffff802771e7>] check_prev_add+0x3b7/0x770 > [<ffffffff80277b96>] validate_chain+0x5f6/0x6b0 > [<ffffffff8027808f>] __lock_acquire+0x43f/0xa10 > [<ffffffff802786f1>] lock_acquire+0x91/0xc0 > [<ffffffff8025d287>] run_workqueue+0x107/0x250 > [<ffffffff8025d47f>] worker_thread+0xaf/0x130 > [<ffffffff80261dae>] kthread+0x4e/0x90 > [<ffffffff8020cfba>] child_rip+0xa/0x20 > [<ffffffffffffffff>] 0xffffffffffffffff > > Now, of course, usbhid->reset_work runs off the schedule_work workqueue, > which was stopped during hibernation, so it depends on that workqueue. > > > Finally, we're back at the top, with input_disconnect_device() acquiring > &dev->mutex. > > > > Now, how can a deadlock happen? > > I think it cannot -- unless you have a polled USB device. The two > "&dev->mutex" instances here are from difference devices, but lockdep > cannot tell them apart, and if you have a polled USB device then the > same can happen. > > Assume you had a polled USB driver using input_unregister_polled_device, > and thus input_unregister_device, in its usb_driver disconnect call. In > that case you could potentially trigger the deadlock when you manage to > get that usb device reset very very close before calling > disable_nonboot_cpus, so close that the usb reset_work is still > scheduled or something like that... > > > I don't really see a good way to solve it -- but I hope the analysis > helps some -- also adding lots of people to CC. > > johannes > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-04-21 13:51 ` Jiri Kosina @ 2009-04-21 13:59 ` Oliver Neukum 2009-04-21 14:11 ` Jiri Kosina 0 siblings, 1 reply; 21+ messages in thread From: Oliver Neukum @ 2009-04-21 13:59 UTC (permalink / raw) To: Jiri Kosina Cc: Johannes Berg, Dmitry Torokhov, linux-input, linux-kernel, Rafael J. Wysocki, Oleg Nesterov Am Dienstag 21 April 2009 15:51:16 schrieb Jiri Kosina: > [ thanks for the analysis Johannes ... involving also Oliver here; really > seems like a false positive to me ] I think he's right. But as a matter of principle I'd like to see flush_scheduled_work() avoided where possible, hence the patch. Regards Oliver ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-04-21 13:59 ` Oliver Neukum @ 2009-04-21 14:11 ` Jiri Kosina 2009-04-21 14:14 ` Oliver Neukum 2009-04-21 14:26 ` Johannes Berg 0 siblings, 2 replies; 21+ messages in thread From: Jiri Kosina @ 2009-04-21 14:11 UTC (permalink / raw) To: Oliver Neukum, Johannes Berg Cc: Dmitry Torokhov, linux-input, linux-kernel, Rafael J. Wysocki, Oleg Nesterov On Tue, 21 Apr 2009, Oliver Neukum wrote: > > [ thanks for the analysis Johannes ... involving also Oliver here; really > > seems like a false positive to me ] > I think he's right. But as a matter of principle I'd like to see > flush_scheduled_work() avoided where possible, hence the patch. I am not entirely sure that this is the same issue as the one from #13136 of kernel.org bugzilla. Anyway, for reference, the patch Oliver mentioned here is below. >From oliver@neukum.org Tue Apr 21 15:38:41 2009 Date: Tue, 21 Apr 2009 15:38:12 +0200 From: Oliver Neukum <oliver@neukum.org> To: Andrew Morton <akpm@linux-foundation.org>, casteyde.christian@free.fr Cc: Jiri Kosina <jkosina@suse.cz>, Rafael J. Wysocki <rjw@sisk.pl> Subject: Re: [Bugme-new] [Bug 13136] New: possible circular locking dependency detected while suspending laptop to S3 Am Dienstag 21 April 2009 01:01:13 schrieb Andrew Morton: > On Sat, 18 Apr 2009 11:17:47 GMT > > bugzilla-daemon@bugzilla.kernel.org wrote: > > http://bugzilla.kernel.org/show_bug.cgi?id=13136 > > Looks like a post-2.6.29 regression in usb-hid. usbhid_close() holding > hid_open_mut while doing flush_scheduled_work(). > > Could you gents please work out where the problem is and suitably > assign the report? Please test this patch. Regards Oliver --- commit 4bdc818cca662000cf195b1e83e8a0f8a3f0b9c6 Author: Oliver Neukum <oneukum@linux-d698.(none)> Date: Tue Apr 21 15:33:41 2009 +0200 avoid deadlock in usbhid_close by cancelling workqueues diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c index 4306cb1..900ce18 100644 --- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -662,8 +662,8 @@ void usbhid_close(struct hid_device *hid) spin_lock_irq(&usbhid->lock); if (!--hid->open) { spin_unlock_irq(&usbhid->lock); + hid_cancel_delayed_stuff(usbhid); usb_kill_urb(usbhid->urbin); - flush_scheduled_work(); usbhid->intf->needs_remote_wakeup = 0; } else { spin_unlock_irq(&usbhid->lock); Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-04-21 14:11 ` Jiri Kosina @ 2009-04-21 14:14 ` Oliver Neukum 2009-04-21 14:26 ` Johannes Berg 1 sibling, 0 replies; 21+ messages in thread From: Oliver Neukum @ 2009-04-21 14:14 UTC (permalink / raw) To: Jiri Kosina Cc: Johannes Berg, Dmitry Torokhov, linux-input, linux-kernel, Rafael J. Wysocki, Oleg Nesterov Am Dienstag 21 April 2009 16:11:03 schrieb Jiri Kosina: > On Tue, 21 Apr 2009, Oliver Neukum wrote: > > > [ thanks for the analysis Johannes ... involving also Oliver here; > > > really seems like a false positive to me ] > > > > I think he's right. But as a matter of principle I'd like to see > > flush_scheduled_work() avoided where possible, hence the patch. > > I am not entirely sure that this is the same issue as the one from #13136 > of kernel.org bugzilla. Which flush_scheduled_work() do you think triggers it? Regards Oliver ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-04-21 14:11 ` Jiri Kosina 2009-04-21 14:14 ` Oliver Neukum @ 2009-04-21 14:26 ` Johannes Berg 2009-05-05 12:47 ` Jiri Kosina 1 sibling, 1 reply; 21+ messages in thread From: Johannes Berg @ 2009-04-21 14:26 UTC (permalink / raw) To: Jiri Kosina Cc: Oliver Neukum, Dmitry Torokhov, linux-input, linux-kernel, Rafael J. Wysocki, Oleg Nesterov [-- Attachment #1: Type: text/plain, Size: 674 bytes --] On Tue, 2009-04-21 at 16:11 +0200, Jiri Kosina wrote: > On Tue, 21 Apr 2009, Oliver Neukum wrote: > > > > [ thanks for the analysis Johannes ... involving also Oliver here; really > > > seems like a false positive to me ] > > I think he's right. But as a matter of principle I'd like to see > > flush_scheduled_work() avoided where possible, hence the patch. > > I am not entirely sure that this is the same issue as the one from #13136 > of kernel.org bugzilla. > > Anyway, for reference, the patch Oliver mentioned here is below. Thanks. Unfortunately I have another problem that turns off lockdep so I can't test this right now, sorry. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-04-21 14:26 ` Johannes Berg @ 2009-05-05 12:47 ` Jiri Kosina 2009-05-08 10:25 ` Johannes Berg 0 siblings, 1 reply; 21+ messages in thread From: Jiri Kosina @ 2009-05-05 12:47 UTC (permalink / raw) To: Johannes Berg Cc: Oliver Neukum, Dmitry Torokhov, linux-input, linux-kernel, Rafael J. Wysocki, Oleg Nesterov On Tue, 21 Apr 2009, Johannes Berg wrote: > > > > [ thanks for the analysis Johannes ... involving also Oliver here; really > > > > seems like a false positive to me ] > > > I think he's right. But as a matter of principle I'd like to see > > > flush_scheduled_work() avoided where possible, hence the patch. > > I am not entirely sure that this is the same issue as the one from #13136 > > of kernel.org bugzilla. > > Anyway, for reference, the patch Oliver mentioned here is below. > Thanks. Unfortunately I have another problem that turns off lockdep so I > can't test this right now, sorry. Are you able to reproduce the problem with current git (the expected fix is 89092ddd7a). Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: lockdep report at resume 2009-05-05 12:47 ` Jiri Kosina @ 2009-05-08 10:25 ` Johannes Berg 0 siblings, 0 replies; 21+ messages in thread From: Johannes Berg @ 2009-05-08 10:25 UTC (permalink / raw) To: Jiri Kosina Cc: Oliver Neukum, Dmitry Torokhov, linux-input, linux-kernel, Rafael J. Wysocki, Oleg Nesterov [-- Attachment #1: Type: text/plain, Size: 902 bytes --] On Tue, 2009-05-05 at 14:47 +0200, Jiri Kosina wrote: > On Tue, 21 Apr 2009, Johannes Berg wrote: > > > > > > [ thanks for the analysis Johannes ... involving also Oliver here; really > > > > > seems like a false positive to me ] > > > > I think he's right. But as a matter of principle I'd like to see > > > > flush_scheduled_work() avoided where possible, hence the patch. > > > I am not entirely sure that this is the same issue as the one from #13136 > > > of kernel.org bugzilla. > > > Anyway, for reference, the patch Oliver mentioned here is below. > > Thanks. Unfortunately I have another problem that turns off lockdep so I > > can't test this right now, sorry. > > Are you able to reproduce the problem with current git (the expected fix > is 89092ddd7a). I don't know, unfortunately, because I keep getting a different lockdep report at _suspend_ time. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2009-05-08 10:30 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-02-07 16:01 lockdep report at resume Johannes Berg 2009-02-11 12:03 ` Jiri Kosina 2009-02-11 14:33 ` Johannes Berg 2009-02-18 13:17 ` Johannes Berg 2009-03-20 8:12 ` Johannes Berg 2009-03-20 14:37 ` Jiri Kosina 2009-03-20 14:39 ` Johannes Berg 2009-03-23 9:29 ` Johannes Berg 2009-03-31 8:36 ` Jiri Kosina 2009-03-31 8:38 ` Johannes Berg 2009-03-31 8:40 ` Jiri Kosina 2009-03-31 8:42 ` Johannes Berg 2009-04-01 9:40 ` Johannes Berg 2009-04-01 10:00 ` Johannes Berg 2009-04-21 13:51 ` Jiri Kosina 2009-04-21 13:59 ` Oliver Neukum 2009-04-21 14:11 ` Jiri Kosina 2009-04-21 14:14 ` Oliver Neukum 2009-04-21 14:26 ` Johannes Berg 2009-05-05 12:47 ` Jiri Kosina 2009-05-08 10:25 ` Johannes Berg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).