public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* deadlock in debugfs synchronize_srcu() when unplugging USB
@ 2017-10-12 20:01 Tyler Hall
  2017-10-12 21:04 ` Paul E. McKenney
  2017-10-16  7:51 ` Greg Kroah-Hartman
  0 siblings, 2 replies; 16+ messages in thread
From: Tyler Hall @ 2017-10-12 20:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Nicolai Stange, Johannes Berg, Paul E.McKenney,
	Greg Kroah-Hartman

Hi,

I have a reproducible scenario wherein removing a USB device while
reading /sys/kernel/debug/usb/devices causes a deadlock. This should
not be specific to any USB device. Any USB device removal that causes
a call to debugfs_remove() has inverted lock ordering with respect to
the read() of debug/usb/devices.

e.g.
read thread: srcu_read_lock(&debugfs_srcu);
-- usb unplug --
  remove thread: mutex_lock(&usb_bus_idr_lock);
  remove thread: synchronize_srcu(&debugfs_srcu); <- blocked
read thread: mutex_lock(&usb_bus_idr_lock); <- blocked
read thread: srcu_read_unlock(&debugfs_srcu, ...);

This seems to be another flavor of what Johannes Berg reported:
deadlock in synchronize_srcu() in debugfs?
https://lkml.org/lkml/2017/3/23/415

I applied this patch set from Nicolai Stange and can no longer
reproduce the hang.
[RFC PATCH v2 0/9] debugfs: per-file removal protection
https://lkml.org/lkml/2017/5/3/292

As patch 2/9 in the series indicates, commit 49d200deaa68 ("debugfs:
prevent access to removed files' private data") is where this was
first introduced, and it is reproducible on v4.14-rc4.

How should we move forward with the resolution of this debugfs change?
It seems to me that the USB locking is reasonable but the debugfs
global srcu is overly restrictive. This could lead to unexpected lock
inversion any time a driver shares a mutex between its debugfs read
and removal paths.

Backtrace below. Thanks!

-Tyler Hall

This is easier to reproduce by adding a sleep before the
usb_bus_idr_lock, but I've seen it on an unmodified kernel.

diff --git a/drivers/usb/core/devices.c b/drivers/usb/core/devices.c
index 55dea2e7828f..534650cd0950 100644
--- a/drivers/usb/core/devices.c
+++ b/drivers/usb/core/devices.c
@@ -614,6 +614,7 @@ static ssize_t usb_device_read(struct file *file,
char __user *buf,
     if (!access_ok(VERIFY_WRITE, buf, nbytes))
         return -EFAULT;

+    msleep(1000);
     mutex_lock(&usb_bus_idr_lock);
     /* print devices for all busses */
     idr_for_each_entry(&usb_bus_idr, bus, id) {


[   24.240542] sysrq: SysRq : Show Blocked State
[   24.240765]   task                        PC stack   pid father
[   24.240975] kworker/0:2     D13840   881      2 0x80000000
[   24.241525] Workqueue: usb_hub_wq hub_event
[   24.241682] Call Trace:
[   24.242273]  __schedule+0x317/0x6d0
[   24.242442]  schedule+0x31/0x80
[   24.242514]  schedule_timeout+0x1d0/0x320
[   24.242603]  ? __queue_work+0x135/0x400
[   24.242689]  wait_for_completion+0x92/0xf0
[   24.242765]  ? wait_for_completion+0x92/0xf0
[   24.242841]  ? wake_up_q+0x70/0x70
[   24.242907]  __synchronize_srcu.part.14+0x71/0x90
[   24.242985]  ? trace_event_raw_event_rcu_torture_read+0xe0/0xe0
[   24.243169]  synchronize_srcu_expedited+0x22/0x30
[   24.243265]  ? synchronize_srcu_expedited+0x22/0x30
[   24.243347]  synchronize_srcu+0x9a/0xc0
[   24.243418]  debugfs_remove+0x6d/0xa0
[   24.243490]  bdi_unregister+0x8b/0x170
[   24.243558]  del_gendisk+0x139/0x220
[   24.243624]  sd_remove+0x5c/0xc0
[   24.243685]  device_release_driver_internal+0x150/0x210
[   24.243769]  device_release_driver+0xd/0x10
[   24.243841]  bus_remove_device+0xdb/0x120
[   24.243915]  device_del+0x1c3/0x2e0
[   24.243977]  __scsi_remove_device+0xff/0x130
[   24.244122]  scsi_forget_host+0x5b/0x60
[   24.244203]  scsi_remove_host+0x74/0x140
[   24.244281]  usb_stor_disconnect+0x54/0xc0
[   24.244357]  usb_unbind_interface+0x6d/0x260
[   24.244437]  device_release_driver_internal+0x150/0x210
[   24.244520]  device_release_driver+0xd/0x10
[   24.244591]  bus_remove_device+0xdb/0x120
[   24.244659]  device_del+0x1c3/0x2e0
[   24.244722]  usb_disable_device+0x97/0x1f0
[   24.244792]  usb_disconnect+0x88/0x230
[   24.244853]  hub_event+0x5b9/0x11e0
[   24.244915]  ? add_timer+0x10e/0x230
[   24.244984]  process_one_work+0x146/0x3e0
[   24.245124]  worker_thread+0x43/0x3e0
[   24.245204]  kthread+0x104/0x140
[   24.245266]  ? create_worker+0x190/0x190
[   24.245333]  ? kthread_create_on_node+0x40/0x40
[   24.245406]  ret_from_fork+0x22/0x30

[   24.245542] cat             D13712  1029   1018 0x00000000
[   24.245652] Call Trace:
[   24.245705]  __schedule+0x317/0x6d0
[   24.245770]  schedule+0x31/0x80
[   24.245830]  schedule_preempt_disabled+0x9/0x10
[   24.245903]  __mutex_lock.isra.2+0x225/0x470
[   24.245975]  __mutex_lock_slowpath+0xe/0x10
[   24.246110]  ? __mutex_lock_slowpath+0xe/0x10
[   24.246199]  mutex_lock+0x2a/0x30
[   24.246261]  usb_device_read+0xb6/0x140
[   24.246325]  full_proxy_read+0x4f/0x90
[   24.246394]  __vfs_read+0x23/0x120
[   24.246456]  ? security_file_permission+0x96/0xb0
[   24.246533]  ? rw_verify_area+0x49/0xb0
[   24.246593]  vfs_read+0x8e/0x130
[   24.246646]  SyS_read+0x41/0xa0
[   24.246698]  entry_SYSCALL_64_fastpath+0x13/0x94

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-11-07 14:29 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-12 20:01 deadlock in debugfs synchronize_srcu() when unplugging USB Tyler Hall
2017-10-12 21:04 ` Paul E. McKenney
2017-10-16  7:51 ` Greg Kroah-Hartman
2017-10-16  8:15   ` Johannes Berg
2017-10-16  8:31     ` Greg Kroah-Hartman
2017-10-16 15:08       ` Tyler Hall
2017-10-30 23:15         ` [PATCH v3 0/8] debugfs: per-file removal protection Nicolai Stange
2017-10-30 23:15           ` [PATCH v3 1/8] debugfs: add support for more elaborate ->d_fsdata Nicolai Stange
2017-10-30 23:15           ` [PATCH v3 2/8] debugfs: implement per-file removal protection Nicolai Stange
2017-10-30 23:15           ` [PATCH v3 3/8] debugfs: debugfs_real_fops(): drop __must_hold sparse annotation Nicolai Stange
2017-10-30 23:15           ` [PATCH v3 4/8] debugfs: convert to debugfs_file_get() and -put() Nicolai Stange
2017-10-30 23:15           ` [PATCH v3 5/8] IB/hfi1: " Nicolai Stange
2017-11-07 14:29             ` Dennis Dalessandro
2017-10-30 23:15           ` [PATCH v3 6/8] debugfs: purge obsolete SRCU based removal protection Nicolai Stange
2017-10-30 23:15           ` [PATCH v3 7/8] debugfs: call debugfs_real_fops() only after debugfs_file_get() Nicolai Stange
2017-10-30 23:15           ` [PATCH v3 8/8] debugfs: defer debugfs_fsdata allocation to first usage Nicolai Stange

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox