* Race condition between userland and USB device attachment @ 2014-07-25 13:43 Sergey Klyaus 2014-07-28 9:31 ` Oliver Neukum 0 siblings, 1 reply; 7+ messages in thread From: Sergey Klyaus @ 2014-07-25 13:43 UTC (permalink / raw) To: linux-kernel; +Cc: Sergey Klyaus Hello. I am currently working on a project with Thin clients with Citrix Receiver 13 for Linux and encountered interesting problem with USB device redirection. ctxusb/ctxusbd process from Citrix Receiver are using inotify mechanism to monitor /dev/bus/usb filesystem, and when device arrives, tries to open it, but get ENODEV status: Jul 25 11:36:13 myaut-desktop ctxusbd[2664]: Failed to open device: No such device Jul 25 11:36:13 myaut-desktop ctxusb[2751]: Failed to open device 001:003 (error 19 - No such device), bad id? It is caused by design of device_add() function: it calls devtmpfs_create_node before bus_add_device. Here are sequence of events: 1. device_add() calls devtmpfs_create_node(). That leads to inotify event that. 2. ctxusb is awoken because inotify event arises, and calls ctxusbd daemon. 3. ctxusbd daemon opens /dev/bus/usb/new-device, so usbdev_open() routine is called 4. usbdev_open() calls usbdev_lookup_by_devt(). Because device is not yet attached to "usb bus", it returns NULL, and thus usbdev_open() returns -ENODEV 5. Finally, device_add() calls bus_add_device(), and all subsequent calls of usbdev_open() will succeed. However, ctxusb/ctxusbd already reported an error and abandon device. User is unsatisfied. I was able to reproduce that issue on Ubuntu 10.04 with 2.6.32 and 3.13 kernels. However, it only occur on uni-processor systems (!) I see three ways to solve that issue: 1. Leave it to userland applications (i.e. using loop with retries and timeouts). However, I feel that it is a kernel issue (application is notified before device is ready). 2. Call bus_add_device() before devtmpfs_create_node(). Very rough, and probably breaks a lot of other kernel code. 3. Wait in usbdev_open() until reconfiguration is finished (i.e. by using some global lock between usb_new_device() and usbdev_open(), or add completion and special state USB_STATE_CONNECTING to a device). P.S. Since I aren't subscribed to mailing list, could you add me in CC? Thanks in advance. Best Regards, Sergey. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Race condition between userland and USB device attachment 2014-07-25 13:43 Race condition between userland and USB device attachment Sergey Klyaus @ 2014-07-28 9:31 ` Oliver Neukum 2014-07-28 14:22 ` Alan Stern 0 siblings, 1 reply; 7+ messages in thread From: Oliver Neukum @ 2014-07-28 9:31 UTC (permalink / raw) To: Sergey Klyaus; +Cc: linux-kernel, Alan Stern On Fri, 2014-07-25 at 17:43 +0400, Sergey Klyaus wrote: > Hello. > > I am currently working on a project with Thin clients with Citrix > Receiver 13 for Linux and encountered interesting problem with USB > device redirection. > ctxusb/ctxusbd process from Citrix Receiver are using inotify mechanism > to monitor /dev/bus/usb filesystem, and when device arrives, tries to > open it, but get ENODEV status: > Jul 25 11:36:13 myaut-desktop ctxusbd[2664]: Failed to open device: No > such device > Jul 25 11:36:13 myaut-desktop ctxusb[2751]: Failed to open device > 001:003 (error 19 - No such device), bad id? > > It is caused by design of device_add() function: it calls > devtmpfs_create_node before bus_add_device. Here are sequence of events: > 1. device_add() calls devtmpfs_create_node(). That leads to inotify > event that. > 2. ctxusb is awoken because inotify event arises, and calls ctxusbd daemon. > 3. ctxusbd daemon opens /dev/bus/usb/new-device, so usbdev_open() > routine is called > 4. usbdev_open() calls usbdev_lookup_by_devt(). Because device is not > yet attached to "usb bus", it returns NULL, and thus usbdev_open() > returns -ENODEV > 5. Finally, device_add() calls bus_add_device(), and all subsequent > calls of usbdev_open() will succeed. However, ctxusb/ctxusbd already > reported an error and abandon device. User is unsatisfied. > > I was able to reproduce that issue on Ubuntu 10.04 with 2.6.32 and 3.13 > kernels. > However, it only occur on uni-processor systems (!) > > I see three ways to solve that issue: > 1. Leave it to userland applications (i.e. using loop with retries and > timeouts). However, I feel that it is a kernel issue (application is > notified before device is ready). > 2. Call bus_add_device() before devtmpfs_create_node(). Very rough, and > probably breaks a lot of other kernel code. > 3. Wait in usbdev_open() until reconfiguration is finished (i.e. by > using some global lock between usb_new_device() and usbdev_open(), or > add completion and special state USB_STATE_CONNECTING to a device). No to your third option. This is no USB problem. The issue is in the generic code. The only clean fix is your suggestion (2) Regards Oliver ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Race condition between userland and USB device attachment 2014-07-28 9:31 ` Oliver Neukum @ 2014-07-28 14:22 ` Alan Stern 2014-08-06 17:38 ` [PATCH] driver core: fix race with userland in device_add() Sergey Klyaus 0 siblings, 1 reply; 7+ messages in thread From: Alan Stern @ 2014-07-28 14:22 UTC (permalink / raw) To: Oliver Neukum; +Cc: Sergey Klyaus, linux-kernel On Mon, 28 Jul 2014, Oliver Neukum wrote: > On Fri, 2014-07-25 at 17:43 +0400, Sergey Klyaus wrote: > > Hello. > > > > I am currently working on a project with Thin clients with Citrix > > Receiver 13 for Linux and encountered interesting problem with USB > > device redirection. > > ctxusb/ctxusbd process from Citrix Receiver are using inotify mechanism > > to monitor /dev/bus/usb filesystem, and when device arrives, tries to > > open it, but get ENODEV status: > > Jul 25 11:36:13 myaut-desktop ctxusbd[2664]: Failed to open device: No > > such device > > Jul 25 11:36:13 myaut-desktop ctxusb[2751]: Failed to open device > > 001:003 (error 19 - No such device), bad id? > > > > It is caused by design of device_add() function: it calls > > devtmpfs_create_node before bus_add_device. Here are sequence of events: > > 1. device_add() calls devtmpfs_create_node(). That leads to inotify > > event that. > > 2. ctxusb is awoken because inotify event arises, and calls ctxusbd daemon. > > 3. ctxusbd daemon opens /dev/bus/usb/new-device, so usbdev_open() > > routine is called > > 4. usbdev_open() calls usbdev_lookup_by_devt(). Because device is not > > yet attached to "usb bus", it returns NULL, and thus usbdev_open() > > returns -ENODEV > > 5. Finally, device_add() calls bus_add_device(), and all subsequent > > calls of usbdev_open() will succeed. However, ctxusb/ctxusbd already > > reported an error and abandon device. User is unsatisfied. > > > > I was able to reproduce that issue on Ubuntu 10.04 with 2.6.32 and 3.13 > > kernels. > > However, it only occur on uni-processor systems (!) > > > > I see three ways to solve that issue: > > 1. Leave it to userland applications (i.e. using loop with retries and > > timeouts). However, I feel that it is a kernel issue (application is > > notified before device is ready). > > 2. Call bus_add_device() before devtmpfs_create_node(). Very rough, and > > probably breaks a lot of other kernel code. > > 3. Wait in usbdev_open() until reconfiguration is finished (i.e. by > > using some global lock between usb_new_device() and usbdev_open(), or > > add completion and special state USB_STATE_CONNECTING to a device). > > No to your third option. This is no USB problem. The issue is in the > generic code. The only clean fix is your suggestion (2) I agree. That whole "if (MAJOR(dev->devt)) {" thing in device_add() should come at the end, not in the middle. Alan Stern ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] driver core: fix race with userland in device_add() 2014-07-28 14:22 ` Alan Stern @ 2014-08-06 17:38 ` Sergey Klyaus 2014-08-06 20:18 ` Alan Stern 0 siblings, 1 reply; 7+ messages in thread From: Sergey Klyaus @ 2014-08-06 17:38 UTC (permalink / raw) To: Alan Stern, Oliver Neukum; +Cc: linux-kernel, Greg Kroah-Hartman On 07/28/2014 06:22 PM, Alan Stern wrote: > On Mon, 28 Jul 2014, Oliver Neukum wrote: > >> On Fri, 2014-07-25 at 17:43 +0400, Sergey Klyaus wrote: >>> Hello. >>> >>> I am currently working on a project with Thin clients with Citrix >>> Receiver 13 for Linux and encountered interesting problem with USB >>> device redirection. >>> ctxusb/ctxusbd process from Citrix Receiver are using inotify mechanism >>> to monitor /dev/bus/usb filesystem, and when device arrives, tries to >>> open it, but get ENODEV status: >>> Jul 25 11:36:13 myaut-desktop ctxusbd[2664]: Failed to open device: No >>> such device >>> Jul 25 11:36:13 myaut-desktop ctxusb[2751]: Failed to open device >>> 001:003 (error 19 - No such device), bad id? >>> >>> It is caused by design of device_add() function: it calls >>> devtmpfs_create_node before bus_add_device. Here are sequence of events: >>> 1. device_add() calls devtmpfs_create_node(). That leads to inotify >>> event that. >>> 2. ctxusb is awoken because inotify event arises, and calls ctxusbd daemon. >>> 3. ctxusbd daemon opens /dev/bus/usb/new-device, so usbdev_open() >>> routine is called >>> 4. usbdev_open() calls usbdev_lookup_by_devt(). Because device is not >>> yet attached to "usb bus", it returns NULL, and thus usbdev_open() >>> returns -ENODEV >>> 5. Finally, device_add() calls bus_add_device(), and all subsequent >>> calls of usbdev_open() will succeed. However, ctxusb/ctxusbd already >>> reported an error and abandon device. User is unsatisfied. >>> >>> I was able to reproduce that issue on Ubuntu 10.04 with 2.6.32 and 3.13 >>> kernels. >>> However, it only occur on uni-processor systems (!) >>> >>> I see three ways to solve that issue: >>> 1. Leave it to userland applications (i.e. using loop with retries and >>> timeouts). However, I feel that it is a kernel issue (application is >>> notified before device is ready). >>> 2. Call bus_add_device() before devtmpfs_create_node(). Very rough, and >>> probably breaks a lot of other kernel code. >>> 3. Wait in usbdev_open() until reconfiguration is finished (i.e. by >>> using some global lock between usb_new_device() and usbdev_open(), or >>> add completion and special state USB_STATE_CONNECTING to a device). >> No to your third option. This is no USB problem. The issue is in the >> generic code. The only clean fix is your suggestion (2) > I agree. That whole "if (MAJOR(dev->devt)) {" thing in device_add() > should come at the end, not in the middle. > > Alan Stern > Hello. I wrote a patch that fixes the problem that described above, here are a patch for 3.16.0+ kernel (cloned from GitHub today). Maybe that "if (MAJOR(dev->devt)) " part has to go even after BUS_NOTIFY_ADD_DEVICE abd KOBJ_ADD? I put it before it, because there is no rollback code in device_add() for that part. Here are a patch: bus_add_device() should be called before devtmpfs_create_node(), so when userland application opens device from devtmpfs, it wouldn't get ENODEV from kernel, because device_add() wasn't completed. diff --git a/drivers/base/core.c b/drivers/base/core.c index 20da3ad..cc84ba8 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1019,18 +1029,6 @@ int device_add(struct device *dev) if (error) goto attrError; - if (MAJOR(dev->devt)) { - error = device_create_file(dev, &dev_attr_dev); - if (error) - goto ueventattrError; - - error = device_create_sys_dev_entry(dev); - if (error) - goto devtattrError; - - devtmpfs_create_node(dev); - } - error = device_add_class_symlinks(dev); if (error) goto SymlinkError; @@ -1044,7 +1042,19 @@ int device_add(struct device *dev) if (error) goto DPMError; device_pm_add(dev); - + + if (MAJOR(dev->devt)) { + error = device_create_file(dev, &dev_attr_dev); + if (error) + goto DevAttrError; + + error = device_create_sys_dev_entry(dev); + if (error) + goto SysEntryError; + + devtmpfs_create_node(dev); + } + /* Notify clients of device addition. This call must come * after dpm_sysfs_add() and before kobject_uevent(). */ @@ -1074,6 +1084,12 @@ int device_add(struct device *dev) done: put_device(dev); return error; + SysEntryError: + if (MAJOR(dev->devt)) + device_remove_file(dev, &dev_attr_dev); + DevAttrError: + device_pm_remove(dev); + dpm_sysfs_remove(dev); DPMError: bus_remove_device(dev); BusError: @@ -1081,14 +1097,6 @@ done: AttrsError: device_remove_class_symlinks(dev); SymlinkError: - if (MAJOR(dev->devt)) - devtmpfs_delete_node(dev); - if (MAJOR(dev->devt)) - device_remove_sys_dev_entry(dev); - devtattrError: - if (MAJOR(dev->devt)) - device_remove_file(dev, &dev_attr_dev); - ueventattrError: device_remove_file(dev, &dev_attr_uevent); attrError: kobject_uevent(&dev->kobj, KOBJ_REMOVE); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] driver core: fix race with userland in device_add() 2014-08-06 17:38 ` [PATCH] driver core: fix race with userland in device_add() Sergey Klyaus @ 2014-08-06 20:18 ` Alan Stern 2014-08-09 14:20 ` Greg Kroah-Hartman 2014-09-08 22:53 ` Greg Kroah-Hartman 0 siblings, 2 replies; 7+ messages in thread From: Alan Stern @ 2014-08-06 20:18 UTC (permalink / raw) To: Sergey Klyaus; +Cc: Oliver Neukum, linux-kernel, Greg Kroah-Hartman On Wed, 6 Aug 2014, Sergey Klyaus wrote: > Hello. > > I wrote a patch that fixes the problem that described above, here are a > patch for 3.16.0+ kernel (cloned from GitHub today). Maybe that "if > (MAJOR(dev->devt)) " part has to go even after BUS_NOTIFY_ADD_DEVICE abd > KOBJ_ADD? I put it before it, because there is no rollback code in > device_add() for that part. I think this is fine. However, I suspect the order of the other calls there isn't totally right. For instance, the if (parent) klist_add_tail(&dev->p->knode_parent, &parent->p->klist_children); part should probably be the first thing after we know the routine can't abort. I guess the time when bus_probe_device() gets called doesn't matter much, because the driver might not even be loaded at this point. But what about all the dev->class stuff at the end of device_add()? Should that happen before any uevents are sent out? Greg, have you looked at this? Alan Stern > Here are a patch: > > bus_add_device() should be called before devtmpfs_create_node(), so when > userland application opens device from devtmpfs, it wouldn't get ENODEV > from kernel, because device_add() wasn't completed. > > diff --git a/drivers/base/core.c b/drivers/base/core.c > index 20da3ad..cc84ba8 100644 > --- a/drivers/base/core.c > +++ b/drivers/base/core.c > @@ -1019,18 +1029,6 @@ int device_add(struct device *dev) > if (error) > goto attrError; > > - if (MAJOR(dev->devt)) { > - error = device_create_file(dev, &dev_attr_dev); > - if (error) > - goto ueventattrError; > - > - error = device_create_sys_dev_entry(dev); > - if (error) > - goto devtattrError; > - > - devtmpfs_create_node(dev); > - } > - > error = device_add_class_symlinks(dev); > if (error) > goto SymlinkError; > @@ -1044,7 +1042,19 @@ int device_add(struct device *dev) > if (error) > goto DPMError; > device_pm_add(dev); > - > + > + if (MAJOR(dev->devt)) { > + error = device_create_file(dev, &dev_attr_dev); > + if (error) > + goto DevAttrError; > + > + error = device_create_sys_dev_entry(dev); > + if (error) > + goto SysEntryError; > + > + devtmpfs_create_node(dev); > + } > + > /* Notify clients of device addition. This call must come > * after dpm_sysfs_add() and before kobject_uevent(). > */ > @@ -1074,6 +1084,12 @@ int device_add(struct device *dev) > done: > put_device(dev); > return error; > + SysEntryError: > + if (MAJOR(dev->devt)) > + device_remove_file(dev, &dev_attr_dev); > + DevAttrError: > + device_pm_remove(dev); > + dpm_sysfs_remove(dev); > DPMError: > bus_remove_device(dev); > BusError: > @@ -1081,14 +1097,6 @@ done: > AttrsError: > device_remove_class_symlinks(dev); > SymlinkError: > - if (MAJOR(dev->devt)) > - devtmpfs_delete_node(dev); > - if (MAJOR(dev->devt)) > - device_remove_sys_dev_entry(dev); > - devtattrError: > - if (MAJOR(dev->devt)) > - device_remove_file(dev, &dev_attr_dev); > - ueventattrError: > device_remove_file(dev, &dev_attr_uevent); > attrError: > kobject_uevent(&dev->kobj, KOBJ_REMOVE); ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] driver core: fix race with userland in device_add() 2014-08-06 20:18 ` Alan Stern @ 2014-08-09 14:20 ` Greg Kroah-Hartman 2014-09-08 22:53 ` Greg Kroah-Hartman 1 sibling, 0 replies; 7+ messages in thread From: Greg Kroah-Hartman @ 2014-08-09 14:20 UTC (permalink / raw) To: Alan Stern; +Cc: Sergey Klyaus, Oliver Neukum, linux-kernel On Wed, Aug 06, 2014 at 04:18:38PM -0400, Alan Stern wrote: > On Wed, 6 Aug 2014, Sergey Klyaus wrote: > > > Hello. > > > > I wrote a patch that fixes the problem that described above, here are a > > patch for 3.16.0+ kernel (cloned from GitHub today). Maybe that "if > > (MAJOR(dev->devt)) " part has to go even after BUS_NOTIFY_ADD_DEVICE abd > > KOBJ_ADD? I put it before it, because there is no rollback code in > > device_add() for that part. > > I think this is fine. However, I suspect the order of the other calls > there isn't totally right. For instance, the > > if (parent) > klist_add_tail(&dev->p->knode_parent, > &parent->p->klist_children); > > part should probably be the first thing after we know the routine can't > abort. > > I guess the time when bus_probe_device() gets called doesn't matter > much, because the driver might not even be loaded at this point. But > what about all the dev->class stuff at the end of device_add()? Should > that happen before any uevents are sent out? > > Greg, have you looked at this? I haven't, thanks for pointing it out, I'll put it on my list of things to do this week. greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] driver core: fix race with userland in device_add() 2014-08-06 20:18 ` Alan Stern 2014-08-09 14:20 ` Greg Kroah-Hartman @ 2014-09-08 22:53 ` Greg Kroah-Hartman 1 sibling, 0 replies; 7+ messages in thread From: Greg Kroah-Hartman @ 2014-09-08 22:53 UTC (permalink / raw) To: Alan Stern; +Cc: Sergey Klyaus, Oliver Neukum, linux-kernel On Wed, Aug 06, 2014 at 04:18:38PM -0400, Alan Stern wrote: > On Wed, 6 Aug 2014, Sergey Klyaus wrote: > > > Hello. > > > > I wrote a patch that fixes the problem that described above, here are a > > patch for 3.16.0+ kernel (cloned from GitHub today). Maybe that "if > > (MAJOR(dev->devt)) " part has to go even after BUS_NOTIFY_ADD_DEVICE abd > > KOBJ_ADD? I put it before it, because there is no rollback code in > > device_add() for that part. > > I think this is fine. However, I suspect the order of the other calls > there isn't totally right. For instance, the > > if (parent) > klist_add_tail(&dev->p->knode_parent, > &parent->p->klist_children); > > part should probably be the first thing after we know the routine can't > abort. > > I guess the time when bus_probe_device() gets called doesn't matter > much, because the driver might not even be loaded at this point. But > what about all the dev->class stuff at the end of device_add()? Should > that happen before any uevents are sent out? > > Greg, have you looked at this? I haven't, given that it's not in a format that I could apply it in, even if I wanted to :( greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-09-08 22:53 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-07-25 13:43 Race condition between userland and USB device attachment Sergey Klyaus 2014-07-28 9:31 ` Oliver Neukum 2014-07-28 14:22 ` Alan Stern 2014-08-06 17:38 ` [PATCH] driver core: fix race with userland in device_add() Sergey Klyaus 2014-08-06 20:18 ` Alan Stern 2014-08-09 14:20 ` Greg Kroah-Hartman 2014-09-08 22:53 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox