* [PATCH RESEND^2] sd: fix infinite kernel/udev loop on non-removable Medium Not Present
@ 2013-04-26 16:39 Steven J. Magnani
2013-04-26 17:10 ` Greg KH
2013-04-26 17:30 ` James Bottomley
0 siblings, 2 replies; 4+ messages in thread
From: Steven J. Magnani @ 2013-04-26 16:39 UTC (permalink / raw)
To: James E.J. Bottomley
Cc: Tejun Heo, linux-kernel, linux-scsi, stable, Steven J. Magnani
Commit eface65c336eff420d70beb0fb6787a732e05ffb (2.6.38) altered
set_media_not_present() in a way that prevents the sd driver from
remembering that a non-removable device has reported "Medium Not Present".
This condition can occur on hotplug of a (i.e.) USB Mass Storage device
whose medium is offline due to an unrecoverable controller error,
but which is otherwise capable of SCSI communication (to download new
microcode, etc.).
Under these conditions, the changed code results in an infinite loop
between the kernel and udevd. When udevd attempts to open the device
in response to a change notification, a SCSI "Medium Not Present" error
occurs which causes the kernel to signal another change. The cycle
repeats until the device is unplugged, resulting in udevd consuming ever-
increasing amounts of CPU and virtual memory.
Resolve this by remembering "media not present" whether the device has
declared itself "removable" or not.
Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
---
--- a/drivers/scsi/sd.c 2013-04-12 14:16:12.252531097 -0500
+++ b/drivers/scsi/sd.c 2013-04-12 14:21:55.197216521 -0500
@@ -1298,10 +1298,8 @@ out:
static void set_media_not_present(struct scsi_disk *sdkp)
{
- if (sdkp->media_present)
+ if (sdkp->media_present) {
sdkp->device->changed = 1;
-
- if (sdkp->device->removable) {
sdkp->media_present = 0;
sdkp->capacity = 0;
}
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH RESEND^2] sd: fix infinite kernel/udev loop on non-removable Medium Not Present 2013-04-26 16:39 [PATCH RESEND^2] sd: fix infinite kernel/udev loop on non-removable Medium Not Present Steven J. Magnani @ 2013-04-26 17:10 ` Greg KH 2013-04-26 17:30 ` James Bottomley 1 sibling, 0 replies; 4+ messages in thread From: Greg KH @ 2013-04-26 17:10 UTC (permalink / raw) To: Steven J. Magnani Cc: James E.J. Bottomley, Tejun Heo, linux-kernel, linux-scsi, stable On Fri, Apr 26, 2013 at 11:39:33AM -0500, Steven J. Magnani wrote: > Commit eface65c336eff420d70beb0fb6787a732e05ffb (2.6.38) altered > set_media_not_present() in a way that prevents the sd driver from > remembering that a non-removable device has reported "Medium Not Present". > This condition can occur on hotplug of a (i.e.) USB Mass Storage device > whose medium is offline due to an unrecoverable controller error, > but which is otherwise capable of SCSI communication (to download new > microcode, etc.). > > Under these conditions, the changed code results in an infinite loop > between the kernel and udevd. When udevd attempts to open the device > in response to a change notification, a SCSI "Medium Not Present" error > occurs which causes the kernel to signal another change. The cycle > repeats until the device is unplugged, resulting in udevd consuming ever- > increasing amounts of CPU and virtual memory. > > Resolve this by remembering "media not present" whether the device has > declared itself "removable" or not. > > Signed-off-by: Steven J. Magnani <steve@digidescorp.com> > --- <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read Documentation/stable_kernel_rules.txt for how to do this properly. </formletter> ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH RESEND^2] sd: fix infinite kernel/udev loop on non-removable Medium Not Present 2013-04-26 16:39 [PATCH RESEND^2] sd: fix infinite kernel/udev loop on non-removable Medium Not Present Steven J. Magnani 2013-04-26 17:10 ` Greg KH @ 2013-04-26 17:30 ` James Bottomley 2013-04-29 20:45 ` Steven J. Magnani 1 sibling, 1 reply; 4+ messages in thread From: James Bottomley @ 2013-04-26 17:30 UTC (permalink / raw) To: Steven J. Magnani; +Cc: Tejun Heo, linux-kernel, linux-scsi, stable On Fri, 2013-04-26 at 11:39 -0500, Steven J. Magnani wrote: > Commit eface65c336eff420d70beb0fb6787a732e05ffb (2.6.38) altered > set_media_not_present() in a way that prevents the sd driver from > remembering that a non-removable device has reported "Medium Not Present". > This condition can occur on hotplug of a (i.e.) USB Mass Storage device > whose medium is offline due to an unrecoverable controller error, > but which is otherwise capable of SCSI communication (to download new > microcode, etc.). This actually sounds to be a bug somewhere in the device. Only removable devices are supposed to signal medium not present. > Under these conditions, the changed code results in an infinite loop > between the kernel and udevd. When udevd attempts to open the device > in response to a change notification, a SCSI "Medium Not Present" error > occurs which causes the kernel to signal another change. The cycle > repeats until the device is unplugged, resulting in udevd consuming ever- > increasing amounts of CPU and virtual memory. > > Resolve this by remembering "media not present" whether the device has > declared itself "removable" or not. > > Signed-off-by: Steven J. Magnani <steve@digidescorp.com> > --- > --- a/drivers/scsi/sd.c 2013-04-12 14:16:12.252531097 -0500 > +++ b/drivers/scsi/sd.c 2013-04-12 14:21:55.197216521 -0500 > @@ -1298,10 +1298,8 @@ out: > > static void set_media_not_present(struct scsi_disk *sdkp) > { > - if (sdkp->media_present) > + if (sdkp->media_present) { > sdkp->device->changed = 1; > - > - if (sdkp->device->removable) { > sdkp->media_present = 0; > sdkp->capacity = 0; I don't really like this change because it will affect TUR failure as well, which looks like it would then zero the capacity of a non-removable device which we aren't expecting. Can we dig into what's going wrong with the device first. It sounds like it really is a removable device and it just needs to be flagged that way (either that or the USB SAT is so screwed up that we might need to apply other blacklists) James ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH RESEND^2] sd: fix infinite kernel/udev loop on non-removable Medium Not Present 2013-04-26 17:30 ` James Bottomley @ 2013-04-29 20:45 ` Steven J. Magnani 0 siblings, 0 replies; 4+ messages in thread From: Steven J. Magnani @ 2013-04-29 20:45 UTC (permalink / raw) To: James Bottomley; +Cc: Tejun Heo, linux-kernel, linux-scsi, stable On Fri, 2013-04-26 at 10:30 -0700, James Bottomley wrote: On Fri, 2013-04-26 at 11:39 -0500, Steven J. Magnani wrote: > > Commit eface65c336eff420d70beb0fb6787a732e05ffb (2.6.38) altered > > set_media_not_present() in a way that prevents the sd driver from > > remembering that a non-removable device has reported "Medium Not Present". > > This condition can occur on hotplug of a (i.e.) USB Mass Storage device > > whose medium is offline due to an unrecoverable controller error, > > but which is otherwise capable of SCSI communication (to download new > > microcode, etc.). > > This actually sounds to be a bug somewhere in the device. Only > removable devices are supposed to signal medium not present. I don't see that spelled out in the -2 family of standards (SAM-2 / SPC-2 / SBC-2). We're only claiming SPC-2 compliance; perhaps that's something that got clarified in a later version? > > Under these conditions, the changed code results in an infinite loop > > between the kernel and udevd. When udevd attempts to open the device > > in response to a change notification, a SCSI "Medium Not Present" error > > occurs which causes the kernel to signal another change. The cycle > > repeats until the device is unplugged, resulting in udevd consuming ever- > > increasing amounts of CPU and virtual memory. One precondition for the infinite loop that got lost between investigation of the problem and submission of the patch is that the device has to identify itself BOTH as "non-removable" and "write protected". The reason is this clause in sd_open(): if (sdev->removable || sdkp->write_prot) check_disk_change(bdev); ...which causes READ CAPACITY to be issued, to which the device responds MEDIUM NOT PRESENT, which causes set_media_not_present() to be invoked and to signal another "changed" event for userland. Apologies for the oversight. > > > Resolve this by remembering "media not present" whether the device has > > declared itself "removable" or not. > > > > Signed-off-by: Steven J. Magnani <steve@digidescorp.com> > > --- > > --- a/drivers/scsi/sd.c 2013-04-12 14:16:12.252531097 -0500 > > +++ b/drivers/scsi/sd.c 2013-04-12 14:21:55.197216521 -0500 > > @@ -1298,10 +1298,8 @@ out: > > > > static void set_media_not_present(struct scsi_disk *sdkp) > > { > > - if (sdkp->media_present) > > + if (sdkp->media_present) { > > sdkp->device->changed = 1; > > - > > - if (sdkp->device->removable) { > > sdkp->media_present = 0; > > sdkp->capacity = 0; > > I don't really like this change because it will affect TUR failure as > well, which looks like it would then zero the capacity of a > non-removable device which we aren't expecting. I think that concern is easily addressed by tweaking the patch to keep the "sdkp->capacity = 0" statement conditional on sdkp->device->removable. > Can we dig into what's > going wrong with the device first. It sounds like it really is a > removable device and it just needs to be flagged that way (either that > or the USB SAT is so screwed up that we might need to apply other > blacklists) Being a USB Mass Storage device, it is removable from a USB sense, but not I think in a SCSI sense - there is no ejectable cartridge or anything. Commands like PREVENT ALLOW MEDIUM REMOVAL, which identification as removable usually triggers, are nonsensical. The scenario is generally one of these two uncommon use cases: 1. The device has come up in "failsafe mode" because the user damaged the normal "operational" microcode by yanking power during an operational microcode update. Failsafe mode allows for microcode update, and not much else. B. The internal controller has experienced an unrecoverable error (reported with appropriate sense codes) during normal operation and the USB connection has since been cycled (disconnected and reconnected; the device is wall-powered, not bus-powered). I don't think either is invalid. I will stipulate that declaring a block device write-protected when no medium is present does not make a whole lot of sense, and that's something we will address going forward. However, I do think it's a Bad Thing (from a security perspective, at minimum) if simply connecting a device to a machine can bring down the system. The Windows FAT driver has such issues. Surely Linux can do better, at least in this particular case where only a minor tweak is needed, not even the addition of defensive code. Regards, ------------------------------------------------------------------------ Steven J. Magnani "I claim this network for MARS! www.digidescorp.com Earthling, return my space modulator!" #include <standard.disclaimer> ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-04-29 20:45 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-04-26 16:39 [PATCH RESEND^2] sd: fix infinite kernel/udev loop on non-removable Medium Not Present Steven J. Magnani 2013-04-26 17:10 ` Greg KH 2013-04-26 17:30 ` James Bottomley 2013-04-29 20:45 ` Steven J. Magnani
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox