* Linux 3.0 STILL dies on USB device hotplug - please merge fix ASAP
@ 2011-07-22 17:02 Andi Kleen
2011-07-22 18:23 ` Jonathan McDowell
2011-07-22 19:52 ` James Bottomley
0 siblings, 2 replies; 5+ messages in thread
From: Andi Kleen @ 2011-07-22 17:02 UTC (permalink / raw)
To: James.Bottomley, stern, linux-kernel, linux-scsi, torvalds
Cc: stable, Dan Williams
Hi,
3.0 still oopses and dies immediately on USB device hot unplug.
The same problem also triggered with SAS device according to Dan.
There was a lot of debugging on this a few weeks back and Alan Stern
posted a SCSI layer patch that fixed the problem (for both USB
and SAS):
http://68.183.106.108/lists/linux-usb/msg49001.html
But for some reason that patch didn't make it into 3.0 and 3.0 still
happily oopses as the RC*s.
Can you please merge this patch ASAP? This should also go to stable.
At least for me it makes pure 3.0 very risky to use, because these USB
hotunplug events are not uncommon and I end up with a dead machine.
Thanks,
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Linux 3.0 STILL dies on USB device hotplug - please merge fix ASAP
2011-07-22 17:02 Linux 3.0 STILL dies on USB device hotplug - please merge fix ASAP Andi Kleen
@ 2011-07-22 18:23 ` Jonathan McDowell
2011-07-22 19:52 ` James Bottomley
1 sibling, 0 replies; 5+ messages in thread
From: Jonathan McDowell @ 2011-07-22 18:23 UTC (permalink / raw)
To: Andi Kleen
Cc: James.Bottomley, stern, linux-kernel, linux-scsi, torvalds,
stable, Dan Williams
On Fri, Jul 22, 2011 at 07:02:14PM +0200, Andi Kleen wrote:
> 3.0 still oopses and dies immediately on USB device hot unplug.
> The same problem also triggered with SAS device according to Dan.
>
> There was a lot of debugging on this a few weeks back and Alan Stern
> posted a SCSI layer patch that fixed the problem (for both USB
> and SAS):
>
> http://68.183.106.108/lists/linux-usb/msg49001.html
It also affects disappearing Fibre Channel devices and Alan's patch
fixes that too.
J.
--
One-seventh of your life is spent on Monday.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Linux 3.0 STILL dies on USB device hotplug - please merge fix ASAP
2011-07-22 17:02 Linux 3.0 STILL dies on USB device hotplug - please merge fix ASAP Andi Kleen
2011-07-22 18:23 ` Jonathan McDowell
@ 2011-07-22 19:52 ` James Bottomley
2011-07-22 20:19 ` Alan Stern
2011-07-22 21:44 ` Andi Kleen
1 sibling, 2 replies; 5+ messages in thread
From: James Bottomley @ 2011-07-22 19:52 UTC (permalink / raw)
To: Andi Kleen
Cc: stern, linux-kernel, linux-scsi, torvalds, stable, Dan Williams
On Fri, 2011-07-22 at 19:02 +0200, Andi Kleen wrote:
> Hi,
>
> 3.0 still oopses and dies immediately on USB device hot unplug.
> The same problem also triggered with SAS device according to Dan.
>
> There was a lot of debugging on this a few weeks back and Alan Stern
> posted a SCSI layer patch that fixed the problem (for both USB
> and SAS):
>
> http://68.183.106.108/lists/linux-usb/msg49001.html
>
> But for some reason that patch didn't make it into 3.0 and 3.0 still
> happily oopses as the RC*s.
>
> Can you please merge this patch ASAP? This should also go to stable.
>
> At least for me it makes pure 3.0 very risky to use, because these USB
> hotunplug events are not uncommon and I end up with a dead machine.
Like I said at the time, the patch is wrong because of the relocation of
the queue teardown. I posted a corrected version, but did anyone test?
Anyway, I merged it on the grounds that it worked for me, but if you
could confirm with linux-next, that would be great (I'll send the pull
request shortly, since it now needs to go via the merge window and I'm
currently in mid journey to Russia.
James
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Linux 3.0 STILL dies on USB device hotplug - please merge fix ASAP
2011-07-22 19:52 ` James Bottomley
@ 2011-07-22 20:19 ` Alan Stern
2011-07-22 21:44 ` Andi Kleen
1 sibling, 0 replies; 5+ messages in thread
From: Alan Stern @ 2011-07-22 20:19 UTC (permalink / raw)
To: James Bottomley
Cc: Andi Kleen, linux-kernel, linux-scsi, torvalds, stable,
Dan Williams
On Fri, 22 Jul 2011, James Bottomley wrote:
> On Fri, 2011-07-22 at 19:02 +0200, Andi Kleen wrote:
> > Hi,
> >
> > 3.0 still oopses and dies immediately on USB device hot unplug.
> > The same problem also triggered with SAS device according to Dan.
> >
> > There was a lot of debugging on this a few weeks back and Alan Stern
> > posted a SCSI layer patch that fixed the problem (for both USB
> > and SAS):
> >
> > http://68.183.106.108/lists/linux-usb/msg49001.html
> >
> > But for some reason that patch didn't make it into 3.0 and 3.0 still
> > happily oopses as the RC*s.
> >
> > Can you please merge this patch ASAP? This should also go to stable.
> >
> > At least for me it makes pure 3.0 very risky to use, because these USB
> > hotunplug events are not uncommon and I end up with a dead machine.
>
> Like I said at the time, the patch is wrong because of the relocation of
> the queue teardown.
That argument doesn't seem right. The queue teardown (i.e., the call
to scsi_free_queue()) was moved by commit 86cbfb5607d4b81b ([SCSI] put
stricter guards on queue dead checks). Here's the changelog:
SCSI uses request_queue->queuedata == NULL as a signal that the queue
is dying. We set this state in the sdev release function. However,
this allows a small window where we release the last reference but
haven't quite got to this stage yet and so something will try to take
a reference in scsi_request_fn and oops. It's very rare, but we had a
report here, so we're pushing this as a bug fix
The actual fix is to set request_queue->queuedata to NULL in
scsi_remove_device() before we drop the reference. This causes
correct automatic rejects from scsi_request_fn as people who hold
additional references try to submit work and prevents anything from
getting a new reference to the sdev that way.
It's quite evident that the point of the commit was to move the line
setting queue->queuedata to NULL; the scsi_free_queue() call merely
went along for the ride (by mistake perhaps?). I don't see any reason
why moving scsi_free_queue() back to where it was should cause a
problem.
Alan Stern
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Linux 3.0 STILL dies on USB device hotplug - please merge fix ASAP
2011-07-22 19:52 ` James Bottomley
2011-07-22 20:19 ` Alan Stern
@ 2011-07-22 21:44 ` Andi Kleen
1 sibling, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2011-07-22 21:44 UTC (permalink / raw)
To: James Bottomley
Cc: Andi Kleen, stern, linux-kernel, linux-scsi, torvalds, stable,
Dan Williams
> Like I said at the time, the patch is wrong because of the relocation of
> the queue teardown. I posted a corrected version, but did anyone test?
> Anyway, I merged it on the grounds that it worked for me, but if you
> could confirm with linux-next, that would be great (I'll send the pull
linux-next works for me too: no oops on 5 pulls or so.
> request shortly, since it now needs to go via the merge window and I'm
> currently in mid journey to Russia.
I would request this to be fixed in stable asap.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-07-22 21:44 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-22 17:02 Linux 3.0 STILL dies on USB device hotplug - please merge fix ASAP Andi Kleen
2011-07-22 18:23 ` Jonathan McDowell
2011-07-22 19:52 ` James Bottomley
2011-07-22 20:19 ` Alan Stern
2011-07-22 21:44 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox