From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Bottomley Subject: Re: [BUG] 2.6.39.1 crash in scsi_dispatch_cmd() Date: Wed, 06 Jul 2011 09:24:09 -0500 Message-ID: <1309962249.3282.1.camel@mulgrave> References: Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Return-path: Received: from bedivere.hansenpartnership.com ([66.63.167.143]:40099 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752989Ab1GFOYM (ORCPT ); Wed, 6 Jul 2011 10:24:12 -0400 In-Reply-To: Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Alan Stern Cc: Roland Dreier , Heiko Carstens , Jens Axboe , linux-scsi@vger.kernel.org, Steffen Maier , "Manvanthara B. Puttashankar" , Tarak Reddy , "Seshagiri N. Ippili" On Wed, 2011-07-06 at 10:20 -0400, Alan Stern wrote: > On Wed, 6 Jul 2011, Roland Dreier wrote: > > > Alan Stern's patch looks a bit fishy -- the scsi_free_queue() is moved > > earlier than the > > > > /* cause the request function to reject all I/O requests */ > > sdev->request_queue->queuedata = NULL; > > > > which seems to leave a small window where the use-after-free can > > happen, and it's not clear to me why the scsi_free_queue() has to move > > at all. > > Looks can be deceiving. Although the scsi_free_queue() is higher up in > the source file, it actually runs later than this code. That's because > __scsi_remove_device() -- this code -- gets called when the device is > unregistered from the driver core, whereas > scsi_device_dev_release_usercontext() -- where the scsi_free_queue() is > moved to -- gets called when the last reference to the device is > dropped. > > Now, one of the things I'm not sure about (it would nice if James would > pick up this thread again and comment) is whether queuedata should be > set to NULL at unregistration time or later on, when the device and the > queue are about to be freed. Sorry, higher priority problems at the moment. Sorry about the ->queuedata cockup, was thinking of sdev->request_queue. Moving the queue free is wrong ... it recently moved to fix another oops. Problem most likely missing block guards on blk_execute_req() ... no check for QUEUE_DEAD. Will be back on Thursday. James