From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>,
jaxboe@fusionio.com, roland@purestorage.com,
linux-scsi@vger.kernel.org,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
device-mapper development <dm-devel@redhat.com>,
Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Subject: Re: [BUG] Oops when SCSI device under multipath is removed
Date: Thu, 11 Aug 2011 10:05:31 -0500 [thread overview]
Message-ID: <1313075131.4166.10.camel@mulgrave> (raw)
In-Reply-To: <Pine.LNX.4.44L0.1108111056540.1958-100000@iolanthe.rowland.org>
On Thu, 2011-08-11 at 10:59 -0400, Alan Stern wrote:
> On Thu, 11 Aug 2011, James Bottomley wrote:
>
> > > If the reason you moved scsi_free_queue into scsi_remove_device
> > > is marking the queue dead, how about the following patch?
> > > Do you think it's acceptable?
> >
> > Well, it's just hiding the problem. The essential problem is that only
> > block has the correctly refcounted knowledge to know the last release of
> > the queue reference. Until that time, the holder of the reference can
> > use the queue regardless of whether blk_cleanup_queue() has been called.
> > This is the race you complain about since use of the queue involves the
> > lock which should be guarded by QUEUE_DEAD checks.
> >
> > This is essentially unfixable with function calls. The only way to fix
> > it is to have a callback model for freeing the external lock.
>
> Assuming the queue is associated with a device, the queue could take a
> reference to the device, dropping that reference when the queue is
> freed. Then the lock could safely be freed at the same time as the
> device.
If that assumption is correct, there's no point refcounting the queue at
all because its use is entirely subordinated to the lifecycle of the
associated device. Plus all the wittering about my previous patch is
pointless, because blk_cleanup_queue() has to do the final put of the
queue in the lock free path (otherwise the assumption is violated).
However, much as I'd like to accept this rosy view, the original oops
that started all of this in 2.6.38 was someone caught something with a
reference to a SCSI queue after the device release function had been
called.
James
next prev parent reply other threads:[~2011-08-11 15:05 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-10 4:29 [BUG] Oops when SCSI device under multipath is removed Jun'ichi Nomura
2011-08-10 19:52 ` James Bottomley
2011-08-11 0:24 ` Jun'ichi Nomura
2011-08-11 3:01 ` Jun'ichi Nomura
2011-08-11 14:33 ` James Bottomley
2011-08-11 14:59 ` Alan Stern
2011-08-11 15:05 ` James Bottomley [this message]
2011-08-11 15:16 ` Alan Stern
2011-08-16 11:26 ` Jun'ichi Nomura
2011-08-18 9:11 ` Jun'ichi Nomura
2011-08-31 19:50 ` Thadeu Lima de Souza Cascardo
2011-09-08 0:00 ` Jun'ichi Nomura
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1313075131.4166.10.camel@mulgrave \
--to=james.bottomley@hansenpartnership.com \
--cc=dm-devel@redhat.com \
--cc=j-nomura@ce.jp.nec.com \
--cc=jaxboe@fusionio.com \
--cc=k-ueda@ct.jp.nec.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=roland@purestorage.com \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox