From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Bart Van Assche <Bart.VanAssche@sandisk.com>,
"bblock@linux.vnet.ibm.com" <bblock@linux.vnet.ibm.com>
Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"maxg@mellanox.com" <maxg@mellanox.com>,
"israelr@mellanox.com" <israelr@mellanox.com>,
"hare@suse.de" <hare@suse.de>,
"martin.petersen@oracle.com" <martin.petersen@oracle.com>
Subject: Re: [PATCH v3 3/4] sd: Make synchronize cache upon shutdown asynchronous
Date: Fri, 28 Apr 2017 11:45:57 -0700 [thread overview]
Message-ID: <1493405157.15639.31.camel@HansenPartnership.com> (raw)
In-Reply-To: <1493232838.2632.9.camel@sandisk.com>
On Wed, 2017-04-26 at 18:53 +0000, Bart Van Assche wrote:
> On Tue, 2017-04-18 at 08:56 -0700, James Bottomley wrote:
> > The priority has got to be the removal we've been ordered to do
> > rather
> > than waiting around to see if the transport comes back so we can
> > send a
> > final flush.
> >
> > How about this approach. It goes straight to DEL if the device is
> > blocked (skipping CANCEL). This means that all the commands issued
> > in
> > ->shutdown will error in the mid-layer, thus making the removal
> > proceed
> > without being stopped.
> >
> > James
> >
> > ---
> >
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index e5a2d590a104..31171204cfd1 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -2611,7 +2611,6 @@ scsi_device_set_state(struct scsi_device
> > *sdev, enum scsi_device_state state)
> > case SDEV_QUIESCE:
> > case SDEV_OFFLINE:
> > case SDEV_TRANSPORT_OFFLINE:
> > - case SDEV_BLOCK:
> > break;
> > default:
> > goto illegal;
> > @@ -2625,6 +2624,7 @@ scsi_device_set_state(struct scsi_device
> > *sdev, enum scsi_device_state state)
> > case SDEV_OFFLINE:
> > case SDEV_TRANSPORT_OFFLINE:
> > case SDEV_CANCEL:
> > + case SDEV_BLOCK:
> > case SDEV_CREATED_BLOCK:
> > break;
> > default:
> > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> > index 82dfe07b1d47..788309e307e9 100644
> > --- a/drivers/scsi/scsi_sysfs.c
> > +++ b/drivers/scsi/scsi_sysfs.c
> > @@ -1282,8 +1282,14 @@ void __scsi_remove_device(struct scsi_device
> > *sdev)
> > return;
> >
> > if (sdev->is_visible) {
> > + /*
> > + * If blocked, we go straight to DEL so any
> > commands
> > + * issued during the driver shutdown (like sync
> > cache)
> > + * are errored
> > + */
> > if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
> > - return;
> > + if (scsi_device_set_state(sdev, SDEV_DEL)
> > != 0)
> > + return;
> >
> > bsg_unregister_queue(sdev->request_queue);
> > device_unregister(&sdev->sdev_dev);
> >
> >
>
> Hello James,
>
> How about modifying the above patch by adding a mutex to avoid that
> the transition to SDEV_DEL and blocking the request queue happen in
> the wrong order, e.g. as in the attached three patches?
If you've hammered it with your usual testing and it survived, I think
that's enough to prove its a concurrency problem we have to solve, so
the critical section introduction looks good. The only refinement I
think I'd ask for is rather than creating an entirely new mutex, what
about using the host->scan_mutex? It simplifies your code in
__scsi_remove_device because the scan mutex is already held on entry.
I'm also not very happy with a conditional mutex: that's usually
something we try to avoid. I'd prefer an underscore version of
scsi_internal_device_block that doesn't take the mutex and which
mpt3sas uses (and make the non underscore version take the mutex and
call the underscore version).
Thanks,
James
next prev parent reply other threads:[~2017-04-28 18:46 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-17 17:34 [PATCH v3 0/4] Avoid that __scsi_remove_device() hangs Bart Van Assche
2017-04-17 17:34 ` [PATCH v3 1/4] Introduce scsi_start_queue() Bart Van Assche
2017-04-17 17:34 ` [PATCH v3 2/4] Introduce scsi_execute_async() Bart Van Assche
2017-04-17 17:34 ` [PATCH v3 3/4] sd: Make synchronize cache upon shutdown asynchronous Bart Van Assche
2017-04-18 14:44 ` Benjamin Block
2017-04-18 15:34 ` Bart Van Assche
2017-04-18 15:56 ` James Bottomley
2017-04-18 16:06 ` Bart Van Assche
2017-04-18 23:47 ` Bart Van Assche
2017-04-18 23:56 ` James Bottomley
2017-04-19 0:02 ` Bart Van Assche
2017-04-19 0:05 ` James Bottomley
2017-04-19 18:42 ` Bart Van Assche
2017-04-20 21:59 ` Bart Van Assche
2017-04-20 22:13 ` James Bottomley
2017-04-20 22:27 ` Bart Van Assche
2017-04-20 22:52 ` Bart Van Assche
2017-04-23 17:28 ` James Bottomley
2017-04-24 21:46 ` Bart Van Assche
2017-04-26 18:53 ` Bart Van Assche
2017-04-28 18:45 ` James Bottomley [this message]
2017-04-24 7:14 ` [lkp-robot] [sd] ab1218235c: INFO:possible_recursive_locking_detected kernel test robot
2017-04-17 17:34 ` [PATCH v3 4/4] Avoid that __scsi_remove_device() hangs Bart Van Assche
2017-04-18 11:58 ` [PATCH v3 0/4] " Israel Rukshin
2017-04-18 15:40 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1493405157.15639.31.camel@HansenPartnership.com \
--to=james.bottomley@hansenpartnership.com \
--cc=Bart.VanAssche@sandisk.com \
--cc=bblock@linux.vnet.ibm.com \
--cc=hare@suse.de \
--cc=israelr@mellanox.com \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=maxg@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox