From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
David Miller <davem@davemloft.net>,
Bart Van Assche <Bart.VanAssche@sandisk.com>,
Christoph Hellwig <hch@infradead.org>,
jbaron@akamai.com,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
sagi@grimberg.me, Sathya Prakash <sathya.prakash@broadcom.com>,
Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>,
Hannes Reinecke <hare@suse.de>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
Christoph Hellwig <hch@lst.de>,
Chaitra Basappa <chaitra.basappa@broadcom.com>,
dledford@redhat.com
Subject: Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands
Date: Tue, 17 Jan 2017 06:44:14 -0800 [thread overview]
Message-ID: <1484664254.2433.23.camel@HansenPartnership.com> (raw)
In-Reply-To: <CAK=zhgpFKLWUpQMp8VOfmHsUCQiF6E54FBP2juaHYNVvR+TOmw@mail.gmail.com>
On Tue, 2017-01-17 at 19:43 +0530, Sreekanth Reddy wrote:
> On Tue, Jan 17, 2017 at 1:31 AM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > From 91d249409546569444897a1ffde65c421e064899 Mon Sep 17 00:00:00
> > 2001
> > From: James Bottomley <James.Bottomley@HansenPartnership.com>
> > Date: Sun, 1 Jan 2017 09:39:24 -0800
> > Subject: [PATCH] scsi: mpt3sas: fix hang on ata passthrough
> > commands
> >
> > mpt3sas has a firmware failure where it can only handle one pass
> > through ATA command at a time. If another comes in, contrary to
> > the
> > SAT standard, it will hang until the first one completes (causing
> > long
> > commands like secure erase to timeout). The original fix was to
> > block
> > the device when an ATA command came in, but this caused a
> > regression
> > with
> >
> > commit 669f044170d8933c3d66d231b69ea97cb8447338
> > Author: Bart Van Assche <bart.vanassche@sandisk.com>
> > Date: Tue Nov 22 16:17:13 2016 -0800
> >
> > scsi: srp_transport: Move queuecommand() wait code to SCSI core
> >
> > So fix the original fix of the secure erase timeout by properly
> > returning SAM_STAT_BUSY like the SAT recommends. The original
> > patch
> > also had a concurrency problem since scsih_qcmd is lockless at that
> > point (this is fixed by using atomic bitops to set and test the
> > flag).
> >
> > Fixes: 18f6084a989ba1b (mpt3sas: Fix secure erase premature
> > termination)
> > Signed-off-by: James Bottomley <
> > James.Bottomley@HansenPartnership.com>
> >
> > ---
> >
> > v2 - use bitops for lockless atomicity
> > v3 - update description, change function name
> > ---
> > drivers/scsi/mpt3sas/mpt3sas_base.h | 12 +++++++++++
> > drivers/scsi/mpt3sas/mpt3sas_scsih.c | 40 +++++++++++++++++++++++-
> > ------------
> > 2 files changed, 38 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h
> > b/drivers/scsi/mpt3sas/mpt3sas_base.h
> > index 394fe13..dcb33f4 100644
> > --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> > @@ -393,6 +393,7 @@ struct MPT3SAS_TARGET {
> > * @eedp_enable: eedp support enable bit
> > * @eedp_type: 0(type_1), 1(type_2), 2(type_3)
> > * @eedp_block_length: block size
> > + * @ata_command_pending: SATL passthrough outstanding for device
> > */
> > struct MPT3SAS_DEVICE {
> > struct MPT3SAS_TARGET *sas_target;
> > @@ -404,6 +405,17 @@ struct MPT3SAS_DEVICE {
> > u8 ignore_delay_remove;
> > /* Iopriority Command Handling */
> > u8 ncq_prio_enable;
> > + /*
> > + * Bug workaround for SATL handling: the mpt2/3sas firmware
> > + * doesn't return BUSY or TASK_SET_FULL for subsequent
> > + * commands while a SATL pass through is in operation as
> > the
> > + * spec requires, it simply does nothing with them until
> > the
> > + * pass through completes, causing them possibly to timeout
> > if
> > + * the passthrough is a long executing command (like format
> > or
> > + * secure erase). This variable allows us to do the right
> > + * thing while a SATL command is pending.
> > + */
> > + unsigned long ata_command_pending;
> >
> > };
> >
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > index b5c966e..830e2c1 100644
> > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > @@ -3899,9 +3899,18 @@ _scsih_temp_threshold_events(struct
> > MPT3SAS_ADAPTER *ioc,
> > }
> > }
> >
> > -static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
> > +static int _scsih_set_satl_pending(struct scsi_cmnd *scmd, bool
> > pending)
> > {
> > - return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] ==
> > ATA_16);
> > + struct MPT3SAS_DEVICE *priv = scmd->device->hostdata;
> > +
> > + if (scmd->cmnd[0] != ATA_12 && scmd->cmnd[0] != ATA_16)
> > + return 0;
> > +
> > + if (pending)
> > + return test_and_set_bit(0, &priv
> > ->ata_command_pending);
> > +
> > + clear_bit(0, &priv->ata_command_pending);
> > + return 0;
> > }
> >
> > /**
> > @@ -3925,9 +3934,7 @@ _scsih_flush_running_cmds(struct
> > MPT3SAS_ADAPTER *ioc)
> > if (!scmd)
> > continue;
> > count++;
> > - if (ata_12_16_cmd(scmd))
> > - scsi_internal_device_unblock(scmd->device,
> > -
> > SDEV_RUNNING);
> > + _scsih_set_satl_pending(scmd, false);
> > mpt3sas_base_free_smid(ioc, smid);
> > scsi_dma_unmap(scmd);
> > if (ioc->pci_error_recovery)
> > @@ -4063,13 +4070,6 @@ scsih_qcmd(struct Scsi_Host *shost, struct
> > scsi_cmnd *scmd)
> > if (ioc->logging_level & MPT_DEBUG_SCSI)
> > scsi_print_command(scmd);
> >
> > - /*
> > - * Lock the device for any subsequent command until command
> > is
> > - * done.
> > - */
> > - if (ata_12_16_cmd(scmd))
> > - scsi_internal_device_block(scmd->device);
> > -
> > sas_device_priv_data = scmd->device->hostdata;
> > if (!sas_device_priv_data || !sas_device_priv_data
> > ->sas_target) {
> > scmd->result = DID_NO_CONNECT << 16;
> > @@ -4083,6 +4083,19 @@ scsih_qcmd(struct Scsi_Host *shost, struct
> > scsi_cmnd *scmd)
> > return 0;
> > }
> >
> > + /*
> > + * Bug work around for firmware SATL handling. The loop
> > + * is based on atomic operations and ensures consistency
> > + * since we're lockless at this point
> > + */
> > + do {
> > + if (sas_device_priv_data->ata_command_pending) {
> > + scmd->result = SAM_STAT_BUSY;
> > + scmd->scsi_done(scmd);
> > + return 0;
> > + }
> > + } while (_scsih_set_satl_pending(scmd, true));
> > +
>
> [Sreekanth] Just for readability purpose, can use use "if
> (test_bit(0,
> &sas_device_priv_data->ata_command_pending)"
> instead of "if (sas_device_priv_data->ata_command_pending)".
> Since while setting & clearing the bit we are using atomic bit
> operations. I don't see any issue functionality wise.
It's taste I suppose. I like the idea of exposing a true or false
value that can be read but which uses bitops under the cover to ensure
atomicity.
The clincher for why not is probably that if you want another go
around, I'm just about to get on a 'plane, so it won't get to it for a
while and Linus will likely revert the other patch if we don't get a
fix in before -rc4
James
next prev parent reply other threads:[~2017-01-17 14:46 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-29 4:30 [PATCH] scsi: mpt3sas: fix hang on ata passthru commands Jason Baron
2016-12-29 8:02 ` Christoph Hellwig
2016-12-29 16:02 ` Jason Baron
2016-12-31 23:19 ` James Bottomley
2017-01-01 14:22 ` Bart Van Assche
2017-01-01 15:30 ` Jason Baron
2017-01-01 16:33 ` David Miller
2017-01-01 17:39 ` James Bottomley
2017-01-03 20:46 ` Jason Baron
2017-01-15 17:01 ` James Bottomley
2017-01-16 16:20 ` Bart Van Assche
2017-01-06 1:59 ` Martin K. Petersen
2017-01-06 15:46 ` Sreekanth Reddy
2017-01-10 4:50 ` Martin K. Petersen
2017-01-16 20:01 ` James Bottomley
2017-01-16 21:01 ` Martin K. Petersen
2017-01-17 9:20 ` Ingo Molnar
2017-01-17 14:13 ` Sreekanth Reddy
2017-01-17 14:15 ` Christoph Hellwig
2017-01-17 19:44 ` Martin K. Petersen
2017-01-17 14:44 ` James Bottomley [this message]
2016-12-29 16:16 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1484664254.2433.23.camel@HansenPartnership.com \
--to=james.bottomley@hansenpartnership.com \
--cc=Bart.VanAssche@sandisk.com \
--cc=chaitra.basappa@broadcom.com \
--cc=davem@davemloft.net \
--cc=dledford@redhat.com \
--cc=hare@suse.de \
--cc=hch@infradead.org \
--cc=hch@lst.de \
--cc=jbaron@akamai.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=sagi@grimberg.me \
--cc=sathya.prakash@broadcom.com \
--cc=sreekanth.reddy@broadcom.com \
--cc=suganath-prabu.subramani@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).