All of lore.kernel.org
 help / color / mirror / Atom feed
From: Niklas Cassel <Niklas.Cassel@wdc.com>
To: Hannes Reinecke <hare@suse.de>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Damien Le Moal <damien.lemoal@opensource.wdc.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>
Subject: Re: [PATCH v2 03/18] scsi: core: allow libata to complete successful commands via EH
Date: Mon, 16 Jan 2023 12:43:38 +0000	[thread overview]
Message-ID: <Y8VGeFSKuIr0nwC3@x1-carbon> (raw)
In-Reply-To: <ab23c5dd-3a61-452c-52c9-43b6b18f2c8e@suse.de>

On Fri, Jan 13, 2023 at 08:57:37AM +0100, Hannes Reinecke wrote:
> On 1/12/23 15:03, Niklas Cassel wrote:
> > In SCSI, we get the sense data as part of the completion, for ATA
> > however, we need to fetch the sense data as an extra step. For an
> > aborted ATA command the sense data is fetched via libata's
> > ->eh_strategy_handler().
> > 
> > For Command Duration Limits policy 0xD:
> > The device shall complete the command without error with the additional
> > sense code set to DATA CURRENTLY UNAVAILABLE.
> > 
> > In order to handle this policy in libata, we intend to send a successful
> > command via SCSI EH, and let libata's ->eh_strategy_handler() fetch the
> > sense data for the good command. This is similar to how we handle an
> > aborted ATA command, just that we need to read the Successful NCQ
> > Commands log instead of the NCQ Command Error log.
> > 
> > When we get a SATA completion with successful commands, ATA_SENSE will
> > be set, indicating that some commands in the completion have sense data.
> > 
> > The sense_valid bitmask in the Sense Data for Successful NCQ Commands
> > log will inform exactly which commands that had sense data, which might
> > be a subset of all the commands that was completed in the same
> > completion. (Yet all will have ATA_SENSE set, since the status is per
> > completion.)
> > 
> > The successful commands that have e.g. a "DATA CURRENTLY UNAVAILABLE"
> > sense data will have a SCSI ML byte set, so scsi_eh_flush_done_q() will
> > not set the scmd->result to DID_TIME_OUT for these commands. However,
> > the successful commands that did not have sense data, must not get their
> > result marked as DID_TIME_OUT by SCSI EH.
> > 
> > Add a new flag SCMD_EH_SUCCESS_CMD, which tells SCSI EH to not mark a
> > command as DID_TIME_OUT, even if it has scmd->result == SAM_STAT_GOOD.
> > 
> > This will be used by libata in a follow-up patch.
> > 
> > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> > ---
> >   drivers/scsi/scsi_error.c | 3 ++-
> >   include/scsi/scsi_cmnd.h  | 5 +++++
> >   2 files changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 2aa2c2aee6e7..51aa5c1e31b5 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -2165,7 +2165,8 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
> >   			 * scsi_eh_get_sense), scmd->result is already
> >   			 * set, do not set DID_TIME_OUT.
> >   			 */
> > -			if (!scmd->result)
> > +			if (!scmd->result &&
> > +			    !(scmd->flags & SCMD_EH_SUCCESS_CMD))
> >   				scmd->result |= (DID_TIME_OUT << 16);
> >   			SCSI_LOG_ERROR_RECOVERY(3,
> >   				scmd_printk(KERN_INFO, scmd,
> Wouldn't it be better to use '!scmd->result && !scsi_sense_valid(scmd)'
> instead of a new flag?
> After all, if we have a valid sense code we _have_ been able to communicate
> with the device. And as we did so it's questionable whether it should count
> as a command time out ...

Hello Hannes,

Thanks a lot for helping out reviewing this series!

Unfortunately, your suggestion won't work.


Let me explain:

When you get a FIS, the ACT register will have a bit set for each
command that finished, however, all the commands will share a single
STATUS value (since there is just a shared STATUS field in the FIS).

So let's say that tags 0-3 got finished (i.e. bits 0-3 are set in the
ACT field) and the STATUS field has the "Sense Data Available" bit set.

This just tells us that at least one of tags 0-3 has sense data.


In order to know which of these tags that actually has sense data,
we need to read the "Sense Data for Successful NCQ Commands log",
which contains a sense_valid bitmask (which contains one bit for
each of the 32 tags).

So reading the "Sense Data for Successful NCQ Commands log" might
tell us that just tag 0-1 have sense data.

So, libata calls ata_qc_schedule_eh() on tags 0-3, wait until SCSI calls
libata .eh_strategy_handler(). libata .eh_strategy_handler() will read the
"Sense Data for Successful NCQ Commands log", which will see that there is
sense data for tags 0-1, and will add sense data for those commands, and
call scsi_check_sense() for tags 0-1.

ata_eh_finish() will finally be called, to determine what to do with the
commands that belonged to EH.

The code looks like this:
if (qc->flags & ATA_QCFLAG_SENSE_VALID ||
    qc->flags & ATA_QCFLAG_EH_SUCCESS_CMD) {
	ata_eh_qc_complete(qc);
}

So it will call complete for all 4 tags, regardless is they had sense data
or not.


scsi_eh_flush_done_q() will soon be called, and since ata_eh_qc_complete()
sets scmd->retries == scmd->allowed, none of the four commands will be retired.

if (!scmd->result &&
    !(scmd->flags & SCMD_EH_SUCCESS_CMD))
	scmd->result |= (DID_TIME_OUT << 16);

The 2 commands with sense data will not get DID_TIMEOUT,
because scmd->result has the SCSI ML byte set
(which is set by scsi_check_sense()).

The 2 commands without sense data will have scmd->result == 0,
so they will get DID_TIME_OUT set if we don't have the extra
!(scmd->flags & SCMD_EH_SUCCESS_CMD)) condition.


SCSI could add an additional check for:

if (!scmd->result && !scsi_sense_valid(scmd) &&
    !(scmd->flags & SCMD_EH_SUCCESS_CMD))
	scmd->result |= (DID_TIME_OUT << 16);

so that a command with does have sense data, but scsi_check_sense()
did not set any SCSI ML byte, does not get DID_TIME_OUT set.

However, for CDL policy 0xD, which this patch cares about,
we would still need the "&& !(scmd->flags & SCMD_EH_SUCCESS_CMD))",
so at least from a CDL perspective, I don't see how any benefit of
also adding a check for "&& !scsi_sense_valid(scmd)".


Kind regards,
Niklas

  reply	other threads:[~2023-01-16 12:43 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12 14:03 [PATCH v2 00/18] Add Command Duration Limits support Niklas Cassel
2023-01-12 14:03 ` [PATCH v2 01/18] ata: libata: allow ata_scsi_set_sense() to not set CHECK_CONDITION Niklas Cassel
2023-01-13  7:49   ` Hannes Reinecke
2023-01-17  6:06   ` Christoph Hellwig
2023-01-17  7:10     ` Damien Le Moal
2023-01-17  7:12       ` Christoph Hellwig
2023-01-17  7:15         ` Damien Le Moal
2023-01-17  7:23           ` Christoph Hellwig
2023-01-12 14:03 ` [PATCH v2 02/18] ata: libata: allow ata_eh_request_sense() " Niklas Cassel
2023-01-13  7:51   ` Hannes Reinecke
2023-01-17  6:08   ` Christoph Hellwig
2023-01-12 14:03 ` [PATCH v2 03/18] scsi: core: allow libata to complete successful commands via EH Niklas Cassel
2023-01-13  7:57   ` Hannes Reinecke
2023-01-16 12:43     ` Niklas Cassel [this message]
2023-01-17 11:44       ` Hannes Reinecke
2023-01-17  7:24   ` Christoph Hellwig
2023-01-12 14:03 ` [PATCH v2 04/18] scsi: rename and move get_scsi_ml_byte() Niklas Cassel
2023-01-13  7:58   ` Hannes Reinecke
2023-01-17  6:12   ` Christoph Hellwig
2023-01-12 14:03 ` [PATCH v2 05/18] scsi: support retrieving sub-pages of mode pages Niklas Cassel
2023-01-13  7:59   ` Hannes Reinecke
2023-01-17  6:13   ` Christoph Hellwig
2023-01-12 14:03 ` [PATCH v2 06/18] scsi: support service action in scsi_report_opcode() Niklas Cassel
2023-01-13 11:48   ` Hannes Reinecke
2023-01-17  6:13   ` Christoph Hellwig
2023-01-12 14:03 ` [PATCH v2 07/18] block: introduce duration-limits priority class Niklas Cassel
2023-01-13 11:55   ` Hannes Reinecke
2023-01-13 12:44     ` Damien Le Moal
2023-01-17  7:25   ` Christoph Hellwig
2023-01-17  8:06     ` Damien Le Moal
2023-01-17  8:13       ` Christoph Hellwig
2023-01-17  8:14         ` Damien Le Moal
2023-01-12 14:03 ` [PATCH v2 08/18] block: introduce BLK_STS_DURATION_LIMIT Niklas Cassel
2023-01-13 11:56   ` Hannes Reinecke
2023-01-17  7:26   ` Christoph Hellwig
2023-01-12 14:03 ` [PATCH v2 09/18] ata: libata: detect support for command duration limits Niklas Cassel
2023-01-13 11:59   ` Hannes Reinecke
2023-01-12 14:03 ` [PATCH v2 10/18] ata: libata-scsi: handle CDL bits in ata_scsiop_maint_in() Niklas Cassel
2023-01-12 14:04 ` [PATCH v2 11/18] ata: libata-scsi: add support for CDL pages mode sense Niklas Cassel
2023-01-12 14:04 ` [PATCH v2 12/18] ata: libata: add ATA feature control sub-page translation Niklas Cassel
2023-01-12 14:04 ` [PATCH v2 13/18] ata: libata: set read/write commands CDL index Niklas Cassel
2023-01-12 14:04 ` [PATCH v2 14/18] scsi: sd: detect support for command duration limits Niklas Cassel
2023-01-17  7:37   ` Christoph Hellwig
2023-01-17  8:00     ` Damien Le Moal
2023-01-12 14:04 ` [PATCH v2 15/18] scsi: sd: set read/write commands CDL index Niklas Cassel
2023-01-12 14:04 ` [PATCH v2 16/18] scsi: sd: handle read/write CDL timeout failures Niklas Cassel
2023-01-12 14:04 ` [PATCH v2 17/18] ata: libata: handle completion of CDL commands using policy 0xD Niklas Cassel
2023-01-12 14:04 ` [PATCH v2 18/18] Documentation: sysfs-block-device: document command duration limits Niklas Cassel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8VGeFSKuIr0nwC3@x1-carbon \
    --to=niklas.cassel@wdc.com \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=hare@suse.de \
    --cc=jejb@linux.ibm.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.