From: Jeff Garzik <jgarzik@pobox.com>
To: Tejun Heo <htejun@gmail.com>
Cc: "linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>,
Douglas Gilbert <dougg@torque.net>
Subject: Re: [PATCH] ATAPI error handling work
Date: Thu, 06 Oct 2005 06:09:04 -0400 [thread overview]
Message-ID: <4344F7C0.4030803@pobox.com> (raw)
In-Reply-To: <20051006053529.GA25976@htj.dyndns.org>
[-- Attachment #1: Type: text/plain, Size: 6449 bytes --]
Tejun Heo wrote:
> Hello, Jeff.
>
> On Wed, Oct 05, 2005 at 04:51:19PM -0400, Jeff Garzik wrote:
>
>>This is a work-in-progress, attempting to fix the death of libata
>>whenever there is a SCSI error thrown to us... IOW this fixes libata
>>EH to properly fake autosensing, rather than kicking over to the SCSI EH
>>thread for that purpose.
>>
>
>
> Great. I just have a few questions. Oh.. BTW, against which head is
> the patch generated? I couldn't get it applied cleanly to either
> upstream or ALL.
I applied fragments of the patch to 'upstream' branch, so you probably
caught me after I had done that, which invalidated the patch. I've
attached the latest patch.
>>@@ -3174,17 +3106,35 @@ out:
>> void ata_eng_timeout(struct ata_port *ap)
>> {
>> struct ata_queued_cmd *qc;
>>+ struct ata_host_set *host_set = ap->host_set;
>>+ unsigned long flags;
>>+ int printed_message = 0;
>>
>> DPRINTK("ENTER\n");
>>
>>+ spin_lock_irqsave(&host_set->lock, flags);
>>+ while (ap->flags & ATA_FLAG_IN_EH) {
>>+ spin_unlock_irqrestore(&host_set->lock, flags);
>>+
>>+ if (!printed_message++)
>>+ printk(KERN_INFO "ata%u: waiting in timeout handler for EH\n",
>>+ ap->id);
>>+
>>+ msleep(250);
>>+
>>+ spin_lock_irqsave(&host_set->lock, flags);
>>+ }
>>+ spin_unlock_irqrestore(&host_set->lock, flags);
>>+
>
>
>
> I'm not really sure if synchronization around reqsense with
> ATA_FLAG_IN_EH is such a good idea for the following reasons.
>
> * IMHO, tugging in request sense into the timeout window of the
> original command is okay (or even better).
Not really: we have to assume that time passed, so by relying on the
SCSI EH timeout rather than a timer, the timeout is suddenly effectively
random. I want a definitive time length, since its likely REQUEST SENSE
may need a longer timeout for some devices.
Second, the timeout for READ/WRITE commands is often smaller than it is
for non-READ/WRITE commands. We want much to ensure that we have
sufficient time to handle REQUEST SENSE.
> * As long as we grab host_set lock while submitting request sense,
> which we already does, the synchronization can be done safely in EH
> proper. libata EH isn't currently ready for this but other commands
> are dealt the same way and fixing EH is the correct way, IMHO.
We grab the lock submitting request sense, but then we release it.
ata_eng_timeout() could get called before the entire REQUEST SENSE
process is completed.
> Note that we really don't have to do anything special for timing out
> during request sense. Once we fix EH/interrupt handler race, it can
> be handled the same way without any speical casing.
Yes, this is feasible. We can simply wait for the timer (which doesn't
exist yet) or atapi_qc_complete_sense() to clear the ATA_FLAG_IN_EH bit,
at which point the EH timer takes over, or will do so shortly.
>> qc = ata_qc_from_tag(ap, ap->active_tag);
>>- if (!qc) {
>>+ if (qc)
>>+ ata_qc_timeout(qc);
>>+ else {
>> printk(KERN_ERR "ata%u: BUG: timeout without command\n",
>> ap->id);
>> goto out;
>> }
>>
>>- ata_qc_timeout(qc);
>>
>> out:
>> DPRINTK("EXIT\n");
>
>
>
> I think you were trying to kill goto out and the label by moving
> ata_qc_timeout() into if condition, no? Currently, the goto out seem
> a bit funny.
The goto is gone in the most recent version :)
>>diff --git a/drivers/scsi/libata-scsi.c b/drivers/scsi/libata-scsi.c
>>--- a/drivers/scsi/libata-scsi.c
>>+++ b/drivers/scsi/libata-scsi.c
>>@@ -225,7 +225,7 @@ void ata_to_sense_error(struct ata_queue
>> };
>> int i = 0;
>>
>>- cmd->result = SAM_STAT_CHECK_CONDITION;
>>+ cmd->result = (DRIVER_SENSE << 24) | SAM_STAT_CHECK_CONDITION;
>>
>> /*
>> * Is this an error we can process/parse
>>@@ -1468,7 +1468,7 @@ unsigned int ata_scsiop_report_luns(stru
>> void ata_scsi_badcmd(struct scsi_cmnd *cmd, void (*done)(struct scsi_cmnd *), u8 asc, u8 ascq)
>> {
>> DPRINTK("ENTER\n");
>>- cmd->result = SAM_STAT_CHECK_CONDITION;
>>+ cmd->result = (DRIVER_SENSE << 24) | SAM_STAT_CHECK_CONDITION;
>>
>> cmd->sense_buffer[0] = 0x70;
>> cmd->sense_buffer[2] = ILLEGAL_REQUEST;
>>@@ -1479,28 +1479,81 @@ void ata_scsi_badcmd(struct scsi_cmnd *c
>> done(cmd);
>> }
>>
>>+static int atapi_qc_complete_sense(struct ata_queued_cmd *qc, u8 drv_stat)
>>+{
>>+ struct scsi_cmnd *cmd = qc->scsicmd;
>>+
>>+ if (ata_ok(drv_stat))
>>+ cmd->result = (DRIVER_SENSE << 24) | SAM_STAT_CHECK_CONDITION;
>>+ else
>>+ ata_to_sense_error(qc, drv_stat);
>>+
>>+ qc->ap->flags &= ~ATA_FLAG_IN_EH;
>>+ qc->scsidone(cmd);
>>+ return 0;
>>+}
>>+
>>+static void atapi_request_sense(struct ata_queued_cmd *qc)
>>+{
>>+ struct ata_port *ap = qc->ap;
>>+ struct scsi_cmnd *cmd = qc->scsicmd;
>>+ int rc;
>>+
>>+ DPRINTK("ATAPI request sense\n");
>>+
>>+ /* FIXME: is this needed? */
>>+ memset(cmd->sense_buffer, 0, sizeof(cmd->sense_buffer));
>
>
>
> This is just a side note as this part seems to be just copied from
> the original request_sense function. Anyways, I think this isn't
> needed anymore. SCSI ML seems to clear things pretty thoroughly these
> days.
Well, until I have the line of code pointed to me, its not important.
Its not the hot path, so we can afford to do possibly-redundant work.
>>+ ata_sg_init_one(qc, cmd->sense_buffer, sizeof(cmd->sense_buffer));
>>+ qc->dma_dir = DMA_FROM_DEVICE;
>>+
>>+ memset(&qc->cdb, 0, ap->cdb_len);
>>+ qc->cdb[0] = REQUEST_SENSE;
>>+ qc->cdb[4] = SCSI_SENSE_BUFFERSIZE;
>>+
>>+ qc->tf.flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
>>+ qc->tf.command = ATA_CMD_PACKET;
>>+
>>+ qc->tf.protocol = ATA_PROT_ATAPI;
>>+ qc->tf.lbam = (8 * 1024) & 0xff;
>>+ qc->tf.lbah = (8 * 1024) >> 8;
>>+ qc->nbytes = SCSI_SENSE_BUFFERSIZE;
>>+
>>+ qc->complete_fn = atapi_qc_complete_sense;
>>+
>>+ rc = ata_qc_issue(qc);
>>+
>>+ if (rc) {
>>+ /* FIXME: complete failing command with an error */
>>+ ata_port_disable(ap);
>>+ }
>>+
>>+ /* FIXME: add timer for timeout of this command */
>>+ /* control flow continues in atapi_qc_complete_sense() */
>>+
>>+ DPRINTK("EXIT\n");
>>+}
>>+
>> static int atapi_qc_complete(struct ata_queued_cmd *qc, u8 drv_stat)
>> {
>>+ struct ata_port *ap = qc->ap;
>> struct scsi_cmnd *cmd = qc->scsicmd;
>>
>>+ rmb();
>
>
>
> I'm having difficult time understanding this rmb(). What is it
> protecting against?
Just paranoia, before reading ap->host->eh_active, which is tweaked by
scsi_error.c.
Jeff
[-- Attachment #2: patch.atapi-autosense --]
[-- Type: text/plain, Size: 6775 bytes --]
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -71,6 +71,7 @@ static int fgb(u32 bitmap);
static int ata_choose_xfer_mode(struct ata_port *ap,
u8 *xfer_mode_out,
unsigned int *xfer_shift_out);
+static int ata_qc_complete_noop(struct ata_queued_cmd *qc, u8 drv_stat);
static void __ata_qc_complete(struct ata_queued_cmd *qc);
static unsigned int ata_unique_id = 1;
@@ -3037,32 +3038,11 @@ static void ata_qc_timeout(struct ata_qu
{
struct ata_port *ap = qc->ap;
struct ata_host_set *host_set = ap->host_set;
- struct ata_device *dev = qc->dev;
u8 host_stat = 0, drv_stat;
unsigned long flags;
DPRINTK("ENTER\n");
- /* FIXME: doesn't this conflict with timeout handling? */
- if (qc->dev->class == ATA_DEV_ATAPI && qc->scsicmd) {
- struct scsi_cmnd *cmd = qc->scsicmd;
-
- if (!(cmd->eh_eflags & SCSI_EH_CANCEL_CMD)) {
-
- /* finish completing original command */
- spin_lock_irqsave(&host_set->lock, flags);
- __ata_qc_complete(qc);
- spin_unlock_irqrestore(&host_set->lock, flags);
-
- atapi_request_sense(ap, dev, cmd);
-
- cmd->result = (CHECK_CONDITION << 1) | (DID_OK << 16);
- scsi_finish_command(cmd);
-
- goto out;
- }
- }
-
spin_lock_irqsave(&host_set->lock, flags);
/* hack alert! We cannot use the supplied completion
@@ -3101,7 +3081,6 @@ static void ata_qc_timeout(struct ata_qu
spin_unlock_irqrestore(&host_set->lock, flags);
-out:
DPRINTK("EXIT\n");
}
@@ -3127,19 +3106,33 @@ out:
void ata_eng_timeout(struct ata_port *ap)
{
struct ata_queued_cmd *qc;
+ struct ata_host_set *host_set = ap->host_set;
+ unsigned long flags;
+ int printed_message = 0;
DPRINTK("ENTER\n");
+ spin_lock_irqsave(&host_set->lock, flags);
+ while (ap->flags & ATA_FLAG_IN_EH) {
+ spin_unlock_irqrestore(&host_set->lock, flags);
+
+ if (!printed_message++)
+ printk(KERN_INFO "ata%u: waiting in timeout handler for EH\n",
+ ap->id);
+
+ msleep(250);
+
+ spin_lock_irqsave(&host_set->lock, flags);
+ }
+ spin_unlock_irqrestore(&host_set->lock, flags);
+
qc = ata_qc_from_tag(ap, ap->active_tag);
if (qc)
ata_qc_timeout(qc);
- else {
+ else
printk(KERN_ERR "ata%u: BUG: timeout without command\n",
ap->id);
- goto out;
- }
-out:
DPRINTK("EXIT\n");
}
@@ -3207,7 +3200,7 @@ struct ata_queued_cmd *ata_qc_new_init(s
return qc;
}
-int ata_qc_complete_noop(struct ata_queued_cmd *qc, u8 drv_stat)
+static int ata_qc_complete_noop(struct ata_queued_cmd *qc, u8 drv_stat)
{
return 0;
}
diff --git a/drivers/scsi/libata-scsi.c b/drivers/scsi/libata-scsi.c
--- a/drivers/scsi/libata-scsi.c
+++ b/drivers/scsi/libata-scsi.c
@@ -1479,19 +1479,33 @@ void ata_scsi_badcmd(struct scsi_cmnd *c
done(cmd);
}
-void atapi_request_sense(struct ata_port *ap, struct ata_device *dev,
- struct scsi_cmnd *cmd)
+static int atapi_qc_complete_sense(struct ata_queued_cmd *qc, u8 drv_stat)
{
- DECLARE_COMPLETION(wait);
- struct ata_queued_cmd *qc;
- unsigned long flags;
+ struct scsi_cmnd *cmd = qc->scsicmd;
+ int ok;
+
+ ok = ata_ok(drv_stat);
+ if (ok)
+ cmd->result = (DRIVER_SENSE << 24) | SAM_STAT_CHECK_CONDITION;
+ else
+ ata_to_sense_error(qc, drv_stat);
+
+ DPRINTK("ATAPI request sense completion (sense %sretrieved)\n",
+ ok ? "" : "not ");
+
+ qc->ap->flags &= ~ATA_FLAG_IN_EH;
+ qc->scsidone(cmd);
+ return 0;
+}
+
+static void atapi_request_sense(struct ata_queued_cmd *qc)
+{
+ struct ata_port *ap = qc->ap;
+ struct scsi_cmnd *cmd = qc->scsicmd;
int rc;
DPRINTK("ATAPI request sense\n");
- qc = ata_qc_new_init(ap, dev);
- BUG_ON(qc == NULL);
-
/* FIXME: is this needed? */
memset(cmd->sense_buffer, 0, sizeof(cmd->sense_buffer));
@@ -1510,44 +1524,39 @@ void atapi_request_sense(struct ata_port
qc->tf.lbah = (8 * 1024) >> 8;
qc->nbytes = SCSI_SENSE_BUFFERSIZE;
- qc->waiting = &wait;
- qc->complete_fn = ata_qc_complete_noop;
+ qc->complete_fn = atapi_qc_complete_sense;
- spin_lock_irqsave(&ap->host_set->lock, flags);
rc = ata_qc_issue(qc);
- spin_unlock_irqrestore(&ap->host_set->lock, flags);
- if (rc)
+ if (rc) {
+ /* FIXME: complete failing command with an error */
ata_port_disable(ap);
- else
- wait_for_completion(&wait);
+ }
+
+ /* FIXME: add timer for timeout of this command */
+ /* control flow continues in atapi_qc_complete_sense() */
DPRINTK("EXIT\n");
}
static int atapi_qc_complete(struct ata_queued_cmd *qc, u8 drv_stat)
{
+ struct ata_port *ap = qc->ap;
struct scsi_cmnd *cmd = qc->scsicmd;
VPRINTK("ENTER, drv_stat == 0x%x\n", drv_stat);
+ rmb();
+ if (ap->host->eh_active)
+ return 1;
+
if (unlikely(drv_stat & (ATA_BUSY | ATA_DRQ)))
ata_to_sense_error(qc, drv_stat);
else if (unlikely(drv_stat & ATA_ERR)) {
DPRINTK("request check condition\n");
-
- /* FIXME: command completion with check condition
- * but no sense causes the error handler to run,
- * which then issues REQUEST SENSE, fills in the sense
- * buffer, and completes the command (for the second
- * time). We need to issue REQUEST SENSE some other
- * way, to avoid completing the command twice.
- */
- cmd->result = SAM_STAT_CHECK_CONDITION;
-
- qc->scsidone(cmd);
-
+ ap->flags |= ATA_FLAG_IN_EH;
+ atapi_request_sense(qc);
return 1;
}
@@ -1585,6 +1594,7 @@ static int atapi_qc_complete(struct ata_
qc->scsidone(cmd);
return 0;
}
+
/**
* atapi_xlat - Initialize PACKET taskfile
* @qc: command structure to be initialized
diff --git a/drivers/scsi/libata.h b/drivers/scsi/libata.h
--- a/drivers/scsi/libata.h
+++ b/drivers/scsi/libata.h
@@ -39,7 +39,6 @@ struct ata_scsi_args {
/* libata-core.c */
extern int atapi_enabled;
-extern int ata_qc_complete_noop(struct ata_queued_cmd *qc, u8 drv_stat);
extern struct ata_queued_cmd *ata_qc_new_init(struct ata_port *ap,
struct ata_device *dev);
extern void ata_qc_free(struct ata_queued_cmd *qc);
@@ -52,8 +51,6 @@ extern void swap_buf_le16(u16 *buf, unsi
/* libata-scsi.c */
-extern void atapi_request_sense(struct ata_port *ap, struct ata_device *dev,
- struct scsi_cmnd *cmd);
extern void ata_scsi_scan_host(struct ata_port *ap);
extern void ata_to_sense_error(struct ata_queued_cmd *qc, u8 drv_stat);
extern int ata_scsi_error(struct Scsi_Host *host);
diff --git a/include/linux/libata.h b/include/linux/libata.h
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -118,6 +118,7 @@ enum {
ATA_FLAG_PIO_DMA = (1 << 8), /* PIO cmds via DMA */
ATA_FLAG_NOINTR = (1 << 9), /* FIXME: Remove this once
* proper HSM is in place. */
+ ATA_FLAG_IN_EH = (1 << 10),
ATA_QCFLAG_ACTIVE = (1 << 1), /* cmd not yet ack'd to scsi lyer */
ATA_QCFLAG_SG = (1 << 3), /* have s/g table? */
next prev parent reply other threads:[~2005-10-06 10:09 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-05 20:51 [PATCH] ATAPI error handling work Jeff Garzik
2005-10-06 5:35 ` Tejun Heo
2005-10-06 10:09 ` Jeff Garzik [this message]
2005-10-06 14:56 ` Tejun Heo
2005-10-29 18:33 ` Jeff Garzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4344F7C0.4030803@pobox.com \
--to=jgarzik@pobox.com \
--cc=dougg@torque.net \
--cc=htejun@gmail.com \
--cc=linux-ide@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).