From: Jeff Garzik <jgarzik@pobox.com>
To: Tejun Heo <htejun@gmail.com>
Cc: "linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>,
Douglas Gilbert <dougg@torque.net>
Subject: Re: [PATCH] ATAPI error handling work
Date: Thu, 06 Oct 2005 06:09:04 -0400 [thread overview]
Message-ID: <4344F7C0.4030803@pobox.com> (raw)
In-Reply-To: <20051006053529.GA25976@htj.dyndns.org>
[-- Attachment #1: Type: text/plain, Size: 6449 bytes --]
Tejun Heo wrote:
> Hello, Jeff.
>
> On Wed, Oct 05, 2005 at 04:51:19PM -0400, Jeff Garzik wrote:
>
>>This is a work-in-progress, attempting to fix the death of libata
>>whenever there is a SCSI error thrown to us... IOW this fixes libata
>>EH to properly fake autosensing, rather than kicking over to the SCSI EH
>>thread for that purpose.
>>
>
>
> Great. I just have a few questions. Oh.. BTW, against which head is
> the patch generated? I couldn't get it applied cleanly to either
> upstream or ALL.
I applied fragments of the patch to 'upstream' branch, so you probably
caught me after I had done that, which invalidated the patch. I've
attached the latest patch.
>>@@ -3174,17 +3106,35 @@ out:
>> void ata_eng_timeout(struct ata_port *ap)
>> {
>> struct ata_queued_cmd *qc;
>>+ struct ata_host_set *host_set = ap->host_set;
>>+ unsigned long flags;
>>+ int printed_message = 0;
>>
>> DPRINTK("ENTER\n");
>>
>>+ spin_lock_irqsave(&host_set->lock, flags);
>>+ while (ap->flags & ATA_FLAG_IN_EH) {
>>+ spin_unlock_irqrestore(&host_set->lock, flags);
>>+
>>+ if (!printed_message++)
>>+ printk(KERN_INFO "ata%u: waiting in timeout handler for EH\n",
>>+ ap->id);
>>+
>>+ msleep(250);
>>+
>>+ spin_lock_irqsave(&host_set->lock, flags);
>>+ }
>>+ spin_unlock_irqrestore(&host_set->lock, flags);
>>+
>
>
>
> I'm not really sure if synchronization around reqsense with
> ATA_FLAG_IN_EH is such a good idea for the following reasons.
>
> * IMHO, tugging in request sense into the timeout window of the
> original command is okay (or even better).
Not really: we have to assume that time passed, so by relying on the
SCSI EH timeout rather than a timer, the timeout is suddenly effectively
random. I want a definitive time length, since its likely REQUEST SENSE
may need a longer timeout for some devices.
Second, the timeout for READ/WRITE commands is often smaller than it is
for non-READ/WRITE commands. We want much to ensure that we have
sufficient time to handle REQUEST SENSE.
> * As long as we grab host_set lock while submitting request sense,
> which we already does, the synchronization can be done safely in EH
> proper. libata EH isn't currently ready for this but other commands
> are dealt the same way and fixing EH is the correct way, IMHO.
We grab the lock submitting request sense, but then we release it.
ata_eng_timeout() could get called before the entire REQUEST SENSE
process is completed.
> Note that we really don't have to do anything special for timing out
> during request sense. Once we fix EH/interrupt handler race, it can
> be handled the same way without any speical casing.
Yes, this is feasible. We can simply wait for the timer (which doesn't
exist yet) or atapi_qc_complete_sense() to clear the ATA_FLAG_IN_EH bit,
at which point the EH timer takes over, or will do so shortly.
>> qc = ata_qc_from_tag(ap, ap->active_tag);
>>- if (!qc) {
>>+ if (qc)
>>+ ata_qc_timeout(qc);
>>+ else {
>> printk(KERN_ERR "ata%u: BUG: timeout without command\n",
>> ap->id);
>> goto out;
>> }
>>
>>- ata_qc_timeout(qc);
>>
>> out:
>> DPRINTK("EXIT\n");
>
>
>
> I think you were trying to kill goto out and the label by moving
> ata_qc_timeout() into if condition, no? Currently, the goto out seem
> a bit funny.
The goto is gone in the most recent version :)
>>diff --git a/drivers/scsi/libata-scsi.c b/drivers/scsi/libata-scsi.c
>>--- a/drivers/scsi/libata-scsi.c
>>+++ b/drivers/scsi/libata-scsi.c
>>@@ -225,7 +225,7 @@ void ata_to_sense_error(struct ata_queue
>> };
>> int i = 0;
>>
>>- cmd->result = SAM_STAT_CHECK_CONDITION;
>>+ cmd->result = (DRIVER_SENSE << 24) | SAM_STAT_CHECK_CONDITION;
>>
>> /*
>> * Is this an error we can process/parse
>>@@ -1468,7 +1468,7 @@ unsigned int ata_scsiop_report_luns(stru
>> void ata_scsi_badcmd(struct scsi_cmnd *cmd, void (*done)(struct scsi_cmnd *), u8 asc, u8 ascq)
>> {
>> DPRINTK("ENTER\n");
>>- cmd->result = SAM_STAT_CHECK_CONDITION;
>>+ cmd->result = (DRIVER_SENSE << 24) | SAM_STAT_CHECK_CONDITION;
>>
>> cmd->sense_buffer[0] = 0x70;
>> cmd->sense_buffer[2] = ILLEGAL_REQUEST;
>>@@ -1479,28 +1479,81 @@ void ata_scsi_badcmd(struct scsi_cmnd *c
>> done(cmd);
>> }
>>
>>+static int atapi_qc_complete_sense(struct ata_queued_cmd *qc, u8 drv_stat)
>>+{
>>+ struct scsi_cmnd *cmd = qc->scsicmd;
>>+
>>+ if (ata_ok(drv_stat))
>>+ cmd->result = (DRIVER_SENSE << 24) | SAM_STAT_CHECK_CONDITION;
>>+ else
>>+ ata_to_sense_error(qc, drv_stat);
>>+
>>+ qc->ap->flags &= ~ATA_FLAG_IN_EH;
>>+ qc->scsidone(cmd);
>>+ return 0;
>>+}
>>+
>>+static void atapi_request_sense(struct ata_queued_cmd *qc)
>>+{
>>+ struct ata_port *ap = qc->ap;
>>+ struct scsi_cmnd *cmd = qc->scsicmd;
>>+ int rc;
>>+
>>+ DPRINTK("ATAPI request sense\n");
>>+
>>+ /* FIXME: is this needed? */
>>+ memset(cmd->sense_buffer, 0, sizeof(cmd->sense_buffer));
>
>
>
> This is just a side note as this part seems to be just copied from
> the original request_sense function. Anyways, I think this isn't
> needed anymore. SCSI ML seems to clear things pretty thoroughly these
> days.
Well, until I have the line of code pointed to me, its not important.
Its not the hot path, so we can afford to do possibly-redundant work.
>>+ ata_sg_init_one(qc, cmd->sense_buffer, sizeof(cmd->sense_buffer));
>>+ qc->dma_dir = DMA_FROM_DEVICE;
>>+
>>+ memset(&qc->cdb, 0, ap->cdb_len);
>>+ qc->cdb[0] = REQUEST_SENSE;
>>+ qc->cdb[4] = SCSI_SENSE_BUFFERSIZE;
>>+
>>+ qc->tf.flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
>>+ qc->tf.command = ATA_CMD_PACKET;
>>+
>>+ qc->tf.protocol = ATA_PROT_ATAPI;
>>+ qc->tf.lbam = (8 * 1024) & 0xff;
>>+ qc->tf.lbah = (8 * 1024) >> 8;
>>+ qc->nbytes = SCSI_SENSE_BUFFERSIZE;
>>+
>>+ qc->complete_fn = atapi_qc_complete_sense;
>>+
>>+ rc = ata_qc_issue(qc);
>>+
>>+ if (rc) {
>>+ /* FIXME: complete failing command with an error */
>>+ ata_port_disable(ap);
>>+ }
>>+
>>+ /* FIXME: add timer for timeout of this command */
>>+ /* control flow continues in atapi_qc_complete_sense() */
>>+
>>+ DPRINTK("EXIT\n");
>>+}
>>+
>> static int atapi_qc_complete(struct ata_queued_cmd *qc, u8 drv_stat)
>> {
>>+ struct ata_port *ap = qc->ap;
>> struct scsi_cmnd *cmd = qc->scsicmd;
>>
>>+ rmb();
>
>
>
> I'm having difficult time understanding this rmb(). What is it
> protecting against?
Just paranoia, before reading ap->host->eh_active, which is tweaked by
scsi_error.c.
Jeff
[-- Attachment #2: patch.atapi-autosense --]
[-- Type: text/plain, Size: 6775 bytes --]
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -71,6 +71,7 @@ static int fgb(u32 bitmap);
static int ata_choose_xfer_mode(struct ata_port *ap,
u8 *xfer_mode_out,
unsigned int *xfer_shift_out);
+static int ata_qc_complete_noop(struct ata_queued_cmd *qc, u8 drv_stat);
static void __ata_qc_complete(struct ata_queued_cmd *qc);
static unsigned int ata_unique_id = 1;
@@ -3037,32 +3038,11 @@ static void ata_qc_timeout(struct ata_qu
{
struct ata_port *ap = qc->ap;
struct ata_host_set *host_set = ap->host_set;
- struct ata_device *dev = qc->dev;
u8 host_stat = 0, drv_stat;
unsigned long flags;
DPRINTK("ENTER\n");
- /* FIXME: doesn't this conflict with timeout handling? */
- if (qc->dev->class == ATA_DEV_ATAPI && qc->scsicmd) {
- struct scsi_cmnd *cmd = qc->scsicmd;
-
- if (!(cmd->eh_eflags & SCSI_EH_CANCEL_CMD)) {
-
- /* finish completing original command */
- spin_lock_irqsave(&host_set->lock, flags);
- __ata_qc_complete(qc);
- spin_unlock_irqrestore(&host_set->lock, flags);
-
- atapi_request_sense(ap, dev, cmd);
-
- cmd->result = (CHECK_CONDITION << 1) | (DID_OK << 16);
- scsi_finish_command(cmd);
-
- goto out;
- }
- }
-
spin_lock_irqsave(&host_set->lock, flags);
/* hack alert! We cannot use the supplied completion
@@ -3101,7 +3081,6 @@ static void ata_qc_timeout(struct ata_qu
spin_unlock_irqrestore(&host_set->lock, flags);
-out:
DPRINTK("EXIT\n");
}
@@ -3127,19 +3106,33 @@ out:
void ata_eng_timeout(struct ata_port *ap)
{
struct ata_queued_cmd *qc;
+ struct ata_host_set *host_set = ap->host_set;
+ unsigned long flags;
+ int printed_message = 0;
DPRINTK("ENTER\n");
+ spin_lock_irqsave(&host_set->lock, flags);
+ while (ap->flags & ATA_FLAG_IN_EH) {
+ spin_unlock_irqrestore(&host_set->lock, flags);
+
+ if (!printed_message++)
+ printk(KERN_INFO "ata%u: waiting in timeout handler for EH\n",
+ ap->id);
+
+ msleep(250);
+
+ spin_lock_irqsave(&host_set->lock, flags);
+ }
+ spin_unlock_irqrestore(&host_set->lock, flags);
+
qc = ata_qc_from_tag(ap, ap->active_tag);
if (qc)
ata_qc_timeout(qc);
- else {
+ else
printk(KERN_ERR "ata%u: BUG: timeout without command\n",
ap->id);
- goto out;
- }
-out:
DPRINTK("EXIT\n");
}
@@ -3207,7 +3200,7 @@ struct ata_queued_cmd *ata_qc_new_init(s
return qc;
}
-int ata_qc_complete_noop(struct ata_queued_cmd *qc, u8 drv_stat)
+static int ata_qc_complete_noop(struct ata_queued_cmd *qc, u8 drv_stat)
{
return 0;
}
diff --git a/drivers/scsi/libata-scsi.c b/drivers/scsi/libata-scsi.c
--- a/drivers/scsi/libata-scsi.c
+++ b/drivers/scsi/libata-scsi.c
@@ -1479,19 +1479,33 @@ void ata_scsi_badcmd(struct scsi_cmnd *c
done(cmd);
}
-void atapi_request_sense(struct ata_port *ap, struct ata_device *dev,
- struct scsi_cmnd *cmd)
+static int atapi_qc_complete_sense(struct ata_queued_cmd *qc, u8 drv_stat)
{
- DECLARE_COMPLETION(wait);
- struct ata_queued_cmd *qc;
- unsigned long flags;
+ struct scsi_cmnd *cmd = qc->scsicmd;
+ int ok;
+
+ ok = ata_ok(drv_stat);
+ if (ok)
+ cmd->result = (DRIVER_SENSE << 24) | SAM_STAT_CHECK_CONDITION;
+ else
+ ata_to_sense_error(qc, drv_stat);
+
+ DPRINTK("ATAPI request sense completion (sense %sretrieved)\n",
+ ok ? "" : "not ");
+
+ qc->ap->flags &= ~ATA_FLAG_IN_EH;
+ qc->scsidone(cmd);
+ return 0;
+}
+
+static void atapi_request_sense(struct ata_queued_cmd *qc)
+{
+ struct ata_port *ap = qc->ap;
+ struct scsi_cmnd *cmd = qc->scsicmd;
int rc;
DPRINTK("ATAPI request sense\n");
- qc = ata_qc_new_init(ap, dev);
- BUG_ON(qc == NULL);
-
/* FIXME: is this needed? */
memset(cmd->sense_buffer, 0, sizeof(cmd->sense_buffer));
@@ -1510,44 +1524,39 @@ void atapi_request_sense(struct ata_port
qc->tf.lbah = (8 * 1024) >> 8;
qc->nbytes = SCSI_SENSE_BUFFERSIZE;
- qc->waiting = &wait;
- qc->complete_fn = ata_qc_complete_noop;
+ qc->complete_fn = atapi_qc_complete_sense;
- spin_lock_irqsave(&ap->host_set->lock, flags);
rc = ata_qc_issue(qc);
- spin_unlock_irqrestore(&ap->host_set->lock, flags);
- if (rc)
+ if (rc) {
+ /* FIXME: complete failing command with an error */
ata_port_disable(ap);
- else
- wait_for_completion(&wait);
+ }
+
+ /* FIXME: add timer for timeout of this command */
+ /* control flow continues in atapi_qc_complete_sense() */
DPRINTK("EXIT\n");
}
static int atapi_qc_complete(struct ata_queued_cmd *qc, u8 drv_stat)
{
+ struct ata_port *ap = qc->ap;
struct scsi_cmnd *cmd = qc->scsicmd;
VPRINTK("ENTER, drv_stat == 0x%x\n", drv_stat);
+ rmb();
+ if (ap->host->eh_active)
+ return 1;
+
if (unlikely(drv_stat & (ATA_BUSY | ATA_DRQ)))
ata_to_sense_error(qc, drv_stat);
else if (unlikely(drv_stat & ATA_ERR)) {
DPRINTK("request check condition\n");
-
- /* FIXME: command completion with check condition
- * but no sense causes the error handler to run,
- * which then issues REQUEST SENSE, fills in the sense
- * buffer, and completes the command (for the second
- * time). We need to issue REQUEST SENSE some other
- * way, to avoid completing the command twice.
- */
- cmd->result = SAM_STAT_CHECK_CONDITION;
-
- qc->scsidone(cmd);
-
+ ap->flags |= ATA_FLAG_IN_EH;
+ atapi_request_sense(qc);
return 1;
}
@@ -1585,6 +1594,7 @@ static int atapi_qc_complete(struct ata_
qc->scsidone(cmd);
return 0;
}
+
/**
* atapi_xlat - Initialize PACKET taskfile
* @qc: command structure to be initialized
diff --git a/drivers/scsi/libata.h b/drivers/scsi/libata.h
--- a/drivers/scsi/libata.h
+++ b/drivers/scsi/libata.h
@@ -39,7 +39,6 @@ struct ata_scsi_args {
/* libata-core.c */
extern int atapi_enabled;
-extern int ata_qc_complete_noop(struct ata_queued_cmd *qc, u8 drv_stat);
extern struct ata_queued_cmd *ata_qc_new_init(struct ata_port *ap,
struct ata_device *dev);
extern void ata_qc_free(struct ata_queued_cmd *qc);
@@ -52,8 +51,6 @@ extern void swap_buf_le16(u16 *buf, unsi
/* libata-scsi.c */
-extern void atapi_request_sense(struct ata_port *ap, struct ata_device *dev,
- struct scsi_cmnd *cmd);
extern void ata_scsi_scan_host(struct ata_port *ap);
extern void ata_to_sense_error(struct ata_queued_cmd *qc, u8 drv_stat);
extern int ata_scsi_error(struct Scsi_Host *host);
diff --git a/include/linux/libata.h b/include/linux/libata.h
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -118,6 +118,7 @@ enum {
ATA_FLAG_PIO_DMA = (1 << 8), /* PIO cmds via DMA */
ATA_FLAG_NOINTR = (1 << 9), /* FIXME: Remove this once
* proper HSM is in place. */
+ ATA_FLAG_IN_EH = (1 << 10),
ATA_QCFLAG_ACTIVE = (1 << 1), /* cmd not yet ack'd to scsi lyer */
ATA_QCFLAG_SG = (1 << 3), /* have s/g table? */
next prev parent reply other threads:[~2005-10-06 10:09 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-05 20:51 [PATCH] ATAPI error handling work Jeff Garzik
2005-10-06 5:35 ` Tejun Heo
2005-10-06 10:09 ` Jeff Garzik [this message]
2005-10-06 14:56 ` Tejun Heo
2005-10-29 18:33 ` Jeff Garzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4344F7C0.4030803@pobox.com \
--to=jgarzik@pobox.com \
--cc=dougg@torque.net \
--cc=htejun@gmail.com \
--cc=linux-ide@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.