* [PATCH/RFC] libata-dev: handle ERR=1 DRQ=1
@ 2006-03-21 11:39 Albert Lee
2006-03-21 17:35 ` Jeff Garzik
0 siblings, 1 reply; 11+ messages in thread
From: Albert Lee @ 2006-03-21 11:39 UTC (permalink / raw)
To: Jeff Garzik; +Cc: IDE Linux, Doug Maxey
handle ERR=1 DRQ=1 for PIO protocol
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
---
Some PIO read commands might set DRQ=1 alone with ERR=1.
Under such situation, the data block should be transferred even if err=1.
This patch adds extra logic to HSM_ST_ERR to handle the required
data transfer.
BTW, also found the following work in progress:
"Disable Data Transfer after Error Detection"
http://t13.org/docs2006/dt1825Lr3-DDT_TR.pdf
If this feature is turned on, then DRQ will be guaranteed 0 when ERR=1
which would simplify the PIO error handling.
Patch against the irq-pio branch.
Need your review and advice, thanks.
Albert
--- irq-pio-printk/drivers/scsi/libata-core.c 2006-03-21 17:55:35.000000000 +0800
+++ irq-pio-err/drivers/scsi/libata-core.c 2006-03-21 19:09:38.000000000 +0800
@@ -3684,6 +3684,46 @@ fsm_start:
printk(KERN_ERR "ata%u: command error, drv_stat 0x%x\n",
ap->id, status);
+ /* handle ERR=1 DRQ=1 */
+ if ((status & ATA_ERR) && (status & ATA_DRQ)) {
+ /* ERR=1 DRQ=1 is only possible for reads (READ PIO
+ * and READ MULTI, etc). For writes, DRQ=1 doesn't
+ * make sense since the data block has been
+ * transferred to the device.
+ */
+ if ((qc->tf.protocol == ATA_PROT_PIO) &&
+ !(qc->tf.flags & ATA_TFLAG_WRITE)) {
+ /* Transfer the corrupted data */
+ ata_pio_sectors(qc);
+
+ /* FIXME: contents of taskfile registers
+ * undefined after the data xfer.
+ * Maybe we should call ->tf_read()
+ * and backup the taskfile somewhere?
+ * Otherwise ata_gen_fixed_sense() won't
+ * get needed information.
+ */
+
+ /* FIXME: The spec said if "correctable"
+ * data error, subsequent blocks will
+ * be transferred. Isn't the command
+ * stopped if ERR is set?
+ */
+
+ ata_altstatus(ap); /* flush */
+ status = ata_chk_status(ap);
+
+ printk(KERN_ERR "ata%u: ERR=1 DRQ=1 error"
+ " block transferred, drv_stat 0x%x\n",
+ ap->id, status);
+ } else
+ /* Low possibility for other protocols
+ * being ERR=1 DRQ=1. Maybe device error?
+ */
+ printk(KERN_ERR "ata%u: ERR=1 DRQ=1\n",
+ ap->id);
+ }
+
/* make sure qc->err_mask is available to
* know what's wrong and recover
*/
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH/RFC] libata-dev: handle ERR=1 DRQ=1 2006-03-21 11:39 [PATCH/RFC] libata-dev: handle ERR=1 DRQ=1 Albert Lee @ 2006-03-21 17:35 ` Jeff Garzik 2006-03-21 17:43 ` Eric D. Mudama 2006-03-23 8:26 ` [PATCH/RFC 2/2] libata-dev: fix the device err check sequence Albert Lee 0 siblings, 2 replies; 11+ messages in thread From: Jeff Garzik @ 2006-03-21 17:35 UTC (permalink / raw) To: albertl; +Cc: IDE Linux, Doug Maxey, edmudama Albert Lee wrote: > handle ERR=1 DRQ=1 for PIO protocol > > Signed-off-by: Albert Lee <albertcc@tw.ibm.com> > --- > > Some PIO read commands might set DRQ=1 alone with ERR=1. > Under such situation, the data block should be transferred even if err=1. > > This patch adds extra logic to HSM_ST_ERR to handle the required > data transfer. > > BTW, also found the following work in progress: > "Disable Data Transfer after Error Detection" > http://t13.org/docs2006/dt1825Lr3-DDT_TR.pdf > If this feature is turned on, then DRQ will be guaranteed 0 when ERR=1 > which would simplify the PIO error handling. > > Patch against the irq-pio branch. > Need your review and advice, thanks. If DRQ==1 then the error register and ATA_ERR should be considered irrelevant. I agree with Eric, that's more of a "don't touch it, other than to SRST it" situation. Jeff ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH/RFC] libata-dev: handle ERR=1 DRQ=1 2006-03-21 17:35 ` Jeff Garzik @ 2006-03-21 17:43 ` Eric D. Mudama 2006-03-23 8:04 ` [PATCH 1/2] libata-dev: irq-pio minor fixes Albert Lee 2006-03-23 8:26 ` [PATCH/RFC 2/2] libata-dev: fix the device err check sequence Albert Lee 1 sibling, 1 reply; 11+ messages in thread From: Eric D. Mudama @ 2006-03-21 17:43 UTC (permalink / raw) To: Jeff Garzik; +Cc: albertl, IDE Linux, Doug Maxey On 3/21/06, Jeff Garzik <jgarzik@pobox.com> wrote: > If DRQ==1 then the error register and ATA_ERR should be considered > irrelevant. I agree with Eric, that's more of a "don't touch it, other > than to SRST it" situation. > > Jeff Well, some applications want to know the drive's best-guess at the data, as there are some cases where the resulting data resembles the original and might be useful even in its partial-block form. Of course, many errors result in absolute garbage, so it's probably only useful for a very limited set of applications, that in most cases already know roughly what type of data is in the bad block. --eric ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] libata-dev: irq-pio minor fixes 2006-03-21 17:43 ` Eric D. Mudama @ 2006-03-23 8:04 ` Albert Lee 2006-03-23 8:18 ` Jeff Garzik 2006-03-24 17:28 ` Jeff Garzik 0 siblings, 2 replies; 11+ messages in thread From: Albert Lee @ 2006-03-23 8:04 UTC (permalink / raw) To: Jeff Garzik; +Cc: IDE Linux, Doug Maxey irq-pio minor fixes for printk() and comments. Signed-off-by: Albert Lee <albertcc@tw.ibm.com> --- Patch against previous irq-pio snapshot. The current irq-pio branch (949ec2c8e6b7b89179b85baf6309c009e1a1b951) looks strange. Maybe it is under refresh. --- irq-pio-00/drivers/scsi/libata-core.c 2006-03-23 11:10:58.000000000 +0800 +++ irq-pio-deverr0/drivers/scsi/libata-core.c 2006-03-23 14:56:03.000000000 +0800 @@ -1283,7 +1283,7 @@ static int ata_dev_configure(struct ata_ if (dev->id[59] & 0x100) { dev->multi_count = dev->id[59] & 0xff; DPRINTK("ata%u: dev %u multi count %u\n", - ap->id, device, dev->multi_count); + ap->id, dev->devno, dev->multi_count); } dev->cdb_len = 16; @@ -3562,6 +3562,9 @@ static int ata_hsm_move(struct ata_port } fsm_start: + DPRINTK("ata%u: protocol %d task_state %d (dev_stat 0x%X)\n", + ap->id, qc->tf.protocol, ap->hsm_task_state, status); + switch (ap->hsm_task_state) { case HSM_ST_FIRST: /* Send first data block or PACKET CDB */ @@ -3688,6 +3691,7 @@ fsm_start: ap->hsm_task_state = HSM_ST_IDLE; + /* complete taskfile transaction */ if (in_wq) ata_poll_qc_complete(qc); else @@ -4441,9 +4445,6 @@ inline unsigned int ata_host_intr (struc if (unlikely(status & ATA_BUSY)) goto idle_irq; - DPRINTK("ata%u: protocol %d task_state %d (dev_stat 0x%X)\n", - ap->id, qc->tf.protocol, ap->hsm_task_state, status); - /* ack bmdma irq events */ ap->ops->irq_clear(ap); ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] libata-dev: irq-pio minor fixes 2006-03-23 8:04 ` [PATCH 1/2] libata-dev: irq-pio minor fixes Albert Lee @ 2006-03-23 8:18 ` Jeff Garzik 2006-03-24 17:28 ` Jeff Garzik 1 sibling, 0 replies; 11+ messages in thread From: Jeff Garzik @ 2006-03-23 8:18 UTC (permalink / raw) To: albertl; +Cc: IDE Linux, Doug Maxey Albert Lee wrote: > irq-pio minor fixes for printk() and comments. > > Signed-off-by: Albert Lee <albertcc@tw.ibm.com> > --- > > Patch against previous irq-pio snapshot. > The current irq-pio branch (949ec2c8e6b7b89179b85baf6309c009e1a1b951) > looks strange. Maybe it is under refresh. It was rebased, which means you need to do a fresh pull. I apologize for not warning you and linux-ide. Something like git reset --hard 949ec2c8e6b7b89179b85baf6309c009e1a1b951 may work. Or simply git checkout -f irq-pio Or, as you say, it may be mirroring from master.kernel.org to the two public-facing kernel.org machines. Jeff ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] libata-dev: irq-pio minor fixes 2006-03-23 8:04 ` [PATCH 1/2] libata-dev: irq-pio minor fixes Albert Lee 2006-03-23 8:18 ` Jeff Garzik @ 2006-03-24 17:28 ` Jeff Garzik 2006-03-25 10:07 ` [PATCH 1/3] libata-dev: irq-pio minor fixes (respin) Albert Lee ` (2 more replies) 1 sibling, 3 replies; 11+ messages in thread From: Jeff Garzik @ 2006-03-24 17:28 UTC (permalink / raw) To: albertl; +Cc: IDE Linux, Doug Maxey Albert Lee wrote: > irq-pio minor fixes for printk() and comments. > > Signed-off-by: Albert Lee <albertcc@tw.ibm.com> ACK patches 1-2, but patch #1 didn't apply. I'll delete these, and wait for you to resend. Jeff ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/3] libata-dev: irq-pio minor fixes (respin) 2006-03-24 17:28 ` Jeff Garzik @ 2006-03-25 10:07 ` Albert Lee 2006-03-29 22:22 ` Jeff Garzik 2006-03-25 10:11 ` [PATCH 2/3] libata-dev: fix the device err check sequence (respin) Albert Lee 2006-03-25 10:18 ` [PATCH 3/3] libata-dev: wait idle after reading the last data block Albert Lee 2 siblings, 1 reply; 11+ messages in thread From: Albert Lee @ 2006-03-25 10:07 UTC (permalink / raw) To: Jeff Garzik; +Cc: IDE Linux, Doug Maxey irq-pio minor fixes for printk() and comments. Signed-off-by: Albert Lee <albertcc@tw.ibm.com> --- Patch against the irq-pio branch. --- b7/drivers/scsi/libata-core.c 2006-03-25 16:51:06.000000000 +0800 +++ c1/drivers/scsi/libata-core.c 2006-03-25 16:58:19.000000000 +0800 @@ -1295,7 +1295,7 @@ static int ata_dev_configure(struct ata_ if (dev->id[59] & 0x100) { dev->multi_count = dev->id[59] & 0xff; DPRINTK("ata%u: dev %u multi count %u\n", - ap->id, device, dev->multi_count); + ap->id, dev->devno, dev->multi_count); } dev->cdb_len = 16; @@ -3594,6 +3594,9 @@ static int ata_hsm_move(struct ata_port } fsm_start: + DPRINTK("ata%u: protocol %d task_state %d (dev_stat 0x%X)\n", + ap->id, qc->tf.protocol, ap->hsm_task_state, status); + switch (ap->hsm_task_state) { case HSM_ST_FIRST: /* Send first data block or PACKET CDB */ @@ -3720,6 +3723,7 @@ fsm_start: ap->hsm_task_state = HSM_ST_IDLE; + /* complete taskfile transaction */ if (in_wq) ata_poll_qc_complete(qc); else @@ -4241,9 +4245,6 @@ inline unsigned int ata_host_intr (struc if (unlikely(status & ATA_BUSY)) goto idle_irq; - DPRINTK("ata%u: protocol %d task_state %d (dev_stat 0x%X)\n", - ap->id, qc->tf.protocol, ap->hsm_task_state, status); - /* ack bmdma irq events */ ap->ops->irq_clear(ap); ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] libata-dev: irq-pio minor fixes (respin) 2006-03-25 10:07 ` [PATCH 1/3] libata-dev: irq-pio minor fixes (respin) Albert Lee @ 2006-03-29 22:22 ` Jeff Garzik 0 siblings, 0 replies; 11+ messages in thread From: Jeff Garzik @ 2006-03-29 22:22 UTC (permalink / raw) To: albertl; +Cc: IDE Linux, Doug Maxey Albert Lee wrote: > irq-pio minor fixes for printk() and comments. > > Signed-off-by: Albert Lee <albertcc@tw.ibm.com> applied 1-3, and patches 1-10 of previous series, to irq-pio branch ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/3] libata-dev: fix the device err check sequence (respin) 2006-03-24 17:28 ` Jeff Garzik 2006-03-25 10:07 ` [PATCH 1/3] libata-dev: irq-pio minor fixes (respin) Albert Lee @ 2006-03-25 10:11 ` Albert Lee 2006-03-25 10:18 ` [PATCH 3/3] libata-dev: wait idle after reading the last data block Albert Lee 2 siblings, 0 replies; 11+ messages in thread From: Albert Lee @ 2006-03-25 10:11 UTC (permalink / raw) To: Jeff Garzik; +Cc: IDE Linux, Doug Maxey Current irq-pio checks ERR bit and stops on ERR before it does anything else. This behavior doesn't look right. The DRQ bit should take higher precedence than the ERR bit. Changes: - Let the HSM do the data transfer whenever the device asks for DRQ bit, even if the ERR bit is set. - For DRQ=1 ERR=1, don't trust the data Signed-off-by: Albert Lee <albertcc@tw.ibm.com> --- Revised per previous comments of Jeff and Eric. The device error check in the entrance is removed. Given the data returned on ERR=1 DRQ=1 might be corrupted or partial, the device err check is added after DRQ check and before the data transfer. AC_ERR_DEV is set to err_mask and tell the upper layer that the data does not look good. --- c1/drivers/scsi/libata-core.c 2006-03-25 16:58:19.000000000 +0800 +++ c2/drivers/scsi/libata-core.c 2006-03-25 16:59:26.000000000 +0800 @@ -3587,12 +3587,6 @@ static int ata_hsm_move(struct ata_port */ WARN_ON(in_wq != ata_hsm_ok_in_wq(ap, qc)); - /* check error */ - if (unlikely(status & (ATA_ERR | ATA_DF))) { - qc->err_mask |= AC_ERR_DEV; - ap->hsm_task_state = HSM_ST_ERR; - } - fsm_start: DPRINTK("ata%u: protocol %d task_state %d (dev_stat 0x%X)\n", ap->id, qc->tf.protocol, ap->hsm_task_state, status); @@ -3615,6 +3609,17 @@ fsm_start: goto fsm_start; } + /* Device should not ask for data transfer (DRQ=1) + * when it finds something wrong. + * Anyway, we respect DRQ here and let HSM go on + * without changing hsm_task_state to HSM_ST_ERR. + */ + if (unlikely(status & (ATA_ERR | ATA_DF))) { + printk(KERN_WARNING "ata%d: DRQ=1 with device error, dev_stat 0x%X\n", + ap->id, status); + qc->err_mask |= AC_ERR_DEV; + } + /* Send the CDB (atapi) or the first data block (ata pio out). * During the state transition, interrupt handler shouldn't * be invoked before the data transfer is complete and @@ -3657,6 +3662,17 @@ fsm_start: goto fsm_start; } + /* Device should not ask for data transfer (DRQ=1) + * when it finds something wrong. + * Anyway, we respect DRQ here and let HSM go on + * without changing hsm_task_state to HSM_ST_ERR. + */ + if (unlikely(status & (ATA_ERR | ATA_DF))) { + printk(KERN_WARNING "ata%d: DRQ=1 with device error, dev_stat 0x%X\n", + ap->id, status); + qc->err_mask |= AC_ERR_DEV; + } + atapi_pio_bytes(qc); if (unlikely(ap->hsm_task_state == HSM_ST_ERR)) @@ -3672,6 +3688,22 @@ fsm_start: goto fsm_start; } + /* Some devices may ask for data transfer (DRQ=1) + * alone with ERR=1 for PIO reads. + * We respect DRQ here and let HSM go on without + * changing hsm_task_state to HSM_ST_ERR. + */ + if (unlikely(status & (ATA_ERR | ATA_DF))) { + /* For writes, ERR=1 DRQ=1 doesn't make + * sense since the data block has been + * transferred to the device. + */ + WARN_ON(qc->tf.flags & ATA_TFLAG_WRITE); + + /* data might be corrputed */ + qc->err_mask |= AC_ERR_DEV; + } + ata_pio_sectors(qc); if (ap->hsm_task_state == HSM_ST_LAST && ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/3] libata-dev: wait idle after reading the last data block 2006-03-24 17:28 ` Jeff Garzik 2006-03-25 10:07 ` [PATCH 1/3] libata-dev: irq-pio minor fixes (respin) Albert Lee 2006-03-25 10:11 ` [PATCH 2/3] libata-dev: fix the device err check sequence (respin) Albert Lee @ 2006-03-25 10:18 ` Albert Lee 2 siblings, 0 replies; 11+ messages in thread From: Albert Lee @ 2006-03-25 10:18 UTC (permalink / raw) To: Jeff Garzik; +Cc: IDE Linux, Doug Maxey Some CD-ROM drives are slow to clear DRQ, after the last data block is read by PIO. Use ata_wait_idle() after reading the last data block. Signed-off-by: Albert Lee <albertcc@tw.ibm.com> --- --- c2/drivers/scsi/libata-core.c 2006-03-25 16:59:26.000000000 +0800 +++ c3/drivers/scsi/libata-core.c 2006-03-25 17:01:29.000000000 +0800 @@ -3710,7 +3710,7 @@ fsm_start: (!(qc->tf.flags & ATA_TFLAG_WRITE))) { /* all data read */ ata_altstatus(ap); - status = ata_chk_status(ap); + status = ata_wait_idle(ap); goto fsm_start; } } ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH/RFC 2/2] libata-dev: fix the device err check sequence 2006-03-21 17:35 ` Jeff Garzik 2006-03-21 17:43 ` Eric D. Mudama @ 2006-03-23 8:26 ` Albert Lee 1 sibling, 0 replies; 11+ messages in thread From: Albert Lee @ 2006-03-23 8:26 UTC (permalink / raw) To: Jeff Garzik; +Cc: IDE Linux, Doug Maxey, edmudama Current irq-pio checks ERR bit and stops on ERR before it does anything else. This behavior doesn't look right. The DRQ bit should take higher precedence than the ERR bit. Changes: - Let the HSM do the data transfer whenever the device asks for DRQ bit, even if the ERR bit is set. - For DRQ=1 ERR=1, don't trust the data Signed-off-by: Albert Lee <albertcc@tw.ibm.com> --- Revised per previous comments of Jeff and Eric. The device error check in the entrance is removed. Given the data returned on ERR=1 DRQ=1 might be corrupted or partial, the device err check is added after DRQ check and before the data transfer. We set AC_ERR_DEV to err_mask and tell the upper layer that the data does not look good. For your review, thanks. --- irq-pio-deverr0/drivers/scsi/libata-core.c 2006-03-23 14:56:03.000000000 +0800 +++ irq-pio-deverr/drivers/scsi/libata-core.c 2006-03-23 15:26:03.000000000 +0800 @@ -3555,12 +3555,6 @@ static int ata_hsm_move(struct ata_port */ WARN_ON(in_wq != ata_hsm_ok_in_wq(ap, qc)); - /* check error */ - if (unlikely(status & (ATA_ERR | ATA_DF))) { - qc->err_mask |= AC_ERR_DEV; - ap->hsm_task_state = HSM_ST_ERR; - } - fsm_start: DPRINTK("ata%u: protocol %d task_state %d (dev_stat 0x%X)\n", ap->id, qc->tf.protocol, ap->hsm_task_state, status); @@ -3583,6 +3577,17 @@ fsm_start: goto fsm_start; } + /* Device should not ask for data transfer (DRQ=1) + * when it finds something wrong. + * Anyway, we respect DRQ here and let HSM go on + * without changing hsm_task_state to HSM_ST_ERR. + */ + if (unlikely(status & (ATA_ERR | ATA_DF))) { + printk(KERN_WARNING "ata%d: DRQ=1 with device error, dev_stat 0x%X\n", + ap->id, status); + qc->err_mask |= AC_ERR_DEV; + } + /* Send the CDB (atapi) or the first data block (ata pio out). * During the state transition, interrupt handler shouldn't * be invoked before the data transfer is complete and @@ -3625,6 +3630,17 @@ fsm_start: goto fsm_start; } + /* Device should not ask for data transfer (DRQ=1) + * when it finds something wrong. + * Anyway, we respect DRQ here and let HSM go on + * without changing hsm_task_state to HSM_ST_ERR. + */ + if (unlikely(status & (ATA_ERR | ATA_DF))) { + printk(KERN_WARNING "ata%d: DRQ=1 with device error, dev_stat 0x%X\n", + ap->id, status); + qc->err_mask |= AC_ERR_DEV; + } + atapi_pio_bytes(qc); if (unlikely(ap->hsm_task_state == HSM_ST_ERR)) @@ -3640,6 +3656,22 @@ fsm_start: goto fsm_start; } + /* Some devices may ask for data transfer (DRQ=1) + * alone with ERR=1 for PIO reads. + * We respect DRQ here and let HSM go on without + * changing hsm_task_state to HSM_ST_ERR. + */ + if (unlikely(status & (ATA_ERR | ATA_DF))) { + /* For writes, ERR=1 DRQ=1 doesn't make + * sense since the data block has been + * transferred to the device. + */ + WARN_ON(qc->tf.flags & ATA_TFLAG_WRITE); + + /* data might be corrputed */ + qc->err_mask |= AC_ERR_DEV; + } + ata_pio_sectors(qc); if (ap->hsm_task_state == HSM_ST_LAST && ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-03-29 22:23 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-03-21 11:39 [PATCH/RFC] libata-dev: handle ERR=1 DRQ=1 Albert Lee 2006-03-21 17:35 ` Jeff Garzik 2006-03-21 17:43 ` Eric D. Mudama 2006-03-23 8:04 ` [PATCH 1/2] libata-dev: irq-pio minor fixes Albert Lee 2006-03-23 8:18 ` Jeff Garzik 2006-03-24 17:28 ` Jeff Garzik 2006-03-25 10:07 ` [PATCH 1/3] libata-dev: irq-pio minor fixes (respin) Albert Lee 2006-03-29 22:22 ` Jeff Garzik 2006-03-25 10:11 ` [PATCH 2/3] libata-dev: fix the device err check sequence (respin) Albert Lee 2006-03-25 10:18 ` [PATCH 3/3] libata-dev: wait idle after reading the last data block Albert Lee 2006-03-23 8:26 ` [PATCH/RFC 2/2] libata-dev: fix the device err check sequence Albert Lee
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).