linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] libata-2.6: Fix races caused by the interrupt handler
@ 2005-04-29  9:27 Albert Lee
  2005-04-29  9:34 ` [PATCH 1/2] libata-2.6: Prevent the interrupt handler from completing a command twice Albert Lee
  2005-04-29  9:39 ` [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device Albert Lee
  0 siblings, 2 replies; 12+ messages in thread
From: Albert Lee @ 2005-04-29  9:27 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Bartlomiej Zolnierkiewicz, Doug Maxey, Jens Axboe, Linux IDE

Hi Jeff:

Two patches on the libata ATAPI interrupt handling for your review:
- patch 1/2: fix race between the interrupt handler and the scsi error handler.
- patch 2/2: fix race between the interrupt handler and atapi_packet_task().

Thanks,

Albert



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] libata-2.6: Prevent the interrupt handler from completing a command twice
  2005-04-29  9:27 [PATCH 0/2] libata-2.6: Fix races caused by the interrupt handler Albert Lee
@ 2005-04-29  9:34 ` Albert Lee
  2005-05-15 22:47   ` Jeff Garzik
  2005-04-29  9:39 ` [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device Albert Lee
  1 sibling, 1 reply; 12+ messages in thread
From: Albert Lee @ 2005-04-29  9:34 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Bartlomiej Zolnierkiewicz, Doug Maxey, Jens Axboe, Linux IDE

[-- Attachment #1: Type: text/plain, Size: 782 bytes --]

Hi Jeff,

Problem:
   During the libata CD-ROM stress test, sometimes the "BUG: timeout without command" error is seen.

Root cause:
  Unexpected interrupt occurs after the ata_qc_complete() is called, but before the SCSI error handler.
The interrupt handler is invoked before the SCSI error handler, and it clears the command by calling ata_qc_complete() again.
Later when the SCSI error handler is run, the ata_queued_cmd is already gone, causing the "BUG: timeout without command" error.

Changes:
  - Use the ATA_QCFLAG_ACTIVE flag to prevent the interrupt handler from completing the command twice,
    before the scsi_error_handler.

Attached please find the patch against the libata-2.6 tree
for your review. Thanks.


Albert

Signed-off-by: Albert Lee <albertcc@tw.ibm.com>

[-- Attachment #2: timeout_without_command.diff --]
[-- Type: text/plain, Size: 939 bytes --]

--- linux-2.6.11.7/drivers/scsi/libata-core.c	2005-04-08 02:57:08.000000000 +0800
+++ linux-2.6.11.7-twe/drivers/scsi/libata-core.c	2005-04-25 16:19:05.000000000 +0800
@@ -2539,7 +2539,7 @@
 	ata_sg_init_one(qc, cmd->sense_buffer, sizeof(cmd->sense_buffer));
 	qc->dma_dir = DMA_FROM_DEVICE;
 
-	memset(&qc->cdb, 0, sizeof(ap->cdb_len));
+	memset(&qc->cdb, 0, ap->cdb_len);
 	qc->cdb[0] = REQUEST_SENSE;
 	qc->cdb[4] = SCSI_SENSE_BUFFERSIZE;
 
@@ -2811,6 +2811,7 @@
 
 	/* call completion callback */
 	rc = qc->complete_fn(qc, drv_stat);
+	qc->flags &= ~ATA_QCFLAG_ACTIVE;
 
 	/* if callback indicates not to complete command (non-zero),
 	 * return immediately
@@ -3229,7 +3230,8 @@
 			struct ata_queued_cmd *qc;
 
 			qc = ata_qc_from_tag(ap, ap->active_tag);
-			if (qc && (!(qc->tf.ctl & ATA_NIEN)))
+			if (qc && (!(qc->tf.ctl & ATA_NIEN)) &&
+			    (qc->flags & ATA_QCFLAG_ACTIVE))
 				handled |= ata_host_intr(ap, qc);
 		}
 	}

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device
  2005-04-29  9:27 [PATCH 0/2] libata-2.6: Fix races caused by the interrupt handler Albert Lee
  2005-04-29  9:34 ` [PATCH 1/2] libata-2.6: Prevent the interrupt handler from completing a command twice Albert Lee
@ 2005-04-29  9:39 ` Albert Lee
  2005-05-15 22:49   ` Jeff Garzik
  1 sibling, 1 reply; 12+ messages in thread
From: Albert Lee @ 2005-04-29  9:39 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Bartlomiej Zolnierkiewicz, Doug Maxey, Jens Axboe, Linux IDE

[-- Attachment #1: Type: text/plain, Size: 525 bytes --]

Hi Jeff,

Problem:
   Sometimes the interrupt handler is invoked before atapi_packet_task() sending CDB to the device.
The interrupt handler completes the command and later when atapi_packet_task() runs, the command is already gone.

Changes:
  - Add 2 PIO states to libata.h.
  - Prevent the interrupt handler from completing the command before the ATAPI CDB is sent to the device.

Attached please find the patch against the libata-2.6 tree for your review. Thanks.

Albert

Signed-off-by: Albert Lee <albertcc@tw.ibm.com>

[-- Attachment #2: atapi_state_fix.diff --]
[-- Type: text/plain, Size: 2520 bytes --]

--- linux-2.6.11.7-twe/include/linux/libata.h	2005-04-25 14:59:15.000000000 +0800
+++ linux-2.6.11.7-mod/include/linux/libata.h	2005-04-25 15:06:25.000000000 +0800
@@ -161,6 +161,8 @@
 	PIO_ST_LAST,
 	PIO_ST_LAST_POLL,
 	PIO_ST_ERR,
+	PIO_ST_PKT,
+	PIO_ST_CDB_SENT,
 };
 
 /* forward declarations */
--- linux-2.6.11.7-twe/drivers/scsi/libata-core.c	2005-04-25 16:19:05.000000000 +0800
+++ linux-2.6.11.7-mod/drivers/scsi/libata-core.c	2005-04-29 16:20:32.000000000 +0800
@@ -2632,6 +2632,7 @@
 	default:
 		ata_altstatus(ap);
 		drv_stat = ata_chk_status(ap);
+		ap->pio_task_state = PIO_ST_IDLE;
 
 		/* ack bmdma irq events */
 		ap->ops->irq_clear(ap);
@@ -2934,17 +2935,20 @@
 	case ATA_PROT_ATAPI:
 		ata_qc_set_polling(qc);
 		ata_tf_to_host_nolock(ap, &qc->tf);
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
 	case ATA_PROT_ATAPI_NODATA:
 		ata_tf_to_host_nolock(ap, &qc->tf);
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
 	case ATA_PROT_ATAPI_DMA:
 		ap->ops->tf_load(ap, &qc->tf);	 /* load tf registers */
 		ap->ops->bmdma_setup(qc);	    /* set up bmdma */
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
@@ -3142,6 +3146,13 @@
 {
 	u8 status, host_stat;
 
+	/* ATAPI: Ignore interrupt if CDB is not sent yet  */
+	if (is_atapi_taskfile(&qc->tf) &&
+	    ap->pio_task_state != PIO_ST_CDB_SENT) {
+		DPRINTK("ata%u: not my atapi interrupt\n", ap->id);
+		goto idle_irq;
+	}
+
 	switch (qc->tf.protocol) {
 
 	case ATA_PROT_DMA:
@@ -3174,6 +3185,8 @@
 		DPRINTK("ata%u: protocol %d (dev_stat 0x%X)\n",
 			ap->id, qc->tf.protocol, status);
 
+		ap->pio_task_state = PIO_ST_IDLE;
+
 		/* ack bmdma irq events */
 		ap->ops->irq_clear(ap);
 
@@ -3260,6 +3273,7 @@
 	struct ata_port *ap = _data;
 	struct ata_queued_cmd *qc;
 	u8 status;
+	unsigned long flags;
 
 	qc = ata_qc_from_tag(ap, ap->active_tag);
 	assert(qc != NULL);
@@ -3275,10 +3289,13 @@
 	if ((status & (ATA_BUSY | ATA_DRQ)) != ATA_DRQ)
 		goto err_out;
 
+	spin_lock_irqsave(&ap->host_set->lock, flags);
+
 	/* send SCSI cdb */
 	DPRINTK("send cdb\n");
 	assert(ap->cdb_len >= 12);
 	ata_data_xfer(ap, qc->cdb, ap->cdb_len, 1);
+	ap->pio_task_state = PIO_ST_CDB_SENT;
 
 	/* if we are DMA'ing, irq handler takes over from here */
 	if (qc->tf.protocol == ATA_PROT_ATAPI_DMA)
@@ -3295,6 +3312,8 @@
 		queue_work(ata_wq, &ap->pio_task);
 	}
 
+	spin_unlock_irqrestore(&ap->host_set->lock, flags);
+
 	return;
 
 err_out:

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] libata-2.6: Prevent the interrupt handler from completing a command twice
  2005-04-29  9:34 ` [PATCH 1/2] libata-2.6: Prevent the interrupt handler from completing a command twice Albert Lee
@ 2005-05-15 22:47   ` Jeff Garzik
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff Garzik @ 2005-05-15 22:47 UTC (permalink / raw)
  To: Albert Lee; +Cc: Bartlomiej Zolnierkiewicz, Doug Maxey, Jens Axboe, Linux IDE

Albert Lee wrote:
> Hi Jeff,
> 
> Problem:
>   During the libata CD-ROM stress test, sometimes the "BUG: timeout 
> without command" error is seen.
> 
> Root cause:
>  Unexpected interrupt occurs after the ata_qc_complete() is called, but 
> before the SCSI error handler.
> The interrupt handler is invoked before the SCSI error handler, and it 
> clears the command by calling ata_qc_complete() again.
> Later when the SCSI error handler is run, the ata_queued_cmd is already 
> gone, causing the "BUG: timeout without command" error.
> 
> Changes:
>  - Use the ATA_QCFLAG_ACTIVE flag to prevent the interrupt handler from 
> completing the command twice,
>    before the scsi_error_handler.

patch applied



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device
  2005-04-29  9:39 ` [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device Albert Lee
@ 2005-05-15 22:49   ` Jeff Garzik
  2005-05-16 10:37     ` Albert Lee
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff Garzik @ 2005-05-15 22:49 UTC (permalink / raw)
  To: Albert Lee; +Cc: Bartlomiej Zolnierkiewicz, Doug Maxey, Jens Axboe, Linux IDE

Albert Lee wrote:
> Hi Jeff,
> 
> Problem:
>   Sometimes the interrupt handler is invoked before atapi_packet_task() 
> sending CDB to the device.
> The interrupt handler completes the command and later when 
> atapi_packet_task() runs, the command is already gone.
> 
> Changes:
>  - Add 2 PIO states to libata.h.
>  - Prevent the interrupt handler from completing the command before the 
> ATAPI CDB is sent to the device.
> 
> Attached please find the patch against the libata-2.6 tree for your 
> review. Thanks.
> 
> Albert
> 
> Signed-off-by: Albert Lee <albertcc@tw.ibm.com>

There is an additional complication:

As specified in IDENTIFY PACKET DEVICE Word 0, in some older ATAPI 
devices, the device will assert an interrupt after the CDB has been 
submitted to the device.

Your patch is moving in the right direction...   but I think we should 
check Word 0 and perform actions accordingly, as some ATAPI devices 
appear to expect it.

I wonder if your ATAPI device is one of these?

	Jeff




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device
  2005-05-15 22:49   ` Jeff Garzik
@ 2005-05-16 10:37     ` Albert Lee
  2005-06-08  8:02       ` [PATCH 1/1] libata: Handle ATAPI interrupt before CDB is sent Albert Lee
  2005-06-28  3:06       ` [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device Jeff Garzik
  0 siblings, 2 replies; 12+ messages in thread
From: Albert Lee @ 2005-05-16 10:37 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Bartlomiej Zolnierkiewicz, Doug Maxey, Jens Axboe, Linux IDE

[-- Attachment #1: Type: text/plain, Size: 1573 bytes --]


Jeff:

> 
> There is an additional complication:
> 
> As specified in IDENTIFY PACKET DEVICE Word 0, in some older ATAPI 
> devices, the device will assert an interrupt after the CDB has been 
> submitted to the device.
> 
> Your patch is moving in the right direction...   but I think we should 
> check Word 0 and perform actions accordingly, as some ATAPI devices 
> appear to expect it.

I didn't consider this case.
Just take a look at the ATA-4 draft spec:

http://www.t13.org/project/d1153r18-ATA-ATAPI-4.pdf
(Fig. 16, page. 236)
"Some devices prior to ATA/ATAPI-4 assert INTRQ if enabled at this point. See
IDENTIFY PACKET DEVICE, word 0, bits 5-6 to determine if an interrupt will occur."

The interrupt seems to be after PACKET but before CDB is sent.
It looks like a good news, since the previous patch can almost handle such interrupt.
We can add IDENTIFY PACKET DEVICE Word 0 checking, if bit 6-5 is 01, then we return irq_handled
instead of ignoring the interrupt. Attached please find the revised patch for your review.


> 
> I wonder if your ATAPI device is one of these?
> 

My ATAPI devices at hand do not generate the interrupt before CDB is sent.
(I wish I have one to test this patch.)
The unexpected interrupt is caused by a pdc2027x hardware problem:
When both primary/secondary channels of the pdc20275 are stressed,
sometimes the interrupt in the 2nd channel will cause the ATA_DMA_INTR bit of the 1st
channel to be set. If only 1 channel is used, the unexpected interrupt is never seen.


Albert

Signed-off-by: Albert Lee <albertcc@tw.ibm.com>

[-- Attachment #2: atapi_state_fix.diff --]
[-- Type: text/plain, Size: 2671 bytes --]

--- linux-2.6.11.7-twe/include/linux/libata.h	2005-04-25 14:59:15.000000000 +0800
+++ linux-2.6.11.7-mod/include/linux/libata.h	2005-04-25 15:06:25.000000000 +0800
@@ -161,6 +161,8 @@
 	PIO_ST_LAST,
 	PIO_ST_LAST_POLL,
 	PIO_ST_ERR,
+	PIO_ST_PKT,
+	PIO_ST_CDB_SENT,
 };
 
 /* forward declarations */
--- linux-2.6.11.7-twe/drivers/scsi/libata-core.c	2005-04-25 16:19:05.000000000 +0800
+++ linux-2.6.11.7-mod/drivers/scsi/libata-core.c	2005-05-16 18:15:55.000000000 +0800
@@ -2632,6 +2632,7 @@
 	default:
 		ata_altstatus(ap);
 		drv_stat = ata_chk_status(ap);
+		ap->pio_task_state = PIO_ST_IDLE;
 
 		/* ack bmdma irq events */
 		ap->ops->irq_clear(ap);
@@ -2934,17 +2935,20 @@
 	case ATA_PROT_ATAPI:
 		ata_qc_set_polling(qc);
 		ata_tf_to_host_nolock(ap, &qc->tf);
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
 	case ATA_PROT_ATAPI_NODATA:
 		ata_tf_to_host_nolock(ap, &qc->tf);
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
 	case ATA_PROT_ATAPI_DMA:
 		ap->ops->tf_load(ap, &qc->tf);	 /* load tf registers */
 		ap->ops->bmdma_setup(qc);	    /* set up bmdma */
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
@@ -3142,6 +3146,18 @@
 {
 	u8 status, host_stat;
 
+	/* ATAPI: Ignore interrupt if CDB is not sent yet  */
+	if (is_atapi_taskfile(&qc->tf) &&
+	    ap->pio_task_state != PIO_ST_CDB_SENT) {
+		if ((qc->dev->id[0] & 0x60) == 0x20) {
+			DPRINTK("ata%u: atapi interrupt handled\n", ap->id);
+			return 1;	/* irq handled */
+		} else {
+			DPRINTK("ata%u: atapi interrupt ignored\n", ap->id);
+			goto idle_irq;
+		}
+	}
+
 	switch (qc->tf.protocol) {
 
 	case ATA_PROT_DMA:
@@ -3174,6 +3190,8 @@
 		DPRINTK("ata%u: protocol %d (dev_stat 0x%X)\n",
 			ap->id, qc->tf.protocol, status);
 
+		ap->pio_task_state = PIO_ST_IDLE;
+
 		/* ack bmdma irq events */
 		ap->ops->irq_clear(ap);
 
@@ -3260,6 +3278,7 @@
 	struct ata_port *ap = _data;
 	struct ata_queued_cmd *qc;
 	u8 status;
+	unsigned long flags;
 
 	qc = ata_qc_from_tag(ap, ap->active_tag);
 	assert(qc != NULL);
@@ -3275,10 +3294,13 @@
 	if ((status & (ATA_BUSY | ATA_DRQ)) != ATA_DRQ)
 		goto err_out;
 
+	spin_lock_irqsave(&ap->host_set->lock, flags);
+
 	/* send SCSI cdb */
 	DPRINTK("send cdb\n");
 	assert(ap->cdb_len >= 12);
 	ata_data_xfer(ap, qc->cdb, ap->cdb_len, 1);
+	ap->pio_task_state = PIO_ST_CDB_SENT;
 
 	/* if we are DMA'ing, irq handler takes over from here */
 	if (qc->tf.protocol == ATA_PROT_ATAPI_DMA)
@@ -3295,6 +3317,8 @@
 		queue_work(ata_wq, &ap->pio_task);
 	}
 
+	spin_unlock_irqrestore(&ap->host_set->lock, flags);
+
 	return;
 
 err_out:

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/1] libata: Handle ATAPI interrupt before CDB is sent
  2005-05-16 10:37     ` Albert Lee
@ 2005-06-08  8:02       ` Albert Lee
  2005-06-08 10:01         ` Bartlomiej Zolnierkiewicz
  2005-06-28  3:06       ` [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device Jeff Garzik
  1 sibling, 1 reply; 12+ messages in thread
From: Albert Lee @ 2005-06-08  8:02 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Bartlomiej Zolnierkiewicz, Doug Maxey, Linux IDE

[-- Attachment #1: Type: text/plain, Size: 1405 bytes --]

Hi Jeff,

Resubmit the revised patch.

Problem:
  atapi_packet_task(): assert(qc != NULL) failed when both primary/secondary IDE channels are stressed.

Root cause:
   Sometimes the interrupt handler is invoked before atapi_packet_task() sends the CDB to the device.
The interrupt handler completes the command. Later when atapi_packet_task() is executed, the command is already gone.

The following sequence causes the problem:
1. A packet command with ATA_PROT_ATAPI_NODATA is written to the CD-ROM drive.
2. atapi_packet_task() is queued.
3. The CD_ROM drive executes and finishes the packet command. It is not busy now.
4. The other IDE channel asserts IRQ.
5. The interrupt handler checks alt_status, seeing device not busy and completes the command.
6. atapi_packet_task() is executed, causing assert() fail.

Another possible cause is the pre-ATA4 devices which generate INTRQ before CDB is sent,
as you pointed out in the previous mail. However, I don't have such drive to test with.

Fix:
  - Add PIO_ST_PKT and PIO_ST_CDB_SENT states.
  - Prevent the interrupt handler from completing the command if the ATAPI CDB is not sent to the device yet.


BTW, it is not the problem of pdc20275 hardware as previously said; pdc20275 is innocent.

Attached please find the patch against the linux-2.6.git tree (2.6.12-rc6) for your review. Thanks.

Albert

Signed-off-by: Albert Lee <albertcc@tw.ibm.com>

[-- Attachment #2: atapi_state_fix.diff --]
[-- Type: text/plain, Size: 3025 bytes --]

--- 10_atapi_pio_fix/include/linux/libata.h	2005-06-06 13:38:51.000000000 +0800
+++ 20_atapi_intr_race_fix/include/linux/libata.h	2005-06-07 11:08:26.000000000 +0800
@@ -161,6 +161,8 @@
 	PIO_ST_LAST,
 	PIO_ST_LAST_POLL,
 	PIO_ST_ERR,
+	PIO_ST_PKT,
+	PIO_ST_CDB_SENT,
 };
 
 /* forward declarations */
--- 10_atapi_pio_fix/drivers/scsi/libata-core.c	2005-06-06 13:37:50.000000000 +0800
+++ 20_atapi_intr_race_fix/drivers/scsi/libata-core.c	2005-06-08 13:41:48.000000000 +0800
@@ -2872,6 +2872,7 @@
 	default:
 		ata_altstatus(ap);
 		drv_stat = ata_chk_status(ap);
+		ap->pio_task_state = PIO_ST_IDLE;
 
 		/* ack bmdma irq events */
 		ap->ops->irq_clear(ap);
@@ -3184,17 +3185,20 @@
 	case ATA_PROT_ATAPI:
 		ata_qc_set_polling(qc);
 		ata_tf_to_host_nolock(ap, &qc->tf);
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
 	case ATA_PROT_ATAPI_NODATA:
 		ata_tf_to_host_nolock(ap, &qc->tf);
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
 	case ATA_PROT_ATAPI_DMA:
 		ap->ops->tf_load(ap, &qc->tf);	 /* load tf registers */
 		ap->ops->bmdma_setup(qc);	    /* set up bmdma */
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
@@ -3457,6 +3461,32 @@
 {
 	u8 status, host_stat;
 
+	/* ATAPI: Handle the interrupt before CDB is sent */
+	if (is_atapi_taskfile(&qc->tf) &&
+	    ap->pio_task_state != PIO_ST_CDB_SENT) {
+		/* check whether it's our irq */
+		host_stat = ap->ops->bmdma_status(ap);
+
+		if (host_stat & ATA_DMA_INTR) {
+			/* Some pre-ATAPI-4 devices assert INTRQ here 
+			 * if nIEN is zero. qc->dev->id[0] bits 5-6 can 
+			 * be used to identify such devices.
+			 */
+			DPRINTK("ata%u: atapi interrupt handled\n", ap->id);
+
+			/* clear INTRQ */
+			status = ata_chk_status(ap);
+
+			/* clear bmdma status irq bit */
+			ap->ops->irq_clear(ap);
+
+			return 1;	/* irq handled */
+		} else {
+			DPRINTK("ata%u: not my atapi interrupt\n", ap->id);
+			goto idle_irq;
+		}
+	}
+
 	switch (qc->tf.protocol) {
 
 	case ATA_PROT_DMA:
@@ -3489,6 +3519,8 @@
 		DPRINTK("ata%u: protocol %d (dev_stat 0x%X)\n",
 			ap->id, qc->tf.protocol, status);
 
+		ap->pio_task_state = PIO_ST_IDLE;
+
 		/* ack bmdma irq events */
 		ap->ops->irq_clear(ap);
 
@@ -3580,6 +3612,7 @@
 	struct ata_port *ap = _data;
 	struct ata_queued_cmd *qc;
 	u8 status;
+	unsigned long flags;
 
 	qc = ata_qc_from_tag(ap, ap->active_tag);
 	assert(qc != NULL);
@@ -3595,10 +3628,13 @@
 	if ((status & (ATA_BUSY | ATA_DRQ)) != ATA_DRQ)
 		goto err_out;
 
+	spin_lock_irqsave(&ap->host_set->lock, flags);
+
 	/* send SCSI cdb */
 	DPRINTK("send cdb\n");
 	assert(ap->cdb_len >= 12);
 	ata_data_xfer(ap, qc->cdb, ap->cdb_len, 1);
+	ap->pio_task_state = PIO_ST_CDB_SENT;
 
 	/* if we are DMA'ing, irq handler takes over from here */
 	if (qc->tf.protocol == ATA_PROT_ATAPI_DMA)
@@ -3615,6 +3651,8 @@
 		queue_work(ata_wq, &ap->pio_task);
 	}
 
+	spin_unlock_irqrestore(&ap->host_set->lock, flags);
+
 	return;
 
 err_out:

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] libata: Handle ATAPI interrupt before CDB is sent
  2005-06-08  8:02       ` [PATCH 1/1] libata: Handle ATAPI interrupt before CDB is sent Albert Lee
@ 2005-06-08 10:01         ` Bartlomiej Zolnierkiewicz
  2005-06-08 12:17           ` Albert Lee
  2005-06-09  7:13           ` Albert Lee
  0 siblings, 2 replies; 12+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2005-06-08 10:01 UTC (permalink / raw)
  To: Albert Lee; +Cc: Jeff Garzik, Doug Maxey, Linux IDE

Hi Albert,
Hi Jeff,

On 6/8/05, Albert Lee <albertcc@tw.ibm.com> wrote:

> @@ -3457,6 +3461,32 @@
>  {
>         u8 status, host_stat;
> 
> +       /* ATAPI: Handle the interrupt before CDB is sent */
> +       if (is_atapi_taskfile(&qc->tf) &&
> +           ap->pio_task_state != PIO_ST_CDB_SENT) {
> +               /* check whether it's our irq */
> +               host_stat = ap->ops->bmdma_status(ap);
> +
> +               if (host_stat & ATA_DMA_INTR) {
> +                       /* Some pre-ATAPI-4 devices assert INTRQ here
> +                        * if nIEN is zero. qc->dev->id[0] bits 5-6 can
> +                        * be used to identify such devices.
> +                        */
> +                       DPRINTK("ata%u: atapi interrupt handled\n", ap->id);

This looks wrong for ATA_PROT_ATAPI_[NODATA],
we can't expect host to support DMA.

> +
> +                       /* clear INTRQ */
> +                       status = ata_chk_status(ap);
> +
> +                       /* clear bmdma status irq bit */
> +                       ap->ops->irq_clear(ap);
> +
> +                       return 1;       /* irq handled */
> +               } else {
> +                       DPRINTK("ata%u: not my atapi interrupt\n", ap->id);
> +                       goto idle_irq;
> +               }
> +       }
> +
>         switch (qc->tf.protocol) {
> 
>         case ATA_PROT_DMA:

	case ATA_PROT_DMA:
	case ATA_PROT_ATAPI_DMA:
	case ATA_PROT_ATAPI:
		/* check status of DMA engine */
		host_stat = ap->ops->bmdma_status(ap);
		VPRINTK("ata%u: host_stat 0x%X\n", ap->id, host_stat);

		/* if it's not our irq... */
		if (!(host_stat & ATA_DMA_INTR))
			goto idle_irq;

		/* before we do anything else, clear DMA-Start bit */
		ap->ops->bmdma_stop(ap);

		/* fall through */

ditto (vanilla libata-core.c)

Bartlomiej

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] libata: Handle ATAPI interrupt before CDB is sent
  2005-06-08 10:01         ` Bartlomiej Zolnierkiewicz
@ 2005-06-08 12:17           ` Albert Lee
  2005-06-09  7:13           ` Albert Lee
  1 sibling, 0 replies; 12+ messages in thread
From: Albert Lee @ 2005-06-08 12:17 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Jeff Garzik, Doug Maxey, Linux IDE


Hi Bart:

> 
> 
>>@@ -3457,6 +3461,32 @@
>> {
>>        u8 status, host_stat;
>>
>>+       /* ATAPI: Handle the interrupt before CDB is sent */
>>+       if (is_atapi_taskfile(&qc->tf) &&
>>+           ap->pio_task_state != PIO_ST_CDB_SENT) {
>>+               /* check whether it's our irq */
>>+               host_stat = ap->ops->bmdma_status(ap);
>>+
>>+               if (host_stat & ATA_DMA_INTR) {
>>+                       /* Some pre-ATAPI-4 devices assert INTRQ here
>>+                        * if nIEN is zero. qc->dev->id[0] bits 5-6 can
>>+                        * be used to identify such devices.
>>+                        */
>>+                       DPRINTK("ata%u: atapi interrupt handled\n", ap->id);
> 
> 
> This looks wrong for ATA_PROT_ATAPI_[NODATA],
> we can't expect host to support DMA.
> 
> 
Yes, it looks strange.

If the IRQ is shared, the interrupt bit in the BM-DMA status register
seems to be our only hope to distinguish our interrupt from others.
Is there any other better method to make sure it is our interrupt?

Or, maybe no need to make sure it is our interrupt?
We can just assume it is our interrupt and read status register
to clear INTRQ (if asserted) blindly here. It should not hurt anyway.

> 
> 
> 	case ATA_PROT_DMA:
> 	case ATA_PROT_ATAPI_DMA:
> 	case ATA_PROT_ATAPI:
> 		/* check status of DMA engine */
> 		host_stat = ap->ops->bmdma_status(ap);
> 		VPRINTK("ata%u: host_stat 0x%X\n", ap->id, host_stat);
> 
> 		/* if it's not our irq... */
> 		if (!(host_stat & ATA_DMA_INTR))
> 			goto idle_irq;
> 
> 		/* before we do anything else, clear DMA-Start bit */
> 		ap->ops->bmdma_stop(ap);
> 
> 		/* fall through */
> 
> ditto (vanilla libata-core.c)

ATA_PROT_ATAPI is done via PIO polling. So, maybe we can safely change the
'case ATA_PROT_ATAPI' here to return error?


BTW, I'm not sure whether the interrupt bit in the BM-DMA status register
is correct if BM-DMA is not started, so I added a debug patch and
did experiment on my pdc2027x adapter.
It seems the interrupt bit of BM-DMA status register is correct
even the BM-DMA is not started on the pdc2027x. Please see the attached log.
(I am not sure whether other adapters have the same behavior.)
Just for your reference.

Albert
======================
Jun  8 14:14:14 p4ht-s kernel: NODATA: ata2: host_stat 0x4
Jun  8 14:14:14 p4ht-s kernel: NODATA: alt status 0x50
Jun  8 14:14:15 p4ht-s kernel: NODATA: ata2: host_stat 0x4
Jun  8 14:14:15 p4ht-s kernel: NODATA: alt status 0x50
Jun  8 14:14:15 p4ht-s kernel: NODATA: ata2: host_stat 0x0
Jun  8 14:14:15 p4ht-s kernel: NODATA: alt status 0xD0     (device busy)
Jun  8 14:14:15 p4ht-s kernel: NODATA: ata2: host_stat 0x0
Jun  8 14:14:15 p4ht-s kernel: NODATA: alt status 0xD0     (device busy)
Jun  8 14:14:15 p4ht-s kernel: NODATA: ata2: host_stat 0x0
Jun  8 14:14:15 p4ht-s kernel: NODATA: alt status 0xD0     (device busy)
Jun  8 14:14:15 p4ht-s kernel: NODATA: ata2: host_stat 0x4
Jun  8 14:14:15 p4ht-s kernel: NODATA: alt status 0x50
Jun  8 14:14:16 p4ht-s kernel: NODATA: ata2: host_stat 0x4
Jun  8 14:14:16 p4ht-s kernel: NODATA: alt status 0x50





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] libata: Handle ATAPI interrupt before CDB is sent
  2005-06-08 10:01         ` Bartlomiej Zolnierkiewicz
  2005-06-08 12:17           ` Albert Lee
@ 2005-06-09  7:13           ` Albert Lee
  1 sibling, 0 replies; 12+ messages in thread
From: Albert Lee @ 2005-06-09  7:13 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Jeff Garzik, Doug Maxey, Linux IDE

[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]



Hi Bart,


> 
>>@@ -3457,6 +3461,32 @@
>> {
>>        u8 status, host_stat;
>>
>>+       /* ATAPI: Handle the interrupt before CDB is sent */
>>+       if (is_atapi_taskfile(&qc->tf) &&
>>+           ap->pio_task_state != PIO_ST_CDB_SENT) {
>>+               /* check whether it's our irq */
>>+               host_stat = ap->ops->bmdma_status(ap);
>>+
>>+               if (host_stat & ATA_DMA_INTR) {
>>+                       /* Some pre-ATAPI-4 devices assert INTRQ here
>>+                        * if nIEN is zero. qc->dev->id[0] bits 5-6 can
>>+                        * be used to identify such devices.
>>+                        */
>>+                       DPRINTK("ata%u: atapi interrupt handled\n", ap->id);
> 
> 
> This looks wrong for ATA_PROT_ATAPI_[NODATA],
> we can't expect host to support DMA.
> 
> 

Revised in this patch by checking qc->dev->id[0] and alt_status.


> 
> 
> 	case ATA_PROT_DMA:
> 	case ATA_PROT_ATAPI_DMA:
> 	case ATA_PROT_ATAPI:
> 		/* check status of DMA engine */
> 		host_stat = ap->ops->bmdma_status(ap);
> 		VPRINTK("ata%u: host_stat 0x%X\n", ap->id, host_stat);
> 
> 		/* if it's not our irq... */
> 		if (!(host_stat & ATA_DMA_INTR))
> 			goto idle_irq;
> 
> 		/* before we do anything else, clear DMA-Start bit */
> 		ap->ops->bmdma_stop(ap);
> 
> 		/* fall through */
> 
> ditto (vanilla libata-core.c)

'case ATA_PROT_ATAPI' moved to 'default'.

Attached please find the revised patch for your review.
Hopefully it looks better this time.
Thanks for the comment. :)

Albert

[-- Attachment #2: atapi_state_fix.diff --]
[-- Type: text/plain, Size: 3441 bytes --]

--- 10_atapi_pio_fix/include/linux/libata.h	2005-06-06 13:38:51.000000000 +0800
+++ 20_atapi_intr_race_fix/include/linux/libata.h	2005-06-07 11:08:26.000000000 +0800
@@ -161,6 +161,8 @@
 	PIO_ST_LAST,
 	PIO_ST_LAST_POLL,
 	PIO_ST_ERR,
+	PIO_ST_PKT,
+	PIO_ST_CDB_SENT,
 };
 
 /* forward declarations */
--- 10_atapi_pio_fix/drivers/scsi/libata-core.c	2005-06-06 13:37:50.000000000 +0800
+++ 20_atapi_intr_race_fix/drivers/scsi/libata-core.c	2005-06-09 14:40:51.000000000 +0800
@@ -2872,6 +2872,7 @@
 	default:
 		ata_altstatus(ap);
 		drv_stat = ata_chk_status(ap);
+		ap->pio_task_state = PIO_ST_IDLE;
 
 		/* ack bmdma irq events */
 		ap->ops->irq_clear(ap);
@@ -3184,17 +3185,20 @@
 	case ATA_PROT_ATAPI:
 		ata_qc_set_polling(qc);
 		ata_tf_to_host_nolock(ap, &qc->tf);
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
 	case ATA_PROT_ATAPI_NODATA:
 		ata_tf_to_host_nolock(ap, &qc->tf);
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
 	case ATA_PROT_ATAPI_DMA:
 		ap->ops->tf_load(ap, &qc->tf);	 /* load tf registers */
 		ap->ops->bmdma_setup(qc);	    /* set up bmdma */
+		ap->pio_task_state = PIO_ST_PKT;
 		queue_work(ata_wq, &ap->packet_task);
 		break;
 
@@ -3457,11 +3461,39 @@
 {
 	u8 status, host_stat;
 
+	/* ATAPI: Handle the interrupt before CDB is sent */
+	if (unlikely(is_atapi_taskfile(&qc->tf) &&
+	    ap->pio_task_state != PIO_ST_CDB_SENT)) {
+		/* check whether it's our irq */
+		if ((qc->dev->id[0] & 0x60) == 0x20) {
+			/* Some pre-ATAPI-4 devices assert INTRQ here 
+			 * if nIEN is zero. qc->dev->id[0] bits 5-6 can 
+			 * be used to identify such devices.
+			 */
+			status = ata_altstatus(ap);
+			if (status & ATA_BUSY)
+				goto idle_irq;
+
+			/* clear INTRQ */
+			status = ata_chk_status(ap);
+			if (unlikely(status & ATA_BUSY))
+			goto idle_irq;
+
+			DPRINTK("ata%u: atapi interrupt handled\n", ap->id);
+
+			/* clear bmdma interrupt bit */
+			ap->ops->irq_clear(ap);
+			return 1;	/* irq handled */
+		} else {
+			DPRINTK("ata%u: not my atapi interrupt\n", ap->id);
+			goto idle_irq;
+		}
+	}
+
 	switch (qc->tf.protocol) {
 
 	case ATA_PROT_DMA:
 	case ATA_PROT_ATAPI_DMA:
-	case ATA_PROT_ATAPI:
 		/* check status of DMA engine */
 		host_stat = ap->ops->bmdma_status(ap);
 		VPRINTK("ata%u: host_stat 0x%X\n", ap->id, host_stat);
@@ -3489,6 +3521,8 @@
 		DPRINTK("ata%u: protocol %d (dev_stat 0x%X)\n",
 			ap->id, qc->tf.protocol, status);
 
+		ap->pio_task_state = PIO_ST_IDLE;
+
 		/* ack bmdma irq events */
 		ap->ops->irq_clear(ap);
 
@@ -3496,6 +3530,7 @@
 		ata_qc_complete(qc, status);
 		break;
 
+	case ATA_PROT_ATAPI:
 	default:
 		goto idle_irq;
 	}
@@ -3580,6 +3615,7 @@
 	struct ata_port *ap = _data;
 	struct ata_queued_cmd *qc;
 	u8 status;
+	unsigned long flags;
 
 	qc = ata_qc_from_tag(ap, ap->active_tag);
 	assert(qc != NULL);
@@ -3595,10 +3631,13 @@
 	if ((status & (ATA_BUSY | ATA_DRQ)) != ATA_DRQ)
 		goto err_out;
 
+	spin_lock_irqsave(&ap->host_set->lock, flags);
+
 	/* send SCSI cdb */
 	DPRINTK("send cdb\n");
 	assert(ap->cdb_len >= 12);
 	ata_data_xfer(ap, qc->cdb, ap->cdb_len, 1);
+	ap->pio_task_state = PIO_ST_CDB_SENT;
 
 	/* if we are DMA'ing, irq handler takes over from here */
 	if (qc->tf.protocol == ATA_PROT_ATAPI_DMA)
@@ -3615,6 +3654,8 @@
 		queue_work(ata_wq, &ap->pio_task);
 	}
 
+	spin_unlock_irqrestore(&ap->host_set->lock, flags);
+
 	return;
 
 err_out:

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device
  2005-05-16 10:37     ` Albert Lee
  2005-06-08  8:02       ` [PATCH 1/1] libata: Handle ATAPI interrupt before CDB is sent Albert Lee
@ 2005-06-28  3:06       ` Jeff Garzik
  2005-06-28 10:05         ` Albert Lee
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff Garzik @ 2005-06-28  3:06 UTC (permalink / raw)
  To: Albert Lee; +Cc: Bartlomiej Zolnierkiewicz, Doug Maxey, Jens Axboe, Linux IDE

Albert Lee wrote:
> 
> Jeff:
> 
>>
>> There is an additional complication:
>>
>> As specified in IDENTIFY PACKET DEVICE Word 0, in some older ATAPI 
>> devices, the device will assert an interrupt after the CDB has been 
>> submitted to the device.
>>
>> Your patch is moving in the right direction...   but I think we should 
>> check Word 0 and perform actions accordingly, as some ATAPI devices 
>> appear to expect it.
> 
> 
> I didn't consider this case.
> Just take a look at the ATA-4 draft spec:
> 
> http://www.t13.org/project/d1153r18-ATA-ATAPI-4.pdf
> (Fig. 16, page. 236)
> "Some devices prior to ATA/ATAPI-4 assert INTRQ if enabled at this 
> point. See
> IDENTIFY PACKET DEVICE, word 0, bits 5-6 to determine if an interrupt 
> will occur."
> 
> The interrupt seems to be after PACKET but before CDB is sent.
> It looks like a good news, since the previous patch can almost handle 
> such interrupt.
> We can add IDENTIFY PACKET DEVICE Word 0 checking, if bit 6-5 is 01, 
> then we return irq_handled
> instead of ignoring the interrupt. Attached please find the revised 
> patch for your review.
> 
> 
>>
>> I wonder if your ATAPI device is one of these?
>>
> 
> My ATAPI devices at hand do not generate the interrupt before CDB is sent.
> (I wish I have one to test this patch.)
> The unexpected interrupt is caused by a pdc2027x hardware problem:
> When both primary/secondary channels of the pdc20275 are stressed,
> sometimes the interrupt in the 2nd channel will cause the ATA_DMA_INTR 
> bit of the 1st
> channel to be set. If only 1 channel is used, the unexpected interrupt 
> is never seen.
> 
> 
> Albert
> 
> Signed-off-by: Albert Lee <albertcc@tw.ibm.com>

You don't need to ignore the interrupt.  If you follow the host state 
machine, you need a cycle through INTRQ->check-status states, before 
proceeding with the rest of the transaction.

	Jeff




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device
  2005-06-28  3:06       ` [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device Jeff Garzik
@ 2005-06-28 10:05         ` Albert Lee
  0 siblings, 0 replies; 12+ messages in thread
From: Albert Lee @ 2005-06-28 10:05 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Bartlomiej Zolnierkiewicz, Doug Maxey, Jens Axboe, Linux IDE


> 
> 
> You don't need to ignore the interrupt.  If you follow the host state 
> machine, you need a cycle through INTRQ->check-status states, before 
> proceeding with the rest of the transaction.
> 
>     Jeff
> 

Hi Jeff,

The revised patch is in another thread:
http://marc.theaimsgroup.com/?l=linux-ide&m=111830130223323&w=2

Sorry I messed things up.
I will send you a better organized summary of the pacthes.


Albert



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2005-06-28 10:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-29  9:27 [PATCH 0/2] libata-2.6: Fix races caused by the interrupt handler Albert Lee
2005-04-29  9:34 ` [PATCH 1/2] libata-2.6: Prevent the interrupt handler from completing a command twice Albert Lee
2005-05-15 22:47   ` Jeff Garzik
2005-04-29  9:39 ` [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device Albert Lee
2005-05-15 22:49   ` Jeff Garzik
2005-05-16 10:37     ` Albert Lee
2005-06-08  8:02       ` [PATCH 1/1] libata: Handle ATAPI interrupt before CDB is sent Albert Lee
2005-06-08 10:01         ` Bartlomiej Zolnierkiewicz
2005-06-08 12:17           ` Albert Lee
2005-06-09  7:13           ` Albert Lee
2005-06-28  3:06       ` [PATCH 2/2] libata-2.6: Ignore interrupt before the ATAPI CDB is sent to the device Jeff Garzik
2005-06-28 10:05         ` Albert Lee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).