public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch]: ide dma timeout retry in pio
@ 2001-05-28 18:34 Jens Axboe
  2001-05-28 19:39 ` Mark Hahn
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2001-05-28 18:34 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Andre M. Hedrick, Alan Cox

[-- Attachment #1: Type: text/plain, Size: 618 bytes --]

Hi,

We have the current problem of ide dma possibly tossing out a complete
request, when we hit a dma timout. In this case, what we really want to
do is retry the request in pio mode and revert to normal dma operations
later again.

This patch catches the dma timout. It clears the dma engine, turns dma
off, sanity checks the request, and makes sure that the ide request
handler restarts the request (now in pio mode). When the first chunk of
the request is finished, return to dma mode. If the dma timeouts keep
happening, stay in pio mode.

Patch is untested for obvious reason, against 2.4.5-ac3

-- 
Jens Axboe


[-- Attachment #2: ide-dma-timeout-1 --]
[-- Type: text/plain, Size: 3527 bytes --]

--- ../linux-2.4.5-ac3-clean/drivers/ide/ide.c	Mon May 28 20:28:05 2001
+++ drivers/ide/ide.c	Mon May 28 20:21:48 2001
@@ -543,10 +543,20 @@
 {
 	struct request *rq;
 	unsigned long flags;
+	ide_drive_t *drive = hwgroup->drive;
 
 	spin_lock_irqsave(&io_request_lock, flags);
 	rq = hwgroup->rq;
 
+	/*
+	 * decide whether to reenable DMA -- 3 is a random magic for now,
+	 * if we DMA timeout more than 3 times, just stay in PIO
+	 */
+	if (drive->state == DMA_PIO_RETRY && drive->retry_pio <= 3) {
+		drive->state = 0;
+		hwgroup->hwif->dmaproc(ide_dma_on, drive);
+	}
+
 	if (!end_that_request_first(rq, uptodate, hwgroup->drive->name)) {
 		add_blkdev_randomness(MAJOR(rq->rq_dev));
 		blkdev_dequeue_request(rq);
@@ -1419,6 +1429,49 @@
 }
 
 /*
+ * un-busy the hwgroup etc, and clear any pending DMA status. we want to
+ * retry the current request in pio mode instead of risking tossing it
+ * all away
+ */
+void ide_dma_timeout_retry(ide_drive_t *drive)
+{
+	ide_hwif_t *hwif = HWIF(drive);
+	struct request *rq;
+
+	/*
+	 * end current dma transaction
+	 */
+	(void) hwif->dmaproc(ide_dma_end, drive);
+
+	/*
+	 * complain a little, later we might remove some of this verbosity
+	 */
+	printk("%s: timeout waiting for DMA\n", drive->name);
+	(void) hwif->dmaproc(ide_dma_timeout, drive);
+
+	/*
+	 * disable dma for now, but remember that we did so because of
+	 * a timeout -- we'll reenable after we finish this next request
+	 * (or rather the first chunk of it) in pio.
+	 */
+	drive->retry_pio++;
+	drive->state = DMA_PIO_RETRY;
+	(void) hwif->dmaproc(ide_dma_off_quietly, drive);
+
+	/*
+	 * un-busy drive etc (hwgroup->busy is cleared on return) and
+	 * make sure request is sane
+	 */
+	rq = HWGROUP(drive)->rq;
+	HWGROUP(drive)->rq = NULL;
+
+	rq->errors = 0;
+	rq->sector = rq->bh->b_rsector;
+	rq->current_nr_sectors = rq->bh->b_size >> 9;
+	rq->buffer = rq->bh->b_data;
+}
+
+/*
  * ide_timer_expiry() is our timeout function for all drive operations.
  * But note that it can also be invoked as a result of a "sleep" operation
  * triggered by the mod_timer() call in ide_do_request.
@@ -1491,11 +1544,10 @@
 				startstop = handler(drive);
 			} else {
 				if (drive->waiting_for_dma) {
-					(void) hwgroup->hwif->dmaproc(ide_dma_end, drive);
-					printk("%s: timeout waiting for DMA\n", drive->name);
-					(void) hwgroup->hwif->dmaproc(ide_dma_timeout, drive);
-				}
-				startstop = ide_error(drive, "irq timeout", GET_STAT());
+					startstop = ide_stopped;
+					ide_dma_timeout_retry(drive);
+				} else
+					startstop = ide_error(drive, "irq timeout", GET_STAT());
 			}
 			set_recovery_timer(hwif);
 			drive->service_time = jiffies - drive->service_start;
--- ../linux-2.4.5-ac3-clean/include/linux/ide.h	Mon May 28 20:28:13 2001
+++ include/linux/ide.h	Mon May 28 20:21:18 2001
@@ -87,6 +87,11 @@
 #define ERROR_RECAL	1	/* Recalibrate every 2nd retry */
 
 /*
+ * state flags
+ */
+#define DMA_PIO_RETRY	1	/* retrying in PIO */
+
+/*
  * Ensure that various configuration flags have compatible settings
  */
 #ifdef REALLY_SLOW_IO
@@ -299,6 +304,8 @@
 	special_t	special;	/* special action flags */
 	byte     keep_settings;		/* restore settings after drive reset */
 	byte     using_dma;		/* disk is using dma for read/write */
+	byte	 retry_pio;		/* retrying dma capable host in pio */
+	byte	 state;			/* retry state */
 	byte     waiting_for_dma;	/* dma currently in progress */
 	byte     unmask;		/* flag: okay to unmask other irqs */
 	byte     slow;			/* flag: slow data port */

^ permalink raw reply	[flat|nested] 15+ messages in thread
* RE: [patch]: ide dma timeout retry in pio
@ 2001-05-30 21:09 Diefenbaugh, Paul S
  0 siblings, 0 replies; 15+ messages in thread
From: Diefenbaugh, Paul S @ 2001-05-30 21:09 UTC (permalink / raw)
  To: 'Christopher B. Liebman', Jens Axboe, Mark Hahn
  Cc: Acpi@Phobos. Fachschaften. Tu-Muenchen. De, linux-kernel, alan,
	andre

Chris/All:

I think your assumptions are correct.  I'm guessing that IDE DMA activity is
not being properly handled when the CPU is in C3, resulting in memory (and
therefore file system) corruption.  We haven't seen corruption on our
development systems, but this is probably due to the fact that we don't
explicitly enable IDE DMA transfers (?).

I'm concerned that the CPU is being put into C3 during what appears to be
times of high bus mastering activity.  The default policy (prpolicy.c) is
configured to only use C3 when bus mastering (BM_STS) is silent for 4 or
more 'quantums'.  You can see if this is working by causing disk activity
while cat'ing the file '/proc/acpi/processor/0/status': the C3 counter
should not be incrementing (or not by much, anyway).

The C3 handler should block bus master activity while the CPU is in C3.  DMA
activity (writes) during C3 would result in cache-incoherency (since the CPU
is not snooping) and thus memory corruption.  The idea is to block bus
mastering activity while in C3 (ARB_DIS), but allow the CPU to wakeup
whenever bus mastering is requested (BM_RLD).  I'm betting that DMA is
happening during C3 resulting in fs corruption.

To verify if C3 is really the culprit we should try disabling its use on a
vulnerable system.  I'd recommend mapping the C3 handler to use C2 instead,
which could be done by modifying the switch statement in pr_power_idle()
within prpower.c (see below).  Note that we'll still be setting BM_RLD for
C3's during pr_power_activate_state(), but this shouldn't be an issue.

	case PR_C2:
	case PR_C3:
		/* Interrupts must be disabled during C2 transitions */
		disable();
		/* See how long we're asleep for */
		acpi_get_timer(&start_ticks);
		/* Invoke C2 */
		acpi_os_in8(processor->power.p_lvl2);
		/* Dummy op - must do something useless after P_LVL2 read */
		acpi_hw_register_bit_access(ACPI_READ, ACPI_MTX_DO_NOT_LOCK,

			BM_STS);
		/* Compute time elapsed */
		acpi_get_timer(&end_ticks);
		/* Re-enable interrupts */
		enable();
		break;
	
	<remove previous 'case PR_C3'>

Could somebody give this a try and let me know?

Thanks,

-- Paul Diefenbaugh
   Intel Corporation


-----Original Message-----
From: Christopher B. Liebman [mailto:liebman@sponsera.com]
Sent: Monday, May 28, 2001 1:13 PM
To: Jens Axboe; Mark Hahn
Cc: Acpi@Phobos. Fachschaften. Tu-Muenchen. De;
linux-kernel@vger.kernel.org; alan@lxorguk.ukuu.org.uk;
andre@linux-ide.org
Subject: RE: [patch]: ide dma timeout retry in pio


I think that this may be an issue with ACPI processor power saving...  I
have documented issues with ide DMA timeouts when the processor is put into
the C3 power state.  One of the things that happens in this state is that
buss master arbitration is *disabled*.....  bus master activity is
*supposed* to transition the system back to a C0 power state.  I'll bet
there are some issues with the Linux IDE dma and disabling bus master
arbitration......  ideas?  thoughts?  patches? ;-)

	-- Chris

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org
> [mailto:linux-kernel-owner@vger.kernel.org]On Behalf Of Mark Hahn
>
> I seem to recall Andre saying that the problem arises when the
> ide DMA engine looses PCI arbitration during a burst.  shorter
> bursts would seem like the best workaround if this is the problem...
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2001-05-30 21:11 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-05-28 18:34 [patch]: ide dma timeout retry in pio Jens Axboe
2001-05-28 19:39 ` Mark Hahn
2001-05-28 20:13   ` Christopher B. Liebman
2001-05-28 20:37   ` Jens Axboe
2001-05-28 22:15     ` Andre Hedrick
2001-05-28 22:26       ` Jens Axboe
2001-05-29  0:09         ` Andre Hedrick
2001-05-29  0:30           ` Jens Axboe
2001-05-28 21:12   ` Alan Cox
2001-05-28 22:11     ` James Turinsky
2001-05-29  6:18     ` Larry McVoy
2001-05-28 22:20       ` Alan Cox
2001-05-28 22:56       ` Meelis Roos
2001-05-29  7:11         ` Larry McVoy
  -- strict thread matches above, loose matches on Subject: below --
2001-05-30 21:09 Diefenbaugh, Paul S

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox