public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Justin T. Gibbs" <gibbs@scsiguy.com>
To: James Bottomley <James.Bottomley@steeleye.com>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: aic7xxx sets CDR offline, how to reset?
Date: Wed, 04 Sep 2002 10:50:09 -0600	[thread overview]
Message-ID: <12750000.1031158209@aslan.btc.adaptec.com> (raw)
In-Reply-To: <200209041613.g84GDtv02639@localhost.localdomain>

> dledford@redhat.com said:
>> Now, granted, that is more complex than going straight to a BDR, but I
>>  have to argue that it *isn't* that complex.  It certainly isn't the
>> nightmare you make it sound like ;-) 
> 
> It's three times longer even in pseudocode...

To make this work, you really need to use the QErr bit in the
disconnect/reconnect page and/or ECA or ACA.  QErr I believe is
well supported in devices, but ECA (pre SCSI-3) and ACA most
likely receive very little testing.

I will also voice my opinion (again) that watchdog timer recovery
is in the wrong place in Linux.  It belongs in the controller drivers:

1) Only the controller driver knows when to start the timeout
2) Only the controller driver knows the current status of the bus/transport
3) Only the controller can close timeout/just completed races
4) Only the controller driver knows the true transport type
   (SPI/FC/ATA/USB/1394/IP) and what recovery actions are appropriate
   for that transport type given the capabilities of the controller.
5) The algorithm for recovery and maintaining order becomes quite simple:
	1) Freeze the input queue for the controller
	2) Return all transactions unqueued to a device to the mid-layer
	3) Perform the recovery actions required
	4) Unfreeze the controller's queue
	5) Device type driver (sd, cd, tape, etc) decides what errors
	   at what rates should cause the failure of a device.  The
	   controller driver just needs to have the error codes so
	   it can honestly and fully report to the type driver what
	   really happens so it can make good decissions

   This of course assumes that all transactions have a serial number and
   that requeuing transactions orders them by serial number.  With QErr
   set, the device closes the rest if the barrier race for you, but even
   without barriers, full transaction ordering is required if you don't
   want a read to inadvertantly pass a write to the same location during
   recovery.

   For prior art, take a look at FreeBSD.  In the worst case, where
   escalation to a bus reset is required, recovery takes 5 seconds.

--
Justin

  reply	other threads:[~2002-09-04 16:46 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-03 14:35 aic7xxx sets CDR offline, how to reset? James Bottomley
2002-09-03 18:23 ` Doug Ledford
2002-09-03 19:09   ` James Bottomley
2002-09-03 20:59     ` Alan Cox
2002-09-03 21:32       ` James Bottomley
2002-09-03 21:54         ` Alan Cox
2002-09-03 22:50         ` Doug Ledford
2002-09-03 23:28           ` Alan Cox
2002-09-04  7:40           ` Jeremy Higdon
2002-09-04 16:24             ` James Bottomley
2002-09-04 17:13               ` Mike Anderson
2002-09-05  9:50               ` Jeremy Higdon
2002-09-04 16:13           ` James Bottomley
2002-09-04 16:50             ` Justin T. Gibbs [this message]
2002-09-05  9:39               ` Jeremy Higdon
2002-09-05 13:35                 ` Justin T. Gibbs
2002-09-03 21:13     ` Doug Ledford
2002-09-03 21:48       ` James Bottomley
2002-09-03 22:42         ` Doug Ledford
2002-09-03 22:52           ` Doug Ledford
2002-09-03 23:29           ` Alan Cox
2002-09-04 21:16           ` Luben Tuikov
2002-09-04 10:37         ` Andries Brouwer
2002-09-04 10:48           ` Doug Ledford
2002-09-04 11:23           ` Alan Cox
2002-09-04 16:25             ` Rogier Wolff
2002-09-04 19:34               ` Thunder from the hill
2002-09-03 21:24     ` Patrick Mansfield
2002-09-03 22:02       ` James Bottomley
2002-09-03 23:26         ` Alan Cox
  -- strict thread matches above, loose matches on Subject: below --
2002-09-02 12:23 CAMTP guest
2002-09-02 15:50 ` Justin T. Gibbs
2002-09-02 18:05   ` Doug Ledford
2002-09-02 19:16     ` CAMTP guest
2002-09-02 19:48       ` Justin T. Gibbs
2002-09-02 19:42     ` Justin T. Gibbs
2002-06-11  2:46 Proposed changes to generic blk tag for use in SCSI (1/3) James Bottomley
2002-06-11  5:50 ` Jens Axboe
2002-06-11 14:29   ` James Bottomley
2002-06-11 14:45     ` Jens Axboe
2002-06-11 16:39       ` James Bottomley
2002-06-13 21:01 ` Doug Ledford
2002-06-13 21:26   ` James Bottomley
2002-06-13 21:50     ` Doug Ledford
2002-06-13 22:09       ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12750000.1031158209@aslan.btc.adaptec.com \
    --to=gibbs@scsiguy.com \
    --cc=James.Bottomley@steeleye.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox