From: Doug Ledford <dledford@redhat.com>
To: James Bottomley <James.Bottomley@SteelEye.com>
Cc: "Justin T. Gibbs" <gibbs@scsiguy.com>,
linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: aic7xxx sets CDR offline, how to reset?
Date: Tue, 3 Sep 2002 18:42:16 -0400 [thread overview]
Message-ID: <20020903184216.F12201@redhat.com> (raw)
In-Reply-To: <200209032148.g83LmeP09177@localhost.localdomain>; from James.Bottomley@SteelEye.com on Tue, Sep 03, 2002 at 04:48:39PM -0500
On Tue, Sep 03, 2002 at 04:48:39PM -0500, James Bottomley wrote:
> dledford@redhat.com said:
> You are correct. However, as soon as you abort the problem command (assuming
> the device recovers from this), it will go on its merry way processing the
> remaining commands in the queue. Assuming one of these is the barrier, you've
> no way now of re-queueing the aborted command so that it comes before the
> ordered tag barrier. You can try using a head of queue tag, but it's still a
> nasty race.
(Solution to this race was in my next paragraph as you found ;-)
> > On direct access devices you are only concerned about ordering around
> > the barrier, not ordering of the actual tagged commands, so for abort
> > you can actually call abort on all the commands past the REQ_BARRIER
> > command first, then the REQ_BARRIER command, then the hung command.
> > That would do the job and preserve REQ_BARRIER ordering while still
> > using aborts.
>
> I agree, but the most likely scenario is that now you're trying to abort
> almost every tag for that device in the system. Isn't reset a simpler
> alternative to this?
Not really. It hasn't been done yet, but one of my goals is to change the
scsi commands over to reasonable list usage (finally) so that we can avoid
all these horrible linear scans it does now looking for an available
command (it also means things like SCSI_OWNER_MID_LAYER can go away
because ownership is defined implicitly by list membership). So,
basically, you have a list item struct on each command. When you build
the commands, you add them to SDpnt->free_list. When you need a command,
instead of searching for a free one, you just grab the head of
SDpnt->free_list and use it. Once you've built the command and are ready
to hand it off to the lldd, you put the command on the tail of the
SDpnt->active_list. When a command completes, you list_remove() it from
the SDpnt->active_list and put it on the SDpnt->complete_list to be
handled by the tasklet. When the tasklet actually completes the command,
it frees the scsi command struct by simply putting it back on the
SDpnt->free_list. Now, if you do things that way, your reset vs. abort
code is actually pretty trivial.
Case 1: you want to throw a BDR. Sample code might end up looking like
this,
[ oops we timed out ]
hostt->bus_device_reset(cmd);
if(!list_empty(cmd->device->active_list)) {
[ our commands haven't all been returned, spew chunks! ]
}
[ do post reset processing ]
Case 2: you want to do an abort, but you need to preserve ordering around
any possible REQ_BARRIERs on the bus. This requires that we keep a
REQ_BARRIER count for the device, it is after all possible that we could
have multiple barriers active at once, so as each command is put on the
active_list, if it is a barrier, then we increment SDpnt->barrier_count
and as we complete commands (at the interrupt context completion, not the
final completion) if it is a barrier command we decrement the count.
[ oops we timed out ]
while(SDpnt->barrier_count && cmd) {
// when the aborted command is returned via the done()
// it will remove it from the active_list, so don't remove
// it here
abort_cmd = list_get_tail(SDpnt->active_list);
if(hostt->abort(abort_cmd) != SUCCESS) {
[ oops, go on to more drastic action ]
} else {
if(abort_cmd->type == BARRIER)
SDpnt->barrier_count--;
if(abort_cmd == cmd)
cmd = NULL;
}
}
if(cmd) {
if(hostt->abort(cmd) != SUCCESS)
[ oops, go on to more drastic action ]
}
Now, granted, that is more complex than going straight to a BDR, but I
have to argue that it *isn't* that complex. It certainly isn't the
nightmare you make it sound like ;-)
> > > At best, abort probably causes a command to overtake a barrier it shouldn't,
> > > at worst we abort the ordered tag that is the barrier and transactional
> > > integrity is lost.
> > >
> > > When error correction is needed, we have to return all the commands for that
> > > device to the block layer so that ordering and barrier issues can be taken
> > > care of in the reissue.
>
> > Not really, this would be easily enough done in the ML_QUEUE area of
> > the scsi layer, but it matters not to me. However, if you throw a
> > BDR, then you have cancelled all outstanding commands and (typically)
> > initiated a hard reset of the device which then requires a device
> > settle time. All of this is more drastic and typically takes longer
> > than the individual aborts which are completed in a single connect->
> > disconnect cycle without ever hitting a data phase and without
> > triggering a full device reset and requiring a settle time.
>
> I agree. I certainly could do it. I'm just a lazy so-and-so. However, think
> what it does. Apart from me having to do more work, the code becomes longer
> and the error recovery path more convoluted and difficult to follow. The
> benefit? well, error recovery might be faster in certain circumstances.
Well, as I've laid it out above, I don't really think it's all that much
to implement ;-) At least not in the mid layer. The low level device
drivers are doing *far* more work to support aborts than the mid layer has
to do.
> I
> just don't see that it's a cost effective change.
Matter of some question, I'm sure. I don't see it as all that much work,
so it seems reasonably cost effective to me ;-)
> If you're hitting error
> recovery so often that whether it recovers in half a second or several
> seconds makes a difference, I'd say there's something else wrong.
Hehehe, if you are hitting error recovery at all then something else is
wrong by definition, the only difference is in how you handle it :-P
--
Doug Ledford <dledford@redhat.com> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606
next prev parent reply other threads:[~2002-09-03 22:42 UTC|newest]
Thread overview: 290+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-09-03 14:35 aic7xxx sets CDR offline, how to reset? James Bottomley
2002-09-03 18:23 ` Doug Ledford
2002-09-03 19:09 ` James Bottomley
2002-09-03 20:59 ` Alan Cox
2002-09-03 20:59 ` Alan Cox
2002-09-03 21:32 ` James Bottomley
2002-09-03 21:54 ` Alan Cox
2002-09-03 22:50 ` Doug Ledford
2002-09-03 23:28 ` Alan Cox
2002-09-04 7:40 ` Jeremy Higdon
2002-09-04 16:24 ` James Bottomley
2002-09-04 17:13 ` Mike Anderson
2002-09-05 9:50 ` Jeremy Higdon
2002-09-04 16:13 ` James Bottomley
2002-09-04 16:50 ` Justin T. Gibbs
2002-09-05 9:39 ` Jeremy Higdon
2002-09-05 13:35 ` Justin T. Gibbs
2002-09-05 23:56 ` Jeremy Higdon
2002-09-06 0:13 ` Justin T. Gibbs
2002-09-06 0:32 ` Jeremy Higdon
2002-09-03 21:13 ` Doug Ledford
2002-09-03 21:48 ` James Bottomley
2002-09-03 22:42 ` Doug Ledford [this message]
2002-09-03 22:52 ` Doug Ledford
2002-09-03 23:29 ` Alan Cox
2002-09-04 21:16 ` Luben Tuikov
2002-09-04 10:37 ` Andries Brouwer
2002-09-04 10:48 ` Doug Ledford
2002-09-04 11:23 ` Alan Cox
2002-09-04 16:25 ` Rogier Wolff
2002-09-04 19:34 ` Thunder from the hill
2002-09-04 19:34 ` Thunder from the hill
2002-09-03 21:24 ` Patrick Mansfield
2002-09-03 22:02 ` James Bottomley
2002-09-03 23:26 ` Alan Cox
[not found] <dledford@redhat.com>
2002-10-02 0:28 ` PATCH: scsi device queue depth adjustability patch Doug Ledford
2002-10-02 1:16 ` Alan Cox
2002-10-02 1:41 ` Doug Ledford
2002-10-02 13:44 ` Alan Cox
2002-10-02 21:41 ` James Bottomley
2002-10-02 22:18 ` Doug Ledford
2002-10-02 23:19 ` James Bottomley
2002-10-03 12:46 ` James Bottomley
2002-10-03 16:35 ` Doug Ledford
2002-10-04 1:40 ` Jeremy Higdon
2002-10-03 14:25 ` James Bottomley
2002-10-03 16:41 ` Doug Ledford
2002-10-03 17:00 ` James Bottomley
2002-10-16 21:35 ` scsi_scan.c question Doug Ledford
2002-10-16 21:41 ` James Bottomley
2002-10-17 0:18 ` Doug Ledford
2002-10-16 21:57 ` Patrick Mansfield
2002-10-18 15:57 ` Patrick Mansfield
2002-11-18 0:27 ` aic7xxx_biosparam Doug Ledford
2002-11-18 0:36 ` aic7xxx_biosparam J.E.J. Bottomley
2002-11-18 2:46 ` aic7xxx_biosparam Doug Ledford
2002-11-18 3:20 ` aic7xxx_biosparam J.E.J. Bottomley
2002-11-18 3:26 ` aic7xxx_biosparam Doug Ledford
2002-11-18 0:43 ` aic7xxx_biosparam Andries Brouwer
2002-11-18 2:47 ` aic7xxx_biosparam Doug Ledford
2002-11-18 0:57 ` aic7xxx_biosparam Alan Cox
2002-11-18 2:34 ` aic7xxx_biosparam Doug Ledford
2002-12-21 1:22 ` scsi_scan changes Doug Ledford
2002-12-21 1:27 ` James Bottomley
2008-10-31 1:02 ` RFC - device names and mdadm with some reference to udev greg
2008-10-31 9:18 ` Neil Brown
2008-11-02 13:52 ` Luca Berra
-- strict thread matches above, loose matches on Subject: below --
2002-11-21 15:16 [PATCH] turn scsi_allocate_device into readable code Christoph Hellwig
2002-11-21 15:36 ` Doug Ledford
2002-11-21 15:39 ` J.E.J. Bottomley
2002-11-21 15:49 ` Doug Ledford
2002-11-21 16:12 ` J.E.J. Bottomley
2002-11-21 17:08 ` [PATCH] current scsi-misc-2.5 include files Patrick Mansfield
2002-11-16 19:40 [PATCH] removel useless mod use count manipulation Christoph Hellwig
2002-11-17 2:59 ` Doug Ledford
2002-11-17 17:31 ` J.E.J. Bottomley
2002-11-17 18:14 ` Doug Ledford
2002-11-17 12:40 ` Douglas Gilbert
2002-11-17 12:48 ` Christoph Hellwig
2002-11-17 13:38 ` Douglas Gilbert
2002-11-15 20:34 [RFC][PATCH] move dma_mask into struct device J.E.J. Bottomley
2002-11-16 0:19 ` Mike Anderson
2002-11-16 14:48 ` J.E.J. Bottomley
2002-11-16 20:33 ` Patrick Mansfield
2002-11-17 15:07 ` J.E.J. Bottomley
2002-11-06 22:18 [PATCH] add request prep functions to SCSI J.E.J. Bottomley
2002-11-06 23:16 ` Doug Ledford
2002-11-06 23:43 ` J.E.J. Bottomley
2002-11-07 21:45 ` Mike Anderson
2002-11-06 4:24 [PATCH] fix 2.5 scsi queue depth setting Patrick Mansfield
2002-11-06 4:35 ` Patrick Mansfield
2002-11-06 17:15 ` J.E.J. Bottomley
2002-11-06 17:47 ` J.E.J. Bottomley
2002-11-06 18:24 ` Patrick Mansfield
2002-11-06 18:32 ` J.E.J. Bottomley
2002-11-06 18:39 ` Patrick Mansfield
2002-11-06 18:50 ` J.E.J. Bottomley
2002-11-06 19:50 ` Patrick Mansfield
2002-11-06 20:45 ` Doug Ledford
2002-11-06 21:19 ` J.E.J. Bottomley
2002-11-06 20:50 ` Doug Ledford
[not found] <patmans@us.ibm.com>
2002-10-15 16:55 ` [RFC PATCH] consolidate SCSI-2 command lun setting Patrick Mansfield
2002-10-15 20:29 ` James Bottomley
2002-10-15 22:00 ` Patrick Mansfield
2002-10-30 16:58 ` [PATCH] 2.5 current bk fix setting scsi queue depths Patrick Mansfield
2002-10-30 17:17 ` James Bottomley
2002-10-30 18:05 ` Patrick Mansfield
2002-10-31 0:44 ` James Bottomley
2002-10-21 19:34 [PATCH] get rid of ->finish method for highlevel drivers Christoph Hellwig
2002-10-21 23:58 ` James Bottomley
2002-10-22 15:48 ` James Bottomley
2002-10-22 18:43 ` Patrick Mansfield
2002-10-22 23:17 ` Mike Anderson
2002-10-22 23:30 ` Doug Ledford
2002-10-23 14:16 ` James Bottomley
2002-10-23 15:13 ` Christoph Hellwig
2002-10-24 1:36 ` Patrick Mansfield
2002-10-24 23:20 ` Willem Riede
2002-10-24 23:36 ` Christoph Hellwig
2002-10-25 0:02 ` Willem Riede
2002-10-22 7:30 ` Mike Anderson
2002-10-22 11:14 ` Christoph Hellwig
2002-10-15 18:55 [patch 2.5] ips queue depths Jeffery, David
2002-10-15 19:30 ` Dave Hansen
2002-10-15 19:47 ` Doug Ledford
2002-10-15 20:04 ` Patrick Mansfield
2002-10-15 20:52 ` Doug Ledford
2002-10-15 23:30 ` Patrick Mansfield
2002-10-15 23:56 ` Luben Tuikov
2002-10-16 2:32 ` Doug Ledford
2002-10-16 19:04 ` Patrick Mansfield
2002-10-16 20:15 ` Doug Ledford
2002-10-17 0:39 ` Luben Tuikov
2002-10-17 17:01 ` Mike Anderson
2002-10-17 21:13 ` Luben Tuikov
2002-10-15 20:10 ` Mike Anderson
2002-10-15 20:24 ` Doug Ledford
2002-10-15 20:38 ` James Bottomley
2002-10-15 22:10 ` Mike Anderson
2002-10-16 1:04 ` James Bottomley
2002-10-15 20:24 ` Mike Anderson
2002-10-15 22:46 ` Doug Ledford
2002-10-15 20:26 ` Luben Tuikov
2002-10-15 21:27 ` Patrick Mansfield
2002-10-16 0:43 ` Luben Tuikov
2002-10-21 7:28 ` Mike Anderson
2002-10-21 16:16 ` Doug Ledford
2002-10-21 16:29 ` James Bottomley
2002-10-10 15:01 [PATCH] scsi host cleanup 3/3 (driver changes) Stephen Cameron
2002-10-10 16:46 ` Mike Anderson
2002-10-10 16:59 ` James Bottomley
2002-10-10 20:05 ` Mike Anderson
2002-09-30 21:06 [PATCH] first cut at fixing unable to requeue with no outstanding commands James Bottomley
2002-09-30 23:28 ` Mike Anderson
2002-10-01 0:38 ` James Bottomley
2002-10-01 15:01 ` Patrick Mansfield
2002-10-01 15:14 ` James Bottomley
2002-10-01 16:23 ` Mike Anderson
2002-10-01 16:30 ` James Bottomley
2002-10-01 20:18 ` Inhibit auto-attach of scsi disks ? Scott Merritt
2002-10-02 0:46 ` Alan Cox
2002-10-02 1:49 ` Scott Merritt
2002-10-02 1:58 ` Doug Ledford
2002-10-02 2:45 ` Scott Merritt
2002-10-02 13:40 ` Alan Cox
2002-09-24 11:35 SCSI woes (followup) Russell King
2002-09-24 13:46 ` James Bottomley
2002-09-24 13:58 ` Russell King
2002-09-24 14:29 ` James Bottomley
2002-09-24 18:16 ` Luben Tuikov
2002-09-24 18:18 ` Patrick Mansfield
2002-09-24 19:01 ` Russell King
2002-09-24 19:08 ` Mike Anderson
2002-09-24 19:21 ` Russell King
2002-09-24 19:32 ` Patrick Mansfield
2002-09-24 20:00 ` Russell King
2002-09-24 22:23 ` Patrick Mansfield
2002-09-24 23:04 ` Russell King
2002-09-24 22:39 ` Russell King
2002-09-24 23:14 ` James Bottomley
2002-09-24 23:26 ` Mike Anderson
2002-09-24 23:31 ` James Bottomley
2002-09-24 23:56 ` Mike Anderson
2002-09-24 23:33 ` Russell King
2002-09-25 0:47 ` Mike Anderson
2002-09-25 8:45 ` Russell King
2002-09-25 2:18 ` Doug Ledford
2002-09-25 14:41 ` Russell King
2002-09-24 23:33 ` Mike Anderson
2002-09-24 23:45 ` Russell King
2002-09-25 0:08 ` Patrick Mansfield
2002-09-25 8:41 ` Russell King
2002-09-25 17:22 ` Patrick Mansfield
2002-09-25 12:46 ` Russell King
2002-09-24 17:57 ` Luben Tuikov
2002-09-24 18:39 ` Mike Anderson
2002-09-24 18:49 ` Luben Tuikov
2002-09-09 14:57 [RFC] Multi-path IO in 2.5/2.6 ? James Bottomley
2002-09-09 16:56 ` Patrick Mansfield
2002-09-09 17:34 ` James Bottomley
2002-09-09 18:40 ` Mike Anderson
2002-09-10 13:02 ` Lars Marowsky-Bree
2002-09-10 16:03 ` Patrick Mansfield
2002-09-10 16:03 ` Patrick Mansfield
2002-09-10 16:27 ` Mike Anderson
2002-09-10 0:08 ` Patrick Mansfield
2002-09-10 7:55 ` Jeremy Higdon
2002-09-10 13:04 ` Lars Marowsky-Bree
2002-09-10 16:20 ` Patrick Mansfield
2002-09-10 16:20 ` Patrick Mansfield
2002-09-10 13:16 ` Lars Marowsky-Bree
2002-09-10 19:26 ` Patrick Mansfield
2002-09-11 14:20 ` James Bottomley
2002-09-11 19:17 ` Lars Marowsky-Bree
2002-09-11 19:17 ` Lars Marowsky-Bree
2002-09-11 19:37 ` James Bottomley
2002-09-11 19:52 ` Lars Marowsky-Bree
2002-09-11 19:52 ` Lars Marowsky-Bree
2002-09-12 1:15 ` Bernd Eckenfels
2002-09-11 21:38 ` Oliver Xymoron
2002-09-11 20:30 ` Doug Ledford
2002-09-11 21:17 ` Mike Anderson
2002-09-10 17:21 ` Patrick Mochel
2002-09-10 17:21 ` Patrick Mochel
2002-09-10 18:42 ` Patrick Mansfield
2002-09-10 19:00 ` Patrick Mochel
2002-09-10 19:00 ` Patrick Mochel
2002-09-10 19:37 ` Patrick Mansfield
2002-09-11 0:21 ` Neil Brown
2002-09-02 12:23 aic7xxx sets CDR offline, how to reset? CAMTP guest
2002-09-02 15:50 ` Justin T. Gibbs
2002-09-02 18:05 ` Doug Ledford
2002-09-02 19:16 ` CAMTP guest
2002-09-02 19:48 ` Justin T. Gibbs
2002-09-02 19:42 ` Justin T. Gibbs
2002-08-26 16:29 [RFC]: 64 bit LUN/Tags, dummy device in host_queue, host_lock <-> LLDD reentrancy Aron Zeh
2002-08-26 16:48 ` James Bottomley
2002-08-26 17:27 ` Mike Anderson
2002-08-26 19:00 ` James Bottomley
2002-08-26 20:57 ` Mike Anderson
2002-08-26 21:10 ` James Bottomley
2002-08-26 22:38 ` Mike Anderson
2002-08-26 22:56 ` Patrick Mansfield
2002-08-26 23:10 ` Doug Ledford
2002-08-28 14:38 ` James Bottomley
2002-08-26 21:15 ` Mike Anderson
2002-08-12 23:38 [PATCH] 2.5.31 scsi_error.c cleanup Mike Anderson
2002-08-22 14:05 ` James Bottomley
2002-08-22 16:34 ` Mike Anderson
2002-08-22 17:11 ` James Bottomley
2002-08-22 20:10 ` Mike Anderson
2002-08-05 23:53 When must the io_request_lock be held? Jamie Wellnitz
2002-08-06 17:58 ` Mukul Kotwani
2002-08-07 14:48 ` Doug Ledford
2002-08-07 15:26 ` James Bottomley
2002-08-07 16:18 ` Doug Ledford
2002-08-07 16:48 ` James Bottomley
2002-08-07 18:06 ` Mike Anderson
2002-08-07 23:17 ` James Bottomley
2002-08-08 19:28 ` Luben Tuikov
2002-08-07 16:55 ` Patrick Mansfield
2002-06-11 2:46 Proposed changes to generic blk tag for use in SCSI (1/3) James Bottomley
2002-06-11 5:50 ` Jens Axboe
2002-06-11 14:29 ` James Bottomley
2002-06-11 14:45 ` Jens Axboe
2002-06-11 16:39 ` James Bottomley
2002-06-13 21:01 ` Doug Ledford
2002-06-13 21:26 ` James Bottomley
2002-06-13 21:50 ` Doug Ledford
2002-06-13 22:09 ` James Bottomley
2002-06-13 21:26 ` James Bottomley
2002-04-08 15:18 [RFC] Persistent naming of scsi devices sullivan
2002-04-08 15:04 ` Christoph Hellwig
2002-04-08 15:59 ` Matthew Jacob
2002-04-08 16:34 ` James Bottomley
2002-04-08 18:27 ` Patrick Mansfield
2002-04-08 19:17 ` James Bottomley
2002-04-09 0:22 ` Douglas Gilbert
2002-04-09 14:35 ` sullivan
2002-04-09 14:55 ` sullivan
2002-04-08 17:51 ` Oliver Neukum
2002-04-08 18:01 ` Christoph Hellwig
2002-04-08 18:18 ` Matthew Jacob
2002-04-08 18:28 ` James Bottomley
2002-04-08 18:34 ` Matthew Jacob
2002-04-08 19:07 ` James Bottomley
2002-04-08 20:41 ` Matthew Jacob
2002-04-08 18:45 ` Tigran Aivazian
2002-04-08 20:18 ` Eddie Williams
2002-04-09 0:48 ` Kurt Garloff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020903184216.F12201@redhat.com \
--to=dledford@redhat.com \
--cc=James.Bottomley@SteelEye.com \
--cc=gibbs@scsiguy.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.