Re: [PATCH] scsi_error fix

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

From: Luben Tuikov <luben@splentec.com>
To: Mike Anderson <andmike@us.ibm.com>
Cc: Andries.Brouwer@cwi.nl, linux-scsi@vger.kernel.org
Subject: Re: [PATCH] scsi_error fix
Date: Fri, 07 Mar 2003 18:22:40 -0500	[thread overview]
Message-ID: <3E6929C0.6040703@splentec.com> (raw)
In-Reply-To: 20030307211732.GA1148@beaverton.ibm.com

Mike Anderson wrote:
> Andries.Brouwer@cwi.nl [Andries.Brouwer@cwi.nl] wrote:
> 
>>
>>    Your change effectively disables that support - we never hit the code in
>>    scsi_eh_get_sense() to request sense. It would be very nice if we could
>>    fix (or audit) all the scsi drivers, apply your change and remove
>>    scsi_eh_get_sense, but AFAIK that has not and is not happening.
>>
>>No. What happened before was that we got into an infinite loop.
>>The right action is to read the code, understand why it gets
>>into a loop, and fix it. Once that has happened we may decide
>>to undo my change. Or we may decide to ask for sense at that very spot.

``at that very spot'' or at a lower level (but not LLDD).

Yes, I agree with Andries here, and my reasoning follows:

> The loop problem is related to scsi error handling using the common code
> of scsi_queue_insert / scsi_requesT_fn . When a command gets started the
> scsi_init_cmd_errh function is called which sets retries to 0.

Ultimately the solution is fixing the LLDD.  Drawing from my own experience:
a quick and dirty way of simulating Autosense is to do it
*right in the scsi_done() fn* .

Add a flag sense_reqd:1 :: struct scsi_cmnd (stands for ``sense requested''),
which is normally 0, and at scsi_done() if we got CHECK CONDITION without
Sense Data, we (for older 2.4 kernels, NOT REAL C CODE):

    if we got CHECK COND and NO SENSE and SCpnt->sense_reqd == 0 then
	/* the original command is in the backup fields already */
	memcpy(SCpnt->cmnd, generic_sense, sizeof(generic_sense));
	memset(SCpnt->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
	SCpnt->sc_data_direction = SCSI_DATA_READ;
	SCpnt->request_buffer = &SCpnt->sense_buffer;
	SCpnt->request_bufflen = SCSI_SENSE_BUFFERSIZE;
	SCpnt->use_sg = 0;
	SCpnt->cmd_len = COMMAND_SIZE(SCpnt->cmnd[0]);
	SCpnt->result = 0;
	SCpnt->sense_reqd = 1;

	enqueue the command to be sent back to the LLDD.
  	(i.e. from pending_q to incoming_q at front -- the only instance of
	backwardness which I spoke of in an older email...)
   else
	/* normal process */
	enqueue the command into the done_q;
	schedule scsi soft irq;
   fi

The only problem with this solution is that if the driver does it's own
queuing and NACA = 0 (normally) in the CONTROL BYTE of the CDB, then
the Sense Data gets wiped out* at the recept of the next command by the Device
Server (unless it's REQUEST SENSE).  Thus, the only true solution is to fix
it at LLDD level, but with the above solution we get as close to it,
but no closer. :-)

* CA (Contingent Allegiance), now obsolete as of SAM-3r05, since Autosense
is mandatory.

> There maybe another issue with the scsi_eh_get_sense function, but I am
> still looking at it.

I have a problem with fixing Autosense at such higher level in SCSI Core.
I.e. having its own function and going through the normal path of SCSI Commands.
It should be a hack, since this is what Non-Autosense is.

I.e. SCSI Core itself should *not* know about Autosense and Non-Autosense
complient drivers.  Thus, this is best fixed at scsi_done() fn.

I.e. the workings of SCSI Core should assume Autosense -- first because
it makes sense, and secondly because it's mandatory as of SAM-3r05.

And if the Sense Data got back nil, i.e. it got wiped out, then we just bubble up
the command (since sense_reqd would == 1).

At this (low) level there's no need to count retries.  At higher level in
SCSI Core, it's ok, but for Autosense we just have to be as quiet as possible.

Note: there are devices which will report CHECK CONDITION but will
never give a valid Sense Data -- those are the ones which could
really break a shaky recovery logic (normally infinite loop).

I've also gotten into an infinite loop in simulating Autosense in my own
mini-scsi-core, but this was resolved by following more or less the
aforementioned framework.

-- 
Luben

next prev parent reply	other threads:[~2003-03-07 23:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-03-07 20:19 [PATCH] scsi_error fix Andries.Brouwer
2003-03-07 21:17 ` Mike Anderson
2003-03-07 22:20   ` James Bottomley
2003-03-07 22:43     ` Mike Anderson
2003-03-07 22:56       ` James Bottomley
2003-03-09  1:41     ` Osamu Tomita
2003-03-07 23:22   ` Luben Tuikov [this message]
  -- strict thread matches above, loose matches on Subject: below --
2003-03-06 23:37 Andries.Brouwer
2003-03-07  1:09 ` Rob Radez
2003-03-07 19:41 ` Patrick Mansfield

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3E6929C0.6040703@splentec.com \
    --to=luben@splentec.com \
    --cc=Andries.Brouwer@cwi.nl \
    --cc=andmike@us.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox