public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
@ 2002-04-15 20:40 Justin T. Gibbs
  2002-04-15 20:51 ` James Bottomley
  2002-04-16  5:58 ` Jens Axboe
  0 siblings, 2 replies; 12+ messages in thread
From: Justin T. Gibbs @ 2002-04-15 20:40 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: linux-scsi, Alan Cox

Marcelo,

Looking at the latest version of your tree via bitkeeper, it
seems that the wrong patch has been applied to sd.c.  In its
current state, the disk driver will fail to act upon any sense
data returned by the low level controller.  The following patch
restores correct behavior.

--
Justin

===== sd.c 1.21 vs edited =====
--- 1.21/drivers/scsi/sd.c	Fri Apr  5 09:19:03 2002
+++ edited/sd.c	Mon Apr 15 14:32:51 2002
@@ -621,7 +621,7 @@
 
 	/* An error occurred */
 	if (driver_byte(result) != 0 && 	/* An error occured */
-	    SCpnt->sense_buffer[0] != 0xF0) {	/* Sense data is valid */
+	    SCpnt->sense_buffer[0] == 0xF0) {	/* Sense data is valid */
 		switch (SCpnt->sense_buffer[2]) {
 		case MEDIUM_ERROR:
 			error_sector = (SCpnt->sense_buffer[3] << 24) |

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-15 20:40 [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX Justin T. Gibbs
@ 2002-04-15 20:51 ` James Bottomley
  2002-04-15 21:13   ` Justin T. Gibbs
  2002-04-16  5:58 ` Jens Axboe
  1 sibling, 1 reply; 12+ messages in thread
From: James Bottomley @ 2002-04-15 20:51 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: Marcelo Tosatti, linux-scsi, Alan Cox

gibbs@scsiguy.com said:
> Looking at the latest version of your tree via bitkeeper, it seems
> that the wrong patch has been applied to sd.c.  In its current state,
> the disk driver will fail to act upon any sense data returned by the
> low level controller.  The following patch restores correct behavior. 

Actually, this should be using the scsi_sense_valid() function call in 
scsi_error.c, shouldn't it?  Especially since the criterion for valid sense is 
merely that bit 7 be set with the error code (which can be either 0x70 or 
0x71) in bits 0-6.

James



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-15 20:51 ` James Bottomley
@ 2002-04-15 21:13   ` Justin T. Gibbs
  2002-04-16 14:03     ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Justin T. Gibbs @ 2002-04-15 21:13 UTC (permalink / raw)
  To: James Bottomley; +Cc: Marcelo Tosatti, linux-scsi, Alan Cox

>gibbs@scsiguy.com said:
>> Looking at the latest version of your tree via bitkeeper, it seems
>> that the wrong patch has been applied to sd.c.  In its current state,
>> the disk driver will fail to act upon any sense data returned by the
>> low level controller.  The following patch restores correct behavior. 
>
>Actually, this should be using the scsi_sense_valid() function call in 
>scsi_error.c, shouldn't it?  Especially since the criterion for valid sense is
> 
>merely that bit 7 be set with the error code (which can be either 0x70 or 
>0x71) in bits 0-6.

Well, if we really wanted to fix it correctly:

1) All of the HBA drivers would set DRIVER_SENSE when sense data
   is available.  Right now many do set it, but because some drivers
   do not, the mid-layer doesn't trust it.

2) The mid-layer would look at DRIVER_SENSE before ever looking
   in the sense buffer.  Currently, and HBA driver has to zero out
   the first byte of the sense buffer to ensure the peripheral drivers
   don't believe that sense is available when it really isn't (stale data
   from previous sense retrieval).

3) The peripheral drivers would understand how to deal with deferred errors.

The check against 0xF0 is currently "correct" because the code in sd.c's
rw_intr doesn't know how to deal with anything other than a current
error (0x70 + valid bit).

--
Justin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-15 20:40 [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX Justin T. Gibbs
  2002-04-15 20:51 ` James Bottomley
@ 2002-04-16  5:58 ` Jens Axboe
  2002-04-16 14:08   ` Justin T. Gibbs
  1 sibling, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2002-04-16  5:58 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: Marcelo Tosatti, linux-scsi, Alan Cox

On Mon, Apr 15 2002, Justin T. Gibbs wrote:
> Marcelo,
> 
> Looking at the latest version of your tree via bitkeeper, it
> seems that the wrong patch has been applied to sd.c.  In its
> current state, the disk driver will fail to act upon any sense
> data returned by the low level controller.  The following patch
> restores correct behavior.

Justin,

Noticed this myself the other day... I don't think they were applied
incorrectly, your 2.4.18 patch is buggy while your 2.4.17 version is
quite ok.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-15 21:13   ` Justin T. Gibbs
@ 2002-04-16 14:03     ` James Bottomley
  2002-04-16 14:43       ` Justin T. Gibbs
  0 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2002-04-16 14:03 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: James Bottomley, Marcelo Tosatti, linux-scsi, Alan Cox

gibbs@scsiguy.com said:
> Well, if we really wanted to fix it correctly:
[...]
> The check against 0xF0 is currently "correct" because the code in
> sd.c's rw_intr doesn't know how to deal with anything other than a
> current error (0x70 + valid bit). 

I'm not advocating that you fix it correctly.  However, if you use the 
provided scsi_sense_valid() function it will show up in an easy to recognise 
form for someone who later does all of this fixing.  We can't go on adding 
these band aid fixes to the scsi subsystem with the justification that it's 
currently broken in a lot of cases so we don't need to consider them.  All 
that does is cause the subsystem to diverge further and make the eventual fix 
even more difficult and daunting.

What is wrong with:

	    if (scsi_sense_valid(SCpnt)) {	/* Sense data is valid */
		/* FIXME: this probably won't work for deferred errors

That way you've used the supplied function and noted the potential problem 
case.

James




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-16  5:58 ` Jens Axboe
@ 2002-04-16 14:08   ` Justin T. Gibbs
  0 siblings, 0 replies; 12+ messages in thread
From: Justin T. Gibbs @ 2002-04-16 14:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Marcelo Tosatti, linux-scsi, Alan Cox

>On Mon, Apr 15 2002, Justin T. Gibbs wrote:
>> Marcelo,
>> 
>> Looking at the latest version of your tree via bitkeeper, it
>> seems that the wrong patch has been applied to sd.c.  In its
>> current state, the disk driver will fail to act upon any sense
>> data returned by the low level controller.  The following patch
>> restores correct behavior.
>
>Justin,
>
>Noticed this myself the other day... I don't think they were applied
>incorrectly, your 2.4.18 patch is buggy while your 2.4.17 version is
>quite ok.

There were two patches posted to the list about this issue.  The first
had this bug.  The second did not.  It does look like the merge of this
correction didn't make it into my 2.4.18 tree at the time I generated
the 2.4.18 patch set for the 6.2.5 release.  I have corrected that
patch set.

--
Justin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-16 14:03     ` James Bottomley
@ 2002-04-16 14:43       ` Justin T. Gibbs
  2002-04-16 14:56         ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Justin T. Gibbs @ 2002-04-16 14:43 UTC (permalink / raw)
  To: James Bottomley; +Cc: Marcelo Tosatti, linux-scsi, Alan Cox

>We can't go on adding these band aid fixes to the scsi subsystem with
>the justification that it's currently broken in a lot of cases so we
>don't need to consider them.  All that does is cause the subsystem to
>diverge further and make the eventual fix even more difficult and daunting.

In my opinion, the only way to really fix the Linux SCSI layer is
to rip it out and start over using prior art from systems with functional
I/O subsystems as a guide to building something enterprise worthy.  As
soon as the community becomes serious about such an endevor, so will I. 8-)

>What is wrong with:
>
>	    if (scsi_sense_valid(SCpnt)) {	/* Sense data is valid */
>		/* FIXME: this probably won't work for deferred errors
>
>That way you've used the supplied function and noted the potential problem 
>case.

If you insist on doing this, lets at least preserve the current behavior:

	/*
	 * FIXME: this driver treats deferred errors are fatal for
	 * 	  the command that reports the error.  The error is
	 *	  actually for a previous command, so the current
	 *	  command should be retried.  In general, deferred
	 *	  errors are hard to cope with as the data written
	 *	  may not be available to be written again.  This
	 *	  is one of the many reasons that write-back caching
	 *	  should not be enabled for general purpose applications.
	 *	  The user should at lease enable "auto write reallocation"
	 *	  and verify that the spare table for the device is not
	 *	  full.
	 */
	if (scsi_sense_valid(SCpnt)
	 && scsi_sense_key(SCpnt) == SKEY_CURRENT_ERROR) {

	}

I don't believe there are definitions to support the second part of
the if clause, but there could be.

--
Justin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-16 14:43       ` Justin T. Gibbs
@ 2002-04-16 14:56         ` James Bottomley
  2002-04-16 15:25           ` Tony Battersby
  2002-04-16 16:18           ` Oliver Xymoron
  0 siblings, 2 replies; 12+ messages in thread
From: James Bottomley @ 2002-04-16 14:56 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: James Bottomley, Marcelo Tosatti, linux-scsi, Alan Cox

gibbs@scsiguy.com said:
> In my opinion, the only way to really fix the Linux SCSI layer is to
> rip it out and start over using prior art from systems with functional
> I/O subsystems as a guide to building something enterprise worthy.  As
> soon as the community becomes serious about such an endevor, so will
> I. 8-) 

Not perhaps for ripping it up and starting again.  However, there are more 
ways than one to skin a cat.  If you can outline a set of principles, it may 
be possible to move current development towards them.  How about just the five 
worst things about the current subsystem and how we could address them?

> If you insist on doing this, lets at least preserve the current
> behavior:

> 	/*
> 	 * FIXME: this driver treats deferred errors are fatal for
[...]
> 	if (scsi_sense_valid(SCpnt)
> 	 && scsi_sense_key(SCpnt) == SKEY_CURRENT_ERROR) {

Excellent!  It's a lot more readable for people not versed in sense buffer 
layout, thanks!

James


> I don't believe there are definitions to support the second part of
> the if clause, but there could be. 




^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-16 14:56         ` James Bottomley
@ 2002-04-16 15:25           ` Tony Battersby
  2002-04-16 17:36             ` Justin T. Gibbs
  2002-04-16 16:18           ` Oliver Xymoron
  1 sibling, 1 reply; 12+ messages in thread
From: Tony Battersby @ 2002-04-16 15:25 UTC (permalink / raw)
  To: 'James Bottomley', 'Justin T. Gibbs'; +Cc: linux-scsi

> > If you insist on doing this, lets at least preserve the current
> > behavior:
>
> > 	/*
> > 	 * FIXME: this driver treats deferred errors are fatal for
> [...]
> > 	if (scsi_sense_valid(SCpnt)
> > 	 && scsi_sense_key(SCpnt) == SKEY_CURRENT_ERROR) {
>
> Excellent!  It's a lot more readable for people not versed in
> sense buffer
> layout, thanks!

Why are you calling it the sense key?  Current vs. deferred error is
indicated by the error code (sense_data[0] & 0x7f), not the sense key
(sense_data[2] & 0x0f).  Calling it the sense key may lead to confusion for
those who are versed in sense buffer layout.

Tony


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-16 14:56         ` James Bottomley
  2002-04-16 15:25           ` Tony Battersby
@ 2002-04-16 16:18           ` Oliver Xymoron
  2002-04-16 17:38             ` Justin T. Gibbs
  1 sibling, 1 reply; 12+ messages in thread
From: Oliver Xymoron @ 2002-04-16 16:18 UTC (permalink / raw)
  To: James Bottomley; +Cc: Justin T. Gibbs, Marcelo Tosatti, linux-scsi, Alan Cox

On Tue, 16 Apr 2002, James Bottomley wrote:

> gibbs@scsiguy.com said:
> > In my opinion, the only way to really fix the Linux SCSI layer is to
> > rip it out and start over using prior art from systems with functional
> > I/O subsystems as a guide to building something enterprise worthy.  As
> > soon as the community becomes serious about such an endevor, so will
> > I. 8-)
>
> Not perhaps for ripping it up and starting again.  However, there are
> more ways than one to skin a cat.  If you can outline a set of
> principles, it may be possible to move current development towards them.
> How about just the five worst things about the current subsystem and how
> we could address them?

>From my recent inspection:

1) sense handling
2) timeouts and error recovery
3) initialization and lun probing
4) scalability
5) readability

In short, all the corner cases are handled with layer on layer of bandaids
and while it works fine for most people with desktop systems, it's really
showing its age when you start messing with storage networking and
hotplugging.

As it stands, SCSI is one of the few subsystems that hasn't gotten a major
overhaul in recent years. As 2.5 is shaping up to be a revamp of the whole
VM/paging/FS/IO/IDE chain and the BIO stuff seems to have settled down,
now is a really good time to start thinking about plugging a cleaner SCSI
layer in.

-- 
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.."


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-16 15:25           ` Tony Battersby
@ 2002-04-16 17:36             ` Justin T. Gibbs
  0 siblings, 0 replies; 12+ messages in thread
From: Justin T. Gibbs @ 2002-04-16 17:36 UTC (permalink / raw)
  To: tonyb; +Cc: 'James Bottomley', linux-scsi

>> > If you insist on doing this, lets at least preserve the current
>> > behavior:
>>
>> > 	/*
>> > 	 * FIXME: this driver treats deferred errors are fatal for
>> [...]
>> > 	if (scsi_sense_valid(SCpnt)
>> > 	 && scsi_sense_key(SCpnt) == SKEY_CURRENT_ERROR) {
>>
>> Excellent!  It's a lot more readable for people not versed in
>> sense buffer
>> layout, thanks!
>
>Why are you calling it the sense key?  Current vs. deferred error is
>indicated by the error code (sense_data[0] & 0x7f), not the sense key
>(sense_data[2] & 0x0f).  Calling it the sense key may lead to confusion for
>those who are versed in sense buffer layout.

Yes, my fault for typing something quickly.  If I were to propose anything
at all, it would be to follow the definitions I did for FreeBSD:

struct scsi_sense_data
{
	uint8_t error_code;
#define	SSD_ERRCODE			0x7F
#define		SSD_CURRENT_ERROR	0x70
#define		SSD_DEFERRED_ERROR	0x71
#define	SSD_ERRCODE_VALID	0x80	
	uint8_t segment;
	uint8_t flags;
#define	SSD_KEY				0x0F
#define		SSD_KEY_NO_SENSE	0x00
#define		SSD_KEY_RECOVERED_ERROR	0x01
#define		SSD_KEY_NOT_READY	0x02
#define		SSD_KEY_MEDIUM_ERROR	0x03
#define		SSD_KEY_HARDWARE_ERROR	0x04
#define		SSD_KEY_ILLEGAL_REQUEST	0x05
#define		SSD_KEY_UNIT_ATTENTION	0x06
#define		SSD_KEY_DATA_PROTECT	0x07
#define		SSD_KEY_BLANK_CHECK	0x08
#define		SSD_KEY_Vendor_Specific	0x09
#define		SSD_KEY_COPY_ABORTED	0x0a
#define		SSD_KEY_ABORTED_COMMAND	0x0b		
#define		SSD_KEY_EQUAL		0x0c
#define		SSD_KEY_VOLUME_OVERFLOW	0x0d
#define		SSD_KEY_MISCOMPARE	0x0e
#define		SSD_KEY_RESERVED	0x0f			
#define	SSD_ILI		0x20
#define	SSD_EOM		0x40
#define	SSD_FILEMARK	0x80
	uint8_t info[4];
	uint8_t extra_len;
	uint8_t cmd_spec_info[4];
	uint8_t add_sense_code;
	uint8_t add_sense_code_qual;
	uint8_t fru;
	uint8_t sense_key_spec[3];
#define	SSD_SCS_VALID		0x80
#define SSD_FIELDPTR_CMD	0x40
#define SSD_BITPTR_VALID	0x08
#define SSD_BITPTR_VALUE	0x07
#define SSD_MIN_SIZE 18
	uint8_t extra_bytes[14];
#define SSD_FULL_SIZE sizeof(struct scsi_sense_data)
};


static __inline void scsi_extract_sense(struct scsi_sense_data *sense,
					int *error_code, int *sense_key,
					int *asc, int *ascq);
static __inline void scsi_ulto2b(u_int32_t val, u_int8_t *bytes);
static __inline void scsi_ulto3b(u_int32_t val, u_int8_t *bytes);
static __inline void scsi_ulto4b(u_int32_t val, u_int8_t *bytes);
static __inline u_int32_t scsi_2btoul(u_int8_t *bytes);
static __inline u_int32_t scsi_3btoul(u_int8_t *bytes);
static __inline int32_t scsi_3btol(u_int8_t *bytes);
static __inline u_int32_t scsi_4btoul(u_int8_t *bytes);

static __inline void
scsi_ulto2b(u_int32_t val, u_int8_t *bytes)
{

	bytes[0] = (val >> 8) & 0xff;
	bytes[1] = val & 0xff;
}

static __inline void
scsi_ulto3b(u_int32_t val, u_int8_t *bytes)
{

	bytes[0] = (val >> 16) & 0xff;
	bytes[1] = (val >> 8) & 0xff;
	bytes[2] = val & 0xff;
}

static __inline void
scsi_ulto4b(u_int32_t val, u_int8_t *bytes)
{

	bytes[0] = (val >> 24) & 0xff;
	bytes[1] = (val >> 16) & 0xff;
	bytes[2] = (val >> 8) & 0xff;
	bytes[3] = val & 0xff;
}

static __inline u_int32_t
scsi_2btoul(u_int8_t *bytes)
{
	u_int32_t rv;

	rv = (bytes[0] << 8) |
	     bytes[1];
	return (rv);
}

static __inline u_int32_t
scsi_3btoul(u_int8_t *bytes)
{
	u_int32_t rv;

	rv = (bytes[0] << 16) |
	     (bytes[1] << 8) |
	     bytes[2];
	return (rv);
}

static __inline int32_t 
scsi_3btol(u_int8_t *bytes)
{
	u_int32_t rc = scsi_3btoul(bytes);
 
	if (rc & 0x00800000)
		rc |= 0xff000000;

	return (int32_t) rc;
}

static __inline u_int32_t
scsi_4btoul(u_int8_t *bytes)
{
	u_int32_t rv;

	rv = (bytes[0] << 24) |
	     (bytes[1] << 16) |
	     (bytes[2] << 8) |
	     bytes[3];
	return (rv);
}

static __inline void scsi_extract_sense(struct scsi_sense_data *sense,
					int *error_code, int *sense_key,
					int *asc, int *ascq)
{
	*error_code = sense->error_code & SSD_ERRCODE;
	*sense_key = sense->flags & SSD_KEY;
	*asc = (sense->extra_len >= 5) ? sense->add_sense_code : 0;
	*ascq = (sense->extra_len >= 6) ? sense->add_sense_code_qual : 0;
}

And corrected code in sd.c might look like this:

	if (driver_byte(result) != 0 &&
            scsi_sense_valid(SCpnt) &&
	    SCpnt->sense_buffer.error_code & SSD_ERRCODE_VALID) {
		int *error_code;
		int *sense_key;
		int *asc;
		int *ascq;
		
		scsi_extract_sense(&SCpnt->sense_buffer, &error_code,
				   &sense_key, &asc, &ascq);
		if (error_code == SSD_DEFERRED_ERROR)
			goto report_error_and_retry_command;

		if (error_code != SSD_CURRENT_ERROR)
			goto report_error_and_fail_command;

		switch (sense_key) {
		case MEDIUM_ERROR:
			error_sector = scsi_4btoul(SCpnt->sense_buffer.info);
			
etc.

This would make the code much more explicit.

--
Justin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
  2002-04-16 16:18           ` Oliver Xymoron
@ 2002-04-16 17:38             ` Justin T. Gibbs
  0 siblings, 0 replies; 12+ messages in thread
From: Justin T. Gibbs @ 2002-04-16 17:38 UTC (permalink / raw)
  To: Oliver Xymoron; +Cc: James Bottomley, Marcelo Tosatti, linux-scsi, Alan Cox

>From my recent inspection:
>
>1) sense handling
>2) timeouts and error recovery
>3) initialization and lun probing
>4) scalability
>5) readability

That about covers it! 8-)

--
Justin

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2002-04-16 17:38 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-04-15 20:40 [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX Justin T. Gibbs
2002-04-15 20:51 ` James Bottomley
2002-04-15 21:13   ` Justin T. Gibbs
2002-04-16 14:03     ` James Bottomley
2002-04-16 14:43       ` Justin T. Gibbs
2002-04-16 14:56         ` James Bottomley
2002-04-16 15:25           ` Tony Battersby
2002-04-16 17:36             ` Justin T. Gibbs
2002-04-16 16:18           ` Oliver Xymoron
2002-04-16 17:38             ` Justin T. Gibbs
2002-04-16  5:58 ` Jens Axboe
2002-04-16 14:08   ` Justin T. Gibbs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox