* [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX
@ 2002-04-15 20:40 Justin T. Gibbs
2002-04-15 20:51 ` James Bottomley
2002-04-16 5:58 ` Jens Axboe
0 siblings, 2 replies; 12+ messages in thread
From: Justin T. Gibbs @ 2002-04-15 20:40 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: linux-scsi, Alan Cox
Marcelo,
Looking at the latest version of your tree via bitkeeper, it
seems that the wrong patch has been applied to sd.c. In its
current state, the disk driver will fail to act upon any sense
data returned by the low level controller. The following patch
restores correct behavior.
--
Justin
===== sd.c 1.21 vs edited =====
--- 1.21/drivers/scsi/sd.c Fri Apr 5 09:19:03 2002
+++ edited/sd.c Mon Apr 15 14:32:51 2002
@@ -621,7 +621,7 @@
/* An error occurred */
if (driver_byte(result) != 0 && /* An error occured */
- SCpnt->sense_buffer[0] != 0xF0) { /* Sense data is valid */
+ SCpnt->sense_buffer[0] == 0xF0) { /* Sense data is valid */
switch (SCpnt->sense_buffer[2]) {
case MEDIUM_ERROR:
error_sector = (SCpnt->sense_buffer[3] << 24) |
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-15 20:40 [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX Justin T. Gibbs @ 2002-04-15 20:51 ` James Bottomley 2002-04-15 21:13 ` Justin T. Gibbs 2002-04-16 5:58 ` Jens Axboe 1 sibling, 1 reply; 12+ messages in thread From: James Bottomley @ 2002-04-15 20:51 UTC (permalink / raw) To: Justin T. Gibbs; +Cc: Marcelo Tosatti, linux-scsi, Alan Cox gibbs@scsiguy.com said: > Looking at the latest version of your tree via bitkeeper, it seems > that the wrong patch has been applied to sd.c. In its current state, > the disk driver will fail to act upon any sense data returned by the > low level controller. The following patch restores correct behavior. Actually, this should be using the scsi_sense_valid() function call in scsi_error.c, shouldn't it? Especially since the criterion for valid sense is merely that bit 7 be set with the error code (which can be either 0x70 or 0x71) in bits 0-6. James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-15 20:51 ` James Bottomley @ 2002-04-15 21:13 ` Justin T. Gibbs 2002-04-16 14:03 ` James Bottomley 0 siblings, 1 reply; 12+ messages in thread From: Justin T. Gibbs @ 2002-04-15 21:13 UTC (permalink / raw) To: James Bottomley; +Cc: Marcelo Tosatti, linux-scsi, Alan Cox >gibbs@scsiguy.com said: >> Looking at the latest version of your tree via bitkeeper, it seems >> that the wrong patch has been applied to sd.c. In its current state, >> the disk driver will fail to act upon any sense data returned by the >> low level controller. The following patch restores correct behavior. > >Actually, this should be using the scsi_sense_valid() function call in >scsi_error.c, shouldn't it? Especially since the criterion for valid sense is > >merely that bit 7 be set with the error code (which can be either 0x70 or >0x71) in bits 0-6. Well, if we really wanted to fix it correctly: 1) All of the HBA drivers would set DRIVER_SENSE when sense data is available. Right now many do set it, but because some drivers do not, the mid-layer doesn't trust it. 2) The mid-layer would look at DRIVER_SENSE before ever looking in the sense buffer. Currently, and HBA driver has to zero out the first byte of the sense buffer to ensure the peripheral drivers don't believe that sense is available when it really isn't (stale data from previous sense retrieval). 3) The peripheral drivers would understand how to deal with deferred errors. The check against 0xF0 is currently "correct" because the code in sd.c's rw_intr doesn't know how to deal with anything other than a current error (0x70 + valid bit). -- Justin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-15 21:13 ` Justin T. Gibbs @ 2002-04-16 14:03 ` James Bottomley 2002-04-16 14:43 ` Justin T. Gibbs 0 siblings, 1 reply; 12+ messages in thread From: James Bottomley @ 2002-04-16 14:03 UTC (permalink / raw) To: Justin T. Gibbs; +Cc: James Bottomley, Marcelo Tosatti, linux-scsi, Alan Cox gibbs@scsiguy.com said: > Well, if we really wanted to fix it correctly: [...] > The check against 0xF0 is currently "correct" because the code in > sd.c's rw_intr doesn't know how to deal with anything other than a > current error (0x70 + valid bit). I'm not advocating that you fix it correctly. However, if you use the provided scsi_sense_valid() function it will show up in an easy to recognise form for someone who later does all of this fixing. We can't go on adding these band aid fixes to the scsi subsystem with the justification that it's currently broken in a lot of cases so we don't need to consider them. All that does is cause the subsystem to diverge further and make the eventual fix even more difficult and daunting. What is wrong with: if (scsi_sense_valid(SCpnt)) { /* Sense data is valid */ /* FIXME: this probably won't work for deferred errors That way you've used the supplied function and noted the potential problem case. James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-16 14:03 ` James Bottomley @ 2002-04-16 14:43 ` Justin T. Gibbs 2002-04-16 14:56 ` James Bottomley 0 siblings, 1 reply; 12+ messages in thread From: Justin T. Gibbs @ 2002-04-16 14:43 UTC (permalink / raw) To: James Bottomley; +Cc: Marcelo Tosatti, linux-scsi, Alan Cox >We can't go on adding these band aid fixes to the scsi subsystem with >the justification that it's currently broken in a lot of cases so we >don't need to consider them. All that does is cause the subsystem to >diverge further and make the eventual fix even more difficult and daunting. In my opinion, the only way to really fix the Linux SCSI layer is to rip it out and start over using prior art from systems with functional I/O subsystems as a guide to building something enterprise worthy. As soon as the community becomes serious about such an endevor, so will I. 8-) >What is wrong with: > > if (scsi_sense_valid(SCpnt)) { /* Sense data is valid */ > /* FIXME: this probably won't work for deferred errors > >That way you've used the supplied function and noted the potential problem >case. If you insist on doing this, lets at least preserve the current behavior: /* * FIXME: this driver treats deferred errors are fatal for * the command that reports the error. The error is * actually for a previous command, so the current * command should be retried. In general, deferred * errors are hard to cope with as the data written * may not be available to be written again. This * is one of the many reasons that write-back caching * should not be enabled for general purpose applications. * The user should at lease enable "auto write reallocation" * and verify that the spare table for the device is not * full. */ if (scsi_sense_valid(SCpnt) && scsi_sense_key(SCpnt) == SKEY_CURRENT_ERROR) { } I don't believe there are definitions to support the second part of the if clause, but there could be. -- Justin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-16 14:43 ` Justin T. Gibbs @ 2002-04-16 14:56 ` James Bottomley 2002-04-16 15:25 ` Tony Battersby 2002-04-16 16:18 ` Oliver Xymoron 0 siblings, 2 replies; 12+ messages in thread From: James Bottomley @ 2002-04-16 14:56 UTC (permalink / raw) To: Justin T. Gibbs; +Cc: James Bottomley, Marcelo Tosatti, linux-scsi, Alan Cox gibbs@scsiguy.com said: > In my opinion, the only way to really fix the Linux SCSI layer is to > rip it out and start over using prior art from systems with functional > I/O subsystems as a guide to building something enterprise worthy. As > soon as the community becomes serious about such an endevor, so will > I. 8-) Not perhaps for ripping it up and starting again. However, there are more ways than one to skin a cat. If you can outline a set of principles, it may be possible to move current development towards them. How about just the five worst things about the current subsystem and how we could address them? > If you insist on doing this, lets at least preserve the current > behavior: > /* > * FIXME: this driver treats deferred errors are fatal for [...] > if (scsi_sense_valid(SCpnt) > && scsi_sense_key(SCpnt) == SKEY_CURRENT_ERROR) { Excellent! It's a lot more readable for people not versed in sense buffer layout, thanks! James > I don't believe there are definitions to support the second part of > the if clause, but there could be. ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-16 14:56 ` James Bottomley @ 2002-04-16 15:25 ` Tony Battersby 2002-04-16 17:36 ` Justin T. Gibbs 2002-04-16 16:18 ` Oliver Xymoron 1 sibling, 1 reply; 12+ messages in thread From: Tony Battersby @ 2002-04-16 15:25 UTC (permalink / raw) To: 'James Bottomley', 'Justin T. Gibbs'; +Cc: linux-scsi > > If you insist on doing this, lets at least preserve the current > > behavior: > > > /* > > * FIXME: this driver treats deferred errors are fatal for > [...] > > if (scsi_sense_valid(SCpnt) > > && scsi_sense_key(SCpnt) == SKEY_CURRENT_ERROR) { > > Excellent! It's a lot more readable for people not versed in > sense buffer > layout, thanks! Why are you calling it the sense key? Current vs. deferred error is indicated by the error code (sense_data[0] & 0x7f), not the sense key (sense_data[2] & 0x0f). Calling it the sense key may lead to confusion for those who are versed in sense buffer layout. Tony ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-16 15:25 ` Tony Battersby @ 2002-04-16 17:36 ` Justin T. Gibbs 0 siblings, 0 replies; 12+ messages in thread From: Justin T. Gibbs @ 2002-04-16 17:36 UTC (permalink / raw) To: tonyb; +Cc: 'James Bottomley', linux-scsi >> > If you insist on doing this, lets at least preserve the current >> > behavior: >> >> > /* >> > * FIXME: this driver treats deferred errors are fatal for >> [...] >> > if (scsi_sense_valid(SCpnt) >> > && scsi_sense_key(SCpnt) == SKEY_CURRENT_ERROR) { >> >> Excellent! It's a lot more readable for people not versed in >> sense buffer >> layout, thanks! > >Why are you calling it the sense key? Current vs. deferred error is >indicated by the error code (sense_data[0] & 0x7f), not the sense key >(sense_data[2] & 0x0f). Calling it the sense key may lead to confusion for >those who are versed in sense buffer layout. Yes, my fault for typing something quickly. If I were to propose anything at all, it would be to follow the definitions I did for FreeBSD: struct scsi_sense_data { uint8_t error_code; #define SSD_ERRCODE 0x7F #define SSD_CURRENT_ERROR 0x70 #define SSD_DEFERRED_ERROR 0x71 #define SSD_ERRCODE_VALID 0x80 uint8_t segment; uint8_t flags; #define SSD_KEY 0x0F #define SSD_KEY_NO_SENSE 0x00 #define SSD_KEY_RECOVERED_ERROR 0x01 #define SSD_KEY_NOT_READY 0x02 #define SSD_KEY_MEDIUM_ERROR 0x03 #define SSD_KEY_HARDWARE_ERROR 0x04 #define SSD_KEY_ILLEGAL_REQUEST 0x05 #define SSD_KEY_UNIT_ATTENTION 0x06 #define SSD_KEY_DATA_PROTECT 0x07 #define SSD_KEY_BLANK_CHECK 0x08 #define SSD_KEY_Vendor_Specific 0x09 #define SSD_KEY_COPY_ABORTED 0x0a #define SSD_KEY_ABORTED_COMMAND 0x0b #define SSD_KEY_EQUAL 0x0c #define SSD_KEY_VOLUME_OVERFLOW 0x0d #define SSD_KEY_MISCOMPARE 0x0e #define SSD_KEY_RESERVED 0x0f #define SSD_ILI 0x20 #define SSD_EOM 0x40 #define SSD_FILEMARK 0x80 uint8_t info[4]; uint8_t extra_len; uint8_t cmd_spec_info[4]; uint8_t add_sense_code; uint8_t add_sense_code_qual; uint8_t fru; uint8_t sense_key_spec[3]; #define SSD_SCS_VALID 0x80 #define SSD_FIELDPTR_CMD 0x40 #define SSD_BITPTR_VALID 0x08 #define SSD_BITPTR_VALUE 0x07 #define SSD_MIN_SIZE 18 uint8_t extra_bytes[14]; #define SSD_FULL_SIZE sizeof(struct scsi_sense_data) }; static __inline void scsi_extract_sense(struct scsi_sense_data *sense, int *error_code, int *sense_key, int *asc, int *ascq); static __inline void scsi_ulto2b(u_int32_t val, u_int8_t *bytes); static __inline void scsi_ulto3b(u_int32_t val, u_int8_t *bytes); static __inline void scsi_ulto4b(u_int32_t val, u_int8_t *bytes); static __inline u_int32_t scsi_2btoul(u_int8_t *bytes); static __inline u_int32_t scsi_3btoul(u_int8_t *bytes); static __inline int32_t scsi_3btol(u_int8_t *bytes); static __inline u_int32_t scsi_4btoul(u_int8_t *bytes); static __inline void scsi_ulto2b(u_int32_t val, u_int8_t *bytes) { bytes[0] = (val >> 8) & 0xff; bytes[1] = val & 0xff; } static __inline void scsi_ulto3b(u_int32_t val, u_int8_t *bytes) { bytes[0] = (val >> 16) & 0xff; bytes[1] = (val >> 8) & 0xff; bytes[2] = val & 0xff; } static __inline void scsi_ulto4b(u_int32_t val, u_int8_t *bytes) { bytes[0] = (val >> 24) & 0xff; bytes[1] = (val >> 16) & 0xff; bytes[2] = (val >> 8) & 0xff; bytes[3] = val & 0xff; } static __inline u_int32_t scsi_2btoul(u_int8_t *bytes) { u_int32_t rv; rv = (bytes[0] << 8) | bytes[1]; return (rv); } static __inline u_int32_t scsi_3btoul(u_int8_t *bytes) { u_int32_t rv; rv = (bytes[0] << 16) | (bytes[1] << 8) | bytes[2]; return (rv); } static __inline int32_t scsi_3btol(u_int8_t *bytes) { u_int32_t rc = scsi_3btoul(bytes); if (rc & 0x00800000) rc |= 0xff000000; return (int32_t) rc; } static __inline u_int32_t scsi_4btoul(u_int8_t *bytes) { u_int32_t rv; rv = (bytes[0] << 24) | (bytes[1] << 16) | (bytes[2] << 8) | bytes[3]; return (rv); } static __inline void scsi_extract_sense(struct scsi_sense_data *sense, int *error_code, int *sense_key, int *asc, int *ascq) { *error_code = sense->error_code & SSD_ERRCODE; *sense_key = sense->flags & SSD_KEY; *asc = (sense->extra_len >= 5) ? sense->add_sense_code : 0; *ascq = (sense->extra_len >= 6) ? sense->add_sense_code_qual : 0; } And corrected code in sd.c might look like this: if (driver_byte(result) != 0 && scsi_sense_valid(SCpnt) && SCpnt->sense_buffer.error_code & SSD_ERRCODE_VALID) { int *error_code; int *sense_key; int *asc; int *ascq; scsi_extract_sense(&SCpnt->sense_buffer, &error_code, &sense_key, &asc, &ascq); if (error_code == SSD_DEFERRED_ERROR) goto report_error_and_retry_command; if (error_code != SSD_CURRENT_ERROR) goto report_error_and_fail_command; switch (sense_key) { case MEDIUM_ERROR: error_sector = scsi_4btoul(SCpnt->sense_buffer.info); etc. This would make the code much more explicit. -- Justin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-16 14:56 ` James Bottomley 2002-04-16 15:25 ` Tony Battersby @ 2002-04-16 16:18 ` Oliver Xymoron 2002-04-16 17:38 ` Justin T. Gibbs 1 sibling, 1 reply; 12+ messages in thread From: Oliver Xymoron @ 2002-04-16 16:18 UTC (permalink / raw) To: James Bottomley; +Cc: Justin T. Gibbs, Marcelo Tosatti, linux-scsi, Alan Cox On Tue, 16 Apr 2002, James Bottomley wrote: > gibbs@scsiguy.com said: > > In my opinion, the only way to really fix the Linux SCSI layer is to > > rip it out and start over using prior art from systems with functional > > I/O subsystems as a guide to building something enterprise worthy. As > > soon as the community becomes serious about such an endevor, so will > > I. 8-) > > Not perhaps for ripping it up and starting again. However, there are > more ways than one to skin a cat. If you can outline a set of > principles, it may be possible to move current development towards them. > How about just the five worst things about the current subsystem and how > we could address them? >From my recent inspection: 1) sense handling 2) timeouts and error recovery 3) initialization and lun probing 4) scalability 5) readability In short, all the corner cases are handled with layer on layer of bandaids and while it works fine for most people with desktop systems, it's really showing its age when you start messing with storage networking and hotplugging. As it stands, SCSI is one of the few subsystems that hasn't gotten a major overhaul in recent years. As 2.5 is shaping up to be a revamp of the whole VM/paging/FS/IO/IDE chain and the BIO stuff seems to have settled down, now is a really good time to start thinking about plugging a cleaner SCSI layer in. -- "Love the dolphins," she advised him. "Write by W.A.S.T.E.." ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-16 16:18 ` Oliver Xymoron @ 2002-04-16 17:38 ` Justin T. Gibbs 0 siblings, 0 replies; 12+ messages in thread From: Justin T. Gibbs @ 2002-04-16 17:38 UTC (permalink / raw) To: Oliver Xymoron; +Cc: James Bottomley, Marcelo Tosatti, linux-scsi, Alan Cox >From my recent inspection: > >1) sense handling >2) timeouts and error recovery >3) initialization and lun probing >4) scalability >5) readability That about covers it! 8-) -- Justin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-15 20:40 [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX Justin T. Gibbs 2002-04-15 20:51 ` James Bottomley @ 2002-04-16 5:58 ` Jens Axboe 2002-04-16 14:08 ` Justin T. Gibbs 1 sibling, 1 reply; 12+ messages in thread From: Jens Axboe @ 2002-04-16 5:58 UTC (permalink / raw) To: Justin T. Gibbs; +Cc: Marcelo Tosatti, linux-scsi, Alan Cox On Mon, Apr 15 2002, Justin T. Gibbs wrote: > Marcelo, > > Looking at the latest version of your tree via bitkeeper, it > seems that the wrong patch has been applied to sd.c. In its > current state, the disk driver will fail to act upon any sense > data returned by the low level controller. The following patch > restores correct behavior. Justin, Noticed this myself the other day... I don't think they were applied incorrectly, your 2.4.18 patch is buggy while your 2.4.17 version is quite ok. -- Jens Axboe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX 2002-04-16 5:58 ` Jens Axboe @ 2002-04-16 14:08 ` Justin T. Gibbs 0 siblings, 0 replies; 12+ messages in thread From: Justin T. Gibbs @ 2002-04-16 14:08 UTC (permalink / raw) To: Jens Axboe; +Cc: Marcelo Tosatti, linux-scsi, Alan Cox >On Mon, Apr 15 2002, Justin T. Gibbs wrote: >> Marcelo, >> >> Looking at the latest version of your tree via bitkeeper, it >> seems that the wrong patch has been applied to sd.c. In its >> current state, the disk driver will fail to act upon any sense >> data returned by the low level controller. The following patch >> restores correct behavior. > >Justin, > >Noticed this myself the other day... I don't think they were applied >incorrectly, your 2.4.18 patch is buggy while your 2.4.17 version is >quite ok. There were two patches posted to the list about this issue. The first had this bug. The second did not. It does look like the merge of this correction didn't make it into my 2.4.18 tree at the time I generated the 2.4.18 patch set for the 6.2.5 release. I have corrected that patch set. -- Justin ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2002-04-16 17:38 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-04-15 20:40 [PATCH] sd.c fixes applied incorectly to 2.4.19-preXX Justin T. Gibbs 2002-04-15 20:51 ` James Bottomley 2002-04-15 21:13 ` Justin T. Gibbs 2002-04-16 14:03 ` James Bottomley 2002-04-16 14:43 ` Justin T. Gibbs 2002-04-16 14:56 ` James Bottomley 2002-04-16 15:25 ` Tony Battersby 2002-04-16 17:36 ` Justin T. Gibbs 2002-04-16 16:18 ` Oliver Xymoron 2002-04-16 17:38 ` Justin T. Gibbs 2002-04-16 5:58 ` Jens Axboe 2002-04-16 14:08 ` Justin T. Gibbs
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox