* New 2.6.24.2 SG_IO SCSI problems @ 2008-02-21 15:15 Mark Hounschell 2008-02-21 15:41 ` James Bottomley 2008-02-22 16:50 ` Mike Christie 0 siblings, 2 replies; 15+ messages in thread From: Mark Hounschell @ 2008-02-21 15:15 UTC (permalink / raw) To: linux-scsi; +Cc: linux-kernel I seem to have run into some sort of regression in the SG_IO interface of 2.6.24.2. I have an application that up until 2.6.24 worked fine. The 2.6.23.16 kernel works fine. During reads I get these kernel messages. Writes and other functions _seem_ OK. Actually basic reads are working. Its with large BC reads using an io_vec list that the problem shows up. Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 09:27:51 harley kernel: sg[0] - Addr 0x06256100 : Length 256 Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 09:27:51 harley kernel: sg[0] - Addr 0x06256100 : Length 256 Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 09:27:51 harley kernel: sg[0] - Addr 0x06256100 : Length 256 Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 09:27:51 harley kernel: sg[0] - Addr 0x06256100 : Length 256 . . . . . . The status elements of the sg_io_hdr_t structure used in the application returns status = 0x0 msg_status 0x0 host_status = 0x7 driver_status = 0x0 The hardware in use on this particular machine is an simple Adaptec AHA-2930CU talking to an old IMPRIMIS 94601-15 1.2GB disk drive. Again, all this works fine with the 2.6.23.11 kernel Any help would be appreciated Regards Mark Hounschell ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-21 15:15 New 2.6.24.2 SG_IO SCSI problems Mark Hounschell @ 2008-02-21 15:41 ` James Bottomley 2008-02-21 16:21 ` Mark Hounschell 2008-02-22 16:50 ` Mike Christie 1 sibling, 1 reply; 15+ messages in thread From: James Bottomley @ 2008-02-21 15:41 UTC (permalink / raw) To: markh; +Cc: linux-scsi, linux-kernel On Thu, 2008-02-21 at 10:15 -0500, Mark Hounschell wrote: > I seem to have run into some sort of regression in the SG_IO interface of 2.6.24.2. > I have an application that up until 2.6.24 worked fine. The 2.6.23.16 kernel works fine. > > During reads I get these kernel messages. Writes and other functions _seem_ OK. Actually basic > reads are working. Its with large BC reads using an io_vec list that the problem shows up. > > Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. > Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. > Feb 21 09:27:51 harley kernel: sg[0] - Addr 0x06256100 : Length 256 Help me a little here. What was the io_vec and command you sent in to produce this? The aic debugging information implies a single element sg list for a 256 byte read. James ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-21 15:41 ` James Bottomley @ 2008-02-21 16:21 ` Mark Hounschell 2008-02-22 10:03 ` Mark Hounschell 0 siblings, 1 reply; 15+ messages in thread From: Mark Hounschell @ 2008-02-21 16:21 UTC (permalink / raw) To: James Bottomley; +Cc: linux-scsi, linux-kernel James Bottomley wrote: > On Thu, 2008-02-21 at 10:15 -0500, Mark Hounschell wrote: >> I seem to have run into some sort of regression in the SG_IO interface of 2.6.24.2. >> I have an application that up until 2.6.24 worked fine. The 2.6.23.16 kernel works fine. >> >> During reads I get these kernel messages. Writes and other functions _seem_ OK. Actually basic >> reads are working. Its with large BC reads using an io_vec list that the problem shows up. >> >> Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. >> Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. >> Feb 21 09:27:51 harley kernel: sg[0] - Addr 0x06256100 : Length 256 > > Help me a little here. What was the io_vec and command you sent in to > produce this? The aic debugging information implies a single element sg > list for a 256 byte read. > > James > > > Well, I did no 256 byte xfer at all. My failing io_vec list has 6 elements. The first 5 are for byte counts of 0xfffc and the last 0x6114. See below. This is some debug info from within my app of the commands leading up the failure: If you need actual values of the io_vec lists I will need to add some additional debug info into the app. I will do if needed. The disk BTW is formated at 768 byte sector size. The first read has a 2 element io_vec list and reports no error: ScsiDev_thread_7e00: Read CBD = 0x08 0x00 0x00 0x00 0x01 0x00 ScsiDev_thread_7e00: Read1(1) bc = 0x0078 addr = 0x000000 Skip 0 ScsiDev_thread_7e00: Read(2) bc = 0x0288 addr = 0xb6cea368 Skip 1 ScsiDev_thread_7e00: SRead DC ops = 2 short_bc = 288 The second read has a 4 element io_vec list and reports no error: ScsiDev_thread_7e00: Read CBD = 0x08 0x00 0x00 0x00 0x05 0x00 ScsiDev_thread_7e00: Read1(1) bc = 0x0780 addr = 0x000000 Skip 0 ScsiDev_thread_7e00: Read1(2) bc = 0x0670 addr = 0x000000 Skip 1 ScsiDev_thread_7e00: Read1(3) bc = 0x003c addr = 0x000200 Skip 0 ScsiDev_thread_7e00: SRead(4) bc = 0x022c addr = 0xb6cea368 Skip 1 ScsiDev_thread_7e00: Read DC ops = 4 short_bc = 22c There is a seek here that reports no error: ScsiDev_thread_7e00: Seek address 2752 BPS 768 This read has a 6 element io_vec list and reports the error to the app: ScsiDev_thread_7e00: ReadX CBD = 0x28 0x00 0x00 0x00 0x27 0x52 0x00 0x01 0xcb 0x00 ScsiDev_thread_7e00: Read1(1) bc = 0xfffc addr = 0x000784 Skip 0 ScsiDev_thread_7e00: Read1(2) bc = 0xfffc addr = 0x010780 Skip 0 ScsiDev_thread_7e00: Read1(3) bc = 0xfffc addr = 0x02077c Skip 0 ScsiDev_thread_7e00: Read1(4) bc = 0xfffc addr = 0x030778 Skip 0 ScsiDev_thread_7e00: Read1(5) bc = 0xfffc addr = 0x040774 Skip 0 ScsiDev_thread_7e00: Read1(6) bc = 0x6114 addr = 0x050770 Skip 0 ScsiDev_thread_7e00: Read DC ops = 6 short_bc = 0 ScsiDev_thread_7e00: scsi = 0x0 msg 0x0 host = 0x00000007 driver = 0x00000000 ScsiDev_thread_7e00: Read error: sns = 0x00 residual = 0x0000 ScsiDev_thread_7e00: Posting IPL status 0x00000090 0x000e0000 for Suba 0000 to loc 0x000000 ScsiDev_thread_7e00: Sleeping!! Below is the complete dump of kernel messages for the above. I assume they are a result of the last failing read but there is an awfull lot there just for that 6 element io_vec list. Sorry to put all this in there but wanted to you to get the idea. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 Again, sorry for all that. regards Mark ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-21 16:21 ` Mark Hounschell @ 2008-02-22 10:03 ` Mark Hounschell 0 siblings, 0 replies; 15+ messages in thread From: Mark Hounschell @ 2008-02-22 10:03 UTC (permalink / raw) To: markh; +Cc: James Bottomley, linux-scsi, linux-kernel Mark Hounschell wrote: > James Bottomley wrote: >> On Thu, 2008-02-21 at 10:15 -0500, Mark Hounschell wrote: >>> I seem to have run into some sort of regression in the SG_IO interface of 2.6.24.2. >>> I have an application that up until 2.6.24 worked fine. The 2.6.23.16 kernel works fine. >>> >>> During reads I get these kernel messages. Writes and other functions _seem_ OK. Actually basic >>> reads are working. Its with large BC reads using an io_vec list that the problem shows up. >>> >>> Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. >>> Feb 21 09:27:51 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. >>> Feb 21 09:27:51 harley kernel: sg[0] - Addr 0x06256100 : Length 256 >> Help me a little here. What was the io_vec and command you sent in to >> produce this? The aic debugging information implies a single element sg >> list for a 256 byte read. >> >> James >> >> >> > > Well, I did no 256 byte xfer at all. My failing io_vec list has 6 elements. > The first 5 are for byte counts of 0xfffc and the last 0x6114. See below. > > This is some debug info from within my app of the commands leading up the failure: > If you need actual values of the io_vec lists I will need to add some additional debug > info into the app. I will do if needed. > > The disk BTW is formated at 768 byte sector size. > > The first read has a 2 element io_vec list and reports no error: > ScsiDev_thread_7e00: Read CBD = 0x08 0x00 0x00 0x00 0x01 0x00 > ScsiDev_thread_7e00: Read1(1) bc = 0x0078 addr = 0x000000 Skip 0 > ScsiDev_thread_7e00: Read(2) bc = 0x0288 addr = 0xb6cea368 Skip 1 > ScsiDev_thread_7e00: SRead DC ops = 2 short_bc = 288 > > The second read has a 4 element io_vec list and reports no error: > ScsiDev_thread_7e00: Read CBD = 0x08 0x00 0x00 0x00 0x05 0x00 > ScsiDev_thread_7e00: Read1(1) bc = 0x0780 addr = 0x000000 Skip 0 > ScsiDev_thread_7e00: Read1(2) bc = 0x0670 addr = 0x000000 Skip 1 > ScsiDev_thread_7e00: Read1(3) bc = 0x003c addr = 0x000200 Skip 0 > ScsiDev_thread_7e00: SRead(4) bc = 0x022c addr = 0xb6cea368 Skip 1 > ScsiDev_thread_7e00: Read DC ops = 4 short_bc = 22c > > There is a seek here that reports no error: > ScsiDev_thread_7e00: Seek address 2752 BPS 768 > > > This read has a 6 element io_vec list and reports the error to the app: > ScsiDev_thread_7e00: ReadX CBD = 0x28 0x00 0x00 0x00 0x27 0x52 0x00 0x01 0xcb 0x00 > ScsiDev_thread_7e00: Read1(1) bc = 0xfffc addr = 0x000784 Skip 0 > ScsiDev_thread_7e00: Read1(2) bc = 0xfffc addr = 0x010780 Skip 0 > ScsiDev_thread_7e00: Read1(3) bc = 0xfffc addr = 0x02077c Skip 0 > ScsiDev_thread_7e00: Read1(4) bc = 0xfffc addr = 0x030778 Skip 0 > ScsiDev_thread_7e00: Read1(5) bc = 0xfffc addr = 0x040774 Skip 0 > ScsiDev_thread_7e00: Read1(6) bc = 0x6114 addr = 0x050770 Skip 0 > ScsiDev_thread_7e00: Read DC ops = 6 short_bc = 0 > > ScsiDev_thread_7e00: scsi = 0x0 msg 0x0 host = 0x00000007 driver = 0x00000000 > ScsiDev_thread_7e00: Read error: sns = 0x00 residual = 0x0000 > ScsiDev_thread_7e00: Posting IPL status 0x00000090 0x000e0000 for Suba 0000 to loc 0x000000 > ScsiDev_thread_7e00: Sleeping!! > > > Below is the complete dump of kernel messages for the above. I assume they are > a result of the last failing read but there is an awfull lot there just for that 6 element > io_vec list. Sorry to put all this in there but wanted to you to get the idea. > > > Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. > Feb 21 10:51:03 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. > Feb 21 10:51:03 harley kernel: sg[0] - Addr 0x03252e100 : Length 256 . . Snip . > Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): data overrun detected in Data-in phase. Tag == 0x1. > Feb 21 10:51:49 harley kernel: (scsi1:A:2:0): Have seen Data Phase. Length = 256. NumSGs = 1. > Feb 21 10:51:49 harley kernel: sg[0] - Addr 0x0f156100 : Length 256 > > Again, sorry for all that. > FWIW, on a different machine doing the same thing I get only a single kernel message and NOT the tons shown above. Regards Mark ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-21 15:15 New 2.6.24.2 SG_IO SCSI problems Mark Hounschell 2008-02-21 15:41 ` James Bottomley @ 2008-02-22 16:50 ` Mike Christie 2008-02-22 16:59 ` Mike Christie 1 sibling, 1 reply; 15+ messages in thread From: Mike Christie @ 2008-02-22 16:50 UTC (permalink / raw) To: markh; +Cc: linux-scsi, linux-kernel Mark Hounschell wrote: > I seem to have run into some sort of regression in the SG_IO interface of 2.6.24.2. > I have an application that up until 2.6.24 worked fine. The 2.6.23.16 kernel works fine. > > During reads I get these kernel messages. Writes and other functions _seem_ OK. Actually basic > reads are working. Its with large BC reads using an io_vec list that the problem shows up. > Are you doing SG_IO to the sg device (/dev/sg*) or to the block device (/dev/sdX)? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-22 16:50 ` Mike Christie @ 2008-02-22 16:59 ` Mike Christie 2008-02-22 17:56 ` Mark Hounschell 0 siblings, 1 reply; 15+ messages in thread From: Mike Christie @ 2008-02-22 16:59 UTC (permalink / raw) To: markh; +Cc: linux-scsi, linux-kernel Mike Christie wrote: > Mark Hounschell wrote: >> I seem to have run into some sort of regression in the SG_IO interface >> of 2.6.24.2. I have an application that up until 2.6.24 worked fine. >> The 2.6.23.16 kernel works fine. >> >> During reads I get these kernel messages. Writes and other functions >> _seem_ OK. Actually basic >> reads are working. Its with large BC reads using an io_vec list that >> the problem shows up. >> > > Are you doing SG_IO to the sg device (/dev/sg*) or to the block device > (/dev/sdX)? If you are doing SG_IO to the sg device, then I know of one regression (well not regression exactly, but I fixed a bug but the patch got partially overwritten by another patch and that caused a new bug). Both bugs are fixed in 2.6.25-rc2. Could you try that out if you are doing SG_IO to the sg device. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-22 16:59 ` Mike Christie @ 2008-02-22 17:56 ` Mark Hounschell 2008-02-22 21:38 ` Mark Hounschell 0 siblings, 1 reply; 15+ messages in thread From: Mark Hounschell @ 2008-02-22 17:56 UTC (permalink / raw) To: Mike Christie; +Cc: linux-scsi, linux-kernel Mike Christie wrote: > Mike Christie wrote: >> Mark Hounschell wrote: >>> I seem to have run into some sort of regression in the SG_IO >>> interface of 2.6.24.2. I have an application that up until 2.6.24 >>> worked fine. The 2.6.23.16 kernel works fine. >>> >>> During reads I get these kernel messages. Writes and other functions >>> _seem_ OK. Actually basic >>> reads are working. Its with large BC reads using an io_vec list that >>> the problem shows up. >>> >> >> Are you doing SG_IO to the sg device (/dev/sg*) or to the block device >> (/dev/sdX)? > > If you are doing SG_IO to the sg device, then I know of one regression > (well not regression exactly, but I fixed a bug but the patch got > partially overwritten by another patch and that caused a new bug). Both > bugs are fixed in 2.6.25-rc2. Could you try that out if you are doing > SG_IO to the sg device. > Yes, I'm using /dev/sg*. And yes again I'll checkout 2.6.25-rc2 ASIC. Thanks Mark ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-22 17:56 ` Mark Hounschell @ 2008-02-22 21:38 ` Mark Hounschell 2008-02-22 22:25 ` Mike Christie 0 siblings, 1 reply; 15+ messages in thread From: Mark Hounschell @ 2008-02-22 21:38 UTC (permalink / raw) To: markh; +Cc: Mike Christie, linux-scsi, linux-kernel Mark Hounschell wrote: > Mike Christie wrote: >> Mike Christie wrote: >>> Mark Hounschell wrote: >>>> I seem to have run into some sort of regression in the SG_IO >>>> interface of 2.6.24.2. I have an application that up until 2.6.24 >>>> worked fine. The 2.6.23.16 kernel works fine. >>>> >>>> During reads I get these kernel messages. Writes and other functions >>>> _seem_ OK. Actually basic >>>> reads are working. Its with large BC reads using an io_vec list that >>>> the problem shows up. >>>> >>> Are you doing SG_IO to the sg device (/dev/sg*) or to the block device >>> (/dev/sdX)? >> If you are doing SG_IO to the sg device, then I know of one regression >> (well not regression exactly, but I fixed a bug but the patch got >> partially overwritten by another patch and that caused a new bug). Both >> bugs are fixed in 2.6.25-rc2. Could you try that out if you are doing >> SG_IO to the sg device. >> > > Yes, I'm using /dev/sg*. And yes again I'll checkout 2.6.25-rc2 ASIC. > > Thanks > Mark > - 2.6.25-rc2 does fix the problem I'm having. I don't suppose there is a patch lying around for 2.6.24.2?? Thanks Mark ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-22 21:38 ` Mark Hounschell @ 2008-02-22 22:25 ` Mike Christie 2008-02-22 22:48 ` Tony Battersby ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Mike Christie @ 2008-02-22 22:25 UTC (permalink / raw) To: markh; +Cc: linux-scsi, linux-kernel, Tony Battersby [-- Attachment #1: Type: text/plain, Size: 1365 bytes --] Mark Hounschell wrote: > Mark Hounschell wrote: >> Mike Christie wrote: >>> Mike Christie wrote: >>>> Mark Hounschell wrote: >>>>> I seem to have run into some sort of regression in the SG_IO >>>>> interface of 2.6.24.2. I have an application that up until 2.6.24 >>>>> worked fine. The 2.6.23.16 kernel works fine. >>>>> >>>>> During reads I get these kernel messages. Writes and other functions >>>>> _seem_ OK. Actually basic >>>>> reads are working. Its with large BC reads using an io_vec list that >>>>> the problem shows up. >>>>> >>>> Are you doing SG_IO to the sg device (/dev/sg*) or to the block device >>>> (/dev/sdX)? >>> If you are doing SG_IO to the sg device, then I know of one regression >>> (well not regression exactly, but I fixed a bug but the patch got >>> partially overwritten by another patch and that caused a new bug). Both >>> bugs are fixed in 2.6.25-rc2. Could you try that out if you are doing >>> SG_IO to the sg device. >>> >> Yes, I'm using /dev/sg*. And yes again I'll checkout 2.6.25-rc2 ASIC. >> >> Thanks >> Mark >> - > > 2.6.25-rc2 does fix the problem I'm having. I don't suppose there is a patch > lying around for 2.6.24.2?? > I attached a backport of the patch from Tony (added as cc) that is in 2.6.25-rc2. Could you try it out against 2.6.24.2 just to make sure it was this patch, then we can send it to stable. [-- Attachment #2: fix-passthrough-bufflen.patch --] [-- Type: text/x-patch, Size: 1443 bytes --] Backport 76d78300a6eb8b7f08e47703b7e68a659ffc2053 to 2.6.24 >From Tony Battersby: When sending a SCSI command to a tape drive via the SCSI Generic (sg) driver, if the command has a data transfer length more than scatter_elem_sz (32 KB default) and not a multiple of 512, then I either hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else the command never completes (depending on the LLDD). When constructing scatterlists, the sg driver rounds up the scatterlist element sizes to be a multiple of 512. This can result in sum(scatterlist lengths) > bufflen. In this case, scsi_req_map_sg() incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to bufflen. When the command completes, req_bio_endio() detects that bio->bi_size != 0, and so it doesn't call bio_endio(). This causes the command to be resubmitted, resulting in BUG_ON or the command never completing. This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather than to sum(scatterlist lengths), which fixes the problem. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> --- linux-2.6.24.2/drivers/scsi/scsi_lib.c 2008-02-10 23:51:11.000000000 -0600 +++ linux-2.6.24.2.work/drivers/scsi/scsi_lib.c 2008-02-22 16:20:09.000000000 -0600 @@ -298,7 +298,6 @@ static int scsi_req_map_sg(struct reques page = sg_page(sg); off = sg->offset; len = sg->length; - data_len += len; while (len > 0 && data_len > 0) { /* ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-22 22:25 ` Mike Christie @ 2008-02-22 22:48 ` Tony Battersby 2008-02-23 11:16 ` Mark Hounschell 2008-03-05 11:58 ` Mark Hounschell 2 siblings, 0 replies; 15+ messages in thread From: Tony Battersby @ 2008-02-22 22:48 UTC (permalink / raw) To: Mike Christie; +Cc: markh, linux-scsi, linux-kernel > I attached a backport of the patch from Tony (added as cc) that is in > 2.6.25-rc2. Could you try it out against 2.6.24.2 just to make sure it > was this patch, then we can send it to stable. > Yes, I had wanted to send this patch to -stable, but got distracted with other bugs. So please do so, by all means. Tony ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-22 22:25 ` Mike Christie 2008-02-22 22:48 ` Tony Battersby @ 2008-02-23 11:16 ` Mark Hounschell 2008-03-05 11:58 ` Mark Hounschell 2 siblings, 0 replies; 15+ messages in thread From: Mark Hounschell @ 2008-02-23 11:16 UTC (permalink / raw) To: Mike Christie; +Cc: markh, linux-scsi, linux-kernel, Tony Battersby Mike Christie wrote: > Mark Hounschell wrote: >> Mark Hounschell wrote: >>> Mike Christie wrote: >>>> Mike Christie wrote: >>>>> Mark Hounschell wrote: >>>>>> I seem to have run into some sort of regression in the SG_IO >>>>>> interface of 2.6.24.2. I have an application that up until 2.6.24 >>>>>> worked fine. The 2.6.23.16 kernel works fine. >>>>>> >>>>>> During reads I get these kernel messages. Writes and other functions >>>>>> _seem_ OK. Actually basic >>>>>> reads are working. Its with large BC reads using an io_vec list that >>>>>> the problem shows up. >>>>>> >>>>> Are you doing SG_IO to the sg device (/dev/sg*) or to the block device >>>>> (/dev/sdX)? >>>> If you are doing SG_IO to the sg device, then I know of one regression >>>> (well not regression exactly, but I fixed a bug but the patch got >>>> partially overwritten by another patch and that caused a new bug). Both >>>> bugs are fixed in 2.6.25-rc2. Could you try that out if you are doing >>>> SG_IO to the sg device. >>>> >>> Yes, I'm using /dev/sg*. And yes again I'll checkout 2.6.25-rc2 ASIC. >>> >>> Thanks >>> Mark >>> - >> >> 2.6.25-rc2 does fix the problem I'm having. I don't suppose there is a >> patch >> lying around for 2.6.24.2?? >> > > I attached a backport of the patch from Tony (added as cc) that is in > 2.6.25-rc2. Could you try it out against 2.6.24.2 just to make sure it > was this patch, then we can send it to stable. > Sorry it took so long. This does fix my problem. I hope it's not to late for 2.6.24.3 Regards Mark ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-02-22 22:25 ` Mike Christie 2008-02-22 22:48 ` Tony Battersby 2008-02-23 11:16 ` Mark Hounschell @ 2008-03-05 11:58 ` Mark Hounschell 2008-03-05 15:44 ` James Bottomley 2 siblings, 1 reply; 15+ messages in thread From: Mark Hounschell @ 2008-03-05 11:58 UTC (permalink / raw) To: Mike Christie; +Cc: markh, linux-scsi, linux-kernel, Tony Battersby Mike Christie wrote: > Mark Hounschell wrote: >> Mark Hounschell wrote: >>> Mike Christie wrote: >>>> Mike Christie wrote: >>>>> Mark Hounschell wrote: >>>>>> I seem to have run into some sort of regression in the SG_IO >>>>>> interface of 2.6.24.2. I have an application that up until 2.6.24 >>>>>> worked fine. The 2.6.23.16 kernel works fine. >>>>>> >>>>>> During reads I get these kernel messages. Writes and other functions >>>>>> _seem_ OK. Actually basic >>>>>> reads are working. Its with large BC reads using an io_vec list that >>>>>> the problem shows up. >>>>>> >>>>> Are you doing SG_IO to the sg device (/dev/sg*) or to the block device >>>>> (/dev/sdX)? >>>> If you are doing SG_IO to the sg device, then I know of one regression >>>> (well not regression exactly, but I fixed a bug but the patch got >>>> partially overwritten by another patch and that caused a new bug). Both >>>> bugs are fixed in 2.6.25-rc2. Could you try that out if you are doing >>>> SG_IO to the sg device. >>>> >>> Yes, I'm using /dev/sg*. And yes again I'll checkout 2.6.25-rc2 ASIC. >>> >>> Thanks >>> Mark >>> - >> >> 2.6.25-rc2 does fix the problem I'm having. I don't suppose there is a >> patch >> lying around for 2.6.24.2?? >> > > I attached a backport of the patch from Tony (added as cc) that is in > 2.6.25-rc2. Could you try it out against 2.6.24.2 just to make sure it > was this patch, then we can send it to stable. > >Mark Hounschell wrote: > >Sorry it took so long. This does fix my problem. I hope it's not to >late for 2.6.24.3 > Backport 76d78300a6eb8b7f08e47703b7e68a659ffc2053 to 2.6.24 >From Tony Battersby: When sending a SCSI command to a tape drive via the SCSI Generic (sg) driver, if the command has a data transfer length more than scatter_elem_sz (32 KB default) and not a multiple of 512, then I either hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else the command never completes (depending on the LLDD). When constructing scatterlists, the sg driver rounds up the scatterlist element sizes to be a multiple of 512. This can result in sum(scatterlist lengths) > bufflen. In this case, scsi_req_map_sg() incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to bufflen. When the command completes, req_bio_endio() detects that bio->bi_size != 0, and so it doesn't call bio_endio(). This causes the command to be resubmitted, resulting in BUG_ON or the command never completing. This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather than to sum(scatterlist lengths), which fixes the problem. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> --- linux-2.6.24.2/drivers/scsi/scsi_lib.c 2008-02-10 23:51:11.000000000 -0600 +++ linux-2.6.24.2.work/drivers/scsi/scsi_lib.c 2008-02-22 16:20:09.000000000 -0600 @@ -298,7 +298,6 @@ static int scsi_req_map_sg(struct reques page = sg_page(sg); off = sg->offset; len = sg->length; - data_len += len; while (len > 0 && data_len > 0) { /* Did this ever get sent to the stable team? Regards Mark ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-03-05 11:58 ` Mark Hounschell @ 2008-03-05 15:44 ` James Bottomley 2008-03-05 16:28 ` Mark Hounschell 2008-03-05 17:13 ` Mike Christie 0 siblings, 2 replies; 15+ messages in thread From: James Bottomley @ 2008-03-05 15:44 UTC (permalink / raw) To: Mark Hounschell Cc: Mike Christie, markh, linux-scsi, linux-kernel, Tony Battersby On Wed, 2008-03-05 at 06:58 -0500, Mark Hounschell wrote: > Mike Christie wrote: > > Mark Hounschell wrote: > >> Mark Hounschell wrote: > >>> Mike Christie wrote: > >>>> Mike Christie wrote: > >>>>> Mark Hounschell wrote: > >>>>>> I seem to have run into some sort of regression in the SG_IO > >>>>>> interface of 2.6.24.2. I have an application that up until 2.6.24 > >>>>>> worked fine. The 2.6.23.16 kernel works fine. > >>>>>> > >>>>>> During reads I get these kernel messages. Writes and other functions > >>>>>> _seem_ OK. Actually basic > >>>>>> reads are working. Its with large BC reads using an io_vec list that > >>>>>> the problem shows up. > >>>>>> > >>>>> Are you doing SG_IO to the sg device (/dev/sg*) or to the block device > >>>>> (/dev/sdX)? > >>>> If you are doing SG_IO to the sg device, then I know of one regression > >>>> (well not regression exactly, but I fixed a bug but the patch got > >>>> partially overwritten by another patch and that caused a new bug). Both > >>>> bugs are fixed in 2.6.25-rc2. Could you try that out if you are doing > >>>> SG_IO to the sg device. > >>>> > >>> Yes, I'm using /dev/sg*. And yes again I'll checkout 2.6.25-rc2 ASIC. > >>> > >>> Thanks > >>> Mark > >>> - > >> > >> 2.6.25-rc2 does fix the problem I'm having. I don't suppose there is a > >> patch > >> lying around for 2.6.24.2?? > >> > > > > I attached a backport of the patch from Tony (added as cc) that is in > > 2.6.25-rc2. Could you try it out against 2.6.24.2 just to make sure it > > was this patch, then we can send it to stable. > > > > >Mark Hounschell wrote: > > > >Sorry it took so long. This does fix my problem. I hope it's not to > >late for 2.6.24.3 > > > > Backport > 76d78300a6eb8b7f08e47703b7e68a659ffc2053 > to 2.6.24 Erm, I think you mean: commit 4d2de3a50ce19af2008a90636436a1bf5b3b697b Author: Tony Battersby <tonyb@cybernetics.com> Date: Tue Feb 5 10:36:10 2008 -0500 [SCSI] fix BUG when sum(scatterlist) > bufflen I can send it ... I thought the error was introduced post 2.6.24, but it was actually in 2.6.24-rc1 James > >From Tony Battersby: > > When sending a SCSI command to a tape drive via the SCSI Generic (sg) > driver, if the command has a data transfer length more than > scatter_elem_sz (32 KB default) and not a multiple of 512, then I either > hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else > the command never completes (depending on the LLDD). > > When constructing scatterlists, the sg driver rounds up the scatterlist > element sizes to be a multiple of 512. This can result in > sum(scatterlist lengths) > bufflen. In this case, scsi_req_map_sg() > incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to > bufflen. When the command completes, req_bio_endio() detects that > bio->bi_size != 0, and so it doesn't call bio_endio(). This causes the > command to be resubmitted, resulting in BUG_ON or the command never > completing. > > This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather > than to sum(scatterlist lengths), which fixes the problem. > > Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> > > --- linux-2.6.24.2/drivers/scsi/scsi_lib.c 2008-02-10 23:51:11.000000000 > -0600 > +++ linux-2.6.24.2.work/drivers/scsi/scsi_lib.c 2008-02-22 > 16:20:09.000000000 -0600 > @@ -298,7 +298,6 @@ static int scsi_req_map_sg(struct reques > page = sg_page(sg); > off = sg->offset; > len = sg->length; > - data_len += len; > > while (len > 0 && data_len > 0) { > /* > > > Did this ever get sent to the stable team? > > Regards > Mark > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-03-05 15:44 ` James Bottomley @ 2008-03-05 16:28 ` Mark Hounschell 2008-03-05 17:13 ` Mike Christie 1 sibling, 0 replies; 15+ messages in thread From: Mark Hounschell @ 2008-03-05 16:28 UTC (permalink / raw) To: James Bottomley Cc: Mark Hounschell, Mike Christie, linux-scsi, linux-kernel, Tony Battersby James Bottomley wrote: > On Wed, 2008-03-05 at 06:58 -0500, Mark Hounschell wrote: >> Mike Christie wrote: >>> Mark Hounschell wrote: >>>> Mark Hounschell wrote: >>>>> Mike Christie wrote: >>>>>> Mike Christie wrote: >>>>>>> Mark Hounschell wrote: >>>>>>>> I seem to have run into some sort of regression in the SG_IO >>>>>>>> interface of 2.6.24.2. I have an application that up until 2.6.24 >>>>>>>> worked fine. The 2.6.23.16 kernel works fine. >>>>>>>> >>>>>>>> During reads I get these kernel messages. Writes and other functions >>>>>>>> _seem_ OK. Actually basic >>>>>>>> reads are working. Its with large BC reads using an io_vec list that >>>>>>>> the problem shows up. >>>>>>>> >>>>>>> Are you doing SG_IO to the sg device (/dev/sg*) or to the block device >>>>>>> (/dev/sdX)? >>>>>> If you are doing SG_IO to the sg device, then I know of one regression >>>>>> (well not regression exactly, but I fixed a bug but the patch got >>>>>> partially overwritten by another patch and that caused a new bug). Both >>>>>> bugs are fixed in 2.6.25-rc2. Could you try that out if you are doing >>>>>> SG_IO to the sg device. >>>>>> >>>>> Yes, I'm using /dev/sg*. And yes again I'll checkout 2.6.25-rc2 ASIC. >>>>> >>>>> Thanks >>>>> Mark >>>>> - >>>> >>>> 2.6.25-rc2 does fix the problem I'm having. I don't suppose there is a >>>> patch >>>> lying around for 2.6.24.2?? >>>> >>> I attached a backport of the patch from Tony (added as cc) that is in >>> 2.6.25-rc2. Could you try it out against 2.6.24.2 just to make sure it >>> was this patch, then we can send it to stable. >>> >>> Mark Hounschell wrote: >>> >>> Sorry it took so long. This does fix my problem. I hope it's not to >>> late for 2.6.24.3 >>> >> Backport >> 76d78300a6eb8b7f08e47703b7e68a659ffc2053 >> to 2.6.24 > > Erm, I think you mean: > > commit 4d2de3a50ce19af2008a90636436a1bf5b3b697b > Author: Tony Battersby <tonyb@cybernetics.com> > Date: Tue Feb 5 10:36:10 2008 -0500 > > [SCSI] fix BUG when sum(scatterlist) > bufflen > > I can send it ... I thought the error was introduced post 2.6.24, but it > was actually in 2.6.24-rc1 > > James > I just cut and pasted from Mike's previous email. It would be great if this could get into the 2.6.24-stable tree. Thanks Mark > >> >From Tony Battersby: >> >> When sending a SCSI command to a tape drive via the SCSI Generic (sg) >> driver, if the command has a data transfer length more than >> scatter_elem_sz (32 KB default) and not a multiple of 512, then I either >> hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else >> the command never completes (depending on the LLDD). >> >> When constructing scatterlists, the sg driver rounds up the scatterlist >> element sizes to be a multiple of 512. This can result in >> sum(scatterlist lengths) > bufflen. In this case, scsi_req_map_sg() >> incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to >> bufflen. When the command completes, req_bio_endio() detects that >> bio->bi_size != 0, and so it doesn't call bio_endio(). This causes the >> command to be resubmitted, resulting in BUG_ON or the command never >> completing. >> >> This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather >> than to sum(scatterlist lengths), which fixes the problem. >> >> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> >> >> --- linux-2.6.24.2/drivers/scsi/scsi_lib.c 2008-02-10 23:51:11.000000000 >> -0600 >> +++ linux-2.6.24.2.work/drivers/scsi/scsi_lib.c 2008-02-22 >> 16:20:09.000000000 -0600 >> @@ -298,7 +298,6 @@ static int scsi_req_map_sg(struct reques >> page = sg_page(sg); >> off = sg->offset; >> len = sg->length; >> - data_len += len; >> >> while (len > 0 && data_len > 0) { >> /* >> >> >> Did this ever get sent to the stable team? >> >> Regards >> Mark >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: New 2.6.24.2 SG_IO SCSI problems 2008-03-05 15:44 ` James Bottomley 2008-03-05 16:28 ` Mark Hounschell @ 2008-03-05 17:13 ` Mike Christie 1 sibling, 0 replies; 15+ messages in thread From: Mike Christie @ 2008-03-05 17:13 UTC (permalink / raw) To: James Bottomley Cc: Mark Hounschell, markh, linux-scsi, linux-kernel, Tony Battersby James Bottomley wrote: > On Wed, 2008-03-05 at 06:58 -0500, Mark Hounschell wrote: >> Mike Christie wrote: >>> Mark Hounschell wrote: >>>> Mark Hounschell wrote: >>>>> Mike Christie wrote: >>>>>> Mike Christie wrote: >>>>>>> Mark Hounschell wrote: >>>>>>>> I seem to have run into some sort of regression in the SG_IO >>>>>>>> interface of 2.6.24.2. I have an application that up until 2.6.24 >>>>>>>> worked fine. The 2.6.23.16 kernel works fine. >>>>>>>> >>>>>>>> During reads I get these kernel messages. Writes and other functions >>>>>>>> _seem_ OK. Actually basic >>>>>>>> reads are working. Its with large BC reads using an io_vec list that >>>>>>>> the problem shows up. >>>>>>>> >>>>>>> Are you doing SG_IO to the sg device (/dev/sg*) or to the block device >>>>>>> (/dev/sdX)? >>>>>> If you are doing SG_IO to the sg device, then I know of one regression >>>>>> (well not regression exactly, but I fixed a bug but the patch got >>>>>> partially overwritten by another patch and that caused a new bug). Both >>>>>> bugs are fixed in 2.6.25-rc2. Could you try that out if you are doing >>>>>> SG_IO to the sg device. >>>>>> >>>>> Yes, I'm using /dev/sg*. And yes again I'll checkout 2.6.25-rc2 ASIC. >>>>> >>>>> Thanks >>>>> Mark >>>>> - >>>> >>>> 2.6.25-rc2 does fix the problem I'm having. I don't suppose there is a >>>> patch >>>> lying around for 2.6.24.2?? >>>> >>> I attached a backport of the patch from Tony (added as cc) that is in >>> 2.6.25-rc2. Could you try it out against 2.6.24.2 just to make sure it >>> was this patch, then we can send it to stable. >>> >>> Mark Hounschell wrote: >>> >>> Sorry it took so long. This does fix my problem. I hope it's not to >>> late for 2.6.24.3 >>> >> Backport >> 76d78300a6eb8b7f08e47703b7e68a659ffc2053 >> to 2.6.24 > > Erm, I think you mean: You are right. > > commit 4d2de3a50ce19af2008a90636436a1bf5b3b697b > Author: Tony Battersby <tonyb@cybernetics.com> > Date: Tue Feb 5 10:36:10 2008 -0500 > > [SCSI] fix BUG when sum(scatterlist) > bufflen > > I can send it ... I thought the error was introduced post 2.6.24, but it > was actually in 2.6.24-rc1 > Ok thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2008-03-05 17:13 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-02-21 15:15 New 2.6.24.2 SG_IO SCSI problems Mark Hounschell 2008-02-21 15:41 ` James Bottomley 2008-02-21 16:21 ` Mark Hounschell 2008-02-22 10:03 ` Mark Hounschell 2008-02-22 16:50 ` Mike Christie 2008-02-22 16:59 ` Mike Christie 2008-02-22 17:56 ` Mark Hounschell 2008-02-22 21:38 ` Mark Hounschell 2008-02-22 22:25 ` Mike Christie 2008-02-22 22:48 ` Tony Battersby 2008-02-23 11:16 ` Mark Hounschell 2008-03-05 11:58 ` Mark Hounschell 2008-03-05 15:44 ` James Bottomley 2008-03-05 16:28 ` Mark Hounschell 2008-03-05 17:13 ` Mike Christie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).