linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 2.6.23-rc4-mm1
       [not found] ` <20070910174926.GC30335@shadowen.org>
@ 2007-09-10 18:19   ` Andrew Morton
  2007-09-10 18:59     ` 2.6.23-rc4-mm1 Torsten Kaiser
                       ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Andrew Morton @ 2007-09-10 18:19 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: linux-kernel, mel, Jens Axboe, linux-scsi

On Mon, 10 Sep 2007 18:49:26 +0100 Andy Whitcroft <apw@shadowen.org> wrote:

> I have a couple of old NUMA-Q systems which are unable to read their
> boot disks with 2.6.23-rc4-mm1.  The disks appear to be recognised and
> even the partition tables read correctly, and then they go pop:
> 
>   qla1280: QLA1040 found on PCI bus 0, dev 10

cc's added.

>   Clocksource tsc unstable (delta = 99922590 ns)
>   Time: jiffies clocksource has been installed.
>   scsi(0:0): Resetting SCSI BUS
>   scsi0 : QLogic QLA1040 PCI to SCSI Host Adapter
>          Firmware version:  7.65.06, Driver version 3.26
>   scsi 0:0:0:0: Direct-Access     IBM      DGHS18X          0360 PQ: 0 ANSI: 3
>   scsi(0:0:0:0): Sync: period 10, offset 12, Wide
>   scsi 0:0:1:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
>   scsi(0:0:1:0): Sync: period 10, offset 12, Wide
>   scsi 0:0:2:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
>   scsi(0:0:2:0): Sync: period 10, offset 12, Wide
>   scsi 0:0:3:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
>   scsi(0:0:3:0): Sync: period 10, offset 12, Wide
>   scsi 0:0:4:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
>   scsi(0:0:4:0): Sync: period 10, offset 12, Wide
>   st: Version 20070203, fixed bufsize 32768, s/g segs 256
>   sd 0:0:0:0: [sda] 35843670 512-byte hardware sectors (18352 MB)
>   sd 0:0:0:0: [sda] Write Protect is off
>   sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
>   sd 0:0:0:0: [sda] 35843670 512-byte hardware sectors (18352 MB)
>   sd 0:0:0:0: [sda] Write Protect is off
>   sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
>    sda: sda1
>   sd 0:0:0:0: [sda] Attached SCSI disk
>   sd 0:0:1:0: [sdb] 17796077 512-byte hardware sectors (9112 MB)
>   sd 0:0:1:0: [sdb] Write Protect is off
>   sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
>   sd 0:0:1:0: [sdb] 17796077 512-byte hardware sectors (9112 MB)
>   sd 0:0:1:0: [sdb] Write Protect is off
>   sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
>    sdb: unknown partition table
>   sd 0:0:1:0: [sdb] Attached SCSI disk
>   sd 0:0:2:0: [sdc] 17796077 512-byte hardware sectors (9112 MB)
>   sd 0:0:2:0: [sdc] Write Protect is off
>   sd 0:0:2:0: [sdc] Write cache: disabled, read cache: enabled, supports DPO and FUA
>   sd 0:0:2:0: [sdc] 17796077 512-byte hardware sectors (9112 MB)
>   sd 0:0:2:0: [sdc] Write Protect is off
>   sd 0:0:2:0: [sdc] Write cache: disabled, read cache: enabled, supports DPO and FUA
>    sdc: sdc1
>   sd 0:0:2:0: [sdc] Attached SCSI disk
>   sd 0:0:3:0: [sdd] 17796077 512-byte hardware sectors (9112 MB)
>   sd 0:0:3:0: [sdd] Write Protect is off
>   sd 0:0:3:0: [sdd] Write cache: disabled, read cache: enabled, supports DPO and FUA
>   sd 0:0:3:0: [sdd] 17796077 512-byte hardware sectors (9112 MB)
>   sd 0:0:3:0: [sdd] Write Protect is off
>   sd 0:0:3:0: [sdd] Write cache: disabled, read cache: enabled, supports DPO and FUA
>    sdd: sdd1
>   sd 0:0:3:0: [sdd] Attached SCSI disk
>   sd 0:0:4:0: [sde] 17796077 512-byte hardware sectors (9112 MB)
>   sd 0:0:4:0: [sde] Write Protect is off
>   sd 0:0:4:0: [sde] Write cache: disabled, read cache: enabled, supports DPO and FUA
>   sd 0:0:4:0: [sde] 17796077 512-byte hardware sectors (9112 MB)
>   sd 0:0:4:0: [sde] Write Protect is off
>   sd 0:0:4:0: [sde] Write cache: disabled, read cache: enabled, supports DPO and FUA
>    sde: unknown partition table
>   sd 0:0:4:0: [sde] Attached SCSI disk
>   sd 0:0:0:0: Attached scsi generic sg0 type 0
>   sd 0:0:1:0: Attached scsi generic sg1 type 0
>   sd 0:0:2:0: Attached scsi generic sg2 type 0
>   sd 0:0:3:0: Attached scsi generic sg3 type 0
>   sd 0:0:4:0: Attached scsi generic sg4 type 0
>   serio: i8042 KBD port at 0x60,0x64 irq 1
>   serio: i8042 AUX port at 0x60,0x64 irq 12
>   mice: PS/2 mouse device common for all mice
>   input: AT Translated Set 2 keyboard as /class/input/input0
>   oprofile: using NMI interrupt.
>   TCP cubic registered
>   NET: Registered protocol family 1
>   NET: Registered protocol family 17
>   Using IPI Shortcut mode
>   input: PS/2 Logitech Mouse as /class/input/input1
>   RAMDISK: cramfs filesystem found at block 0
>   RAMDISK: Loading 1244KiB [1 disk] into ram disk... |/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/done.
>   VFS: Mounted root (cramfs filesystem) readonly.
>   Freeing unused kernel memory: 220k freed
>   initrd-tools: 0.1.81.1
>   mount: fs type devfs not supported by kernel
>   FATAL: Module sd_mod not found.
>   umount: devfs: not mounted
>   ext3: No journal on filesystem on sda1
>   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
>   end_request: I/O error, dev sda, sector 63
>   Buffer I/O error on device sda1, logical block 0
>   Buffer I/O error on device sda1, logical block 1
>   Buffer I/O error on device sda1, logical block 2
>   Buffer I/O error on device sda1, logical block 3
>   mount: fs type devfs not supported by kernel
>   ext3: No journal on filesystem on sda1
>   umount: devfs: not mounted
>   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
>   end_request: I/O error, dev sda, sector 28010831
>   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
>   end_request: I/O error, dev sda, sector 31080815
>   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
>   end_request: I/O error, dev sda, sector 31080855
>   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
>   end_request: I/O error, dev sda, sector 31080919
>   Buffer I/O error on device sda1, logical block 3885107
>   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
>   end_request: I/O error, dev sda, sector 28411047
>   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
>   end_request: I/O error, dev sda, sector 31135687
>   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
>   end_request: I/O error, dev sda, sector 31138007
>   sd 0:0:0:0: [sda] <6>sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> 

The only patch which touches qla1280 is git-block.patch.  From a quick
squizz the change looks OK, although it's tricky and something might have
broken.

(the dprintk at line 2929 needs to print remseg, not seg_cnt).

Can you retest with that change reverted (below)?  If it's not that then
perhaps something in scsi core broke, dunno.


diff -puN drivers/scsi/qla1280.c~revert-1 drivers/scsi/qla1280.c
--- a/drivers/scsi/qla1280.c~revert-1
+++ a/drivers/scsi/qla1280.c
@@ -2775,7 +2775,7 @@ qla1280_64bit_start_scsi(struct scsi_qla
 	struct device_reg __iomem *reg = ha->iobase;
 	struct scsi_cmnd *cmd = sp->cmd;
 	cmd_a64_entry_t *pkt;
-	struct scatterlist *sg = NULL, *s;
+	struct scatterlist *sg = NULL;
 	__le32 *dword_ptr;
 	dma_addr_t dma_handle;
 	int status = 0;
@@ -2889,16 +2889,13 @@ qla1280_64bit_start_scsi(struct scsi_qla
 	 * Load data segments.
 	 */
 	if (seg_cnt) {	/* If data transfer. */
-		int remseg = seg_cnt;
 		/* Setup packet address segment pointer. */
 		dword_ptr = (u32 *)&pkt->dseg_0_address;
 
 		if (cmd->use_sg) {	/* If scatter gather */
 			/* Load command entry data segments. */
-			for_each_sg(sg, s, seg_cnt, cnt) {
-				if (cnt == 2)
-					break;
-				dma_handle = sg_dma_address(s);
+			for (cnt = 0; cnt < 2 && seg_cnt; cnt++, seg_cnt--) {
+				dma_handle = sg_dma_address(sg);
 #if defined(CONFIG_IA64_GENERIC) || defined(CONFIG_IA64_SGI_SN2)
 				if (ha->flags.use_pci_vchannel)
 					sn_pci_set_vchan(ha->pdev,
@@ -2909,12 +2906,12 @@ qla1280_64bit_start_scsi(struct scsi_qla
 					cpu_to_le32(pci_dma_lo32(dma_handle));
 				*dword_ptr++ =
 					cpu_to_le32(pci_dma_hi32(dma_handle));
-				*dword_ptr++ = cpu_to_le32(sg_dma_len(s));
+				*dword_ptr++ = cpu_to_le32(sg_dma_len(sg));
+				sg++;
 				dprintk(3, "S/G Segment phys_addr=%x %x, len=0x%x\n",
 					cpu_to_le32(pci_dma_hi32(dma_handle)),
 					cpu_to_le32(pci_dma_lo32(dma_handle)),
-					cpu_to_le32(sg_dma_len(sg_next(s))));
-				remseg--;
+					cpu_to_le32(sg_dma_len(sg)));
 			}
 			dprintk(5, "qla1280_64bit_start_scsi: Scatter/gather "
 				"command packet data - b %i, t %i, l %i \n",
@@ -2929,9 +2926,7 @@ qla1280_64bit_start_scsi(struct scsi_qla
 			dprintk(3, "S/G Building Continuation...seg_cnt=0x%x "
 				"remains\n", seg_cnt);
 
-			while (remseg > 0) {
-				/* Update sg start */
-				sg = s;
+			while (seg_cnt > 0) {
 				/* Adjust ring index. */
 				ha->req_ring_index++;
 				if (ha->req_ring_index == REQUEST_ENTRY_CNT) {
@@ -2957,10 +2952,9 @@ qla1280_64bit_start_scsi(struct scsi_qla
 					(u32 *)&((struct cont_a64_entry *) pkt)->dseg_0_address;
 
 				/* Load continuation entry data segments. */
-				for_each_sg(sg, s, remseg, cnt) {
-					if (cnt == 5)
-						break;
-					dma_handle = sg_dma_address(s);
+				for (cnt = 0; cnt < 5 && seg_cnt;
+				     cnt++, seg_cnt--) {
+					dma_handle = sg_dma_address(sg);
 #if defined(CONFIG_IA64_GENERIC) || defined(CONFIG_IA64_SGI_SN2)
 				if (ha->flags.use_pci_vchannel)
 					sn_pci_set_vchan(ha->pdev, 
@@ -2972,12 +2966,12 @@ qla1280_64bit_start_scsi(struct scsi_qla
 					*dword_ptr++ =
 						cpu_to_le32(pci_dma_hi32(dma_handle));
 					*dword_ptr++ =
-						cpu_to_le32(sg_dma_len(s));
+						cpu_to_le32(sg_dma_len(sg));
 					dprintk(3, "S/G Segment Cont. phys_addr=%x %x, len=0x%x\n",
 						cpu_to_le32(pci_dma_hi32(dma_handle)),
 						cpu_to_le32(pci_dma_lo32(dma_handle)),
-						cpu_to_le32(sg_dma_len(s)));
-					remseg--;
+						cpu_to_le32(sg_dma_len(sg)));
+					sg++;
 				}
 				dprintk(5, "qla1280_64bit_start_scsi: "
 					"continuation packet data - b %i, t "
@@ -3068,7 +3062,7 @@ qla1280_32bit_start_scsi(struct scsi_qla
 	struct device_reg __iomem *reg = ha->iobase;
 	struct scsi_cmnd *cmd = sp->cmd;
 	struct cmd_entry *pkt;
-	struct scatterlist *sg = NULL, *s;
+	struct scatterlist *sg = NULL;
 	__le32 *dword_ptr;
 	int status = 0;
 	int cnt;
@@ -3194,7 +3188,6 @@ qla1280_32bit_start_scsi(struct scsi_qla
 	 * Load data segments.
 	 */
 	if (seg_cnt) {
-		int remseg = seg_cnt;
 		/* Setup packet address segment pointer. */
 		dword_ptr = &pkt->dseg_0_address;
 
@@ -3203,25 +3196,22 @@ qla1280_32bit_start_scsi(struct scsi_qla
 			qla1280_dump_buffer(1, (char *)sg, 4 * 16);
 
 			/* Load command entry data segments. */
-			for_each_sg(sg, s, seg_cnt, cnt) {
-				if (cnt == 4)
-					break;
+			for (cnt = 0; cnt < 4 && seg_cnt; cnt++, seg_cnt--) {
 				*dword_ptr++ =
-					cpu_to_le32(pci_dma_lo32(sg_dma_address(s)));
-				*dword_ptr++ = cpu_to_le32(sg_dma_len(s));
+					cpu_to_le32(pci_dma_lo32(sg_dma_address(sg)));
+				*dword_ptr++ =
+					cpu_to_le32(sg_dma_len(sg));
 				dprintk(3, "S/G Segment phys_addr=0x%lx, len=0x%x\n",
-					(pci_dma_lo32(sg_dma_address(s))),
-					(sg_dma_len(s)));
-				remseg--;
+					(pci_dma_lo32(sg_dma_address(sg))),
+					(sg_dma_len(sg)));
+				sg++;
 			}
 			/*
 			 * Build continuation packets.
 			 */
 			dprintk(3, "S/G Building Continuation"
 				"...seg_cnt=0x%x remains\n", seg_cnt);
-			while (remseg > 0) {
-				/* Continue from end point */
-				sg = s;
+			while (seg_cnt > 0) {
 				/* Adjust ring index. */
 				ha->req_ring_index++;
 				if (ha->req_ring_index == REQUEST_ENTRY_CNT) {
@@ -3249,16 +3239,18 @@ qla1280_32bit_start_scsi(struct scsi_qla
 					&((struct cont_entry *) pkt)->dseg_0_address;
 
 				/* Load continuation entry data segments. */
-				for_each_sg(sg, s, remseg, cnt) {
+				for (cnt = 0; cnt < 7 && seg_cnt;
+				     cnt++, seg_cnt--) {
 					*dword_ptr++ =
-						cpu_to_le32(pci_dma_lo32(sg_dma_address(s)));
+						cpu_to_le32(pci_dma_lo32(sg_dma_address(sg)));
 					*dword_ptr++ =
-						cpu_to_le32(sg_dma_len(s));
+						cpu_to_le32(sg_dma_len(sg));
 					dprintk(1,
 						"S/G Segment Cont. phys_addr=0x%x, "
 						"len=0x%x\n",
-						cpu_to_le32(pci_dma_lo32(sg_dma_address(s))),
-						cpu_to_le32(sg_dma_len(s)));
+						cpu_to_le32(pci_dma_lo32(sg_dma_address(sg))),
+						cpu_to_le32(sg_dma_len(sg)));
+					sg++;
 				}
 				dprintk(5, "qla1280_32bit_start_scsi: "
 					"continuation packet data - "
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 18:19   ` 2.6.23-rc4-mm1 Andrew Morton
@ 2007-09-10 18:59     ` Torsten Kaiser
  2007-09-10 19:20       ` 2.6.23-rc4-mm1 Andrew Morton
  2007-09-10 19:10     ` 2.6.23-rc4-mm1 FUJITA Tomonori
  2007-09-10 19:31     ` 2.6.23-rc4-mm1 FUJITA Tomonori
  2 siblings, 1 reply; 20+ messages in thread
From: Torsten Kaiser @ 2007-09-10 18:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Andy Whitcroft, linux-kernel, mel, Jens Axboe, linux-scsi

On 9/10/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Mon, 10 Sep 2007 18:49:26 +0100 Andy Whitcroft <apw@shadowen.org> wrote:
>
> > I have a couple of old NUMA-Q systems which are unable to read their
> > boot disks with 2.6.23-rc4-mm1.  The disks appear to be recognised and
> > even the partition tables read correctly, and then they go pop:

I reported a similar problem on Sep 1, but until now got no response.
The system boots, reads the partition tables, starts the RAID and then
kicks one drive out because of errors.

> >   qla1280: QLA1040 found on PCI bus 0, dev 10
> >   Clocksource tsc unstable (delta = 99922590 ns)
> >   Time: jiffies clocksource has been installed.
> >   scsi(0:0): Resetting SCSI BUS
> >   scsi0 : QLogic QLA1040 PCI to SCSI Host Adapter
> >          Firmware version:  7.65.06, Driver version 3.26
> >   scsi 0:0:0:0: Direct-Access     IBM      DGHS18X          0360 PQ: 0 ANSI: 3
> >   scsi(0:0:0:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:1:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:1:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:2:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:2:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:3:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:3:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:4:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:4:0): Sync: period 10, offset 12, Wide
> >   st: Version 20070203, fixed bufsize 32768, s/g segs 256
> >   sd 0:0:0:0: [sda] 35843670 512-byte hardware sectors (18352 MB)
> >   sd 0:0:0:0: [sda] Write Protect is off
> >   sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:0:0: [sda] 35843670 512-byte hardware sectors (18352 MB)
> >   sd 0:0:0:0: [sda] Write Protect is off
> >   sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sda: sda1
[snip]
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 63
> >   Buffer I/O error on device sda1, logical block 0
> >   Buffer I/O error on device sda1, logical block 1
> >   Buffer I/O error on device sda1, logical block 2
> >   Buffer I/O error on device sda1, logical block 3
> >   mount: fs type devfs not supported by kernel
> >   ext3: No journal on filesystem on sda1
> >   umount: devfs: not mounted
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 28010831
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31080815

>From my log:
[    3.890000] scsi0 : sata_sil24
[    3.900000] scsi1 : sata_sil24
[    3.900000] ata1: SATA max UDMA/100 host m128@0xefeffc00 port
0xefef8000 irq 16
[    3.920000] ata2: SATA max UDMA/100 host m128@0xefeffc00 port
0xefefa000 irq 16
[    4.300000] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    4.360000] ata1.00: ATA-7: MAXTOR STM3320820AS, 3.AAE, max UDMA/133
[    4.370000] ata1.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    4.430000] ata1.00: configured for UDMA/100
[    4.500000] ieee1394: Node added: ID:BUS[0-00:1023]  GUID[0010dc00005cc354]
[    4.500000] ieee1394: Host added: ID:BUS[0-01:1023]  GUID[0011d80000c4c261]
[    4.790000] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    4.850000] ata2.00: ATA-7: MAXTOR STM3320820AS, 3.AAE, max UDMA/133
[    4.860000] ata2.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    4.920000] ata2.00: configured for UDMA/100
[    4.930000] scsi 0:0:0:0: Direct-Access     ATA      MAXTOR
STM332082 3.AA PQ: 0 ANSI: 5
[    4.960000] sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
[    4.980000] sd 0:0:0:0: [sda] Write Protect is off
[    4.990000] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    4.990000] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[    5.020000] sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
[    5.040000] sd 0:0:0:0: [sda] Write Protect is off
[    5.050000] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    5.050000] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[    5.080000]  sda: sda1 sda2
[    5.110000] sd 0:0:0:0: [sda] Attached SCSI disk
[    5.120000] scsi 1:0:0:0: Direct-Access     ATA      MAXTOR
STM332082 3.AA PQ: 0 ANSI: 5
[    5.140000] sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
[    5.170000] sd 1:0:0:0: [sdb] Write Protect is off
[    5.180000] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    5.180000] sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[    5.210000] sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
[    5.230000] sd 1:0:0:0: [sdb] Write Protect is off
[    5.240000] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    5.240000] sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[    5.270000]  sdb: sdb1 sdb2
[    5.300000] sd 1:0:0:0: [sdb] Attached SCSI disk
[more normal boot messaged, 3-disk RAID5 starts]
[   63.420000] ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[   63.420000] ata2.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0
cdb 0x0 data 4096 out
[   63.420000]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[   63.420000] ata2.00: status: {DRDY }
[   63.420000] ata2: hard resetting link
[   65.720000] ata2: softreset failed (port not ready)
[   65.720000] ata2: reset failed (errno=-5), retrying in 8 secs
[   73.420000] ata2: hard resetting link
[   75.720000] ata2: softreset failed (port not ready)
[   75.720000] ata2: reset failed (errno=-5), retrying in 8 secs
[   83.420000] ata2: hard resetting link
[   85.720000] ata2: softreset failed (port not ready)
[   85.720000] ata2: reset failed (errno=-5), retrying in 33 secs
[snip, disk gets kicked]
[  120.780000] sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK,SUGGEST_OK
[  120.780000] end_request: I/O error, dev sdb, sector 19550927
[  120.780000] sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK,SUGGEST_OK
[  120.780000] end_request: I/O error, dev sdb, sector 19550935
[  120.780000] sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK,SUGGEST_OK
[  120.780000] end_request: I/O error, dev sdb, sector 19550943
[  120.780000] sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK,SUGGEST_OK

More similar error messages in the old my LKML-mail.

After sdb was removed from the array the system worked normal with
only two drives.
But on the next boot it kicked the second sata_sil24 disk from the
array killing it.

Torsten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 18:19   ` 2.6.23-rc4-mm1 Andrew Morton
  2007-09-10 18:59     ` 2.6.23-rc4-mm1 Torsten Kaiser
@ 2007-09-10 19:10     ` FUJITA Tomonori
  2007-09-13 17:34       ` 2.6.23-rc4-mm1 Andy Whitcroft
  2007-09-15  4:16       ` 2.6.23-rc4-mm1 Paul Jackson
  2007-09-10 19:31     ` 2.6.23-rc4-mm1 FUJITA Tomonori
  2 siblings, 2 replies; 20+ messages in thread
From: FUJITA Tomonori @ 2007-09-10 19:10 UTC (permalink / raw)
  To: apw, akpm; +Cc: linux-kernel, mel, jens.axboe, linux-scsi, fujita.tomonori

On Mon, 10 Sep 2007 11:19:26 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Mon, 10 Sep 2007 18:49:26 +0100 Andy Whitcroft <apw@shadowen.org> wrote:
> 
> > I have a couple of old NUMA-Q systems which are unable to read their
> > boot disks with 2.6.23-rc4-mm1.  The disks appear to be recognised and
> > even the partition tables read correctly, and then they go pop:
> > 
> >   qla1280: QLA1040 found on PCI bus 0, dev 10
> 
> cc's added.
> 
> >   Clocksource tsc unstable (delta = 99922590 ns)
> >   Time: jiffies clocksource has been installed.
> >   scsi(0:0): Resetting SCSI BUS
> >   scsi0 : QLogic QLA1040 PCI to SCSI Host Adapter
> >          Firmware version:  7.65.06, Driver version 3.26
> >   scsi 0:0:0:0: Direct-Access     IBM      DGHS18X          0360 PQ: 0 ANSI: 3
> >   scsi(0:0:0:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:1:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:1:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:2:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:2:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:3:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:3:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:4:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:4:0): Sync: period 10, offset 12, Wide
> >   st: Version 20070203, fixed bufsize 32768, s/g segs 256
> >   sd 0:0:0:0: [sda] 35843670 512-byte hardware sectors (18352 MB)
> >   sd 0:0:0:0: [sda] Write Protect is off
> >   sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:0:0: [sda] 35843670 512-byte hardware sectors (18352 MB)
> >   sd 0:0:0:0: [sda] Write Protect is off
> >   sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sda: sda1
> >   sd 0:0:0:0: [sda] Attached SCSI disk
> >   sd 0:0:1:0: [sdb] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:1:0: [sdb] Write Protect is off
> >   sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:1:0: [sdb] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:1:0: [sdb] Write Protect is off
> >   sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sdb: unknown partition table
> >   sd 0:0:1:0: [sdb] Attached SCSI disk
> >   sd 0:0:2:0: [sdc] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:2:0: [sdc] Write Protect is off
> >   sd 0:0:2:0: [sdc] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:2:0: [sdc] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:2:0: [sdc] Write Protect is off
> >   sd 0:0:2:0: [sdc] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sdc: sdc1
> >   sd 0:0:2:0: [sdc] Attached SCSI disk
> >   sd 0:0:3:0: [sdd] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:3:0: [sdd] Write Protect is off
> >   sd 0:0:3:0: [sdd] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:3:0: [sdd] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:3:0: [sdd] Write Protect is off
> >   sd 0:0:3:0: [sdd] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sdd: sdd1
> >   sd 0:0:3:0: [sdd] Attached SCSI disk
> >   sd 0:0:4:0: [sde] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:4:0: [sde] Write Protect is off
> >   sd 0:0:4:0: [sde] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:4:0: [sde] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:4:0: [sde] Write Protect is off
> >   sd 0:0:4:0: [sde] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sde: unknown partition table
> >   sd 0:0:4:0: [sde] Attached SCSI disk
> >   sd 0:0:0:0: Attached scsi generic sg0 type 0
> >   sd 0:0:1:0: Attached scsi generic sg1 type 0
> >   sd 0:0:2:0: Attached scsi generic sg2 type 0
> >   sd 0:0:3:0: Attached scsi generic sg3 type 0
> >   sd 0:0:4:0: Attached scsi generic sg4 type 0
> >   serio: i8042 KBD port at 0x60,0x64 irq 1
> >   serio: i8042 AUX port at 0x60,0x64 irq 12
> >   mice: PS/2 mouse device common for all mice
> >   input: AT Translated Set 2 keyboard as /class/input/input0
> >   oprofile: using NMI interrupt.
> >   TCP cubic registered
> >   NET: Registered protocol family 1
> >   NET: Registered protocol family 17
> >   Using IPI Shortcut mode
> >   input: PS/2 Logitech Mouse as /class/input/input1
> >   RAMDISK: cramfs filesystem found at block 0
> >   RAMDISK: Loading 1244KiB [1 disk] into ram disk... |/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/done.
> >   VFS: Mounted root (cramfs filesystem) readonly.
> >   Freeing unused kernel memory: 220k freed
> >   initrd-tools: 0.1.81.1
> >   mount: fs type devfs not supported by kernel
> >   FATAL: Module sd_mod not found.
> >   umount: devfs: not mounted
> >   ext3: No journal on filesystem on sda1
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 63
> >   Buffer I/O error on device sda1, logical block 0
> >   Buffer I/O error on device sda1, logical block 1
> >   Buffer I/O error on device sda1, logical block 2
> >   Buffer I/O error on device sda1, logical block 3
> >   mount: fs type devfs not supported by kernel
> >   ext3: No journal on filesystem on sda1
> >   umount: devfs: not mounted
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 28010831
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31080815
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31080855
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31080919
> >   Buffer I/O error on device sda1, logical block 3885107
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 28411047
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31135687
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31138007
> >   sd 0:0:0:0: [sda] <6>sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> > 
> 
> The only patch which touches qla1280 is git-block.patch.  From a quick
> squizz the change looks OK, although it's tricky and something might have
> broken.

Can you try this patch (against 2.6.23-rc4-mm1)?

>From 592bd2049cb3e6e1f1dde7cf631879f26ddffeaa Mon Sep 17 00:00:00 2001
From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Date: Mon, 10 Sep 2007 04:17:13 +0100
Subject: [PATCH] qla1280: sg chaining fixes

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 drivers/scsi/qla1280.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c
index bd805ec..7c1eaec 100644
--- a/drivers/scsi/qla1280.c
+++ b/drivers/scsi/qla1280.c
@@ -2977,8 +2977,8 @@ qla1280_64bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp)
 						cpu_to_le32(pci_dma_hi32(dma_handle)),
 						cpu_to_le32(pci_dma_lo32(dma_handle)),
 						cpu_to_le32(sg_dma_len(s)));
-					remseg--;
 				}
+				remseg -= cnt;
 				dprintk(5, "qla1280_64bit_start_scsi: "
 					"continuation packet data - b %i, t "
 					"%i, l %i \n", SCSI_BUS_32(cmd),
@@ -3250,6 +3250,8 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp)
 
 				/* Load continuation entry data segments. */
 				for_each_sg(sg, s, remseg, cnt) {
+					if (cnt == 7)
+						break;
 					*dword_ptr++ =
 						cpu_to_le32(pci_dma_lo32(sg_dma_address(s)));
 					*dword_ptr++ =
@@ -3260,6 +3262,7 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp)
 						cpu_to_le32(pci_dma_lo32(sg_dma_address(s))),
 						cpu_to_le32(sg_dma_len(s)));
 				}
+				remseg -= cnt;
 				dprintk(5, "qla1280_32bit_start_scsi: "
 					"continuation packet data - "
 					"scsi(%i:%i:%i)\n", SCSI_BUS_32(cmd),
-- 
1.5.2.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 18:59     ` 2.6.23-rc4-mm1 Torsten Kaiser
@ 2007-09-10 19:20       ` Andrew Morton
  2007-09-10 19:38         ` 2.6.23-rc4-mm1 Torsten Kaiser
  2007-09-10 19:42         ` 2.6.23-rc4-mm1 FUJITA Tomonori
  0 siblings, 2 replies; 20+ messages in thread
From: Andrew Morton @ 2007-09-10 19:20 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Andy Whitcroft, linux-kernel, mel, Jens Axboe, linux-scsi,
	linux-ide

On Mon, 10 Sep 2007 20:59:49 +0200 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote:

> On 9/10/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Mon, 10 Sep 2007 18:49:26 +0100 Andy Whitcroft <apw@shadowen.org> wrote:
> >
> > > I have a couple of old NUMA-Q systems which are unable to read their
> > > boot disks with 2.6.23-rc4-mm1.  The disks appear to be recognised and
> > > even the partition tables read correctly, and then they go pop:
> 
> I reported a similar problem on Sep 1, but until now got no response.

You still haven't had a response ;)  Let's add a cc.

Oh, you reported it against 2.6.23-rc4-mm1
(http://lkml.org/lkml/2007/9/1/92) and I did cc linux-ide in my response.

I'll continue to point out where this sort of thing occurs because last
week I was told that a reson why so many bug reports are ignored is because
"linux-kernel has too much traffic".

> The system boots, reads the partition tables, starts the RAID and then
> kicks one drive out because of errors.

Andy is using qla1280.  You're using sata.  So it's probably a different
bug, with the same symptoms.

> > >   qla1280: QLA1040 found on PCI bus 0, dev 10
> > >   Clocksource tsc unstable (delta = 99922590 ns)
> > >   Time: jiffies clocksource has been installed.
> > >   scsi(0:0): Resetting SCSI BUS
> > >   scsi0 : QLogic QLA1040 PCI to SCSI Host Adapter
> > >          Firmware version:  7.65.06, Driver version 3.26
> > >   scsi 0:0:0:0: Direct-Access     IBM      DGHS18X          0360 PQ: 0 ANSI: 3
> > >   scsi(0:0:0:0): Sync: period 10, offset 12, Wide
> > >   scsi 0:0:1:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> > >   scsi(0:0:1:0): Sync: period 10, offset 12, Wide
> > >   scsi 0:0:2:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> > >   scsi(0:0:2:0): Sync: period 10, offset 12, Wide
> > >   scsi 0:0:3:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> > >   scsi(0:0:3:0): Sync: period 10, offset 12, Wide
> > >   scsi 0:0:4:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> > >   scsi(0:0:4:0): Sync: period 10, offset 12, Wide
> > >   st: Version 20070203, fixed bufsize 32768, s/g segs 256
> > >   sd 0:0:0:0: [sda] 35843670 512-byte hardware sectors (18352 MB)
> > >   sd 0:0:0:0: [sda] Write Protect is off
> > >   sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
> > >   sd 0:0:0:0: [sda] 35843670 512-byte hardware sectors (18352 MB)
> > >   sd 0:0:0:0: [sda] Write Protect is off
> > >   sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
> > >    sda: sda1
> [snip]
> > >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> > >   end_request: I/O error, dev sda, sector 63
> > >   Buffer I/O error on device sda1, logical block 0
> > >   Buffer I/O error on device sda1, logical block 1
> > >   Buffer I/O error on device sda1, logical block 2
> > >   Buffer I/O error on device sda1, logical block 3
> > >   mount: fs type devfs not supported by kernel
> > >   ext3: No journal on filesystem on sda1
> > >   umount: devfs: not mounted
> > >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> > >   end_request: I/O error, dev sda, sector 28010831
> > >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> > >   end_request: I/O error, dev sda, sector 31080815
> 
> >From my log:
> [    3.890000] scsi0 : sata_sil24
> [    3.900000] scsi1 : sata_sil24
> [    3.900000] ata1: SATA max UDMA/100 host m128@0xefeffc00 port
> 0xefef8000 irq 16
> [    3.920000] ata2: SATA max UDMA/100 host m128@0xefeffc00 port
> 0xefefa000 irq 16
> [    4.300000] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [    4.360000] ata1.00: ATA-7: MAXTOR STM3320820AS, 3.AAE, max UDMA/133
> [    4.370000] ata1.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
> [    4.430000] ata1.00: configured for UDMA/100
> [    4.500000] ieee1394: Node added: ID:BUS[0-00:1023]  GUID[0010dc00005cc354]
> [    4.500000] ieee1394: Host added: ID:BUS[0-01:1023]  GUID[0011d80000c4c261]
> [    4.790000] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [    4.850000] ata2.00: ATA-7: MAXTOR STM3320820AS, 3.AAE, max UDMA/133
> [    4.860000] ata2.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
> [    4.920000] ata2.00: configured for UDMA/100
> [    4.930000] scsi 0:0:0:0: Direct-Access     ATA      MAXTOR
> STM332082 3.AA PQ: 0 ANSI: 5
> [    4.960000] sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> [    4.980000] sd 0:0:0:0: [sda] Write Protect is off
> [    4.990000] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> [    4.990000] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [    5.020000] sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> [    5.040000] sd 0:0:0:0: [sda] Write Protect is off
> [    5.050000] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> [    5.050000] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [    5.080000]  sda: sda1 sda2
> [    5.110000] sd 0:0:0:0: [sda] Attached SCSI disk
> [    5.120000] scsi 1:0:0:0: Direct-Access     ATA      MAXTOR
> STM332082 3.AA PQ: 0 ANSI: 5
> [    5.140000] sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
> [    5.170000] sd 1:0:0:0: [sdb] Write Protect is off
> [    5.180000] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [    5.180000] sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [    5.210000] sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
> [    5.230000] sd 1:0:0:0: [sdb] Write Protect is off
> [    5.240000] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [    5.240000] sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [    5.270000]  sdb: sdb1 sdb2
> [    5.300000] sd 1:0:0:0: [sdb] Attached SCSI disk
> [more normal boot messaged, 3-disk RAID5 starts]
> [   63.420000] ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
> [   63.420000] ata2.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0
> cdb 0x0 data 4096 out
> [   63.420000]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [   63.420000] ata2.00: status: {DRDY }
> [   63.420000] ata2: hard resetting link
> [   65.720000] ata2: softreset failed (port not ready)
> [   65.720000] ata2: reset failed (errno=-5), retrying in 8 secs
> [   73.420000] ata2: hard resetting link
> [   75.720000] ata2: softreset failed (port not ready)
> [   75.720000] ata2: reset failed (errno=-5), retrying in 8 secs
> [   83.420000] ata2: hard resetting link
> [   85.720000] ata2: softreset failed (port not ready)
> [   85.720000] ata2: reset failed (errno=-5), retrying in 33 secs
> [snip, disk gets kicked]
> [  120.780000] sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK,SUGGEST_OK
> [  120.780000] end_request: I/O error, dev sdb, sector 19550927
> [  120.780000] sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK,SUGGEST_OK
> [  120.780000] end_request: I/O error, dev sdb, sector 19550935
> [  120.780000] sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK,SUGGEST_OK
> [  120.780000] end_request: I/O error, dev sdb, sector 19550943
> [  120.780000] sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK,SUGGEST_OK
> 
> More similar error messages in the old my LKML-mail.
> 
> After sdb was removed from the array the system worked normal with
> only two drives.
> But on the next boot it kicked the second sata_sil24 disk from the
> array killing it.

Can you please confirm that this bug is present in -mm and not present in
mainline (yet)?

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 18:19   ` 2.6.23-rc4-mm1 Andrew Morton
  2007-09-10 18:59     ` 2.6.23-rc4-mm1 Torsten Kaiser
  2007-09-10 19:10     ` 2.6.23-rc4-mm1 FUJITA Tomonori
@ 2007-09-10 19:31     ` FUJITA Tomonori
  2007-09-14  8:10       ` 2.6.23-rc4-mm1 Andy Whitcroft
  2 siblings, 1 reply; 20+ messages in thread
From: FUJITA Tomonori @ 2007-09-10 19:31 UTC (permalink / raw)
  To: akpm; +Cc: apw, linux-kernel, mel, jens.axboe, linux-scsi, fujita.tomonori

On Mon, 10 Sep 2007 11:19:26 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Mon, 10 Sep 2007 18:49:26 +0100 Andy Whitcroft <apw@shadowen.org> wrote:
> 
> > I have a couple of old NUMA-Q systems which are unable to read their
> > boot disks with 2.6.23-rc4-mm1.  The disks appear to be recognised and
> > even the partition tables read correctly, and then they go pop:
> > 
> >   qla1280: QLA1040 found on PCI bus 0, dev 10
> 
> cc's added.
> 
> >   Clocksource tsc unstable (delta = 99922590 ns)
> >   Time: jiffies clocksource has been installed.
> >   scsi(0:0): Resetting SCSI BUS
> >   scsi0 : QLogic QLA1040 PCI to SCSI Host Adapter
> >          Firmware version:  7.65.06, Driver version 3.26
> >   scsi 0:0:0:0: Direct-Access     IBM      DGHS18X          0360 PQ: 0 ANSI: 3
> >   scsi(0:0:0:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:1:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:1:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:2:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:2:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:3:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:3:0): Sync: period 10, offset 12, Wide
> >   scsi 0:0:4:0: Direct-Access     IBM OEM  DCHS09X          5454 PQ: 0 ANSI: 2
> >   scsi(0:0:4:0): Sync: period 10, offset 12, Wide
> >   st: Version 20070203, fixed bufsize 32768, s/g segs 256
> >   sd 0:0:0:0: [sda] 35843670 512-byte hardware sectors (18352 MB)
> >   sd 0:0:0:0: [sda] Write Protect is off
> >   sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:0:0: [sda] 35843670 512-byte hardware sectors (18352 MB)
> >   sd 0:0:0:0: [sda] Write Protect is off
> >   sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sda: sda1
> >   sd 0:0:0:0: [sda] Attached SCSI disk
> >   sd 0:0:1:0: [sdb] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:1:0: [sdb] Write Protect is off
> >   sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:1:0: [sdb] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:1:0: [sdb] Write Protect is off
> >   sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sdb: unknown partition table
> >   sd 0:0:1:0: [sdb] Attached SCSI disk
> >   sd 0:0:2:0: [sdc] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:2:0: [sdc] Write Protect is off
> >   sd 0:0:2:0: [sdc] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:2:0: [sdc] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:2:0: [sdc] Write Protect is off
> >   sd 0:0:2:0: [sdc] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sdc: sdc1
> >   sd 0:0:2:0: [sdc] Attached SCSI disk
> >   sd 0:0:3:0: [sdd] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:3:0: [sdd] Write Protect is off
> >   sd 0:0:3:0: [sdd] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:3:0: [sdd] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:3:0: [sdd] Write Protect is off
> >   sd 0:0:3:0: [sdd] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sdd: sdd1
> >   sd 0:0:3:0: [sdd] Attached SCSI disk
> >   sd 0:0:4:0: [sde] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:4:0: [sde] Write Protect is off
> >   sd 0:0:4:0: [sde] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >   sd 0:0:4:0: [sde] 17796077 512-byte hardware sectors (9112 MB)
> >   sd 0:0:4:0: [sde] Write Protect is off
> >   sd 0:0:4:0: [sde] Write cache: disabled, read cache: enabled, supports DPO and FUA
> >    sde: unknown partition table
> >   sd 0:0:4:0: [sde] Attached SCSI disk
> >   sd 0:0:0:0: Attached scsi generic sg0 type 0
> >   sd 0:0:1:0: Attached scsi generic sg1 type 0
> >   sd 0:0:2:0: Attached scsi generic sg2 type 0
> >   sd 0:0:3:0: Attached scsi generic sg3 type 0
> >   sd 0:0:4:0: Attached scsi generic sg4 type 0
> >   serio: i8042 KBD port at 0x60,0x64 irq 1
> >   serio: i8042 AUX port at 0x60,0x64 irq 12
> >   mice: PS/2 mouse device common for all mice
> >   input: AT Translated Set 2 keyboard as /class/input/input0
> >   oprofile: using NMI interrupt.
> >   TCP cubic registered
> >   NET: Registered protocol family 1
> >   NET: Registered protocol family 17
> >   Using IPI Shortcut mode
> >   input: PS/2 Logitech Mouse as /class/input/input1
> >   RAMDISK: cramfs filesystem found at block 0
> >   RAMDISK: Loading 1244KiB [1 disk] into ram disk... |/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/done.
> >   VFS: Mounted root (cramfs filesystem) readonly.
> >   Freeing unused kernel memory: 220k freed
> >   initrd-tools: 0.1.81.1
> >   mount: fs type devfs not supported by kernel
> >   FATAL: Module sd_mod not found.
> >   umount: devfs: not mounted
> >   ext3: No journal on filesystem on sda1
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 63
> >   Buffer I/O error on device sda1, logical block 0
> >   Buffer I/O error on device sda1, logical block 1
> >   Buffer I/O error on device sda1, logical block 2
> >   Buffer I/O error on device sda1, logical block 3
> >   mount: fs type devfs not supported by kernel
> >   ext3: No journal on filesystem on sda1
> >   umount: devfs: not mounted
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 28010831
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31080815
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31080855
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31080919
> >   Buffer I/O error on device sda1, logical block 3885107
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 28411047
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31135687
> >   sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> >   end_request: I/O error, dev sda, sector 31138007
> >   sd 0:0:0:0: [sda] <6>sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
> > 
> 
> The only patch which touches qla1280 is git-block.patch.  From a quick
> squizz the change looks OK, although it's tricky and something might have
> broken.
> 
> (the dprintk at line 2929 needs to print remseg, not seg_cnt).
> 
> Can you retest with that change reverted (below)?  If it's not that then
> perhaps something in scsi core broke, dunno.

Even if we revert the qla1280 patch, scsi-ml still sends chaining sg
list. So it doesn't work.

The following patch disables chaining sg list for qla1280. If the fix
that I've just sent doesn't work, please try this.

-
From: FUJITA Tomonori <tomof@acm.org>
Subject: [PATCH] add use_sg_chaining option to scsi_host_template

This option is true if a low-level driver can support sg
chaining. This will be removed eventually when all the drivers are
converted to support sg chaining. q->max_phys_segments is set to
SCSI_MAX_SG_SEGMENTS if false.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 arch/ia64/hp/sim/simscsi.c            |    1 +
 drivers/scsi/3w-9xxx.c                |    1 +
 drivers/scsi/3w-xxxx.c                |    1 +
 drivers/scsi/BusLogic.c               |    1 +
 drivers/scsi/NCR53c406a.c             |    3 ++-
 drivers/scsi/a100u2w.c                |    1 +
 drivers/scsi/aacraid/linit.c          |    1 +
 drivers/scsi/aha1740.c                |    1 +
 drivers/scsi/aic7xxx/aic79xx_osm.c    |    1 +
 drivers/scsi/aic7xxx/aic7xxx_osm.c    |    1 +
 drivers/scsi/aic7xxx_old.c            |    1 +
 drivers/scsi/arcmsr/arcmsr_hba.c      |    1 +
 drivers/scsi/dc395x.c                 |    1 +
 drivers/scsi/dpt_i2o.c                |    1 +
 drivers/scsi/eata.c                   |    3 ++-
 drivers/scsi/hosts.c                  |    1 +
 drivers/scsi/hptiop.c                 |    1 +
 drivers/scsi/ibmmca.c                 |    1 +
 drivers/scsi/ibmvscsi/ibmvscsi.c      |    1 +
 drivers/scsi/initio.c                 |    1 +
 drivers/scsi/ipr.c                    |    1 +
 drivers/scsi/lpfc/lpfc_scsi.c         |    2 ++
 drivers/scsi/mac53c94.c               |    1 +
 drivers/scsi/megaraid.c               |    1 +
 drivers/scsi/megaraid/megaraid_mbox.c |    1 +
 drivers/scsi/megaraid/megaraid_sas.c  |    1 +
 drivers/scsi/mesh.c                   |    1 +
 drivers/scsi/nsp32.c                  |    1 +
 drivers/scsi/pcmcia/sym53c500_cs.c    |    1 +
 drivers/scsi/qla2xxx/qla_os.c         |    2 ++
 drivers/scsi/qla4xxx/ql4_os.c         |    1 +
 drivers/scsi/qlogicfas.c              |    1 +
 drivers/scsi/scsi_lib.c               |    5 ++++-
 drivers/scsi/stex.c                   |    1 +
 drivers/scsi/sym53c416.c              |    1 +
 drivers/scsi/sym53c8xx_2/sym_glue.c   |    1 +
 drivers/scsi/u14-34f.c                |    1 +
 drivers/scsi/ultrastor.c              |    1 +
 drivers/scsi/wd7000.c                 |    1 +
 include/scsi/scsi_host.h              |   13 +++++++++++++
 40 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/hp/sim/simscsi.c b/arch/ia64/hp/sim/simscsi.c
index 4552a1c..e711657 100644
--- a/arch/ia64/hp/sim/simscsi.c
+++ b/arch/ia64/hp/sim/simscsi.c
@@ -360,6 +360,7 @@ static struct scsi_host_template driver_template = {
 	.max_sectors		= 1024,
 	.cmd_per_lun		= SIMSCSI_REQ_QUEUE_LEN,
 	.use_clustering		= DISABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 };
 
 static int __init
diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c
index efd9d8d..fb14014 100644
--- a/drivers/scsi/3w-9xxx.c
+++ b/drivers/scsi/3w-9xxx.c
@@ -1990,6 +1990,7 @@ static struct scsi_host_template driver_template = {
 	.max_sectors		= TW_MAX_SECTORS,
 	.cmd_per_lun		= TW_MAX_CMDS_PER_LUN,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.shost_attrs		= twa_host_attrs,
 	.emulated		= 1
 };
diff --git a/drivers/scsi/3w-xxxx.c b/drivers/scsi/3w-xxxx.c
index c7995fc..a64153b 100644
--- a/drivers/scsi/3w-xxxx.c
+++ b/drivers/scsi/3w-xxxx.c
@@ -2261,6 +2261,7 @@ static struct scsi_host_template driver_template = {
 	.max_sectors		= TW_MAX_SECTORS,
 	.cmd_per_lun		= TW_MAX_CMDS_PER_LUN,	
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.shost_attrs		= tw_host_attrs,
 	.emulated		= 1
 };
diff --git a/drivers/scsi/BusLogic.c b/drivers/scsi/BusLogic.c
index 9b20617..49e1ffa 100644
--- a/drivers/scsi/BusLogic.c
+++ b/drivers/scsi/BusLogic.c
@@ -3575,6 +3575,7 @@ static struct scsi_host_template Bus_Logic_template = {
 	.unchecked_isa_dma = 1,
 	.max_sectors = 128,
 	.use_clustering = ENABLE_CLUSTERING,
+	.use_sg_chaining = ENABLE_SG_CHAINING,
 };
 
 /*
diff --git a/drivers/scsi/NCR53c406a.c b/drivers/scsi/NCR53c406a.c
index eda8c48..3168a17 100644
--- a/drivers/scsi/NCR53c406a.c
+++ b/drivers/scsi/NCR53c406a.c
@@ -1066,7 +1066,8 @@ static struct scsi_host_template driver_template =
      .sg_tablesize      	= 32			/*SG_ALL*/ /*SG_NONE*/, 
      .cmd_per_lun       	= 1			/* commands per lun */, 
      .unchecked_isa_dma 	= 1			/* unchecked_isa_dma */,
-     .use_clustering    	= ENABLE_CLUSTERING                               
+     .use_clustering    	= ENABLE_CLUSTERING,
+     .use_sg_chaining           = ENABLE_SG_CHAINING,
 };
 
 #include "scsi_module.c"
diff --git a/drivers/scsi/a100u2w.c b/drivers/scsi/a100u2w.c
index f608d4a..d3a6d15 100644
--- a/drivers/scsi/a100u2w.c
+++ b/drivers/scsi/a100u2w.c
@@ -1071,6 +1071,7 @@ static struct scsi_host_template inia100_template = {
 	.sg_tablesize		= SG_ALL,
 	.cmd_per_lun 		= 1,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 };
 
 static int __devinit inia100_probe_one(struct pci_dev *pdev,
diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index a7f42a1..038980b 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -944,6 +944,7 @@ static struct scsi_host_template aac_driver_template = {
 	.cmd_per_lun    		= AAC_NUM_IO_FIB, 
 #endif	
 	.use_clustering			= ENABLE_CLUSTERING,
+	.use_sg_chaining		= ENABLE_SG_CHAINING,
 	.emulated                       = 1,
 };
 
diff --git a/drivers/scsi/aha1740.c b/drivers/scsi/aha1740.c
index e4a4f3a..f6722fd 100644
--- a/drivers/scsi/aha1740.c
+++ b/drivers/scsi/aha1740.c
@@ -563,6 +563,7 @@ static struct scsi_host_template aha1740_template = {
 	.sg_tablesize     = AHA1740_SCATTER,
 	.cmd_per_lun      = AHA1740_CMDLUN,
 	.use_clustering   = ENABLE_CLUSTERING,
+	.use_sg_chaining  = ENABLE_SG_CHAINING,
 	.eh_abort_handler = aha1740_eh_abort_handler,
 };
 
diff --git a/drivers/scsi/aic7xxx/aic79xx_osm.c b/drivers/scsi/aic7xxx/aic79xx_osm.c
index a055a96..42c0f14 100644
--- a/drivers/scsi/aic7xxx/aic79xx_osm.c
+++ b/drivers/scsi/aic7xxx/aic79xx_osm.c
@@ -766,6 +766,7 @@ struct scsi_host_template aic79xx_driver_template = {
 	.max_sectors		= 8192,
 	.cmd_per_lun		= 2,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.slave_alloc		= ahd_linux_slave_alloc,
 	.slave_configure	= ahd_linux_slave_configure,
 	.target_alloc		= ahd_linux_target_alloc,
diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm.c b/drivers/scsi/aic7xxx/aic7xxx_osm.c
index 2e9c38f..7770bef 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm.c
@@ -747,6 +747,7 @@ struct scsi_host_template aic7xxx_driver_template = {
 	.max_sectors		= 8192,
 	.cmd_per_lun		= 2,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.slave_alloc		= ahc_linux_slave_alloc,
 	.slave_configure	= ahc_linux_slave_configure,
 	.target_alloc		= ahc_linux_target_alloc,
diff --git a/drivers/scsi/aic7xxx_old.c b/drivers/scsi/aic7xxx_old.c
index 1a71b02..4025608 100644
--- a/drivers/scsi/aic7xxx_old.c
+++ b/drivers/scsi/aic7xxx_old.c
@@ -11142,6 +11142,7 @@ static struct scsi_host_template driver_template = {
 	.max_sectors		= 2048,
 	.cmd_per_lun		= 3,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 };
 
 #include "scsi_module.c"
diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 0ddfc21..d5039f3 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -121,6 +121,7 @@ static struct scsi_host_template arcmsr_scsi_host_template = {
 	.max_sectors    	= ARCMSR_MAX_XFER_SECTORS,
 	.cmd_per_lun		= ARCMSR_MAX_CMD_PERLUN,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.shost_attrs		= arcmsr_host_attrs,
 };
 static struct pci_error_handlers arcmsr_pci_error_handlers = {
diff --git a/drivers/scsi/dc395x.c b/drivers/scsi/dc395x.c
index 7b8a345..d2a2026 100644
--- a/drivers/scsi/dc395x.c
+++ b/drivers/scsi/dc395x.c
@@ -4765,6 +4765,7 @@ static struct scsi_host_template dc395x_driver_template = {
 	.eh_bus_reset_handler   = dc395x_eh_bus_reset,
 	.unchecked_isa_dma      = 0,
 	.use_clustering         = DISABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 };
 
 
diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
index bea9d65..8258506 100644
--- a/drivers/scsi/dpt_i2o.c
+++ b/drivers/scsi/dpt_i2o.c
@@ -3295,6 +3295,7 @@ static struct scsi_host_template adpt_template = {
 	.this_id		= 7,
 	.cmd_per_lun		= 1,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 };
 
 static s32 adpt_scsi_register(adpt_hba* pHba)
diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c
index a83e9f1..2f685cf 100644
--- a/drivers/scsi/eata.c
+++ b/drivers/scsi/eata.c
@@ -523,7 +523,8 @@ static struct scsi_host_template driver_template = {
 	.slave_configure = eata2x_slave_configure,
 	.this_id = 7,
 	.unchecked_isa_dma = 1,
-	.use_clustering = ENABLE_CLUSTERING
+	.use_clustering = ENABLE_CLUSTERING,
+	.use_sg_chaining = ENABLE_SG_CHAINING,
 };
 
 #if !defined(__BIG_ENDIAN_BITFIELD) && !defined(__LITTLE_ENDIAN_BITFIELD)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 96bc312..8c42539 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -342,6 +342,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	shost->unchecked_isa_dma = sht->unchecked_isa_dma;
 	shost->use_clustering = sht->use_clustering;
 	shost->ordered_tag = sht->ordered_tag;
+	shost->use_sg_chaining = sht->use_sg_chaining;
 
 	if (sht->max_host_blocked)
 		shost->max_host_blocked = sht->max_host_blocked;
diff --git a/drivers/scsi/hptiop.c b/drivers/scsi/hptiop.c
index 8b384fa..8515054 100644
--- a/drivers/scsi/hptiop.c
+++ b/drivers/scsi/hptiop.c
@@ -655,6 +655,7 @@ static struct scsi_host_template driver_template = {
 	.unchecked_isa_dma          = 0,
 	.emulated                   = 0,
 	.use_clustering             = ENABLE_CLUSTERING,
+	.use_sg_chaining            = ENABLE_SG_CHAINING,
 	.proc_name                  = driver_name,
 	.shost_attrs                = hptiop_attrs,
 	.this_id                    = -1,
diff --git a/drivers/scsi/ibmmca.c b/drivers/scsi/ibmmca.c
index bff8252..695941a 100644
--- a/drivers/scsi/ibmmca.c
+++ b/drivers/scsi/ibmmca.c
@@ -1501,6 +1501,7 @@ static struct scsi_host_template ibmmca_driver_template = {
           .sg_tablesize   = 16,
           .cmd_per_lun    = 1,
           .use_clustering = ENABLE_CLUSTERING,
+          .use_sg_chaining = ENABLE_SG_CHAINING,
 };
 
 static int ibmmca_probe(struct device *dev)
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 93bd01b..084488c 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -1545,6 +1545,7 @@ static struct scsi_host_template driver_template = {
 	.this_id = -1,
 	.sg_tablesize = SG_ALL,
 	.use_clustering = ENABLE_CLUSTERING,
+	.use_sg_chaining = ENABLE_SG_CHAINING,
 	.shost_attrs = ibmvscsi_attrs,
 };
 
diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c
index d9dfb69..22d40fd 100644
--- a/drivers/scsi/initio.c
+++ b/drivers/scsi/initio.c
@@ -2831,6 +2831,7 @@ static struct scsi_host_template initio_template = {
 	.sg_tablesize		= SG_ALL,
 	.cmd_per_lun		= 1,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 };
 
 static int initio_probe_one(struct pci_dev *pdev,
diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index b41dfb5..ba7b567 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -4949,6 +4949,7 @@ static struct scsi_host_template driver_template = {
 	.max_sectors = IPR_IOA_MAX_SECTORS,
 	.cmd_per_lun = IPR_MAX_CMD_PER_LUN,
 	.use_clustering = ENABLE_CLUSTERING,
+	.use_sg_chaining = ENABLE_SG_CHAINING,
 	.shost_attrs = ipr_ioa_attrs,
 	.sdev_attrs = ipr_dev_attrs,
 	.proc_name = IPR_NAME
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index cd67493..c075556 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -1438,6 +1438,7 @@ struct scsi_host_template lpfc_template = {
 	.scan_finished		= lpfc_scan_finished,
 	.this_id		= -1,
 	.sg_tablesize		= LPFC_SG_SEG_CNT,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.cmd_per_lun		= LPFC_CMD_PER_LUN,
 	.use_clustering		= ENABLE_CLUSTERING,
 	.shost_attrs		= lpfc_hba_attrs,
@@ -1460,6 +1461,7 @@ struct scsi_host_template lpfc_vport_template = {
 	.sg_tablesize		= LPFC_SG_SEG_CNT,
 	.cmd_per_lun		= LPFC_CMD_PER_LUN,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.shost_attrs		= lpfc_vport_attrs,
 	.max_sectors		= 0xFFFF,
 };
diff --git a/drivers/scsi/mac53c94.c b/drivers/scsi/mac53c94.c
index b12ad7c..a035001 100644
--- a/drivers/scsi/mac53c94.c
+++ b/drivers/scsi/mac53c94.c
@@ -402,6 +402,7 @@ static struct scsi_host_template mac53c94_template = {
 	.sg_tablesize	= SG_ALL,
 	.cmd_per_lun	= 1,
 	.use_clustering	= DISABLE_CLUSTERING,
+	.use_sg_chaining = ENABLE_SG_CHAINING,
 };
 
 static int mac53c94_probe(struct macio_dev *mdev, const struct of_device_id *match)
diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 9023ec6..a0133b5 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -4484,6 +4484,7 @@ static struct scsi_host_template megaraid_template = {
 	.sg_tablesize			= MAX_SGLIST,
 	.cmd_per_lun			= DEF_CMD_PER_LUN,
 	.use_clustering			= ENABLE_CLUSTERING,
+	.use_sg_chaining		= ENABLE_SG_CHAINING,
 	.eh_abort_handler		= megaraid_abort,
 	.eh_device_reset_handler	= megaraid_reset,
 	.eh_bus_reset_handler		= megaraid_reset,
diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
index c6a53dc..e4e4c6a 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.c
+++ b/drivers/scsi/megaraid/megaraid_mbox.c
@@ -361,6 +361,7 @@ static struct scsi_host_template megaraid_template_g = {
 	.eh_host_reset_handler		= megaraid_reset_handler,
 	.change_queue_depth		= megaraid_change_queue_depth,
 	.use_clustering			= ENABLE_CLUSTERING,
+	.use_sg_chaining		= ENABLE_SG_CHAINING,
 	.sdev_attrs			= megaraid_sdev_attrs,
 	.shost_attrs			= megaraid_shost_attrs,
 };
diff --git a/drivers/scsi/megaraid/megaraid_sas.c b/drivers/scsi/megaraid/megaraid_sas.c
index ebb948c..e3c5c52 100644
--- a/drivers/scsi/megaraid/megaraid_sas.c
+++ b/drivers/scsi/megaraid/megaraid_sas.c
@@ -1110,6 +1110,7 @@ static struct scsi_host_template megasas_template = {
 	.eh_timed_out = megasas_reset_timer,
 	.bios_param = megasas_bios_param,
 	.use_clustering = ENABLE_CLUSTERING,
+	.use_sg_chaining = ENABLE_SG_CHAINING,
 };
 
 /**
diff --git a/drivers/scsi/mesh.c b/drivers/scsi/mesh.c
index 651d09b..7470ff3 100644
--- a/drivers/scsi/mesh.c
+++ b/drivers/scsi/mesh.c
@@ -1843,6 +1843,7 @@ static struct scsi_host_template mesh_template = {
 	.sg_tablesize			= SG_ALL,
 	.cmd_per_lun			= 2,
 	.use_clustering			= DISABLE_CLUSTERING,
+	.use_sg_chaining		= ENABLE_SG_CHAINING,
 };
 
 static int mesh_probe(struct macio_dev *mdev, const struct of_device_id *match)
diff --git a/drivers/scsi/nsp32.c b/drivers/scsi/nsp32.c
index 4215f3b..6da1504 100644
--- a/drivers/scsi/nsp32.c
+++ b/drivers/scsi/nsp32.c
@@ -281,6 +281,7 @@ static struct scsi_host_template nsp32_template = {
 	.cmd_per_lun			= 1,
 	.this_id			= NSP32_HOST_SCSIID,
 	.use_clustering			= DISABLE_CLUSTERING,
+	.use_sg_chaining		= ENABLE_SG_CHAINING,
 	.eh_abort_handler       	= nsp32_eh_abort,
 	.eh_bus_reset_handler		= nsp32_eh_bus_reset,
 	.eh_host_reset_handler		= nsp32_eh_host_reset,
diff --git a/drivers/scsi/pcmcia/sym53c500_cs.c b/drivers/scsi/pcmcia/sym53c500_cs.c
index 961839e..190e2a7 100644
--- a/drivers/scsi/pcmcia/sym53c500_cs.c
+++ b/drivers/scsi/pcmcia/sym53c500_cs.c
@@ -694,6 +694,7 @@ static struct scsi_host_template sym53c500_driver_template = {
      .sg_tablesize		= 32,
      .cmd_per_lun		= 1,
      .use_clustering		= ENABLE_CLUSTERING,
+     .use_sg_chaining		= ENABLE_SG_CHAINING,
      .shost_attrs		= SYM53C500_shost_attrs
 };
 
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index acca898..3abbbc0 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -132,6 +132,7 @@ struct scsi_host_template qla2x00_driver_template = {
 	.this_id		= -1,
 	.cmd_per_lun		= 3,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.sg_tablesize		= SG_ALL,
 
 	/*
@@ -163,6 +164,7 @@ struct scsi_host_template qla24xx_driver_template = {
 	.this_id		= -1,
 	.cmd_per_lun		= 3,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.sg_tablesize		= SG_ALL,
 
 	.max_sectors		= 0xFFFF,
diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 8fa5aea..89460d2 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -94,6 +94,7 @@ static struct scsi_host_template qla4xxx_driver_template = {
 	.this_id		= -1,
 	.cmd_per_lun		= 3,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.sg_tablesize		= SG_ALL,
 
 	.max_sectors		= 0xFFFF,
diff --git a/drivers/scsi/qlogicfas.c b/drivers/scsi/qlogicfas.c
index 94baca8..2268ca1 100644
--- a/drivers/scsi/qlogicfas.c
+++ b/drivers/scsi/qlogicfas.c
@@ -197,6 +197,7 @@ static struct scsi_host_template qlogicfas_driver_template = {
 	.sg_tablesize		= SG_ALL,
 	.cmd_per_lun		= 1,
 	.use_clustering		= DISABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 };
 
 static __init int qlogicfas_init(void)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d0a1028..38eec00 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1685,7 +1685,10 @@ struct request_queue *__scsi_alloc_queue(struct Scsi_Host *shost,
 	 * converted, so better keep it safe.
 	 */
 #ifdef ARCH_HAS_SG_CHAIN
-	blk_queue_max_phys_segments(q, SCSI_MAX_SG_CHAIN_SEGMENTS);
+	if (shost->use_sg_chaining)
+		blk_queue_max_phys_segments(q, SCSI_MAX_SG_CHAIN_SEGMENTS);
+	else
+		blk_queue_max_phys_segments(q, SCSI_MAX_SG_SEGMENTS);
 #else
 	blk_queue_max_phys_segments(q, SCSI_MAX_SG_SEGMENTS);
 #endif
diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c
index 72f6d80..e3fab3a 100644
--- a/drivers/scsi/stex.c
+++ b/drivers/scsi/stex.c
@@ -1123,6 +1123,7 @@ static struct scsi_host_template driver_template = {
 	.this_id			= -1,
 	.sg_tablesize			= ST_MAX_SG,
 	.cmd_per_lun			= ST_CMD_PER_LUN,
+	.use_sg_chaining		= ENABLE_SG_CHAINING,
 };
 
 static int stex_set_dma_mask(struct pci_dev * pdev)
diff --git a/drivers/scsi/sym53c416.c b/drivers/scsi/sym53c416.c
index 92bfaea..8befab7 100644
--- a/drivers/scsi/sym53c416.c
+++ b/drivers/scsi/sym53c416.c
@@ -854,5 +854,6 @@ static struct scsi_host_template driver_template = {
 	.cmd_per_lun =		1,
 	.unchecked_isa_dma =	1,
 	.use_clustering =	ENABLE_CLUSTERING,
+	.use_sg_chaining =	ENABLE_SG_CHAINING,
 };
 #include "scsi_module.c"
diff --git a/drivers/scsi/sym53c8xx_2/sym_glue.c b/drivers/scsi/sym53c8xx_2/sym_glue.c
index 764490e..7576c99 100644
--- a/drivers/scsi/sym53c8xx_2/sym_glue.c
+++ b/drivers/scsi/sym53c8xx_2/sym_glue.c
@@ -1827,6 +1827,7 @@ static struct scsi_host_template sym2_template = {
 	.eh_host_reset_handler	= sym53c8xx_eh_host_reset_handler,
 	.this_id		= 7,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 	.max_sectors		= 0xFFFF,
 #ifdef SYM_LINUX_PROC_INFO_SUPPORT
 	.proc_info		= sym53c8xx_proc_info,
diff --git a/drivers/scsi/u14-34f.c b/drivers/scsi/u14-34f.c
index 9e8232a..a0d9ef4 100644
--- a/drivers/scsi/u14-34f.c
+++ b/drivers/scsi/u14-34f.c
@@ -451,6 +451,7 @@ static struct scsi_host_template driver_template = {
                 .this_id                 = 7,
                 .unchecked_isa_dma       = 1,
                 .use_clustering          = ENABLE_CLUSTERING
+                .use_sg_chaining         = ENABLE_SG_CHAINING,
                 };
 
 #if !defined(__BIG_ENDIAN_BITFIELD) && !defined(__LITTLE_ENDIAN_BITFIELD)
diff --git a/drivers/scsi/ultrastor.c b/drivers/scsi/ultrastor.c
index c08235d..ea72bbe 100644
--- a/drivers/scsi/ultrastor.c
+++ b/drivers/scsi/ultrastor.c
@@ -1197,5 +1197,6 @@ static struct scsi_host_template driver_template = {
 	.cmd_per_lun       = ULTRASTOR_MAX_CMDS_PER_LUN,
 	.unchecked_isa_dma = 1,
 	.use_clustering    = ENABLE_CLUSTERING,
+	.use_sg_chaining   = ENABLE_SG_CHAINING,
 };
 #include "scsi_module.c"
diff --git a/drivers/scsi/wd7000.c b/drivers/scsi/wd7000.c
index d6fd425..255c611 100644
--- a/drivers/scsi/wd7000.c
+++ b/drivers/scsi/wd7000.c
@@ -1671,6 +1671,7 @@ static struct scsi_host_template driver_template = {
 	.cmd_per_lun		= 1,
 	.unchecked_isa_dma	= 1,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 };
 
 #include "scsi_module.c"
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 88f6871..3ee3805 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -36,6 +36,9 @@ struct blk_queue_tags;
 #define DISABLE_CLUSTERING 0
 #define ENABLE_CLUSTERING 1
 
+#define DISABLE_SG_CHAINING 0
+#define ENABLE_SG_CHAINING 1
+
 enum scsi_eh_timer_return {
 	EH_NOT_HANDLED,
 	EH_HANDLED,
@@ -435,6 +438,15 @@ struct scsi_host_template {
 	unsigned ordered_tag:1;
 
 	/*
+	 * true if the low-level driver can support sg chaining. this
+	 * will be removed eventually when all the drivers are
+	 * converted to support sg chaining.
+	 *
+	 * Status: OBSOLETE
+	 */
+	unsigned use_sg_chaining:1;
+
+	/*
 	 * Countdown for host blocking with no commands outstanding
 	 */
 	unsigned int max_host_blocked;
@@ -577,6 +589,7 @@ struct Scsi_Host {
 	unsigned unchecked_isa_dma:1;
 	unsigned use_clustering:1;
 	unsigned use_blk_tcq:1;
+	unsigned use_sg_chaining:1;
 
 	/*
 	 * Host has requested that no further requests come through for the
-- 
1.5.2.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 19:20       ` 2.6.23-rc4-mm1 Andrew Morton
@ 2007-09-10 19:38         ` Torsten Kaiser
  2007-09-10 19:42         ` 2.6.23-rc4-mm1 FUJITA Tomonori
  1 sibling, 0 replies; 20+ messages in thread
From: Torsten Kaiser @ 2007-09-10 19:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andy Whitcroft, linux-kernel, mel, Jens Axboe, linux-scsi,
	linux-ide

On 9/10/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Mon, 10 Sep 2007 20:59:49 +0200 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote:
>
> > On 9/10/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > > On Mon, 10 Sep 2007 18:49:26 +0100 Andy Whitcroft <apw@shadowen.org> wrote:
> > I reported a similar problem on Sep 1, but until now got no response.
>
> You still haven't had a response ;)  Let's add a cc.

But the mail from Andy was a nice point to try to another cc, i.e.
linux-scsi that you added. :)

> Oh, you reported it against 2.6.23-rc4-mm1
> (http://lkml.org/lkml/2007/9/1/92) and I did cc linux-ide in my response.
>
[snip]
> Andy is using qla1280.  You're using sata.  So it's probably a different
> bug, with the same symptoms.

Yes, but you (Andrew) also said in response to Andy: "If it's not that then
perhaps something in scsi core broke, dunno." So I wanted to add that
my problem migth point this bug into the core direction.

> Can you please confirm that this bug is present in -mm and not present in
> mainline (yet)?

Currently using 2.6.23-rc3-mm1, that works for me.
Now downloading 2.6.23-rc5-git1...

Torsten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 19:20       ` 2.6.23-rc4-mm1 Andrew Morton
  2007-09-10 19:38         ` 2.6.23-rc4-mm1 Torsten Kaiser
@ 2007-09-10 19:42         ` FUJITA Tomonori
  2007-09-10 20:43           ` 2.6.23-rc4-mm1 Torsten Kaiser
  1 sibling, 1 reply; 20+ messages in thread
From: FUJITA Tomonori @ 2007-09-10 19:42 UTC (permalink / raw)
  To: akpm
  Cc: just.for.lkml, apw, linux-kernel, mel, jens.axboe, linux-scsi,
	linux-ide, fujita.tomonori

On Mon, 10 Sep 2007 12:20:38 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Mon, 10 Sep 2007 20:59:49 +0200 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote:
> 
> > On 9/10/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > > On Mon, 10 Sep 2007 18:49:26 +0100 Andy Whitcroft <apw@shadowen.org> wrote:
> > >
> > > > I have a couple of old NUMA-Q systems which are unable to read their
> > > > boot disks with 2.6.23-rc4-mm1.  The disks appear to be recognised and
> > > > even the partition tables read correctly, and then they go pop:
> > 
> > I reported a similar problem on Sep 1, but until now got no response.
> 
> You still haven't had a response ;)  Let's add a cc.
> 
> Oh, you reported it against 2.6.23-rc4-mm1
> (http://lkml.org/lkml/2007/9/1/92) and I did cc linux-ide in my response.
> 
> I'll continue to point out where this sort of thing occurs because last
> week I was told that a reson why so many bug reports are ignored is because
> "linux-kernel has too much traffic".

many SCSI people don't subscribe to linux-kernel, I think.


> > The system boots, reads the partition tables, starts the RAID and then
> > kicks one drive out because of errors.
> 
> Andy is using qla1280.  You're using sata.  So it's probably a different
> bug, with the same symptoms.

This might be a sg chaining bug too (probabaly sg chaining libata
patch).

Can you try the following patch that I've just sent:

http://lkml.org/lkml/2007/9/10/251

The patch also disables chaining sg list for libata.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 19:42         ` 2.6.23-rc4-mm1 FUJITA Tomonori
@ 2007-09-10 20:43           ` Torsten Kaiser
  2007-09-11  8:32             ` 2.6.23-rc4-mm1 Jens Axboe
  0 siblings, 1 reply; 20+ messages in thread
From: Torsten Kaiser @ 2007-09-10 20:43 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, apw, linux-kernel, mel, jens.axboe, linux-scsi, linux-ide,
	fujita.tomonori

On 9/10/07, FUJITA Tomonori <tomof@acm.org> wrote:
> On Mon, 10 Sep 2007 12:20:38 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > On Mon, 10 Sep 2007 20:59:49 +0200 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote:
> > > The system boots, reads the partition tables, starts the RAID and then
> > > kicks one drive out because of errors.
> >
> > Andy is using qla1280.  You're using sata.  So it's probably a different
> > bug, with the same symptoms.
>
> This might be a sg chaining bug too (probabaly sg chaining libata
> patch).
>
> Can you try the following patch that I've just sent:
>
> http://lkml.org/lkml/2007/9/10/251
>
> The patch also disables chaining sg list for libata.
>
With this patch 2.6.23-rc4-mm1 works for me.
Mainline 2.6.23-rc5-git1 works also without needing any patches.

Torsten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 20:43           ` 2.6.23-rc4-mm1 Torsten Kaiser
@ 2007-09-11  8:32             ` Jens Axboe
  0 siblings, 0 replies; 20+ messages in thread
From: Jens Axboe @ 2007-09-11  8:32 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: FUJITA Tomonori, akpm, apw, linux-kernel, mel, linux-scsi,
	linux-ide, fujita.tomonori

On Mon, Sep 10 2007, Torsten Kaiser wrote:
> On 9/10/07, FUJITA Tomonori <tomof@acm.org> wrote:
> > On Mon, 10 Sep 2007 12:20:38 -0700
> > Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > > On Mon, 10 Sep 2007 20:59:49 +0200 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote:
> > > > The system boots, reads the partition tables, starts the RAID and then
> > > > kicks one drive out because of errors.
> > >
> > > Andy is using qla1280.  You're using sata.  So it's probably a different
> > > bug, with the same symptoms.
> >
> > This might be a sg chaining bug too (probabaly sg chaining libata
> > patch).
> >
> > Can you try the following patch that I've just sent:
> >
> > http://lkml.org/lkml/2007/9/10/251
> >
> > The patch also disables chaining sg list for libata.
> >
> With this patch 2.6.23-rc4-mm1 works for me.
> Mainline 2.6.23-rc5-git1 works also without needing any patches.

OK, thanks for testing that. I'll merge Tomo's patch so that we can
selectively enable drivers when we KNOW they work, instead of trying to
do this (massive) operation whole sale.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 19:10     ` 2.6.23-rc4-mm1 FUJITA Tomonori
@ 2007-09-13 17:34       ` Andy Whitcroft
  2007-09-15  4:16       ` 2.6.23-rc4-mm1 Paul Jackson
  1 sibling, 0 replies; 20+ messages in thread
From: Andy Whitcroft @ 2007-09-13 17:34 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-kernel, mel, jens.axboe, linux-scsi, fujita.tomonori

On Tue, Sep 11, 2007 at 04:10:47AM +0900, FUJITA Tomonori wrote:

> > The only patch which touches qla1280 is git-block.patch.  From a quick
> > squizz the change looks OK, although it's tricky and something might have
> > broken.
> 
> Can you try this patch (against 2.6.23-rc4-mm1)?

Yep this patch seems to sort out booting on these boxes.  The other one
is also testing.  Results later.

> >From 592bd2049cb3e6e1f1dde7cf631879f26ddffeaa Mon Sep 17 00:00:00 2001
> From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Date: Mon, 10 Sep 2007 04:17:13 +0100
> Subject: [PATCH] qla1280: sg chaining fixes
> 
> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> ---
>  drivers/scsi/qla1280.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c
> index bd805ec..7c1eaec 100644
> --- a/drivers/scsi/qla1280.c
> +++ b/drivers/scsi/qla1280.c
> @@ -2977,8 +2977,8 @@ qla1280_64bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp)
>  						cpu_to_le32(pci_dma_hi32(dma_handle)),
>  						cpu_to_le32(pci_dma_lo32(dma_handle)),
>  						cpu_to_le32(sg_dma_len(s)));
> -					remseg--;
>  				}
> +				remseg -= cnt;
>  				dprintk(5, "qla1280_64bit_start_scsi: "
>  					"continuation packet data - b %i, t "
>  					"%i, l %i \n", SCSI_BUS_32(cmd),
> @@ -3250,6 +3250,8 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp)
>  
>  				/* Load continuation entry data segments. */
>  				for_each_sg(sg, s, remseg, cnt) {
> +					if (cnt == 7)
> +						break;
>  					*dword_ptr++ =
>  						cpu_to_le32(pci_dma_lo32(sg_dma_address(s)));
>  					*dword_ptr++ =
> @@ -3260,6 +3262,7 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp)
>  						cpu_to_le32(pci_dma_lo32(sg_dma_address(s))),
>  						cpu_to_le32(sg_dma_len(s)));
>  				}
> +				remseg -= cnt;
>  				dprintk(5, "qla1280_32bit_start_scsi: "
>  					"continuation packet data - "
>  					"scsi(%i:%i:%i)\n", SCSI_BUS_32(cmd),
> -- 
> 1.5.2.4
> 
> 

-apw

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 19:31     ` 2.6.23-rc4-mm1 FUJITA Tomonori
@ 2007-09-14  8:10       ` Andy Whitcroft
  2007-09-14 13:01         ` 2.6.23-rc4-mm1 Torsten Kaiser
  0 siblings, 1 reply; 20+ messages in thread
From: Andy Whitcroft @ 2007-09-14  8:10 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-kernel, mel, jens.axboe, linux-scsi, fujita.tomonori

On Tue, Sep 11, 2007 at 04:31:12AM +0900, FUJITA Tomonori wrote:
[...]
> > The only patch which touches qla1280 is git-block.patch.  From a quick
> > squizz the change looks OK, although it's tricky and something might have
> > broken.
> > 
> > (the dprintk at line 2929 needs to print remseg, not seg_cnt).
> > 
> > Can you retest with that change reverted (below)?  If it's not that then
> > perhaps something in scsi core broke, dunno.
> 
> Even if we revert the qla1280 patch, scsi-ml still sends chaining sg
> list. So it doesn't work.
> 
> The following patch disables chaining sg list for qla1280. If the fix
> that I've just sent doesn't work, please try this.

Ok, the other patch _did_ work, but this got tested anyhow and it did
_not_ fix things.

> -
> From: FUJITA Tomonori <tomof@acm.org>
> Subject: [PATCH] add use_sg_chaining option to scsi_host_template
> 
> This option is true if a low-level driver can support sg
> chaining. This will be removed eventually when all the drivers are
> converted to support sg chaining. q->max_phys_segments is set to
> SCSI_MAX_SG_SEGMENTS if false.
> 
> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> ---
>  arch/ia64/hp/sim/simscsi.c            |    1 +
>  drivers/scsi/3w-9xxx.c                |    1 +
>  drivers/scsi/3w-xxxx.c                |    1 +
>  drivers/scsi/BusLogic.c               |    1 +
>  drivers/scsi/NCR53c406a.c             |    3 ++-
>  drivers/scsi/a100u2w.c                |    1 +
>  drivers/scsi/aacraid/linit.c          |    1 +
>  drivers/scsi/aha1740.c                |    1 +
>  drivers/scsi/aic7xxx/aic79xx_osm.c    |    1 +
>  drivers/scsi/aic7xxx/aic7xxx_osm.c    |    1 +
>  drivers/scsi/aic7xxx_old.c            |    1 +
>  drivers/scsi/arcmsr/arcmsr_hba.c      |    1 +
>  drivers/scsi/dc395x.c                 |    1 +
>  drivers/scsi/dpt_i2o.c                |    1 +
>  drivers/scsi/eata.c                   |    3 ++-
>  drivers/scsi/hosts.c                  |    1 +
>  drivers/scsi/hptiop.c                 |    1 +
>  drivers/scsi/ibmmca.c                 |    1 +
>  drivers/scsi/ibmvscsi/ibmvscsi.c      |    1 +
>  drivers/scsi/initio.c                 |    1 +
>  drivers/scsi/ipr.c                    |    1 +
>  drivers/scsi/lpfc/lpfc_scsi.c         |    2 ++
>  drivers/scsi/mac53c94.c               |    1 +
>  drivers/scsi/megaraid.c               |    1 +
>  drivers/scsi/megaraid/megaraid_mbox.c |    1 +
>  drivers/scsi/megaraid/megaraid_sas.c  |    1 +
>  drivers/scsi/mesh.c                   |    1 +
>  drivers/scsi/nsp32.c                  |    1 +
>  drivers/scsi/pcmcia/sym53c500_cs.c    |    1 +
>  drivers/scsi/qla2xxx/qla_os.c         |    2 ++
>  drivers/scsi/qla4xxx/ql4_os.c         |    1 +
>  drivers/scsi/qlogicfas.c              |    1 +
>  drivers/scsi/scsi_lib.c               |    5 ++++-
>  drivers/scsi/stex.c                   |    1 +
>  drivers/scsi/sym53c416.c              |    1 +
>  drivers/scsi/sym53c8xx_2/sym_glue.c   |    1 +
>  drivers/scsi/u14-34f.c                |    1 +
>  drivers/scsi/ultrastor.c              |    1 +
>  drivers/scsi/wd7000.c                 |    1 +
>  include/scsi/scsi_host.h              |   13 +++++++++++++
>  40 files changed, 59 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/ia64/hp/sim/simscsi.c b/arch/ia64/hp/sim/simscsi.c
> index 4552a1c..e711657 100644
> --- a/arch/ia64/hp/sim/simscsi.c
> +++ b/arch/ia64/hp/sim/simscsi.c
> @@ -360,6 +360,7 @@ static struct scsi_host_template driver_template = {
>  	.max_sectors		= 1024,
>  	.cmd_per_lun		= SIMSCSI_REQ_QUEUE_LEN,
>  	.use_clustering		= DISABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  };
>  
>  static int __init
> diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c
> index efd9d8d..fb14014 100644
> --- a/drivers/scsi/3w-9xxx.c
> +++ b/drivers/scsi/3w-9xxx.c
> @@ -1990,6 +1990,7 @@ static struct scsi_host_template driver_template = {
>  	.max_sectors		= TW_MAX_SECTORS,
>  	.cmd_per_lun		= TW_MAX_CMDS_PER_LUN,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.shost_attrs		= twa_host_attrs,
>  	.emulated		= 1
>  };
> diff --git a/drivers/scsi/3w-xxxx.c b/drivers/scsi/3w-xxxx.c
> index c7995fc..a64153b 100644
> --- a/drivers/scsi/3w-xxxx.c
> +++ b/drivers/scsi/3w-xxxx.c
> @@ -2261,6 +2261,7 @@ static struct scsi_host_template driver_template = {
>  	.max_sectors		= TW_MAX_SECTORS,
>  	.cmd_per_lun		= TW_MAX_CMDS_PER_LUN,	
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.shost_attrs		= tw_host_attrs,
>  	.emulated		= 1
>  };
> diff --git a/drivers/scsi/BusLogic.c b/drivers/scsi/BusLogic.c
> index 9b20617..49e1ffa 100644
> --- a/drivers/scsi/BusLogic.c
> +++ b/drivers/scsi/BusLogic.c
> @@ -3575,6 +3575,7 @@ static struct scsi_host_template Bus_Logic_template = {
>  	.unchecked_isa_dma = 1,
>  	.max_sectors = 128,
>  	.use_clustering = ENABLE_CLUSTERING,
> +	.use_sg_chaining = ENABLE_SG_CHAINING,
>  };
>  
>  /*
> diff --git a/drivers/scsi/NCR53c406a.c b/drivers/scsi/NCR53c406a.c
> index eda8c48..3168a17 100644
> --- a/drivers/scsi/NCR53c406a.c
> +++ b/drivers/scsi/NCR53c406a.c
> @@ -1066,7 +1066,8 @@ static struct scsi_host_template driver_template =
>       .sg_tablesize      	= 32			/*SG_ALL*/ /*SG_NONE*/, 
>       .cmd_per_lun       	= 1			/* commands per lun */, 
>       .unchecked_isa_dma 	= 1			/* unchecked_isa_dma */,
> -     .use_clustering    	= ENABLE_CLUSTERING                               
> +     .use_clustering    	= ENABLE_CLUSTERING,
> +     .use_sg_chaining           = ENABLE_SG_CHAINING,
>  };
>  
>  #include "scsi_module.c"
> diff --git a/drivers/scsi/a100u2w.c b/drivers/scsi/a100u2w.c
> index f608d4a..d3a6d15 100644
> --- a/drivers/scsi/a100u2w.c
> +++ b/drivers/scsi/a100u2w.c
> @@ -1071,6 +1071,7 @@ static struct scsi_host_template inia100_template = {
>  	.sg_tablesize		= SG_ALL,
>  	.cmd_per_lun 		= 1,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  };
>  
>  static int __devinit inia100_probe_one(struct pci_dev *pdev,
> diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
> index a7f42a1..038980b 100644
> --- a/drivers/scsi/aacraid/linit.c
> +++ b/drivers/scsi/aacraid/linit.c
> @@ -944,6 +944,7 @@ static struct scsi_host_template aac_driver_template = {
>  	.cmd_per_lun    		= AAC_NUM_IO_FIB, 
>  #endif	
>  	.use_clustering			= ENABLE_CLUSTERING,
> +	.use_sg_chaining		= ENABLE_SG_CHAINING,
>  	.emulated                       = 1,
>  };
>  
> diff --git a/drivers/scsi/aha1740.c b/drivers/scsi/aha1740.c
> index e4a4f3a..f6722fd 100644
> --- a/drivers/scsi/aha1740.c
> +++ b/drivers/scsi/aha1740.c
> @@ -563,6 +563,7 @@ static struct scsi_host_template aha1740_template = {
>  	.sg_tablesize     = AHA1740_SCATTER,
>  	.cmd_per_lun      = AHA1740_CMDLUN,
>  	.use_clustering   = ENABLE_CLUSTERING,
> +	.use_sg_chaining  = ENABLE_SG_CHAINING,
>  	.eh_abort_handler = aha1740_eh_abort_handler,
>  };
>  
> diff --git a/drivers/scsi/aic7xxx/aic79xx_osm.c b/drivers/scsi/aic7xxx/aic79xx_osm.c
> index a055a96..42c0f14 100644
> --- a/drivers/scsi/aic7xxx/aic79xx_osm.c
> +++ b/drivers/scsi/aic7xxx/aic79xx_osm.c
> @@ -766,6 +766,7 @@ struct scsi_host_template aic79xx_driver_template = {
>  	.max_sectors		= 8192,
>  	.cmd_per_lun		= 2,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.slave_alloc		= ahd_linux_slave_alloc,
>  	.slave_configure	= ahd_linux_slave_configure,
>  	.target_alloc		= ahd_linux_target_alloc,
> diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm.c b/drivers/scsi/aic7xxx/aic7xxx_osm.c
> index 2e9c38f..7770bef 100644
> --- a/drivers/scsi/aic7xxx/aic7xxx_osm.c
> +++ b/drivers/scsi/aic7xxx/aic7xxx_osm.c
> @@ -747,6 +747,7 @@ struct scsi_host_template aic7xxx_driver_template = {
>  	.max_sectors		= 8192,
>  	.cmd_per_lun		= 2,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.slave_alloc		= ahc_linux_slave_alloc,
>  	.slave_configure	= ahc_linux_slave_configure,
>  	.target_alloc		= ahc_linux_target_alloc,
> diff --git a/drivers/scsi/aic7xxx_old.c b/drivers/scsi/aic7xxx_old.c
> index 1a71b02..4025608 100644
> --- a/drivers/scsi/aic7xxx_old.c
> +++ b/drivers/scsi/aic7xxx_old.c
> @@ -11142,6 +11142,7 @@ static struct scsi_host_template driver_template = {
>  	.max_sectors		= 2048,
>  	.cmd_per_lun		= 3,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  };
>  
>  #include "scsi_module.c"
> diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
> index 0ddfc21..d5039f3 100644
> --- a/drivers/scsi/arcmsr/arcmsr_hba.c
> +++ b/drivers/scsi/arcmsr/arcmsr_hba.c
> @@ -121,6 +121,7 @@ static struct scsi_host_template arcmsr_scsi_host_template = {
>  	.max_sectors    	= ARCMSR_MAX_XFER_SECTORS,
>  	.cmd_per_lun		= ARCMSR_MAX_CMD_PERLUN,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.shost_attrs		= arcmsr_host_attrs,
>  };
>  static struct pci_error_handlers arcmsr_pci_error_handlers = {
> diff --git a/drivers/scsi/dc395x.c b/drivers/scsi/dc395x.c
> index 7b8a345..d2a2026 100644
> --- a/drivers/scsi/dc395x.c
> +++ b/drivers/scsi/dc395x.c
> @@ -4765,6 +4765,7 @@ static struct scsi_host_template dc395x_driver_template = {
>  	.eh_bus_reset_handler   = dc395x_eh_bus_reset,
>  	.unchecked_isa_dma      = 0,
>  	.use_clustering         = DISABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  };
>  
>  
> diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
> index bea9d65..8258506 100644
> --- a/drivers/scsi/dpt_i2o.c
> +++ b/drivers/scsi/dpt_i2o.c
> @@ -3295,6 +3295,7 @@ static struct scsi_host_template adpt_template = {
>  	.this_id		= 7,
>  	.cmd_per_lun		= 1,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  };
>  
>  static s32 adpt_scsi_register(adpt_hba* pHba)
> diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c
> index a83e9f1..2f685cf 100644
> --- a/drivers/scsi/eata.c
> +++ b/drivers/scsi/eata.c
> @@ -523,7 +523,8 @@ static struct scsi_host_template driver_template = {
>  	.slave_configure = eata2x_slave_configure,
>  	.this_id = 7,
>  	.unchecked_isa_dma = 1,
> -	.use_clustering = ENABLE_CLUSTERING
> +	.use_clustering = ENABLE_CLUSTERING,
> +	.use_sg_chaining = ENABLE_SG_CHAINING,
>  };
>  
>  #if !defined(__BIG_ENDIAN_BITFIELD) && !defined(__LITTLE_ENDIAN_BITFIELD)
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 96bc312..8c42539 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -342,6 +342,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
>  	shost->unchecked_isa_dma = sht->unchecked_isa_dma;
>  	shost->use_clustering = sht->use_clustering;
>  	shost->ordered_tag = sht->ordered_tag;
> +	shost->use_sg_chaining = sht->use_sg_chaining;
>  
>  	if (sht->max_host_blocked)
>  		shost->max_host_blocked = sht->max_host_blocked;
> diff --git a/drivers/scsi/hptiop.c b/drivers/scsi/hptiop.c
> index 8b384fa..8515054 100644
> --- a/drivers/scsi/hptiop.c
> +++ b/drivers/scsi/hptiop.c
> @@ -655,6 +655,7 @@ static struct scsi_host_template driver_template = {
>  	.unchecked_isa_dma          = 0,
>  	.emulated                   = 0,
>  	.use_clustering             = ENABLE_CLUSTERING,
> +	.use_sg_chaining            = ENABLE_SG_CHAINING,
>  	.proc_name                  = driver_name,
>  	.shost_attrs                = hptiop_attrs,
>  	.this_id                    = -1,
> diff --git a/drivers/scsi/ibmmca.c b/drivers/scsi/ibmmca.c
> index bff8252..695941a 100644
> --- a/drivers/scsi/ibmmca.c
> +++ b/drivers/scsi/ibmmca.c
> @@ -1501,6 +1501,7 @@ static struct scsi_host_template ibmmca_driver_template = {
>            .sg_tablesize   = 16,
>            .cmd_per_lun    = 1,
>            .use_clustering = ENABLE_CLUSTERING,
> +          .use_sg_chaining = ENABLE_SG_CHAINING,
>  };
>  
>  static int ibmmca_probe(struct device *dev)
> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
> index 93bd01b..084488c 100644
> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
> @@ -1545,6 +1545,7 @@ static struct scsi_host_template driver_template = {
>  	.this_id = -1,
>  	.sg_tablesize = SG_ALL,
>  	.use_clustering = ENABLE_CLUSTERING,
> +	.use_sg_chaining = ENABLE_SG_CHAINING,
>  	.shost_attrs = ibmvscsi_attrs,
>  };
>  
> diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c
> index d9dfb69..22d40fd 100644
> --- a/drivers/scsi/initio.c
> +++ b/drivers/scsi/initio.c
> @@ -2831,6 +2831,7 @@ static struct scsi_host_template initio_template = {
>  	.sg_tablesize		= SG_ALL,
>  	.cmd_per_lun		= 1,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  };
>  
>  static int initio_probe_one(struct pci_dev *pdev,
> diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
> index b41dfb5..ba7b567 100644
> --- a/drivers/scsi/ipr.c
> +++ b/drivers/scsi/ipr.c
> @@ -4949,6 +4949,7 @@ static struct scsi_host_template driver_template = {
>  	.max_sectors = IPR_IOA_MAX_SECTORS,
>  	.cmd_per_lun = IPR_MAX_CMD_PER_LUN,
>  	.use_clustering = ENABLE_CLUSTERING,
> +	.use_sg_chaining = ENABLE_SG_CHAINING,
>  	.shost_attrs = ipr_ioa_attrs,
>  	.sdev_attrs = ipr_dev_attrs,
>  	.proc_name = IPR_NAME
> diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
> index cd67493..c075556 100644
> --- a/drivers/scsi/lpfc/lpfc_scsi.c
> +++ b/drivers/scsi/lpfc/lpfc_scsi.c
> @@ -1438,6 +1438,7 @@ struct scsi_host_template lpfc_template = {
>  	.scan_finished		= lpfc_scan_finished,
>  	.this_id		= -1,
>  	.sg_tablesize		= LPFC_SG_SEG_CNT,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.cmd_per_lun		= LPFC_CMD_PER_LUN,
>  	.use_clustering		= ENABLE_CLUSTERING,
>  	.shost_attrs		= lpfc_hba_attrs,
> @@ -1460,6 +1461,7 @@ struct scsi_host_template lpfc_vport_template = {
>  	.sg_tablesize		= LPFC_SG_SEG_CNT,
>  	.cmd_per_lun		= LPFC_CMD_PER_LUN,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.shost_attrs		= lpfc_vport_attrs,
>  	.max_sectors		= 0xFFFF,
>  };
> diff --git a/drivers/scsi/mac53c94.c b/drivers/scsi/mac53c94.c
> index b12ad7c..a035001 100644
> --- a/drivers/scsi/mac53c94.c
> +++ b/drivers/scsi/mac53c94.c
> @@ -402,6 +402,7 @@ static struct scsi_host_template mac53c94_template = {
>  	.sg_tablesize	= SG_ALL,
>  	.cmd_per_lun	= 1,
>  	.use_clustering	= DISABLE_CLUSTERING,
> +	.use_sg_chaining = ENABLE_SG_CHAINING,
>  };
>  
>  static int mac53c94_probe(struct macio_dev *mdev, const struct of_device_id *match)
> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
> index 9023ec6..a0133b5 100644
> --- a/drivers/scsi/megaraid.c
> +++ b/drivers/scsi/megaraid.c
> @@ -4484,6 +4484,7 @@ static struct scsi_host_template megaraid_template = {
>  	.sg_tablesize			= MAX_SGLIST,
>  	.cmd_per_lun			= DEF_CMD_PER_LUN,
>  	.use_clustering			= ENABLE_CLUSTERING,
> +	.use_sg_chaining		= ENABLE_SG_CHAINING,
>  	.eh_abort_handler		= megaraid_abort,
>  	.eh_device_reset_handler	= megaraid_reset,
>  	.eh_bus_reset_handler		= megaraid_reset,
> diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
> index c6a53dc..e4e4c6a 100644
> --- a/drivers/scsi/megaraid/megaraid_mbox.c
> +++ b/drivers/scsi/megaraid/megaraid_mbox.c
> @@ -361,6 +361,7 @@ static struct scsi_host_template megaraid_template_g = {
>  	.eh_host_reset_handler		= megaraid_reset_handler,
>  	.change_queue_depth		= megaraid_change_queue_depth,
>  	.use_clustering			= ENABLE_CLUSTERING,
> +	.use_sg_chaining		= ENABLE_SG_CHAINING,
>  	.sdev_attrs			= megaraid_sdev_attrs,
>  	.shost_attrs			= megaraid_shost_attrs,
>  };
> diff --git a/drivers/scsi/megaraid/megaraid_sas.c b/drivers/scsi/megaraid/megaraid_sas.c
> index ebb948c..e3c5c52 100644
> --- a/drivers/scsi/megaraid/megaraid_sas.c
> +++ b/drivers/scsi/megaraid/megaraid_sas.c
> @@ -1110,6 +1110,7 @@ static struct scsi_host_template megasas_template = {
>  	.eh_timed_out = megasas_reset_timer,
>  	.bios_param = megasas_bios_param,
>  	.use_clustering = ENABLE_CLUSTERING,
> +	.use_sg_chaining = ENABLE_SG_CHAINING,
>  };
>  
>  /**
> diff --git a/drivers/scsi/mesh.c b/drivers/scsi/mesh.c
> index 651d09b..7470ff3 100644
> --- a/drivers/scsi/mesh.c
> +++ b/drivers/scsi/mesh.c
> @@ -1843,6 +1843,7 @@ static struct scsi_host_template mesh_template = {
>  	.sg_tablesize			= SG_ALL,
>  	.cmd_per_lun			= 2,
>  	.use_clustering			= DISABLE_CLUSTERING,
> +	.use_sg_chaining		= ENABLE_SG_CHAINING,
>  };
>  
>  static int mesh_probe(struct macio_dev *mdev, const struct of_device_id *match)
> diff --git a/drivers/scsi/nsp32.c b/drivers/scsi/nsp32.c
> index 4215f3b..6da1504 100644
> --- a/drivers/scsi/nsp32.c
> +++ b/drivers/scsi/nsp32.c
> @@ -281,6 +281,7 @@ static struct scsi_host_template nsp32_template = {
>  	.cmd_per_lun			= 1,
>  	.this_id			= NSP32_HOST_SCSIID,
>  	.use_clustering			= DISABLE_CLUSTERING,
> +	.use_sg_chaining		= ENABLE_SG_CHAINING,
>  	.eh_abort_handler       	= nsp32_eh_abort,
>  	.eh_bus_reset_handler		= nsp32_eh_bus_reset,
>  	.eh_host_reset_handler		= nsp32_eh_host_reset,
> diff --git a/drivers/scsi/pcmcia/sym53c500_cs.c b/drivers/scsi/pcmcia/sym53c500_cs.c
> index 961839e..190e2a7 100644
> --- a/drivers/scsi/pcmcia/sym53c500_cs.c
> +++ b/drivers/scsi/pcmcia/sym53c500_cs.c
> @@ -694,6 +694,7 @@ static struct scsi_host_template sym53c500_driver_template = {
>       .sg_tablesize		= 32,
>       .cmd_per_lun		= 1,
>       .use_clustering		= ENABLE_CLUSTERING,
> +     .use_sg_chaining		= ENABLE_SG_CHAINING,
>       .shost_attrs		= SYM53C500_shost_attrs
>  };
>  
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index acca898..3abbbc0 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -132,6 +132,7 @@ struct scsi_host_template qla2x00_driver_template = {
>  	.this_id		= -1,
>  	.cmd_per_lun		= 3,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.sg_tablesize		= SG_ALL,
>  
>  	/*
> @@ -163,6 +164,7 @@ struct scsi_host_template qla24xx_driver_template = {
>  	.this_id		= -1,
>  	.cmd_per_lun		= 3,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.sg_tablesize		= SG_ALL,
>  
>  	.max_sectors		= 0xFFFF,
> diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
> index 8fa5aea..89460d2 100644
> --- a/drivers/scsi/qla4xxx/ql4_os.c
> +++ b/drivers/scsi/qla4xxx/ql4_os.c
> @@ -94,6 +94,7 @@ static struct scsi_host_template qla4xxx_driver_template = {
>  	.this_id		= -1,
>  	.cmd_per_lun		= 3,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.sg_tablesize		= SG_ALL,
>  
>  	.max_sectors		= 0xFFFF,
> diff --git a/drivers/scsi/qlogicfas.c b/drivers/scsi/qlogicfas.c
> index 94baca8..2268ca1 100644
> --- a/drivers/scsi/qlogicfas.c
> +++ b/drivers/scsi/qlogicfas.c
> @@ -197,6 +197,7 @@ static struct scsi_host_template qlogicfas_driver_template = {
>  	.sg_tablesize		= SG_ALL,
>  	.cmd_per_lun		= 1,
>  	.use_clustering		= DISABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  };
>  
>  static __init int qlogicfas_init(void)
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index d0a1028..38eec00 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1685,7 +1685,10 @@ struct request_queue *__scsi_alloc_queue(struct Scsi_Host *shost,
>  	 * converted, so better keep it safe.
>  	 */
>  #ifdef ARCH_HAS_SG_CHAIN
> -	blk_queue_max_phys_segments(q, SCSI_MAX_SG_CHAIN_SEGMENTS);
> +	if (shost->use_sg_chaining)
> +		blk_queue_max_phys_segments(q, SCSI_MAX_SG_CHAIN_SEGMENTS);
> +	else
> +		blk_queue_max_phys_segments(q, SCSI_MAX_SG_SEGMENTS);
>  #else
>  	blk_queue_max_phys_segments(q, SCSI_MAX_SG_SEGMENTS);
>  #endif
> diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c
> index 72f6d80..e3fab3a 100644
> --- a/drivers/scsi/stex.c
> +++ b/drivers/scsi/stex.c
> @@ -1123,6 +1123,7 @@ static struct scsi_host_template driver_template = {
>  	.this_id			= -1,
>  	.sg_tablesize			= ST_MAX_SG,
>  	.cmd_per_lun			= ST_CMD_PER_LUN,
> +	.use_sg_chaining		= ENABLE_SG_CHAINING,
>  };
>  
>  static int stex_set_dma_mask(struct pci_dev * pdev)
> diff --git a/drivers/scsi/sym53c416.c b/drivers/scsi/sym53c416.c
> index 92bfaea..8befab7 100644
> --- a/drivers/scsi/sym53c416.c
> +++ b/drivers/scsi/sym53c416.c
> @@ -854,5 +854,6 @@ static struct scsi_host_template driver_template = {
>  	.cmd_per_lun =		1,
>  	.unchecked_isa_dma =	1,
>  	.use_clustering =	ENABLE_CLUSTERING,
> +	.use_sg_chaining =	ENABLE_SG_CHAINING,
>  };
>  #include "scsi_module.c"
> diff --git a/drivers/scsi/sym53c8xx_2/sym_glue.c b/drivers/scsi/sym53c8xx_2/sym_glue.c
> index 764490e..7576c99 100644
> --- a/drivers/scsi/sym53c8xx_2/sym_glue.c
> +++ b/drivers/scsi/sym53c8xx_2/sym_glue.c
> @@ -1827,6 +1827,7 @@ static struct scsi_host_template sym2_template = {
>  	.eh_host_reset_handler	= sym53c8xx_eh_host_reset_handler,
>  	.this_id		= 7,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  	.max_sectors		= 0xFFFF,
>  #ifdef SYM_LINUX_PROC_INFO_SUPPORT
>  	.proc_info		= sym53c8xx_proc_info,
> diff --git a/drivers/scsi/u14-34f.c b/drivers/scsi/u14-34f.c
> index 9e8232a..a0d9ef4 100644
> --- a/drivers/scsi/u14-34f.c
> +++ b/drivers/scsi/u14-34f.c
> @@ -451,6 +451,7 @@ static struct scsi_host_template driver_template = {
>                  .this_id                 = 7,
>                  .unchecked_isa_dma       = 1,
>                  .use_clustering          = ENABLE_CLUSTERING
> +                .use_sg_chaining         = ENABLE_SG_CHAINING,
>                  };
>  
>  #if !defined(__BIG_ENDIAN_BITFIELD) && !defined(__LITTLE_ENDIAN_BITFIELD)
> diff --git a/drivers/scsi/ultrastor.c b/drivers/scsi/ultrastor.c
> index c08235d..ea72bbe 100644
> --- a/drivers/scsi/ultrastor.c
> +++ b/drivers/scsi/ultrastor.c
> @@ -1197,5 +1197,6 @@ static struct scsi_host_template driver_template = {
>  	.cmd_per_lun       = ULTRASTOR_MAX_CMDS_PER_LUN,
>  	.unchecked_isa_dma = 1,
>  	.use_clustering    = ENABLE_CLUSTERING,
> +	.use_sg_chaining   = ENABLE_SG_CHAINING,
>  };
>  #include "scsi_module.c"
> diff --git a/drivers/scsi/wd7000.c b/drivers/scsi/wd7000.c
> index d6fd425..255c611 100644
> --- a/drivers/scsi/wd7000.c
> +++ b/drivers/scsi/wd7000.c
> @@ -1671,6 +1671,7 @@ static struct scsi_host_template driver_template = {
>  	.cmd_per_lun		= 1,
>  	.unchecked_isa_dma	= 1,
>  	.use_clustering		= ENABLE_CLUSTERING,
> +	.use_sg_chaining	= ENABLE_SG_CHAINING,
>  };
>  
>  #include "scsi_module.c"
> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> index 88f6871..3ee3805 100644
> --- a/include/scsi/scsi_host.h
> +++ b/include/scsi/scsi_host.h
> @@ -36,6 +36,9 @@ struct blk_queue_tags;
>  #define DISABLE_CLUSTERING 0
>  #define ENABLE_CLUSTERING 1
>  
> +#define DISABLE_SG_CHAINING 0
> +#define ENABLE_SG_CHAINING 1
> +
>  enum scsi_eh_timer_return {
>  	EH_NOT_HANDLED,
>  	EH_HANDLED,
> @@ -435,6 +438,15 @@ struct scsi_host_template {
>  	unsigned ordered_tag:1;
>  
>  	/*
> +	 * true if the low-level driver can support sg chaining. this
> +	 * will be removed eventually when all the drivers are
> +	 * converted to support sg chaining.
> +	 *
> +	 * Status: OBSOLETE
> +	 */
> +	unsigned use_sg_chaining:1;
> +
> +	/*
>  	 * Countdown for host blocking with no commands outstanding
>  	 */
>  	unsigned int max_host_blocked;
> @@ -577,6 +589,7 @@ struct Scsi_Host {
>  	unsigned unchecked_isa_dma:1;
>  	unsigned use_clustering:1;
>  	unsigned use_blk_tcq:1;
> +	unsigned use_sg_chaining:1;
>  
>  	/*
>  	 * Host has requested that no further requests come through for the
> -- 
> 1.5.2.4

-apw

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-14  8:10       ` 2.6.23-rc4-mm1 Andy Whitcroft
@ 2007-09-14 13:01         ` Torsten Kaiser
  2007-09-14 20:15           ` 2.6.23-rc4-mm1 Andrew Morton
  0 siblings, 1 reply; 20+ messages in thread
From: Torsten Kaiser @ 2007-09-14 13:01 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: FUJITA Tomonori, akpm, linux-kernel, mel, jens.axboe, linux-scsi,
	fujita.tomonori

On 9/14/07, Andy Whitcroft <apw@shadowen.org> wrote:
> On Tue, Sep 11, 2007 at 04:31:12AM +0900, FUJITA Tomonori wrote:
> [...]
> >
> > Even if we revert the qla1280 patch, scsi-ml still sends chaining sg
> > list. So it doesn't work.
> >
> > The following patch disables chaining sg list for qla1280. If the fix
> > that I've just sent doesn't work, please try this.
>
> Ok, the other patch _did_ work, but this got tested anyhow and it did
> _not_ fix things.
>

Sorry to confirm this. My RAID5 got destroyed a second time.
To summarize what worked / not worked / and seems to work for me:

First 2 tries with unpatched rc4-mm1: Both times one sata_sil24-drive got kicked
Then I switched back to rc3-mm1, 18 boots with that kernel worked.
Then I tried the patched rc4-mm1 and it worked too.
The next boot also worked, but the third time kicked a drive out again.
But as nobody reads logs, I did not notice that and keep using the
patched rc4-mm1.
The next 5 times the system worked normally with the two remaining drives.
The sixth boot kicked the second sata_sil24 drive. That I did notice...
After reassembling the RAID, I'm now back to the patch rc4-mm1 that
did boot correctly this time.
So the patch just makes it unlikelier to hit the bug. Instead of
failing 2 out of 2 times, it only failed 2 out of 8 times.
I compared the rc4-mm1 boot from a working case and the case where it
kicked the first drive. Nothing seems to stand out...

< == good rc4-mm1 boot
> == bad rc4-mm1 boot that kicked the drive

145c145
< CPU 0: aperture @ 4000000 size 32 MB
---
> CPU 0: aperture @ b7f0000000 size 32 MB
154c154
< Calibrating delay using timer specific routine.. 5203.23 BogoMIPS
(lpj=26016160)
---
> Calibrating delay using timer specific routine.. 5203.22 BogoMIPS (lpj=26016138)
169c169
< APIC timer calibration result 12499998
---
> APIC timer calibration result 12499994
173c173
< Calibrating delay using timer specific routine.. 5222.40 BogoMIPS
(lpj=26112010)
---
> Calibrating delay using timer specific routine.. 5200.01 BogoMIPS (lpj=26000052)
182c182
< Calibrating delay using timer specific routine.. 5222.73 BogoMIPS
(lpj=26113694)
---
> Calibrating delay using timer specific routine.. 5200.01 BogoMIPS (lpj=26000081)
191c191
< Calibrating delay using timer specific routine.. 5223.07 BogoMIPS
(lpj=26115369)
---
> Calibrating delay using timer specific routine.. 5200.03 BogoMIPS (lpj=26000164)
269d268
< Switched to high resolution mode on CPU 3
270a270
> Switched to high resolution mode on CPU 3
502,509c502,509
< raid6: int64x1   2634 MB/s
< raid6: int64x2   3244 MB/s
< raid6: int64x4   3405 MB/s
< raid6: int64x8   2614 MB/s
< raid6: sse2x1    3607 MB/s
< raid6: sse2x2    4834 MB/s
< raid6: sse2x4    4946 MB/s
< raid6: using algorithm sse2x4 (4946 MB/s)
---
> raid6: int64x1   2680 MB/s
> raid6: int64x2   3232 MB/s
> raid6: int64x4   3411 MB/s
> raid6: int64x8   2620 MB/s
> raid6: sse2x1    3606 MB/s
> raid6: sse2x2    4810 MB/s
> raid6: sse2x4    4910 MB/s
> raid6: using algorithm sse2x4 (4910 MB/s)
567c567
< md1: bitmap initialized from disk: read 10/10 pages, set 96 bits
---
> md1: bitmap initialized from disk: read 10/10 pages, set 104 bits
568a569,655
> ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
>          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> ata1.00: status: {DRDY }
> ata1: soft resetting link
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata1.00: configured for UDMA/100
> ata1: EH complete
> sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
>          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> ata1.00: status: {DRDY }
> ata1: soft resetting link
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata1.00: configured for UDMA/100
> ata1: EH complete
> sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
>          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> ata1.00: status: {DRDY }
> ata1: soft resetting link
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata1.00: configured for UDMA/100
> ata1: EH complete
> sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
>          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> ata1.00: status: {DRDY }
> ata1: soft resetting link
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata1.00: configured for UDMA/100
> ata1: EH complete
> sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
>          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> ata1.00: status: {DRDY }
> ata1: soft resetting link
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata1.00: configured for UDMA/100
> ata1: EH complete
> sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
>          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> ata1.00: status: {DRDY }
> ata1: soft resetting link
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata1.00: configured for UDMA/100
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
> sd 0:0:0:0: [sda] Sense Key : Aborted Command [current] [descriptor]
> Descriptor sense data with sense descriptors (in hex):
>         72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
>         00 00 00 af
> sd 0:0:0:0: [sda] Add. Sense: No additional sense information
> end_request: I/O error, dev sda, sector 625137161
> ata1: EH complete
> sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> md: super_written gets error=-5, uptodate=0
> raid5: Disk failure on sda2, disabling device. Operation continuing on 2 devices
571a659,663
> RAID5 conf printout:
>  --- rd:3 wd:2
>  disk 0, o:0, dev:sda2
>  disk 1, o:1, dev:sdb2
>  disk 2, o:1, dev:sdc2
576a669,672
> RAID5 conf printout:
>  --- rd:3 wd:2
>  disk 1, o:1, dev:sdb2
>  disk 2, o:1, dev:sdc2

Another good boot also showed the aperture at a similar high address:
CPU 0: aperture @ b7f2000000 size 32 MB
And that good boot also showed the "correct" BogoMIPS:
Calibrating delay using timer specific routine.. 5205.43 BogoMIPS (lpj=26027183)
Calibrating delay using timer specific routine.. 5200.01 BogoMIPS (lpj=26000052)
Calibrating delay using timer specific routine.. 5200.01 BogoMIPS (lpj=26000082)
Calibrating delay using timer specific routine.. 5200.03 BogoMIPS (lpj=26000166)

Anything more I can provide to help debugging this?

Torsten

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-14 13:01         ` 2.6.23-rc4-mm1 Torsten Kaiser
@ 2007-09-14 20:15           ` Andrew Morton
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2007-09-14 20:15 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Andy Whitcroft, FUJITA Tomonori, linux-kernel, mel, jens.axboe,
	linux-scsi, fujita.tomonori, linux-ide

On Fri, 14 Sep 2007 15:01:03 +0200 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote:

> On 9/14/07, Andy Whitcroft <apw@shadowen.org> wrote:
> > On Tue, Sep 11, 2007 at 04:31:12AM +0900, FUJITA Tomonori wrote:
> > [...]
> > >
> > > Even if we revert the qla1280 patch, scsi-ml still sends chaining sg
> > > list. So it doesn't work.
> > >
> > > The following patch disables chaining sg list for qla1280. If the fix
> > > that I've just sent doesn't work, please try this.
> >
> > Ok, the other patch _did_ work, but this got tested anyhow and it did
> > _not_ fix things.
> >
> 
> Sorry to confirm this. My RAID5 got destroyed a second time.
> To summarize what worked / not worked / and seems to work for me:
> 
> First 2 tries with unpatched rc4-mm1: Both times one sata_sil24-drive got kicked
> Then I switched back to rc3-mm1, 18 boots with that kernel worked.
> Then I tried the patched rc4-mm1 and it worked too.
> The next boot also worked, but the third time kicked a drive out again.
> But as nobody reads logs, I did not notice that and keep using the
> patched rc4-mm1.
> The next 5 times the system worked normally with the two remaining drives.
> The sixth boot kicked the second sata_sil24 drive. That I did notice...
> After reassembling the RAID, I'm now back to the patch rc4-mm1 that
> did boot correctly this time.
> So the patch just makes it unlikelier to hit the bug. Instead of
> failing 2 out of 2 times, it only failed 2 out of 8 times.
> I compared the rc4-mm1 boot from a working case and the case where it
> kicked the first drive. Nothing seems to stand out...
> 
> < == good rc4-mm1 boot
> > == bad rc4-mm1 boot that kicked the drive
> 
> 145c145
> < CPU 0: aperture @ 4000000 size 32 MB
> ---
> > CPU 0: aperture @ b7f0000000 size 32 MB
> 154c154
> < Calibrating delay using timer specific routine.. 5203.23 BogoMIPS
> (lpj=26016160)
> ---
> > Calibrating delay using timer specific routine.. 5203.22 BogoMIPS (lpj=26016138)
> 169c169
> < APIC timer calibration result 12499998
> ---
> > APIC timer calibration result 12499994
> 173c173
> < Calibrating delay using timer specific routine.. 5222.40 BogoMIPS
> (lpj=26112010)
> ---
> > Calibrating delay using timer specific routine.. 5200.01 BogoMIPS (lpj=26000052)
> 182c182
> < Calibrating delay using timer specific routine.. 5222.73 BogoMIPS
> (lpj=26113694)
> ---
> > Calibrating delay using timer specific routine.. 5200.01 BogoMIPS (lpj=26000081)
> 191c191
> < Calibrating delay using timer specific routine.. 5223.07 BogoMIPS
> (lpj=26115369)
> ---
> > Calibrating delay using timer specific routine.. 5200.03 BogoMIPS (lpj=26000164)
> 269d268
> < Switched to high resolution mode on CPU 3
> 270a270
> > Switched to high resolution mode on CPU 3
> 502,509c502,509
> < raid6: int64x1   2634 MB/s
> < raid6: int64x2   3244 MB/s
> < raid6: int64x4   3405 MB/s
> < raid6: int64x8   2614 MB/s
> < raid6: sse2x1    3607 MB/s
> < raid6: sse2x2    4834 MB/s
> < raid6: sse2x4    4946 MB/s
> < raid6: using algorithm sse2x4 (4946 MB/s)
> ---
> > raid6: int64x1   2680 MB/s
> > raid6: int64x2   3232 MB/s
> > raid6: int64x4   3411 MB/s
> > raid6: int64x8   2620 MB/s
> > raid6: sse2x1    3606 MB/s
> > raid6: sse2x2    4810 MB/s
> > raid6: sse2x4    4910 MB/s
> > raid6: using algorithm sse2x4 (4910 MB/s)
> 567c567
> < md1: bitmap initialized from disk: read 10/10 pages, set 96 bits
> ---
> > md1: bitmap initialized from disk: read 10/10 pages, set 104 bits
> 568a569,655
> > ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> > ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> > ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
> >          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> > ata1.00: status: {DRDY }
> > ata1: soft resetting link
> > ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > ata1.00: configured for UDMA/100
> > ata1: EH complete
> > sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> > ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> > ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> > ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
> >          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> > ata1.00: status: {DRDY }
> > ata1: soft resetting link
> > ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > ata1.00: configured for UDMA/100
> > ata1: EH complete
> > sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> > ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> > ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> > ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
> >          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> > ata1.00: status: {DRDY }
> > ata1: soft resetting link
> > ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > ata1.00: configured for UDMA/100
> > ata1: EH complete
> > sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> > ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> > ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> > ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
> >          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> > ata1.00: status: {DRDY }
> > ata1: soft resetting link
> > ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > ata1.00: configured for UDMA/100
> > ata1: EH complete
> > sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> > ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> > ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> > ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
> >          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> > ata1.00: status: {DRDY }
> > ata1: soft resetting link
> > ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > ata1.00: configured for UDMA/100
> > ata1: EH complete
> > sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> > ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2
> > ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT
> > ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out
> >          res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error)
> > ata1.00: status: {DRDY }
> > ata1: soft resetting link
> > ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > ata1.00: configured for UDMA/100
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
> > sd 0:0:0:0: [sda] Sense Key : Aborted Command [current] [descriptor]
> > Descriptor sense data with sense descriptors (in hex):
> >         72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
> >         00 00 00 af
> > sd 0:0:0:0: [sda] Add. Sense: No additional sense information
> > end_request: I/O error, dev sda, sector 625137161

So do we think it's a sata regression?

> > ata1: EH complete
> > sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> > md: super_written gets error=-5, uptodate=0
> > raid5: Disk failure on sda2, disabling device. Operation continuing on 2 devices
> 571a659,663
> > RAID5 conf printout:
> >  --- rd:3 wd:2
> >  disk 0, o:0, dev:sda2
> >  disk 1, o:1, dev:sdb2
> >  disk 2, o:1, dev:sdc2
> 576a669,672
> > RAID5 conf printout:
> >  --- rd:3 wd:2
> >  disk 1, o:1, dev:sdb2
> >  disk 2, o:1, dev:sdc2
> 
> Another good boot also showed the aperture at a similar high address:
> CPU 0: aperture @ b7f2000000 size 32 MB
> And that good boot also showed the "correct" BogoMIPS:
> Calibrating delay using timer specific routine.. 5205.43 BogoMIPS (lpj=26027183)
> Calibrating delay using timer specific routine.. 5200.01 BogoMIPS (lpj=26000052)
> Calibrating delay using timer specific routine.. 5200.01 BogoMIPS (lpj=26000082)
> Calibrating delay using timer specific routine.. 5200.03 BogoMIPS (lpj=26000166)
> 
> Anything more I can provide to help debugging this?
> 

Let's keep linux-ide cc'ed, please.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-10 19:10     ` 2.6.23-rc4-mm1 FUJITA Tomonori
  2007-09-13 17:34       ` 2.6.23-rc4-mm1 Andy Whitcroft
@ 2007-09-15  4:16       ` Paul Jackson
  2007-09-15 10:52         ` 2.6.23-rc4-mm1 FUJITA Tomonori
  1 sibling, 1 reply; 20+ messages in thread
From: Paul Jackson @ 2007-09-15  4:16 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: apw, akpm, linux-kernel, mel, jens.axboe, linux-scsi,
	fujita.tomonori

FUJITA Tomonori wrote:
> Can you try this patch (against 2.6.23-rc4-mm1)?
> 
> >From 592bd2049cb3e6e1f1dde7cf631879f26ddffeaa Mon Sep 17 00:00:00 2001
> From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Date: Mon, 10 Sep 2007 04:17:13 +0100
> Subject: [PATCH] qla1280: sg chaining fixes
> 
> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> ---
>  drivers/scsi/qla1280.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)

This patch works for me.

I was getting the scsi errors reported earlier in
this thread, running 2.6.23-rc4-mm1 on one of our
big SGI Altix systems.

Applying this patch fixed it, so far as I can tell,
which is to say my system boots cleanly once again.

Thanks.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-15  4:16       ` 2.6.23-rc4-mm1 Paul Jackson
@ 2007-09-15 10:52         ` FUJITA Tomonori
  2007-09-17 13:28           ` 2.6.23-rc4-mm1 Jens Axboe
  0 siblings, 1 reply; 20+ messages in thread
From: FUJITA Tomonori @ 2007-09-15 10:52 UTC (permalink / raw)
  To: pj, jens.axboe
  Cc: tomof, apw, akpm, linux-kernel, mel, linux-scsi, fujita.tomonori

On Fri, 14 Sep 2007 21:16:35 -0700
Paul Jackson <pj@sgi.com> wrote:

> FUJITA Tomonori wrote:
> > Can you try this patch (against 2.6.23-rc4-mm1)?
> > 
> > >From 592bd2049cb3e6e1f1dde7cf631879f26ddffeaa Mon Sep 17 00:00:00 2001
> > From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > Date: Mon, 10 Sep 2007 04:17:13 +0100
> > Subject: [PATCH] qla1280: sg chaining fixes
> > 
> > Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > ---
> >  drivers/scsi/qla1280.c |    5 ++++-
> >  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> This patch works for me.
> 
> I was getting the scsi errors reported earlier in
> this thread, running 2.6.23-rc4-mm1 on one of our
> big SGI Altix systems.
> 
> Applying this patch fixed it, so far as I can tell,
> which is to say my system boots cleanly once again.

Thanks for testing!

Jens, we could enable use_sg_chaining option for qla1280.


From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Subject: [PATCH] qla1280: enable use_sg_chaining option

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 drivers/scsi/qla1280.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c
index 7c1eaec..83249af 100644
--- a/drivers/scsi/qla1280.c
+++ b/drivers/scsi/qla1280.c
@@ -4259,6 +4259,7 @@ static struct scsi_host_template qla1280_driver_template = {
 	.sg_tablesize		= SG_ALL,
 	.cmd_per_lun		= 1,
 	.use_clustering		= ENABLE_CLUSTERING,
+	.use_sg_chaining	= ENABLE_SG_CHAINING,
 };
 
 
-- 
1.5.2.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-15 10:52         ` 2.6.23-rc4-mm1 FUJITA Tomonori
@ 2007-09-17 13:28           ` Jens Axboe
  2007-09-17 14:32             ` 2.6.23-rc4-mm1 FUJITA Tomonori
  0 siblings, 1 reply; 20+ messages in thread
From: Jens Axboe @ 2007-09-17 13:28 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: pj, apw, akpm, linux-kernel, mel, linux-scsi, fujita.tomonori

On Sat, Sep 15 2007, FUJITA Tomonori wrote:
> On Fri, 14 Sep 2007 21:16:35 -0700
> Paul Jackson <pj@sgi.com> wrote:
> 
> > FUJITA Tomonori wrote:
> > > Can you try this patch (against 2.6.23-rc4-mm1)?
> > > 
> > > >From 592bd2049cb3e6e1f1dde7cf631879f26ddffeaa Mon Sep 17 00:00:00 2001
> > > From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > > Date: Mon, 10 Sep 2007 04:17:13 +0100
> > > Subject: [PATCH] qla1280: sg chaining fixes
> > > 
> > > Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > > ---
> > >  drivers/scsi/qla1280.c |    5 ++++-
> > >  1 files changed, 4 insertions(+), 1 deletions(-)
> > 
> > This patch works for me.
> > 
> > I was getting the scsi errors reported earlier in
> > this thread, running 2.6.23-rc4-mm1 on one of our
> > big SGI Altix systems.
> > 
> > Applying this patch fixed it, so far as I can tell,
> > which is to say my system boots cleanly once again.
> 
> Thanks for testing!
> 
> Jens, we could enable use_sg_chaining option for qla1280.

Added, thanks!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-17 13:28           ` 2.6.23-rc4-mm1 Jens Axboe
@ 2007-09-17 14:32             ` FUJITA Tomonori
  2007-09-18 10:18               ` 2.6.23-rc4-mm1 Jens Axboe
  0 siblings, 1 reply; 20+ messages in thread
From: FUJITA Tomonori @ 2007-09-17 14:32 UTC (permalink / raw)
  To: jens.axboe
  Cc: tomof, pj, apw, akpm, linux-kernel, mel, linux-scsi,
	fujita.tomonori

On Mon, 17 Sep 2007 15:28:19 +0200
Jens Axboe <jens.axboe@oracle.com> wrote:

> On Sat, Sep 15 2007, FUJITA Tomonori wrote:
> > On Fri, 14 Sep 2007 21:16:35 -0700
> > Paul Jackson <pj@sgi.com> wrote:
> > 
> > > FUJITA Tomonori wrote:
> > > > Can you try this patch (against 2.6.23-rc4-mm1)?
> > > > 
> > > > >From 592bd2049cb3e6e1f1dde7cf631879f26ddffeaa Mon Sep 17 00:00:00 2001
> > > > From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > > > Date: Mon, 10 Sep 2007 04:17:13 +0100
> > > > Subject: [PATCH] qla1280: sg chaining fixes
> > > > 
> > > > Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > > > ---
> > > >  drivers/scsi/qla1280.c |    5 ++++-
> > > >  1 files changed, 4 insertions(+), 1 deletions(-)
> > > 
> > > This patch works for me.
> > > 
> > > I was getting the scsi errors reported earlier in
> > > this thread, running 2.6.23-rc4-mm1 on one of our
> > > big SGI Altix systems.
> > > 
> > > Applying this patch fixed it, so far as I can tell,
> > > which is to say my system boots cleanly once again.
> > 
> > Thanks for testing!
> > 
> > Jens, we could enable use_sg_chaining option for qla1280.
> 
> Added, thanks!

Thanks.

BTW, please don't forget to integrate the following patches:


- revert sg segment size ifdefs

http://marc.info/?l=linux-scsi&m=118881264013097&w=2

- remove sglist_len

http://marc.info/?l=linux-scsi&m=118907920405100&w=2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-17 14:32             ` 2.6.23-rc4-mm1 FUJITA Tomonori
@ 2007-09-18 10:18               ` Jens Axboe
  2007-09-18 12:25                 ` 2.6.23-rc4-mm1 FUJITA Tomonori
  0 siblings, 1 reply; 20+ messages in thread
From: Jens Axboe @ 2007-09-18 10:18 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: pj, apw, akpm, linux-kernel, mel, linux-scsi, fujita.tomonori

On Mon, Sep 17 2007, FUJITA Tomonori wrote:
> On Mon, 17 Sep 2007 15:28:19 +0200
> Jens Axboe <jens.axboe@oracle.com> wrote:
> 
> > On Sat, Sep 15 2007, FUJITA Tomonori wrote:
> > > On Fri, 14 Sep 2007 21:16:35 -0700
> > > Paul Jackson <pj@sgi.com> wrote:
> > > 
> > > > FUJITA Tomonori wrote:
> > > > > Can you try this patch (against 2.6.23-rc4-mm1)?
> > > > > 
> > > > > >From 592bd2049cb3e6e1f1dde7cf631879f26ddffeaa Mon Sep 17 00:00:00 2001
> > > > > From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > > > > Date: Mon, 10 Sep 2007 04:17:13 +0100
> > > > > Subject: [PATCH] qla1280: sg chaining fixes
> > > > > 
> > > > > Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > > > > ---
> > > > >  drivers/scsi/qla1280.c |    5 ++++-
> > > > >  1 files changed, 4 insertions(+), 1 deletions(-)
> > > > 
> > > > This patch works for me.
> > > > 
> > > > I was getting the scsi errors reported earlier in
> > > > this thread, running 2.6.23-rc4-mm1 on one of our
> > > > big SGI Altix systems.
> > > > 
> > > > Applying this patch fixed it, so far as I can tell,
> > > > which is to say my system boots cleanly once again.
> > > 
> > > Thanks for testing!
> > > 
> > > Jens, we could enable use_sg_chaining option for qla1280.
> > 
> > Added, thanks!
> 
> Thanks.
> 
> BTW, please don't forget to integrate the following patches:
> 
> 
> - revert sg segment size ifdefs
> 
> http://marc.info/?l=linux-scsi&m=118881264013097&w=2
> 
> - remove sglist_len
> 
> http://marc.info/?l=linux-scsi&m=118907920405100&w=2

Added, and I rebased the sglist-* branches to current again. So
everything should be fully uptodate once more.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-18 10:18               ` 2.6.23-rc4-mm1 Jens Axboe
@ 2007-09-18 12:25                 ` FUJITA Tomonori
  2007-09-18 12:51                   ` 2.6.23-rc4-mm1 Jens Axboe
  0 siblings, 1 reply; 20+ messages in thread
From: FUJITA Tomonori @ 2007-09-18 12:25 UTC (permalink / raw)
  To: jens.axboe, michaelc
  Cc: tomof, pj, apw, akpm, linux-kernel, mel, linux-scsi,
	fujita.tomonori

On Tue, 18 Sep 2007 12:18:40 +0200
Jens Axboe <jens.axboe@oracle.com> wrote:

> On Mon, Sep 17 2007, FUJITA Tomonori wrote:
> > On Mon, 17 Sep 2007 15:28:19 +0200
> > Jens Axboe <jens.axboe@oracle.com> wrote:
> > 
> > > On Sat, Sep 15 2007, FUJITA Tomonori wrote:
> > > > On Fri, 14 Sep 2007 21:16:35 -0700
> > > > Paul Jackson <pj@sgi.com> wrote:
> > > > 
> > > > > FUJITA Tomonori wrote:
> > > > > > Can you try this patch (against 2.6.23-rc4-mm1)?
> > > > > > 
> > > > > > >From 592bd2049cb3e6e1f1dde7cf631879f26ddffeaa Mon Sep 17 00:00:00 2001
> > > > > > From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > > > > > Date: Mon, 10 Sep 2007 04:17:13 +0100
> > > > > > Subject: [PATCH] qla1280: sg chaining fixes
> > > > > > 
> > > > > > Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > > > > > ---
> > > > > >  drivers/scsi/qla1280.c |    5 ++++-
> > > > > >  1 files changed, 4 insertions(+), 1 deletions(-)
> > > > > 
> > > > > This patch works for me.
> > > > > 
> > > > > I was getting the scsi errors reported earlier in
> > > > > this thread, running 2.6.23-rc4-mm1 on one of our
> > > > > big SGI Altix systems.
> > > > > 
> > > > > Applying this patch fixed it, so far as I can tell,
> > > > > which is to say my system boots cleanly once again.
> > > > 
> > > > Thanks for testing!
> > > > 
> > > > Jens, we could enable use_sg_chaining option for qla1280.
> > > 
> > > Added, thanks!
> > 
> > Thanks.
> > 
> > BTW, please don't forget to integrate the following patches:
> > 
> > 
> > - revert sg segment size ifdefs
> > 
> > http://marc.info/?l=linux-scsi&m=118881264013097&w=2
> > 
> > - remove sglist_len
> > 
> > http://marc.info/?l=linux-scsi&m=118907920405100&w=2
> 
> Added, and I rebased the sglist-* branches to current again. So
> everything should be fully uptodate once more.

Thanks, here are a few more things.

- please drop the iscsi patch since Mike has major changes to iscsi
I/O path.

- ipr sg chaining need to be disabled since libata is not ready.

- you can add Doug's ACK to scsi_debug patch:

http://marc.info/?l=linux-scsi&m=118926325931801&w=2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.6.23-rc4-mm1
  2007-09-18 12:25                 ` 2.6.23-rc4-mm1 FUJITA Tomonori
@ 2007-09-18 12:51                   ` Jens Axboe
  0 siblings, 0 replies; 20+ messages in thread
From: Jens Axboe @ 2007-09-18 12:51 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: michaelc, pj, apw, akpm, linux-kernel, mel, linux-scsi,
	fujita.tomonori

On Tue, Sep 18 2007, FUJITA Tomonori wrote:
> On Tue, 18 Sep 2007 12:18:40 +0200
> Jens Axboe <jens.axboe@oracle.com> wrote:
> 
> > On Mon, Sep 17 2007, FUJITA Tomonori wrote:
> > > On Mon, 17 Sep 2007 15:28:19 +0200
> > > Jens Axboe <jens.axboe@oracle.com> wrote:
> > > 
> > > > On Sat, Sep 15 2007, FUJITA Tomonori wrote:
> > > > > On Fri, 14 Sep 2007 21:16:35 -0700
> > > > > Paul Jackson <pj@sgi.com> wrote:
> > > > > 
> > > > > > FUJITA Tomonori wrote:
> > > > > > > Can you try this patch (against 2.6.23-rc4-mm1)?
> > > > > > > 
> > > > > > > >From 592bd2049cb3e6e1f1dde7cf631879f26ddffeaa Mon Sep 17 00:00:00 2001
> > > > > > > From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > > > > > > Date: Mon, 10 Sep 2007 04:17:13 +0100
> > > > > > > Subject: [PATCH] qla1280: sg chaining fixes
> > > > > > > 
> > > > > > > Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > > > > > > ---
> > > > > > >  drivers/scsi/qla1280.c |    5 ++++-
> > > > > > >  1 files changed, 4 insertions(+), 1 deletions(-)
> > > > > > 
> > > > > > This patch works for me.
> > > > > > 
> > > > > > I was getting the scsi errors reported earlier in
> > > > > > this thread, running 2.6.23-rc4-mm1 on one of our
> > > > > > big SGI Altix systems.
> > > > > > 
> > > > > > Applying this patch fixed it, so far as I can tell,
> > > > > > which is to say my system boots cleanly once again.
> > > > > 
> > > > > Thanks for testing!
> > > > > 
> > > > > Jens, we could enable use_sg_chaining option for qla1280.
> > > > 
> > > > Added, thanks!
> > > 
> > > Thanks.
> > > 
> > > BTW, please don't forget to integrate the following patches:
> > > 
> > > 
> > > - revert sg segment size ifdefs
> > > 
> > > http://marc.info/?l=linux-scsi&m=118881264013097&w=2
> > > 
> > > - remove sglist_len
> > > 
> > > http://marc.info/?l=linux-scsi&m=118907920405100&w=2
> > 
> > Added, and I rebased the sglist-* branches to current again. So
> > everything should be fully uptodate once more.
> 
> Thanks, here are a few more things.
> 
> - please drop the iscsi patch since Mike has major changes to iscsi
> I/O path.
> 
> - ipr sg chaining need to be disabled since libata is not ready.
> 
> - you can add Doug's ACK to scsi_debug patch:
> 
> http://marc.info/?l=linux-scsi&m=118926325931801&w=2

All done.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2007-09-18 12:50 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070831215822.26e1432b.akpm@linux-foundation.org>
     [not found] ` <20070910174926.GC30335@shadowen.org>
2007-09-10 18:19   ` 2.6.23-rc4-mm1 Andrew Morton
2007-09-10 18:59     ` 2.6.23-rc4-mm1 Torsten Kaiser
2007-09-10 19:20       ` 2.6.23-rc4-mm1 Andrew Morton
2007-09-10 19:38         ` 2.6.23-rc4-mm1 Torsten Kaiser
2007-09-10 19:42         ` 2.6.23-rc4-mm1 FUJITA Tomonori
2007-09-10 20:43           ` 2.6.23-rc4-mm1 Torsten Kaiser
2007-09-11  8:32             ` 2.6.23-rc4-mm1 Jens Axboe
2007-09-10 19:10     ` 2.6.23-rc4-mm1 FUJITA Tomonori
2007-09-13 17:34       ` 2.6.23-rc4-mm1 Andy Whitcroft
2007-09-15  4:16       ` 2.6.23-rc4-mm1 Paul Jackson
2007-09-15 10:52         ` 2.6.23-rc4-mm1 FUJITA Tomonori
2007-09-17 13:28           ` 2.6.23-rc4-mm1 Jens Axboe
2007-09-17 14:32             ` 2.6.23-rc4-mm1 FUJITA Tomonori
2007-09-18 10:18               ` 2.6.23-rc4-mm1 Jens Axboe
2007-09-18 12:25                 ` 2.6.23-rc4-mm1 FUJITA Tomonori
2007-09-18 12:51                   ` 2.6.23-rc4-mm1 Jens Axboe
2007-09-10 19:31     ` 2.6.23-rc4-mm1 FUJITA Tomonori
2007-09-14  8:10       ` 2.6.23-rc4-mm1 Andy Whitcroft
2007-09-14 13:01         ` 2.6.23-rc4-mm1 Torsten Kaiser
2007-09-14 20:15           ` 2.6.23-rc4-mm1 Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).