public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.23-rc9 boot failure (megaraid?)
@ 2007-10-02 16:48 Burton Windle
  2007-10-02 18:15 ` Adrian Bunk
  0 siblings, 1 reply; 16+ messages in thread
From: Burton Windle @ 2007-10-02 16:48 UTC (permalink / raw)
  To: linux-kernel

2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.

System is a Dell Poweredge with PERC 2/DC with RAID1 volume.

>From 2.6.23-rc9:

Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX4: IDE controller at PCI slot 0000:00:07.1
eth1: Optical link UP (Full Duplex, Flow Control: )
PIIX4: chipset revision 1
PIIX4: not 100% native mode: will probe irqs later
     ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio
hdc: SAMSUNG SC-140B, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hdc: ATAPI 40X CD-ROM drive, 128kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
ide-floppy driver 0.99.newide
ACPI: PCI Interrupt 0000:00:0d.1[A] -> GSI 17 (level, low) -> IRQ 18
megaraid: found 0x8086:0x1960:bus 0:slot 13:func 1
scsi0:Found MegaRAID controller at 0xf8812000, IRQ:18
megaraid: [1.06:1p00] detected 1 logical drives.
megaraid: channel[0] is raid.
megaraid: channel[1] is raid.
scsi0 : LSI Logic MegaRAID 1.06 254 commands 16 targs 5 chans 7 luns
scsi0: scanning scsi channel 0 for logical drives.
scsi 0:0:0:0: Direct-Access     MegaRAID LD0 RAID1  8568R 1.06 PQ: 0 ANSI: 2
scsi0: scanning scsi channel 4 [P0] for physical devices.
scsi0: scanning scsi channel 5 [P1] for physical devices.
st: Version 20070203, fixed bufsize 32768, s/g segs 256
sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Asking for cache data failed
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Asking for cache data failed
sd 0:0:0:0: [sda] Assuming drive cache: write through
  sda: sda1
  sda: p1 exceeds device capacity
sd 0:0:0:0: [sda] Attached SCSI disk
PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: PC Speaker as /class/input/input1
input: AT Translated Set 2 keyboard as /class/input/input2
i2c /dev entries driver
piix4_smbus 0000:00:07.3: Found 0000:00:07.3 device
piix4_smbus 0000:00:07.3: Host SMBus controller not enabled!
NET: Registered protocol family 26
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 15
Starting balanced_irq
Using IPI Shortcut mode
attempt to access beyond end of device
sda: rw=0, want=67, limit=1
EXT3-fs: unable to read superblock
attempt to access beyond end of device
sda: rw=0, want=67, limit=1
EXT2-fs: unable to read superblock
attempt to access beyond end of device
sda: rw=0, want=129, limit=1
isofs_fill_super: bread failed, dev=sda1, iso_blknum=16, block=32
attempt to access beyond end of device
sda: rw=0, want=131, limit=1
attempt to access beyond end of device
sda: rw=0, want=17542979, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541955, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541731, limit=1
attempt to access beyond end of device
sda: rw=0, want=17542971, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541947, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541723, limit=1
attempt to access beyond end of device
sda: rw=0, want=17542379, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541355, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541131, limit=1
attempt to access beyond end of device
sda: rw=0, want=17542371, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541347, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541123, limit=1
attempt to access beyond end of device
sda: rw=0, want=14394267, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393243, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393019, limit=1
attempt to access beyond end of device
sda: rw=0, want=14394259, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393235, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393011, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393667, limit=1
attempt to access beyond end of device
sda: rw=0, want=14392643, limit=1
attempt to access beyond end of device
sda: rw=0, want=14392419, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393659, limit=1
attempt to access beyond end of device
sda: rw=0, want=14392635, limit=1
attempt to access beyond end of device
sda: rw=0, want=14392411, limit=1
attempt to access beyond end of device
sda: rw=0, want=1315, limit=1
attempt to access beyond end of device
sda: rw=0, want=1091, limit=1
UDF-fs: No partition found (1)
List of all partitions:
1600    4194302 hdc driver: ide-cdrom
0800          0 sda driver: sd
   0801    8771458 sda1
No filesystem could mount root, tried:  ext3 ext2 iso9660 udf
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,1)


>From 2.6.22.9:

megaraid: found 0x8086:0x1960:bus 0:slot 13:func 1
scsi0:Found MegaRAID controller at 0xf8812000, IRQ:18
megaraid: [1.06:1p00] detected 1 logical drives.
megaraid: channel[0] is raid.
megaraid: channel[1] is raid.
scsi0 : LSI Logic MegaRAID 1.06 254 commands 16 targs 5 chans 7 luns
scsi0: scanning scsi channel 0 for logical drives.
scsi 0:0:0:0: Direct-Access     MegaRAID LD0 RAID1  8568R 1.06 PQ: 0 ANSI: 2
scsi0: scanning scsi channel 4 [P0] for physical devices.
scsi0: scanning scsi channel 5 [P1] for physical devices.
st: Version 20070203, fixed bufsize 32768, s/g segs 256
sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Asking for cache data failed
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Asking for cache data failed
sd 0:0:0:0: [sda] Assuming drive cache: write through
  sda: sda1
sd 0:0:0:0: [sda] Attached SCSI disk
PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: PC Speaker as /class/input/input1
input: AT Translated Set 2 keyboard as /class/input/input2
i2c /dev entries driver
piix4_smbus 0000:00:07.3: Found 0000:00:07.3 device
piix4_smbus 0000:00:07.3: Host SMBus controller not enabled!
NET: Registered protocol family 26
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 15
Starting balanced_irq
Using IPI Shortcut mode
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 260k freed
EXT3 FS on sda1, internal journal




00:0d.1 I2O: Intel Corporation 80960RP [i960RP Microprocessor] (rev 02) (prog-if 01)
         Subsystem: Dell PowerEdge Expandable RAID Controller 2/DC
         Flags: bus master, medium devsel, latency 64, IRQ 18
         Memory at f7000000 (32-bit, prefetchable) [size=4M]
         [virtual] Expansion ROM at 50000000 [disabled] [size=32K]
         Capabilities: <access denied>




CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_NETLINK=y
CONFIG_SCSI_PROC_FS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_FC_ATTRS=y
CONFIG_SCSI_LOWLEVEL=y
CONFIG_MEGARAID_LEGACY=y


-- 
Burton Windle                           bwindle@fint.org


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-02 16:48 2.6.23-rc9 boot failure (megaraid?) Burton Windle
@ 2007-10-02 18:15 ` Adrian Bunk
  2007-10-02 18:46   ` Burton Windle
  2007-10-02 20:38   ` James Bottomley
  0 siblings, 2 replies; 16+ messages in thread
From: Adrian Bunk @ 2007-10-02 18:15 UTC (permalink / raw)
  To: Burton Windle
  Cc: linux-kernel, Jens Axboe, FUJITA Tomonori, Sumant Patro,
	James Bottomley, megaraidlinux, linux-scsi

Cc's added, the complete bug report is at
  http://lkml.org/lkml/2007/10/2/243

On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
>
> System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
>...

Thanks for your report.

Diff'ing the dmesg's shows:

<--  snip  -->

 scsi0: scanning scsi channel 4 [P0] for physical devices.
 scsi0: scanning scsi channel 5 [P1] for physical devices.
 st: Version 20070203, fixed bufsize 32768, s/g segs 256
-sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
+sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
+sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
 sd 0:0:0:0: [sda] Write Protect is off
 sd 0:0:0:0: [sda] Asking for cache data failed
 sd 0:0:0:0: [sda] Assuming drive cache: write through
-sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
+sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
+sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
 sd 0:0:0:0: [sda] Write Protect is off
 sd 0:0:0:0: [sda] Asking for cache data failed
 sd 0:0:0:0: [sda] Assuming drive cache: write through
  sda: sda1
+ sda: p1 exceeds device capacity

<--  snip  -->

Does reverting the commit below fix the problem?

cu
Adrian


commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e
Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Date:   Mon May 14 20:17:27 2007 +0900

    [SCSI] megaraid_old: convert to use the data buffer accessors
    
    - remove the unnecessary map_single path.
    
    - convert to use the new accessors for the sg lists and the
    parameters.
    
    Jens Axboe <jens.axboe@oracle.com> did the for_each_sg cleanup.
    
    Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
    Acked-by: Sumant Patro <sumant.patro@lsi.com>
    Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 40ee07d..3907f67 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -523,10 +523,8 @@ mega_build_cmd(adapter_t *adapter, Scsi_Cmnd *cmd, int *busy)
 	/*
 	 * filter the internal and ioctl commands
 	 */
-	if((cmd->cmnd[0] == MEGA_INTERNAL_CMD)) {
-		return cmd->request_buffer;
-	}
-
+	if((cmd->cmnd[0] == MEGA_INTERNAL_CMD))
+		return (scb_t *)cmd->host_scribble;
 
 	/*
 	 * We know what channels our logical drives are on - mega_find_card()
@@ -657,22 +655,14 @@ mega_build_cmd(adapter_t *adapter, Scsi_Cmnd *cmd, int *busy)
 
 		case MODE_SENSE: {
 			char *buf;
+			struct scatterlist *sg;
 
-			if (cmd->use_sg) {
-				struct scatterlist *sg;
+			sg = scsi_sglist(cmd);
+			buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset;
 
-				sg = (struct scatterlist *)cmd->request_buffer;
-				buf = kmap_atomic(sg->page, KM_IRQ0) +
-					sg->offset;
-			} else
-				buf = cmd->request_buffer;
 			memset(buf, 0, cmd->cmnd[4]);
-			if (cmd->use_sg) {
-				struct scatterlist *sg;
+			kunmap_atomic(buf - sg->offset, KM_IRQ0);
 
-				sg = (struct scatterlist *)cmd->request_buffer;
-				kunmap_atomic(buf - sg->offset, KM_IRQ0);
-			}
 			cmd->result = (DID_OK << 16);
 			cmd->scsi_done(cmd);
 			return NULL;
@@ -1551,23 +1541,15 @@ mega_cmd_done(adapter_t *adapter, u8 completed[], int nstatus, int status)
 		islogical = adapter->logdrv_chan[cmd->device->channel];
 		if( cmd->cmnd[0] == INQUIRY && !islogical ) {
 
-			if( cmd->use_sg ) {
-				sgl = (struct scatterlist *)
-					cmd->request_buffer;
-
-				if( sgl->page ) {
-					c = *(unsigned char *)
+			sgl = scsi_sglist(cmd);
+			if( sgl->page ) {
+				c = *(unsigned char *)
 					page_address((&sgl[0])->page) +
 					(&sgl[0])->offset; 
-				}
-				else {
-					printk(KERN_WARNING
-						"megaraid: invalid sg.\n");
-					c = 0;
-				}
-			}
-			else {
-				c = *(u8 *)cmd->request_buffer;
+			} else {
+				printk(KERN_WARNING
+				       "megaraid: invalid sg.\n");
+				c = 0;
 			}
 
 			if(IS_RAID_CH(adapter, cmd->device->channel) &&
@@ -1704,30 +1686,14 @@ mega_rundoneq (adapter_t *adapter)
 static void
 mega_free_scb(adapter_t *adapter, scb_t *scb)
 {
-	unsigned long length;
-
 	switch( scb->dma_type ) {
 
 	case MEGA_DMA_TYPE_NONE:
 		break;
 
-	case MEGA_BULK_DATA:
-		if (scb->cmd->use_sg == 0)
-			length = scb->cmd->request_bufflen;
-		else {
-			struct scatterlist *sgl =
-				(struct scatterlist *)scb->cmd->request_buffer;
-			length = sgl->length;
-		}
-		pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
-			       length, scb->dma_direction);
-		break;
-
 	case MEGA_SGLIST:
-		pci_unmap_sg(adapter->dev, scb->cmd->request_buffer,
-			scb->cmd->use_sg, scb->dma_direction);
+		scsi_dma_unmap(scb->cmd);
 		break;
-
 	default:
 		break;
 	}
@@ -1767,80 +1733,33 @@ __mega_busywait_mbox (adapter_t *adapter)
 static int
 mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)
 {
-	struct scatterlist	*sgl;
-	struct page	*page;
-	unsigned long	offset;
-	unsigned int	length;
+	struct scatterlist *sg;
 	Scsi_Cmnd	*cmd;
 	int	sgcnt;
 	int	idx;
 
 	cmd = scb->cmd;
 
-	/* Scatter-gather not used */
-	if( cmd->use_sg == 0 || (cmd->use_sg == 1 && 
-				 !adapter->has_64bit_addr)) {
-
-		if (cmd->use_sg == 0) {
-			page = virt_to_page(cmd->request_buffer);
-			offset = offset_in_page(cmd->request_buffer);
-			length = cmd->request_bufflen;
-		} else {
-			sgl = (struct scatterlist *)cmd->request_buffer;
-			page = sgl->page;
-			offset = sgl->offset;
-			length = sgl->length;
-		}
-
-		scb->dma_h_bulkdata = pci_map_page(adapter->dev,
-						  page, offset,
-						  length,
-						  scb->dma_direction);
-		scb->dma_type = MEGA_BULK_DATA;
-
-		/*
-		 * We need to handle special 64-bit commands that need a
-		 * minimum of 1 SG
-		 */
-		if( adapter->has_64bit_addr ) {
-			scb->sgl64[0].address = scb->dma_h_bulkdata;
-			scb->sgl64[0].length = length;
-			*buf = (u32)scb->sgl_dma_addr;
-			*len = (u32)length;
-			return 1;
-		}
-		else {
-			*buf = (u32)scb->dma_h_bulkdata;
-			*len = (u32)length;
-		}
-		return 0;
-	}
-
-	sgl = (struct scatterlist *)cmd->request_buffer;
-
 	/*
 	 * Copy Scatter-Gather list info into controller structure.
 	 *
 	 * The number of sg elements returned must not exceed our limit
 	 */
-	sgcnt = pci_map_sg(adapter->dev, sgl, cmd->use_sg,
-			scb->dma_direction);
+	sgcnt = scsi_dma_map(cmd);
 
 	scb->dma_type = MEGA_SGLIST;
 
-	BUG_ON(sgcnt > adapter->sglen);
+	BUG_ON(sgcnt > adapter->sglen || sgcnt < 0);
 
 	*len = 0;
 
-	for( idx = 0; idx < sgcnt; idx++, sgl++ ) {
-
-		if( adapter->has_64bit_addr ) {
-			scb->sgl64[idx].address = sg_dma_address(sgl);
-			*len += scb->sgl64[idx].length = sg_dma_len(sgl);
-		}
-		else {
-			scb->sgl[idx].address = sg_dma_address(sgl);
-			*len += scb->sgl[idx].length = sg_dma_len(sgl);
+	scsi_for_each_sg(cmd, sg, sgcnt, idx) {
+		if (adapter->has_64bit_addr) {
+			scb->sgl64[idx].address = sg_dma_address(sg);
+			*len += scb->sgl64[idx].length = sg_dma_len(sg);
+		} else {
+			scb->sgl[idx].address = sg_dma_address(sg);
+			*len += scb->sgl[idx].length = sg_dma_len(sg);
 		}
 	}
 
@@ -4494,7 +4413,7 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
 	scmd->device = sdev;
 
 	scmd->device->host = adapter->host;
-	scmd->request_buffer = (void *)scb;
+	scmd->host_scribble = (void *)scb;
 	scmd->cmnd[0] = MEGA_INTERNAL_CMD;
 
 	scb->state |= SCB_ACTIVE;

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-02 18:15 ` Adrian Bunk
@ 2007-10-02 18:46   ` Burton Windle
  2007-10-02 19:55     ` Rafael J. Wysocki
  2007-10-02 20:38   ` James Bottomley
  1 sibling, 1 reply; 16+ messages in thread
From: Burton Windle @ 2007-10-02 18:46 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: linux-kernel, Jens Axboe, FUJITA Tomonori, Sumant Patro,
	James Bottomley, megaraidlinux, linux-scsi

On Tue, 2 Oct 2007, Adrian Bunk wrote:

> Cc's added, the complete bug report is at
>  http://lkml.org/lkml/2007/10/2/243
>
> On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
>> 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
>>
>> System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
>> ...
>
> Thanks for your report.
>
> Does reverting the commit below fix the problem?
>
> cu
> Adrian
>
>
> commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e
> Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Date:   Mon May 14 20:17:27 2007 +0900
>

Confirmed; reverting the above (snipped) patch does fix the issue.

-- 
Burton Windle                           bwindle@fint.org


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-02 18:46   ` Burton Windle
@ 2007-10-02 19:55     ` Rafael J. Wysocki
  0 siblings, 0 replies; 16+ messages in thread
From: Rafael J. Wysocki @ 2007-10-02 19:55 UTC (permalink / raw)
  To: Burton Windle
  Cc: Adrian Bunk, linux-kernel, Jens Axboe, FUJITA Tomonori,
	Sumant Patro, James Bottomley, megaraidlinux, linux-scsi

On Tuesday, 2 October 2007 20:46, Burton Windle wrote:
> On Tue, 2 Oct 2007, Adrian Bunk wrote:
> 
> > Cc's added, the complete bug report is at
> >  http://lkml.org/lkml/2007/10/2/243
> >
> > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> >> 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> >>
> >> System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> >> ...
> >
> > Thanks for your report.
> >
> > Does reverting the commit below fix the problem?
> >
> > cu
> > Adrian
> >
> >
> > commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e
> > Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > Date:   Mon May 14 20:17:27 2007 +0900
> >
> 
> Confirmed; reverting the above (snipped) patch does fix the issue.

I've created a bugzilla entry for your report at:

http://bugzilla.kernel.org/show_bug.cgi?id=9113

Please add a summary of your observations in there.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-02 18:15 ` Adrian Bunk
  2007-10-02 18:46   ` Burton Windle
@ 2007-10-02 20:38   ` James Bottomley
  2007-10-03  0:00     ` FUJITA Tomonori
  2007-10-03  0:09     ` FUJITA Tomonori
  1 sibling, 2 replies; 16+ messages in thread
From: James Bottomley @ 2007-10-02 20:38 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Burton Windle, linux-kernel, Jens Axboe, FUJITA Tomonori,
	Sumant Patro, megaraidlinux, linux-scsi

On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> Cc's added, the complete bug report is at
>   http://lkml.org/lkml/2007/10/2/243
> 
> On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> >
> > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> >...
> 
> Thanks for your report.
> 
> Diff'ing the dmesg's shows:
> 
> <--  snip  -->
> 
>  scsi0: scanning scsi channel 4 [P0] for physical devices.
>  scsi0: scanning scsi channel 5 [P1] for physical devices.
>  st: Version 20070203, fixed bufsize 32768, s/g segs 256
> -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
>  sd 0:0:0:0: [sda] Write Protect is off
>  sd 0:0:0:0: [sda] Asking for cache data failed
>  sd 0:0:0:0: [sda] Assuming drive cache: write through
> -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
>  sd 0:0:0:0: [sda] Write Protect is off
>  sd 0:0:0:0: [sda] Asking for cache data failed
>  sd 0:0:0:0: [sda] Assuming drive cache: write through
>   sda: sda1
> + sda: p1 exceeds device capacity
> 
> <--  snip  -->
> 
> -	case MEGA_BULK_DATA:
> -		if (scb->cmd->use_sg == 0)
> -			length = scb->cmd->request_bufflen;
> -		else {
> -			struct scatterlist *sgl =
> -				(struct scatterlist *)scb->cmd->request_buffer;
> -			length = sgl->length;
> -		}
> -		pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> -			       length, scb->dma_direction);
> -		break;
> -

This is the problem piece I think.  We've reintroduced a very old bug:

commit 51c928c34fa7cff38df584ad01de988805877dba
Author: James Bottomley <James.Bottomley@SteelEye.com>
Date:   Sat Oct 1 09:38:05 2005 -0500

    [SCSI] Legacy MegaRAID: Fix READ CAPACITY
    
    Some Legacy megaraid cards can't actually cope with the scatter/gather
    version of the READ CAPACITY command (which is what we now send them
    since altering all SCSI internal I/O to go via the block layer).  Fix
    this (and a few other broken megaraid driver assumptions) by sending
    the non-sg version of the command if the sg list only has a single
    element.
    
    Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

So what we have to do is put back the check for use_sg == 1 and send
that as a bulk transfer command.

James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-02 20:38   ` James Bottomley
@ 2007-10-03  0:00     ` FUJITA Tomonori
  2007-10-03 23:32       ` Patro, Sumant
  2007-10-03  0:09     ` FUJITA Tomonori
  1 sibling, 1 reply; 16+ messages in thread
From: FUJITA Tomonori @ 2007-10-03  0:00 UTC (permalink / raw)
  To: James.Bottomley
  Cc: bunk, bwindle, linux-kernel, jens.axboe, fujita.tomonori,
	sumant.patro, megaraidlinux, linux-scsi

On Tue, 02 Oct 2007 15:38:13 -0500
James Bottomley <James.Bottomley@SteelEye.com> wrote:

> On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > Cc's added, the complete bug report is at
> >   http://lkml.org/lkml/2007/10/2/243
> > 
> > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > >
> > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > >...
> > 
> > Thanks for your report.
> > 
> > Diff'ing the dmesg's shows:
> > 
> > <--  snip  -->
> > 
> >  scsi0: scanning scsi channel 4 [P0] for physical devices.
> >  scsi0: scanning scsi channel 5 [P1] for physical devices.
> >  st: Version 20070203, fixed bufsize 32768, s/g segs 256
> > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> >  sd 0:0:0:0: [sda] Write Protect is off
> >  sd 0:0:0:0: [sda] Asking for cache data failed
> >  sd 0:0:0:0: [sda] Assuming drive cache: write through
> > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> >  sd 0:0:0:0: [sda] Write Protect is off
> >  sd 0:0:0:0: [sda] Asking for cache data failed
> >  sd 0:0:0:0: [sda] Assuming drive cache: write through
> >   sda: sda1
> > + sda: p1 exceeds device capacity
> > 
> > <--  snip  -->
> > 
> > -	case MEGA_BULK_DATA:
> > -		if (scb->cmd->use_sg == 0)
> > -			length = scb->cmd->request_bufflen;
> > -		else {
> > -			struct scatterlist *sgl =
> > -				(struct scatterlist *)scb->cmd->request_buffer;
> > -			length = sgl->length;
> > -		}
> > -		pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > -			       length, scb->dma_direction);
> > -		break;
> > -
> 
> This is the problem piece I think.  We've reintroduced a very old bug:
> 
> commit 51c928c34fa7cff38df584ad01de988805877dba
> Author: James Bottomley <James.Bottomley@SteelEye.com>
> Date:   Sat Oct 1 09:38:05 2005 -0500
> 
>     [SCSI] Legacy MegaRAID: Fix READ CAPACITY
>     
>     Some Legacy megaraid cards can't actually cope with the scatter/gather
>     version of the READ CAPACITY command (which is what we now send them
>     since altering all SCSI internal I/O to go via the block layer).  Fix
>     this (and a few other broken megaraid driver assumptions) by sending
>     the non-sg version of the command if the sg list only has a single
>     element.
>     
>     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
> 
> So what we have to do is put back the check for use_sg == 1 and send
> that as a bulk transfer command.

Sorry about this. Can this fix the problem?

Thanks,


diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 3907f67..da56163 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)
 
 	*len = 0;
 
+	if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
+		sg = scsi_sglist(cmd);
+		scb->dma_h_bulkdata = sg_dma_address(sg);
+		*buf = (u32)scb->dma_h_bulkdata;
+		*len = sg_dma_len(sg);
+		return 0;
+	}
+
 	scsi_for_each_sg(cmd, sg, sgcnt, idx) {
 		if (adapter->has_64bit_addr) {
 			scb->sgl64[idx].address = sg_dma_address(sg);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-02 20:38   ` James Bottomley
  2007-10-03  0:00     ` FUJITA Tomonori
@ 2007-10-03  0:09     ` FUJITA Tomonori
  1 sibling, 0 replies; 16+ messages in thread
From: FUJITA Tomonori @ 2007-10-03  0:09 UTC (permalink / raw)
  To: James.Bottomley
  Cc: bunk, bwindle, linux-kernel, jens.axboe, fujita.tomonori,
	sumant.patro, megaraidlinux, linux-scsi

On Tue, 02 Oct 2007 15:38:13 -0500
James Bottomley <James.Bottomley@SteelEye.com> wrote:

> On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > Cc's added, the complete bug report is at
> >   http://lkml.org/lkml/2007/10/2/243
> > 
> > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > >
> > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > >...
> > 
> > Thanks for your report.
> > 
> > Diff'ing the dmesg's shows:
> > 
> > <--  snip  -->
> > 
> >  scsi0: scanning scsi channel 4 [P0] for physical devices.
> >  scsi0: scanning scsi channel 5 [P1] for physical devices.
> >  st: Version 20070203, fixed bufsize 32768, s/g segs 256
> > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> >  sd 0:0:0:0: [sda] Write Protect is off
> >  sd 0:0:0:0: [sda] Asking for cache data failed
> >  sd 0:0:0:0: [sda] Assuming drive cache: write through
> > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> >  sd 0:0:0:0: [sda] Write Protect is off
> >  sd 0:0:0:0: [sda] Asking for cache data failed
> >  sd 0:0:0:0: [sda] Assuming drive cache: write through
> >   sda: sda1
> > + sda: p1 exceeds device capacity
> > 
> > <--  snip  -->
> > 
> > -	case MEGA_BULK_DATA:
> > -		if (scb->cmd->use_sg == 0)
> > -			length = scb->cmd->request_bufflen;
> > -		else {
> > -			struct scatterlist *sgl =
> > -				(struct scatterlist *)scb->cmd->request_buffer;
> > -			length = sgl->length;
> > -		}
> > -		pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > -			       length, scb->dma_direction);
> > -		break;
> > -
> 
> This is the problem piece I think.  We've reintroduced a very old bug:
> 
> commit 51c928c34fa7cff38df584ad01de988805877dba
> Author: James Bottomley <James.Bottomley@SteelEye.com>
> Date:   Sat Oct 1 09:38:05 2005 -0500
> 
>     [SCSI] Legacy MegaRAID: Fix READ CAPACITY
>     
>     Some Legacy megaraid cards can't actually cope with the scatter/gather
>     version of the READ CAPACITY command (which is what we now send them
>     since altering all SCSI internal I/O to go via the block layer).  Fix
>     this (and a few other broken megaraid driver assumptions) by sending
>     the non-sg version of the command if the sg list only has a single
>     element.
>     
>     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
> 
> So what we have to do is put back the check for use_sg == 1 and send
> that as a bulk transfer command.

Sorry again. Needs to check sg count before dma mapping.


diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 3907f67..ae0b220 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -1737,9 +1737,12 @@ mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)
 	Scsi_Cmnd	*cmd;
 	int	sgcnt;
 	int	idx;
+	int bulkdata;
 
 	cmd = scb->cmd;
 
+	bulkdata = (scsi_sg_count(cmd) == 1) ? 1 : 0;
+
 	/*
 	 * Copy Scatter-Gather list info into controller structure.
 	 *
@@ -1753,6 +1756,14 @@ mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)
 
 	*len = 0;
 
+	if (bulkdata && !adapter->has_64bit_addr) {
+		sg = scsi_sglist(cmd);
+		scb->dma_h_bulkdata = sg_dma_address(sg);
+		*buf = (u32)scb->dma_h_bulkdata;
+		*len = sg_dma_len(sg);
+		return 0;
+	}
+
 	scsi_for_each_sg(cmd, sg, sgcnt, idx) {
 		if (adapter->has_64bit_addr) {
 			scb->sgl64[idx].address = sg_dma_address(sg);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* RE: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-03  0:00     ` FUJITA Tomonori
@ 2007-10-03 23:32       ` Patro, Sumant
  2007-10-03 23:46         ` FUJITA Tomonori
  0 siblings, 1 reply; 16+ messages in thread
From: Patro, Sumant @ 2007-10-03 23:32 UTC (permalink / raw)
  To: FUJITA Tomonori, James.Bottomley
  Cc: bunk, bwindle, linux-kernel, jens.axboe, DL-MegaRAID Linux,
	linux-scsi

 

> -----Original Message-----
> From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] 
> Sent: Tuesday, October 02, 2007 5:01 PM
> To: James.Bottomley@SteelEye.com
> Cc: bunk@kernel.org; bwindle@fint.org; 
> linux-kernel@vger.kernel.org; jens.axboe@oracle.com; 
> fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID 
> Linux; linux-scsi@vger.kernel.org
> Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> 
> On Tue, 02 Oct 2007 15:38:13 -0500
> James Bottomley <James.Bottomley@SteelEye.com> wrote:
> 
> > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > Cc's added, the complete bug report is at
> > >   http://lkml.org/lkml/2007/10/2/243
> > > 
> > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > >
> > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > >...
> > > 
> > > Thanks for your report.
> > > 
> > > Diff'ing the dmesg's shows:
> > > 
> > > <--  snip  -->
> > > 
> > >  scsi0: scanning scsi channel 4 [P0] for physical devices.
> > >  scsi0: scanning scsi channel 5 [P1] for physical devices.
> > >  st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd 
> > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> cache: write 
> > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware 
> sectors (8984 
> > > MB)
> > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> cache: write 
> > > through
> > >   sda: sda1
> > > + sda: p1 exceeds device capacity
> > > 
> > > <--  snip  -->
> > > 
> > > -	case MEGA_BULK_DATA:
> > > -		if (scb->cmd->use_sg == 0)
> > > -			length = scb->cmd->request_bufflen;
> > > -		else {
> > > -			struct scatterlist *sgl =
> > > -				(struct scatterlist 
> *)scb->cmd->request_buffer;
> > > -			length = sgl->length;
> > > -		}
> > > -		pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > -			       length, scb->dma_direction);
> > > -		break;
> > > -
> > 
> > This is the problem piece I think.  We've reintroduced a 
> very old bug:
> > 
> > commit 51c928c34fa7cff38df584ad01de988805877dba
> > Author: James Bottomley <James.Bottomley@SteelEye.com>
> > Date:   Sat Oct 1 09:38:05 2005 -0500
> > 
> >     [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> >     
> >     Some Legacy megaraid cards can't actually cope with the 
> scatter/gather
> >     version of the READ CAPACITY command (which is what we 
> now send them
> >     since altering all SCSI internal I/O to go via the 
> block layer).  Fix
> >     this (and a few other broken megaraid driver 
> assumptions) by sending
> >     the non-sg version of the command if the sg list only 
> has a single
> >     element.
> >     
> >     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
> > 
> > So what we have to do is put back the check for use_sg == 1 
> and send 
> > that as a bulk transfer command.
> 
> Sorry about this. Can this fix the problem?
> 
> Thanks,
> 
> 
> diff --git a/drivers/scsi/megaraid.c 
> b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> --- a/drivers/scsi/megaraid.c
> +++ b/drivers/scsi/megaraid.c
> @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter, 
> scb_t *scb, u32 *buf, u32 *len)
>  
>  	*len = 0;
>  
> +	if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> +		sg = scsi_sglist(cmd);
> +		scb->dma_h_bulkdata = sg_dma_address(sg);
> +		*buf = (u32)scb->dma_h_bulkdata;
> +		*len = sg_dma_len(sg);
> +		return 0;
> +	}
> +
>  	scsi_for_each_sg(cmd, sg, sgcnt, idx) {
>  		if (adapter->has_64bit_addr) {
>  			scb->sgl64[idx].address = sg_dma_address(sg);
> 


With this patch I see the correct logical disk size reported.
Thanks.

Sumant

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-03 23:32       ` Patro, Sumant
@ 2007-10-03 23:46         ` FUJITA Tomonori
  2007-10-04  7:28           ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: FUJITA Tomonori @ 2007-10-03 23:46 UTC (permalink / raw)
  To: Sumant.Patro
  Cc: fujita.tomonori, James.Bottomley, bunk, bwindle, linux-kernel,
	jens.axboe, megaraidlinux, linux-scsi

On Wed, 3 Oct 2007 17:32:55 -0600
"Patro, Sumant" <Sumant.Patro@lsi.com> wrote:

>  
> 
> > -----Original Message-----
> > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] 
> > Sent: Tuesday, October 02, 2007 5:01 PM
> > To: James.Bottomley@SteelEye.com
> > Cc: bunk@kernel.org; bwindle@fint.org; 
> > linux-kernel@vger.kernel.org; jens.axboe@oracle.com; 
> > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID 
> > Linux; linux-scsi@vger.kernel.org
> > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> > 
> > On Tue, 02 Oct 2007 15:38:13 -0500
> > James Bottomley <James.Bottomley@SteelEye.com> wrote:
> > 
> > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > > Cc's added, the complete bug report is at
> > > >   http://lkml.org/lkml/2007/10/2/243
> > > > 
> > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > > >
> > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > > >...
> > > > 
> > > > Thanks for your report.
> > > > 
> > > > Diff'ing the dmesg's shows:
> > > > 
> > > > <--  snip  -->
> > > > 
> > > >  scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > >  scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > >  st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd 
> > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> > cache: write 
> > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware 
> > sectors (8984 
> > > > MB)
> > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> > cache: write 
> > > > through
> > > >   sda: sda1
> > > > + sda: p1 exceeds device capacity
> > > > 
> > > > <--  snip  -->
> > > > 
> > > > -	case MEGA_BULK_DATA:
> > > > -		if (scb->cmd->use_sg == 0)
> > > > -			length = scb->cmd->request_bufflen;
> > > > -		else {
> > > > -			struct scatterlist *sgl =
> > > > -				(struct scatterlist 
> > *)scb->cmd->request_buffer;
> > > > -			length = sgl->length;
> > > > -		}
> > > > -		pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > > -			       length, scb->dma_direction);
> > > > -		break;
> > > > -
> > > 
> > > This is the problem piece I think.  We've reintroduced a 
> > very old bug:
> > > 
> > > commit 51c928c34fa7cff38df584ad01de988805877dba
> > > Author: James Bottomley <James.Bottomley@SteelEye.com>
> > > Date:   Sat Oct 1 09:38:05 2005 -0500
> > > 
> > >     [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> > >     
> > >     Some Legacy megaraid cards can't actually cope with the 
> > scatter/gather
> > >     version of the READ CAPACITY command (which is what we 
> > now send them
> > >     since altering all SCSI internal I/O to go via the 
> > block layer).  Fix
> > >     this (and a few other broken megaraid driver 
> > assumptions) by sending
> > >     the non-sg version of the command if the sg list only 
> > has a single
> > >     element.
> > >     
> > >     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
> > > 
> > > So what we have to do is put back the check for use_sg == 1 
> > and send 
> > > that as a bulk transfer command.
> > 
> > Sorry about this. Can this fix the problem?
> > 
> > Thanks,
> > 
> > 
> > diff --git a/drivers/scsi/megaraid.c 
> > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> > --- a/drivers/scsi/megaraid.c
> > +++ b/drivers/scsi/megaraid.c
> > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter, 
> > scb_t *scb, u32 *buf, u32 *len)
> >  
> >  	*len = 0;
> >  
> > +	if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> > +		sg = scsi_sglist(cmd);
> > +		scb->dma_h_bulkdata = sg_dma_address(sg);
> > +		*buf = (u32)scb->dma_h_bulkdata;
> > +		*len = sg_dma_len(sg);
> > +		return 0;
> > +	}
> > +
> >  	scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> >  		if (adapter->has_64bit_addr) {
> >  			scb->sgl64[idx].address = sg_dma_address(sg);
> > 
> 
> 
> With this patch I see the correct logical disk size reported.
> Thanks.

Great, thanks for testing!

Can you try the following patch instead of the above patch?

http://marc.info/?l=linux-scsi&m=119137033016550&w=2


I know the changes are pretty trivial and it should work...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-03 23:46         ` FUJITA Tomonori
@ 2007-10-04  7:28           ` Jens Axboe
  2007-10-04 10:20             ` FUJITA Tomonori
  2007-10-04 10:48             ` Adrian Bunk
  0 siblings, 2 replies; 16+ messages in thread
From: Jens Axboe @ 2007-10-04  7:28 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: Sumant.Patro, James.Bottomley, bunk, bwindle, linux-kernel,
	megaraidlinux, linux-scsi

On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> On Wed, 3 Oct 2007 17:32:55 -0600
> "Patro, Sumant" <Sumant.Patro@lsi.com> wrote:
> 
> >  
> > 
> > > -----Original Message-----
> > > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] 
> > > Sent: Tuesday, October 02, 2007 5:01 PM
> > > To: James.Bottomley@SteelEye.com
> > > Cc: bunk@kernel.org; bwindle@fint.org; 
> > > linux-kernel@vger.kernel.org; jens.axboe@oracle.com; 
> > > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID 
> > > Linux; linux-scsi@vger.kernel.org
> > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> > > 
> > > On Tue, 02 Oct 2007 15:38:13 -0500
> > > James Bottomley <James.Bottomley@SteelEye.com> wrote:
> > > 
> > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > > > Cc's added, the complete bug report is at
> > > > >   http://lkml.org/lkml/2007/10/2/243
> > > > > 
> > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > > > >
> > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > > > >...
> > > > > 
> > > > > Thanks for your report.
> > > > > 
> > > > > Diff'ing the dmesg's shows:
> > > > > 
> > > > > <--  snip  -->
> > > > > 
> > > > >  scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > > >  scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > > >  st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd 
> > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> > > cache: write 
> > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware 
> > > sectors (8984 
> > > > > MB)
> > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> > > cache: write 
> > > > > through
> > > > >   sda: sda1
> > > > > + sda: p1 exceeds device capacity
> > > > > 
> > > > > <--  snip  -->
> > > > > 
> > > > > -	case MEGA_BULK_DATA:
> > > > > -		if (scb->cmd->use_sg == 0)
> > > > > -			length = scb->cmd->request_bufflen;
> > > > > -		else {
> > > > > -			struct scatterlist *sgl =
> > > > > -				(struct scatterlist 
> > > *)scb->cmd->request_buffer;
> > > > > -			length = sgl->length;
> > > > > -		}
> > > > > -		pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > > > -			       length, scb->dma_direction);
> > > > > -		break;
> > > > > -
> > > > 
> > > > This is the problem piece I think.  We've reintroduced a 
> > > very old bug:
> > > > 
> > > > commit 51c928c34fa7cff38df584ad01de988805877dba
> > > > Author: James Bottomley <James.Bottomley@SteelEye.com>
> > > > Date:   Sat Oct 1 09:38:05 2005 -0500
> > > > 
> > > >     [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> > > >     
> > > >     Some Legacy megaraid cards can't actually cope with the 
> > > scatter/gather
> > > >     version of the READ CAPACITY command (which is what we 
> > > now send them
> > > >     since altering all SCSI internal I/O to go via the 
> > > block layer).  Fix
> > > >     this (and a few other broken megaraid driver 
> > > assumptions) by sending
> > > >     the non-sg version of the command if the sg list only 
> > > has a single
> > > >     element.
> > > >     
> > > >     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
> > > > 
> > > > So what we have to do is put back the check for use_sg == 1 
> > > and send 
> > > > that as a bulk transfer command.
> > > 
> > > Sorry about this. Can this fix the problem?
> > > 
> > > Thanks,
> > > 
> > > 
> > > diff --git a/drivers/scsi/megaraid.c 
> > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> > > --- a/drivers/scsi/megaraid.c
> > > +++ b/drivers/scsi/megaraid.c
> > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter, 
> > > scb_t *scb, u32 *buf, u32 *len)
> > >  
> > >  	*len = 0;
> > >  
> > > +	if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> > > +		sg = scsi_sglist(cmd);
> > > +		scb->dma_h_bulkdata = sg_dma_address(sg);
> > > +		*buf = (u32)scb->dma_h_bulkdata;
> > > +		*len = sg_dma_len(sg);
> > > +		return 0;
> > > +	}
> > > +
> > >  	scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> > >  		if (adapter->has_64bit_addr) {
> > >  			scb->sgl64[idx].address = sg_dma_address(sg);
> > > 
> > 
> > 
> > With this patch I see the correct logical disk size reported.
> > Thanks.
> 
> Great, thanks for testing!
> 
> Can you try the following patch instead of the above patch?
> 
> http://marc.info/?l=linux-scsi&m=119137033016550&w=2
> 
> 
> I know the changes are pretty trivial and it should work...

Tomo, this is the patch I added.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-04  7:28           ` Jens Axboe
@ 2007-10-04 10:20             ` FUJITA Tomonori
  2007-10-04 10:36               ` Jens Axboe
  2007-10-04 10:48             ` Adrian Bunk
  1 sibling, 1 reply; 16+ messages in thread
From: FUJITA Tomonori @ 2007-10-04 10:20 UTC (permalink / raw)
  To: jens.axboe
  Cc: fujita.tomonori, Sumant.Patro, James.Bottomley, bunk, bwindle,
	linux-kernel, megaraidlinux, linux-scsi

On Thu, 4 Oct 2007 09:28:34 +0200
Jens Axboe <jens.axboe@oracle.com> wrote:

> On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> > On Wed, 3 Oct 2007 17:32:55 -0600
> > "Patro, Sumant" <Sumant.Patro@lsi.com> wrote:
> > 
> > >  
> > > 
> > > > -----Original Message-----
> > > > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] 
> > > > Sent: Tuesday, October 02, 2007 5:01 PM
> > > > To: James.Bottomley@SteelEye.com
> > > > Cc: bunk@kernel.org; bwindle@fint.org; 
> > > > linux-kernel@vger.kernel.org; jens.axboe@oracle.com; 
> > > > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID 
> > > > Linux; linux-scsi@vger.kernel.org
> > > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> > > > 
> > > > On Tue, 02 Oct 2007 15:38:13 -0500
> > > > James Bottomley <James.Bottomley@SteelEye.com> wrote:
> > > > 
> > > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > > > > Cc's added, the complete bug report is at
> > > > > >   http://lkml.org/lkml/2007/10/2/243
> > > > > > 
> > > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > > > > >
> > > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > > > > >...
> > > > > > 
> > > > > > Thanks for your report.
> > > > > > 
> > > > > > Diff'ing the dmesg's shows:
> > > > > > 
> > > > > > <--  snip  -->
> > > > > > 
> > > > > >  scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > > > >  scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > > > >  st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd 
> > > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > > > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> > > > cache: write 
> > > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware 
> > > > sectors (8984 
> > > > > > MB)
> > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > > > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> > > > cache: write 
> > > > > > through
> > > > > >   sda: sda1
> > > > > > + sda: p1 exceeds device capacity
> > > > > > 
> > > > > > <--  snip  -->
> > > > > > 
> > > > > > -	case MEGA_BULK_DATA:
> > > > > > -		if (scb->cmd->use_sg == 0)
> > > > > > -			length = scb->cmd->request_bufflen;
> > > > > > -		else {
> > > > > > -			struct scatterlist *sgl =
> > > > > > -				(struct scatterlist 
> > > > *)scb->cmd->request_buffer;
> > > > > > -			length = sgl->length;
> > > > > > -		}
> > > > > > -		pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > > > > -			       length, scb->dma_direction);
> > > > > > -		break;
> > > > > > -
> > > > > 
> > > > > This is the problem piece I think.  We've reintroduced a 
> > > > very old bug:
> > > > > 
> > > > > commit 51c928c34fa7cff38df584ad01de988805877dba
> > > > > Author: James Bottomley <James.Bottomley@SteelEye.com>
> > > > > Date:   Sat Oct 1 09:38:05 2005 -0500
> > > > > 
> > > > >     [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> > > > >     
> > > > >     Some Legacy megaraid cards can't actually cope with the 
> > > > scatter/gather
> > > > >     version of the READ CAPACITY command (which is what we 
> > > > now send them
> > > > >     since altering all SCSI internal I/O to go via the 
> > > > block layer).  Fix
> > > > >     this (and a few other broken megaraid driver 
> > > > assumptions) by sending
> > > > >     the non-sg version of the command if the sg list only 
> > > > has a single
> > > > >     element.
> > > > >     
> > > > >     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
> > > > > 
> > > > > So what we have to do is put back the check for use_sg == 1 
> > > > and send 
> > > > > that as a bulk transfer command.
> > > > 
> > > > Sorry about this. Can this fix the problem?
> > > > 
> > > > Thanks,
> > > > 
> > > > 
> > > > diff --git a/drivers/scsi/megaraid.c 
> > > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> > > > --- a/drivers/scsi/megaraid.c
> > > > +++ b/drivers/scsi/megaraid.c
> > > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter, 
> > > > scb_t *scb, u32 *buf, u32 *len)
> > > >  
> > > >  	*len = 0;
> > > >  
> > > > +	if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> > > > +		sg = scsi_sglist(cmd);
> > > > +		scb->dma_h_bulkdata = sg_dma_address(sg);
> > > > +		*buf = (u32)scb->dma_h_bulkdata;
> > > > +		*len = sg_dma_len(sg);
> > > > +		return 0;
> > > > +	}
> > > > +
> > > >  	scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> > > >  		if (adapter->has_64bit_addr) {
> > > >  			scb->sgl64[idx].address = sg_dma_address(sg);
> > > > 
> > > 
> > > 
> > > With this patch I see the correct logical disk size reported.
> > > Thanks.
> > 
> > Great, thanks for testing!
> > 
> > Can you try the following patch instead of the above patch?
> > 
> > http://marc.info/?l=linux-scsi&m=119137033016550&w=2
> > 
> > 
> > I know the changes are pretty trivial and it should work...
> 
> Tomo, this is the patch I added.

Thanks. I thought that it will be sent via scsi-misc because the scsi
accessor patch introduced this bug. But either is ok with me.

BTW, please add my sign-off.

-
[SCSI] megaraid_old: fix scatter/gather for legacy megaraid cards

Some legacy megaraid cards (!has_64bit_addr case) can't cope with the
catter/gather version of the READ CAPACITY command. We need to send
the non-sg version of the command if the sg list only as a single
element.

commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e reintroduced this bug,
which was fixed long ago (commit 51c928c34fa7cff38df584ad01de988805877dba).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-04 10:20             ` FUJITA Tomonori
@ 2007-10-04 10:36               ` Jens Axboe
  2007-10-04 12:50                 ` James Bottomley
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2007-10-04 10:36 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: Sumant.Patro, James.Bottomley, bunk, bwindle, linux-kernel,
	megaraidlinux, linux-scsi

On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> On Thu, 4 Oct 2007 09:28:34 +0200
> Jens Axboe <jens.axboe@oracle.com> wrote:
> 
> > On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> > > On Wed, 3 Oct 2007 17:32:55 -0600
> > > "Patro, Sumant" <Sumant.Patro@lsi.com> wrote:
> > > 
> > > >  
> > > > 
> > > > > -----Original Message-----
> > > > > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] 
> > > > > Sent: Tuesday, October 02, 2007 5:01 PM
> > > > > To: James.Bottomley@SteelEye.com
> > > > > Cc: bunk@kernel.org; bwindle@fint.org; 
> > > > > linux-kernel@vger.kernel.org; jens.axboe@oracle.com; 
> > > > > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID 
> > > > > Linux; linux-scsi@vger.kernel.org
> > > > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> > > > > 
> > > > > On Tue, 02 Oct 2007 15:38:13 -0500
> > > > > James Bottomley <James.Bottomley@SteelEye.com> wrote:
> > > > > 
> > > > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > > > > > Cc's added, the complete bug report is at
> > > > > > >   http://lkml.org/lkml/2007/10/2/243
> > > > > > > 
> > > > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > > > > > >
> > > > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > > > > > >...
> > > > > > > 
> > > > > > > Thanks for your report.
> > > > > > > 
> > > > > > > Diff'ing the dmesg's shows:
> > > > > > > 
> > > > > > > <--  snip  -->
> > > > > > > 
> > > > > > >  scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > > > > >  scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > > > > >  st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd 
> > > > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > > > > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> > > > > cache: write 
> > > > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware 
> > > > > sectors (8984 
> > > > > > > MB)
> > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > > > > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> > > > > cache: write 
> > > > > > > through
> > > > > > >   sda: sda1
> > > > > > > + sda: p1 exceeds device capacity
> > > > > > > 
> > > > > > > <--  snip  -->
> > > > > > > 
> > > > > > > -	case MEGA_BULK_DATA:
> > > > > > > -		if (scb->cmd->use_sg == 0)
> > > > > > > -			length = scb->cmd->request_bufflen;
> > > > > > > -		else {
> > > > > > > -			struct scatterlist *sgl =
> > > > > > > -				(struct scatterlist 
> > > > > *)scb->cmd->request_buffer;
> > > > > > > -			length = sgl->length;
> > > > > > > -		}
> > > > > > > -		pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > > > > > -			       length, scb->dma_direction);
> > > > > > > -		break;
> > > > > > > -
> > > > > > 
> > > > > > This is the problem piece I think.  We've reintroduced a 
> > > > > very old bug:
> > > > > > 
> > > > > > commit 51c928c34fa7cff38df584ad01de988805877dba
> > > > > > Author: James Bottomley <James.Bottomley@SteelEye.com>
> > > > > > Date:   Sat Oct 1 09:38:05 2005 -0500
> > > > > > 
> > > > > >     [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> > > > > >     
> > > > > >     Some Legacy megaraid cards can't actually cope with the 
> > > > > scatter/gather
> > > > > >     version of the READ CAPACITY command (which is what we 
> > > > > now send them
> > > > > >     since altering all SCSI internal I/O to go via the 
> > > > > block layer).  Fix
> > > > > >     this (and a few other broken megaraid driver 
> > > > > assumptions) by sending
> > > > > >     the non-sg version of the command if the sg list only 
> > > > > has a single
> > > > > >     element.
> > > > > >     
> > > > > >     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
> > > > > > 
> > > > > > So what we have to do is put back the check for use_sg == 1 
> > > > > and send 
> > > > > > that as a bulk transfer command.
> > > > > 
> > > > > Sorry about this. Can this fix the problem?
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > 
> > > > > diff --git a/drivers/scsi/megaraid.c 
> > > > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> > > > > --- a/drivers/scsi/megaraid.c
> > > > > +++ b/drivers/scsi/megaraid.c
> > > > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter, 
> > > > > scb_t *scb, u32 *buf, u32 *len)
> > > > >  
> > > > >  	*len = 0;
> > > > >  
> > > > > +	if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> > > > > +		sg = scsi_sglist(cmd);
> > > > > +		scb->dma_h_bulkdata = sg_dma_address(sg);
> > > > > +		*buf = (u32)scb->dma_h_bulkdata;
> > > > > +		*len = sg_dma_len(sg);
> > > > > +		return 0;
> > > > > +	}
> > > > > +
> > > > >  	scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> > > > >  		if (adapter->has_64bit_addr) {
> > > > >  			scb->sgl64[idx].address = sg_dma_address(sg);
> > > > > 
> > > > 
> > > > 
> > > > With this patch I see the correct logical disk size reported.
> > > > Thanks.
> > > 
> > > Great, thanks for testing!
> > > 
> > > Can you try the following patch instead of the above patch?
> > > 
> > > http://marc.info/?l=linux-scsi&m=119137033016550&w=2
> > > 
> > > 
> > > I know the changes are pretty trivial and it should work...
> > 
> > Tomo, this is the patch I added.
> 
> Thanks. I thought that it will be sent via scsi-misc because the scsi
> accessor patch introduced this bug. But either is ok with me.

If it only affects the driver _after_ the scsi accessor patch and as
such doesn't screw over git-block, then I'll drop it for sure.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-04  7:28           ` Jens Axboe
  2007-10-04 10:20             ` FUJITA Tomonori
@ 2007-10-04 10:48             ` Adrian Bunk
  2007-10-04 10:55               ` FUJITA Tomonori
  1 sibling, 1 reply; 16+ messages in thread
From: Adrian Bunk @ 2007-10-04 10:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: FUJITA Tomonori, Sumant.Patro, James.Bottomley, bwindle,
	linux-kernel, megaraidlinux, linux-scsi

On Thu, Oct 04, 2007 at 09:28:34AM +0200, Jens Axboe wrote:
>...
> Tomo, this is the patch I added.

Please excuse my comment in case this was already clear:

You are aware that this bug is a regression in 2.6.23-rc and the patch 
should therefore go to Linus ASAP and not after the release of 2.6.23?

> Jens Axboe

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-04 10:48             ` Adrian Bunk
@ 2007-10-04 10:55               ` FUJITA Tomonori
  2007-10-04 11:00                 ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: FUJITA Tomonori @ 2007-10-04 10:55 UTC (permalink / raw)
  To: bunk
  Cc: jens.axboe, fujita.tomonori, Sumant.Patro, James.Bottomley,
	bwindle, linux-kernel, megaraidlinux, linux-scsi

On Thu, 4 Oct 2007 12:48:58 +0200
Adrian Bunk <bunk@kernel.org> wrote:

> On Thu, Oct 04, 2007 at 09:28:34AM +0200, Jens Axboe wrote:
> >...
> > Tomo, this is the patch I added.
> 
> Please excuse my comment in case this was already clear:
> 
> You are aware that this bug is a regression in 2.6.23-rc and the patch 
> should therefore go to Linus ASAP and not after the release of 2.6.23?

Oops, you are right. This should go via scsi-rc-fixes tree ASAP.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-04 10:55               ` FUJITA Tomonori
@ 2007-10-04 11:00                 ` Jens Axboe
  0 siblings, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2007-10-04 11:00 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: bunk, Sumant.Patro, James.Bottomley, bwindle, linux-kernel,
	megaraidlinux, linux-scsi

On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> On Thu, 4 Oct 2007 12:48:58 +0200
> Adrian Bunk <bunk@kernel.org> wrote:
> 
> > On Thu, Oct 04, 2007 at 09:28:34AM +0200, Jens Axboe wrote:
> > >...
> > > Tomo, this is the patch I added.
> > 
> > Please excuse my comment in case this was already clear:
> > 
> > You are aware that this bug is a regression in 2.6.23-rc and the patch 
> > should therefore go to Linus ASAP and not after the release of 2.6.23?
> 
> Oops, you are right. This should go via scsi-rc-fixes tree ASAP.

Irk, the scsi accessor stuff is already in, I forgot and thought it was
pending for 2.6.24. So rush the patch upstream please!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.23-rc9 boot failure (megaraid?)
  2007-10-04 10:36               ` Jens Axboe
@ 2007-10-04 12:50                 ` James Bottomley
  0 siblings, 0 replies; 16+ messages in thread
From: James Bottomley @ 2007-10-04 12:50 UTC (permalink / raw)
  To: Jens Axboe
  Cc: FUJITA Tomonori, Sumant.Patro, bunk, bwindle, linux-kernel,
	megaraidlinux, linux-scsi

On Thu, 2007-10-04 at 12:36 +0200, Jens Axboe wrote:
> On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> > On Thu, 4 Oct 2007 09:28:34 +0200
> > Jens Axboe <jens.axboe@oracle.com> wrote:
> > 
> > > On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> > > > On Wed, 3 Oct 2007 17:32:55 -0600
> > > > "Patro, Sumant" <Sumant.Patro@lsi.com> wrote:
> > > > 
> > > > >  
> > > > > 
> > > > > > -----Original Message-----
> > > > > > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] 
> > > > > > Sent: Tuesday, October 02, 2007 5:01 PM
> > > > > > To: James.Bottomley@SteelEye.com
> > > > > > Cc: bunk@kernel.org; bwindle@fint.org; 
> > > > > > linux-kernel@vger.kernel.org; jens.axboe@oracle.com; 
> > > > > > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID 
> > > > > > Linux; linux-scsi@vger.kernel.org
> > > > > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> > > > > > 
> > > > > > On Tue, 02 Oct 2007 15:38:13 -0500
> > > > > > James Bottomley <James.Bottomley@SteelEye.com> wrote:
> > > > > > 
> > > > > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > > > > > > Cc's added, the complete bug report is at
> > > > > > > >   http://lkml.org/lkml/2007/10/2/243
> > > > > > > > 
> > > > > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > > > > > > >
> > > > > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > > > > > > >...
> > > > > > > > 
> > > > > > > > Thanks for your report.
> > > > > > > > 
> > > > > > > > Diff'ing the dmesg's shows:
> > > > > > > > 
> > > > > > > > <--  snip  -->
> > > > > > > > 
> > > > > > > >  scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > > > > > >  scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > > > > > >  st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd 
> > > > > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > > > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > > > > > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> > > > > > cache: write 
> > > > > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware 
> > > > > > sectors (8984 
> > > > > > > > MB)
> > > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > > > >  sd 0:0:0:0: [sda] Write Protect is off  sd 0:0:0:0: [sda] Asking 
> > > > > > > > for cache data failed  sd 0:0:0:0: [sda] Assuming drive 
> > > > > > cache: write 
> > > > > > > > through
> > > > > > > >   sda: sda1
> > > > > > > > + sda: p1 exceeds device capacity
> > > > > > > > 
> > > > > > > > <--  snip  -->
> > > > > > > > 
> > > > > > > > -	case MEGA_BULK_DATA:
> > > > > > > > -		if (scb->cmd->use_sg == 0)
> > > > > > > > -			length = scb->cmd->request_bufflen;
> > > > > > > > -		else {
> > > > > > > > -			struct scatterlist *sgl =
> > > > > > > > -				(struct scatterlist 
> > > > > > *)scb->cmd->request_buffer;
> > > > > > > > -			length = sgl->length;
> > > > > > > > -		}
> > > > > > > > -		pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > > > > > > -			       length, scb->dma_direction);
> > > > > > > > -		break;
> > > > > > > > -
> > > > > > > 
> > > > > > > This is the problem piece I think.  We've reintroduced a 
> > > > > > very old bug:
> > > > > > > 
> > > > > > > commit 51c928c34fa7cff38df584ad01de988805877dba
> > > > > > > Author: James Bottomley <James.Bottomley@SteelEye.com>
> > > > > > > Date:   Sat Oct 1 09:38:05 2005 -0500
> > > > > > > 
> > > > > > >     [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> > > > > > >     
> > > > > > >     Some Legacy megaraid cards can't actually cope with the 
> > > > > > scatter/gather
> > > > > > >     version of the READ CAPACITY command (which is what we 
> > > > > > now send them
> > > > > > >     since altering all SCSI internal I/O to go via the 
> > > > > > block layer).  Fix
> > > > > > >     this (and a few other broken megaraid driver 
> > > > > > assumptions) by sending
> > > > > > >     the non-sg version of the command if the sg list only 
> > > > > > has a single
> > > > > > >     element.
> > > > > > >     
> > > > > > >     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
> > > > > > > 
> > > > > > > So what we have to do is put back the check for use_sg == 1 
> > > > > > and send 
> > > > > > > that as a bulk transfer command.
> > > > > > 
> > > > > > Sorry about this. Can this fix the problem?
> > > > > > 
> > > > > > Thanks,
> > > > > > 
> > > > > > 
> > > > > > diff --git a/drivers/scsi/megaraid.c 
> > > > > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> > > > > > --- a/drivers/scsi/megaraid.c
> > > > > > +++ b/drivers/scsi/megaraid.c
> > > > > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter, 
> > > > > > scb_t *scb, u32 *buf, u32 *len)
> > > > > >  
> > > > > >  	*len = 0;
> > > > > >  
> > > > > > +	if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> > > > > > +		sg = scsi_sglist(cmd);
> > > > > > +		scb->dma_h_bulkdata = sg_dma_address(sg);
> > > > > > +		*buf = (u32)scb->dma_h_bulkdata;
> > > > > > +		*len = sg_dma_len(sg);
> > > > > > +		return 0;
> > > > > > +	}
> > > > > > +
> > > > > >  	scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> > > > > >  		if (adapter->has_64bit_addr) {
> > > > > >  			scb->sgl64[idx].address = sg_dma_address(sg);
> > > > > > 
> > > > > 
> > > > > 
> > > > > With this patch I see the correct logical disk size reported.
> > > > > Thanks.
> > > > 
> > > > Great, thanks for testing!
> > > > 
> > > > Can you try the following patch instead of the above patch?
> > > > 
> > > > http://marc.info/?l=linux-scsi&m=119137033016550&w=2
> > > > 
> > > > 
> > > > I know the changes are pretty trivial and it should work...
> > > 
> > > Tomo, this is the patch I added.
> > 
> > Thanks. I thought that it will be sent via scsi-misc because the scsi
> > accessor patch introduced this bug. But either is ok with me.
> 
> If it only affects the driver _after_ the scsi accessor patch and as
> such doesn't screw over git-block, then I'll drop it for sure.

No, this is a release critical fix ... I'll roll it up and send it in
for 2.6.23.

James



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-10-04 12:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-02 16:48 2.6.23-rc9 boot failure (megaraid?) Burton Windle
2007-10-02 18:15 ` Adrian Bunk
2007-10-02 18:46   ` Burton Windle
2007-10-02 19:55     ` Rafael J. Wysocki
2007-10-02 20:38   ` James Bottomley
2007-10-03  0:00     ` FUJITA Tomonori
2007-10-03 23:32       ` Patro, Sumant
2007-10-03 23:46         ` FUJITA Tomonori
2007-10-04  7:28           ` Jens Axboe
2007-10-04 10:20             ` FUJITA Tomonori
2007-10-04 10:36               ` Jens Axboe
2007-10-04 12:50                 ` James Bottomley
2007-10-04 10:48             ` Adrian Bunk
2007-10-04 10:55               ` FUJITA Tomonori
2007-10-04 11:00                 ` Jens Axboe
2007-10-03  0:09     ` FUJITA Tomonori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox