Re: Crash during SATA reads

Linux SCSI subsystem development
 help / color / mirror / Atom feed

* Re: Crash during SATA reads
       [not found]     ` <bd36f99e0911111309ie8166a9h8f926948bf15d350@mail.gmail.com>
@ 2009-11-11 21:39       ` Jeff Garzik
  2009-11-11 22:19         ` Glenn Maynard
  2009-11-12  9:16         ` Jens Axboe
  0 siblings, 2 replies; 9+ messages in thread
From: Jeff Garzik @ 2009-11-11 21:39 UTC (permalink / raw)
  To: Glenn Maynard; +Cc: linux-kernel, linux-scsi, Jens Axboe

On 11/11/2009 04:09 PM, Glenn Maynard wrote:
> Another trace, this time a bit deeper:
>
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<c1079e7b>] end_buffer_async_read+0x59/0xc5
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT
> last sysfs file:
> Modules linked in: netconsole atl1c rtc
>
> Pid: 1300, comm: gzip Not tainted (2.6.31.6 #5) G31M-ES2L
> EIP: 0060:[<c1079e7b>] EFLAGS: 00010013 CPU: 0
> EIP is at end_buffer_async_read+0x59/0xc5
> EAX: 00000000 EBX: c1ae7b38 ECX: 00000000 EDX: c1079e22
> ESI: 00000202 EDI: c17b5040 EBP: 00000000 ESP: df1dde80
>   DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process gzip (pid: 1300, ti=df1dc000 task=df866420 task.ti=df1dc000)
> Stack:
>   ce632840 ce632840 df91de78 df1dded0 c107ccb9 c107cc89 c107e551 00000200
> <0>  c11587c4 00000000 00000001 df430298 00000000 00000000 0000be00 00004200
> <0>  00000000 df91de78 00000000 df91de78 df8ef168 c115893e df8a6c80 00000000
> Call Trace:
>   [<c107ccb9>] ? end_bio_bh_io_sync+0x30/0x38
>   [<c107cc89>] ? end_bio_bh_io_sync+0x0/0x38
>   [<c107e551>] ? bio_endio+0x24/0x26
>   [<c11587c4>] ? blk_update_request+0xdf/0x24e
>   [<c115893e>] ? blk_update_bidi_request+0xb/0x41
>   [<c11589f5>] ? blk_end_bidi_request+0x10/0x4f
>   [<c1158a64>] ? blk_end_request+0x7/0xc
>   [<c11abcb2>] ? scsi_end_request+0x17/0x69
>   [<c11abfc3>] ? scsi_io_completion+0x173/0x335
>   [<c11a8330>] ? scsi_finish_command+0x70/0x86
>   [<c11ac6a6>] ? scsi_softirq_done+0xd7/0xdc
>   [<c115b3f9>] ? blk_done_softirq+0x51/0x5d
>   [<c101bde0>] ? __do_softirq+0x5f/0xc8
>   [<c101be6b>] ? do_softirq+0x22/0x26
>   [<c101becd>] ? irq_exit+0x29/0x34
>   [<c1004097>] ? do_IRQ+0x53/0x63
>   [<c1002ea9>] ? common_interrupt+0x29/0x30
> Code: e8 d2 fd ff ff 80 0f 02 f6 47 01 08 75 04 0f 0b eb fe 9c 5e fa
> 89 e0 25 00 e0 ff ff ff 40 14 80 23 7f 89 d8 e8 58 fd ff ff 89 d9<8b>
> 11 b8 00 00 00 00 f6 c2 01 0f 44 e8 f6 c2 80 74 09 80 e2 04
> EIP: [<c1079e7b>] end_buffer_async_read+0x59/0xc5 SS:ESP 0068:df1dde80
> CR2: 0000000000000000
> ---[ end trace 2e93648aef49fa0e ]---
>
> It looks like buffer_tmp was NULL during end_buffer_async_read's
> buffer_uptodate(tmp).
>
>
> This one just happened when I interrupted the dd:
>
> BUG: unable to handle kernel NULL pointer dereference at 00000004
> IP: [<c107ae05>] block_invalidatepage+0x22/0x54
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT
> last sysfs file:
> Modules linked in: netconsole atl1c rtc
>
> Pid: 1296, comm: dd Not tainted (2.6.31.6 #6) G31M-ES2L
> EIP: 0060:[<c107ae05>] EFLAGS: 00010217 CPU: 0
> EIP is at block_invalidatepage+0x22/0x54
> EAX: 00000000 EBX: c14f1540 ECX: 00000002 EDX: 00001000
> ESI: 00000000 EDI: 00001000 EBP: c8d4fc88 ESP: df1d7ed0
>   DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process dd (pid: 1296, ti=df1d6000 task=df87bc00 task.ti=df1d6000)
> Stack:
>   00000000 c14f1540 00000000 c14f1540 00064156 c1048c29 c14f1540 c1048cab
> <0>  00000001 c1048def ffffffff 00000000 df4cff54 0000000e 00000000 c13f0440
> <0>  c14f1540 c14f2ae0 c13f7a40 c13f7e80 c13fd220 c13efd40 c13efd60 c13f01c0
> Call Trace:
>   [<c1048c29>] ? do_invalidatepage+0x1d/0x1f
>   [<c1048cab>] ? truncate_complete_page+0x19/0x4b
>   [<c1048def>] ? truncate_inode_pages_range+0xd1/0x2af
>   [<c1048fd6>] ? truncate_inode_pages+0x9/0xc
>   [<c107fb83>] ? __blkdev_put+0x47/0xec
>   [<c1060fe7>] ? __fput+0xc4/0x177
>   [<c105fa8a>] ? filp_close+0x4c/0x52
>   [<c105faee>] ? sys_close+0x5e/0x99
>   [<c10027f0>] ? sysenter_do_call+0x12/0x22
> Code: bf 89 d8 5b e9 ee ed ff ff 55 57 56 53 89 c3 56 89 14 24 31 d2
> 8b 00 a8 01 75 04 0f 0b eb fe f6 c4 08 74 33 8b 6b 0c 89 e8 89 d7<8b>
> 70 04 03 78 14 39 14 24 77 05 e8 97 ff ff ff 39 ee 89 fa 89
> EIP: [<c107ae05>] block_invalidatepage+0x22/0x54 SS:ESP 0068:df1d7ed0
> CR2: 0000000000000004
> ---[ end trace 64b0df80bcbb02ce ]---
>
> Here, it looks like bh is NULL in block_invalidatepage's loop, dying
> at bh->b_size.
>
> I'm not sure where to go from here.

I would poke Jens Axboe (block maintainer) and linux-scsi, the issuer of 
these commands.

Also, providing filesystem info (ext3? ext4? XFS? btrfs?) and info on 
your workload would be helpful.

	Jeff




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Crash during SATA reads
  2009-11-11 21:39       ` Crash during SATA reads Jeff Garzik
@ 2009-11-11 22:19         ` Glenn Maynard
  2009-11-12  2:28           ` Dmitry Monakhov
  2009-11-12  9:16         ` Jens Axboe
  1 sibling, 1 reply; 9+ messages in thread
From: Glenn Maynard @ 2009-11-11 22:19 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel, linux-scsi, Jens Axboe

On Wed, Nov 11, 2009 at 4:39 PM, Jeff Garzik <jeff@garzik.org> wrote:
> I would poke Jens Axboe (block maintainer) and linux-scsi, the issuer of
> these commands.

(Looks like you poked for me; thanks.)

> Also, providing filesystem info (ext3? ext4? XFS? btrfs?) and info on your
> workload would be helpful.

These crashes happen while running "dd if=/dev/sdb | gzip > /dev/null"
in 2.6.31.6.  I'm just making a raw image of another drive.  Removing
gzip masks the problem.  It happens anywhere between almost
immediately to after running for 15-30 minutes.

All filesystems are XFS, but none of sdb's partitions are mounted and
I'm reading the device directly (/dev/sdb, not a partition).  The
system that's making the image is mounted read-only and isn't running
anything else.

(Another backtrace at http://lkml.org/lkml/2009/11/11/81.)

Here's a curious one; EIP appears to be in data:

Pid: 1312, comm: gzip Not tainted (2.6.31.6 #6) G31M-ES2L
EIP: 0060:[<c1ae78e4>] EFLAGS: 00010286 CPU: 0
EIP is at 0xc1ae78e4
EAX: c1ae78c0 EBX: c107ccb1 ECX: c1ae78c0 EDX: 00000001
ESI: c1ae78c0 EDI: dfa39069 EBP: c7d29ed0 ESP: c7d29e94
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process gzip (pid: 1312, ti=c7d28000 task=df8b6080 task.ti=c7d28000)
Stack:
 c107ccc1 c107e559 00000200 c11587cc 00000000 df8a6c80 dfa07168 00000000
<0> 00000000 0000c000 00004000 00000000 dfa39068 00000000 dfa39068 dfa07168
<0> c1158946 df8a6c80 00000000 c11589fd 00000000 df8a6c80 00000000 dfa39068
Call Trace:
 [<c107ccc1>] ? end_bio_bh_io_sync+0x30/0x38
 [<c107e559>] ? bio_endio+0x24/0x26
 [<c11587cc>] ? blk_update_request+0xdf/0x24e
 [<c1158946>] ? blk_update_bidi_request+0xb/0x41
 [<c11589fd>] ? blk_end_bidi_request+0x10/0x4f
 [<c1158a6c>] ? blk_end_request+0x7/0xc
 [<c11abcc2>] ? scsi_end_request+0x17/0x69
 [<c11abfd3>] ? scsi_io_completion+0x173/0x335
 [<c11a8340>] ? scsi_finish_command+0x70/0x86
 [<c11ac6b6>] ? scsi_softirq_done+0xd7/0xdc
 [<c115b401>] ? blk_done_softirq+0x51/0x5d
 [<c101bde0>] ? __do_softirq+0x5f/0xc8
 [<c101be6b>] ? do_softirq+0x22/0x26
 [<c101becd>] ? irq_exit+0x29/0x34
 [<c1004097>] ? do_IRQ+0x53/0x63
 [<c1002ea9>] ? common_interrupt+0x29/0x30
Code: 00 00 00 00 00 00 00 c0 72 77 83 00 00 00 00 40 79 ae c1 40 8e
4c df 09 00 00 f0 00 00 00 00 01 00 00 00 01 00 00 00 00 00 00 00 <00>
02 00 00 00 02 00 00 04 00 00 00 ff ff ff ff 01 00 00 00 08
EIP: [<c1ae78e4>] 0xc1ae78e4 SS:ESP 0068:c7d29e94
CR2: 0000000000000001
---[ end trace fb168fcf160f6893 ]---

-- 
Glenn Maynard

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Crash during SATA reads
  2009-11-11 22:19         ` Glenn Maynard
@ 2009-11-12  2:28           ` Dmitry Monakhov
  2009-11-12  5:26             ` Glenn Maynard
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Monakhov @ 2009-11-12  2:28 UTC (permalink / raw)
  To: Glenn Maynard; +Cc: Jeff Garzik, linux-kernel, linux-scsi, Jens Axboe

Glenn Maynard <glenn@zewt.org> writes:

> On Wed, Nov 11, 2009 at 4:39 PM, Jeff Garzik <jeff@garzik.org> wrote:
>> I would poke Jens Axboe (block maintainer) and linux-scsi, the issuer of
>> these commands.
>
> (Looks like you poked for me; thanks.)
>
>> Also, providing filesystem info (ext3? ext4? XFS? btrfs?) and info on your
>> workload would be helpful.
>
> These crashes happen while running "dd if=/dev/sdb | gzip > /dev/null"
> in 2.6.31.6.  I'm just making a raw image of another drive.  Removing
> gzip masks the problem.  It happens anywhere between almost
> immediately to after running for 15-30 minutes.
>
> All filesystems are XFS, but none of sdb's partitions are mounted and
> I'm reading the device directly (/dev/sdb, not a partition).  The
> system that's making the image is mounted read-only and isn't running
> anything else.
>
> (Another backtrace at http://lkml.org/lkml/2009/11/11/81.)
Seems what you have use-after-free here.
You probably have to add some debug info in to bio->end_io method
May be something like this.
diff --git a/fs/buffer.c b/fs/buffer.c
index 6fa5302..ce95791 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2930,6 +2930,10 @@ EXPORT_SYMBOL(generic_block_bmap);
 static void end_bio_bh_io_sync(struct bio *bio, int err)
 {
 	struct buffer_head *bh = bio->bi_private;
+	BIO_BUG_ON(!atomic_read(&bio->bi_cnt));
+	BUG_ON(!bh);
+	BUG_ON(!bh->b_page);
+	BUG_ON(!bh->b_this_page);
 
 	if (err == -EOPNOTSUPP) {
 		set_bit(BIO_EOPNOTSUPP, &bio->bi_flags);


>
> Here's a curious one; EIP appears to be in data:
>
> Pid: 1312, comm: gzip Not tainted (2.6.31.6 #6) G31M-ES2L
> EIP: 0060:[<c1ae78e4>] EFLAGS: 00010286 CPU: 0
> EIP is at 0xc1ae78e4
> EAX: c1ae78c0 EBX: c107ccb1 ECX: c1ae78c0 EDX: 00000001
> ESI: c1ae78c0 EDI: dfa39069 EBP: c7d29ed0 ESP: c7d29e94
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process gzip (pid: 1312, ti=c7d28000 task=df8b6080 task.ti=c7d28000)
> Stack:
>  c107ccc1 c107e559 00000200 c11587cc 00000000 df8a6c80 dfa07168 00000000
> <0> 00000000 0000c000 00004000 00000000 dfa39068 00000000 dfa39068 dfa07168
> <0> c1158946 df8a6c80 00000000 c11589fd 00000000 df8a6c80 00000000 dfa39068
> Call Trace:
>  [<c107ccc1>] ? end_bio_bh_io_sync+0x30/0x38
>  [<c107e559>] ? bio_endio+0x24/0x26
>  [<c11587cc>] ? blk_update_request+0xdf/0x24e
>  [<c1158946>] ? blk_update_bidi_request+0xb/0x41
>  [<c11589fd>] ? blk_end_bidi_request+0x10/0x4f
>  [<c1158a6c>] ? blk_end_request+0x7/0xc
>  [<c11abcc2>] ? scsi_end_request+0x17/0x69
>  [<c11abfd3>] ? scsi_io_completion+0x173/0x335
>  [<c11a8340>] ? scsi_finish_command+0x70/0x86
>  [<c11ac6b6>] ? scsi_softirq_done+0xd7/0xdc
>  [<c115b401>] ? blk_done_softirq+0x51/0x5d
>  [<c101bde0>] ? __do_softirq+0x5f/0xc8
>  [<c101be6b>] ? do_softirq+0x22/0x26
>  [<c101becd>] ? irq_exit+0x29/0x34
>  [<c1004097>] ? do_IRQ+0x53/0x63
>  [<c1002ea9>] ? common_interrupt+0x29/0x30
> Code: 00 00 00 00 00 00 00 c0 72 77 83 00 00 00 00 40 79 ae c1 40 8e
> 4c df 09 00 00 f0 00 00 00 00 01 00 00 00 01 00 00 00 00 00 00 00 <00>
> 02 00 00 00 02 00 00 04 00 00 00 ff ff ff ff 01 00 00 00 08
> EIP: [<c1ae78e4>] 0xc1ae78e4 SS:ESP 0068:c7d29e94
> CR2: 0000000000000001
> ---[ end trace fb168fcf160f6893 ]---

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Crash during SATA reads
  2009-11-12  2:28           ` Dmitry Monakhov
@ 2009-11-12  5:26             ` Glenn Maynard
  0 siblings, 0 replies; 9+ messages in thread
From: Glenn Maynard @ 2009-11-12  5:26 UTC (permalink / raw)
  To: Jeff Garzik, linux-kernel, linux-scsi, Jens Axboe

On Wed, Nov 11, 2009 at 9:28 PM, Dmitry Monakhov <dmonakhov@openvz.org> wrote:
> Seems what you have use-after-free here.
> You probably have to add some debug info in to bio->end_io method
> May be something like this.

That's what it looks like, but won't these checks trigger on the use
(and give the same trace), when we need to know where the free is
happening?  The problem is manifesting in several places (bogus or
NULL bh->b_end_io in end_bio_bh_io_sync(); bh->b_this_page == NULL--I
think, havn't reproduced that again to confirm--in
block_invalidatepage()).  I'm not sure how to figure out where the
free might be happening; it's tricky enough in userspace with Valgrind
available.

I tried logging alloc_buffer_head and free_buffer_head, but it was too
much output (if it's not masking the problem entirely by changing the
timing too much, it would take ages to repro).

I just repro'd it (took several hours this time), on the
BUG_ON(!bh->b_page); assertion.  This one happened while doing a
partition copy (/dev/sdb2) rather than /dev/sdb.  Trace follows
(though I doubt it offers any new information).  Hopefully somebody
has an idea of where to look next...

kernel BUG at fs/buffer.c:2934!
invalid opcode: 0000 [#1] PREEMPT
last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/model
Modules linked in: netconsole atl1c rtc

Pid: 0, comm: swapper Not tainted (2.6.31.6 #16) G31M-ES2L
EIP: 0060:[<c107cce3>] EFLAGS: 00010282 CPU: 0
EIP is at end_bio_bh_io_sync+0x20/0x63
EAX: c1ae78c0 EBX: c107cce3 ECX: c1ae78c0 EDX: 00000000
ESI: c1ae78c0 EDI: d9e5c450 EBP: c1351eac ESP: c1351e74
 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process swapper (pid: 0, ti=c1350000 task=c1356260 task.ti=c1350000)
Stack:
 c107e5b5 00000400 c1158828 00000000 df8a62c0 df8ef5a4 00000000 00000000
<0> 00012c00 0000d400 00000000 d9e5c450 00000000 d9e5c450 df8ef168 c11589a2
<0> df8a6500 00000000 c1158a59 00000000 df8a6500 00000000 d9e5c450 c1158ac8
Call Trace:
 [<c107e5b5>] ? bio_endio+0x24/0x26
 [<c1158828>] ? blk_update_request+0xdf/0x24e
 [<c11589a2>] ? blk_update_bidi_request+0xb/0x41
 [<c1158a59>] ? blk_end_bidi_request+0x10/0x4f
 [<c1158ac8>] ? blk_end_request+0x7/0xc
 [<c11abd12>] ? scsi_end_request+0x17/0x69
 [<c11ac023>] ? scsi_io_completion+0x173/0x335
 [<c11a8390>] ? scsi_finish_command+0x70/0x86
 [<c11ac706>] ? scsi_softirq_done+0xd7/0xdc
 [<c115b45d>] ? blk_done_softirq+0x51/0x5d
 [<c101bde0>] ? __do_softirq+0x5f/0xc8
 [<c101be6b>] ? do_softirq+0x22/0x26
 [<c101becd>] ? irq_exit+0x29/0x34
 [<c1004097>] ? do_IRQ+0x53/0x63
 [<c1002ea9>] ? common_interrupt+0x29/0x30
 [<c100716e>] ? mwait_idle+0x3c/0x44
 [<c1001423>] ? cpu_idle+0x19/0x3a
 [<c13885aa>] ? start_kernel+0x1a4/0x1a6
Code: 54 24 18 83 c4 48 5b 5e 5f 5d c3 56 53 89 c3 8b 48 40 8b 40 34
85 c0 75 04 0f 0b eb fe 85 c9 75 04 0f 0b eb fe 83 79 08 00 75 04 <0f>
0b eb fe 83 79 04 00 75 04 0f 0b eb fe 83 fa a1 75 08 80 4b
EIP: [<c107cce3>] end_bio_bh_io_sync+0x20/0x63 SS:ESP 0068:c1351e74

-- 
Glenn Maynard

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Crash during SATA reads
  2009-11-11 21:39       ` Crash during SATA reads Jeff Garzik
  2009-11-11 22:19         ` Glenn Maynard
@ 2009-11-12  9:16         ` Jens Axboe
  2009-11-12  9:48           ` Glenn Maynard
  1 sibling, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2009-11-12  9:16 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Glenn Maynard, linux-kernel, linux-scsi

On Wed, Nov 11 2009, Jeff Garzik wrote:
> On 11/11/2009 04:09 PM, Glenn Maynard wrote:
>> Another trace, this time a bit deeper:
>>
>> BUG: unable to handle kernel NULL pointer dereference at (null)
>> IP: [<c1079e7b>] end_buffer_async_read+0x59/0xc5
>> *pde = 00000000
>> Oops: 0000 [#1] PREEMPT
>> last sysfs file:
>> Modules linked in: netconsole atl1c rtc
>>
>> Pid: 1300, comm: gzip Not tainted (2.6.31.6 #5) G31M-ES2L
>> EIP: 0060:[<c1079e7b>] EFLAGS: 00010013 CPU: 0
>> EIP is at end_buffer_async_read+0x59/0xc5
>> EAX: 00000000 EBX: c1ae7b38 ECX: 00000000 EDX: c1079e22
>> ESI: 00000202 EDI: c17b5040 EBP: 00000000 ESP: df1dde80
>>   DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
>> Process gzip (pid: 1300, ti=df1dc000 task=df866420 task.ti=df1dc000)
>> Stack:
>>   ce632840 ce632840 df91de78 df1dded0 c107ccb9 c107cc89 c107e551 00000200
>> <0>  c11587c4 00000000 00000001 df430298 00000000 00000000 0000be00 00004200
>> <0>  00000000 df91de78 00000000 df91de78 df8ef168 c115893e df8a6c80 00000000
>> Call Trace:
>>   [<c107ccb9>] ? end_bio_bh_io_sync+0x30/0x38
>>   [<c107cc89>] ? end_bio_bh_io_sync+0x0/0x38
>>   [<c107e551>] ? bio_endio+0x24/0x26
>>   [<c11587c4>] ? blk_update_request+0xdf/0x24e
>>   [<c115893e>] ? blk_update_bidi_request+0xb/0x41
>>   [<c11589f5>] ? blk_end_bidi_request+0x10/0x4f
>>   [<c1158a64>] ? blk_end_request+0x7/0xc
>>   [<c11abcb2>] ? scsi_end_request+0x17/0x69
>>   [<c11abfc3>] ? scsi_io_completion+0x173/0x335
>>   [<c11a8330>] ? scsi_finish_command+0x70/0x86
>>   [<c11ac6a6>] ? scsi_softirq_done+0xd7/0xdc
>>   [<c115b3f9>] ? blk_done_softirq+0x51/0x5d
>>   [<c101bde0>] ? __do_softirq+0x5f/0xc8
>>   [<c101be6b>] ? do_softirq+0x22/0x26
>>   [<c101becd>] ? irq_exit+0x29/0x34
>>   [<c1004097>] ? do_IRQ+0x53/0x63
>>   [<c1002ea9>] ? common_interrupt+0x29/0x30
>> Code: e8 d2 fd ff ff 80 0f 02 f6 47 01 08 75 04 0f 0b eb fe 9c 5e fa
>> 89 e0 25 00 e0 ff ff ff 40 14 80 23 7f 89 d8 e8 58 fd ff ff 89 d9<8b>
>> 11 b8 00 00 00 00 f6 c2 01 0f 44 e8 f6 c2 80 74 09 80 e2 04
>> EIP: [<c1079e7b>] end_buffer_async_read+0x59/0xc5 SS:ESP 0068:df1dde80
>> CR2: 0000000000000000
>> ---[ end trace 2e93648aef49fa0e ]---
>>
>> It looks like buffer_tmp was NULL during end_buffer_async_read's
>> buffer_uptodate(tmp).
>>
>>
>> This one just happened when I interrupted the dd:
>>
>> BUG: unable to handle kernel NULL pointer dereference at 00000004
>> IP: [<c107ae05>] block_invalidatepage+0x22/0x54
>> *pde = 00000000
>> Oops: 0000 [#1] PREEMPT
>> last sysfs file:
>> Modules linked in: netconsole atl1c rtc
>>
>> Pid: 1296, comm: dd Not tainted (2.6.31.6 #6) G31M-ES2L
>> EIP: 0060:[<c107ae05>] EFLAGS: 00010217 CPU: 0
>> EIP is at block_invalidatepage+0x22/0x54
>> EAX: 00000000 EBX: c14f1540 ECX: 00000002 EDX: 00001000
>> ESI: 00000000 EDI: 00001000 EBP: c8d4fc88 ESP: df1d7ed0
>>   DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
>> Process dd (pid: 1296, ti=df1d6000 task=df87bc00 task.ti=df1d6000)
>> Stack:
>>   00000000 c14f1540 00000000 c14f1540 00064156 c1048c29 c14f1540 c1048cab
>> <0>  00000001 c1048def ffffffff 00000000 df4cff54 0000000e 00000000 c13f0440
>> <0>  c14f1540 c14f2ae0 c13f7a40 c13f7e80 c13fd220 c13efd40 c13efd60 c13f01c0
>> Call Trace:
>>   [<c1048c29>] ? do_invalidatepage+0x1d/0x1f
>>   [<c1048cab>] ? truncate_complete_page+0x19/0x4b
>>   [<c1048def>] ? truncate_inode_pages_range+0xd1/0x2af
>>   [<c1048fd6>] ? truncate_inode_pages+0x9/0xc
>>   [<c107fb83>] ? __blkdev_put+0x47/0xec
>>   [<c1060fe7>] ? __fput+0xc4/0x177
>>   [<c105fa8a>] ? filp_close+0x4c/0x52
>>   [<c105faee>] ? sys_close+0x5e/0x99
>>   [<c10027f0>] ? sysenter_do_call+0x12/0x22
>> Code: bf 89 d8 5b e9 ee ed ff ff 55 57 56 53 89 c3 56 89 14 24 31 d2
>> 8b 00 a8 01 75 04 0f 0b eb fe f6 c4 08 74 33 8b 6b 0c 89 e8 89 d7<8b>
>> 70 04 03 78 14 39 14 24 77 05 e8 97 ff ff ff 39 ee 89 fa 89
>> EIP: [<c107ae05>] block_invalidatepage+0x22/0x54 SS:ESP 0068:df1d7ed0
>> CR2: 0000000000000004
>> ---[ end trace 64b0df80bcbb02ce ]---
>>
>> Here, it looks like bh is NULL in block_invalidatepage's loop, dying
>> at bh->b_size.
>>
>> I'm not sure where to go from here.
>
> I would poke Jens Axboe (block maintainer) and linux-scsi, the issuer of  
> these commands.

So the device isn't mounted, which means that we'll default to 1k
blocks. I think that's a big clue here, as you'll have 4 buffer_heads
pointing to the same page. And one of your oopses is indeed in the
->b_this_page unroll.

Can you trigger this if you mount/umount the device first?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Crash during SATA reads
  2009-11-12  9:16         ` Jens Axboe
@ 2009-11-12  9:48           ` Glenn Maynard
  2009-11-12  9:55             ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Glenn Maynard @ 2009-11-12  9:48 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jeff Garzik, linux-kernel, linux-scsi

On Thu, Nov 12, 2009 at 4:16 AM, Jens Axboe <jens.axboe@oracle.com> wrote:
> So the device isn't mounted, which means that we'll default to 1k
> blocks. I think that's a big clue here, as you'll have 4 buffer_heads
> pointing to the same page. And one of your oopses is indeed in the
> ->b_this_page unroll.
>
> Can you trigger this if you mount/umount the device first?

Yeah.  I've reproduced it reading from /dev/sda (the drive the root
partition is mounted on) with no second drive attached, too.

I'm currently reproducing it with "dd if=/dev/sda of=/dev/sdb", since
that reproduces faster and more reproducably than just reading.
Here's a trace from one of these:

BUG: unable to handle kernel NULL pointer dereference at 000000c0
IP: [<c1097067>] bdevname+0x8/0x13
*pde = 00000000
Oops: 0000 [#1] PREEMPT
last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/model
Modules linked in: netconsole atl1c rtc

Pid: 151, comm: pdflush Not tainted (2.6.31.6 #17) G31M-ES2L
EIP: 0060:[<c1097067>] EFLAGS: 00010246 CPU: 0
EIP is at bdevname+0x8/0x13
EAX: df4ddcf0 EBX: 00000000 ECX: df87fd54 EDX: 00000000
ESI: d5cfa0c0 EDI: 00000000 EBP: 00000001 ESP: df87fd38
 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process pdflush (pid: 151, ti=df87e000 task=df8caae0 task.ti=df87e000)
Stack:
 c11582ff 0006d250 00000000 fffffffb 00000002 00000246 c1ae7e80 00011200
<0> 00011210 df83a9e0 c1043b26 df87fd60 00000004 00000101 c101be1c 00000000
<0> 00000000 d5cfa0c0 00000001 c11583e4 00000001 df83aa00 df83aa00 c107d6db
Call Trace:
 [<c11582ff>] ? generic_make_request+0x1cc/0x209
 [<c1043b26>] ? mempool_alloc+0x1e/0xd6
 [<c101be1c>] ? __do_softirq+0x9b/0xc8
 [<c11583e4>] ? submit_bio+0xa8/0xb0
 [<c107d6db>] ? bio_alloc_bioset+0x3e/0xa4
 [<c107ce92>] ? submit_bh+0x127/0x152
 [<c107b117>] ? __block_write_full_page+0x1ff/0x2d8
 [<c107eafd>] ? blkdev_get_block+0x0/0x46
 [<c107cc5e>] ? block_write_full_page_endio+0xe2/0xec
 [<c1079ef7>] ? end_buffer_async_write+0x0/0xd4
 [<c107eafd>] ? blkdev_get_block+0x0/0x46
 [<c107cc72>] ? block_write_full_page+0xa/0xc
 [<c1079ef7>] ? end_buffer_async_write+0x0/0xd4
 [<c104750c>] ? __writepage+0x8/0x1d
 [<c1047407>] ? write_cache_pages+0x1bf/0x2bc
 [<c1047504>] ? __writepage+0x0/0x1d
 [<c1047521>] ? generic_writepages+0x0/0x1f
 [<c104753d>] ? generic_writepages+0x1c/0x1f
 [<c1047562>] ? do_writepages+0x22/0x32
 [<c1076444>] ? writeback_single_inode+0xad/0x1b2
 [<c1076672>] ? generic_sync_sb_inodes+0x129/0x26e
 [<c1047c80>] ? pdflush+0x0/0x2d
 [<c1076820>] ? writeback_inodes+0x64/0xb4
 [<c1047040>] ? background_writeout+0x5f/0x91
 [<c1047bab>] ? __pdflush+0xcf/0x1a4
 [<c1047ca9>] ? pdflush+0x29/0x2d
 [<c1046fe1>] ? background_writeout+0x0/0x91
 [<c1047c80>] ? pdflush+0x0/0x2d
 [<c1026d72>] ? kthread+0x6b/0x70
 [<c1026d07>] ? kthread+0x0/0x70
 [<c1002f93>] ? kernel_thread_helper+0x7/0x10
Code: 09 56 53 68 f7 09 2e c1 eb 07 56 53 68 90 0a 2e c1 6a 20 55 e8
f0 01 0d 00 83 c4 14 5b 89 e8 5e 5f 5d c3 89 d1 8b 50 48 8b 40 54 <8b>
92 c0 00 00 00 e9 92 ff ff ff 53 89 d3 89 c2 81 e2 ff ff 0f
EIP: [<c1097067>] bdevname+0x8/0x13 SS:ESP 0068:df87fd38
CR2: 00000000000000c0

-- 
Glenn Maynard

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Crash during SATA reads
  2009-11-12  9:48           ` Glenn Maynard
@ 2009-11-12  9:55             ` Jens Axboe
  2009-11-12 10:11               ` Glenn Maynard
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2009-11-12  9:55 UTC (permalink / raw)
  To: Glenn Maynard; +Cc: Jeff Garzik, linux-kernel, linux-scsi

On Thu, Nov 12 2009, Glenn Maynard wrote:
> On Thu, Nov 12, 2009 at 4:16 AM, Jens Axboe <jens.axboe@oracle.com> wrote:
> > So the device isn't mounted, which means that we'll default to 1k
> > blocks. I think that's a big clue here, as you'll have 4 buffer_heads
> > pointing to the same page. And one of your oopses is indeed in the
> > ->b_this_page unroll.
> >
> > Can you trigger this if you mount/umount the device first?
> 
> Yeah.  I've reproduced it reading from /dev/sda (the drive the root
> partition is mounted on) with no second drive attached, too.
> 
> I'm currently reproducing it with "dd if=/dev/sda of=/dev/sdb", since
> that reproduces faster and more reproducably than just reading.
> Here's a trace from one of these:
> 
> BUG: unable to handle kernel NULL pointer dereference at 000000c0
> IP: [<c1097067>] bdevname+0x8/0x13
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT
> last sysfs file:
> /sys/devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/model
> Modules linked in: netconsole atl1c rtc

To rule out any potential hardware problems, have you run any extensive
memory checks on this machine?

> Pid: 151, comm: pdflush Not tainted (2.6.31.6 #17) G31M-ES2L
> EIP: 0060:[<c1097067>] EFLAGS: 00010246 CPU: 0
> EIP is at bdevname+0x8/0x13
> EAX: df4ddcf0 EBX: 00000000 ECX: df87fd54 EDX: 00000000
> ESI: d5cfa0c0 EDI: 00000000 EBP: 00000001 ESP: df87fd38
>  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> Process pdflush (pid: 151, ti=df87e000 task=df8caae0 task.ti=df87e000)
> Stack:
>  c11582ff 0006d250 00000000 fffffffb 00000002 00000246 c1ae7e80 00011200
> <0> 00011210 df83a9e0 c1043b26 df87fd60 00000004 00000101 c101be1c 00000000
> <0> 00000000 d5cfa0c0 00000001 c11583e4 00000001 df83aa00 df83aa00 c107d6db
> Call Trace:
>  [<c11582ff>] ? generic_make_request+0x1cc/0x209
>  [<c1043b26>] ? mempool_alloc+0x1e/0xd6
>  [<c101be1c>] ? __do_softirq+0x9b/0xc8
>  [<c11583e4>] ? submit_bio+0xa8/0xb0
>  [<c107d6db>] ? bio_alloc_bioset+0x3e/0xa4
>  [<c107ce92>] ? submit_bh+0x127/0x152
>  [<c107b117>] ? __block_write_full_page+0x1ff/0x2d8
>  [<c107eafd>] ? blkdev_get_block+0x0/0x46
>  [<c107cc5e>] ? block_write_full_page_endio+0xe2/0xec
>  [<c1079ef7>] ? end_buffer_async_write+0x0/0xd4
>  [<c107eafd>] ? blkdev_get_block+0x0/0x46
>  [<c107cc72>] ? block_write_full_page+0xa/0xc
>  [<c1079ef7>] ? end_buffer_async_write+0x0/0xd4
>  [<c104750c>] ? __writepage+0x8/0x1d
>  [<c1047407>] ? write_cache_pages+0x1bf/0x2bc
>  [<c1047504>] ? __writepage+0x0/0x1d
>  [<c1047521>] ? generic_writepages+0x0/0x1f
>  [<c104753d>] ? generic_writepages+0x1c/0x1f
>  [<c1047562>] ? do_writepages+0x22/0x32
>  [<c1076444>] ? writeback_single_inode+0xad/0x1b2
>  [<c1076672>] ? generic_sync_sb_inodes+0x129/0x26e
>  [<c1047c80>] ? pdflush+0x0/0x2d
>  [<c1076820>] ? writeback_inodes+0x64/0xb4
>  [<c1047040>] ? background_writeout+0x5f/0x91
>  [<c1047bab>] ? __pdflush+0xcf/0x1a4
>  [<c1047ca9>] ? pdflush+0x29/0x2d
>  [<c1046fe1>] ? background_writeout+0x0/0x91
>  [<c1047c80>] ? pdflush+0x0/0x2d
>  [<c1026d72>] ? kthread+0x6b/0x70
>  [<c1026d07>] ? kthread+0x0/0x70
>  [<c1002f93>] ? kernel_thread_helper+0x7/0x10
> Code: 09 56 53 68 f7 09 2e c1 eb 07 56 53 68 90 0a 2e c1 6a 20 55 e8
> f0 01 0d 00 83 c4 14 5b 89 e8 5e 5f 5d c3 89 d1 8b 50 48 8b 40 54 <8b>
> 92 c0 00 00 00 e9 92 ff ff ff 53 89 d3 89 c2 81 e2 ff ff 0f
> EIP: [<c1097067>] bdevname+0x8/0x13 SS:ESP 0068:df87fd38
> CR2: 00000000000000c0

This one is different, it's on the submit path.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Crash during SATA reads
  2009-11-12  9:55             ` Jens Axboe
@ 2009-11-12 10:11               ` Glenn Maynard
  2009-11-12 10:12                 ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Glenn Maynard @ 2009-11-12 10:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jeff Garzik, linux-kernel, linux-scsi

On Thu, Nov 12, 2009 at 4:55 AM, Jens Axboe <jens.axboe@oracle.com> wrote:
> To rule out any potential hardware problems, have you run any extensive
> memory checks on this machine?

God damn it.  I ran memtest+ just a week or so ago and it was fine
through several passes; now it's failing badly.  I'm inclined to send
Kingston a bill for a day and a half of wasted time.

I didn't think back to memory since I'm used to memory problems in
userspace, where they cause wildly variable crashes.  Of course, the
kernel's far more likely to stick data in the same spot, so it makes
sense that it doesn't cause as much volatility, leading to a bunch of
related-looking crashes.

Thanks.

-- 
Glenn Maynard

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Crash during SATA reads
  2009-11-12 10:11               ` Glenn Maynard
@ 2009-11-12 10:12                 ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2009-11-12 10:12 UTC (permalink / raw)
  To: Glenn Maynard; +Cc: Jeff Garzik, linux-kernel, linux-scsi

On Thu, Nov 12 2009, Glenn Maynard wrote:
> On Thu, Nov 12, 2009 at 4:55 AM, Jens Axboe <jens.axboe@oracle.com> wrote:
> > To rule out any potential hardware problems, have you run any extensive
> > memory checks on this machine?
> 
> God damn it.  I ran memtest+ just a week or so ago and it was fine
> through several passes; now it's failing badly.  I'm inclined to send
> Kingston a bill for a day and a half of wasted time.

Auch, perhaps it's just seated badly? Power problems? Or yes, you just
got crap RAM :/

> I didn't think back to memory since I'm used to memory problems in
> userspace, where they cause wildly variable crashes.  Of course, the
> kernel's far more likely to stick data in the same spot, so it makes
> sense that it doesn't cause as much volatility, leading to a bunch of
> related-looking crashes.

Well, at least it's a relief that it isn't a kernel issue :-)

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-11-12 10:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bd36f99e0911110118v6dc7f87ao98968ba234fad7fc@mail.gmail.com>
     [not found] ` <4AFA835B.9000904@garzik.org>
     [not found]   ` <bd36f99e0911110412hf2c7920gf06bd29ed43e240c@mail.gmail.com>
     [not found]     ` <bd36f99e0911111309ie8166a9h8f926948bf15d350@mail.gmail.com>
2009-11-11 21:39       ` Crash during SATA reads Jeff Garzik
2009-11-11 22:19         ` Glenn Maynard
2009-11-12  2:28           ` Dmitry Monakhov
2009-11-12  5:26             ` Glenn Maynard
2009-11-12  9:16         ` Jens Axboe
2009-11-12  9:48           ` Glenn Maynard
2009-11-12  9:55             ` Jens Axboe
2009-11-12 10:11               ` Glenn Maynard
2009-11-12 10:12                 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox