* PROBLEM: kernel crashes on RAID1 drive error
@ 2004-10-20 22:08 Mark Rustad
2004-10-21 8:45 ` Jens Axboe
0 siblings, 1 reply; 13+ messages in thread
From: Mark Rustad @ 2004-10-20 22:08 UTC (permalink / raw)
To: linux-raid, linux-scsi
Folks,
I have been having trouble with kernel crashes resulting from RAID1
component device failures. I have been testing the robustness of an
embedded system and have been using a drive that is known to fail after
a time under load. When this device returns a media error, I always
wind up with either a kernel hang or reboot. In this environment, each
drive has four partitions, each of which is part of a RAID1 with its
partner on the other device. Swap is on md2 so even it should be
robust.
I have gotten this result with the SuSE standard i386 smp kernels
2.6.5-7.97 and 2.6.5-7.108. I also get these failures with the
kernel.org kernels 2.6.8.1, 2.6.9-rc4 and 2.6.9.
The hardware setup is a two cpu Nacona with an Adaptec 7902 SCSI
controller with two Seagate drives on a SAF-TE bus. I run three or four
dd commands copying /dev/md0 to /dev/null to provide the activity that
stimulates the failure.
I suspect that something is going wrong in the retry of the failed I/O
operations, but I'm really not familiar with any of this area of the
kernel at all.
In one failure, I get the following messages from kernel 2.6.9:
raid1: Disk failure on sdb1, disabling device.
raid1: sdb1: rescheduling sector 176
raid1: sda1: redirecting sector 176 to another mirror
raid1: sdb1: rescheduling sector 184
raid1: sda1: redirecting sector 184 to another mirror
Incorrect number of segments after building list
counted 2, received 1
req nr_sec 0, cur_nr_sec 7
raid1: sda1: rescheduling sector 176
raid1: sda1: redirecting sector 176 to another mirror
Incorrect number of segments after building list
counted 2, received 1
req nr_sec 0, cur_nr_sec 7
raid1: sda1: rescheduling sector 184
raid1: sda1: redirecting sector 184 to another mirror
Incorrect number of segments after building list
counted 3, received 1
req nr_sec 0, cur_nr_sec 7
raid1: sda1: rescheduling sector 176
raid1: sda1: redirecting sector 176 to another mirror
Incorrect number of segments after building list
counted 2, received 1
---
The above messages go on essentially forever. At least until this
activity itself causes something to wedge.
The other failure I get is an oops. Here is the output from ksymoops:
ksymoops 2.4.9 on i686 2.6.5-7.97-bigsmp. Options used
-v vmlinux (specified)
-K (specified)
-L (specified)
-O (specified)
-M (specified)
kernel BUG at /usr/src/linux-2.6.9/fs/buffer.c:614!
invalid operand: 0000 [#1]
CPU: 1
EIP: 0060:[<c014faf9>] Not tainted VLI
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246 (2.6.9-3d-1)
eax: 00000019 ebx: c0dc695c ecx: c0dc695c edx: 00000001
esi: 00000001 edi: 00000000 ebp: 00000000 esp: df9f7d30
ds: 007b es: 007b ss: 0068
Stack: dec21540 c0152128 00000000 00000000 c015214b dec21540 c0153338
c0152956
c02f26b9 f7cf1d80 df8aea00 f7cf1dc0 f7cf1dc0 df8aea00 c02f2738
c013637e
f7cf1dc0 00000001 df8aea00 00000000 c02f2815 00002002 d2a0ab00
df9f7d94
Call Trace:
[<c0152128>] end_bio_bh_io_sync+0x0/0x3b
[<c015214b>] end_bio_bh_io_sync+0x23/0x3b
[<c0153338>] bio_endio+0x3b/0x65
[<c0152956>] bio_put+0x21/0x2d
[<c02f26b9>] put_all_bios+0x3d/0x57
[<c02f2738>] raid_end_bio_io+0x22/0xb8
[<c013637e>] mempool_free+0x6c/0x73
[<c02f2815>] raid1_end_read_request+0x47/0xcb
[<c02a846d>] scsi_softirq+0xbf/0xcd
[<c0136257>] mempool_alloc+0x66/0x121
[<c02f27ce>] raid1_end_read_request+0x0/0xcb
[<c0153338>] bio_endio+0x3b/0x65
[<c0279dd4>] __end_that_request_first+0xe3/0x22d
[<c011e537>] prepare_to_wait_exclusive+0x15/0x4c
[<c02ac212>] scsi_end_request+0x1b/0xa6
[<c02ac56d>] scsi_io_completion+0x16a/0x4a3
[<c011d2d5>] __wake_up+0x32/0x43
[<c02a851e>] scsi_finish_command+0x7d/0xd1
[<c02a846d>] scsi_softirq+0xbf/0xcd
[<c0124342>] __do_softirq+0x62/0xcd
[<c01243da>] do_softirq+0x2d/0x35
[<c0108b38>] do_IRQ+0x112/0x129
[<c0106cc0>] common_interrupt+0x18/0x20
[<c027007b>] uart_block_til_ready+0x18e/0x193
[<c02f2b60>] unplug_slaves+0x95/0x97
[<c02f3b29>] raid1d+0x186/0x18e
[<c02f85ac>] md_thread+0x174/0x19a
[<c011e5b9>] autoremove_wake_function+0x0/0x37
[<c011e5b9>] autoremove_wake_function+0x0/0x37
[<c02f8438>] md_thread+0x0/0x19a
[<c01047fd>] kernel_thread_helper+0x5/0xb
Code: ff f0 0f ba 2f 01 eb a0 8b 02 a8 04 74 2a 5b 89 ea b8 f4 28 3e c0
5e 5f 5d
>>EIP; c014faf9 <end_buffer_async_read+a4/bb> <=====
>>ebx; c0dc695c <pg0+83995c/3fa71400>
>>ecx; c0dc695c <pg0+83995c/3fa71400>
>>esp; df9f7d30 <pg0+1f46ad30/3fa71400>
Trace; c0152128 <end_bio_bh_io_sync+0/3b>
Trace; c015214b <end_bio_bh_io_sync+23/3b>
Trace; c0153338 <bio_endio+3b/65>
Trace; c0152956 <bio_put+21/2d>
Trace; c02f26b9 <put_all_bios+3d/57>
Trace; c02f2738 <raid_end_bio_io+22/b8>
Trace; c013637e <mempool_free+6c/73>
Trace; c02f2815 <raid1_end_read_request+47/cb>
Trace; c02a846d <scsi_softirq+bf/cd>
Trace; c0136257 <mempool_alloc+66/121>
Trace; c02f27ce <raid1_end_read_request+0/cb>
Trace; c0153338 <bio_endio+3b/65>
Trace; c0279dd4 <__end_that_request_first+e3/22d>
Trace; c011e537 <prepare_to_wait_exclusive+15/4c>
Trace; c02ac212 <scsi_end_request+1b/a6>
Trace; c02ac56d <scsi_io_completion+16a/4a3>
Trace; c011d2d5 <__wake_up+32/43>
Trace; c02a851e <scsi_finish_command+7d/d1>
Trace; c02a846d <scsi_softirq+bf/cd>
Trace; c0124342 <__do_softirq+62/cd>
Trace; c01243da <do_softirq+2d/35>
Trace; c0108b38 <do_IRQ+112/129>
Trace; c0106cc0 <common_interrupt+18/20>
Trace; c027007b <uart_block_til_ready+18e/193>
Trace; c02f2b60 <unplug_slaves+95/97>
Trace; c02f3b29 <raid1d+186/18e>
Trace; c02f85ac <md_thread+174/19a>
Trace; c011e5b9 <autoremove_wake_function+0/37>
Trace; c011e5b9 <autoremove_wake_function+0/37>
Trace; c02f8438 <md_thread+0/19a>
Trace; c01047fd <kernel_thread_helper+5/b>
Code; c014faf9 <end_buffer_async_read+a4/bb>
00000000 <_EIP>:
Code; c014faf9 <end_buffer_async_read+a4/bb> <=====
0: ff f0 push %eax <=====
Code; c014fafb <end_buffer_async_read+a6/bb>
2: 0f ba 2f 01 btsl $0x1,(%edi)
Code; c014faff <end_buffer_async_read+aa/bb>
6: eb a0 jmp ffffffa8 <_EIP+0xffffffa8>
Code; c014fb01 <end_buffer_async_read+ac/bb>
8: 8b 02 mov (%edx),%eax
Code; c014fb03 <end_buffer_async_read+ae/bb>
a: a8 04 test $0x4,%al
Code; c014fb05 <end_buffer_async_read+b0/bb>
c: 74 2a je 38 <_EIP+0x38>
Code; c014fb07 <end_buffer_async_read+b2/bb>
e: 5b pop %ebx
Code; c014fb08 <end_buffer_async_read+b3/bb>
f: 89 ea mov %ebp,%edx
Code; c014fb0a <end_buffer_async_read+b5/bb>
11: b8 f4 28 3e c0 mov $0xc03e28f4,%eax
Code; c014fb0f <end_buffer_async_read+ba/bb>
16: 5e pop %esi
Code; c014fb10 <end_buffer_async_write+0/de>
17: 5f pop %edi
Code; c014fb11 <end_buffer_async_write+1/de>
18: 5d pop %ebp
<0>Kernel panic - not syncing: Fatal exception in interrupt
---
In these cases, the kernel is a monolithic kernel - no modules at all.
Since the problem also happens with the standard SuSE smp kernel, which
does have modules, I don't believe that that is a factor. We just don't
need modules in our embedded system.
I don't know if the problem is in the raid1 code, in the general SCSI
code or in the Adaptec driver somewhere. Does anyone have a clue?
Note that using mdadm to fail a drive is utterly unlike this and seems
to work ok. It seems to take an honest-to-goodness broken drive to get
this failure. Of course, the whole point of RAID1 is to handle a
failing drive, so this is kind of a serious problem.
--
Mark Rustad, MRustad@aol.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-10-20 22:08 PROBLEM: kernel crashes on RAID1 drive error Mark Rustad
@ 2004-10-21 8:45 ` Jens Axboe
2004-10-21 13:52 ` Paul Clements
2004-10-21 16:31 ` Mark Rustad
0 siblings, 2 replies; 13+ messages in thread
From: Jens Axboe @ 2004-10-21 8:45 UTC (permalink / raw)
To: Mark Rustad; +Cc: linux-raid, linux-scsi
On Wed, Oct 20 2004, Mark Rustad wrote:
> Folks,
>
> I have been having trouble with kernel crashes resulting from RAID1
> component device failures. I have been testing the robustness of an
> embedded system and have been using a drive that is known to fail after
> a time under load. When this device returns a media error, I always
> wind up with either a kernel hang or reboot. In this environment, each
> drive has four partitions, each of which is part of a RAID1 with its
> partner on the other device. Swap is on md2 so even it should be
> robust.
>
> I have gotten this result with the SuSE standard i386 smp kernels
> 2.6.5-7.97 and 2.6.5-7.108. I also get these failures with the
> kernel.org kernels 2.6.8.1, 2.6.9-rc4 and 2.6.9.
>
> The hardware setup is a two cpu Nacona with an Adaptec 7902 SCSI
> controller with two Seagate drives on a SAF-TE bus. I run three or four
> dd commands copying /dev/md0 to /dev/null to provide the activity that
> stimulates the failure.
>
> I suspect that something is going wrong in the retry of the failed I/O
> operations, but I'm really not familiar with any of this area of the
> kernel at all.
>
> In one failure, I get the following messages from kernel 2.6.9:
>
> raid1: Disk failure on sdb1, disabling device.
> raid1: sdb1: rescheduling sector 176
> raid1: sda1: redirecting sector 176 to another mirror
> raid1: sdb1: rescheduling sector 184
> raid1: sda1: redirecting sector 184 to another mirror
> Incorrect number of segments after building list
> counted 2, received 1
> req nr_sec 0, cur_nr_sec 7
This should be fixed by this patch, can you test it?
===== drivers/block/ll_rw_blk.c 1.273 vs edited =====
--- 1.273/drivers/block/ll_rw_blk.c 2004-10-19 11:40:18 +02:00
+++ edited/drivers/block/ll_rw_blk.c 2004-10-20 17:06:12 +02:00
@@ -2766,22 +2767,36 @@
{
struct bio *bio, *prevbio = NULL;
int nr_phys_segs, nr_hw_segs;
+ unsigned int phys_size, hw_size;
+ request_queue_t *q = rq->q;
if (!rq->bio)
return;
- nr_phys_segs = nr_hw_segs = 0;
+ phys_size = hw_size = nr_phys_segs = nr_hw_segs = 0;
rq_for_each_bio(bio, rq) {
/* Force bio hw/phys segs to be recalculated. */
bio->bi_flags &= ~(1 << BIO_SEG_VALID);
- nr_phys_segs += bio_phys_segments(rq->q, bio);
- nr_hw_segs += bio_hw_segments(rq->q, bio);
+ nr_phys_segs += bio_phys_segments(q, bio);
+ nr_hw_segs += bio_hw_segments(q, bio);
if (prevbio) {
- if (blk_phys_contig_segment(rq->q, prevbio, bio))
+ int pseg = phys_size + prevbio->bi_size + bio->bi_size;
+ int hseg = hw_size + prevbio->bi_size + bio->bi_size;
+
+ if (blk_phys_contig_segment(q, prevbio, bio) &&
+ pseg <= q->max_segment_size) {
nr_phys_segs--;
- if (blk_hw_contig_segment(rq->q, prevbio, bio))
+ phys_size += prevbio->bi_size + bio->bi_size;
+ } else
+ phys_size = 0;
+
+ if (blk_hw_contig_segment(q, prevbio, bio) &&
+ hseg <= q->max_segment_size) {
nr_hw_segs--;
+ hw_size += prevbio->bi_size + bio->bi_size;
+ } else
+ hw_size = 0;
}
prevbio = bio;
}
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-10-21 8:45 ` Jens Axboe
@ 2004-10-21 13:52 ` Paul Clements
2004-10-21 13:55 ` Jens Axboe
2004-10-21 16:31 ` Mark Rustad
1 sibling, 1 reply; 13+ messages in thread
From: Paul Clements @ 2004-10-21 13:52 UTC (permalink / raw)
To: Mark Rustad; +Cc: Jens Axboe, linux-raid, linux-scsi
Jens Axboe wrote:
> On Wed, Oct 20 2004, Mark Rustad wrote:
>
>>Folks,
>>
>>I have been having trouble with kernel crashes resulting from RAID1
>>component device failures. I have been testing the robustness of an
>>embedded system and have been using a drive that is known to fail after
>>a time under load. When this device returns a media error, I always
>>wind up with either a kernel hang or reboot. In this environment, each
>>drive has four partitions, each of which is part of a RAID1 with its
>>partner on the other device. Swap is on md2 so even it should be
>>robust.
>>
>>I have gotten this result with the SuSE standard i386 smp kernels
>>2.6.5-7.97 and 2.6.5-7.108. I also get these failures with the
>>kernel.org kernels 2.6.8.1, 2.6.9-rc4 and 2.6.9.
>>
>>The hardware setup is a two cpu Nacona with an Adaptec 7902 SCSI
>>controller with two Seagate drives on a SAF-TE bus. I run three or four
>>dd commands copying /dev/md0 to /dev/null to provide the activity that
>>stimulates the failure.
>>
>>I suspect that something is going wrong in the retry of the failed I/O
>>operations, but I'm really not familiar with any of this area of the
>>kernel at all.
>>
>>In one failure, I get the following messages from kernel 2.6.9:
>>
>>raid1: Disk failure on sdb1, disabling device.
>>raid1: sdb1: rescheduling sector 176
>>raid1: sda1: redirecting sector 176 to another mirror
>>raid1: sdb1: rescheduling sector 184
>>raid1: sda1: redirecting sector 184 to another mirror
>>Incorrect number of segments after building list
>>counted 2, received 1
>>req nr_sec 0, cur_nr_sec 7
>
>
> This should be fixed by this patch, can you test it?
There may well be two problems here, but the original problem you're
seeing (infinite read retries, and failures) is due to a bug in raid1.
Basically the bio handling on read error retry was not quite right. Neil
Brown just posted the patch to correct this a couple of days ago:
http://marc.theaimsgroup.com/?l=linux-raid&m=109824318202358&w=2
Please try that. (If you need a patch that applies to SUSE 2.6.5, I also
have a version of the patch which should apply to that).
Please be aware that there are several other bugs in the SUSE 2.6.5-7.97
kernel in md and raid1 (basically it's a matter of that kernel being
somewhat behind mainline, where most of these bugs are now fixed). I've
sent several patches to SUSE to fix these issues, that hopefully will
get into their SP1 release that should be forthcoming soon...
--
Paul
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-10-21 13:52 ` Paul Clements
@ 2004-10-21 13:55 ` Jens Axboe
2004-10-21 14:01 ` Paul Clements
0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2004-10-21 13:55 UTC (permalink / raw)
To: Paul Clements; +Cc: Mark Rustad, linux-raid, linux-scsi
On Thu, Oct 21 2004, Paul Clements wrote:
> Jens Axboe wrote:
> >On Wed, Oct 20 2004, Mark Rustad wrote:
> >
> >>Folks,
> >>
> >>I have been having trouble with kernel crashes resulting from RAID1
> >>component device failures. I have been testing the robustness of an
> >>embedded system and have been using a drive that is known to fail after
> >>a time under load. When this device returns a media error, I always
> >>wind up with either a kernel hang or reboot. In this environment, each
> >>drive has four partitions, each of which is part of a RAID1 with its
> >>partner on the other device. Swap is on md2 so even it should be
> >>robust.
> >>
> >>I have gotten this result with the SuSE standard i386 smp kernels
> >>2.6.5-7.97 and 2.6.5-7.108. I also get these failures with the
> >>kernel.org kernels 2.6.8.1, 2.6.9-rc4 and 2.6.9.
> >>
> >>The hardware setup is a two cpu Nacona with an Adaptec 7902 SCSI
> >>controller with two Seagate drives on a SAF-TE bus. I run three or four
> >>dd commands copying /dev/md0 to /dev/null to provide the activity that
> >>stimulates the failure.
> >>
> >>I suspect that something is going wrong in the retry of the failed I/O
> >>operations, but I'm really not familiar with any of this area of the
> >>kernel at all.
> >>
> >>In one failure, I get the following messages from kernel 2.6.9:
> >>
> >>raid1: Disk failure on sdb1, disabling device.
> >>raid1: sdb1: rescheduling sector 176
> >>raid1: sda1: redirecting sector 176 to another mirror
> >>raid1: sdb1: rescheduling sector 184
> >>raid1: sda1: redirecting sector 184 to another mirror
> >>Incorrect number of segments after building list
> >>counted 2, received 1
> >>req nr_sec 0, cur_nr_sec 7
> >
> >
> >This should be fixed by this patch, can you test it?
>
> There may well be two problems here, but the original problem you're
> seeing (infinite read retries, and failures) is due to a bug in raid1.
> Basically the bio handling on read error retry was not quite right. Neil
> Brown just posted the patch to correct this a couple of days ago:
>
> http://marc.theaimsgroup.com/?l=linux-raid&m=109824318202358&w=2
>
> Please try that. (If you need a patch that applies to SUSE 2.6.5, I also
> have a version of the patch which should apply to that).
Is 2.6.9 not uptodate wrt those raid1 patches?!
> Please be aware that there are several other bugs in the SUSE 2.6.5-7.97
> kernel in md and raid1 (basically it's a matter of that kernel being
> somewhat behind mainline, where most of these bugs are now fixed). I've
> sent several patches to SUSE to fix these issues, that hopefully will
> get into their SP1 release that should be forthcoming soon...
-97 is the release kernel, -111 is the current update kernel. And it has
those raid1 issues fixed already, at least the ones that are known. The
scsi segment issue is not, however.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-10-21 13:55 ` Jens Axboe
@ 2004-10-21 14:01 ` Paul Clements
2004-10-21 14:02 ` Jens Axboe
0 siblings, 1 reply; 13+ messages in thread
From: Paul Clements @ 2004-10-21 14:01 UTC (permalink / raw)
To: Jens Axboe; +Cc: Mark Rustad, linux-raid, linux-scsi
Jens Axboe wrote:
> On Thu, Oct 21 2004, Paul Clements wrote:
>
>>Jens Axboe wrote:
>>
>>>On Wed, Oct 20 2004, Mark Rustad wrote:
>>>
>>>
>>>>Folks,
>>>>
>>>>I have been having trouble with kernel crashes resulting from RAID1
>>>>component device failures. I have been testing the robustness of an
>>>>embedded system and have been using a drive that is known to fail after
>>>>a time under load. When this device returns a media error, I always
>>>>wind up with either a kernel hang or reboot. In this environment, each
>>>>drive has four partitions, each of which is part of a RAID1 with its
>>>>partner on the other device. Swap is on md2 so even it should be
>>>>robust.
>>>>
>>>>I have gotten this result with the SuSE standard i386 smp kernels
>>>>2.6.5-7.97 and 2.6.5-7.108. I also get these failures with the
>>>>kernel.org kernels 2.6.8.1, 2.6.9-rc4 and 2.6.9.
>>>>
>>>>The hardware setup is a two cpu Nacona with an Adaptec 7902 SCSI
>>>>controller with two Seagate drives on a SAF-TE bus. I run three or four
>>>>dd commands copying /dev/md0 to /dev/null to provide the activity that
>>>>stimulates the failure.
>>>>
>>>>I suspect that something is going wrong in the retry of the failed I/O
>>>>operations, but I'm really not familiar with any of this area of the
>>>>kernel at all.
>>>>
>>>>In one failure, I get the following messages from kernel 2.6.9:
>>>>
>>>>raid1: Disk failure on sdb1, disabling device.
>>>>raid1: sdb1: rescheduling sector 176
>>>>raid1: sda1: redirecting sector 176 to another mirror
>>>>raid1: sdb1: rescheduling sector 184
>>>>raid1: sda1: redirecting sector 184 to another mirror
>>>>Incorrect number of segments after building list
>>>>counted 2, received 1
>>>>req nr_sec 0, cur_nr_sec 7
>>>
>>>
>>>This should be fixed by this patch, can you test it?
>>
>>There may well be two problems here, but the original problem you're
>>seeing (infinite read retries, and failures) is due to a bug in raid1.
>>Basically the bio handling on read error retry was not quite right. Neil
>>Brown just posted the patch to correct this a couple of days ago:
>>
>>http://marc.theaimsgroup.com/?l=linux-raid&m=109824318202358&w=2
>>
>>Please try that. (If you need a patch that applies to SUSE 2.6.5, I also
>>have a version of the patch which should apply to that).
>
>
> Is 2.6.9 not uptodate wrt those raid1 patches?!
Unfortunately, no. This latest problem (the one he's reporting) is not
fixed in mainline. I discovered the problem a month or so ago while
testing with SLES 9. I posted a patch and Neil expanded on it (to
include raid10, which is now in mainline, and also suffers from the same
problem). Neil just posted the patch two days ago to linux-raid, so I
expect it's in -mm now.
>>Please be aware that there are several other bugs in the SUSE 2.6.5-7.97
>>kernel in md and raid1 (basically it's a matter of that kernel being
>>somewhat behind mainline, where most of these bugs are now fixed). I've
>>sent several patches to SUSE to fix these issues, that hopefully will
>>get into their SP1 release that should be forthcoming soon...
>
>
> -97 is the release kernel, -111 is the current update kernel. And it has
> those raid1 issues fixed already, at least the ones that are known. The
> scsi segment issue is not, however.
Thanks. Good to know that. -111 is currently available to customers? We
may recommend that our customers use that, rather than patching -97
ourselves.
--
Paul
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-10-21 14:01 ` Paul Clements
@ 2004-10-21 14:02 ` Jens Axboe
2004-10-22 16:00 ` Mark Rustad
0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2004-10-21 14:02 UTC (permalink / raw)
To: Paul Clements; +Cc: Mark Rustad, linux-raid, linux-scsi
On Thu, Oct 21 2004, Paul Clements wrote:
> Jens Axboe wrote:
> >On Thu, Oct 21 2004, Paul Clements wrote:
> >
> >>Jens Axboe wrote:
> >>
> >>>On Wed, Oct 20 2004, Mark Rustad wrote:
> >>>
> >>>
> >>>>Folks,
> >>>>
> >>>>I have been having trouble with kernel crashes resulting from RAID1
> >>>>component device failures. I have been testing the robustness of an
> >>>>embedded system and have been using a drive that is known to fail after
> >>>>a time under load. When this device returns a media error, I always
> >>>>wind up with either a kernel hang or reboot. In this environment, each
> >>>>drive has four partitions, each of which is part of a RAID1 with its
> >>>>partner on the other device. Swap is on md2 so even it should be
> >>>>robust.
> >>>>
> >>>>I have gotten this result with the SuSE standard i386 smp kernels
> >>>>2.6.5-7.97 and 2.6.5-7.108. I also get these failures with the
> >>>>kernel.org kernels 2.6.8.1, 2.6.9-rc4 and 2.6.9.
> >>>>
> >>>>The hardware setup is a two cpu Nacona with an Adaptec 7902 SCSI
> >>>>controller with two Seagate drives on a SAF-TE bus. I run three or four
> >>>>dd commands copying /dev/md0 to /dev/null to provide the activity that
> >>>>stimulates the failure.
> >>>>
> >>>>I suspect that something is going wrong in the retry of the failed I/O
> >>>>operations, but I'm really not familiar with any of this area of the
> >>>>kernel at all.
> >>>>
> >>>>In one failure, I get the following messages from kernel 2.6.9:
> >>>>
> >>>>raid1: Disk failure on sdb1, disabling device.
> >>>>raid1: sdb1: rescheduling sector 176
> >>>>raid1: sda1: redirecting sector 176 to another mirror
> >>>>raid1: sdb1: rescheduling sector 184
> >>>>raid1: sda1: redirecting sector 184 to another mirror
> >>>>Incorrect number of segments after building list
> >>>>counted 2, received 1
> >>>>req nr_sec 0, cur_nr_sec 7
> >>>
> >>>
> >>>This should be fixed by this patch, can you test it?
> >>
> >>There may well be two problems here, but the original problem you're
> >>seeing (infinite read retries, and failures) is due to a bug in raid1.
> >>Basically the bio handling on read error retry was not quite right. Neil
> >>Brown just posted the patch to correct this a couple of days ago:
> >>
> >>http://marc.theaimsgroup.com/?l=linux-raid&m=109824318202358&w=2
> >>
> >>Please try that. (If you need a patch that applies to SUSE 2.6.5, I also
> >>have a version of the patch which should apply to that).
> >
> >
> >Is 2.6.9 not uptodate wrt those raid1 patches?!
>
> Unfortunately, no. This latest problem (the one he's reporting) is not
> fixed in mainline. I discovered the problem a month or so ago while
> testing with SLES 9. I posted a patch and Neil expanded on it (to
> include raid10, which is now in mainline, and also suffers from the same
> problem). Neil just posted the patch two days ago to linux-raid, so I
> expect it's in -mm now.
Irk, that's too bad. So we are now looking at probably a month before
mainline has a stable release with that fixed too :/
> >>Please be aware that there are several other bugs in the SUSE 2.6.5-7.97
> >>kernel in md and raid1 (basically it's a matter of that kernel being
> >>somewhat behind mainline, where most of these bugs are now fixed). I've
> >>sent several patches to SUSE to fix these issues, that hopefully will
> >>get into their SP1 release that should be forthcoming soon...
> >
> >
> >-97 is the release kernel, -111 is the current update kernel. And it has
> >those raid1 issues fixed already, at least the ones that are known. The
> >scsi segment issue is not, however.
>
> Thanks. Good to know that. -111 is currently available to customers? We
> may recommend that our customers use that, rather than patching -97
> ourselves.
Yes it is, it's generally available through the online updates.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-10-21 8:45 ` Jens Axboe
2004-10-21 13:52 ` Paul Clements
@ 2004-10-21 16:31 ` Mark Rustad
1 sibling, 0 replies; 13+ messages in thread
From: Mark Rustad @ 2004-10-21 16:31 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-raid
Jens,
On Oct 21, 2004, at 3:45 AM, Jens Axboe wrote:
> On Wed, Oct 20 2004, Mark Rustad wrote:
>> Folks,
>>
>> I have been having trouble with kernel crashes resulting from RAID1
>> component device failures. I have been testing the robustness of an
>> embedded system and have been using a drive that is known to fail
>> after
>> a time under load. When this device returns a media error, I always
>> wind up with either a kernel hang or reboot. In this environment, each
>> drive has four partitions, each of which is part of a RAID1 with its
>> partner on the other device. Swap is on md2 so even it should be
>> robust.
<snip>
> This should be fixed by this patch, can you test it?
>
> ===== drivers/block/ll_rw_blk.c 1.273 vs edited =====
> --- 1.273/drivers/block/ll_rw_blk.c 2004-10-19 11:40:18 +02:00
> +++ edited/drivers/block/ll_rw_blk.c 2004-10-20 17:06:12 +02:00
<snip>
I applied this patch and the raid1/raid10 patch referenced in another
message. I had to mess with this patch a bit to get it to apply, but
because there was such good context, I know that I got the correct end
result. The raid1/raid10 patch applied cleanly unchanged. Unfortunately
I still get the oops. As I was looking at this I realized that I am
running with elevator=cfq simply because that is how SuSE sets things
up (just in case that has some bearing on things).
Because of the differences in the patch compared to the 2.6.9 base I
was applying it to, I wonder if there are other changes required.
Anyway, here is the oops that I now get:
ksymoops 2.4.9 on i686 2.6.5-7.97-bigsmp. Options used
-v vmlinux (specified)
-K (specified)
-L (specified)
-O (specified)
-m System.map (specified)
kernel BUG at /usr/src/linux-2.6.9/fs/buffer.c:614!
invalid operand: 0000 [#1]
CPU: 1
EIP: 0060:[<c014faf9>] Not tainted VLI
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246 (2.6.9-3d-1)
eax: 00000019 ebx: c0b24adc ecx: c0b24adc edx: 00000001
esi: 00000001 edi: 00000000 ebp: 00000001 esp: df9f7cc8
ds: 007b es: 007b ss: 0068
Stack: ded502c0 c0152128 00000000 00000001 c015214b ded502c0 c0153338
00000000
00000000 f7cb65b0 df9f7d14 f7cd1240 f7cd1240 df8ada00 c02f2738
df9b5300
f7cd1240 00000001 df8ada00 00000001 c02f2815 c1814220 deced450
df5c7be4
Call Trace:
[<c0152128>] end_bio_bh_io_sync+0x0/0x3b
[<c015214b>] end_bio_bh_io_sync+0x23/0x3b
[<c0153338>] bio_endio+0x3b/0x65
[<c02f2738>] raid_end_bio_io+0x22/0xb8
[<c02f2815>] raid1_end_read_request+0x47/0xcb
[<c011bb08>] try_to_wake_up+0x1f4/0x273
[<c02f27ce>] raid1_end_read_request+0x0/0xcb
[<c0153338>] bio_endio+0x3b/0x65
[<c0279dd4>] __end_that_request_first+0xe3/0x22d
[<c011d280>] __wake_up_common+0x35/0x58
[<c02ac212>] scsi_end_request+0x1b/0xa6
[<c02ac56d>] scsi_io_completion+0x16a/0x4a3
[<c0136257>] mempool_alloc+0x66/0x121
[<c02a851e>] scsi_finish_command+0x7d/0xd1
[<c02a846d>] scsi_softirq+0xbf/0xcd
[<c0124342>] __do_softirq+0x62/0xcd
[<c01243da>] do_softirq+0x2d/0x35
[<c0108b38>] do_IRQ+0x112/0x129
[<c0106cc0>] common_interrupt+0x18/0x20
[<c027007b>] uart_block_til_ready+0x18e/0x193
[<c0279627>] __make_request+0x244/0x4ac
[<c027994e>] generic_make_request+0xbf/0x16c
[<c011d2d5>] __wake_up+0x32/0x43
[<c02f2ab5>] read_balance+0x16b/0x181
[<c0120c64>] __printk_ratelimit+0x8a/0xa5
[<c02f3ab6>] raid1d+0x113/0x18e
[<c02f85ac>] md_thread+0x174/0x19a
[<c011e5b9>] autoremove_wake_function+0x0/0x37
[<c011e5b9>] autoremove_wake_function+0x0/0x37
[<c02f8438>] md_thread+0x0/0x19a
[<c01047fd>] kernel_thread_helper+0x5/0xb
Code: ff f0 0f ba 2f 01 eb a0 8b 02 a8 04 74 2a 5b 89 ea b8 f4 28 3e c0
5e 5f 5d
>>EIP; c014faf9 <__find_get_block_slow+112/128> <=====
>>ebx; c0b24adc <pg0+593adc/3fa6d400>
>>ecx; c0b24adc <pg0+593adc/3fa6d400>
>>esp; df9f7cc8 <pg0+1f466cc8/3fa6d400>
Trace; c0152128 <block_write_full_page+8/fa>
Trace; c015214b <block_write_full_page+2b/fa>
Trace; c0153338 <bio_dirty_fn+35/4d>
Trace; c02f2738 <r1buf_pool_alloc+6b/11d>
Trace; c02f2815 <r1buf_pool_free+2b/72>
Trace; c011bb08 <try_to_wake_up+a4/273>
Trace; c02f27ce <r1buf_pool_alloc+101/11d>
Trace; c0153338 <bio_dirty_fn+35/4d>
Trace; c0279dd4 <blk_recalc_rq_segments+10b/154>
Trace; c011d280 <scheduler_tick+343/452>
Trace; c02ac212 <scsi_single_lun_run+35/ce>
Trace; c02ac56d <scsi_release_buffers+d/83>
Trace; c0136257 <mempool_resize+b7/158>
Trace; c02a851e <scsi_init_cmd_from_req+159/15e>
Trace; c02a846d <scsi_init_cmd_from_req+a8/15e>
Trace; c0124342 <sys_adjtimex+2/4e>
Trace; c01243da <getnstimeofday+b/22>
Trace; c0108b38 <do_IRQ+112/198>
Trace; c0106cc0 <common_interrupt+18/20>
Trace; c027007b <uart_block_til_ready+6e/193>
Trace; c0279627 <__make_request+124/4ac>
Trace; c027994e <__make_request+44b/4ac>
Trace; c011d2d5 <scheduler_tick+398/452>
Trace; c02f2ab5 <raid1_end_write_request+3c/b1>
Trace; c0120c64 <unregister_console+3/85>
Trace; c02f3ab6 <sync_request_write+17e/24b>
Trace; c02f85ac <md_open+3/5d>
Trace; c011e5b9 <add_wait_queue+27/30>
Trace; c011e5b9 <add_wait_queue+27/30>
Trace; c02f8438 <md_ioctl+558/6c9>
Trace; c01047fd <kernel_thread_helper+5/b>
Code; c014faf9 <__find_get_block_slow+112/128>
00000000 <_EIP>:
Code; c014faf9 <__find_get_block_slow+112/128> <=====
0: ff f0 push %eax <=====
Code; c014fafb <__find_get_block_slow+114/128>
2: 0f ba 2f 01 btsl $0x1,(%edi)
Code; c014faff <__find_get_block_slow+118/128>
6: eb a0 jmp ffffffa8 <_EIP+0xffffffa8>
Code; c014fb01 <__find_get_block_slow+11a/128>
8: 8b 02 mov (%edx),%eax
Code; c014fb03 <__find_get_block_slow+11c/128>
a: a8 04 test $0x4,%al
Code; c014fb05 <__find_get_block_slow+11e/128>
c: 74 2a je 38 <_EIP+0x38>
Code; c014fb07 <__find_get_block_slow+120/128>
e: 5b pop %ebx
Code; c014fb08 <__find_get_block_slow+121/128>
f: 89 ea mov %ebp,%edx
Code; c014fb0a <__find_get_block_slow+123/128>
11: b8 f4 28 3e c0 mov $0xc03e28f4,%eax
Code; c014fb0f <invalidate_bdev+0/17>
16: 5e pop %esi
Code; c014fb10 <invalidate_bdev+1/17>
17: 5f pop %edi
Code; c014fb11 <invalidate_bdev+2/17>
18: 5d pop %ebp
<0>Kernel panic - not syncing: Fatal exception in interrupt
--
Mark Rustad, MRustad@aol.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-10-21 14:02 ` Jens Axboe
@ 2004-10-22 16:00 ` Mark Rustad
2004-10-28 19:35 ` Mark Rustad
0 siblings, 1 reply; 13+ messages in thread
From: Mark Rustad @ 2004-10-22 16:00 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-raid, Paul Clements, linux-scsi
Jens,
On Oct 21, 2004, at 9:02 AM, Jens Axboe wrote:
>>> -97 is the release kernel, -111 is the current update kernel. And it
>>> has
>>> those raid1 issues fixed already, at least the ones that are known.
>>> The
>>> scsi segment issue is not, however.
>>
>> Thanks. Good to know that. -111 is currently available to customers?
>> We
>> may recommend that our customers use that, rather than patching -97
>> ourselves.
>
> Yes it is, it's generally available through the online updates.
FWIW, I tried the -111 kernel and got a crash with my failing drive.
The messages out of the kernel were:
raid1: Disk failure on sdb1, disabling device.
raid1: sdb1: rescheduling sector 176
raid1: sda1: redirecting sector 176 to another mirror
raid1: sdb1: rescheduling sector 184
raid1: sda1: redirecting sector 184 to another mirror
Oct 22 10:42:03 linux kernel: scsi0: ERROR on channel 0, id 5, lun 0,
CDB: Read (10) 00 00 00 00 bf 00 01 00 00
Oct 22 10:42:03 linux kernel: Info fld=0xf3, Current sdb: sense key
Medium Error
Oct 22 10:42:03 linux kernel: Additional sense: Unrecovered read error
Oct 22 10:42:03 linux kernel: end_request: I/O error, dev sdb, sector
240
Unable to handle kernel NULL pointer dereference at virtual address
00000000
printing eip:
*pde = 00000000
Oops: 0000 [#1]
SMP
CPU: 0
EIP: 0060:[<c01559a4>] Tainted: G U
EFLAGS: 00010286 (2.6.5-7.111-smp)
EIP is at page_address+0x14/0xc0
eax: 00000000 ebx: 00000000 ecx: d0e50ac0 edx: f782a970
esi: f7d7cd00 edi: 00000001 ebp: 00000008 esp: f7e65e90
ds: 007b es: 007b ss: 0068
Process scsi_eh_0 (pid: 220, threadinfo=f7e64000 task=f7e1acb0)
Stack: 00000000 f7d7cd00 00000001 00000008 c0249501 c0127b7a 00000001
d0e50ac0
00000000 00000e00 c0249bee c035b0f4 f7eb5e8c 000000ef 00000000
00000001
fffffffb 00000e00 00000007 f7d7cd00 f7d7cd00 f71cce00 00000000
f7def200
Call Trace:
[<c0249501>] blk_recalc_rq_sectors+0xa1/0x110
[<c0127b7a>] printk+0x18a/0x1a0
[<c0249bee>] __end_that_request_first+0x1be/0x240
[<f883fb99>] scsi_end_request+0x29/0xe0 [scsi_mod]
[<f883ff74>] scsi_io_completion+0x324/0x4c0 [scsi_mod]
[<f883a3b2>] scsi_finish_command+0x82/0xf0 [scsi_mod]
[<c0127b7a>] printk+0x18a/0x1a0
[<f883e687>] scsi_error_handler+0x987/0xed0 [scsi_mod]
[<f883dd00>] scsi_error_handler+0x0/0xed0 [scsi_mod]
[<c0107005>] kernel_thread_helper+0x5/0x10
Code: 8b 00 f6 c4 01 75 26 a1 0c fb 47 c0 29 c3 c1 fb 05 c1 e3 0c
<1>Unable to handle kernel NULL pointer dereference at virtual address
00000000
printing eip:
f88584be
*pde = 00000000
Oops: 0002 [#2]
SMP
CPU: 0
EIP: 0060:[<f88584be>] Tainted: G U
EFLAGS: 00010046 (2.6.5-7.111-smp)
EIP is at dump_block_silence+0x1e/0xc0 [dump_blockdev]
eax: 00000000 ebx: f7d86c00 ecx: f8875810 edx: 00000000
esi: f8859740 edi: f7e65e5c ebp: 00000000 esp: f7e65d28
ds: 007b es: 007b ss: 0068
Process scsi_eh_0 (pid: 220, threadinfo=f7e64000 task=f7e1acb0)
Stack: 00000000 00000000 00000000 00000000 00000000 00000000 f8870ae9
00000000
00000000 00000000 f8870c49 00000000 00000000 00000000 f8870d05
00000000
c0358f00 00000202 f886f852 ffffffef c010aed3 00000000 c010af28
c03552c0
Call Trace:
[<f8870ae9>] dump_begin+0x59/0xd0 [dump]
[<f8870c49>] dump_execute_savedump+0x9/0x50 [dump]
[<f8870d05>] dump_generic_execute+0x75/0x80 [dump]
[<f886f852>] dump_execute+0x52/0xa0 [dump]
[<c010aed3>] die+0x133/0x1b0
[<c010af28>] die+0x188/0x1b0
[<c011dc40>] do_page_fault+0x0/0x54d
[<c011df81>] do_page_fault+0x341/0x54d
[<f88c9c20>] ahd_linux_queue_cmd_complete+0xe0/0x2a0 [aic79xx]
[<c011dc40>] do_page_fault+0x0/0x54d
[<c010a28d>] error_code+0x2d/0x40
[<c01559a4>] page_address+0x14/0xc0
[<c0249501>] blk_recalc_rq_sectors+0xa1/0x110
[<c0127b7a>] printk+0x18a/0x1a0
[<c0249bee>] __end_that_request_first+0x1be/0x240
[<f883fb99>] scsi_end_request+0x29/0xe0 [scsi_mod]
[<f883ff74>] scsi_io_completion+0x324/0x4c0 [scsi_mod]
[<f883a3b2>] scsi_finish_command+0x82/0xf0 [scsi_mod]
[<c0127b7a>] printk+0x18a/0x1a0
[<f883e687>] scsi_error_handler+0x987/0xed0 [scsi_mod]
[<f883dd00>] scsi_error_handler+0x0/0xed0 [scsi_mod]
[<c0107005>] kernel_thread_helper+0x5/0x10
Code: 86 02 84 c0 ba f0 ff ff ff 7f 0e 8b 5c 24 10 89 d0 8b 74 24
<6>LKCD dump already in progress
------------[ cut here ]------------
kernel BUG at kernel/exit.c:833!
invalid operand: 0000 [#3]
SMP
CPU: 0
EIP: 0060:[<c012a108>] Tainted: G U
EFLAGS: 00010282 (2.6.5-7.111-smp)
EIP is at do_exit+0x968/0xb60
eax: 00000001 ebx: 00000000 ecx: 00000000 edx: 00000001
esi: f7fa17c0 edi: f7e1acb0 ebp: f7fa17c0 esp: f7e65bd8
ds: 007b es: 007b ss: 0068
Process scsi_eh_0 (pid: 220, threadinfo=f7e64000 task=f7e1acb0)
Stack: 00017e5a 00000282 f7e65cf4 c0431a41 00000246 f7e1ad08 00000002
f7e1ad48
f7e65c10 00000202 00000002 f7e1ad08 f7e64000 00000002 f7e65cf4
00000002
c010af50 0000000b c034405a 00000002 00000002 f7e1acb0 c034405a
00000000
Call Trace:
[<c010af50>] do_simd_coprocessor_error+0x0/0xb0
[<c011dc40>] do_page_fault+0x0/0x54d
[<c011df81>] do_page_fault+0x341/0x54d
[<f886fdfe>] dump_lcrash_save_context+0x2e/0x60 [dump]
[<c0119fa1>] dump_send_ipi+0x11/0x20
[<f88710e4>] __dump_save_other_cpus+0xb4/0xe0 [dump]
[<f88700ce>] dump_lcrash_configure_header+0x29e/0x2c0 [dump]
[<c011dc40>] do_page_fault+0x0/0x54d
[<c010a28d>] error_code+0x2d/0x40
[<f88584be>] dump_block_silence+0x1e/0xc0 [dump_blockdev]
[<f8870ae9>] dump_begin+0x59/0xd0 [dump]
[<f8870c49>] dump_execute_savedump+0x9/0x50 [dump]
[<f8870d05>] dump_generic_execute+0x75/0x80 [dump]
[<f886f852>] dump_execute+0x52/0xa0 [dump]
[<c010aed3>] die+0x133/0x1b0
[<c010af28>] die+0x188/0x1b0
[<c011dc40>] do_page_fault+0x0/0x54d
[<c011df81>] do_page_fault+0x341/0x54d
[<f88c9c20>] ahd_linux_queue_cmd_complete+0xe0/0x2a0 [aic79xx]
[<c011dc40>] do_page_fault+0x0/0x54d
[<c010a28d>] error_code+0x2d/0x40
[<c01559a4>] page_address+0x14/0xc0
[<c0249501>] blk_recalc_rq_sectors+0xa1/0x110
[<c0127b7a>] printk+0x18a/0x1a0
[<c0249bee>] __end_that_request_first+0x1be/0x240
[<f883fb99>] scsi_end_request+0x29/0xe0 [scsi_mod]
[<f883ff74>] scsi_io_completion+0x324/0x4c0 [scsi_mod]
[<f883a3b2>] scsi_finish_command+0x82/0xf0 [scsi_mod]
[<c0127b7a>] printk+0x18a/0x1a0
[<f883e687>] scsi_error_handler+0x987/0xed0 [scsi_mod]
[<f883dd00>] scsi_error_handler+0x0/0xed0 [scsi_mod]
[<c0107005>] kernel_thread_helper+0x5/0x10
Code: 0f 0b 41 03 37 43 34 c0 eb fe 8b 6f 10 85 ed 74 ac eb 9b 8b
<6>LKCD dump already in progress
*** everything beyond removed, because cpu 0 continued to fault over
and over
--
Mark Rustad, MRustad@aol.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-10-22 16:00 ` Mark Rustad
@ 2004-10-28 19:35 ` Mark Rustad
2004-11-04 18:56 ` Mark Rustad
0 siblings, 1 reply; 13+ messages in thread
From: Mark Rustad @ 2004-10-28 19:35 UTC (permalink / raw)
To: linux-raid
I've still been trying to resolve this problem. I have reproduced the
kernel crash on a RAID1 drive error on kernels all the way up to
2.6.10-rc1-bk6. Seeing a patch to fix a problem in the sg device, I
turned off sgraidmon as well as mdadmd to have less going on in my
environment and took the SCSI generic driver out of the kernel. I also
turned on frame pointers in the kernel, just to aid in walking the
stack.
With those changes I still get a kernel crash on a drive error. In this
case I got the following:
raid1: Disk failure on sdb1, disabling device.
raid1: sdb1: rescheduling sector 176
raid1: sda1: redirecting sector 176 to another mirror
Oct 28 10:18:23 xio3d-x3C kernel: SCSI error : <0 0 5 0> return code =
0x8000002
Oct 28 10:18:23 xio3d-x3C kernel: Info fld=0xf3, Current sdb: sense key
Medium Error
Oct 28 10:18:23 xio3d-x3C kernel: Additional sense: Unrecovered read
error
Oct 28 10:18:23 xio3d-x3C kernel: end_request: I/O error, dev sdb,
sector 240
------------[ cut here ]------------
kernel BUG at /usr/src/linux-2.6.10-rc1-bk6/drivers/scsi/scsi_lib.c:572!
invalid operand: 0000 [#1]
SMP
CPU: 0
EIP: 0060:[<c02ca8fb>] Not tainted VLI
EFLAGS: 00010046 (2.6.10-rc1-3d-x)
EIP is at scsi_alloc_sgtable+0xe0/0xed
eax: 00000000 ebx: df45899c ecx: 00000000 edx: ded49e00
esi: ded49e00 edi: ded49e00 ebp: f7cd3d98 esp: f7cd3d80
ds: 007b es: 007b ss: 0068
Process scsi_eh_0 (pid: 18, threadinfo=f7cd2000 task=f7ca2a20)
Stack: c0582b94 c05720c0 00000000 df45899c ded49e00 ded49e00 f7cd3db8
c02caf3a
ded49e00 00000020 00010000 ded49e00 ded49e00 df45899c f7cd3de0
c02cb0ee
ded49e00 00000001 00000000 00000000 f7cacc00 df45899c f7cacc00
f7c44030
Call Trace:
[<c0107a93>] show_stack+0xaf/0xb7
[<c0107c13>] show_registers+0x158/0x1cd
[<c0107e0f>] die+0xfa/0x182
[<c0108325>] do_invalid_op+0x108/0x10a
[<c01076ed>] error_code+0x2d/0x38
[<c02caf3a>] scsi_init_io+0x62/0x123
[<c02cb0ee>] scsi_prep_fn+0xae/0x205
[<c02932ae>] elv_next_request+0x65/0xf7
[<c02cb465>] scsi_request_fn+0x220/0x429
[<c02959d5>] blk_insert_request+0xaf/0xe3
[<c02ca6cd>] scsi_requeue_command+0x40/0x4d
[<c02ca7b0>] scsi_end_request+0x6c/0xd7
[<c02cac20>] scsi_io_completion+0x278/0x530
[<c02faf37>] sd_rw_intr+0x84/0x2ab
[<c02c5ebe>] scsi_finish_command+0x83/0xe4
[<c02c99b8>] scsi_eh_flush_done_q+0xb0/0x105
[<c02c9aa2>] scsi_unjam_host+0x95/0x1eb
[<c02c9cc8>] scsi_error_handler+0xd0/0x172
[<c0104ff5>] kernel_thread_helper+0x5/0xb
Code: a0 00 00 00 02 00 eb 8b 66 c7 82 a0 00 00 00 03 00 eb 80 66 c7 82
a0 00 00
00 00 00 e9 72 ff ff ff 31 c0 83 c4 0c 5b 5e 5f 5d c3 <0f> 0b 3c 02 f0
ab 3f c0
e9 2f ff ff ff 55 89 e5 8b 45 0c 8b 55
The BUG is caused by the nr_phys_segments field being zero, resulting
in an attempt to have a sg list with no elements. Since I have been
stimulating the drive failure with "dd of=/dev/null bs=512
if=/dev/md0", I got to wondering if it was the single-sector reads that
contribute to the problem, so I changed the bs=512 to bs=4096. Then I
got the following crash:
raid1: Disk failure on sdb1, disabling device.
raid1: sdb1: rescheduling sector 176
raid1: sda1: redirecting sector 176 to another mirror
Oct 28 13:30:11 xio3d-x3C kernel: SCSI error : <0 0 5 0> return code =
0x8000002
Oct 28 13:30:11 xio3d-x3C kernel: Info fld=0xf3, Current sdb: sense key
Medium Error
Oct 28 13:30:11 xio3d-x3C kernel: Additional sense: Unrecovered read
error
Oct 28 13:30:11 xio3d-x3C kernel: end_request: I/O error, dev sdb,
sector 240
Unable to handle kernel NULL pointer dereference at virtual address
00000000 printing eip:
*pde = 00000000
Oops: 0000 [#1]
SMP
CPU: 0
EIP: 0060:[<c0144cfc>] Not tainted VLI
EFLAGS: 00010292 (2.6.10-rc1-3d-x)
EIP is at page_address+0xc/0x93
eax: 00000000 ebx: 00000000 ecx: c1936c10 edx: c1936c10
esi: c6f11100 edi: 00000000 ebp: c04f7e1c esp: c04f7e04
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c04f6000 task=c040db40)
Stack: fffffffb 00000000 c04f7e2c ded86d70 c6f11100 00000000 c04f7e2c
c0296c60
00000000 00000e00 c04f7e64 c0296e36 ded86d70 00000007 fffffffb
00000000
c6f11100 00000001 fffffffb 00000e00 00000007 ded86d70 decbee00
f7c448b8
Call Trace:
[<c0107a93>] show_stack+0xaf/0xb7
[<c0107c13>] show_registers+0x158/0x1cd
[<c0107e0f>] die+0xfa/0x182
[<c0118f05>] do_page_fault+0x432/0x5fe
[<c01076ed>] error_code+0x2d/0x38
[<c0296c60>] blk_recalc_rq_sectors+0x87/0xa0
[<c0296e36>] __end_that_request_first+0x1bd/0x23d
[<c02ca772>] scsi_end_request+0x2e/0xd7
[<c02cac20>] scsi_io_completion+0x278/0x530
[<c02faf37>] sd_rw_intr+0x84/0x2ab
[<c02c5ebe>] scsi_finish_command+0x83/0xe4
[<c02c5dee>] scsi_softirq+0xd3/0xe7
[<c01269f5>] __do_softirq+0x65/0xd3
[<c0126a94>] do_softirq+0x31/0x33
[<c0109090>] do_IRQ+0x2c/0x33
[<c01075d0>] common_interrupt+0x18/0x20
[<c0104e23>] cpu_idle+0x31/0x40
[<c04f8ada>] start_kernel+0x19b/0x20b
[<c0100211>] 0xc0100211
Code: da 01 74 ec 3c c0 eb d8 55 89 e5 69 45 08 01 00 37 9e 5d c1 e8 19
c1 e0 07
05 00 ed 58 c0 c3 55 89 e5 57 56 53 83 ec 0c 8b 5d 08 <8b> 03 f6 c4 01
75 1a 2b
1d 10 6d 59 c0 c1 fb 05 c1 e3 0c 8d 83
<0>Kernel panic - not syncing: Fatal exception in interrupt
<0>Rebooting in 5 seconds..
In this case, there was about 30 seconds between the time that the disk
error was reported and the kernel finally crashed.
Right now I am wondering if this comment in scsi_lib.c could have
something to do with this problem...
/*
* XXX: Following is probably broken since deferred
errors
* fall through [dpg 20040827]
*/
--
Mark Rustad, MRustad@aol.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-10-28 19:35 ` Mark Rustad
@ 2004-11-04 18:56 ` Mark Rustad
2004-11-16 15:51 ` Lars Marowsky-Bree
0 siblings, 1 reply; 13+ messages in thread
From: Mark Rustad @ 2004-11-04 18:56 UTC (permalink / raw)
To: linux-raid, linux-scsi
I came up with a patch that seems to fix the kernel crashing on a RAID1
device failure problem. I do not have confidence that it really is the
correct fix, but I'm pretty sure that it is touching the right area.
I'm sure someone more familiar with this area of the kernel will
recognize from this what the correct fix is, if this is not it.
Submitted for your consideration:
--- a/drivers/block/ll_rw_blk.c 2004-11-01 14:28:48.000000000 -0600
+++ b/drivers/block/ll_rw_blk.c 2004-11-01 14:29:13.000000000 -0600
@@ -2865,10 +2865,7 @@
* if the request wasn't completed, update state
*/
if (bio_nbytes) {
bio_endio(bio, bio_nbytes, error);
- bio->bi_idx += next_idx;
- bio_iovec(bio)->bv_offset += nr_bytes;
- bio_iovec(bio)->bv_len -= nr_bytes;
}
blk_recalc_rq_sectors(req, total_bytes >> 9);
With this applied, my kernel does not crash on media errors on one of
the devices and just keeps on running on the other device. In my case,
the code just above these lines was taking the path that was walking
through the bio_iovec array.
--
Mark Rustad, MRustad@mac.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-11-04 18:56 ` Mark Rustad
@ 2004-11-16 15:51 ` Lars Marowsky-Bree
2004-11-16 16:40 ` Mark Rustad
0 siblings, 1 reply; 13+ messages in thread
From: Lars Marowsky-Bree @ 2004-11-16 15:51 UTC (permalink / raw)
To: Mark Rustad, linux-raid, linux-scsi
On 2004-11-04T12:56:01, Mark Rustad <mrustad@mac.com> wrote:
> I came up with a patch that seems to fix the kernel crashing on a RAID1
> device failure problem. I do not have confidence that it really is the
> correct fix, but I'm pretty sure that it is touching the right area.
> I'm sure someone more familiar with this area of the kernel will
> recognize from this what the correct fix is, if this is not it.
> Submitted for your consideration:
Hi Mark, checking back whether any better fix has since been proposed?
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PROBLEM: kernel crashes on RAID1 drive error
2004-11-16 15:51 ` Lars Marowsky-Bree
@ 2004-11-16 16:40 ` Mark Rustad
0 siblings, 0 replies; 13+ messages in thread
From: Mark Rustad @ 2004-11-16 16:40 UTC (permalink / raw)
To: Lars Marowsky-Bree; +Cc: linux-raid, linux-scsi
Lars,
On Nov 16, 2004, at 9:51 AM, Lars Marowsky-Bree wrote:
> On 2004-11-04T12:56:01, Mark Rustad <mrustad@mac.com> wrote:
>
>> I came up with a patch that seems to fix the kernel crashing on a
>> RAID1
>> device failure problem. I do not have confidence that it really is the
>> correct fix, but I'm pretty sure that it is touching the right area.
>> I'm sure someone more familiar with this area of the kernel will
>> recognize from this what the correct fix is, if this is not it.
>> Submitted for your consideration:
>
> Hi Mark, checking back whether any better fix has since been proposed?
My earlier attempt was clearly broken, though it did cure the kernel
crashes. Below is the patch I am currently applying against ll_rw_blk.c
in SuSE kernel 2.6.5-7.111. The last two hunks of the patch are
relevant to this problem (the remainder of the patch should look
familiar).
It looks to me like this problem was introduced in 2.6.2 by what was
thought to be a simple rearrangement of code. I believe that that
change did not take into account all of the ways that the loop could
exit.
Unfortunately, the drive I had that would power up and work for 20
minutes and then return media errors has now totally died, so I am not
currently able to convince myself that this fix is really right. I am
continuing to try to reproduce a similar event.
My current patch is below:
--- a/drivers/block/ll_rw_blk.c 2004-11-15 08:35:19.000000000 -0600
+++ a/drivers/block/ll_rw_blk.c 2004-11-15 08:41:37.000000000 -0600
@@ -2654,19 +2654,40 @@
void blk_recalc_rq_segments(struct request *rq)
{
- struct bio *bio;
+ struct bio *bio, *prevbio = NULL;
int nr_phys_segs, nr_hw_segs;
+ unsigned int phys_size, hw_size;
+ request_queue_t *q = rq->q;
if (!rq->bio)
return;
- nr_phys_segs = nr_hw_segs = 0;
+ phys_size = hw_size = nr_phys_segs = nr_hw_segs = 0;
rq_for_each_bio(bio, rq) {
/* Force bio hw/phys segs to be recalculated. */
bio->bi_flags &= ~(1 << BIO_SEG_VALID);
- nr_phys_segs += bio_phys_segments(rq->q, bio);
- nr_hw_segs += bio_hw_segments(rq->q, bio);
+ nr_phys_segs += bio_phys_segments(q, bio);
+ nr_hw_segs += bio_hw_segments(q, bio);
+ if (prevbio) {
+ int pseg = phys_size + prevbio->bi_size +
bio->bi_size;
+ int hseg = hw_size + prevbio->bi_size +
bio->bi_size;
+
+ if (blk_phys_contig_segment(q, prevbio, bio) &&
+ pseg <= q->max_segment_size) {
+ nr_phys_segs--;
+ phys_size += prevbio->bi_size +
bio->bi_size;
+ } else
+ phys_size = 0;
+
+ if (blk_hw_contig_segment(q, prevbio, bio) &&
+ hseg <= q->max_segment_size) {
+ nr_hw_segs--;
+ hw_size += prevbio->bi_size +
bio->bi_size;
+ } else
+ hw_size = 0;
+ }
+ prevbio = bio;
}
rq->nr_phys_segments = nr_phys_segs;
@@ -2762,6 +2783,8 @@
* not a complete bvec done
*/
if (unlikely(nbytes > nr_bytes)) {
+ bio_iovec_idx(bio, idx)->bv_offset +=
nr_bytes;
+ bio_iovec_idx(bio, idx)->bv_len -=
nr_bytes;
bio_nbytes += nr_bytes;
total_bytes += nr_bytes;
break;
@@ -2798,8 +2821,6 @@
if (bio_nbytes) {
bio_endio(bio, bio_nbytes, error);
bio->bi_idx += next_idx;
- bio_iovec(bio)->bv_offset += nr_bytes;
- bio_iovec(bio)->bv_len -= nr_bytes;
}
blk_recalc_rq_sectors(req, total_bytes >> 9);
--
Mark Rustad, MRustad@mac.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Problem: kernel crashes on RAID1 drive error
@ 2004-12-28 12:00 bernd
0 siblings, 0 replies; 13+ messages in thread
From: bernd @ 2004-12-28 12:00 UTC (permalink / raw)
To: linux-raid
Hi raid gurus,
can anybody tell me if there is a reliable fix for this problem? Or if not
when there will be one available for SuSE 9.2?
We would appreciate if the fix will be bundled with an online-update for
SuSE 9.2 systems. May be someone from the SuSE Labs can give us more
information, too.
We just installed the latest online-update form SuSE on our 9.2 machines
which brought in kernel 2.6.8-24.10 but the problem still exists.
With this serious bug RAID1 is pretty meaningless ...
Greetings Bernd Rieke
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2004-12-28 12:00 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-20 22:08 PROBLEM: kernel crashes on RAID1 drive error Mark Rustad
2004-10-21 8:45 ` Jens Axboe
2004-10-21 13:52 ` Paul Clements
2004-10-21 13:55 ` Jens Axboe
2004-10-21 14:01 ` Paul Clements
2004-10-21 14:02 ` Jens Axboe
2004-10-22 16:00 ` Mark Rustad
2004-10-28 19:35 ` Mark Rustad
2004-11-04 18:56 ` Mark Rustad
2004-11-16 15:51 ` Lars Marowsky-Bree
2004-11-16 16:40 ` Mark Rustad
2004-10-21 16:31 ` Mark Rustad
-- strict thread matches above, loose matches on Subject: below --
2004-12-28 12:00 Problem: " bernd
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).