* [PANIC] : kernel BUG at drivers/md/raid5.c:2756!
@ 2011-10-31 21:29 Manish Katiyar
2011-11-01 5:39 ` NeilBrown
0 siblings, 1 reply; 8+ messages in thread
From: Manish Katiyar @ 2011-10-31 21:29 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: Manish Katiyar
I was running following script (trying to reproduce an ext4 error
reported in another thread) and the kernel dies with below error.
The place where it crashes is :-
2746 static void handle_parity_checks6(raid5_conf_t *conf, struct
stripe_head *sh,
2747 struct stripe_head_state *s,
2748 int disks)
2749 {
.....
2754 set_bit(STRIPE_HANDLE, &sh->state);
2755
2756 BUG_ON(s->failed > 2); <============== !!!!
[ 9663.343974] md/raid:md11: Disk failure on loop3, disabling device.[
9663.343976] md/raid:md11: Operation continuing on 4 devices.[
9668.547289] ------------[ cut here ]------------[ 9668.547327] kernel
BUG at drivers/md/raid5.c:2756![ 9668.547356] invalid opcode: 0000
[#1] SMP [ 9668.547388] Modules linked in: parport_pc ppdev
snd_hda_codec_hdmi snd_hda_codec_conexant aesni_intel cryptd aes_i586
aes_generic nfsd exportfs btusb nfs bluetooth lockd fscache
auth_rpcgss nfs_acl sunrpc binfmt_misc joydev snd_hda_intel
snd_hda_codec fuse snd_hwdep thinkpad_acpi snd_pcm snd_seq_midi
uvcvideo snd_rawmidi snd_seq_midi_event arc4 snd_seq videodev i915
iwlagn mxm_wmi drm_kms_helper drm snd_timer psmouse snd_seq_device
serio_raw mac80211 snd tpm_tis tpm nvram tpm_bios intel_ips cfg80211
soundcore i2c_algo_bit snd_page_alloc video lp parport usbhid hid
raid10 raid456 async_raid6_recov async_pq ahci libahci firewire_ohci
firewire_core crc_itu_t sdhci_pci sdhci e1000e raid6_pq async_xor xor
async_memcpy async_tx raid1 raid0 multipath linear[ 9668.547951] [
9668.547964] Pid: 6067, comm: md11_raid6 Tainted: G W
3.1.0-rc3+ #0 LENOVO 2537GH6/2537GH6[ 9668.548021] EIP:
0060:[<f878d590>] EFLAGS: 00010202 CPU: 3[ 9668.548056] EIP is at
handle_stripe+0x1e60/0x1e70 [raid456][ 9668.548087] EAX: 00000005 EBX:
ea589e00 ECX: 00000000 EDX: 00000003[ 9668.548121] ESI: 00000006 EDI:
df059590 EBP: ded39f00 ESP: ded39e30[ 9668.548155] DS: 007b ES: 007b
FS: 00d8 GS: 00e0 SS: 0068[ 9668.548186] Process md11_raid6 (pid:
6067, ti=ded38000 task=e364b2c0 task.ti=ded38000)[ 9668.548228]
Stack:[ 9668.548241] ded39e38 c10167e8 00000002 c107ce85 00000001
ded39e4c 00009258 00000000[ 9668.548303] df0595b8 ded39e60 ea589e00
fffffffc 00000007 ea589f28 ea589e00 df059590[ 9668.548364] 00000000
e36b1d50 ded39e7c 00000000 00000000 00000000 00000000 00000007[
9668.548424] Call Trace:[ 9668.548447] [<c10167e8>] ?
sched_clock+0x8/0x10[ 9668.548477] [<c107ce85>] ?
sched_clock_cpu+0xe5/0x150[ 9668.548509] [<f8787f39>] ?
__release_stripe+0x109/0x160 [raid456][ 9668.548545] [<f8787fce>] ?
release_stripe+0x3e/0x50 [raid456][ 9668.548580] [<f878f47a>]
raid5d+0x3aa/0x510 [raid456][ 9668.548611] [<c107698d>] ?
finish_wait+0x4d/0x70[ 9668.548641] [<c13fc3fd>]
md_thread+0xed/0x120[ 9668.548669] [<c1076890>] ?
add_wait_queue+0x50/0x50[ 9668.548697] [<c13fc310>] ?
md_rdev_init+0x120/0x120[ 9668.548725] [<c107608d>]
kthread+0x6d/0x80[ 9668.548750] [<c1076020>] ?
flush_kthread_worker+0x80/0x80[ 9668.548784] [<c15419be>]
kernel_thread_helper+0x6/0x10[ 9668.548814] Code: 44 01 40 f0 80 88 80
00 00 00 02 f0 80 88 80 00 00 00 20 8b 45 98 e9 7a f3 ff ff 0f 0b c7
40 38 03 00 00 00 b8 03 00 00 00 eb b4 <0f> 0b 0f 0b 0f 0b 0f 0b [
9668.549063] md: md11: resync done.[ 9668.549087] 90 8d b4 26 00 00 00
00 55 89 e5 57 56 [ 9668.549159] EIP: [<f878d590>]
handle_stripe+0x1e60/0x1e70 [raid456] SS:ESP 0068:ded39e30[
9668.935138] ---[ end trace e71016c3ebaeb3bd ]---
The script to reproduce is :
/home/mkatiyar> cat a.ksh
#!/bin/ksh
SUDO=sudo
cmd() {
sudo $*
}
device=/dev/md11
cd
cmd mdadm --stop $device
cmd mdadm --remove $device
cmd umount /tmp/b
for i in `seq 1 7`
do
cmd losetup -d /dev/loop$i
done
mkdir -p /tmp/a
mkdir -p /tmp/b
cd /tmp/a
for i in `seq 1 7`
do
cmd rm /tmp/a/raid-$i
cmd dd if=/dev/zero of=/tmp/a/raid-$i bs=4k count=25000
cmd losetup /dev/loop$i /tmp/a/raid-$i
done
cmd mdadm --create $device --level=6 --raid-devices=7 /dev/loop[1-7]
cmd cat /proc/mdstat
cmd mkfs.ext4 -b 4096 -i 4096 -m 0 $device
cmd mount $device /tmp/b
cmd mdadm --manage $device --fail /dev/loop1
cmd mdadm --manage $device --fail /dev/loop2
cmd dmesg -c > /dev/null 2>&1
cmd dd if=/dev/zero of=/tmp/b/testfile bs=1k count=1000 &
cmd mdadm --manage $device --fail /dev/loop3
PS : I'm not part of the list, so please keep me in cc in the response.
--
Thanks -
Manish
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PANIC] : kernel BUG at drivers/md/raid5.c:2756!
2011-10-31 21:29 [PANIC] : kernel BUG at drivers/md/raid5.c:2756! Manish Katiyar
@ 2011-11-01 5:39 ` NeilBrown
2011-11-01 6:09 ` Manish Katiyar
2011-11-04 19:03 ` Williams, Dan J
0 siblings, 2 replies; 8+ messages in thread
From: NeilBrown @ 2011-11-01 5:39 UTC (permalink / raw)
To: Manish Katiyar; +Cc: linux-raid, Dan Williams
[-- Attachment #1: Type: text/plain, Size: 6455 bytes --]
On Mon, 31 Oct 2011 14:29:38 -0700 Manish Katiyar <mkatiyar@gmail.com> wrote:
> I was running following script (trying to reproduce an ext4 error
> reported in another thread) and the kernel dies with below error.
>
> The place where it crashes is :-
> 2746 static void handle_parity_checks6(raid5_conf_t *conf, struct
> stripe_head *sh,
> 2747 struct stripe_head_state *s,
> 2748 int disks)
> 2749 {
> .....
> 2754 set_bit(STRIPE_HANDLE, &sh->state);
> 2755
> 2756 BUG_ON(s->failed > 2); <============== !!!!
>
>
>
> [ 9663.343974] md/raid:md11: Disk failure on loop3, disabling device.[
> 9663.343976] md/raid:md11: Operation continuing on 4 devices.[
> 9668.547289] ------------[ cut here ]------------[ 9668.547327] kernel
> BUG at drivers/md/raid5.c:2756![ 9668.547356] invalid opcode: 0000
> [#1] SMP [ 9668.547388] Modules linked in: parport_pc ppdev
> snd_hda_codec_hdmi snd_hda_codec_conexant aesni_intel cryptd aes_i586
> aes_generic nfsd exportfs btusb nfs bluetooth lockd fscache
> auth_rpcgss nfs_acl sunrpc binfmt_misc joydev snd_hda_intel
> snd_hda_codec fuse snd_hwdep thinkpad_acpi snd_pcm snd_seq_midi
> uvcvideo snd_rawmidi snd_seq_midi_event arc4 snd_seq videodev i915
> iwlagn mxm_wmi drm_kms_helper drm snd_timer psmouse snd_seq_device
> serio_raw mac80211 snd tpm_tis tpm nvram tpm_bios intel_ips cfg80211
> soundcore i2c_algo_bit snd_page_alloc video lp parport usbhid hid
> raid10 raid456 async_raid6_recov async_pq ahci libahci firewire_ohci
> firewire_core crc_itu_t sdhci_pci sdhci e1000e raid6_pq async_xor xor
> async_memcpy async_tx raid1 raid0 multipath linear[ 9668.547951] [
> 9668.547964] Pid: 6067, comm: md11_raid6 Tainted: G W
> 3.1.0-rc3+ #0 LENOVO 2537GH6/2537GH6[ 9668.548021] EIP:
> 0060:[<f878d590>] EFLAGS: 00010202 CPU: 3[ 9668.548056] EIP is at
> handle_stripe+0x1e60/0x1e70 [raid456][ 9668.548087] EAX: 00000005 EBX:
> ea589e00 ECX: 00000000 EDX: 00000003[ 9668.548121] ESI: 00000006 EDI:
> df059590 EBP: ded39f00 ESP: ded39e30[ 9668.548155] DS: 007b ES: 007b
> FS: 00d8 GS: 00e0 SS: 0068[ 9668.548186] Process md11_raid6 (pid:
> 6067, ti=ded38000 task=e364b2c0 task.ti=ded38000)[ 9668.548228]
> Stack:[ 9668.548241] ded39e38 c10167e8 00000002 c107ce85 00000001
> ded39e4c 00009258 00000000[ 9668.548303] df0595b8 ded39e60 ea589e00
> fffffffc 00000007 ea589f28 ea589e00 df059590[ 9668.548364] 00000000
> e36b1d50 ded39e7c 00000000 00000000 00000000 00000000 00000007[
> 9668.548424] Call Trace:[ 9668.548447] [<c10167e8>] ?
> sched_clock+0x8/0x10[ 9668.548477] [<c107ce85>] ?
> sched_clock_cpu+0xe5/0x150[ 9668.548509] [<f8787f39>] ?
> __release_stripe+0x109/0x160 [raid456][ 9668.548545] [<f8787fce>] ?
> release_stripe+0x3e/0x50 [raid456][ 9668.548580] [<f878f47a>]
> raid5d+0x3aa/0x510 [raid456][ 9668.548611] [<c107698d>] ?
> finish_wait+0x4d/0x70[ 9668.548641] [<c13fc3fd>]
> md_thread+0xed/0x120[ 9668.548669] [<c1076890>] ?
> add_wait_queue+0x50/0x50[ 9668.548697] [<c13fc310>] ?
> md_rdev_init+0x120/0x120[ 9668.548725] [<c107608d>]
> kthread+0x6d/0x80[ 9668.548750] [<c1076020>] ?
> flush_kthread_worker+0x80/0x80[ 9668.548784] [<c15419be>]
> kernel_thread_helper+0x6/0x10[ 9668.548814] Code: 44 01 40 f0 80 88 80
> 00 00 00 02 f0 80 88 80 00 00 00 20 8b 45 98 e9 7a f3 ff ff 0f 0b c7
> 40 38 03 00 00 00 b8 03 00 00 00 eb b4 <0f> 0b 0f 0b 0f 0b 0f 0b [
> 9668.549063] md: md11: resync done.[ 9668.549087] 90 8d b4 26 00 00 00
> 00 55 89 e5 57 56 [ 9668.549159] EIP: [<f878d590>]
> handle_stripe+0x1e60/0x1e70 [raid456] SS:ESP 0068:ded39e30[
> 9668.935138] ---[ end trace e71016c3ebaeb3bd ]---
>
> The script to reproduce is :
>
> /home/mkatiyar> cat a.ksh
> #!/bin/ksh
>
> SUDO=sudo
>
> cmd() {
> sudo $*
> }
>
> device=/dev/md11
> cd
> cmd mdadm --stop $device
> cmd mdadm --remove $device
> cmd umount /tmp/b
>
> for i in `seq 1 7`
> do
> cmd losetup -d /dev/loop$i
> done
>
> mkdir -p /tmp/a
> mkdir -p /tmp/b
>
> cd /tmp/a
>
> for i in `seq 1 7`
> do
> cmd rm /tmp/a/raid-$i
> cmd dd if=/dev/zero of=/tmp/a/raid-$i bs=4k count=25000
> cmd losetup /dev/loop$i /tmp/a/raid-$i
> done
>
> cmd mdadm --create $device --level=6 --raid-devices=7 /dev/loop[1-7]
> cmd cat /proc/mdstat
>
> cmd mkfs.ext4 -b 4096 -i 4096 -m 0 $device
> cmd mount $device /tmp/b
>
> cmd mdadm --manage $device --fail /dev/loop1
> cmd mdadm --manage $device --fail /dev/loop2
>
> cmd dmesg -c > /dev/null 2>&1
> cmd dd if=/dev/zero of=/tmp/b/testfile bs=1k count=1000 &
> cmd mdadm --manage $device --fail /dev/loop3
>
>
> PS : I'm not part of the list, so please keep me in cc in the response.
>
Thanks for the report.
I think you were quite unlucky to hit this and that you will find it hard to
reproduce. :-(
It will only happen if a device fails while a parity calculation is happening
on a stripe (and normally the stripe will reading or writing, not
calculating).
i.e. in handle_stripe you need sh->check_state to be non-zero, and
s.failed > 2. And sh->check_state don't be set non-zero when s.failed > 2
and doesn't stay non-zero for long.
I think we probably just want to make sure we abort any parity calculation
when the array fails.
This patch might do that.
Dan: could you have a look and see if this looks OK. i.e. is this sufficient
to abort the parity stuff or is something else needed.
Thanks,
NeilBrown
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index dbae459..9eb97b3 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3165,10 +3165,14 @@ static void handle_stripe(struct stripe_head *sh)
/* check if the array has lost more than max_degraded devices and,
* if so, some requests might need to be failed.
*/
- if (s.failed > conf->max_degraded && s.to_read+s.to_write+s.written)
- handle_failed_stripe(conf, sh, &s, disks, &s.return_bi);
- if (s.failed > conf->max_degraded && s.syncing)
- handle_failed_sync(conf, sh, &s);
+ if (s.failed > conf->max_degraded) {
+ sh->check_state = 0;
+ sh->reconstruct_state = 0;
+ if (s.to_read+s.to_write+s.written)
+ handle_failed_stripe(conf, sh, &s, disks, &s.return_bi);
+ if (s.syncing)
+ handle_failed_sync(conf, sh, &s);
+ }
/*
* might be able to return some write requests if the parity blocks
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PANIC] : kernel BUG at drivers/md/raid5.c:2756!
2011-11-01 5:39 ` NeilBrown
@ 2011-11-01 6:09 ` Manish Katiyar
2011-11-01 6:29 ` NeilBrown
2011-11-04 19:03 ` Williams, Dan J
1 sibling, 1 reply; 8+ messages in thread
From: Manish Katiyar @ 2011-11-01 6:09 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid, Dan Williams
> I think you were quite unlucky to hit this and that you will find it hard to
> reproduce. :-(
Well, its always reproducible on my ubuntu when I run that script.
--
Thanks -
Manish
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PANIC] : kernel BUG at drivers/md/raid5.c:2756!
2011-11-01 6:09 ` Manish Katiyar
@ 2011-11-01 6:29 ` NeilBrown
2011-11-04 18:25 ` Manish Katiyar
0 siblings, 1 reply; 8+ messages in thread
From: NeilBrown @ 2011-11-01 6:29 UTC (permalink / raw)
To: Manish Katiyar; +Cc: linux-raid, Dan Williams
[-- Attachment #1: Type: text/plain, Size: 575 bytes --]
On Mon, 31 Oct 2011 23:09:04 -0700 Manish Katiyar <mkatiyar@gmail.com> wrote:
> > I think you were quite unlucky to hit this and that you will find it hard to
> > reproduce. :-(
>
> Well, its always reproducible on my ubuntu when I run that script.
>
Excellent!! That means you can reliably see if the patch fixes it. :-)
I guess I was thinking of 'real' devices which are slower than XOR. You were
using loop back device on files in /tmp which are presumably always in memory.
I guess it would be a lot easier to hit in that case.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PANIC] : kernel BUG at drivers/md/raid5.c:2756!
2011-11-01 6:29 ` NeilBrown
@ 2011-11-04 18:25 ` Manish Katiyar
2011-11-06 22:00 ` NeilBrown
0 siblings, 1 reply; 8+ messages in thread
From: Manish Katiyar @ 2011-11-04 18:25 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid, Dan Williams
On Mon, Oct 31, 2011 at 11:29 PM, NeilBrown <neilb@suse.de> wrote:
> On Mon, 31 Oct 2011 23:09:04 -0700 Manish Katiyar <mkatiyar@gmail.com> wrote:
>
>> > I think you were quite unlucky to hit this and that you will find it hard to
>> > reproduce. :-(
>>
>> Well, its always reproducible on my ubuntu when I run that script.
>>
>
> Excellent!! That means you can reliably see if the patch fixes it. :-)
Sorry for the late reply. Yes it fixes it for me.
>
> I guess I was thinking of 'real' devices which are slower than XOR. You were
> using loop back device on files in /tmp which are presumably always in memory.
> I guess it would be a lot easier to hit in that case.
>
> Thanks,
> NeilBrown
>
--
Thanks -
Manish
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PANIC] : kernel BUG at drivers/md/raid5.c:2756!
2011-11-01 5:39 ` NeilBrown
2011-11-01 6:09 ` Manish Katiyar
@ 2011-11-04 19:03 ` Williams, Dan J
2011-11-06 22:12 ` NeilBrown
1 sibling, 1 reply; 8+ messages in thread
From: Williams, Dan J @ 2011-11-04 19:03 UTC (permalink / raw)
To: NeilBrown; +Cc: Manish Katiyar, linux-raid
On Mon, Oct 31, 2011 at 10:39 PM, NeilBrown <neilb@suse.de> wrote:
> On Mon, 31 Oct 2011 14:29:38 -0700 Manish Katiyar <mkatiyar@gmail.com> wrote:
>
>> I was running following script (trying to reproduce an ext4 error
>> reported in another thread) and the kernel dies with below error.
>>
>> The place where it crashes is :-
>> 2746 static void handle_parity_checks6(raid5_conf_t *conf, struct
>> stripe_head *sh,
>> 2747 struct stripe_head_state *s,
>> 2748 int disks)
>> 2749 {
>> .....
>> 2754 set_bit(STRIPE_HANDLE, &sh->state);
>> 2755
>> 2756 BUG_ON(s->failed > 2); <============== !!!!
>>
>>
>>
>> [ 9663.343974] md/raid:md11: Disk failure on loop3, disabling device.[
>> 9663.343976] md/raid:md11: Operation continuing on 4 devices.[
>> 9668.547289] ------------[ cut here ]------------[ 9668.547327] kernel
>> BUG at drivers/md/raid5.c:2756![ 9668.547356] invalid opcode: 0000
>> [#1] SMP [ 9668.547388] Modules linked in: parport_pc ppdev
>> snd_hda_codec_hdmi snd_hda_codec_conexant aesni_intel cryptd aes_i586
>> aes_generic nfsd exportfs btusb nfs bluetooth lockd fscache
>> auth_rpcgss nfs_acl sunrpc binfmt_misc joydev snd_hda_intel
>> snd_hda_codec fuse snd_hwdep thinkpad_acpi snd_pcm snd_seq_midi
>> uvcvideo snd_rawmidi snd_seq_midi_event arc4 snd_seq videodev i915
>> iwlagn mxm_wmi drm_kms_helper drm snd_timer psmouse snd_seq_device
>> serio_raw mac80211 snd tpm_tis tpm nvram tpm_bios intel_ips cfg80211
>> soundcore i2c_algo_bit snd_page_alloc video lp parport usbhid hid
>> raid10 raid456 async_raid6_recov async_pq ahci libahci firewire_ohci
>> firewire_core crc_itu_t sdhci_pci sdhci e1000e raid6_pq async_xor xor
>> async_memcpy async_tx raid1 raid0 multipath linear[ 9668.547951] [
>> 9668.547964] Pid: 6067, comm: md11_raid6 Tainted: G W
>> 3.1.0-rc3+ #0 LENOVO 2537GH6/2537GH6[ 9668.548021] EIP:
>> 0060:[<f878d590>] EFLAGS: 00010202 CPU: 3[ 9668.548056] EIP is at
>> handle_stripe+0x1e60/0x1e70 [raid456][ 9668.548087] EAX: 00000005 EBX:
>> ea589e00 ECX: 00000000 EDX: 00000003[ 9668.548121] ESI: 00000006 EDI:
>> df059590 EBP: ded39f00 ESP: ded39e30[ 9668.548155] DS: 007b ES: 007b
>> FS: 00d8 GS: 00e0 SS: 0068[ 9668.548186] Process md11_raid6 (pid:
>> 6067, ti=ded38000 task=e364b2c0 task.ti=ded38000)[ 9668.548228]
>> Stack:[ 9668.548241] ded39e38 c10167e8 00000002 c107ce85 00000001
>> ded39e4c 00009258 00000000[ 9668.548303] df0595b8 ded39e60 ea589e00
>> fffffffc 00000007 ea589f28 ea589e00 df059590[ 9668.548364] 00000000
>> e36b1d50 ded39e7c 00000000 00000000 00000000 00000000 00000007[
>> 9668.548424] Call Trace:[ 9668.548447] [<c10167e8>] ?
>> sched_clock+0x8/0x10[ 9668.548477] [<c107ce85>] ?
>> sched_clock_cpu+0xe5/0x150[ 9668.548509] [<f8787f39>] ?
>> __release_stripe+0x109/0x160 [raid456][ 9668.548545] [<f8787fce>] ?
>> release_stripe+0x3e/0x50 [raid456][ 9668.548580] [<f878f47a>]
>> raid5d+0x3aa/0x510 [raid456][ 9668.548611] [<c107698d>] ?
>> finish_wait+0x4d/0x70[ 9668.548641] [<c13fc3fd>]
>> md_thread+0xed/0x120[ 9668.548669] [<c1076890>] ?
>> add_wait_queue+0x50/0x50[ 9668.548697] [<c13fc310>] ?
>> md_rdev_init+0x120/0x120[ 9668.548725] [<c107608d>]
>> kthread+0x6d/0x80[ 9668.548750] [<c1076020>] ?
>> flush_kthread_worker+0x80/0x80[ 9668.548784] [<c15419be>]
>> kernel_thread_helper+0x6/0x10[ 9668.548814] Code: 44 01 40 f0 80 88 80
>> 00 00 00 02 f0 80 88 80 00 00 00 20 8b 45 98 e9 7a f3 ff ff 0f 0b c7
>> 40 38 03 00 00 00 b8 03 00 00 00 eb b4 <0f> 0b 0f 0b 0f 0b 0f 0b [
>> 9668.549063] md: md11: resync done.[ 9668.549087] 90 8d b4 26 00 00 00
>> 00 55 89 e5 57 56 [ 9668.549159] EIP: [<f878d590>]
>> handle_stripe+0x1e60/0x1e70 [raid456] SS:ESP 0068:ded39e30[
>> 9668.935138] ---[ end trace e71016c3ebaeb3bd ]---
>>
>> The script to reproduce is :
>>
>> /home/mkatiyar> cat a.ksh
>> #!/bin/ksh
>>
>> SUDO=sudo
>>
>> cmd() {
>> sudo $*
>> }
>>
>> device=/dev/md11
>> cd
>> cmd mdadm --stop $device
>> cmd mdadm --remove $device
>> cmd umount /tmp/b
>>
>> for i in `seq 1 7`
>> do
>> cmd losetup -d /dev/loop$i
>> done
>>
>> mkdir -p /tmp/a
>> mkdir -p /tmp/b
>>
>> cd /tmp/a
>>
>> for i in `seq 1 7`
>> do
>> cmd rm /tmp/a/raid-$i
>> cmd dd if=/dev/zero of=/tmp/a/raid-$i bs=4k count=25000
>> cmd losetup /dev/loop$i /tmp/a/raid-$i
>> done
>>
>> cmd mdadm --create $device --level=6 --raid-devices=7 /dev/loop[1-7]
>> cmd cat /proc/mdstat
>>
>> cmd mkfs.ext4 -b 4096 -i 4096 -m 0 $device
>> cmd mount $device /tmp/b
>>
>> cmd mdadm --manage $device --fail /dev/loop1
>> cmd mdadm --manage $device --fail /dev/loop2
>>
>> cmd dmesg -c > /dev/null 2>&1
>> cmd dd if=/dev/zero of=/tmp/b/testfile bs=1k count=1000 &
>> cmd mdadm --manage $device --fail /dev/loop3
>>
>>
>> PS : I'm not part of the list, so please keep me in cc in the response.
>>
>
>
> Thanks for the report.
>
> I think you were quite unlucky to hit this and that you will find it hard to
> reproduce. :-(
>
> It will only happen if a device fails while a parity calculation is happening
> on a stripe (and normally the stripe will reading or writing, not
> calculating).
>
> i.e. in handle_stripe you need sh->check_state to be non-zero, and
> s.failed > 2. And sh->check_state don't be set non-zero when s.failed > 2
> and doesn't stay non-zero for long.
>
> I think we probably just want to make sure we abort any parity calculation
> when the array fails.
> This patch might do that.
>
> Dan: could you have a look and see if this looks OK. i.e. is this sufficient
> to abort the parity stuff or is something else needed.
>
> Thanks,
> NeilBrown
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index dbae459..9eb97b3 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -3165,10 +3165,14 @@ static void handle_stripe(struct stripe_head *sh)
> /* check if the array has lost more than max_degraded devices and,
> * if so, some requests might need to be failed.
> */
> - if (s.failed > conf->max_degraded && s.to_read+s.to_write+s.written)
> - handle_failed_stripe(conf, sh, &s, disks, &s.return_bi);
> - if (s.failed > conf->max_degraded && s.syncing)
> - handle_failed_sync(conf, sh, &s);
> + if (s.failed > conf->max_degraded) {
> + sh->check_state = 0;
> + sh->reconstruct_state = 0;
> + if (s.to_read+s.to_write+s.written)
> + handle_failed_stripe(conf, sh, &s, disks, &s.return_bi);
> + if (s.syncing)
> + handle_failed_sync(conf, sh, &s);
> + }
Hmm... this is sufficient to abort the operations, but this may short
circuit writeback of blocks that we successfully computed while the
failure is happening. I think there is a small benefit in continuing
with the writeback even though the array is failed. Maybe it prevents
a few out of sync stripes for the subsequent forced reassembly?
--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PANIC] : kernel BUG at drivers/md/raid5.c:2756!
2011-11-04 18:25 ` Manish Katiyar
@ 2011-11-06 22:00 ` NeilBrown
0 siblings, 0 replies; 8+ messages in thread
From: NeilBrown @ 2011-11-06 22:00 UTC (permalink / raw)
To: Manish Katiyar; +Cc: linux-raid, Dan Williams
[-- Attachment #1: Type: text/plain, Size: 921 bytes --]
On Fri, 4 Nov 2011 11:25:41 -0700 Manish Katiyar <mkatiyar@gmail.com> wrote:
> On Mon, Oct 31, 2011 at 11:29 PM, NeilBrown <neilb@suse.de> wrote:
> > On Mon, 31 Oct 2011 23:09:04 -0700 Manish Katiyar <mkatiyar@gmail.com> wrote:
> >
> >> > I think you were quite unlucky to hit this and that you will find it hard to
> >> > reproduce. :-(
> >>
> >> Well, its always reproducible on my ubuntu when I run that script.
> >>
> >
> > Excellent!! That means you can reliably see if the patch fixes it. :-)
>
> Sorry for the late reply. Yes it fixes it for me.
Thanks a lot for the confirmation!
NeilBrown
>
>
>
> >
> > I guess I was thinking of 'real' devices which are slower than XOR. You were
> > using loop back device on files in /tmp which are presumably always in memory.
> > I guess it would be a lot easier to hit in that case.
> >
> > Thanks,
> > NeilBrown
> >
>
>
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PANIC] : kernel BUG at drivers/md/raid5.c:2756!
2011-11-04 19:03 ` Williams, Dan J
@ 2011-11-06 22:12 ` NeilBrown
0 siblings, 0 replies; 8+ messages in thread
From: NeilBrown @ 2011-11-06 22:12 UTC (permalink / raw)
To: Williams, Dan J; +Cc: Manish Katiyar, linux-raid
[-- Attachment #1: Type: text/plain, Size: 8347 bytes --]
On Fri, 4 Nov 2011 12:03:08 -0700 "Williams, Dan J"
<dan.j.williams@intel.com> wrote:
> On Mon, Oct 31, 2011 at 10:39 PM, NeilBrown <neilb@suse.de> wrote:
> > On Mon, 31 Oct 2011 14:29:38 -0700 Manish Katiyar <mkatiyar@gmail.com> wrote:
> >
> >> I was running following script (trying to reproduce an ext4 error
> >> reported in another thread) and the kernel dies with below error.
> >>
> >> The place where it crashes is :-
> >> 2746 static void handle_parity_checks6(raid5_conf_t *conf, struct
> >> stripe_head *sh,
> >> 2747 struct stripe_head_state *s,
> >> 2748 int disks)
> >> 2749 {
> >> .....
> >> 2754 set_bit(STRIPE_HANDLE, &sh->state);
> >> 2755
> >> 2756 BUG_ON(s->failed > 2); <============== !!!!
> >>
> >>
> >>
> >> [ 9663.343974] md/raid:md11: Disk failure on loop3, disabling device.[
> >> 9663.343976] md/raid:md11: Operation continuing on 4 devices.[
> >> 9668.547289] ------------[ cut here ]------------[ 9668.547327] kernel
> >> BUG at drivers/md/raid5.c:2756![ 9668.547356] invalid opcode: 0000
> >> [#1] SMP [ 9668.547388] Modules linked in: parport_pc ppdev
> >> snd_hda_codec_hdmi snd_hda_codec_conexant aesni_intel cryptd aes_i586
> >> aes_generic nfsd exportfs btusb nfs bluetooth lockd fscache
> >> auth_rpcgss nfs_acl sunrpc binfmt_misc joydev snd_hda_intel
> >> snd_hda_codec fuse snd_hwdep thinkpad_acpi snd_pcm snd_seq_midi
> >> uvcvideo snd_rawmidi snd_seq_midi_event arc4 snd_seq videodev i915
> >> iwlagn mxm_wmi drm_kms_helper drm snd_timer psmouse snd_seq_device
> >> serio_raw mac80211 snd tpm_tis tpm nvram tpm_bios intel_ips cfg80211
> >> soundcore i2c_algo_bit snd_page_alloc video lp parport usbhid hid
> >> raid10 raid456 async_raid6_recov async_pq ahci libahci firewire_ohci
> >> firewire_core crc_itu_t sdhci_pci sdhci e1000e raid6_pq async_xor xor
> >> async_memcpy async_tx raid1 raid0 multipath linear[ 9668.547951] [
> >> 9668.547964] Pid: 6067, comm: md11_raid6 Tainted: G W
> >> 3.1.0-rc3+ #0 LENOVO 2537GH6/2537GH6[ 9668.548021] EIP:
> >> 0060:[<f878d590>] EFLAGS: 00010202 CPU: 3[ 9668.548056] EIP is at
> >> handle_stripe+0x1e60/0x1e70 [raid456][ 9668.548087] EAX: 00000005 EBX:
> >> ea589e00 ECX: 00000000 EDX: 00000003[ 9668.548121] ESI: 00000006 EDI:
> >> df059590 EBP: ded39f00 ESP: ded39e30[ 9668.548155] DS: 007b ES: 007b
> >> FS: 00d8 GS: 00e0 SS: 0068[ 9668.548186] Process md11_raid6 (pid:
> >> 6067, ti=ded38000 task=e364b2c0 task.ti=ded38000)[ 9668.548228]
> >> Stack:[ 9668.548241] ded39e38 c10167e8 00000002 c107ce85 00000001
> >> ded39e4c 00009258 00000000[ 9668.548303] df0595b8 ded39e60 ea589e00
> >> fffffffc 00000007 ea589f28 ea589e00 df059590[ 9668.548364] 00000000
> >> e36b1d50 ded39e7c 00000000 00000000 00000000 00000000 00000007[
> >> 9668.548424] Call Trace:[ 9668.548447] [<c10167e8>] ?
> >> sched_clock+0x8/0x10[ 9668.548477] [<c107ce85>] ?
> >> sched_clock_cpu+0xe5/0x150[ 9668.548509] [<f8787f39>] ?
> >> __release_stripe+0x109/0x160 [raid456][ 9668.548545] [<f8787fce>] ?
> >> release_stripe+0x3e/0x50 [raid456][ 9668.548580] [<f878f47a>]
> >> raid5d+0x3aa/0x510 [raid456][ 9668.548611] [<c107698d>] ?
> >> finish_wait+0x4d/0x70[ 9668.548641] [<c13fc3fd>]
> >> md_thread+0xed/0x120[ 9668.548669] [<c1076890>] ?
> >> add_wait_queue+0x50/0x50[ 9668.548697] [<c13fc310>] ?
> >> md_rdev_init+0x120/0x120[ 9668.548725] [<c107608d>]
> >> kthread+0x6d/0x80[ 9668.548750] [<c1076020>] ?
> >> flush_kthread_worker+0x80/0x80[ 9668.548784] [<c15419be>]
> >> kernel_thread_helper+0x6/0x10[ 9668.548814] Code: 44 01 40 f0 80 88 80
> >> 00 00 00 02 f0 80 88 80 00 00 00 20 8b 45 98 e9 7a f3 ff ff 0f 0b c7
> >> 40 38 03 00 00 00 b8 03 00 00 00 eb b4 <0f> 0b 0f 0b 0f 0b 0f 0b [
> >> 9668.549063] md: md11: resync done.[ 9668.549087] 90 8d b4 26 00 00 00
> >> 00 55 89 e5 57 56 [ 9668.549159] EIP: [<f878d590>]
> >> handle_stripe+0x1e60/0x1e70 [raid456] SS:ESP 0068:ded39e30[
> >> 9668.935138] ---[ end trace e71016c3ebaeb3bd ]---
> >>
> >> The script to reproduce is :
> >>
> >> /home/mkatiyar> cat a.ksh
> >> #!/bin/ksh
> >>
> >> SUDO=sudo
> >>
> >> cmd() {
> >> sudo $*
> >> }
> >>
> >> device=/dev/md11
> >> cd
> >> cmd mdadm --stop $device
> >> cmd mdadm --remove $device
> >> cmd umount /tmp/b
> >>
> >> for i in `seq 1 7`
> >> do
> >> cmd losetup -d /dev/loop$i
> >> done
> >>
> >> mkdir -p /tmp/a
> >> mkdir -p /tmp/b
> >>
> >> cd /tmp/a
> >>
> >> for i in `seq 1 7`
> >> do
> >> cmd rm /tmp/a/raid-$i
> >> cmd dd if=/dev/zero of=/tmp/a/raid-$i bs=4k count=25000
> >> cmd losetup /dev/loop$i /tmp/a/raid-$i
> >> done
> >>
> >> cmd mdadm --create $device --level=6 --raid-devices=7 /dev/loop[1-7]
> >> cmd cat /proc/mdstat
> >>
> >> cmd mkfs.ext4 -b 4096 -i 4096 -m 0 $device
> >> cmd mount $device /tmp/b
> >>
> >> cmd mdadm --manage $device --fail /dev/loop1
> >> cmd mdadm --manage $device --fail /dev/loop2
> >>
> >> cmd dmesg -c > /dev/null 2>&1
> >> cmd dd if=/dev/zero of=/tmp/b/testfile bs=1k count=1000 &
> >> cmd mdadm --manage $device --fail /dev/loop3
> >>
> >>
> >> PS : I'm not part of the list, so please keep me in cc in the response.
> >>
> >
> >
> > Thanks for the report.
> >
> > I think you were quite unlucky to hit this and that you will find it hard to
> > reproduce. :-(
> >
> > It will only happen if a device fails while a parity calculation is happening
> > on a stripe (and normally the stripe will reading or writing, not
> > calculating).
> >
> > i.e. in handle_stripe you need sh->check_state to be non-zero, and
> > s.failed > 2. And sh->check_state don't be set non-zero when s.failed > 2
> > and doesn't stay non-zero for long.
> >
> > I think we probably just want to make sure we abort any parity calculation
> > when the array fails.
> > This patch might do that.
> >
> > Dan: could you have a look and see if this looks OK. i.e. is this sufficient
> > to abort the parity stuff or is something else needed.
> >
> > Thanks,
> > NeilBrown
> >
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index dbae459..9eb97b3 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -3165,10 +3165,14 @@ static void handle_stripe(struct stripe_head *sh)
> > /* check if the array has lost more than max_degraded devices and,
> > * if so, some requests might need to be failed.
> > */
> > - if (s.failed > conf->max_degraded && s.to_read+s.to_write+s.written)
> > - handle_failed_stripe(conf, sh, &s, disks, &s.return_bi);
> > - if (s.failed > conf->max_degraded && s.syncing)
> > - handle_failed_sync(conf, sh, &s);
> > + if (s.failed > conf->max_degraded) {
> > + sh->check_state = 0;
> > + sh->reconstruct_state = 0;
> > + if (s.to_read+s.to_write+s.written)
> > + handle_failed_stripe(conf, sh, &s, disks, &s.return_bi);
> > + if (s.syncing)
> > + handle_failed_sync(conf, sh, &s);
> > + }
>
> Hmm... this is sufficient to abort the operations, but this may short
> circuit writeback of blocks that we successfully computed while the
> failure is happening. I think there is a small benefit in continuing
> with the writeback even though the array is failed. Maybe it prevents
> a few out of sync stripes for the subsequent forced reassembly?
My first attempt at a patch followed this line (if I remember and understand
correctly) but it seemed to get a bit complicated - If a drive was failed
but the stripe-cache for that device was up-to-date, we want to consider it
'failed' in some contexts, and not failed in other contexts.
So it became quite unclear what to store in the 'failed_num' array of 'struct
stripe_head_state'.
It almost certainly could be made to work but it didn't seem like it would be
worth the trouble. As soon as we have too many failure we really must fail
the writes, so not actually writing any of it out is completely defensible.
Given that, and as you have confirmed that it will be effective in aborting
the operations, I think I'll stick with the original patch.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-11-06 22:12 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-31 21:29 [PANIC] : kernel BUG at drivers/md/raid5.c:2756! Manish Katiyar
2011-11-01 5:39 ` NeilBrown
2011-11-01 6:09 ` Manish Katiyar
2011-11-01 6:29 ` NeilBrown
2011-11-04 18:25 ` Manish Katiyar
2011-11-06 22:00 ` NeilBrown
2011-11-04 19:03 ` Williams, Dan J
2011-11-06 22:12 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).