All of lore.kernel.org
 help / color / mirror / Atom feed
From: "BERTRAND Joël" <joel.bertrand@systella.fr>
To: linux-raid@vger.kernel.org, sparclinux@vger.kernel.org
Subject: Re: [BUG] Raid5 trouble
Date: Wed, 17 Oct 2007 16:32:03 +0200	[thread overview]
Message-ID: <47161CE3.80909@systella.fr> (raw)
In-Reply-To: <4714BB92.7040701@systella.fr>

BERTRAND Joël wrote:
>     Hello,
> 
>     I run 2.6.23 linux kernel on two T1000 (sparc64) servers. Each 
> server has a partitionable raid5 array (/dev/md/d0) and I have to 
> synchronize both raid5 volumes by raid1. Thus, I have tried to build a 
> raid1 volume between /dev/md/d0p1 and /dev/sdi1 (exported by iscsi from 
> the second server) and I obtain a BUG :
> 
> Root gershwin:[/usr/scripts] > mdadm -C /dev/md7 -l1 -n2 /dev/md/d0p1 
> /dev/sdi1
> ...

	Hello,

	I have fixed iscsi-target, and I have tested it. It works now without 
any trouble. Patches were posted on iscsi-target mailing list. When I 
use iSCSI to access to foreign raid5 volume, it works fine. I can format 
foreign volume, copy large files on it... But when I tried to create a 
new raid1 volume with a local raid5 volume and a foreign raid5 volume, I 
receive my well known Oops. You can find my dmesg after Oops :

md: md_d0 stopped.
md: bind<sdd1>
md: bind<sde1>
md: bind<sdf1>
md: bind<sdg1>
md: bind<sdh1>

md: bind<sdc1>
raid5: device sdc1 operational as raid disk 0
raid5: device sdh1 operational as raid disk 5
raid5: device sdg1 operational as raid disk 4
raid5: device sdf1 operational as raid disk 3
raid5: device sde1 operational as raid disk 2
raid5: device sdd1 operational as raid disk 1
raid5: allocated 12518kB for md_d0
raid5: raid level 5 set md_d0 active with 6 out of 6 devices, algorithm 2
RAID5 conf printout:
  --- rd:6 wd:6
  disk 0, o:1, dev:sdc1
  disk 1, o:1, dev:sdd1
  disk 2, o:1, dev:sde1
  disk 3, o:1, dev:sdf1
  disk 4, o:1, dev:sdg1
  disk 5, o:1, dev:sdh1
  md_d0: p1
scsi3 : iSCSI Initiator over TCP/IP
scsi 3:0:0:0: Direct-Access     IET      VIRTUAL-DISK     0    PQ: 0 ANSI: 4
sd 3:0:0:0: [sdi] 2929451520 512-byte hardware sectors (1499879 MB)
sd 3:0:0:0: [sdi] Write Protect is off
sd 3:0:0:0: [sdi] Mode Sense: 77 00 00 08
sd 3:0:0:0: [sdi] Write cache: disabled, read cache: enabled, doesn't 
support DPO or FUA
sd 3:0:0:0: [sdi] 2929451520 512-byte hardware sectors (1499879 MB)
sd 3:0:0:0: [sdi] Write Protect is off
sd 3:0:0:0: [sdi] Mode Sense: 77 00 00 08
sd 3:0:0:0: [sdi] Write cache: disabled, read cache: enabled, doesn't 
support DPO or FUA
  sdi: sdi1
sd 3:0:0:0: [sdi] Attached SCSI disk
md: bind<md_d0p1>
md: bind<sdi1>
md: md7: raid array is not clean -- starting background reconstruction
raid1: raid set md7 active with 2 out of 2 mirrors
md: resync of RAID array md7
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 
KB/sec) for resync.
md: using 256k window, over a total of 1464725632 blocks.
kernel BUG at drivers/md/raid5.c:380!
               \|/ ____ \|/
               "@'/ .. \`@"
               /_| \__/ |_\
                  \__U_/
md7_resync(4929): Kernel bad sw trap 5 [#1]
TSTATE: 0000000080001606 TPC: 00000000005ed50c TNPC: 00000000005ed510 Y: 
00000000    Not tainted
TPC: <get_stripe_work+0x1f4/0x200>
g0: 0000000000000005 g1: 00000000007c0400 g2: 0000000000000001 g3: 
0000000000748400
g4: fffff800feeb6880 g5: fffff80002080000 g6: fffff800e7598000 g7: 
0000000000748528
o0: 0000000000000029 o1: 0000000000715798 o2: 000000000000017c o3: 
0000000000000005
o4: 0000000000000006 o5: fffff800e8f0a060 sp: fffff800e759ad81 ret_pc: 
00000000005ed504
RPC: <get_stripe_work+0x1ec/0x200>
l0: 0000000000000002 l1: ffffffffffffffff l2: fffff800e8f0a0a0 l3: 
fffff800e8f09fe8
l4: fffff800e8f0a088 l5: fffffffffffffff8 l6: 0000000000000005 l7: 
fffff800e8374000
i0: fffff800e8f0a028 i1: 0000000000000000 i2: 0000000000000004 i3: 
fffff800e759b720
i4: 0000000000000080 i5: 0000000000000080 i6: fffff800e759ae51 i7: 
00000000005f0274
I7: <handle_stripe5+0x4fc/0x1340>
Caller[00000000005f0274]: handle_stripe5+0x4fc/0x1340
Caller[00000000005f211c]: handle_stripe+0x24/0x13e0
Caller[00000000005f4450]: make_request+0x358/0x600
Caller[0000000000542890]: generic_make_request+0x198/0x220
Caller[00000000005eb240]: sync_request+0x608/0x640
Caller[00000000005fef7c]: md_do_sync+0x384/0x920
Caller[00000000005ff8f0]: md_thread+0x38/0x140
Caller[0000000000478b40]: kthread+0x48/0x80
Caller[00000000004273d0]: kernel_thread+0x38/0x60
Caller[0000000000478de0]: kthreadd+0x148/0x1c0
Instruction DUMP: 9210217c  7ff8f57f  90122398 <91d02005> 30680004 
01000000  01000000  01000000  9de3bf00

	I suspect a major bug in raid5 code but I don't know how debug it...

	md7 was crated by mdadm -C /dev/md7 -l1 -n2 /dev/md/d0 /dev/sdi1. 
/dev/md/d0 is a raid5 volume, and sdi a iSCSI disk.

	Regards,

	JKB
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: "BERTRAND Joël" <joel.bertrand@systella.fr>
To: linux-raid@vger.kernel.org, sparclinux@vger.kernel.org
Subject: Re: [BUG] Raid5 trouble
Date: Wed, 17 Oct 2007 14:32:03 +0000	[thread overview]
Message-ID: <47161CE3.80909@systella.fr> (raw)
In-Reply-To: <4714BB92.7040701@systella.fr>

BERTRAND Joël wrote:
>     Hello,
> 
>     I run 2.6.23 linux kernel on two T1000 (sparc64) servers. Each 
> server has a partitionable raid5 array (/dev/md/d0) and I have to 
> synchronize both raid5 volumes by raid1. Thus, I have tried to build a 
> raid1 volume between /dev/md/d0p1 and /dev/sdi1 (exported by iscsi from 
> the second server) and I obtain a BUG :
> 
> Root gershwin:[/usr/scripts] > mdadm -C /dev/md7 -l1 -n2 /dev/md/d0p1 
> /dev/sdi1
> ...

	Hello,

	I have fixed iscsi-target, and I have tested it. It works now without 
any trouble. Patches were posted on iscsi-target mailing list. When I 
use iSCSI to access to foreign raid5 volume, it works fine. I can format 
foreign volume, copy large files on it... But when I tried to create a 
new raid1 volume with a local raid5 volume and a foreign raid5 volume, I 
receive my well known Oops. You can find my dmesg after Oops :

md: md_d0 stopped.
md: bind<sdd1>
md: bind<sde1>
md: bind<sdf1>
md: bind<sdg1>
md: bind<sdh1>

md: bind<sdc1>
raid5: device sdc1 operational as raid disk 0
raid5: device sdh1 operational as raid disk 5
raid5: device sdg1 operational as raid disk 4
raid5: device sdf1 operational as raid disk 3
raid5: device sde1 operational as raid disk 2
raid5: device sdd1 operational as raid disk 1
raid5: allocated 12518kB for md_d0
raid5: raid level 5 set md_d0 active with 6 out of 6 devices, algorithm 2
RAID5 conf printout:
  --- rd:6 wd:6
  disk 0, o:1, dev:sdc1
  disk 1, o:1, dev:sdd1
  disk 2, o:1, dev:sde1
  disk 3, o:1, dev:sdf1
  disk 4, o:1, dev:sdg1
  disk 5, o:1, dev:sdh1
  md_d0: p1
scsi3 : iSCSI Initiator over TCP/IP
scsi 3:0:0:0: Direct-Access     IET      VIRTUAL-DISK     0    PQ: 0 ANSI: 4
sd 3:0:0:0: [sdi] 2929451520 512-byte hardware sectors (1499879 MB)
sd 3:0:0:0: [sdi] Write Protect is off
sd 3:0:0:0: [sdi] Mode Sense: 77 00 00 08
sd 3:0:0:0: [sdi] Write cache: disabled, read cache: enabled, doesn't 
support DPO or FUA
sd 3:0:0:0: [sdi] 2929451520 512-byte hardware sectors (1499879 MB)
sd 3:0:0:0: [sdi] Write Protect is off
sd 3:0:0:0: [sdi] Mode Sense: 77 00 00 08
sd 3:0:0:0: [sdi] Write cache: disabled, read cache: enabled, doesn't 
support DPO or FUA
  sdi: sdi1
sd 3:0:0:0: [sdi] Attached SCSI disk
md: bind<md_d0p1>
md: bind<sdi1>
md: md7: raid array is not clean -- starting background reconstruction
raid1: raid set md7 active with 2 out of 2 mirrors
md: resync of RAID array md7
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 
KB/sec) for resync.
md: using 256k window, over a total of 1464725632 blocks.
kernel BUG at drivers/md/raid5.c:380!
               \|/ ____ \|/
               "@'/ .. \`@"
               /_| \__/ |_\
                  \__U_/
md7_resync(4929): Kernel bad sw trap 5 [#1]
TSTATE: 0000000080001606 TPC: 00000000005ed50c TNPC: 00000000005ed510 Y: 
00000000    Not tainted
TPC: <get_stripe_work+0x1f4/0x200>
g0: 0000000000000005 g1: 00000000007c0400 g2: 0000000000000001 g3: 
0000000000748400
g4: fffff800feeb6880 g5: fffff80002080000 g6: fffff800e7598000 g7: 
0000000000748528
o0: 0000000000000029 o1: 0000000000715798 o2: 000000000000017c o3: 
0000000000000005
o4: 0000000000000006 o5: fffff800e8f0a060 sp: fffff800e759ad81 ret_pc: 
00000000005ed504
RPC: <get_stripe_work+0x1ec/0x200>
l0: 0000000000000002 l1: ffffffffffffffff l2: fffff800e8f0a0a0 l3: 
fffff800e8f09fe8
l4: fffff800e8f0a088 l5: fffffffffffffff8 l6: 0000000000000005 l7: 
fffff800e8374000
i0: fffff800e8f0a028 i1: 0000000000000000 i2: 0000000000000004 i3: 
fffff800e759b720
i4: 0000000000000080 i5: 0000000000000080 i6: fffff800e759ae51 i7: 
00000000005f0274
I7: <handle_stripe5+0x4fc/0x1340>
Caller[00000000005f0274]: handle_stripe5+0x4fc/0x1340
Caller[00000000005f211c]: handle_stripe+0x24/0x13e0
Caller[00000000005f4450]: make_request+0x358/0x600
Caller[0000000000542890]: generic_make_request+0x198/0x220
Caller[00000000005eb240]: sync_request+0x608/0x640
Caller[00000000005fef7c]: md_do_sync+0x384/0x920
Caller[00000000005ff8f0]: md_thread+0x38/0x140
Caller[0000000000478b40]: kthread+0x48/0x80
Caller[00000000004273d0]: kernel_thread+0x38/0x60
Caller[0000000000478de0]: kthreadd+0x148/0x1c0
Instruction DUMP: 9210217c  7ff8f57f  90122398 <91d02005> 30680004 
01000000  01000000  01000000  9de3bf00

	I suspect a major bug in raid5 code but I don't know how debug it...

	md7 was crated by mdadm -C /dev/md7 -l1 -n2 /dev/md/d0 /dev/sdi1. 
/dev/md/d0 is a raid5 volume, and sdi a iSCSI disk.

	Regards,

	JKB

  reply	other threads:[~2007-10-17 14:32 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-16 13:24 [BUG] Raid5 trouble BERTRAND Joël
2007-10-16 13:24 ` BERTRAND Joël
2007-10-17 14:32 ` BERTRAND Joël [this message]
2007-10-17 14:32   ` BERTRAND Joël
2007-10-17 14:58   ` Dan Williams
2007-10-17 14:58     ` Dan Williams
2007-10-17 15:40     ` Dan Williams
2007-10-17 15:40       ` Dan Williams
2007-10-17 16:44       ` BERTRAND Joël
2007-10-17 16:44         ` BERTRAND Joël
2007-10-18  0:46         ` Dan Williams
2007-10-18  0:46           ` Dan Williams
2007-10-18  8:29           ` BERTRAND Joël
2007-10-18  8:29             ` BERTRAND Joël
2007-10-19  2:55       ` Bill Davidsen
2007-10-19  2:55         ` Bill Davidsen
2007-10-19  8:04         ` BERTRAND Joël
2007-10-19  8:04           ` BERTRAND Joël
2007-10-19 15:51           ` Dan Williams
2007-10-19 15:51             ` Dan Williams
2007-10-19 16:03             ` BERTRAND Joël
2007-10-19 16:03               ` BERTRAND Joël
     [not found]             ` <4718DE66.8000905@tmr.com>
2007-10-19 20:42               ` BERTRAND Joël
2007-10-19 20:42                 ` BERTRAND Joël
2007-10-19 20:49                 ` [BUG] Raid1/5 over iSCSI trouble BERTRAND Joël
2007-10-19 20:49                   ` BERTRAND Joël
2007-10-19 21:02                   ` [Iscsitarget-devel] " Ross S. W. Walker
2007-10-19 21:02                     ` Ross S. W. Walker
2007-10-19 21:06                     ` BERTRAND Joël
2007-10-19 21:06                       ` [Iscsitarget-devel] " BERTRAND Joël
2007-10-19 21:10                       ` Ross S. W. Walker
2007-10-19 21:10                         ` [Iscsitarget-devel] " Ross S. W. Walker
2007-10-20  7:45                         ` BERTRAND Joël
2007-10-20  7:45                           ` [Iscsitarget-devel] " BERTRAND Joël
2007-10-19 21:11                       ` Scott Kaelin
2007-10-19 21:11                         ` Scott Kaelin
2007-10-19 21:04                   ` BERTRAND Joël
2007-10-19 21:04                     ` BERTRAND Joël
2007-10-19 21:08                     ` Ross S. W. Walker
2007-10-19 21:08                       ` [Iscsitarget-devel] " Ross S. W. Walker
2007-10-19 21:12                     ` Dan Williams
2007-10-19 21:12                       ` Dan Williams
2007-10-20  8:05                       ` BERTRAND Joël
2007-10-20  8:05                         ` BERTRAND Joël
2007-10-24  7:12                         ` BERTRAND Joël
2007-10-24  7:12                           ` BERTRAND Joël
2007-10-24 20:10                           ` Bill Davidsen
2007-10-24 20:10                             ` Bill Davidsen
2007-10-24 23:49                           ` Dan Williams
2007-10-24 23:49                             ` Dan Williams
2007-10-25  0:03                             ` David Miller
2007-10-25  0:03                               ` David Miller
2007-10-27 13:29                             ` BERTRAND Joël
2007-10-27 13:29                               ` BERTRAND Joël
2007-10-27 18:27                               ` Dan Williams
2007-10-27 18:27                                 ` Dan Williams
2007-10-27 19:35                                 ` BERTRAND Joël
2007-10-27 19:35                                   ` BERTRAND Joël
2007-10-27 21:13                               ` Ming Zhang
2007-10-27 21:13                                 ` Ming Zhang
2007-10-29 10:40                                 ` BERTRAND Joël
2007-10-29 10:40                                   ` BERTRAND Joël
2007-10-19 21:19                     ` Ming Zhang
2007-10-19 21:19                       ` [Iscsitarget-devel] " Ming Zhang
2007-10-19 23:50                     ` Bill Davidsen
2007-10-19 23:50                       ` Bill Davidsen
2007-10-19 23:58                       ` Bill Davidsen
2007-10-19 23:58                         ` Bill Davidsen
2007-10-20  7:52                       ` BERTRAND Joël
2007-10-20  7:52                         ` BERTRAND Joël
2007-10-17 16:07     ` [BUG] Raid5 trouble BERTRAND Joël
2007-10-17 16:07       ` BERTRAND Joël

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47161CE3.80909@systella.fr \
    --to=joel.bertrand@systella.fr \
    --cc=linux-raid@vger.kernel.org \
    --cc=sparclinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.