* 16 HDDs too much for RAID6?
@ 2008-03-06 9:01 Lars Täuber
2008-03-06 9:45 ` Andre Noll
0 siblings, 1 reply; 13+ messages in thread
From: Lars Täuber @ 2008-03-06 9:01 UTC (permalink / raw)
To: linux-raid
Hallo!
Here we have another problem with our RAID6 and 16 HDDs:
monosan:~ # mdadm -V
mdadm - v2.6.2 - 21st May 2007
monosan:~ # mdadm -C /dev/md4 -l6 -n 16 -x 0 /dev/dm-*
mdadm: /dev/dm-0 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-1 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-10 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-11 appears to contain an ext2fs file system
size=-2147483648K mtime=Thu Jan 1 01:00:00 1970
mdadm: /dev/dm-11 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-12 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-13 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-14 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-2 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-3 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-4 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-5 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-6 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
mdadm: /dev/dm-7 appears to be part of a raid array:
level=raid6 devices=15 ctime=Wed Feb 13 10:38:52 2008
Continue creating array? y
mdadm: array /dev/md4 started.
monosan:~ # mdadm --detail --scan| fgrep md4 >> /etc/mdadm.conf
monosan:~ # mdadm -S /dev/md4
mdadm: stopped /dev/md4
monosan:~ # mdadm -A /dev/md4
mdadm: WARNING /dev/dm-9 and /dev/dm-8 appear to have very similar superblocks.
If they are really different, please --zero the superblock on one
If they are the same or overlap, please remove one from the
DEVICE list in mdadm.conf.
This happens _always_ when the arreay is reassembled. The actual devices with the duplicated superblocks differ sometimes.
Are 16 drives too much for RAID6?
Thanks
Lars
--
Informationstechnologie
Berlin-Brandenburgische Akademie der Wissenschaften
Jägerstrasse 22-23 10117 Berlin
Tel.: +49 30 20370-352 http://www.bbaw.de
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: 16 HDDs too much for RAID6? 2008-03-06 9:01 16 HDDs too much for RAID6? Lars Täuber @ 2008-03-06 9:45 ` Andre Noll 2008-03-06 10:55 ` Lars Täuber 0 siblings, 1 reply; 13+ messages in thread From: Andre Noll @ 2008-03-06 9:45 UTC (permalink / raw) To: Lars Täuber; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1458 bytes --] On 10:01, Lars Täuber wrote: > Here we have another problem with our RAID6 and 16 HDDs: > > monosan:~ # mdadm -V > mdadm - v2.6.2 - 21st May 2007 > monosan:~ # mdadm -C /dev/md4 -l6 -n 16 -x 0 /dev/dm-* > mdadm: /dev/dm-0 appears to be part of a raid array: Run mdadm --zero-superblock /dev/dm-0 before creating the array to get rid of these. > monosan:~ # mdadm --detail --scan| fgrep md4 >> /etc/mdadm.conf Please post your /etc/mdadm.conf > monosan:~ # mdadm -A /dev/md4 > mdadm: WARNING /dev/dm-9 and /dev/dm-8 appear to have very similar superblocks. > If they are really different, please --zero the superblock on one > If they are the same or overlap, please remove one from the > DEVICE list in mdadm.conf. Are you sure dm-8 and dm-9 are different devices? pvdisplay /dev/dm-9; pvdisplay /dev/dm-8 should tell you. > This happens _always_ when the arreay is reassembled. The actual > devices with the duplicated superblocks differ sometimes. If I read the code correctly it means that the two devices have identical superblocks, the same event count and the same minor number, so mdadm thinks dm-8 and dm-9 are overlapping partitions. > Are 16 drivesmda too much for RAID6? No, raid6 supports up to 254 devices (there are other reasons for not using too many devices for raid6 though). Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 16 HDDs too much for RAID6? 2008-03-06 9:45 ` Andre Noll @ 2008-03-06 10:55 ` Lars Täuber 2008-03-06 16:16 ` Andre Noll 0 siblings, 1 reply; 13+ messages in thread From: Lars Täuber @ 2008-03-06 10:55 UTC (permalink / raw) To: Andre Noll; +Cc: linux-raid Hi Andre, > Run > mdadm --zero-superblock /dev/dm-0 > > before creating the array to get rid of these. ok, thanks. > > > monosan:~ # mdadm --detail --scan| fgrep md4 >> /etc/mdadm.conf > > Please post your /etc/mdadm.conf monosan:~ # cat /etc/mdadm.conf DEVICE partitions ARRAY /dev/md2 level=raid1 UUID=d9d31de2:e6dbd3c3:37c7ea09:882a64e5 ARRAY /dev/md3 level=raid1 UUID=a8687183:a79e514c:ca492c4b:ffd4384f ARRAY /dev/md4 level=raid6 num-devices=16 UUID=8d596319:4d21dba3:3871bccf:5b90a66d > > monosan:~ # mdadm -A /dev/md4 > > mdadm: WARNING /dev/dm-9 and /dev/dm-8 appear to have very similar superblocks. > > If they are really different, please --zero the superblock on one > > If they are the same or overlap, please remove one from the > > DEVICE list in mdadm.conf. > > Are you sure dm-8 and dm-9 are different devices? > > pvdisplay /dev/dm-9; pvdisplay /dev/dm-8 > > should tell you. Yes, I'm sure: monosan:~ # pvdisplay /dev/dm-9; pvdisplay /dev/dm-8 Failed to read physical volume "/dev/dm-9" Failed to read physical volume "/dev/dm-8" but: monosan:~ # multipathd -k multipathd> list topology mpath0 (SATA_ST31000340NS_5QJ02QRQ) dm-0 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:32:0 sdah 66:16 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:15:0 sdr 65:16 [active][ready] mpath1 (SATA_ST31000340NS_5QJ0204G) dm-1 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:31:0 sdag 66:0 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:14:0 sdq 65:0 [active][ready] mpath2 (SATA_ST31000340NS_5QJ02TVQ) dm-2 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:30:0 sdaf 65:240 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:13:0 sdp 8:240 [active][ready] mpath3 (SATA_ST31000340NS_5QJ012AL) dm-3 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:29:0 sdae 65:224 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:12:0 sdo 8:224 [active][ready] mpath4 (SATA_ST31000340NS_5QJ00PHN) dm-4 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:28:0 sdad 65:208 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:11:0 sdn 8:208 [active][ready] mpath5 (SATA_ST31000340NS_5QJ01BYF) dm-5 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:27:0 sdac 65:192 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:10:0 sdm 8:192 [active][ready] mpath6 (SATA_ST31000340NS_5QJ026J1) dm-6 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:26:0 sdab 65:176 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:9:0 sdl 8:176 [active][ready] mpath7 (SATA_ST31000340NS_5QJ01G09) dm-7 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:25:0 sdaa 65:160 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:8:0 sdk 8:160 [active][ready] rename: mpath9 (SATA_ST31000340NS_5QJ02461) dm-8 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:24:0 sdz 65:144 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:7:0 sdj 8:144 [active][ready] reload: mpath10 (SATA_ST31000340NS_5QJ013GW) dm-9 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:23:0 sdy 65:128 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:6:0 sdi 8:128 [active][ready] mpath11 (SATA_ST31000340NS_5QJ01835) dm-10 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:22:0 sdx 65:112 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:5:0 sdh 8:112 [active][ready] mpath12 (SATA_ST31000340NS_5QJ01C49) dm-11 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:21:0 sdw 65:96 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:4:0 sdg 8:96 [active][ready] mpath13 (SATA_ST31000340NS_5QJ02TBZ) dm-12 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:20:0 sdv 65:80 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:3:0 sdf 8:80 [active][ready] mpath14 (SATA_ST31000340NS_5QJ01JSF) dm-13 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:19:0 sdu 65:64 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:2:0 sde 8:64 [active][ready] mpath15 (SATA_ST31000340NS_5QJ02TBK) dm-14 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:18:0 sdt 65:48 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:1:0 sdd 8:48 [active][ready] mpath16 (SATA_ST31000340NS_5QJ0185Y) dm-15 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:17:0 sds 65:32 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:0:0 sdc 8:32 [active][ready] multipathd> monosan:~ # for DEV in /dev/sd[c-z] /dev/sda[a-h] ; do echo Serial $DEV ; smartctl -id sat $DEV ; done | fgrep Serial Serial /dev/sdc Serial Number: 5QJ0185Y Serial /dev/sdd Serial Number: 5QJ02TBK Serial /dev/sde Serial Number: 5QJ01JSF Serial /dev/sdf Serial Number: 5QJ02TBZ Serial /dev/sdg Serial Number: 5QJ01C49 Serial /dev/sdh Serial Number: 5QJ01835 Serial /dev/sdi Serial Number: 5QJ013GW Serial /dev/sdj Serial Number: 5QJ02461 Serial /dev/sdk Serial Number: 5QJ01G09 Serial /dev/sdl Serial Number: 5QJ026J1 Serial /dev/sdm Serial Number: 5QJ01BYF Serial /dev/sdn Serial Number: 5QJ00PHN Serial /dev/sdo Serial Number: 5QJ012AL Serial /dev/sdp Serial Number: 5QJ02TVQ Serial /dev/sdq Serial Number: 5QJ0204G Serial /dev/sdr Serial Number: 5QJ02QRQ Serial /dev/sds Serial Number: 5QJ0185Y Serial /dev/sdt Serial Number: 5QJ02TBK Serial /dev/sdu Serial Number: 5QJ01JSF Serial /dev/sdv Serial Number: 5QJ02TBZ Serial /dev/sdw Serial Number: 5QJ01C49 Serial /dev/sdx Serial Number: 5QJ01835 Serial /dev/sdy Serial Number: 5QJ013GW Serial /dev/sdz Serial Number: 5QJ02461 Serial /dev/sdaa Serial Number: 5QJ01G09 Serial /dev/sdab Serial Number: 5QJ026J1 Serial /dev/sdac Serial Number: 5QJ01BYF Serial /dev/sdad Serial Number: 5QJ00PHN Serial /dev/sdae Serial Number: 5QJ012AL Serial /dev/sdaf Serial Number: 5QJ02TVQ Serial /dev/sdag Serial Number: 5QJ0204G Serial /dev/sdah Serial Number: 5QJ02QRQ > > > This happens _always_ when the arreay is reassembled. The actual > > devices with the duplicated superblocks differ sometimes. > > If I read the code correctly it means that the two devices have > identical superblocks, the same event count and the same minor number, > so mdadm thinks dm-8 and dm-9 are overlapping partitions. The multipathed drives are used as a whole without. The dont contain any partitions. sdz and sdj are the same physical device with serial 5QJ02461 called /dev/dm-8: rename: mpath9 (SATA_ST31000340NS_5QJ02461) dm-8 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:24:0 sdz 65:144 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:7:0 sdj 8:144 [active][ready] similar to sdy and sdi with serial 5QJ013GW called /dev/dm-9 reload: mpath10 (SATA_ST31000340NS_5QJ013GW) dm-9 ATA ,ST31000340NS [size=932G][features=0][hwhandler=0] \_ round-robin 0 [prio=0][active] \_ 6:0:23:0 sdy 65:128 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:6:0 sdi 8:128 [active][ready] > > Are 16 drivesmda too much for RAID6? > > No, raid6 supports up to 254 devices (there are other reasons for > not using too many devices for raid6 though). Is there a way to get more verbose infos or debug this anyhow? Thanks Lars ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 16 HDDs too much for RAID6? 2008-03-06 10:55 ` Lars Täuber @ 2008-03-06 16:16 ` Andre Noll 2008-03-07 8:41 ` Luca Berra 0 siblings, 1 reply; 13+ messages in thread From: Andre Noll @ 2008-03-06 16:16 UTC (permalink / raw) To: Lars Täuber; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1540 bytes --] On 11:55, Lars Täuber wrote: > monosan:~ # cat /etc/mdadm.conf > DEVICE partitions > ARRAY /dev/md2 level=raid1 UUID=d9d31de2:e6dbd3c3:37c7ea09:882a64e5 > ARRAY /dev/md3 level=raid1 UUID=a8687183:a79e514c:ca492c4b:ffd4384f > ARRAY /dev/md4 level=raid6 num-devices=16 UUID=8d596319:4d21dba3:3871bccf:5b90a66d Does it help to list only the 16 devices that are used for the array, i.e. something like DEVICE /dev/sd[a-p] > sdz and sdj are the same physical device with serial 5QJ02461 called /dev/dm-8: > rename: mpath9 (SATA_ST31000340NS_5QJ02461) dm-8 ATA ,ST31000340NS > [size=932G][features=0][hwhandler=0] > \_ round-robin 0 [prio=0][active] > \_ 6:0:24:0 sdz 65:144 [active][ready] > \_ round-robin 0 [prio=0][enabled] > \_ 6:0:7:0 sdj 8:144 [active][ready] > > similar to sdy and sdi with serial 5QJ013GW called /dev/dm-9 > reload: mpath10 (SATA_ST31000340NS_5QJ013GW) dm-9 ATA ,ST31000340NS > [size=932G][features=0][hwhandler=0] > \_ round-robin 0 [prio=0][active] > \_ 6:0:23:0 sdy 65:128 [active][ready] > \_ round-robin 0 [prio=0][enabled] > \_ 6:0:6:0 sdi 8:128 [active][ready] I think this is what is confusing mdadm. Your "DEVICE partitions" line instructs mdadm to consider all devices in /proc/partitions, so it finds both sdy and sdi. > Is there a way to get more verbose infos or debug this anyhow? There's the --detail and --verbose command line options to mdadm. Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 16 HDDs too much for RAID6? 2008-03-06 16:16 ` Andre Noll @ 2008-03-07 8:41 ` Luca Berra 2008-03-07 10:33 ` Andre Noll 2008-03-07 10:45 ` Lars Täuber 0 siblings, 2 replies; 13+ messages in thread From: Luca Berra @ 2008-03-07 8:41 UTC (permalink / raw) To: linux-raid On Thu, Mar 06, 2008 at 05:16:21PM +0100, Andre Noll wrote: >On 11:55, Lars Täuber wrote: >> monosan:~ # cat /etc/mdadm.conf >> DEVICE partitions >> ARRAY /dev/md2 level=raid1 UUID=d9d31de2:e6dbd3c3:37c7ea09:882a64e5 >> ARRAY /dev/md3 level=raid1 UUID=a8687183:a79e514c:ca492c4b:ffd4384f >> ARRAY /dev/md4 level=raid6 num-devices=16 UUID=8d596319:4d21dba3:3871bccf:5b90a66d > >Does it help to list only the 16 devices that are used for the array, >i.e. something like > > DEVICE /dev/sd[a-p] i hope kernel will pevent you from doing something this stupid, but i am not that sure. if you wanna check if the problem is device selection a more appropriate line would be DEVICE /dev/mapper/mpath* or DEVICE /dev/dm-[0-9] /dev/dm-1[0-5] >I think this is what is confusing mdadm. Your "DEVICE partitions" >line instructs mdadm to consider all devices in /proc/partitions, >so it finds both sdy and sdi. in this case i believe the error message would be different >> Is there a way to get more verbose infos or debug this anyhow? you could try with the --verbose option and post the results here. also could you check if the minor number of /dev/dm-* are really unique? in case this yelds no result we will have to add some more printf in Assemble.c. Regards, L. -- Luca Berra -- bluca@comedia.it Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 16 HDDs too much for RAID6? 2008-03-07 8:41 ` Luca Berra @ 2008-03-07 10:33 ` Andre Noll 2008-03-07 10:45 ` Lars Täuber 1 sibling, 0 replies; 13+ messages in thread From: Andre Noll @ 2008-03-07 10:33 UTC (permalink / raw) To: linux-raid [-- Attachment #1: Type: text/plain, Size: 1353 bytes --] On 09:41, Luca Berra wrote: > >I think this is what is confusing mdadm. Your "DEVICE partitions" > >line instructs mdadm to consider all devices in /proc/partitions, > >so it finds both sdy and sdi. > in this case i believe the error message would be different Well, the code that causes the warning Lars is seeing is if (best[i] >=0 && devices[best[i]].i.events == devices[devcnt].i.events && (devices[best[i]].i.disk.minor != devices[devcnt].i.disk.minor) && st->ss->major == 0 && info.array.level != -4) { /* two different devices with identical superblock. * Could be a mis-detection caused by overlapping * partitions. fail-safe. */ fprintf(stderr, Name ": WARNING %s and %s appear" " to have very similar superblocks.\n" " If they are really different, " "please --zero the superblock on one\n" " If they are the same or overlap," " please remove one from %s.\n", devices[best[i]].devname, devname, inargv ? "the list" : "the\n DEVICE list in mdadm.conf" ); if (must_close) close(mdfd); return 1; } IMHO this can be triggered by having two devices nodes (sdy and sdi, whatever) that correspond to the same physical device, no? Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 16 HDDs too much for RAID6? 2008-03-07 8:41 ` Luca Berra 2008-03-07 10:33 ` Andre Noll @ 2008-03-07 10:45 ` Lars Täuber 2008-03-28 9:20 ` Reopen: " Lars Täuber 1 sibling, 1 reply; 13+ messages in thread From: Lars Täuber @ 2008-03-07 10:45 UTC (permalink / raw) To: linux-raid Hi guys, Luca Berra <bluca@comedia.it> schrieb: > On Thu, Mar 06, 2008 at 05:16:21PM +0100, Andre Noll wrote: > >On 11:55, Lars Täuber wrote: > >> monosan:~ # cat /etc/mdadm.conf > >> DEVICE partitions > >> ARRAY /dev/md2 level=raid1 UUID=d9d31de2:e6dbd3c3:37c7ea09:882a64e5 > >> ARRAY /dev/md3 level=raid1 UUID=a8687183:a79e514c:ca492c4b:ffd4384f > >> ARRAY /dev/md4 level=raid6 num-devices=16 UUID=8d596319:4d21dba3:3871bccf:5b90a66d > > > >Does it help to list only the 16 devices that are used for the array, > >i.e. something like > > > > DEVICE /dev/sd[a-p] because the devices sd[c-z] and sda[a-h] are used by multipathd they are accessible in read only mode only. For writing /dev/dm-* devices are available. > i hope kernel will pevent you from doing something this stupid, but i am > not that sure. > if you wanna check if the problem is device selection a more appropriate > line would be > DEVICE /dev/mapper/mpath* > or > DEVICE /dev/dm-[0-9] /dev/dm-1[0-5] Correct. My mdadm.conf has now this line for safety: DEVICE /dev/sd[ab][0-9] /dev/dm-* But this doesn't really changed anything. > >I think this is what is confusing mdadm. Your "DEVICE partitions" > >line instructs mdadm to consider all devices in /proc/partitions, > >so it finds both sdy and sdi. > in this case i believe the error message would be different > > >> Is there a way to get more verbose infos or debug this anyhow? > > you could try with the --verbose option and post the results here. > > also could you check if the minor number of /dev/dm-* are really unique? > > in case this yelds no result we will have to add some more printf in > Assemble.c. I zeroed out all physical devices completely: # for DEV in /dev/sd[c-r]; do dd if=/dev/zero of=$DEV; done Now the problem is gone. I don't know what really caused the problem. Many thanks for your suggestions. Lars -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Reopen: 16 HDDs too much for RAID6? 2008-03-07 10:45 ` Lars Täuber @ 2008-03-28 9:20 ` Lars Täuber 2008-03-28 10:14 ` Bernd Schubert 0 siblings, 1 reply; 13+ messages in thread From: Lars Täuber @ 2008-03-28 9:20 UTC (permalink / raw) To: linux-raid Hallo! Lars Täuber <taeuber@bbaw.de> schrieb: > I zeroed out all physical devices completely: > # for DEV in /dev/sd[c-r]; do dd if=/dev/zero of=$DEV; done > > Now the problem is gone. I don't know what really caused the problem. > Many thanks for your suggestions. The problem has occured again. I'm not sure what the cause was, but the duplicated superblock is there again. But the raid fell apart before. So I suspect this occurs only after the array was degraded. The discs are not defective so I tried to reassemble the array with the original discs again: monosan:~ # mdadm -A /dev/md4 mdadm: WARNING /dev/dm-9 and /dev/dm-8 appear to have very similar superblocks. If they are really different, please --zero the superblock on one If they are the same or overlap, please remove one from the DEVICE list in mdadm.conf. How can I extract the superblocks to check if they are really identically? Thanks Lars -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Reopen: 16 HDDs too much for RAID6? 2008-03-28 9:20 ` Reopen: " Lars Täuber @ 2008-03-28 10:14 ` Bernd Schubert 2008-03-28 10:27 ` Lars Täuber 0 siblings, 1 reply; 13+ messages in thread From: Bernd Schubert @ 2008-03-28 10:14 UTC (permalink / raw) To: Lars Täuber; +Cc: linux-raid Hallo Lars, On Friday 28 March 2008 10:20:02 Lars Täuber wrote: > Hallo! > > Lars Täuber <taeuber@bbaw.de> schrieb: > > I zeroed out all physical devices completely: > > # for DEV in /dev/sd[c-r]; do dd if=/dev/zero of=$DEV; done why not simply "mdadm --zero-superblock $DEV"? > > > > Now the problem is gone. I don't know what really caused the problem. > > Many thanks for your suggestions. > > The problem has occured again. > I'm not sure what the cause was, but the duplicated superblock is there > again. But the raid fell apart before. So I suspect this occurs only after > the array was degraded. The discs are not defective so I tried to > reassemble the array with the original discs again: monosan:~ # mdadm -A > /dev/md4 > mdadm: WARNING /dev/dm-9 and /dev/dm-8 appear to have very similar > superblocks. If they are really different, please --zero the superblock on > one If they are the same or overlap, please remove one from the > DEVICE list in mdadm.conf. > > How can I extract the superblocks to check if they are really identically? mdadm --examine /dev/dm-9 mdadm --examine /dev/dm-8 Do you have some kind of multipathing, which really could cause a identical superblocks on dm-9 and dm-8? Did you specify dm-9 and dm-8 in you mdadm.conf / assemble script or the real human readable lvm / multipath names? Cheers, Bernd -- Bernd Schubert Q-Leap Networks GmbH -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Reopen: 16 HDDs too much for RAID6? 2008-03-28 10:14 ` Bernd Schubert @ 2008-03-28 10:27 ` Lars Täuber 2008-03-28 10:35 ` Bernd Schubert 0 siblings, 1 reply; 13+ messages in thread From: Lars Täuber @ 2008-03-28 10:27 UTC (permalink / raw) To: linux-raid Hallo Bernd, Bernd Schubert <bs@q-leap.de> schrieb: > Hallo Lars, > > On Friday 28 March 2008 10:20:02 Lars Täuber wrote: > > Hallo! > > > > Lars Täuber <taeuber@bbaw.de> schrieb: > > > I zeroed out all physical devices completely: > > > # for DEV in /dev/sd[c-r]; do dd if=/dev/zero of=$DEV; done > > why not simply "mdadm --zero-superblock $DEV"? I just wanted to make clear, that there couldn't be anything left anywhere on the disk. I already learned about this option of mdadm. > > > > > > Now the problem is gone. I don't know what really caused the problem. > > > Many thanks for your suggestions. > > > > The problem has occured again. > > I'm not sure what the cause was, but the duplicated superblock is there > > again. But the raid fell apart before. So I suspect this occurs only after > > the array was degraded. The discs are not defective so I tried to > > reassemble the array with the original discs again: monosan:~ # mdadm -A > > /dev/md4 > > mdadm: WARNING /dev/dm-9 and /dev/dm-8 appear to have very similar > > superblocks. If they are really different, please --zero the superblock on > > one If they are the same or overlap, please remove one from the > > DEVICE list in mdadm.conf. > > > > How can I extract the superblocks to check if they are really identically? > > > mdadm --examine /dev/dm-9 > mdadm --examine /dev/dm-8 I just reassembled the array for another test. Next time I'll have a deeper look with this. > Do you have some kind of multipathing, which really could cause a identical > superblocks on dm-9 and dm-8? Did you specify dm-9 and dm-8 in you > mdadm.conf / assemble script or the real human readable lvm / multipath > names? Here the conf file: monosan:~ # cat /etc/mdadm.conf DEVICE /dev/sd[ab][0-9] /dev/dm-* ARRAY /dev/md2 level=raid1 UUID=d9d31de2:e6dbd3c3:37c7ea09:882a64e5 ARRAY /dev/md3 level=raid1 num-devices=2 UUID=a8687183:a79e514c:ca492c4b:ffd4384f ARRAY /dev/md4 level=raid6 num-devices=15 spares=1 UUID=cfcbe071:f6766d8f:0f1ffefa:892d09c3 ARRAY /dev/md9 level=raid1 num-devices=2 name=9 UUID=db687150:614e76fd:28feefc0:b1aae572 All dm-* devices are really distinctive. I could post the /etc/multipath.conf too if you want. Thanks Lars -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Reopen: 16 HDDs too much for RAID6? 2008-03-28 10:27 ` Lars Täuber @ 2008-03-28 10:35 ` Bernd Schubert 2008-03-28 10:55 ` Lars Täuber 0 siblings, 1 reply; 13+ messages in thread From: Bernd Schubert @ 2008-03-28 10:35 UTC (permalink / raw) To: Lars Täuber; +Cc: linux-raid > Here the conf file: > monosan:~ # cat /etc/mdadm.conf > DEVICE /dev/sd[ab][0-9] /dev/dm-* > ARRAY /dev/md2 level=raid1 UUID=d9d31de2:e6dbd3c3:37c7ea09:882a64e5 > ARRAY /dev/md3 level=raid1 num-devices=2 > UUID=a8687183:a79e514c:ca492c4b:ffd4384f ARRAY /dev/md4 level=raid6 > num-devices=15 spares=1 UUID=cfcbe071:f6766d8f:0f1ffefa:892d09c3 ARRAY > /dev/md9 level=raid1 num-devices=2 name=9 > UUID=db687150:614e76fd:28feefc0:b1aae572 > > All dm-* devices are really distinctive. I could post the > /etc/multipath.conf too if you want. I only have very little experience with multipathing, but please send your config. The fact you really do use multipathing only confirms my initial guess it is a multipath and not a md problem. Cheers, Bernd -- Bernd Schubert Q-Leap Networks GmbH ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Reopen: 16 HDDs too much for RAID6? 2008-03-28 10:35 ` Bernd Schubert @ 2008-03-28 10:55 ` Lars Täuber 2008-03-28 11:20 ` Bernd Schubert 0 siblings, 1 reply; 13+ messages in thread From: Lars Täuber @ 2008-03-28 10:55 UTC (permalink / raw) To: linux-raid Hi Bernd, Bernd Schubert <bs@q-leap.de> schrieb: > > Here the conf file: > > monosan:~ # cat /etc/mdadm.conf > > DEVICE /dev/sd[ab][0-9] /dev/dm-* > > ARRAY /dev/md2 level=raid1 UUID=d9d31de2:e6dbd3c3:37c7ea09:882a64e5 > > ARRAY /dev/md3 level=raid1 num-devices=2 > > UUID=a8687183:a79e514c:ca492c4b:ffd4384f ARRAY /dev/md4 level=raid6 > > num-devices=15 spares=1 UUID=cfcbe071:f6766d8f:0f1ffefa:892d09c3 ARRAY > > /dev/md9 level=raid1 num-devices=2 name=9 > > UUID=db687150:614e76fd:28feefc0:b1aae572 > > > > All dm-* devices are really distinctive. I could post the > > /etc/multipath.conf too if you want. here is the file: monosan:~ # cat /etc/multipath.conf # # This configuration file is generated by Yast, do not modify it # manually please. # defaults { polling_interval "0" user_friendly_names "yes" # path_grouping_policy "multibus" } blacklist { # devnode "*" wwid "SATA_WDC_WD1600YS-01_WD-WCAP02964085" wwid "SATA_WDC_WD1600YS-01_WD-WCAP02965435" } blacklist_exceptions { } multipaths { mutlipath { wwid "SATA_ST31000340NS_5QJ02TBK" } mutlipath { wwid "SATA_ST31000340NS_5QJ0185Y" } mutlipath { wwid "SATA_ST31000340NS_5QJ02QRQ" } mutlipath { wwid "SATA_ST31000340NS_5QJ0204G" } mutlipath { wwid "SATA_ST31000340NS_5QJ02TVQ" } mutlipath { wwid "SATA_ST31000340NS_5QJ012AL" } mutlipath { wwid "SATA_ST31000340NS_5QJ00PHN" } mutlipath { wwid "SATA_ST31000340NS_5QJ01BYF" } mutlipath { wwid "SATA_ST31000340NS_5QJ026J1" } mutlipath { wwid "SATA_ST31000340NS_5QJ01G09" } mutlipath { wwid "SATA_ST31000340NS_5QJ02461" } mutlipath { wwid "SATA_ST31000340NS_5QJ013GW" } mutlipath { wwid "SATA_ST31000340NS_5QJ01835" } mutlipath { wwid "SATA_ST31000340NS_5QJ01C49" } mutlipath { wwid "SATA_ST31000340NS_5QJ02TBZ" } mutlipath { wwid "SATA_ST31000340NS_5QJ01JSF" } } > I only have very little experience with multipathing, but please send your > config. The fact you really do use multipathing only confirms my initial > guess it is a multipath and not a md problem. But when I assemble the array after a clean shutdown after it has been initially synced there is no problem. Just if the array was degraded it has such duplicated superblocks. How comes? Lars ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Reopen: 16 HDDs too much for RAID6? 2008-03-28 10:55 ` Lars Täuber @ 2008-03-28 11:20 ` Bernd Schubert 0 siblings, 0 replies; 13+ messages in thread From: Bernd Schubert @ 2008-03-28 11:20 UTC (permalink / raw) To: Lars Täuber; +Cc: linux-raid On Friday 28 March 2008 11:55:37 Lars Täuber wrote: > Hi Bernd, > > Bernd Schubert <bs@q-leap.de> schrieb: > > > Here the conf file: > > > monosan:~ # cat /etc/mdadm.conf > > > DEVICE /dev/sd[ab][0-9] /dev/dm-* > > > ARRAY /dev/md2 level=raid1 UUID=d9d31de2:e6dbd3c3:37c7ea09:882a64e5 > > > ARRAY /dev/md3 level=raid1 num-devices=2 > > > UUID=a8687183:a79e514c:ca492c4b:ffd4384f ARRAY /dev/md4 level=raid6 > > > num-devices=15 spares=1 UUID=cfcbe071:f6766d8f:0f1ffefa:892d09c3 ARRAY > > > /dev/md9 level=raid1 num-devices=2 name=9 > > > UUID=db687150:614e76fd:28feefc0:b1aae572 > > > > > > All dm-* devices are really distinctive. I could post the > > > /etc/multipath.conf too if you want. > > here is the file: > monosan:~ # cat /etc/multipath.conf > # > # This configuration file is generated by Yast, do not modify it > # manually please. > # > defaults { > polling_interval "0" > user_friendly_names "yes" > # path_grouping_policy "multibus" > } > > blacklist { > # devnode "*" > wwid "SATA_WDC_WD1600YS-01_WD-WCAP02964085" > wwid "SATA_WDC_WD1600YS-01_WD-WCAP02965435" > } Hmm, maybe you are creating multipath of multipath? Here's something from a config of our systems: devnode_blacklist { devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" } > > blacklist_exceptions { > } > > multipaths { > mutlipath { > wwid "SATA_ST31000340NS_5QJ02TBK" > } I would set user friendly names using the alias paramter, something like this multipaths { multipath { wwid 360050cc000203ffc0000000000000019 alias raid1a-ost } > I only have very little experience with multipathing, but please send > > your config. The fact you really do use multipathing only confirms my > > initial guess it is a multipath and not a md problem. > > But when I assemble the array after a clean shutdown after it has been > initially synced there is no problem. Just if the array was degraded it has > such duplicated superblocks. How comes? No idea so far, but please do some blacklisting. And if you will set more readable names like "/dev/disk8" instead of "dm-8", it might get much more easy to figure out what it wrong. Cheers, Bernd -- Bernd Schubert Q-Leap Networks GmbH -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2008-03-28 11:20 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-03-06 9:01 16 HDDs too much for RAID6? Lars Täuber 2008-03-06 9:45 ` Andre Noll 2008-03-06 10:55 ` Lars Täuber 2008-03-06 16:16 ` Andre Noll 2008-03-07 8:41 ` Luca Berra 2008-03-07 10:33 ` Andre Noll 2008-03-07 10:45 ` Lars Täuber 2008-03-28 9:20 ` Reopen: " Lars Täuber 2008-03-28 10:14 ` Bernd Schubert 2008-03-28 10:27 ` Lars Täuber 2008-03-28 10:35 ` Bernd Schubert 2008-03-28 10:55 ` Lars Täuber 2008-03-28 11:20 ` Bernd Schubert
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).