* Deleting mdadm RAID arrays
@ 2008-02-05 10:42 Marcin Krol
2008-02-05 11:43 ` Moshe Yudkowsky
` (3 more replies)
0 siblings, 4 replies; 25+ messages in thread
From: Marcin Krol @ 2008-02-05 10:42 UTC (permalink / raw)
To: linux-raid
Hello everyone,
I have had a problem with RAID array (udev messed up disk names, I've had RAID on
disks only, without raid partitions) on Debian Etch server with 6 disks and so I decided
to rearrange this.
Deleted the disks from (2 RAID-5) arrays, deleted the md* devices from /dev,
created /dev/sd[a-f]1 Linux raid auto-detect partitions and rebooted the host.
Now the mdadm startup script is writing in loop a message like "mdadm: warning: /dev/sda1 and
/dev/sdb1 have similar superblocks. If they are not identical, --zero the superblock ... "
The host can't boot up now because of this.
If I boot the server with some disks, I can't even zero that superblock:
% mdadm --zero-superblock /dev/sdb1
mdadm: Couldn't open /dev/sdb1 for write - not zeroing
It's the same even after:
% mdadm --manage /dev/md2 --fail /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md2
Now, I have NEVER created /dev/md2 array, yet it show up automatically!
% cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1]
md2 : active(auto-read-only) raid1 sdb1[1]
390708736 blocks [3/1] [_U_]
md1 : inactive sda1[2]
390708736 blocks
unused devices: <none>
Questions:
1. Where this info on array resides?! I have deleted /etc/mdadm/mdadm.conf
and /dev/md devices and yet it comes seemingly out of nowhere.
2. How can I delete that damn array so it doesn't hang my server up in a loop?
--
Marcin Krol
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: Deleting mdadm RAID arrays 2008-02-05 10:42 Deleting mdadm RAID arrays Marcin Krol @ 2008-02-05 11:43 ` Moshe Yudkowsky 2008-02-06 9:35 ` Marcin Krol 2008-02-05 12:27 ` Janek Kozicki ` (2 subsequent siblings) 3 siblings, 1 reply; 25+ messages in thread From: Moshe Yudkowsky @ 2008-02-05 11:43 UTC (permalink / raw) To: Marcin Krol; +Cc: linux-raid > 1. Where this info on array resides?! I have deleted /etc/mdadm/mdadm.conf > and /dev/md devices and yet it comes seemingly out of nowhere. /boot has a copy of mdadm.conf so that / and other drives can be started and then mounted. update-initramfs will update /boot's copy of mdadm.conf. -- Moshe Yudkowsky * moshe@pobox.com * www.pobox.com/~moshe ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-05 11:43 ` Moshe Yudkowsky @ 2008-02-06 9:35 ` Marcin Krol 0 siblings, 0 replies; 25+ messages in thread From: Marcin Krol @ 2008-02-06 9:35 UTC (permalink / raw) To: linux-raid Tuesday 05 February 2008 12:43:31 Moshe Yudkowsky napisał(a): > > 1. Where this info on array resides?! I have deleted /etc/mdadm/mdadm.conf > > and /dev/md devices and yet it comes seemingly out of nowhere. > /boot has a copy of mdadm.conf so that / and other drives can be started > and then mounted. update-initramfs will update /boot's copy of mdadm.conf. Yeah, I found that while deleting mdadm package... Thanks for answers everyone anyway. Regards, Marcin Krol - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-05 10:42 Deleting mdadm RAID arrays Marcin Krol 2008-02-05 11:43 ` Moshe Yudkowsky @ 2008-02-05 12:27 ` Janek Kozicki 2008-02-05 13:52 ` Michael Tokarev 2008-02-05 20:12 ` Deleting mdadm RAID arrays Neil Brown 2008-02-06 11:22 ` David Greaves 3 siblings, 1 reply; 25+ messages in thread From: Janek Kozicki @ 2008-02-05 12:27 UTC (permalink / raw) To: linux-raid Marcin Krol said: (by the date of Tue, 5 Feb 2008 11:42:19 +0100) > 2. How can I delete that damn array so it doesn't hang my server up in a loop? dd if=/dev/zero of=/dev/sdb1 bs=1M count=10 I'm not using mdadm.conf at all. Everything is stored in the superblock of the device. So if you don't erase it - info about raid array will be still automatically found. -- Janek Kozicki | ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-05 12:27 ` Janek Kozicki @ 2008-02-05 13:52 ` Michael Tokarev 2008-02-05 14:33 ` Moshe Yudkowsky 2008-02-05 14:47 ` Auto generation of mdadm.conf (was: Deleting mdadm RAID arrays) Janek Kozicki 0 siblings, 2 replies; 25+ messages in thread From: Michael Tokarev @ 2008-02-05 13:52 UTC (permalink / raw) To: Janek Kozicki; +Cc: linux-raid Janek Kozicki wrote: > Marcin Krol said: (by the date of Tue, 5 Feb 2008 11:42:19 +0100) > >> 2. How can I delete that damn array so it doesn't hang my server up in a loop? > > dd if=/dev/zero of=/dev/sdb1 bs=1M count=10 This works provided the superblocks are at the beginning of the component devices. Which is not the case by default (0.90 superblocks, at the end of components), or with 1.0 superblocks. mdadm --zero-superblock /dev/sdb1 is the way to go here. > I'm not using mdadm.conf at all. Everything is stored in the > superblock of the device. So if you don't erase it - info about raid > array will be still automatically found. That's wrong, as you need at least something to identify the array components. UUID is the most reliable and commonly used. You assemble the arrays as mdadm --assemble /dev/md1 --uuid=123456789 or something like that anyway. If not, your arrays may not start properly in case you shuffled disks (e.g replaced a bad one), or your disks were renumbered after a kernel or other hardware change and so on. The most convient place to store that info is mdadm.conf. Here, it looks just like: DEVICE partitions ARRAY /dev/md1 UUID=4ee58096:e5bc04ac:b02137be:3792981a ARRAY /dev/md2 UUID=b4dec03f:24ec8947:1742227c:761aa4cb By default mdadm offers additional information which helps to diagnose possible problems, namely: ARRAY /dev/md5 level=raid5 num-devices=4 UUID=6dc4e503:85540e55:d935dea5:d63df51b This new info isn't necessary for mdadm to work (but UUID is), yet it comes handy sometimes. /mjt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-05 13:52 ` Michael Tokarev @ 2008-02-05 14:33 ` Moshe Yudkowsky 2008-02-05 15:16 ` Michael Tokarev 2008-02-05 14:47 ` Auto generation of mdadm.conf (was: Deleting mdadm RAID arrays) Janek Kozicki 1 sibling, 1 reply; 25+ messages in thread From: Moshe Yudkowsky @ 2008-02-05 14:33 UTC (permalink / raw) To: Michael Tokarev; +Cc: Janek Kozicki, linux-raid Michael Tokarev wrote: > Janek Kozicki wrote: >> Marcin Krol said: (by the date of Tue, 5 Feb 2008 11:42:19 +0100) >> >>> 2. How can I delete that damn array so it doesn't hang my server up in a loop? >> dd if=/dev/zero of=/dev/sdb1 bs=1M count=10 > > This works provided the superblocks are at the beginning of the > component devices. Which is not the case by default (0.90 > superblocks, at the end of components), or with 1.0 superblocks. > > mdadm --zero-superblock /dev/sdb1 Would that work if even if he doesn't update his mdadm.conf inside the /boot image? Or would mdadm attempt to build the array according to the instructions in mdadm.conf? I expect that it might depend on whether the instructions are given in terms of UUID or in terms of devices. -- Moshe Yudkowsky * moshe@pobox.com * www.pobox.com/~moshe "I think it a greater honour to have my head standing on the ports of this town for this quarrel, than to have my portrait in the King's bedchamber." -- Montrose, 20 May 1650 ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-05 14:33 ` Moshe Yudkowsky @ 2008-02-05 15:16 ` Michael Tokarev 0 siblings, 0 replies; 25+ messages in thread From: Michael Tokarev @ 2008-02-05 15:16 UTC (permalink / raw) To: Moshe Yudkowsky; +Cc: Janek Kozicki, linux-raid Moshe Yudkowsky wrote: > Michael Tokarev wrote: >> Janek Kozicki wrote: >>> Marcin Krol said: (by the date of Tue, 5 Feb 2008 11:42:19 +0100) >>> >>>> 2. How can I delete that damn array so it doesn't hang my server up >>>> in a loop? >>> dd if=/dev/zero of=/dev/sdb1 bs=1M count=10 >> >> This works provided the superblocks are at the beginning of the >> component devices. Which is not the case by default (0.90 >> superblocks, at the end of components), or with 1.0 superblocks. >> >> mdadm --zero-superblock /dev/sdb1 > > Would that work if even if he doesn't update his mdadm.conf inside the > /boot image? Or would mdadm attempt to build the array according to the > instructions in mdadm.conf? I expect that it might depend on whether the > instructions are given in terms of UUID or in terms of devices. After zeroing superblocks, mdadm will NOT assemble the array, regardless if using UUIDs or devices or whatever. In order to assemble the array, all component devices MUST have valid superblocks and the superblocks must match each other. mdadm --assemble in initramfs will simple fail to do its work. /mjt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Auto generation of mdadm.conf (was: Deleting mdadm RAID arrays) 2008-02-05 13:52 ` Michael Tokarev 2008-02-05 14:33 ` Moshe Yudkowsky @ 2008-02-05 14:47 ` Janek Kozicki 2008-02-05 15:34 ` Auto generation of mdadm.conf Michael Tokarev 1 sibling, 1 reply; 25+ messages in thread From: Janek Kozicki @ 2008-02-05 14:47 UTC (permalink / raw) To: linux-raid Michael Tokarev said: (by the date of Tue, 05 Feb 2008 16:52:18 +0300) > Janek Kozicki wrote: > > I'm not using mdadm.conf at all. > > That's wrong, as you need at least something to identify the array > components. I was afraid of that ;-) So, is that a correct way to automatically generate a correct mdadm.conf ? I did it after some digging in man pages: echo 'DEVICE partitions' > mdadm.conf mdadm --examine --scan --config=mdadm.conf >> ./mdadm.conf Now, when I do 'cat mdadm.conf' i get: DEVICE partitions ARRAY /dev/md/0 level=raid1 metadata=1 num-devices=3 UUID=75b0f87879:539d6cee:f22092f4:7a6e6f name='backup':0 ARRAY /dev/md/2 level=raid1 metadata=1 num-devices=3 UUID=4fd340a6c4:db01d6f7:1e03da2d:bdd574 name=backup:2 ARRAY /dev/md/1 level=raid5 metadata=1 num-devices=3 UUID=22f22c3599:613d5231:d407a655:bdeb84 name=backup:1 Looks quite reasonable. Should I append it to /etc/mdadm/mdadm.conf ? This file currently contains: (commented lines are left out) DEVICE partitions CREATE owner=root group=disk mode=0660 auto=yes HOMEHOST <system> MAILADDR root This is the default content of /etc/mdadm/mdadm.conf on fresh debian etch install. best regards -- Janek Kozicki ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Auto generation of mdadm.conf 2008-02-05 14:47 ` Auto generation of mdadm.conf (was: Deleting mdadm RAID arrays) Janek Kozicki @ 2008-02-05 15:34 ` Michael Tokarev 2008-02-05 18:39 ` Janek Kozicki 0 siblings, 1 reply; 25+ messages in thread From: Michael Tokarev @ 2008-02-05 15:34 UTC (permalink / raw) To: Janek Kozicki; +Cc: linux-raid Janek Kozicki wrote: > Michael Tokarev said: (by the date of Tue, 05 Feb 2008 16:52:18 +0300) > >> Janek Kozicki wrote: >>> I'm not using mdadm.conf at all. >> That's wrong, as you need at least something to identify the array >> components. > > I was afraid of that ;-) So, is that a correct way to automatically > generate a correct mdadm.conf ? I did it after some digging in man pages: > > echo 'DEVICE partitions' > mdadm.conf > mdadm --examine --scan --config=mdadm.conf >> ./mdadm.conf > > Now, when I do 'cat mdadm.conf' i get: > > DEVICE partitions > ARRAY /dev/md/0 level=raid1 metadata=1 num-devices=3 UUID=75b0f87879:539d6cee:f22092f4:7a6e6f name='backup':0 > ARRAY /dev/md/2 level=raid1 metadata=1 num-devices=3 UUID=4fd340a6c4:db01d6f7:1e03da2d:bdd574 name=backup:2 > ARRAY /dev/md/1 level=raid5 metadata=1 num-devices=3 UUID=22f22c3599:613d5231:d407a655:bdeb84 name=backup:1 Hmm. I wonder why the name for md/0 is in quotes, while others are not. > Looks quite reasonable. Should I append it to /etc/mdadm/mdadm.conf ? Probably... see below. > This file currently contains: (commented lines are left out) > > DEVICE partitions > CREATE owner=root group=disk mode=0660 auto=yes > HOMEHOST <system> > MAILADDR root > > This is the default content of /etc/mdadm/mdadm.conf on fresh debian > etch install. But now I wonder HOW your arrays gets assembled in the first place. Let me guess... mdrun? Or maybe in-kernel auto-detection? The thing is that mdadm will NOT assemble your arrays given this config. If you have your disk/controller and md drivers built into the kernel, AND marked the partitions as "linux raid autodetect", kernel may assemble them right at boot. But I don't remember if the kernel will even consider v.1 superblocks for its auto- assembly. In any way, don't rely on the kernel to do this work, in-kernel assembly code is very simplistic and works up to a moment when anything changes/breaks. It's almost the same code as was in old raidtools... Another possibility is mdrun utility (shell script) shipped with Debian's mdadm package. It's deprecated now, but still provided for compatibility. mdrun is even worse, it will try to assemble ALL arrays found, giving them random names and numbers, not handling failures correctly, and failing badly in case of, e.g. a "foreign" disk is found which happens to contain a valid raid superblock somewhere... Well. There's another, 3rd possibility: mdadm can assemble all arrays automatically (even if not listed explicitly in mdadm.conf) using homehost (only available with v.1 superblock). I haven't tried this option yet, so don't remember how it works. From the mdadm(8) manpage: Auto Assembly When --assemble is used with --scan and no devices are listed, mdadm will first attempt to assemble all the arrays listed in the config file. If a homehost has been specified (either in the config file or on the command line), mdadm will look further for possible arrays and will try to assemble anything that it finds which is tagged as belonging to the given homehost. This is the only situation where mdadm will assemble arrays without being given specific device name or identity information for the array. If mdadm finds a consistent set of devices that look like they should comprise an array, and if the superblock is tagged as belonging to the given home host, it will automatically choose a device name and try to assemble the array. If the array uses version-0.90 metadata, then the minor number as recorded in the superblock is used to create a name in /dev/md/ so for example /dev/md/3. If the array uses version-1 meta‐ data, then the name from the superblock is used to similarly create a name in /dev/md (the name will have any ’host’ prefix stripped first). So.. probably this is the way your arrays are being assembled, since you do have HOMEHOST in your mdadm.conf... Looks like it should work, after all... ;) And in this case there's no need to specify additional array information in the config file. /mjt - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Auto generation of mdadm.conf 2008-02-05 15:34 ` Auto generation of mdadm.conf Michael Tokarev @ 2008-02-05 18:39 ` Janek Kozicki 0 siblings, 0 replies; 25+ messages in thread From: Janek Kozicki @ 2008-02-05 18:39 UTC (permalink / raw) Cc: linux-raid Michael Tokarev said: (by the date of Tue, 05 Feb 2008 18:34:47 +0300) <...> > So.. probably this is the way your arrays are being assembled, since you > do have HOMEHOST in your mdadm.conf... Looks like it should work, after > all... ;) And in this case there's no need to specify additional array > information in the config file. whew, that was a long read. Thanks for detailed analysis. I hope that your conclusion is correct, since I have no way to decide this by myself. My knowledge is not enough here :) best regards -- Janek Kozicki | ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-05 10:42 Deleting mdadm RAID arrays Marcin Krol 2008-02-05 11:43 ` Moshe Yudkowsky 2008-02-05 12:27 ` Janek Kozicki @ 2008-02-05 20:12 ` Neil Brown 2008-02-06 9:55 ` Marcin Krol 2008-02-06 19:03 ` Bill Davidsen 2008-02-06 11:22 ` David Greaves 3 siblings, 2 replies; 25+ messages in thread From: Neil Brown @ 2008-02-05 20:12 UTC (permalink / raw) To: Marcin Krol; +Cc: linux-raid On Tuesday February 5, admin@domeny.pl wrote: > > % mdadm --zero-superblock /dev/sdb1 > mdadm: Couldn't open /dev/sdb1 for write - not zeroing That's weird. Why can't it open it? Maybe you aren't running as root (The '%' prompt is suspicious). Maybe the kernel has been told to forget about the partitions of /dev/sdb. mdadm will sometimes tell it to do that, but only if you try to assemble arrays out of whole components. If that is the problem, then blockdev --rereadpt /dev/sdb will fix it. NeilBrown ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-05 20:12 ` Deleting mdadm RAID arrays Neil Brown @ 2008-02-06 9:55 ` Marcin Krol 2008-02-06 10:11 ` Peter Rabbitson 2008-02-06 10:43 ` Neil Brown 2008-02-06 19:03 ` Bill Davidsen 1 sibling, 2 replies; 25+ messages in thread From: Marcin Krol @ 2008-02-06 9:55 UTC (permalink / raw) To: linux-raid Tuesday 05 February 2008 21:12:32 Neil Brown napisał(a): > > % mdadm --zero-superblock /dev/sdb1 > > mdadm: Couldn't open /dev/sdb1 for write - not zeroing > > That's weird. > Why can't it open it? Hell if I know. First time I see such a thing. > Maybe you aren't running as root (The '%' prompt is suspicious). I am running as root, the "%" prompt is the obfuscation part (I have configured bash to display IP as part of prompt). > Maybe the kernel has been told to forget about the partitions of > /dev/sdb. But fdisk/cfdisk has no problem whatsoever finding the partitions . > mdadm will sometimes tell it to do that, but only if you try to > assemble arrays out of whole components. > If that is the problem, then > blockdev --rereadpt /dev/sdb I deleted LVM devices that were sitting on top of RAID and reinstalled mdadm. % blockdev --rereadpt /dev/sdf BLKRRPART: Device or resource busy % mdadm /dev/md2 --fail /dev/sdf1 mdadm: set /dev/sdf1 faulty in /dev/md2 % blockdev --rereadpt /dev/sdf BLKRRPART: Device or resource busy % mdadm /dev/md2 --remove /dev/sdf1 mdadm: hot remove failed for /dev/sdf1: Device or resource busy lsof /dev/sdf1 gives ZERO results. arrrRRRGH Regards, Marcin Krol - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-06 9:55 ` Marcin Krol @ 2008-02-06 10:11 ` Peter Rabbitson 2008-02-06 10:32 ` Marcin Krol 2008-02-06 10:43 ` Neil Brown 1 sibling, 1 reply; 25+ messages in thread From: Peter Rabbitson @ 2008-02-06 10:11 UTC (permalink / raw) To: Marcin Krol; +Cc: linux-raid Marcin Krol wrote: > Tuesday 05 February 2008 21:12:32 Neil Brown napisał(a): > >>> % mdadm --zero-superblock /dev/sdb1 >>> mdadm: Couldn't open /dev/sdb1 for write - not zeroing >> That's weird. >> Why can't it open it? > > Hell if I know. First time I see such a thing. > >> Maybe you aren't running as root (The '%' prompt is suspicious). > > I am running as root, the "%" prompt is the obfuscation part (I have > configured bash to display IP as part of prompt). > >> Maybe the kernel has been told to forget about the partitions of >> /dev/sdb. > > But fdisk/cfdisk has no problem whatsoever finding the partitions . > >> mdadm will sometimes tell it to do that, but only if you try to >> assemble arrays out of whole components. > >> If that is the problem, then >> blockdev --rereadpt /dev/sdb > > I deleted LVM devices that were sitting on top of RAID and reinstalled mdadm. > > % blockdev --rereadpt /dev/sdf > BLKRRPART: Device or resource busy > > % mdadm /dev/md2 --fail /dev/sdf1 > mdadm: set /dev/sdf1 faulty in /dev/md2 > > % blockdev --rereadpt /dev/sdf > BLKRRPART: Device or resource busy > > % mdadm /dev/md2 --remove /dev/sdf1 > mdadm: hot remove failed for /dev/sdf1: Device or resource busy > > lsof /dev/sdf1 gives ZERO results. > What does this say: dmsetup table - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-06 10:11 ` Peter Rabbitson @ 2008-02-06 10:32 ` Marcin Krol 0 siblings, 0 replies; 25+ messages in thread From: Marcin Krol @ 2008-02-06 10:32 UTC (permalink / raw) To: linux-raid Wednesday 06 February 2008 11:11:51 Peter Rabbitson napisał(a): > > lsof /dev/sdf1 gives ZERO results. > > > > What does this say: > > dmsetup table % dmsetup table vg-home: 0 614400000 linear 9:2 384 Regards, Marcin Krol - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-06 9:55 ` Marcin Krol 2008-02-06 10:11 ` Peter Rabbitson @ 2008-02-06 10:43 ` Neil Brown 2008-02-06 12:03 ` Marcin Krol 1 sibling, 1 reply; 25+ messages in thread From: Neil Brown @ 2008-02-06 10:43 UTC (permalink / raw) To: Marcin Krol; +Cc: linux-raid On Wednesday February 6, admin@domeny.pl wrote: > > > Maybe the kernel has been told to forget about the partitions of > > /dev/sdb. > > But fdisk/cfdisk has no problem whatsoever finding the partitions . It is looking at the partition table on disk. Not at the kernel's idea of partitions, which is initialised from that table... What does cat /proc/partitions say? > > > mdadm will sometimes tell it to do that, but only if you try to > > assemble arrays out of whole components. > > > If that is the problem, then > > blockdev --rereadpt /dev/sdb > > I deleted LVM devices that were sitting on top of RAID and reinstalled mdadm. > > % blockdev --rereadpt /dev/sdf > BLKRRPART: Device or resource busy > Implies that some partition is in use. > % mdadm /dev/md2 --fail /dev/sdf1 > mdadm: set /dev/sdf1 faulty in /dev/md2 > > % blockdev --rereadpt /dev/sdf > BLKRRPART: Device or resource busy > > % mdadm /dev/md2 --remove /dev/sdf1 > mdadm: hot remove failed for /dev/sdf1: Device or resource busy OK, that's weird. If sdf1 is faulty, then you should be able to remove it. What does cat /proc/mdstat dmesg | tail say at this point? NeilBrown ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-06 10:43 ` Neil Brown @ 2008-02-06 12:03 ` Marcin Krol 2008-02-07 2:36 ` Neil Brown 0 siblings, 1 reply; 25+ messages in thread From: Marcin Krol @ 2008-02-06 12:03 UTC (permalink / raw) To: linux-raid Wednesday 06 February 2008 11:43:12: > On Wednesday February 6, admin@domeny.pl wrote: > > > > > Maybe the kernel has been told to forget about the partitions of > > > /dev/sdb. > > > > But fdisk/cfdisk has no problem whatsoever finding the partitions . > > It is looking at the partition table on disk. Not at the kernel's > idea of partitions, which is initialised from that table... Aha! Thanks for this bit. I get it now. > What does > > cat /proc/partitions > > say? Note: I have reconfigured udev now to associate device names with serial numbers (below) % cat /proc/partitions major minor #blocks name 8 0 390711384 sda 8 1 390708801 sda1 8 16 390711384 sdb 8 17 390708801 sdb1 8 32 390711384 sdc 8 33 390708801 sdc1 8 48 390710327 sdd 8 49 390708801 sdd1 8 64 390711384 sde 8 65 390708801 sde1 8 80 390711384 sdf 8 81 390708801 sdf1 3 64 78150744 hdb 3 65 1951866 hdb1 3 66 7815622 hdb2 3 67 4883760 hdb3 3 68 1 hdb4 3 69 979933 hdb5 3 70 979933 hdb6 3 71 61536951 hdb7 9 1 781417472 md1 9 0 781417472 md0 /dev/disk/by-id % ls -l total 0 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-ST380023A_3KB0MV22 -> ../../hdb lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part1 -> ../../hdb1 lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part2 -> ../../hdb2 lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part3 -> ../../hdb3 lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part4 -> ../../hdb4 lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part5 -> ../../hdb5 lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part6 -> ../../hdb6 lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part7 -> ../../hdb7 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1696130 -> ../../d_6 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1696130-part1 -> ../../d_6 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1707974 -> ../../d_5 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1707974-part1 -> ../../d_5 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1795228 -> ../../d_1 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1795228-part1 -> ../../d_1 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1795364 -> ../../d_3 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1795364-part1 -> ../../d_3 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1798692 -> ../../d_2 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1798692-part1 -> ../../d_2 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1800255 -> ../../d_4 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1800255-part1 -> ../../d_4 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1696130 -> ../../d_6 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1696130-part1 -> ../../d_6 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1707974 -> ../../d_5 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1707974-part1 -> ../../d_5 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1795228 -> ../../d_1 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1795228-part1 -> ../../d_1 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1795364 -> ../../d_3 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1795364-part1 -> ../../d_3 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1798692 -> ../../d_2 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1798692-part1 -> ../../d_2 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1800255 -> ../../d_4 lrwxrwxrwx 1 root root 9 2008-02-06 13:34 scsi-S_WD-WMAMY1800255-part1 -> ../../d_4 I have no idea why udev can't allocate /dev/d_1p1 to partition 1 on disk d_1. I have explicitly asked it to do that: /etc/udev/rules.d % cat z24_disks_domeny.rules KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1795228", NAME="d_1" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1795228-part1", NAME="d_1p1" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1798692", NAME="d_2" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1798692-part1", NAME="d_2p1" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1795364", NAME="d_3" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1795364-part1", NAME="d_3p1" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1800255", NAME="d_4" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1800255-part1", NAME="d_4p1" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1707974", NAME="d_5" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1707974-part1", NAME="d_5p1" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1696130", NAME="d_6" KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="WD-WMAMY1696130-part1", NAME="d_6p1" /etc/udev/rules.d % cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md0 : active(auto-read-only) raid5 sdc1[0] sde1[3](S) sdd1[1] 781417472 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_] md1 : active(auto-read-only) raid5 sdf1[0] sdb1[3](S) sda1[1] 781417472 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_] md0 consists of sdc1, sde1 and sdd1 even though when creating I asked it to use d_1, d_2 and d_3 (this is probably written on the particular disk/partition itself, but I have no idea how to clean this up - mdadm --zero-superblock /dev/d_1 again produces "mdadm: Couldn't open /dev/d_1 for write - not zeroing") /etc/mdadm % mdadm -Q --detail /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Wed Feb 6 12:24:49 2008 Raid Level : raid5 Array Size : 781417472 (745.22 GiB 800.17 GB) Used Dev Size : 390708736 (372.61 GiB 400.09 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Wed Feb 6 12:34:00 2008 State : clean, degraded Active Devices : 2 Working Devices : 3 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 64K UUID : f83e3541:b5b63f10:a6d4720f:52a5051f Events : 0.14 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/d_1 1 8 49 1 active sync /dev/d_2 2 0 0 2 removed 3 8 65 - spare /dev/d_3 -- Marcin Krol ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-06 12:03 ` Marcin Krol @ 2008-02-07 2:36 ` Neil Brown 2008-02-07 9:56 ` Marcin Krol 0 siblings, 1 reply; 25+ messages in thread From: Neil Brown @ 2008-02-07 2:36 UTC (permalink / raw) To: Marcin Krol; +Cc: linux-raid On Wednesday February 6, admin@domeny.pl wrote: > > % cat /proc/partitions > major minor #blocks name > > 8 0 390711384 sda > 8 1 390708801 sda1 > 8 16 390711384 sdb > 8 17 390708801 sdb1 > 8 32 390711384 sdc > 8 33 390708801 sdc1 > 8 48 390710327 sdd > 8 49 390708801 sdd1 > 8 64 390711384 sde > 8 65 390708801 sde1 > 8 80 390711384 sdf > 8 81 390708801 sdf1 > 3 64 78150744 hdb > 3 65 1951866 hdb1 > 3 66 7815622 hdb2 > 3 67 4883760 hdb3 > 3 68 1 hdb4 > 3 69 979933 hdb5 > 3 70 979933 hdb6 > 3 71 61536951 hdb7 > 9 1 781417472 md1 > 9 0 781417472 md0 So all the expected partitions are known to the kernel - good. > > /etc/udev/rules.d % cat /proc/mdstat > Personalities : [raid1] [raid6] [raid5] [raid4] > md0 : active(auto-read-only) raid5 sdc1[0] sde1[3](S) sdd1[1] > 781417472 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_] > > md1 : active(auto-read-only) raid5 sdf1[0] sdb1[3](S) sda1[1] > 781417472 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_] > > md0 consists of sdc1, sde1 and sdd1 even though when creating I asked it to > use d_1, d_2 and d_3 (this is probably written on the particular disk/partition itself, > but I have no idea how to clean this up - mdadm --zero-superblock /dev/d_1 > again produces "mdadm: Couldn't open /dev/d_1 for write - not zeroing") > I suspect it is related to the (auto-read-only). The array is degraded and has a spare, so it wants to do a recovery to the spare. But it won't start the recovery until the array is not read-only. But the recovery process has partly started (you'll see an md1_resync thread) so it won't let go of any fail devices at the moment. If you mdadm -w /dev/md0 the recovery will start. Then mdadm /dev/md0 -f /dev/d_1 will fail d_1, abort the recovery, and release d_1. Then mdadm --zero-superblock /dev/d_1 should work. It is currently failing with EBUSY - --zero-superblock opens the device with O_EXCL to ensure that it isn't currently in use, and as long as it is part of an md array, O_EXCL will fail. I should make that more explicit in the error message. NeilBrown ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-07 2:36 ` Neil Brown @ 2008-02-07 9:56 ` Marcin Krol 2008-02-07 21:35 ` Bill Davidsen 0 siblings, 1 reply; 25+ messages in thread From: Marcin Krol @ 2008-02-07 9:56 UTC (permalink / raw) To: linux-raid Thursday 07 February 2008 03:36:31 Neil Brown napisał(a): > > 8 0 390711384 sda > > 8 1 390708801 sda1 > > 8 16 390711384 sdb > > 8 17 390708801 sdb1 > > 8 32 390711384 sdc > > 8 33 390708801 sdc1 > > 8 48 390710327 sdd > > 8 49 390708801 sdd1 > > 8 64 390711384 sde > > 8 65 390708801 sde1 > > 8 80 390711384 sdf > > 8 81 390708801 sdf1 > > 3 64 78150744 hdb > > 3 65 1951866 hdb1 > > 3 66 7815622 hdb2 > > 3 67 4883760 hdb3 > > 3 68 1 hdb4 > > 3 69 979933 hdb5 > > 3 70 979933 hdb6 > > 3 71 61536951 hdb7 > > 9 1 781417472 md1 > > 9 0 781417472 md0 > > So all the expected partitions are known to the kernel - good. It 's not good really!! I can't trust /dev/sd* devices - they get swapped randomly depending on sequence of module loading!! I have two drivers, ahci for onboard SATA controllers and sata_sil for additional controller. Sometimes the system boots ahci first and sata_sil later, sometimes in reverse sequence. Then, sda becomes sdc, sdb becomes sdd, etc. It is exactly the problem that I cannot rely on kernel's information which physical drive is which logical drive! > Then > mdadm /dev/md0 -f /dev/d_1 > > will fail d_1, abort the recovery, and release d_1. > > Then > mdadm --zero-superblock /dev/d_1 > > should work. Thanks, though I managed to fail the drives, remove them, zero superblocks and reassemble the arrays anyway. The problem I have now is that mdadm seems to be of 'two minds' when it comes to where it gets the info on which disk is what part of the array. As you may remember, I have configured udev to associate /dev/d_* devices with serial numbers (to keep them from changing depending on boot module loading sequence). Now, when I swap two (random) drives in order to test if it keeps device names associated with serial numbers I get the following effect: 1. mdadm -Q --detail /dev/md* gives correct results before *and* after the swapping: % mdadm -Q --detail /dev/md0 /dev/md0: [...] Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/d_1 1 8 17 1 active sync /dev/d_2 2 8 81 2 active sync /dev/d_3 % mdadm -Q --detail /dev/md1 /dev/md1: [...] Number Major Minor RaidDevice State 0 8 49 0 active sync /dev/d_4 1 8 65 1 active sync /dev/d_5 2 8 33 2 active sync /dev/d_6 2. However, cat /proc/mdstat gives shows different layout of the arrays! BEFORE the swap: % cat mdstat-16_51 Personalities : [raid6] [raid5] [raid4] md1 : active raid5 sdb1[2] sdf1[0] sda1[1] 781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] md0 : active raid5 sde1[2] sdc1[0] sdd1[1] 781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] unused devices: <none> AFTER the swap: % cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md1 : active(auto-read-only) raid5 sdd1[0] sdc1[2] sde1[1] 781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] md0 : active(auto-read-only) raid5 sda1[0] sdf1[2] sdb1[1] 781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] unused devices: <none> I have no idea now if the array is functioning (it keeps the drives according to /dev/d_* devices and superblock info is unimportant) or if my arrays fell apart because of that swapping. And I made *damn* sure I zeroed all the superblocks before reassembling the arrays. Yet it still shows the old partitions on those arrays! Here's current mdadm -E information if it might help with diagnosing this: % mdadm -E /dev/d_1 /dev/d_1: Magic : a92b4efc Version : 00.90.00 UUID : dc150d95:d1aea7bc:a6d4720f:52a5051f Creation Time : Wed Feb 6 13:44:00 2008 Raid Level : raid5 Used Dev Size : 390708736 (372.61 GiB 400.09 GB) Array Size : 781417472 (745.22 GiB 800.17 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 0 Update Time : Wed Feb 6 20:23:33 2008 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Checksum : f706efc3 - correct Events : 0.16 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 0 8 33 0 active sync /dev/d_6 0 0 8 33 0 active sync /dev/d_6 1 1 8 49 1 active sync /dev/d_4 2 2 8 65 2 active sync /dev/d_5 b1 (192.168.1.235) ~ % mdadm -E /dev/d_2 /dev/d_2: Magic : a92b4efc Version : 00.90.00 UUID : dc150d95:d1aea7bc:a6d4720f:52a5051f Creation Time : Wed Feb 6 13:44:00 2008 Raid Level : raid5 Used Dev Size : 390708736 (372.61 GiB 400.09 GB) Array Size : 781417472 (745.22 GiB 800.17 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 0 Update Time : Wed Feb 6 20:23:33 2008 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Checksum : f706efd5 - correct Events : 0.16 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 1 8 49 1 active sync /dev/d_4 0 0 8 33 0 active sync /dev/d_6 1 1 8 49 1 active sync /dev/d_4 2 2 8 65 2 active sync /dev/d_5 b1 (192.168.1.235) ~ % mdadm -E /dev/d_3 /dev/d_3: Magic : a92b4efc Version : 00.90.00 UUID : dc150d95:d1aea7bc:a6d4720f:52a5051f Creation Time : Wed Feb 6 13:44:00 2008 Raid Level : raid5 Used Dev Size : 390708736 (372.61 GiB 400.09 GB) Array Size : 781417472 (745.22 GiB 800.17 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 0 Update Time : Wed Feb 6 20:23:33 2008 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Checksum : f706efe7 - correct Events : 0.16 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 2 8 65 2 active sync /dev/d_5 0 0 8 33 0 active sync /dev/d_6 1 1 8 49 1 active sync /dev/d_4 2 2 8 65 2 active sync /dev/d_5 b1 (192.168.1.235) ~ % mdadm -E /dev/d_4 /dev/d_4: Magic : a92b4efc Version : 00.90.00 UUID : 0ccf5692:82985f35:a6d4720f:52a5051f Creation Time : Wed Feb 6 13:43:24 2008 Raid Level : raid5 Used Dev Size : 390708736 (372.61 GiB 400.09 GB) Array Size : 781417472 (745.22 GiB 800.17 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 1 Update Time : Wed Feb 6 20:23:40 2008 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Checksum : d8aaf014 - correct Events : 0.12 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 0 8 81 0 active sync /dev/d_3 0 0 8 81 0 active sync /dev/d_3 1 1 8 1 1 active sync /dev/d_1 2 2 8 17 2 active sync /dev/d_2 b1 (192.168.1.235) ~ % mdadm -E /dev/d_5 /dev/d_5: Magic : a92b4efc Version : 00.90.00 UUID : 0ccf5692:82985f35:a6d4720f:52a5051f Creation Time : Wed Feb 6 13:43:24 2008 Raid Level : raid5 Used Dev Size : 390708736 (372.61 GiB 400.09 GB) Array Size : 781417472 (745.22 GiB 800.17 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 1 Update Time : Wed Feb 6 20:23:40 2008 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Checksum : d8aaefc6 - correct Events : 0.12 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 1 8 1 1 active sync /dev/d_1 0 0 8 81 0 active sync /dev/d_3 1 1 8 1 1 active sync /dev/d_1 2 2 8 17 2 active sync /dev/d_2 b1 (192.168.1.235) ~ % mdadm -E /dev/d_6 /dev/d_6: Magic : a92b4efc Version : 00.90.00 UUID : 0ccf5692:82985f35:a6d4720f:52a5051f Creation Time : Wed Feb 6 13:43:24 2008 Raid Level : raid5 Used Dev Size : 390708736 (372.61 GiB 400.09 GB) Array Size : 781417472 (745.22 GiB 800.17 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 1 Update Time : Wed Feb 6 20:23:40 2008 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Checksum : d8aaefd8 - correct Events : 0.12 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 2 8 17 2 active sync /dev/d_2 0 0 8 81 0 active sync /dev/d_3 1 1 8 1 1 active sync /dev/d_1 2 2 8 17 2 active sync /dev/d_2 - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-07 9:56 ` Marcin Krol @ 2008-02-07 21:35 ` Bill Davidsen 2008-02-08 9:35 ` Marcin Krol 0 siblings, 1 reply; 25+ messages in thread From: Bill Davidsen @ 2008-02-07 21:35 UTC (permalink / raw) To: Marcin Krol; +Cc: linux-raid Marcin Krol wrote: > Thursday 07 February 2008 03:36:31 Neil Brown napisał(a): > > >>> 8 0 390711384 sda >>> 8 1 390708801 sda1 >>> 8 16 390711384 sdb >>> 8 17 390708801 sdb1 >>> 8 32 390711384 sdc >>> 8 33 390708801 sdc1 >>> 8 48 390710327 sdd >>> 8 49 390708801 sdd1 >>> 8 64 390711384 sde >>> 8 65 390708801 sde1 >>> 8 80 390711384 sdf >>> 8 81 390708801 sdf1 >>> 3 64 78150744 hdb >>> 3 65 1951866 hdb1 >>> 3 66 7815622 hdb2 >>> 3 67 4883760 hdb3 >>> 3 68 1 hdb4 >>> 3 69 979933 hdb5 >>> 3 70 979933 hdb6 >>> 3 71 61536951 hdb7 >>> 9 1 781417472 md1 >>> 9 0 781417472 md0 >>> >> So all the expected partitions are known to the kernel - good. >> > > It 's not good really!! > > I can't trust /dev/sd* devices - they get swapped randomly depending > on sequence of module loading!! I have two drivers, ahci for onboard > SATA controllers and sata_sil for additional controller. > > Sometimes the system boots ahci first and sata_sil later, sometimes > in reverse sequence. > > Then, sda becomes sdc, sdb becomes sdd, etc. > > It is exactly the problem that I cannot rely on kernel's information which > physical drive is which logical drive! > > >> Then >> mdadm /dev/md0 -f /dev/d_1 >> >> will fail d_1, abort the recovery, and release d_1. >> >> Then >> mdadm --zero-superblock /dev/d_1 >> >> should work. >> > > Thanks, though I managed to fail the drives, remove them, zero superblocks > and reassemble the arrays anyway. > > The problem I have now is that mdadm seems to be of 'two minds' when it comes > to where it gets the info on which disk is what part of the array. > > As you may remember, I have configured udev to associate /dev/d_* devices with > serial numbers (to keep them from changing depending on boot module loading > sequence). > > Why do you care? If you are using UUID for all the arrays and mounts does this buy you anything? And more to the point, the first time a drive fails and you replace it, will it cause you a problem? Require maintaining the serial to name data manually? I miss the benefit of forcing this instead of just building the information at boot time and dropping it in a file. > Now, when I swap two (random) drives in order to test if it keeps device names > associated with serial numbers I get the following effect: > > 1. mdadm -Q --detail /dev/md* gives correct results before *and* after the swapping: > > % mdadm -Q --detail /dev/md0 > /dev/md0: > [...] > Number Major Minor RaidDevice State > 0 8 1 0 active sync /dev/d_1 > 1 8 17 1 active sync /dev/d_2 > 2 8 81 2 active sync /dev/d_3 > > % mdadm -Q --detail /dev/md1 > /dev/md1: > [...] > Number Major Minor RaidDevice State > 0 8 49 0 active sync /dev/d_4 > 1 8 65 1 active sync /dev/d_5 > 2 8 33 2 active sync /dev/d_6 > > > 2. However, cat /proc/mdstat gives shows different layout of the arrays! > > BEFORE the swap: > > % cat mdstat-16_51 > Personalities : [raid6] [raid5] [raid4] > md1 : active raid5 sdb1[2] sdf1[0] sda1[1] > 781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] > > md0 : active raid5 sde1[2] sdc1[0] sdd1[1] > 781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] > > unused devices: <none> > > > AFTER the swap: > > % cat /proc/mdstat > Personalities : [raid6] [raid5] [raid4] > md1 : active(auto-read-only) raid5 sdd1[0] sdc1[2] sde1[1] > 781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] > > md0 : active(auto-read-only) raid5 sda1[0] sdf1[2] sdb1[1] > 781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] > > unused devices: <none> > > I have no idea now if the array is functioning (it keeps the drives > according to /dev/d_* devices and superblock info is unimportant) > or if my arrays fell apart because of that swapping. > > And I made *damn* sure I zeroed all the superblocks before reassembling > the arrays. Yet it still shows the old partitions on those arrays! > As I noted before, you said you had these on whole devices before, did you zero the superblocks on the whole devices or the partitions? From what I read, it was the partitions. -- Bill Davidsen <davidsen@tmr.com> "Woe unto the statesman who makes war without a reason that will still be valid when the war is over..." Otto von Bismark - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-07 21:35 ` Bill Davidsen @ 2008-02-08 9:35 ` Marcin Krol 2008-02-08 12:44 ` Bill Davidsen 0 siblings, 1 reply; 25+ messages in thread From: Marcin Krol @ 2008-02-08 9:35 UTC (permalink / raw) To: linux-raid Thursday 07 February 2008 22:35:45 Bill Davidsen napisał(a): > > As you may remember, I have configured udev to associate /dev/d_* devices with > > serial numbers (to keep them from changing depending on boot module loading > > sequence). > Why do you care? Because /dev/sd* devices get swapped randomly depending on boot module insertion sequence, as I explained earlier. > If you are using UUID for all the arrays and mounts > does this buy you anything? This is exactly what is not clear for me: what is it that identifies drive/partition as part of the array? /dev/sd name? UUID as part of superblock? /dev/d_n? If it's UUID I should be safe regardless of /dev/sd* designation? Yes or no? > And more to the point, the first time a > drive fails and you replace it, will it cause you a problem? Require > maintaining the serial to name data manually? That's not the problem. I just want my array to be intact. > I miss the benefit of forcing this instead of just building the > information at boot time and dropping it in a file. I would prefer that, too - if it worked. I was getting both arrays messed up randomly on boot. "messed up" in the sense of arrays being composed of different /dev/sd devices. > > And I made *damn* sure I zeroed all the superblocks before reassembling > > the arrays. Yet it still shows the old partitions on those arrays! > > > As I noted before, you said you had these on whole devices before, did > you zero the superblocks on the whole devices or the partitions? From > what I read, it was the partitions. I tried it both ways actually (rebuilt arrays a few times, just udev didn't want to associate WD-serialnumber-part1 as /dev/d_1p1 as it was told, it still claimed it was /dev/d_1). Regards, Marcin Krol - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-08 9:35 ` Marcin Krol @ 2008-02-08 12:44 ` Bill Davidsen 2008-02-08 12:52 ` Marcin Krol 0 siblings, 1 reply; 25+ messages in thread From: Bill Davidsen @ 2008-02-08 12:44 UTC (permalink / raw) To: Marcin Krol; +Cc: linux-raid Marcin Krol wrote: > Thursday 07 February 2008 22:35:45 Bill Davidsen napisał(a): > >>> As you may remember, I have configured udev to associate /dev/d_* devices with >>> serial numbers (to keep them from changing depending on boot module loading >>> sequence). >>> > > >> Why do you care? >> > > Because /dev/sd* devices get swapped randomly depending on boot module insertion > sequence, as I explained earlier. > > So there's no functional problem, just cosmetic? >> If you are using UUID for all the arrays and mounts >> does this buy you anything? >> > > This is exactly what is not clear for me: what is it that identifies drive/partition as part of > the array? /dev/sd name? UUID as part of superblock? /dev/d_n? > > If it's UUID I should be safe regardless of /dev/sd* designation? Yes or no? > > Yes, absolutely. >> And more to the point, the first time a >> drive fails and you replace it, will it cause you a problem? Require >> maintaining the serial to name data manually? >> > > That's not the problem. I just want my array to be intact. > > >> I miss the benefit of forcing this instead of just building the >> information at boot time and dropping it in a file. >> > > I would prefer that, too - if it worked. I was getting both arrays messed > up randomly on boot. "messed up" in the sense of arrays being composed > of different /dev/sd devices. > > Different devices? Or just different names for the same devices? I assume just the names change, and I still don't see why you care... subtle beyond my understanding. > >>> And I made *damn* sure I zeroed all the superblocks before reassembling >>> the arrays. Yet it still shows the old partitions on those arrays! >>> >>> >> As I noted before, you said you had these on whole devices before, did >> you zero the superblocks on the whole devices or the partitions? From >> what I read, it was the partitions. >> > > I tried it both ways actually (rebuilt arrays a few times, just udev didn't want > to associate WD-serialnumber-part1 as /dev/d_1p1 as it was told, it still claimed > it was /dev/d_1). > I'm not talking about building the array, but zeroing the superblocks. Did you use the partition name, /dev/sdb1, when you ran mdadm with "zero-super" or did you zero the whole device, /dev/sdb, which is what you were using when you first built the array with whole devices. If you didn't zero the superblock for the whole device it may explain why a superblock is still found. -- Bill Davidsen <davidsen@tmr.com> "Woe unto the statesman who makes war without a reason that will still be valid when the war is over..." Otto von Bismark - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-08 12:44 ` Bill Davidsen @ 2008-02-08 12:52 ` Marcin Krol 0 siblings, 0 replies; 25+ messages in thread From: Marcin Krol @ 2008-02-08 12:52 UTC (permalink / raw) To: linux-raid Friday 08 February 2008 13:44:18 Bill Davidsen napisał(a): > > This is exactly what is not clear for me: what is it that identifies drive/partition as part of > > the array? /dev/sd name? UUID as part of superblock? /dev/d_n? > > > > If it's UUID I should be safe regardless of /dev/sd* designation? Yes or no? > Yes, absolutely. OK, that's what I needed to know. Regards, Marcin Krol - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-05 20:12 ` Deleting mdadm RAID arrays Neil Brown 2008-02-06 9:55 ` Marcin Krol @ 2008-02-06 19:03 ` Bill Davidsen 1 sibling, 0 replies; 25+ messages in thread From: Bill Davidsen @ 2008-02-06 19:03 UTC (permalink / raw) To: Neil Brown; +Cc: Marcin Krol, linux-raid Neil Brown wrote: > On Tuesday February 5, admin@domeny.pl wrote: > >> % mdadm --zero-superblock /dev/sdb1 >> mdadm: Couldn't open /dev/sdb1 for write - not zeroing >> > > That's weird. > Why can't it open it? > > I suspect that (a) he's not root and has read-only access to the device (I have group read for certain groups, too). And since he had the arrays on raw devices, shouldn't he zero the superblocks using the whole device as well? Depending on what type of superblock it might not be found otherwise. It sure can't hurt to zero all the superblocks of the whole devices and then check the partitions to see if they are present, then create the array again with --force and be really sure the superblock is present and sane. > Maybe you aren't running as root (The '%' prompt is suspicious). > Maybe the kernel has been told to forget about the partitions of > /dev/sdb. > mdadm will sometimes tell it to do that, but only if you try to > assemble arrays out of whole components. > > If that is the problem, then > blockdev --rereadpt /dev/sdb > > will fix it. > > NeilBrown -- Bill Davidsen <davidsen@tmr.com> "Woe unto the statesman who makes war without a reason that will still be valid when the war is over..." Otto von Bismark ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-05 10:42 Deleting mdadm RAID arrays Marcin Krol ` (2 preceding siblings ...) 2008-02-05 20:12 ` Deleting mdadm RAID arrays Neil Brown @ 2008-02-06 11:22 ` David Greaves 2008-02-06 11:56 ` Marcin Krol 3 siblings, 1 reply; 25+ messages in thread From: David Greaves @ 2008-02-06 11:22 UTC (permalink / raw) To: Marcin Krol; +Cc: linux-raid, Neil Brown Marcin Krol wrote: > Hello everyone, > > I have had a problem with RAID array (udev messed up disk names, I've had RAID on > disks only, without raid partitions) Do you mean that you originally used /dev/sdb for the RAID array? And now you are using /dev/sdb1? Given the system seems confused I wonder if this may be relevant? David ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Deleting mdadm RAID arrays 2008-02-06 11:22 ` David Greaves @ 2008-02-06 11:56 ` Marcin Krol 0 siblings, 0 replies; 25+ messages in thread From: Marcin Krol @ 2008-02-06 11:56 UTC (permalink / raw) To: linux-raid Wednesday 06 February 2008 12:22:00: > > I have had a problem with RAID array (udev messed up disk names, I've had RAID on > > disks only, without raid partitions) > > Do you mean that you originally used /dev/sdb for the RAID array? And now you > are using /dev/sdb1? That's reconfigured now, it doesn't matter (started up the host in single user, created partitions as opposed to running RAID previously on whole disks). > Given the system seems confused I wonder if this may be relevant? I don't think so, I tried most mdadm operations (fail, remove, etc) on disks (like sdb) and partitions (like sdb1) and get identical messages for either. -- Marcin Krol ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2008-02-08 12:52 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-02-05 10:42 Deleting mdadm RAID arrays Marcin Krol 2008-02-05 11:43 ` Moshe Yudkowsky 2008-02-06 9:35 ` Marcin Krol 2008-02-05 12:27 ` Janek Kozicki 2008-02-05 13:52 ` Michael Tokarev 2008-02-05 14:33 ` Moshe Yudkowsky 2008-02-05 15:16 ` Michael Tokarev 2008-02-05 14:47 ` Auto generation of mdadm.conf (was: Deleting mdadm RAID arrays) Janek Kozicki 2008-02-05 15:34 ` Auto generation of mdadm.conf Michael Tokarev 2008-02-05 18:39 ` Janek Kozicki 2008-02-05 20:12 ` Deleting mdadm RAID arrays Neil Brown 2008-02-06 9:55 ` Marcin Krol 2008-02-06 10:11 ` Peter Rabbitson 2008-02-06 10:32 ` Marcin Krol 2008-02-06 10:43 ` Neil Brown 2008-02-06 12:03 ` Marcin Krol 2008-02-07 2:36 ` Neil Brown 2008-02-07 9:56 ` Marcin Krol 2008-02-07 21:35 ` Bill Davidsen 2008-02-08 9:35 ` Marcin Krol 2008-02-08 12:44 ` Bill Davidsen 2008-02-08 12:52 ` Marcin Krol 2008-02-06 19:03 ` Bill Davidsen 2008-02-06 11:22 ` David Greaves 2008-02-06 11:56 ` Marcin Krol
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).