* Some md/mdadm bugs @ 2012-02-02 19:08 Asdo 2012-02-02 21:17 ` NeilBrown 0 siblings, 1 reply; 13+ messages in thread From: Asdo @ 2012-02-02 19:08 UTC (permalink / raw) To: linux-raid Hello list I removed sda from the system and I confirmed /dev/sda did not exist any more. After some time an I/O was issued to the array and sda6 was failed by MD in /dev/md5: md5 : active raid1 sdb6[2] sda6[0](F) 10485688 blocks super 1.0 [2/1] [_U] bitmap: 1/160 pages [4KB], 32KB chunk At this point I tried: mdadm /dev/md5 --remove detached --> no effect ! mdadm /dev/md5 --remove failed --> no effect ! mdadm /dev/md5 --remove /dev/sda6 --> mdadm: cannot find /dev/sda6: No such file or directory (!!!) mdadm /dev/md5 --remove sda6 --> finally worked ! (I don't know how I had the idea to actually try this...) Then here is another array: md1 : active raid1 sda2[0] sdb2[2] 10485688 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk This one did not even realize that sda was removed from the system long ago. Apparently only when an I/O is issued, mdadm realizes the drive is not there anymore. I am wondering (and this would be very serious) what happens if a new drives is inserted and it takes the /dev/sda identifier!? Would MD start writing or do any operation THERE!? There is another problem... I tried to make MD realize that the drive is detached: mdadm /dev/md1 --fail detached --> no effect ! however: ls /dev/sda2 --> ls: cannot access /dev/sda2: No such file or directory so "detached" also seems broken... And here goes also a feature request: if a device is detached from the system, (echo 1 > device/delete or removing via hardware hot-swap + AHCI) MD should detect this situation and mark the device (and all its partitions) as failed in all arrays, or even remove the device completely from the RAID. In my case I have verified that MD did not realize the device was removed from the system, and only much later when an I/O was issued to the disk, it would mark the device as failed in the RAID. After the above is implemented, it could be an idea to actually allow a new disk to take the place of a failed disk automatically if that would be a "re-add" (probably the same failed disk is being reinserted by the operator) and this even if the array is running, and especially if there is a bitmap. Now it doesn't happen: When I reinserted the disk, udev triggered the --incremental, to reinsert the device, but mdadm refused to do anything because the old slot was still occupied with a failed+detached device. I manually removed the device from the raid then I ran --incremental, but mdadm still refused to re-add the device to the RAID because the array was running. I think that if it is a re-add, and especially if the bitmap is active, I can't think of a situation in which the user would *not* want to do an incremental re-add even if the array is running. Thank you Asdo ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-02 19:08 Some md/mdadm bugs Asdo @ 2012-02-02 21:17 ` NeilBrown 2012-02-02 22:58 ` Asdo 0 siblings, 1 reply; 13+ messages in thread From: NeilBrown @ 2012-02-02 21:17 UTC (permalink / raw) To: Asdo; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 4492 bytes --] On Thu, 02 Feb 2012 20:08:53 +0100 Asdo <asdo@shiftmail.org> wrote: > Hello list > > I removed sda from the system and I confirmed /dev/sda did not exist any > more. > After some time an I/O was issued to the array and sda6 was failed by MD > in /dev/md5: > > md5 : active raid1 sdb6[2] sda6[0](F) > 10485688 blocks super 1.0 [2/1] [_U] > bitmap: 1/160 pages [4KB], 32KB chunk > > At this point I tried: > > mdadm /dev/md5 --remove detached > --> no effect ! > mdadm /dev/md5 --remove failed > --> no effect ! What version of mdadm? (mdadm --version). These stopped working at one stage and were fixed in 3.1.5. > mdadm /dev/md5 --remove /dev/sda6 > --> mdadm: cannot find /dev/sda6: No such file or directory (!!!) > mdadm /dev/md5 --remove sda6 > --> finally worked ! (I don't know how I had the idea to actually try > this...) Well done. > > > Then here is another array: > > md1 : active raid1 sda2[0] sdb2[2] > 10485688 blocks super 1.0 [2/2] [UU] > bitmap: 0/1 pages [0KB], 65536KB chunk > > This one did not even realize that sda was removed from the system long ago. Nobody told it. > Apparently only when an I/O is issued, mdadm realizes the drive is not > there anymore. Only when there is IO, or someone tells it. > I am wondering (and this would be very serious) what happens if a new > drives is inserted and it takes the /dev/sda identifier!? Would MD start > writing or do any operation THERE!? Wouldn't happen. As long as md hold onto the shell of the old sda nothing else will get the name 'sda'. > > There is another problem... > I tried to make MD realize that the drive is detached: > > mdadm /dev/md1 --fail detached > --> no effect ! > however: > ls /dev/sda2 > --> ls: cannot access /dev/sda2: No such file or directory > so "detached" also seems broken... Before 3.1.5 it was. If you are using a newer mdadm I'll need to look into it. > > > > And here goes also a feature request: > > if a device is detached from the system, (echo 1 > device/delete or > removing via hardware hot-swap + AHCI) MD should detect this situation > and mark the device (and all its partitions) as failed in all arrays, or > even remove the device completely from the RAID. This needs to be done via a udev rule. That is why --remove understands names like "sda6" (no /dev). Then a device is removed, udev processes the remove notification. The rule ACTION=="remove", RUN+="/sbin/mdadm -If $name" in /etc/udev/rules.d/something.rules will make that happen. > In my case I have verified that MD did not realize the device was > removed from the system, and only much later when an I/O was issued to > the disk, it would mark the device as failed in the RAID. > > After the above is implemented, it could be an idea to actually allow a > new disk to take the place of a failed disk automatically if that would > be a "re-add" (probably the same failed disk is being reinserted by the > operator) and this even if the array is running, and especially if there > is a bitmap. It should so that, providing you have a udev rule like: ACTION=="add", RUN+="/sbin/mdadm -I $tempnode" You can even get it to add other devices as spares with e.g. policy action=force-spare though you almost certainly don't want that general a policy. You would want to restrict that to certain ports (device paths). > Now it doesn't happen: > When I reinserted the disk, udev triggered the --incremental, to > reinsert the device, but mdadm refused to do anything because the old > slot was still occupied with a failed+detached device. I manually > removed the device from the raid then I ran --incremental, but mdadm > still refused to re-add the device to the RAID because the array was > running. I think that if it is a re-add, and especially if the bitmap is > active, I can't think of a situation in which the user would *not* want > to do an incremental re-add even if the array is running. Hmmm.. that doesn't seem right. What version of mdadm are you running? Maybe a newer one would get this right. Thanks for the reports. NeilBrown > > Thank you > Asdo > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-02 21:17 ` NeilBrown @ 2012-02-02 22:58 ` Asdo 2012-02-06 16:59 ` Joel 2012-02-06 17:07 ` Asdo 0 siblings, 2 replies; 13+ messages in thread From: Asdo @ 2012-02-02 22:58 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid Hello Neil thanks for the reply version is: mdadm - v3.1.4 - 31st August 2010 so it's indeed before 3.1.5 That's what is in Ubuntu latest stable 11.10, they are lagging behind I'll break the quotes to add a few comments ---> On 02/02/12 22:17, NeilBrown wrote: > ..... >> I am wondering (and this would be very serious) what happens if a new >> drives is inserted and it takes the /dev/sda identifier!? Would MD start >> writing or do any operation THERE!? > Wouldn't happen. As long as md hold onto the shell of the old sda nothing > else will get the name 'sda'. Great! indeed this was what I *suspected* based on the fact newly added drives got higher identifiers. It's good to hear it from a safe source though. >> And here goes also a feature request: >> >> if a device is detached from the system, (echo 1> device/delete or >> removing via hardware hot-swap + AHCI) MD should detect this situation >> and mark the device (and all its partitions) as failed in all arrays, or >> even remove the device completely from the RAID. > This needs to be done via a udev rule. > That is why --remove understands names like "sda6" (no /dev). > > Then a device is removed, udev processes the remove notification. > The rule > > ACTION=="remove", RUN+="/sbin/mdadm -If $name" > > in /etc/udev/rules.d/something.rules > > will make that happen. Oh great! Will use that. --incremental --fail ! I would never have thought of combining those. > >> In my case I have verified that MD did not realize the device was >> removed from the system, and only much later when an I/O was issued to >> the disk, it would mark the device as failed in the RAID. >> >> After the above is implemented, it could be an idea to actually allow a >> new disk to take the place of a failed disk automatically if that would >> be a "re-add" (probably the same failed disk is being reinserted by the >> operator) and this even if the array is running, and especially if there >> is a bitmap. > It should so that, providing you have a udev rule like: > ACTION=="add", RUN+="/sbin/mdadm -I $tempnode" I think I have this rule. But it doesn't work even via commandline if the array is running as I wrote below ---> > You can even get it to add other devices as spares with e.g. > policy action=force-spare > > though you almost certainly don't want that general a policy. You would > want to restrict that to certain ports (device paths). sure, I understand >> Now it doesn't happen: >> When I reinserted the disk, udev triggered the --incremental, to >> reinsert the device, but mdadm refused to do anything because the old >> slot was still occupied with a failed+detached device. I manually >> removed the device from the raid then I ran --incremental, but mdadm >> still refused to re-add the device to the RAID because the array was >> running. I think that if it is a re-add, and especially if the bitmap is >> active, I can't think of a situation in which the user would *not* want >> to do an incremental re-add even if the array is running. > Hmmm.. that doesn't seem right. What version of mdadm are you running? 3.1.4 > Maybe a newer one would get this right. I need to try... I think I need that. > Thanks for the reports. thank you for your reply. Asdo ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-02 22:58 ` Asdo @ 2012-02-06 16:59 ` Joel 2012-02-06 18:47 ` Asdo 2012-02-06 17:07 ` Asdo 1 sibling, 1 reply; 13+ messages in thread From: Joel @ 2012-02-06 16:59 UTC (permalink / raw) To: linux-raid Asdo <asdo <at> shiftmail.org> writes: > Neil say: > > ACTION=="remove", RUN+="/sbin/mdadm -If $name" > > > > in /etc/udev/rules.d/something.rules > > > > will make that happen. > > Oh great! > > Will use that. > > --incremental --fail ! I would never have thought of combining those. I don't think the -If is --incremental --fail. It is --incremental --force. Doesn't incremental automagically add a device if it is new and remove a device if it is old? Joel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-06 16:59 ` Joel @ 2012-02-06 18:47 ` Asdo 2012-02-06 18:50 ` Joel 0 siblings, 1 reply; 13+ messages in thread From: Asdo @ 2012-02-06 18:47 UTC (permalink / raw) To: Joel; +Cc: linux-raid On 02/06/12 17:59, Joel wrote: > Asdo<asdo<at> shiftmail.org> writes: > >> Neil say: >>> ACTION=="remove", RUN+="/sbin/mdadm -If $name" >>> >>> in /etc/udev/rules.d/something.rules >>> >>> will make that happen. >> Oh great! >> >> Will use that. >> >> --incremental --fail ! I would never have thought of combining those. > I don't think the -If is --incremental --fail. It is --incremental --force. > Doesn't incremental automagically add a device if it is new and remove a device > if it is old? > No, it is really --incremental --fail : it behaves like --incremental --fail, while --incremental --force is an illegal combination for mdadm (I just tried) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-06 18:47 ` Asdo @ 2012-02-06 18:50 ` Joel 0 siblings, 0 replies; 13+ messages in thread From: Joel @ 2012-02-06 18:50 UTC (permalink / raw) To: linux-raid Asdo <asdo <at> shiftmail.org> writes: > No, it is really --incremental --fail : > it behaves like --incremental --fail, while --incremental --force is an > illegal combination for mdadm (I just tried) Shoot! You are absolutely right. My manpage reading skills must need serious refreshment! ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-02 22:58 ` Asdo 2012-02-06 16:59 ` Joel @ 2012-02-06 17:07 ` Asdo 2012-02-06 18:47 ` Asdo 2012-02-06 22:20 ` NeilBrown 1 sibling, 2 replies; 13+ messages in thread From: Asdo @ 2012-02-06 17:07 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid On 02/02/12 23:58, Asdo wrote: > >>> Now it doesn't happen: >>> When I reinserted the disk, udev triggered the --incremental, to >>> reinsert the device, but mdadm refused to do anything because the old >>> slot was still occupied with a failed+detached device. I manually >>> removed the device from the raid then I ran --incremental, but mdadm >>> still refused to re-add the device to the RAID because the array was >>> running. I think that if it is a re-add, and especially if the >>> bitmap is >>> active, I can't think of a situation in which the user would *not* want >>> to do an incremental re-add even if the array is running. >> Hmmm.. that doesn't seem right. What version of mdadm are you running? > > 3.1.4 > >> Maybe a newer one would get this right. > I need to try... > I think I need that. Hi Neil, Still some problems on mdadm 3.2.2 (from Ubuntu Precise) apparently: Problem #1: # mdadm -If /dev/sda4 mdadm: incremental removal requires a kernel device name, not a file: /dev/sda4 however this works: # mdadm -If sda4 mdadm: set sda4 faulty in md3 mdadm: hot removed sda4 from md3 Is this by design? Would your udev rule ACTION=="remove", RUN+="/sbin/mdadm -If $name" trigger the first or the second kind of invocation? Problem #2: by reinserting sda, it became sdax, and the array is still running like this: md3 : active raid1 sdb4[2] 10485688 blocks super 1.0 [2/1] [_U] bitmap: 0/160 pages [0KB], 32KB chunk please note the bitmap is active so now I'm trying auto hot-add: # mdadm -I /dev/sdax4 mdadm: not adding /dev/sdax4 to active array (without --run) /dev/md3 still the old problem I mentioned with 3.1.4. Trying more ways: (even with the "--run" which is suggested) # mdadm --run -I /dev/sdax4 mdadm: -I would set mdadm mode to "incremental", but it is already set to "misc". # mdadm -I --run /dev/sdax4 mdadm: failed to add /dev/sdax4 to /dev/md3: Invalid argument. # mdadm -I --run sdax4 mdadm: stat failed for sdax4: No such file or directory. # mdadm -I sdax4 mdadm: stat failed for sdax4: No such file or directory. This feature not working is a problem because if one extracts one disk by mistake, and then reinserts it, even with bitmaps active, he needs to do a lot of manual work to re-add it to the arrays (potentially even error-prone, if he mistakes the partition numbers)... Thank you A. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-06 17:07 ` Asdo @ 2012-02-06 18:47 ` Asdo 2012-02-06 22:31 ` NeilBrown 2012-02-06 22:20 ` NeilBrown 1 sibling, 1 reply; 13+ messages in thread From: Asdo @ 2012-02-06 18:47 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid One or two more bug(s) in 3.2.2 (note: my latest mail I am replying to is still valid) AUTO line in mdadm.conf does not appear to work any longer in 3.2.2 compared to mdadm 3.1.4 Now this line "AUTO -all" still autoassembles every array. There are many arrays not declared in my mdadm.conf, and which are not for this host (hostname is different) but mdadm still autoassembles everything, e.g.: # mdadm -I /dev/sdr8 mdadm: /dev/sdr8 attached to /dev/md/perftest:r0d24, not enough to start (1). (note: "perftest" is even not the hostname) I have just regressed to mdadm 3.1.4 to confirm that it worked back then, and yes, I confirm that 3.1.4 was not doing any action upon: # mdadm -I /dev/sdr8 --> nothing done when the line in config was: "AUTO -all" or even "AUTO +homehost -all" which is the line I am normally using. This is a problem in our fairly large system with 80+ HDDs and many partitions which I am testing now which is full of every kind of arrays.... I am normally using : "AUTO +homehost -all" to prevent assembling a bagzillion of arrays at boot, also because doing that gives race conditions at boot and drops me to initramfs shell (see below next bug). Another problem with 3.2.2: At boot, this is from a serial dump: udevd[218]: symlink '../../sdx13' '/dev/disk/by-partlabel/Linux\x20RAID.udev-tmp' failed: File exists udevd[189]: symlink '../../sdb1' '/dev/disk/by-partlabel/Linux\x20RAID.udev-tmp' failed: File exists And sdb1 is not correctly inserted into array /dev/md0 which hence starts degraded and so I am dropped into an initramfs shell. This looks like a race condition... I don't know if this is fault of udev, udev rules or mdadm... This is with mdadm 3.2.2 and kernel 3.0.13 (called 3.0.0-15-server by Ubuntu) on Ubuntu oneiric 11.10 Having also the above bug of nonworking AUTO line, this problem happens a lot with 80+ disks and lots of partitions. If the auto line worked, I would have postponed most of the assembly's at a very late stage in the boot process, maybe after a significant "sleep". Actually this race condition could be an ubuntu udev script bug : Here are the ubuntu udev rules files I could find, related to mdadm or containing "by-partlabel": ------------------------------------------------ 65-mdadm-blkid.rules: # This file causes Linux RAID (md) block devices to be checked for further # filesystems if the array is active. See udev(8) for syntax. # # Based on Suse's udev rule file for md SUBSYSTEM!="block", GOTO="mdadm_end" KERNEL!="md[0-9]*", GOTO="mdadm_end" ACTION!="add|change", GOTO="mdadm_end" # container devices have a metadata version of e.g. 'external:ddf' and # never leave state 'inactive' ATTR{md/metadata_version}=="external:[A-Za-z]*", ATTR{md/array_state}=="inactive", GOTO="md_ignore_state" ENV{DEVTYPE}=="partition", GOTO="md_ignore_state" TEST!="md/array_state", GOTO="mdadm_end" ATTR{md/array_state}=="|clear|inactive", GOTO="mdadm_end" LABEL="md_ignore_state" # Obtain array information IMPORT{program}="/sbin/mdadm --detail --export $tempnode" ENV{DEVTYPE}=="disk", ENV{MD_NAME}=="?*", SYMLINK+="disk/by-id/md-name-$env{MD_NAME}", OPTIONS+="string_escape=replace" ENV{DEVTYPE}=="disk", ENV{MD_UUID}=="?*", SYMLINK+="disk/by-id/md-uuid-$env{MD_UUID}" ENV{DEVTYPE}=="disk", ENV{MD_DEVNAME}=="?*", SYMLINK+="md/$env{MD_DEVNAME}" ENV{DEVTYPE}=="partition", ENV{MD_NAME}=="?*", SYMLINK+="disk/by-id/md-name-$env{MD_NAME}-part%n", OPTIONS+="string_escape=replace" ENV{DEVTYPE}=="partition", ENV{MD_UUID}=="?*", SYMLINK+="disk/by-id/md-uuid-$env{MD_UUID}-part%n" ENV{DEVTYPE}=="partition", ENV{MD_DEVNAME}=="*[^0-9]", SYMLINK+="md/$env{MD_DEVNAME}%n" ENV{DEVTYPE}=="partition", ENV{MD_DEVNAME}=="*[0-9]", SYMLINK+="md/$env{MD_DEVNAME}p%n" # by-uuid and by-label symlinks IMPORT{program}="/sbin/blkid -o udev -p $tempnode" OPTIONS+="link_priority=100" OPTIONS+="watch" ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_ENC}=="?*", \ SYMLINK+="disk/by-uuid/$env{ID_FS_UUID_ENC}" ENV{ID_FS_USAGE}=="filesystem|other", ENV{ID_FS_LABEL_ENC}=="?*", \ SYMLINK+="disk/by-label/$env{ID_FS_LABEL_ENC}" LABEL="mdadm_end" ------------------------------------------------ 85-mdadm.rules: # This file causes block devices with Linux RAID (mdadm) signatures to # automatically cause mdadm to be run. # See udev(8) for syntax SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", \ RUN+="/sbin/mdadm --incremental $env{DEVNAME}" ------------------------------------------------ part of 60-persistent-storage.rules: # do not edit this file, it will be overwritten on update # persistent storage links: /dev/disk/{by-id,by-uuid,by-label,by-path} # scheme based on "Linux persistent device names", 2004, Hannes Reinecke <hare@suse.de> # forward scsi device event to corresponding block device ACTION=="change", SUBSYSTEM=="scsi", ENV{DEVTYPE}=="scsi_device", TEST=="block", ATTR{block/*/uevent}="change" ACTION=="remove", GOTO="persistent_storage_end" # enable in-kernel media-presence polling ACTION=="add", SUBSYSTEM=="module", KERNEL=="block", ATTR{parameters/events_dfl_poll_msecs}=="0", ATTR{parameters/events_dfl_poll_msecs}="2000" SUBSYSTEM!="block", GOTO="persistent_storage_end" # skip rules for inappropriate block devices KERNEL=="fd*|mtd*|nbd*|gnbd*|btibm*|dm-*|md*", GOTO="persistent_storage_end" # ignore partitions that span the entire disk TEST=="whole_disk", GOTO="persistent_storage_end" # for partitions import parent information ENV{DEVTYPE}=="partition", IMPORT{parent}="ID_*" # virtio-blk KERNEL=="vd*[!0-9]", ATTRS{serial}=="?*", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/virtio-$env{ID_SERIAL}" KERNEL=="vd*[0-9]", ATTRS{serial}=="?*", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/virtio-$env{ID_SERIAL}-part%n" # ATA devices with their own "ata" kernel subsystem KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", SUBSYSTEMS=="ata", IMPORT{program}="ata_id --export $tempnode" # ATA devices using the "scsi" subsystem KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", SUBSYSTEMS=="scsi", ATTRS{vendor}=="ATA", IMPORT{program}="ata_id --export $tempnode" # ATA/ATAPI devices (SPC-3 or later) using the "scsi" subsystem KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", SUBSYSTEMS=="scsi", ATTRS{type}=="5", ATTRS{scsi_level}=="[6-9]*", IMPORT{program}="ata_id --export $tempnode" # Run ata_id on non-removable USB Mass Storage (SATA/PATA disks in enclosures) KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", ATTR{removable}=="0", SUBSYSTEMS=="usb", IMPORT{program}="ata_id --export $tempnode" # Otherwise fall back to using usb_id for USB devices KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", SUBSYSTEMS=="usb", IMPORT{program}="usb_id --export %p" # scsi devices KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", IMPORT{program}="scsi_id --export --whitelisted -d $tempnode", ENV{ID_BUS}="scsi" KERNEL=="cciss*", ENV{DEVTYPE}=="disk", ENV{ID_SERIAL}!="?*", IMPORT{program}="scsi_id --export --whitelisted -d $tempnode", ENV{ID_BUS}="cciss" KERNEL=="sd*|sr*|cciss*", ENV{DEVTYPE}=="disk", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/$env{ID_BUS}-$env{ID_SERIAL}" KERNEL=="sd*|cciss*", ENV{DEVTYPE}=="partition", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/$env{ID_BUS}-$env{ID_SERIAL}-part%n" # firewire KERNEL=="sd*[!0-9]|sr*", ATTRS{ieee1394_id}=="?*", SYMLINK+="disk/by-id/ieee1394-$attr{ieee1394_id}" KERNEL=="sd*[0-9]", ATTRS{ieee1394_id}=="?*", SYMLINK+="disk/by-id/ieee1394-$attr{ieee1394_id}-part%n" # scsi compat links for ATA devices KERNEL=="sd*[!0-9]", ENV{ID_BUS}=="ata", PROGRAM="scsi_id --whitelisted --replace-whitespace -p0x80 -d$tempnode", RESULT=="?*", ENV{ID_SCSI_COMPAT}="$result", SYMLINK+="disk/by-id/scsi-$env{ID_SCSI_COMPAT}" KERNEL=="sd*[0-9]", ENV{ID_SCSI_COMPAT}=="?*", SYMLINK+="disk/by-id/scsi-$env{ID_SCSI_COMPAT}-part%n" KERNEL=="mmcblk[0-9]", SUBSYSTEMS=="mmc", ATTRS{name}=="?*", ATTRS{serial}=="?*", ENV{ID_NAME}="$attr{name}", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/mmc-$env{ID_NAME}_$env{ID_SERIAL}" KERNEL=="mmcblk[0-9]p[0-9]", ENV{ID_NAME}=="?*", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/mmc-$env{ID_NAME}_$env{ID_SERIAL}-part%n" KERNEL=="mspblk[0-9]", SUBSYSTEMS=="memstick", ATTRS{name}=="?*", ATTRS{serial}=="?*", ENV{ID_NAME}="$attr{name}", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/memstick-$env{ID_NAME}_$env{ID_SERIAL}" KERNEL=="mspblk[0-9]p[0-9]", ENV{ID_NAME}=="?*", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/memstick-$env{ID_NAME}_$env{ID_SERIAL}-part%n" # by-path (parent device path) ENV{DEVTYPE}=="disk", ENV{ID_PATH}=="", DEVPATH!="*/virtual/*", IMPORT{program}="path_id %p" ENV{DEVTYPE}=="disk", ENV{ID_PATH}=="?*", SYMLINK+="disk/by-path/$env{ID_PATH}" ENV{DEVTYPE}=="partition", ENV{ID_PATH}=="?*", SYMLINK+="disk/by-path/$env{ID_PATH}-part%n" # skip unpartitioned removable media devices from drivers which do not send "change" events ENV{DEVTYPE}=="disk", KERNEL!="sd*|sr*", ATTR{removable}=="1", GOTO="persistent_storage_end" # probe filesystem metadata of optical drives which have a media inserted KERNEL=="sr*", ENV{DISK_EJECT_REQUEST}!="?*", ENV{ID_CDROM_MEDIA_TRACK_COUNT_DATA}=="?*", ENV{ID_CDROM_MEDIA_SESSION_LAST_OFFSET}=="?*", IMPORT{program}="/sbin/blkid -o udev -p -u noraid -O $env{ID_CDROM_MEDIA_SESSION_LAST_OFFSET} $tempnode" # single-session CDs do not have ID_CDROM_MEDIA_SESSION_LAST_OFFSET KERNEL=="sr*", ENV{DISK_EJECT_REQUEST}!="?*", ENV{ID_CDROM_MEDIA_TRACK_COUNT_DATA}=="?*", ENV{ID_CDROM_MEDIA_SESSION_LAST_OFFSET}=="", IMPORT{program}="/sbin/blkid -o udev -p -u noraid $tempnode" # probe filesystem metadata of disks KERNEL!="sr*", IMPORT{program}="/sbin/blkid -o udev -p $tempnode" # watch metadata changes by tools closing the device after writing KERNEL!="sr*", OPTIONS+="watch" # by-label/by-uuid links (filesystem metadata) ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-uuid/$env{ID_FS_UUID_ENC}" ENV{ID_FS_USAGE}=="filesystem|other", ENV{ID_FS_LABEL_ENC}=="?*", SYMLINK+="disk/by-label/$env{ID_FS_LABEL_ENC}" # by-id (World Wide Name) ENV{DEVTYPE}=="disk", ENV{ID_WWN_WITH_EXTENSION}=="?*", SYMLINK+="disk/by-id/wwn-$env{ID_WWN_WITH_EXTENSION}" ENV{DEVTYPE}=="partition", ENV{ID_WWN_WITH_EXTENSION}=="?*", SYMLINK+="disk/by-id/wwn-$env{ID_WWN_WITH_EXTENSION}-part%n" # by-partlabel/by-partuuid links (partition metadata) ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_PART_ENTRY_UUID}=="?*", SYMLINK+="disk/by-partuuid/$env{ID_PART_ENTRY_UUID}" ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_PART_ENTRY_NAME}=="?*", SYMLINK+="disk/by-partlabel/$env{ID_PART_ENTRY_NAME}" LABEL="persistent_storage_end" ------------------------------------------------ Do you think this is an ubuntu udev rule bug or a mdadm bug? Thank you A. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-06 18:47 ` Asdo @ 2012-02-06 22:31 ` NeilBrown 2012-02-07 17:13 ` Asdo 0 siblings, 1 reply; 13+ messages in thread From: NeilBrown @ 2012-02-06 22:31 UTC (permalink / raw) To: Asdo; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 2989 bytes --] On Mon, 06 Feb 2012 19:47:38 +0100 Asdo <asdo@shiftmail.org> wrote: > One or two more bug(s) in 3.2.2 > (note: my latest mail I am replying to is still valid) > > AUTO line in mdadm.conf does not appear to work any longer in 3.2.2 > compared to mdadm 3.1.4 > Now this line > > "AUTO -all" > > still autoassembles every array. > There are many arrays not declared in my mdadm.conf, and which are not > for this host (hostname is different) > but mdadm still autoassembles everything, e.g.: > > # mdadm -I /dev/sdr8 > mdadm: /dev/sdr8 attached to /dev/md/perftest:r0d24, not enough to start > (1). > > (note: "perftest" is even not the hostname) Odd.. it works for me: # cat /etc/mdadm.conf AUTO -all # mdadm -Iv /dev/sda mdadm: /dev/sda has metadata type 1.x for which auto-assembly is disabled # mdadm -V mdadm - v3.2.2 - 17th June 2011 # Can you show the complete output of the same commands (with sdr8 in place of sda of course :-) > > I have just regressed to mdadm 3.1.4 to confirm that it worked back > then, and yes, I confirm that 3.1.4 was not doing any action upon: > # mdadm -I /dev/sdr8 > --> nothing done > when the line in config was: > "AUTO -all" > or even > "AUTO +homehost -all" > which is the line I am normally using. > > > This is a problem in our fairly large system with 80+ HDDs and many > partitions which I am testing now which is full of every kind of arrays.... > I am normally using : "AUTO +homehost -all" to prevent assembling a > bagzillion of arrays at boot, also because doing that gives race > conditions at boot and drops me to initramfs shell (see below next bug). > > > > > > Another problem with 3.2.2: > > At boot, this is from a serial dump: > > udevd[218]: symlink '../../sdx13' > '/dev/disk/by-partlabel/Linux\x20RAID.udev-tmp' failed: File exists > udevd[189]: symlink '../../sdb1' > '/dev/disk/by-partlabel/Linux\x20RAID.udev-tmp' failed: File exists > > And sdb1 is not correctly inserted into array /dev/md0 which hence > starts degraded and so I am dropped into an initramfs shell. > This looks like a race condition... I don't know if this is fault of > udev, udev rules or mdadm... > This is with mdadm 3.2.2 and kernel 3.0.13 (called 3.0.0-15-server by > Ubuntu) on Ubuntu oneiric 11.10 > Having also the above bug of nonworking AUTO line, this problem happens > a lot with 80+ disks and lots of partitions. If the auto line worked, I > would have postponed most of the assembly's at a very late stage in the > boot process, maybe after a significant "sleep". > > > Actually this race condition could be an ubuntu udev script bug : > > Here are the ubuntu udev rules files I could find, related to mdadm or > containing "by-partlabel": It does look like a udev thing more than an mdadm thing. What do /dev/blkid -o udev -p /dev/sdb1 and /dev/blkid -o udev -p /dev/sdx12 report? NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-06 22:31 ` NeilBrown @ 2012-02-07 17:13 ` Asdo 2012-02-09 0:55 ` NeilBrown 0 siblings, 1 reply; 13+ messages in thread From: Asdo @ 2012-02-07 17:13 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid On 02/06/12 23:31, NeilBrown wrote: > On Mon, 06 Feb 2012 19:47:38 +0100 Asdo<asdo@shiftmail.org> wrote: > >> One or two more bug(s) in 3.2.2 >> (note: my latest mail I am replying to is still valid) >> >> AUTO line in mdadm.conf does not appear to work any longer in 3.2.2 >> compared to mdadm 3.1.4 >> Now this line >> >> "AUTO -all" >> >> still autoassembles every array. >> There are many arrays not declared in my mdadm.conf, and which are not >> for this host (hostname is different) >> but mdadm still autoassembles everything, e.g.: >> >> # mdadm -I /dev/sdr8 >> mdadm: /dev/sdr8 attached to /dev/md/perftest:r0d24, not enough to start >> (1). >> >> (note: "perftest" is even not the hostname) > Odd.. it works for me: > > # cat /etc/mdadm.conf > AUTO -all > # mdadm -Iv /dev/sda > mdadm: /dev/sda has metadata type 1.x for which auto-assembly is disabled > # mdadm -V > mdadm - v3.2.2 - 17th June 2011 > # > > Can you show the complete output of the same commands (with sdr8 in place of sda of course :-) I confirm the bug exists in 3.2.2 I compiled from source 3.2.2 from your git to make sure ("git checkout mdadm-3.2.2" and then "make") # ./mdadm -Iv /dev/sdat1 mdadm: /dev/sdat1 attached to /dev/md/perftest:sr50d12p1n1, not enough to start (1). # ./mdadm --version mdadm - v3.2.2 - 17th June 2011 # cat /etc/mdadm/mdadm.conf AUTO -all however the good news is that the bug is gone in 3.2.3 (still from your git) # ./mdadm -Iv /dev/sdat1 mdadm: /dev/sdat1 has metadata type 1.x for which auto-assembly is disabled # ./mdadm --version mdadm - v3.2.3 - 23rd December 2011 # cat /etc/mdadm/mdadm.conf AUTO -all However in 3.2.3 there is another bug, or else I don't understand how AUTO works anymore: # hostname perftest # hostname perftest # cat /etc/mdadm/mdadm.conf HOMEHOST <system> AUTO +homehost -all # ./mdadm -Iv /dev/sdat1 mdadm: /dev/sdat1 has metadata type 1.x for which auto-assembly is disabled # ./mdadm --version mdadm - v3.2.3 - 23rd December 2011 ?? Admittedly perftest is not the original hostname for this machine but it shouldn't matter (does it go reading /etc/hostname directly?)... Same result is if I make the mdadm.conf file like this HOMEHOST perftest AUTO +homehost -all Else, If I create the file like this: # cat /etc/mdadm/mdadm.conf HOMEHOST <system> AUTO +1.x homehost -all # hostname perftest # ./mdadm -Iv /dev/sdat1 mdadm: /dev/sdat1 attached to /dev/md/sr50d12p1n1, not enough to start (1). # ./mdadm --version mdadm - v3.2.3 - 23rd December 2011 Now it works, BUT it works *too much*, look: # hostname foo # hostname foo # ./mdadm -Iv /dev/sdat1 mdadm: /dev/sdat1 attached to /dev/md/perftest:sr50d12p1n1, not enough to start (1). # cat /etc/mdadm/mdadm.conf HOMEHOST <system> AUTO +1.x homehost -all # ./mdadm --version mdadm - v3.2.3 - 23rd December 2011 Same behaviour is if I make the mdadm.conf file with an explicit HOMEHOST name: # hostname foo # cat /etc/mdadm/mdadm.conf HOMEHOST foo AUTO +1.x homehost -all # ./mdadm -Iv /dev/sdat1 mdadm: /dev/sdat1 attached to /dev/md/perftest:sr50d12p1n1, not enough to start (1). # ./mdadm --version mdadm - v3.2.3 - 23rd December 2011 It does not seem correct behaviour to me. If it is, could you explain how I should create the mdadm.conf file in order for mdadm to autoassemble *all* arrays for this host (matching `hostname` == array-hostname in 1.x) and never autoassemble arrays with different hostname? Note I'm *not* using 0.90 metadata anywhere, so no special case is needed for that metadata version I'm not sure if 3.1.4 had the "correct" behaviour... Yesterday it seemed to me it had, but today I can't seem to make it work anymore like I intended. > >> I have just regressed to mdadm 3.1.4 to confirm that it worked back >> then, and yes, I confirm that 3.1.4 was not doing any action upon: >> # mdadm -I /dev/sdr8 >> --> nothing done >> when the line in config was: >> "AUTO -all" >> or even >> "AUTO +homehost -all" >> which is the line I am normally using. >> >> >> This is a problem in our fairly large system with 80+ HDDs and many >> partitions which I am testing now which is full of every kind of arrays.... >> I am normally using : "AUTO +homehost -all" to prevent assembling a >> bagzillion of arrays at boot, also because doing that gives race >> conditions at boot and drops me to initramfs shell (see below next bug). >> >> >> >> >> >> Another problem with 3.2.2: >> >> At boot, this is from a serial dump: >> >> udevd[218]: symlink '../../sdx13' >> '/dev/disk/by-partlabel/Linux\x20RAID.udev-tmp' failed: File exists >> udevd[189]: symlink '../../sdb1' >> '/dev/disk/by-partlabel/Linux\x20RAID.udev-tmp' failed: File exists >> >> And sdb1 is not correctly inserted into array /dev/md0 which hence >> starts degraded and so I am dropped into an initramfs shell. >> This looks like a race condition... I don't know if this is fault of >> udev, udev rules or mdadm... >> This is with mdadm 3.2.2 and kernel 3.0.13 (called 3.0.0-15-server by >> Ubuntu) on Ubuntu oneiric 11.10 >> Having also the above bug of nonworking AUTO line, this problem happens >> a lot with 80+ disks and lots of partitions. If the auto line worked, I >> would have postponed most of the assembly's at a very late stage in the >> boot process, maybe after a significant "sleep". >> >> >> Actually this race condition could be an ubuntu udev script bug : >> >> Here are the ubuntu udev rules files I could find, related to mdadm or >> containing "by-partlabel": > It does look like a udev thing more than an mdadm thing. > > What do > /dev/blkid -o udev -p /dev/sdb1 > and > /dev/blkid -o udev -p /dev/sdx12 > > report? Unfortunately I rebooted in the meanwhile. Now sdb1 is assembled. I am pretty sure sdb1 is really the same device of the old boot so here it goes: # blkid -o udev -p /dev/sdb1 ID_FS_UUID=d6557fd5-0233-0ca1-8882-200cec91b3a3 ID_FS_UUID_ENC=d6557fd5-0233-0ca1-8882-200cec91b3a3 ID_FS_UUID_SUB=0ffdf74a-36f9-7a7a-9dbe-653bb37bdc8a ID_FS_UUID_SUB_ENC=0ffdf74a-36f9-7a7a-9dbe-653bb37bdc8a ID_FS_LABEL=hardstorage1:grubarr ID_FS_LABEL_ENC=hardstorage1:grubarr ID_FS_VERSION=1.0 ID_FS_TYPE=linux_raid_member ID_FS_USAGE=raid ID_PART_ENTRY_SCHEME=gpt ID_PART_ENTRY_NAME=Linux\x20RAID ID_PART_ENTRY_UUID=31c747e8-826f-48a3-ace0-c8063d489810 ID_PART_ENTRY_TYPE=a19d880f-05fc-4d3b-a006-743f0f84911e ID_PART_ENTRY_NUMBER=1 regarding sdx13 (I suppose sdx12 was a typo) I don't guarantee it's the same device as in the previous boot, because it's in the SAS-expanders path... However it will be something similar anyway # blkid -o udev -p /dev/sdx13 ID_FS_UUID=527dd3b2-decf-4278-cb92-e47bcea21a39 ID_FS_UUID_ENC=527dd3b2-decf-4278-cb92-e47bcea21a39 ID_FS_UUID_SUB=c1751a32-0ef6-ff30-04ad-16322edfe9b1 ID_FS_UUID_SUB_ENC=c1751a32-0ef6-ff30-04ad-16322edfe9b1 ID_FS_LABEL=perftest:sr50d12p7n6 ID_FS_LABEL_ENC=perftest:sr50d12p7n6 ID_FS_VERSION=1.0 ID_FS_TYPE=linux_raid_member ID_FS_USAGE=raid ID_PART_ENTRY_SCHEME=gpt ID_PART_ENTRY_NAME=Linux\x20RAID ID_PART_ENTRY_UUID=7a355609-793e-442f-b668-4168d2474f89 ID_PART_ENTRY_TYPE=a19d880f-05fc-4d3b-a006-743f0f84911e ID_PART_ENTRY_NUMBER=13 Ok now I understand that I have hundreds of partitions, all with the same ID_PART_ENTRY_NAME=Linux\x20RAID and I am actually surprised to see only 2 clashes reported in the serial console dump. I confirm that once the system boots, only the last identically-named symlink survives (obviously) --------- # ll /dev/disk/by-partlabel/ total 0 drwxr-xr-x 2 root root 60 Feb 7 16:54 ./ drwxr-xr-x 8 root root 160 Feb 7 10:59 ../ lrwxrwxrwx 1 root root 12 Feb 7 16:54 Linux\x20RAID -> ../../sdas16 --------- But strangely there were only 2 clashes reported by udev It it also interesting that sdb1 was the only partition which failed to assemble among the 8 basic raid1 arrays I have at boot (which I know really well and I checked at last boot and confirmed all other 15 partitions sd[ab][12345678] were present and correctly assembled in couples making /dev/md[01234567]) only sdb1 was missing, the same partition that reported the clash... that's a bit too much for a coincidence. What do you think? Thank you A. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-07 17:13 ` Asdo @ 2012-02-09 0:55 ` NeilBrown 0 siblings, 0 replies; 13+ messages in thread From: NeilBrown @ 2012-02-09 0:55 UTC (permalink / raw) To: Asdo; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 10050 bytes --] On Tue, 07 Feb 2012 18:13:05 +0100 Asdo <asdo@shiftmail.org> wrote: > On 02/06/12 23:31, NeilBrown wrote: > > On Mon, 06 Feb 2012 19:47:38 +0100 Asdo<asdo@shiftmail.org> wrote: > > > >> One or two more bug(s) in 3.2.2 > >> (note: my latest mail I am replying to is still valid) > >> > >> AUTO line in mdadm.conf does not appear to work any longer in 3.2.2 > >> compared to mdadm 3.1.4 > >> Now this line > >> > >> "AUTO -all" > >> > >> still autoassembles every array. > >> There are many arrays not declared in my mdadm.conf, and which are not > >> for this host (hostname is different) > >> but mdadm still autoassembles everything, e.g.: > >> > >> # mdadm -I /dev/sdr8 > >> mdadm: /dev/sdr8 attached to /dev/md/perftest:r0d24, not enough to start > >> (1). > >> > >> (note: "perftest" is even not the hostname) > > Odd.. it works for me: > > > > # cat /etc/mdadm.conf > > AUTO -all > > # mdadm -Iv /dev/sda > > mdadm: /dev/sda has metadata type 1.x for which auto-assembly is disabled > > # mdadm -V > > mdadm - v3.2.2 - 17th June 2011 > > # > > > > Can you show the complete output of the same commands (with sdr8 in place of sda of course :-) > > I confirm the bug exists in 3.2.2 > I compiled from source 3.2.2 from your git to make sure > > ("git checkout mdadm-3.2.2" and then "make") Hmm - you are right. I must have been testing a half-baked intermediate. > > # ./mdadm -Iv /dev/sdat1 > mdadm: /dev/sdat1 attached to /dev/md/perftest:sr50d12p1n1, not enough > to start (1). > # ./mdadm --version > mdadm - v3.2.2 - 17th June 2011 > # cat /etc/mdadm/mdadm.conf > AUTO -all > > > however the good news is that the bug is gone in 3.2.3 (still from your git) > > # ./mdadm -Iv /dev/sdat1 > mdadm: /dev/sdat1 has metadata type 1.x for which auto-assembly is disabled > # ./mdadm --version > mdadm - v3.2.3 - 23rd December 2011 > # cat /etc/mdadm/mdadm.conf > AUTO -all > Oh good, I must have fixed it. > > > > > > However in 3.2.3 there is another bug, or else I don't understand how > AUTO works anymore: > > # hostname perftest > # hostname > perftest > # cat /etc/mdadm/mdadm.conf > HOMEHOST <system> > AUTO +homehost -all This should be AUTO homehost -all 'homehost' is not the name of a metadata type, it is a directive like 'yes' or 'no'. So no '+' is wanted. That said, there is a bug in there (fix just pushed out) but the above AUTO line works correctly. > # ./mdadm -Iv /dev/sdat1 > mdadm: /dev/sdat1 has metadata type 1.x for which auto-assembly is disabled > # ./mdadm --version > mdadm - v3.2.3 - 23rd December 2011 > > > ?? > Admittedly perftest is not the original hostname for this machine but it > shouldn't matter (does it go reading /etc/hostname directly?)... > Same result is if I make the mdadm.conf file like this > > HOMEHOST perftest > AUTO +homehost -all > > > Else, If I create the file like this: > > # cat /etc/mdadm/mdadm.conf > HOMEHOST <system> > AUTO +1.x homehost -all You removed the '+' from the homehost which is good, but added the "+1.x" which is not what you want - as I think you know. > # hostname > perftest > # ./mdadm -Iv /dev/sdat1 > mdadm: /dev/sdat1 attached to /dev/md/sr50d12p1n1, not enough to start (1). > # ./mdadm --version > mdadm - v3.2.3 - 23rd December 2011 > > > Now it works, BUT it works *too much*, look: > > # hostname foo > # hostname > foo > # ./mdadm -Iv /dev/sdat1 > mdadm: /dev/sdat1 attached to /dev/md/perftest:sr50d12p1n1, not enough > to start (1). > # cat /etc/mdadm/mdadm.conf > HOMEHOST <system> > AUTO +1.x homehost -all > # ./mdadm --version > mdadm - v3.2.3 - 23rd December 2011 > > > Same behaviour is if I make the mdadm.conf file with an explicit > HOMEHOST name: > # hostname > foo > # cat /etc/mdadm/mdadm.conf > HOMEHOST foo > AUTO +1.x homehost -all > # ./mdadm -Iv /dev/sdat1 > mdadm: /dev/sdat1 attached to /dev/md/perftest:sr50d12p1n1, not enough > to start (1). > # ./mdadm --version > mdadm - v3.2.3 - 23rd December 2011 > > > > It does not seem correct behaviour to me. > > If it is, could you explain how I should create the mdadm.conf file in > order for mdadm to autoassemble *all* arrays for this host (matching > `hostname` == array-hostname in 1.x) and never autoassemble arrays with > different hostname? > > Note I'm *not* using 0.90 metadata anywhere, so no special case is > needed for that metadata version > > > I'm not sure if 3.1.4 had the "correct" behaviour... Yesterday it seemed > to me it had, but today I can't seem to make it work anymore like I > intended. > > > > > > > > >> I have just regressed to mdadm 3.1.4 to confirm that it worked back > >> then, and yes, I confirm that 3.1.4 was not doing any action upon: > >> # mdadm -I /dev/sdr8 > >> --> nothing done > >> when the line in config was: > >> "AUTO -all" > >> or even > >> "AUTO +homehost -all" > >> which is the line I am normally using. > >> > >> > >> This is a problem in our fairly large system with 80+ HDDs and many > >> partitions which I am testing now which is full of every kind of arrays.... > >> I am normally using : "AUTO +homehost -all" to prevent assembling a > >> bagzillion of arrays at boot, also because doing that gives race > >> conditions at boot and drops me to initramfs shell (see below next bug). > >> > >> > >> > >> > >> > >> Another problem with 3.2.2: > >> > >> At boot, this is from a serial dump: > >> > >> udevd[218]: symlink '../../sdx13' > >> '/dev/disk/by-partlabel/Linux\x20RAID.udev-tmp' failed: File exists > >> udevd[189]: symlink '../../sdb1' > >> '/dev/disk/by-partlabel/Linux\x20RAID.udev-tmp' failed: File exists > >> > >> And sdb1 is not correctly inserted into array /dev/md0 which hence > >> starts degraded and so I am dropped into an initramfs shell. > >> This looks like a race condition... I don't know if this is fault of > >> udev, udev rules or mdadm... > >> This is with mdadm 3.2.2 and kernel 3.0.13 (called 3.0.0-15-server by > >> Ubuntu) on Ubuntu oneiric 11.10 > >> Having also the above bug of nonworking AUTO line, this problem happens > >> a lot with 80+ disks and lots of partitions. If the auto line worked, I > >> would have postponed most of the assembly's at a very late stage in the > >> boot process, maybe after a significant "sleep". > >> > >> > >> Actually this race condition could be an ubuntu udev script bug : > >> > >> Here are the ubuntu udev rules files I could find, related to mdadm or > >> containing "by-partlabel": > > It does look like a udev thing more than an mdadm thing. > > > > What do > > /dev/blkid -o udev -p /dev/sdb1 > > and > > /dev/blkid -o udev -p /dev/sdx12 > > > > report? > > Unfortunately I rebooted in the meanwhile. > Now sdb1 is assembled. > > I am pretty sure sdb1 is really the same device of the old boot so here > it goes: > > > # blkid -o udev -p /dev/sdb1 > ID_FS_UUID=d6557fd5-0233-0ca1-8882-200cec91b3a3 > ID_FS_UUID_ENC=d6557fd5-0233-0ca1-8882-200cec91b3a3 > ID_FS_UUID_SUB=0ffdf74a-36f9-7a7a-9dbe-653bb37bdc8a > ID_FS_UUID_SUB_ENC=0ffdf74a-36f9-7a7a-9dbe-653bb37bdc8a > ID_FS_LABEL=hardstorage1:grubarr > ID_FS_LABEL_ENC=hardstorage1:grubarr > ID_FS_VERSION=1.0 > ID_FS_TYPE=linux_raid_member > ID_FS_USAGE=raid > ID_PART_ENTRY_SCHEME=gpt > ID_PART_ENTRY_NAME=Linux\x20RAID > ID_PART_ENTRY_UUID=31c747e8-826f-48a3-ace0-c8063d489810 > ID_PART_ENTRY_TYPE=a19d880f-05fc-4d3b-a006-743f0f84911e > ID_PART_ENTRY_NUMBER=1 The "ID_PART_ENTRY_SCHEME=gpt" is causing the disk/by-partuuid link to be created and as you presumably have the same label on the other device (being the other half of a RAID1) the udev rules files will make the same symlink in both. So this is definitely a bug in the udev rules files. They should probably ignore ID_PART_ENTRY_SCHEME if ID_FS_USAGE=="raid". > > > regarding sdx13 (I suppose sdx12 was a typo) I don't guarantee it's the > same device as in the previous boot, because it's in the SAS-expanders > path... > However it will be something similar anyway > > # blkid -o udev -p /dev/sdx13 > ID_FS_UUID=527dd3b2-decf-4278-cb92-e47bcea21a39 > ID_FS_UUID_ENC=527dd3b2-decf-4278-cb92-e47bcea21a39 > ID_FS_UUID_SUB=c1751a32-0ef6-ff30-04ad-16322edfe9b1 > ID_FS_UUID_SUB_ENC=c1751a32-0ef6-ff30-04ad-16322edfe9b1 > ID_FS_LABEL=perftest:sr50d12p7n6 > ID_FS_LABEL_ENC=perftest:sr50d12p7n6 > ID_FS_VERSION=1.0 > ID_FS_TYPE=linux_raid_member > ID_FS_USAGE=raid > ID_PART_ENTRY_SCHEME=gpt > ID_PART_ENTRY_NAME=Linux\x20RAID > ID_PART_ENTRY_UUID=7a355609-793e-442f-b668-4168d2474f89 > ID_PART_ENTRY_TYPE=a19d880f-05fc-4d3b-a006-743f0f84911e > ID_PART_ENTRY_NUMBER=13 > > > Ok now I understand that I have hundreds of partitions, all with the same > ID_PART_ENTRY_NAME=Linux\x20RAID > and I am actually surprised to see only 2 clashes reported in the serial > console dump. > I confirm that once the system boots, only the last identically-named > symlink survives (obviously) > --------- > # ll /dev/disk/by-partlabel/ > total 0 > drwxr-xr-x 2 root root 60 Feb 7 16:54 ./ > drwxr-xr-x 8 root root 160 Feb 7 10:59 ../ > lrwxrwxrwx 1 root root 12 Feb 7 16:54 Linux\x20RAID -> ../../sdas16 > --------- > But strangely there were only 2 clashes reported by udev > > It it also interesting that sdb1 was the only partition which failed to > assemble among the 8 basic raid1 arrays I have at boot (which I know > really well and I checked at last boot and confirmed all other 15 > partitions sd[ab][12345678] were present and correctly assembled in > couples making /dev/md[01234567]) only sdb1 was missing, the same > partition that reported the clash... that's a bit too much for a > coincidence. > > What do you think? Do the other partitions have the ID_PART_ENTRY_SCHEME=gpt setting? NeilBrown > > Thank you > A. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-06 17:07 ` Asdo 2012-02-06 18:47 ` Asdo @ 2012-02-06 22:20 ` NeilBrown 2012-02-07 17:47 ` Asdo 1 sibling, 1 reply; 13+ messages in thread From: NeilBrown @ 2012-02-06 22:20 UTC (permalink / raw) To: Asdo; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 3934 bytes --] On Mon, 06 Feb 2012 18:07:38 +0100 Asdo <asdo@shiftmail.org> wrote: > On 02/02/12 23:58, Asdo wrote: > > > >>> Now it doesn't happen: > >>> When I reinserted the disk, udev triggered the --incremental, to > >>> reinsert the device, but mdadm refused to do anything because the old > >>> slot was still occupied with a failed+detached device. I manually > >>> removed the device from the raid then I ran --incremental, but mdadm > >>> still refused to re-add the device to the RAID because the array was > >>> running. I think that if it is a re-add, and especially if the > >>> bitmap is > >>> active, I can't think of a situation in which the user would *not* want > >>> to do an incremental re-add even if the array is running. > >> Hmmm.. that doesn't seem right. What version of mdadm are you running? > > > > 3.1.4 > > > >> Maybe a newer one would get this right. > > I need to try... > > I think I need that. > > Hi Neil, > > Still some problems on mdadm 3.2.2 (from Ubuntu Precise) apparently: > > Problem #1: > > # mdadm -If /dev/sda4 > mdadm: incremental removal requires a kernel device name, not a file: > /dev/sda4 > > however this works: > > # mdadm -If sda4 > mdadm: set sda4 faulty in md3 > mdadm: hot removed sda4 from md3 > > Is this by design? Yes. > Would your udev rule > ACTION=="remove", RUN+="/sbin/mdadm -If $name" > trigger the first or the second kind of invocation? Yes. > > > Problem #2: > > by reinserting sda, it became sdax, and the array is still running like > this: > > md3 : active raid1 sdb4[2] > 10485688 blocks super 1.0 [2/1] [_U] > bitmap: 0/160 pages [0KB], 32KB chunk > > please note the bitmap is active True, but there is nothing in it (0 pages). That implies that no bits are set. I guess that is possible if nothing has been written to the array since the other device was removed. > > so now I'm trying auto hot-add: > > # mdadm -I /dev/sdax4 > mdadm: not adding /dev/sdax4 to active array (without --run) /dev/md3 > > still the old problem I mentioned with 3.1.4. I need to see -E and -X output on both drives to be able to see what is happening here. Also the content of /etc/mdadm.conf might be relevant. If you could supply that info I might be able to explain what is happening. > Trying more ways: (even with the "--run" which is suggested) > > # mdadm --run -I /dev/sdax4 > mdadm: -I would set mdadm mode to "incremental", but it is already set > to "misc". > > # mdadm -I --run /dev/sdax4 > mdadm: failed to add /dev/sdax4 to /dev/md3: Invalid argument. > Hmm... I'm able to reproduce something like this. Following patch seems to fix it, but I need to check the code more thoroughly to be sure. Note that this will *not* fix the "not adding ... not active array" problem. NeilBrown diff --git a/Incremental.c b/Incremental.c index 60175af..2be0d05 100644 --- a/Incremental.c +++ b/Incremental.c @@ -415,19 +415,19 @@ int Incremental(char *devname, int verbose, int runstop, goto out_unlock; } } - info2.disk.major = major(stb.st_rdev); - info2.disk.minor = minor(stb.st_rdev); + info.disk.major = major(stb.st_rdev); + info.disk.minor = minor(stb.st_rdev); /* add disk needs to know about containers */ if (st->ss->external) sra->array.level = LEVEL_CONTAINER; - err = add_disk(mdfd, st, sra, &info2); + err = add_disk(mdfd, st, sra, &info); if (err < 0 && errno == EBUSY) { /* could be another device present with the same * disk.number. Find and reject any such */ find_reject(mdfd, st, sra, info.disk.number, info.events, verbose, chosen_name); - err = add_disk(mdfd, st, sra, &info2); + err = add_disk(mdfd, st, sra, &info); } if (err < 0) { fprintf(stderr, Name ": failed to add %s to %s: %s.\n", [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Some md/mdadm bugs 2012-02-06 22:20 ` NeilBrown @ 2012-02-07 17:47 ` Asdo 0 siblings, 0 replies; 13+ messages in thread From: Asdo @ 2012-02-07 17:47 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid On 02/06/12 23:20, NeilBrown wrote: >> >> Problem #2: >> >> by reinserting sda, it became sdax, and the array is still running like >> this: >> >> md3 : active raid1 sdb4[2] >> 10485688 blocks super 1.0 [2/1] [_U] >> bitmap: 0/160 pages [0KB], 32KB chunk >> >> please note the bitmap is active > True, but there is nothing in it (0 pages). That implies that no bits are > set. I guess that is possible if nothing has been written to the array since > the other device was removed. Almost certain: the array is not really in use (no lvm, not mounted) even if running >> so now I'm trying auto hot-add: >> >> # mdadm -I /dev/sdax4 >> mdadm: not adding /dev/sdax4 to active array (without --run) /dev/md3 >> >> still the old problem I mentioned with 3.1.4. > I need to see -E and -X output on both drives to be able to see what is > happening here. Also the content of /etc/mdadm.conf might be relevant. > If you could supply that info I might be able to explain what is happening. Please note the names changed since yesterday, because of hot-swap tests and reboots: now it's sda4 and sdb4 md3 : active raid1 sdb4[2] 10485688 blocks super 1.0 [2/1] [_U] bitmap: 0/160 pages [0KB], 32KB chunk # ./mdadm -E /dev/sda4 /dev/sda4: Magic : a92b4efc Version : 1.0 Feature Map : 0x1 Array UUID : 8da28111:cdb69fa9:8d607b48:78fb102d Name : hardstorage1:sys2boot Creation Time : Mon Mar 21 16:13:46 2011 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 20971376 (10.00 GiB 10.74 GB) Array Size : 20971376 (10.00 GiB 10.74 GB) Super Offset : 20971504 sectors State : clean Device UUID : c470ba58:897d9cb5:4054c89a:d41608d3 Internal Bitmap : -81 sectors from superblock Update Time : Tue Feb 7 17:25:16 2012 Checksum : a4deb673 - correct Events : 106 Device Role : Active device 0 Array State : AA ('A' == active, '.' == missing) # ./mdadm -X /dev/sda4 Filename : /dev/sda4 Magic : 6d746962 Version : 4 UUID : 8da28111:cdb69fa9:8d607b48:78fb102d Events : 106 Events Cleared : 61 State : OK Chunksize : 32 KB Daemon : 5s flush period Write Mode : Normal Sync Size : 10485688 (10.00 GiB 10.74 GB) Bitmap : 327678 bits (chunks), 0 dirty (0.0%) # ./mdadm -E /dev/sdb4 /dev/sdb4: Magic : a92b4efc Version : 1.0 Feature Map : 0x1 Array UUID : 8da28111:cdb69fa9:8d607b48:78fb102d Name : hardstorage1:sys2boot Creation Time : Mon Mar 21 16:13:46 2011 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 20971376 (10.00 GiB 10.74 GB) Array Size : 20971376 (10.00 GiB 10.74 GB) Super Offset : 20971504 sectors State : clean Device UUID : 0c978768:dccaa84d:4cbe07ee:501f863e Internal Bitmap : -81 sectors from superblock Update Time : Tue Feb 7 17:29:06 2012 Checksum : b769d7e - correct Events : 108 Device Role : Active device 1 Array State : .A ('A' == active, '.' == missing) # ./mdadm -X /dev/sdb4 Filename : /dev/sdb4 Magic : 6d746962 Version : 4 UUID : 8da28111:cdb69fa9:8d607b48:78fb102d Events : 108 Events Cleared : 61 State : OK Chunksize : 32 KB Daemon : 5s flush period Write Mode : Normal Sync Size : 10485688 (10.00 GiB 10.74 GB) Bitmap : 327678 bits (chunks), 0 dirty (0.0%) # cat /etc/mdadm/mdadm.conf AUTO +1.x (I made it simple :-D ) >> Trying more ways: (even with the "--run" which is suggested) >> >> # mdadm --run -I /dev/sdax4 >> mdadm: -I would set mdadm mode to "incremental", but it is already set >> to "misc". >> >> # mdadm -I --run /dev/sdax4 >> mdadm: failed to add /dev/sdax4 to /dev/md3: Invalid argument. >> > Hmm... I'm able to reproduce something like this. > > Following patch seems to fix it, but I need to check the code more > thoroughly to be sure. Congrats, it really seems to fix it at least for 3.2.3: before (with 3.2.3 from your git): # ./mdadm -I /dev/sda4 mdadm: not adding /dev/sda4 to active array (without --run) /dev/md3 # ./mdadm -I --run /dev/sda4 mdadm: failed to add /dev/sda4 to /dev/md3: Invalid argument. 3.2.3 + your patch: # ./mdadm -I /dev/sda4 mdadm: not adding /dev/sda4 to active array (without --run) /dev/md3 # ./mdadm -I --run /dev/sda4 mdadm: /dev/sda4 attached to /dev/md3 which is already active. > Note that this will *not* fix the "not adding ... not > active array" problem. it's not a: "not adding ... to not active array..." but instead it's a: "not adding ... to *active* array..." However, yes, I think the behaviour without --run should be different than it is now Thanks for your help A. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2012-02-09 0:55 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-02-02 19:08 Some md/mdadm bugs Asdo 2012-02-02 21:17 ` NeilBrown 2012-02-02 22:58 ` Asdo 2012-02-06 16:59 ` Joel 2012-02-06 18:47 ` Asdo 2012-02-06 18:50 ` Joel 2012-02-06 17:07 ` Asdo 2012-02-06 18:47 ` Asdo 2012-02-06 22:31 ` NeilBrown 2012-02-07 17:13 ` Asdo 2012-02-09 0:55 ` NeilBrown 2012-02-06 22:20 ` NeilBrown 2012-02-07 17:47 ` Asdo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).