From: "Tomáš Dulík" <dulik@unart.cz>
To: Doug Ledford <dledford@redhat.com>
Cc: Linux RAID Mailing List <linux-raid@vger.kernel.org>
Subject: Re: [Patch mdadm] Add hot-unplug support to mdadm
Date: Tue, 13 Apr 2010 11:28:24 +0200 [thread overview]
Message-ID: <4BC43938.2020109@unart.cz> (raw)
In-Reply-To: <20100409103330.37d9dff5@notabene.brown>
Hi Doug,
first of all: thanks for your work on hot-unplug!
I am new to Linux RAID, have been using HW RAID before but after my LSI
controller burned to ashes I decided I don't want to see HW RAID ... ever.
First thing I found weird on Linux RAID was the missing support for dead
device removal.
I spent last 3 weeks trying to write various scripts for UDEV "remove"
and mdadm "Fail" events handling, but finally I found the same thing
like you - it is not possible to remove dead device from an array,
because the events are issued too late. The only way to remove dead
device is reboot, which is not what I would expect as solution in Linux
world.
So I downloaded your code from Neil's git
(http://neil.brown.name/git?p=mdadm;a=shortlog;h=refs/heads/hotunplug)
and also applied the "Minor incremental fixup" mentioned in your message
below.
The compiled mdadm works OK for normal operations (--fail, --remove,
--add), but crashes with Segmentation fault for the "--incremental
--fail" operation if I use it for a disk that I have just disconnected.
Here is what I've got:
# gdb --args ./mdadm -If sda3
GNU gdb 6.8-debian
This GDB was configured as "x86_64-linux-gnu"...
(gdb) run
Starting program: /root/mdadm-git/mdadm/mdadm -If sda3
Program received signal SIGSEGV, Segmentation fault.
0x000000000040a796 in mdstat_by_component (name=0x7fff0d0aee83 "sda3")
at mdstat.c:351
351 if (ent->metadata_version &&
(gdb) where
#0 0x000000000040a796 in mdstat_by_component (name=0x7fff0d0aee83
"sda3") at mdstat.c:351
#1 0x000000000042411c in IncrementalRemove (devname=0x7fff0d0aee83
"sda3", verbose=0) at Incremental.c:867
#2 0x00000000004075a7 in main (argc=3, argv=0x7fff0d0ad698) at mdadm.c:1545
It does not matter if I use sda3 or sda, the result is the same.
What am I doing wrong?
This is my environment:
# uname -a
Linux xeric 2.6.26-2-xen-amd64 #1 SMP Thu Nov 5 04:27:12 UTC 2009 x86_64
GNU/Linux
# modinfo md_mod
filename: /lib/modules/2.6.26-2-xen-amd64/kernel/drivers/md/md-mod.ko
alias: block-major-9-*
alias: md
license: GPL
depends:
vermagic: 2.6.26-2-xen-amd64 SMP mod_unload modversions Xen
parm: start_dirty_degraded:int
# cat /proc/mdstat
Personalities : [raid1]
md2 : active (auto-read-only) raid1 sda3[0] sdb3[1]
9767424 blocks [2/2] [UU]
bitmap: 0/150 pages [0KB], 32KB chunk
md1 : active raid1 sda2[2](F) sdb2[1]
468752512 blocks [2/1] [_U]
bitmap: 18/224 pages [72KB], 1024KB chunk
md0 : active raid1 sda1[0] sdb1[1]
497856 blocks [2/2] [UU]
bitmap: 0/61 pages [0KB], 4KB chunk
Thanks for your help!
Tomas Dulik,
FAI TBU Zlin,
Nad Stranemi 4511,
CZECH REPUBLIC
phone: +420 57 603 5187
On 04/05/2010 12:40 PM, Doug Ledford wrote:
> Minor incremental fixup: In the case of passing in faulty or
> disconnected as the device name, since we now use the value of tfd to
> determine if we should attempt ioctls or go straight to using sysfs
> entries, we now need to make sure we init tdf and then set it properly
> in both of the loops where we check for faulty and disconnected devices
> (although I'm now highly suspicious of the faulty check code as I
> suspect all the faulty devices will have the same problem that our hot
> unplug code ran into and the faulty devices will not be openable and
> that will mean that passing in faulty is probably just broken at this
> point in time...but that's another patch for another day).
>
> --
> Doug Ledford <dledford@xxxxxxxxxx>
> GPG KeyID: CFBFF194
> http://people.redhat.com/dledford
>
> Infiniband specific RPMs available at
> http://people.redhat.com/dledford/Infiniband
next prev parent reply other threads:[~2010-04-13 9:28 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-05 16:40 [Patch mdadm] Add hot-unplug support to mdadm Doug Ledford
2010-04-06 16:26 ` Doug Ledford
2010-04-07 1:30 ` Neil Brown
2010-04-07 2:02 ` Doug Ledford
2010-04-07 2:24 ` Doug Ledford
2010-04-07 3:07 ` Doug Ledford
2010-04-07 5:32 ` Luca Berra
2010-04-07 6:59 ` Neil Brown
2010-04-08 23:31 ` Neil Brown
2010-04-09 0:33 ` Neil Brown
2010-04-09 20:02 ` Doug Ledford
2010-04-13 9:28 ` Tomáš Dulík [this message]
2010-04-13 16:27 ` Doug Ledford
2010-04-13 18:49 ` Doug Ledford
[not found] ` <4BC5ADB2.2060705@unart.cz>
2010-04-15 5:24 ` Neil Brown
2010-04-15 13:11 ` Tomáš Dulík
2010-04-13 19:04 ` Doug Ledford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BC43938.2020109@unart.cz \
--to=dulik@unart.cz \
--cc=dledford@redhat.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.