From: "Tomáš Dulík" <dulik@unart.cz>
To: Doug Ledford <dledford@redhat.com>
Cc: Linux RAID Mailing List <linux-raid@vger.kernel.org>
Subject: Re: [Patch mdadm] Add hot-unplug support to mdadm
Date: Tue, 13 Apr 2010 11:28:24 +0200 [thread overview]
Message-ID: <4BC43938.2020109@unart.cz> (raw)
In-Reply-To: <20100409103330.37d9dff5@notabene.brown>
Hi Doug,
first of all: thanks for your work on hot-unplug!
I am new to Linux RAID, have been using HW RAID before but after my LSI
controller burned to ashes I decided I don't want to see HW RAID ... ever.
First thing I found weird on Linux RAID was the missing support for dead
device removal.
I spent last 3 weeks trying to write various scripts for UDEV "remove"
and mdadm "Fail" events handling, but finally I found the same thing
like you - it is not possible to remove dead device from an array,
because the events are issued too late. The only way to remove dead
device is reboot, which is not what I would expect as solution in Linux
world.
So I downloaded your code from Neil's git
(http://neil.brown.name/git?p=mdadm;a=shortlog;h=refs/heads/hotunplug)
and also applied the "Minor incremental fixup" mentioned in your message
below.
The compiled mdadm works OK for normal operations (--fail, --remove,
--add), but crashes with Segmentation fault for the "--incremental
--fail" operation if I use it for a disk that I have just disconnected.
Here is what I've got:
# gdb --args ./mdadm -If sda3
GNU gdb 6.8-debian
This GDB was configured as "x86_64-linux-gnu"...
(gdb) run
Starting program: /root/mdadm-git/mdadm/mdadm -If sda3
Program received signal SIGSEGV, Segmentation fault.
0x000000000040a796 in mdstat_by_component (name=0x7fff0d0aee83 "sda3")
at mdstat.c:351
351 if (ent->metadata_version &&
(gdb) where
#0 0x000000000040a796 in mdstat_by_component (name=0x7fff0d0aee83
"sda3") at mdstat.c:351
#1 0x000000000042411c in IncrementalRemove (devname=0x7fff0d0aee83
"sda3", verbose=0) at Incremental.c:867
#2 0x00000000004075a7 in main (argc=3, argv=0x7fff0d0ad698) at mdadm.c:1545
It does not matter if I use sda3 or sda, the result is the same.
What am I doing wrong?
This is my environment:
# uname -a
Linux xeric 2.6.26-2-xen-amd64 #1 SMP Thu Nov 5 04:27:12 UTC 2009 x86_64
GNU/Linux
# modinfo md_mod
filename: /lib/modules/2.6.26-2-xen-amd64/kernel/drivers/md/md-mod.ko
alias: block-major-9-*
alias: md
license: GPL
depends:
vermagic: 2.6.26-2-xen-amd64 SMP mod_unload modversions Xen
parm: start_dirty_degraded:int
# cat /proc/mdstat
Personalities : [raid1]
md2 : active (auto-read-only) raid1 sda3[0] sdb3[1]
9767424 blocks [2/2] [UU]
bitmap: 0/150 pages [0KB], 32KB chunk
md1 : active raid1 sda2[2](F) sdb2[1]
468752512 blocks [2/1] [_U]
bitmap: 18/224 pages [72KB], 1024KB chunk
md0 : active raid1 sda1[0] sdb1[1]
497856 blocks [2/2] [UU]
bitmap: 0/61 pages [0KB], 4KB chunk
Thanks for your help!
Tomas Dulik,
FAI TBU Zlin,
Nad Stranemi 4511,
CZECH REPUBLIC
phone: +420 57 603 5187
On 04/05/2010 12:40 PM, Doug Ledford wrote:
> Minor incremental fixup: In the case of passing in faulty or
> disconnected as the device name, since we now use the value of tfd to
> determine if we should attempt ioctls or go straight to using sysfs
> entries, we now need to make sure we init tdf and then set it properly
> in both of the loops where we check for faulty and disconnected devices
> (although I'm now highly suspicious of the faulty check code as I
> suspect all the faulty devices will have the same problem that our hot
> unplug code ran into and the faulty devices will not be openable and
> that will mean that passing in faulty is probably just broken at this
> point in time...but that's another patch for another day).
>
> --
> Doug Ledford <dledford@xxxxxxxxxx>
> GPG KeyID: CFBFF194
> http://people.redhat.com/dledford
>
> Infiniband specific RPMs available at
> http://people.redhat.com/dledford/Infiniband
next prev parent reply other threads:[~2010-04-13 9:28 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-05 16:40 [Patch mdadm] Add hot-unplug support to mdadm Doug Ledford
2010-04-06 16:26 ` Doug Ledford
2010-04-07 1:30 ` Neil Brown
2010-04-07 2:02 ` Doug Ledford
2010-04-07 2:24 ` Doug Ledford
2010-04-07 3:07 ` Doug Ledford
2010-04-07 5:32 ` Luca Berra
2010-04-07 6:59 ` Neil Brown
2010-04-08 23:31 ` Neil Brown
2010-04-09 0:33 ` Neil Brown
2010-04-09 20:02 ` Doug Ledford
2010-04-13 9:28 ` Tomáš Dulík [this message]
2010-04-13 16:27 ` Doug Ledford
2010-04-13 18:49 ` Doug Ledford
[not found] ` <4BC5ADB2.2060705@unart.cz>
2010-04-15 5:24 ` Neil Brown
2010-04-15 13:11 ` Tomáš Dulík
2010-04-13 19:04 ` Doug Ledford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BC43938.2020109@unart.cz \
--to=dulik@unart.cz \
--cc=dledford@redhat.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).