linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Tomáš Dulík" <dulik@unart.cz>
To: Doug Ledford <dledford@redhat.com>
Cc: Linux RAID Mailing List <linux-raid@vger.kernel.org>
Subject: Re: [Patch mdadm] Add hot-unplug support to mdadm
Date: Tue, 13 Apr 2010 11:28:24 +0200	[thread overview]
Message-ID: <4BC43938.2020109@unart.cz> (raw)
In-Reply-To: <20100409103330.37d9dff5@notabene.brown>

Hi Doug,

first of all: thanks for your work on hot-unplug!
I am new to Linux RAID, have been using HW RAID before but after my LSI 
controller burned to ashes I decided I don't want to see HW RAID ... ever.

First thing I found weird on Linux RAID was the missing support for dead 
device removal.
I spent last 3 weeks trying to write various scripts for UDEV "remove" 
and mdadm "Fail" events handling, but finally I found the same thing 
like you - it is not possible to remove dead device from an array, 
because the events are issued too late. The only way to remove dead 
device is reboot, which is not what I would expect as solution in Linux 
world.

So I downloaded your code from Neil's git 
(http://neil.brown.name/git?p=mdadm;a=shortlog;h=refs/heads/hotunplug)
and also applied the "Minor incremental fixup" mentioned in your message 
below.

The compiled mdadm works OK for normal operations (--fail, --remove, 
--add), but crashes with Segmentation fault for the "--incremental 
--fail" operation if I use it for a disk that I have just disconnected.
Here is what I've got:

# gdb --args ./mdadm -If sda3
GNU gdb 6.8-debian
This GDB was configured as "x86_64-linux-gnu"...
(gdb) run
Starting program: /root/mdadm-git/mdadm/mdadm -If sda3
Program received signal SIGSEGV, Segmentation fault.
0x000000000040a796 in mdstat_by_component (name=0x7fff0d0aee83 "sda3") 
at mdstat.c:351
351                     if (ent->metadata_version &&
(gdb) where
#0  0x000000000040a796 in mdstat_by_component (name=0x7fff0d0aee83 
"sda3") at mdstat.c:351
#1  0x000000000042411c in IncrementalRemove (devname=0x7fff0d0aee83 
"sda3", verbose=0) at Incremental.c:867
#2  0x00000000004075a7 in main (argc=3, argv=0x7fff0d0ad698) at mdadm.c:1545

It does not matter if I use sda3 or sda, the result is the same.
What am I doing wrong?

This is my environment:
# uname -a
Linux xeric 2.6.26-2-xen-amd64 #1 SMP Thu Nov 5 04:27:12 UTC 2009 x86_64 
GNU/Linux

# modinfo md_mod
filename:       /lib/modules/2.6.26-2-xen-amd64/kernel/drivers/md/md-mod.ko
alias:          block-major-9-*
alias:          md
license:        GPL
depends:
vermagic:       2.6.26-2-xen-amd64 SMP mod_unload modversions Xen
parm:           start_dirty_degraded:int

# cat /proc/mdstat
Personalities : [raid1]
md2 : active (auto-read-only) raid1 sda3[0] sdb3[1]
      9767424 blocks [2/2] [UU]
      bitmap: 0/150 pages [0KB], 32KB chunk

md1 : active raid1 sda2[2](F) sdb2[1]
      468752512 blocks [2/1] [_U]
      bitmap: 18/224 pages [72KB], 1024KB chunk

md0 : active raid1 sda1[0] sdb1[1]
      497856 blocks [2/2] [UU]
      bitmap: 0/61 pages [0KB], 4KB chunk


Thanks for your help!

Tomas Dulik,
FAI TBU Zlin,
Nad Stranemi 4511,
CZECH REPUBLIC
phone: +420 57 603 5187

On 04/05/2010 12:40 PM, Doug Ledford wrote:
> Minor incremental fixup: In the case of passing in faulty or
> disconnected as the device name, since we now use the value of tfd to
> determine if we should attempt ioctls or go straight to using sysfs
> entries, we now need to make sure we init tdf and then set it properly
> in both of the loops where we check for faulty and disconnected devices
> (although I'm now highly suspicious of the faulty check code as I
> suspect all the faulty devices will have the same problem that our hot
> unplug code ran into and the faulty devices will not be openable and
> that will mean that passing in faulty is probably just broken at this
> point in time...but that's another patch for another day).
>
> -- 
> Doug Ledford <dledford@xxxxxxxxxx>
> GPG KeyID: CFBFF194
> http://people.redhat.com/dledford
>
> Infiniband specific RPMs available at
> http://people.redhat.com/dledford/Infiniband 


  parent reply	other threads:[~2010-04-13  9:28 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-05 16:40 [Patch mdadm] Add hot-unplug support to mdadm Doug Ledford
2010-04-06 16:26 ` Doug Ledford
2010-04-07  1:30 ` Neil Brown
2010-04-07  2:02   ` Doug Ledford
2010-04-07  2:24     ` Doug Ledford
2010-04-07  3:07       ` Doug Ledford
2010-04-07  5:32     ` Luca Berra
2010-04-07  6:59       ` Neil Brown
2010-04-08 23:31     ` Neil Brown
2010-04-09  0:33       ` Neil Brown
2010-04-09 20:02         ` Doug Ledford
2010-04-13  9:28         ` Tomáš Dulík [this message]
2010-04-13 16:27           ` Doug Ledford
2010-04-13 18:49           ` Doug Ledford
     [not found]             ` <4BC5ADB2.2060705@unart.cz>
2010-04-15  5:24               ` Neil Brown
2010-04-15 13:11                 ` Tomáš Dulík
2010-04-13 19:04         ` Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BC43938.2020109@unart.cz \
    --to=dulik@unart.cz \
    --cc=dledford@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).