Re: raidhotadd works, mdadm --add doesn't

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Bill Davidsen <davidsen@tmr.com>
To: Leon Avery <leon@eatworms.swmed.edu>
Cc: linux-raid@vger.kernel.org
Subject: Re: raidhotadd works, mdadm --add doesn't
Date: Thu, 14 Sep 2006 18:30:35 -0400	[thread overview]
Message-ID: <4509D80B.5020806@tmr.com> (raw)
In-Reply-To: <7.0.1.0.0.20060910150522.0360c060@eatworms.swmed.edu>

Leon Avery wrote:

> I've been using RAID for a long time, but have been using the old 
> raidtools.  Having just discovered mdadm, I want to switch, but I'm 
> having trouble.  I'm trying to figure out how to use mdadm to replace 
> a failed disk.  Here is my /proc/mdstat:
>
>     Personalities : [linear] [raid1]
>     read_ahead 1024 sectors
>     md5 : active linear md3[1] md4[0]
>           1024504832 blocks 64k rounding
>
>     md4 : active raid1 hdf5[0] hdh5[1]
>           731808832 blocks [2/2] [UU]
>
>     md3 : active raid1 hde5[0] hdg5[1]
>           292696128 blocks [2/2] [UU]
>
>     md2 : active raid1 hda5[0] hdc5[1]
>           48339456 blocks [2/2] [UU]
>
>     md0 : active raid1 hda3[0] hdc3[1]
>           9765376 blocks [2/2] [UU]
>
>     unused devices: <none>
>
> The relevant parts are md0 and md2.  Physical disk hda failed, which 
> left md0 and md2 running in degraded mode.  Having an old spare used 
> disk sitting on the shelf, I plugged it in, repartitioned it, and said
>
>     mdadm --add /dev/md0 /dev/hda3

Did you remove the hda from the array first?

>
> This appeared to work, but when I looked at mdstat, hda3 was marked as 
> failed, and md0 was still running degraded.  I then foolishly tried
>
>     mdadm --add /dev/md0 /dev/hda3 --run
>
> That caused a kernel panic and crashed my system.
>
> I rebooted and said
>
>     raidhotadd /dev/md0 /dev/hda3
>
> That worked perfectly, and reconstruction started immediately.  So, 
> although I don't actually have a problem at the moment, I still 
> haven't figured out how to make mdadm hot-add a replacement disk.
>
> Examination of the syslog was interesting if not exactly informative.  
> Here's the relevant extract from the attempt to use mdadm:
>
>     Sep 10 06:50:28 eatworms kernel: md: trying to hot-add hda3 to md0 
> ...
>     Sep 10 06:50:28 eatworms kernel: md: bind<hda3,2>
>     Sep 10 06:50:28 eatworms kernel: RAID1 conf printout:
>     Sep 10 06:50:28 eatworms kernel:  --- wd:1 rd:2 nd:1
>     Sep 10 06:50:28 eatworms kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 
> dev:[dev 00:00]
>     Sep 10 06:50:28 eatworms kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 
> dev:hdc3
>         ...snip...
>     Sep 10 06:50:28 eatworms kernel: RAID1 conf printout:
>     Sep 10 06:50:28 eatworms kernel:  --- wd:1 rd:2 nd:2
>     Sep 10 06:50:28 eatworms kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 
> dev:[dev 00:00]
>     Sep 10 06:50:28 eatworms kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 
> dev:hdc3
>     Sep 10 06:50:28 eatworms kernel:  disk 2, s:1, o:0, n:2 rd:2 us:1 
> dev:hda3
>         ...snip...
>     Sep 10 06:50:28 eatworms kernel: md: updating md0 RAID superblock 
> on device
>     Sep 10 06:50:28 eatworms kernel: md: hda3 [events: 
> 0000038c]<6>(write) hda3's sb offset: -64
>     Sep 10 06:50:28 eatworms kernel: attempt to access beyond end of 
> device
>     Sep 10 06:50:28 eatworms kernel: 03:03: rw=1, want=2147483588, 
> limit=1
>     Sep 10 06:50:28 eatworms kernel: md: write_disk_sb failed for 
> device hda3
>         ...followed by several retries of this before giving up
>
> The problem seems to be the negative superblock offset.  In contrast, 
> the section after the raidhotadd looks like this:
>
>     Sep 10 07:12:29 eatworms kernel: md: trying to hot-add hda3 to md0 
> ...
>     Sep 10 07:12:29 eatworms kernel: md: bind<hda3,2>
>     Sep 10 07:12:29 eatworms kernel: RAID1 conf printout:
>     Sep 10 07:12:29 eatworms kernel:  --- wd:1 rd:2 nd:1
>     Sep 10 07:12:29 eatworms kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 
> dev:[dev 00:00]
>     Sep 10 07:12:29 eatworms kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 
> dev:hdc3
>         ...snip...
>     Sep 10 07:12:29 eatworms kernel: RAID1 conf printout:
>     Sep 10 07:12:29 eatworms kernel:  --- wd:1 rd:2 nd:2
>     Sep 10 07:12:29 eatworms kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 
> dev:[dev 00:00]
>     Sep 10 07:12:29 eatworms kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 
> dev:hdc3
>     Sep 10 07:12:29 eatworms kernel:  disk 2, s:1, o:0, n:2 rd:2 us:1 
> dev:hda3
>         ...snip...
>     Sep 10 07:12:29 eatworms kernel: md: updating md0 RAID superblock 
> on device
>     Sep 10 07:12:29 eatworms kernel: md: hda3 [events: 
> 00000459]<6>(write) hda3's sb offset: 9765440
>     Sep 10 07:12:29 eatworms kernel: md: hdc3 [events: 
> 00000459]<6>(write) hdc3's sb offset: 9765440
>
> Here we have a reasonable offset of 9765440 and everything works fine.
>
> I suppose this could be an mdadm bug, but it seems more likely that 
> I'm doing something stupid.  Could someone enlighten me?
>
> My system config (uname -a):
>
>     Linux eatworms.swmed.edu 2.4.22e #1 Tue Feb 17 13:37:36 CST 2004 
> i686 unknown unknown GNU/Linux
>
>
> -- 
> Leon Avery                                        (214) 648-4931 (voice)
> Department of Molecular Biology                            -1488 (fax)
> University of Texas Southwestern Medical Center
> 6000 Harry Hines Blvd                            leon@eatworms.swmed.edu
> Dallas, TX  75390-9148                  http://eatworms.swmed.edu/~leon/
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979

next prev parent reply	other threads:[~2006-09-14 22:30 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-10 20:30 raidhotadd works, mdadm --add doesn't Leon Avery
2006-09-11 17:37 ` Steve Cousins
2006-09-14 22:30 ` Bill Davidsen [this message]
  -- strict thread matches above, loose matches on Subject: below --
2006-09-10 20:40 Leon Avery

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4509D80B.5020806@tmr.com \
    --to=davidsen@tmr.com \
    --cc=leon@eatworms.swmed.edu \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).