From: NeilBrown <neilb@suse.de>
To: Albert Pauw <albert.pauw@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: Version 3.2.5 and ddf issues (bugreport)
Date: Wed, 15 Aug 2012 09:31:05 +1000 [thread overview]
Message-ID: <20120815093105.05402855@notabene.brown> (raw)
In-Reply-To: <50179B62.9020603@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6452 bytes --]
On Tue, 31 Jul 2012 10:46:26 +0200 Albert Pauw <albert.pauw@gmail.com> wrote:
> On 07/31/2012 08:11 AM, NeilBrown wrote:
> > On Sat, 28 Jul 2012 13:46:06 +0200 Albert Pauw <albert.pauw@gmail.com> wrote:
> >
> >> Hi Neil,
> >>
> >> After a hiatus of 1.5 year (busy with all sorts) I am back and tried the
> >> ddf code to see how things improved.
> > Thanks!
> >
> >> I build a VM Centos 6.3 system with 6 extra 1GB disks for testing.
> >> I found several issues in the standard installed 3.2.3 version of mdadm
> >> relating to ddf, but installed the
> >> 3.2.5 version in order to work with recent code.
> >>
> >> However, while version 3.2.3 is able to create a ddf container with
> >> raidsets in it, I found a problem with the 3.2.5 version.
> >>
> >> After initially creating the container:
> >>
> >> mdadm -C /dev/md127 -e ddf -l container /dev/sd[b-g]
> >>
> >> which worked, I created a raid (1 or 5 it doesn't matter in this case)
> >> in it:
> >>
> >> mdadm -C /dev/md0 -l raid5 -n 3 /dev/md127
> >>
> >> However, it stays on resync=PENDING and readonly, and doesn't get build.
> >>
> >> So I tried to set it to readwrite:
> >>
> >> mdadm --readwrite /dev/md0
> >>
> >> Unfortunately, it stays on readonly and doesn't get build.
> >>
> >> As said before, this did work in 3.2.3.
> >>
> >> Are you already on this problem?
> > It sounds like a problem with 'mdmon'. mdmon needs to be running before the
> > array can become read-write. mdadm should start mdmon automatically but
> > maybe it isn't. Maybe it cannot find mdmon?
> >
> > could you check if mdadm is running? If it isn't run
> > mdmon /dev/md127 &
> > and see if it starts working.
> Hi Neil,
>
> thanks for your reply. Yes, mdmon wasn't running. Couldn't get it
> running with a recompiled 3.2.5, the standard one which came with Centos
> (3.2.3) works fine, I assume the made some changes to the code? Anyway,
> I moved to my own laptop, running Fedora 16 and pulled mdadm frm git and
> recompiled. That works. I also used loop devices as disks.
>
> Here is the first of my findings:
>
> I created a container with six disks, disk 1-2 is a raid 1 device, disk
> 3-6 are a raid 6 device.
>
> Here is the table shown at the end of the mdadm -E command for the
> container:
>
> Physical Disks : 6
> Number RefNo Size Device Type/State
> 0 06a5f547 479232K /dev/loop2 active/Online
> 1 47564acc 479232K /dev/loop3 active/Online
> 2 bf30692c 479232K /dev/loop5 active/Online
> 3 275d02f5 479232K /dev/loop4 active/Online
> 4 b0916b3f 479232K /dev/loop6 active/Online
> 5 65956a72 479232K /dev/loop1 active/Online
>
> I now fail a disk (disk 0) and I get:
>
> Physical Disks : 6
> Number RefNo Size Device Type/State
> 0 06a5f547 479232K /dev/loop2 active/Online
> 1 47564acc 479232K /dev/loop3 active/Online
> 2 bf30692c 479232K /dev/loop5 active/Online
> 3 275d02f5 479232K /dev/loop4 active/Online
> 4 b0916b3f 479232K /dev/loop6 active/Online
> 5 65956a72 479232K /dev/loop1 active/Offline, Failed
>
> Then I removed the disk from the container:
>
> Physical Disks : 6
> Number RefNo Size Device Type/State
> 0 06a5f547 479232K /dev/loop2 active/Online
> 1 47564acc 479232K /dev/loop3 active/Online
> 2 bf30692c 479232K /dev/loop5 active/Online
> 3 275d02f5 479232K /dev/loop4 active/Online
> 4 b0916b3f 479232K /dev/loop6 active/Online
> 5 65956a72 479232K active/Offline,
> Failed, Missing
>
> Notice the active/Offline status, is this correct?
To be honest, I don't know. The DDF spec doesn't really go into that sort of
detail, or at least I didn't find it.
Given that the device is Missing, it hardly seems to matter whether it is
Active or Spare or Foreign or Legacy.
I guess if it re-appears we want to know what it was ... maybe.
>
> I added the disk back into the container, NO zero-superblock:
>
> Physical Disks : 6
> Number RefNo Size Device Type/State
> 0 06a5f547 479232K /dev/loop2 active/Online
> 1 47564acc 479232K /dev/loop3 active/Online
> 2 bf30692c 479232K /dev/loop5 active/Online
> 3 275d02f5 479232K /dev/loop4 active/Online
> 4 b0916b3f 479232K /dev/loop6 active/Online
> 5 65956a72 479232K /dev/loop1 active/Offline,
> Failed, Missing
>
> It stays active/Offline (this is now correct I assume), Failed (again
> correct if had failed before), but also still missing.
I found why this happens. When I added code to support incremental assembly
of DDF arrays, I broke the ability to hot-add a device which happened to have
reasonably good looking metadata on it. The best approach for now is to
--zero the device first. I'll push out a patch which does just that.
>
> I remove the disk again, do a zero-superblock and add it again:
>
> Physical Disks : 6
> Number RefNo Size Device Type/State
> 0 06a5f547 479232K /dev/loop2 active/Online
> 1 47564acc 479232K /dev/loop3 active/Online
> 2 bf30692c 479232K /dev/loop5 active/Online
> 3 275d02f5 479232K /dev/loop4 active/Online
> 4 b0916b3f 479232K /dev/loop6 active/Online
> 5 ede51ba3 479232K /dev/loop1 active/Online, Rebuilding
>
> This is correct, the disk is seen as a new disk and rebuilding starts.
>
>
> Regards,
>
> Albert
diff --git a/Manage.c b/Manage.c
index f83af65..7f27f74 100644
--- a/Manage.c
+++ b/Manage.c
@@ -786,6 +786,7 @@ int Manage_add(int fd, int tfd, struct mddev_dev *dv,
return -1;
}
+ Kill(dv->devname, NULL, 0, -1, 0);
dfd = dev_open(dv->devname, O_RDWR | O_EXCL|O_DIRECT);
if (mdmon_running(tst->container_dev))
tst->update_tail = &tst->updates;
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2012-08-14 23:31 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-23 19:18 More ddf container woes Albert Pauw
2011-03-23 22:08 ` NeilBrown
2012-07-28 11:46 ` Version 3.2.5 and ddf issues (bugreport) Albert Pauw
2012-07-31 6:11 ` NeilBrown
2012-07-31 8:46 ` Albert Pauw
2012-08-02 0:05 ` NeilBrown
2012-08-14 23:31 ` NeilBrown [this message]
-- strict thread matches above, loose matches on Subject: below --
2012-07-28 11:54 Albert Pauw
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120815093105.05402855@notabene.brown \
--to=neilb@suse.de \
--cc=albert.pauw@gmail.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).