* [PATCH 1/4] mdmon: always get layout from sysfs @ 2013-08-01 22:35 mwilck 2013-08-01 22:35 ` [PATCH 2/4] DDF: no need for GET_LAYOUT any more mwilck ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: mwilck @ 2013-08-01 22:35 UTC (permalink / raw) To: neilb, linux-raid; +Cc: mwilck commit 71d68ff62 uses the array layout. It needs to be initialized. Signed-off-by: Martin Wilck <mwilck@arcor.de> --- managemon.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/managemon.c b/managemon.c index 6cd93e5..21bf2bd 100644 --- a/managemon.c +++ b/managemon.c @@ -652,7 +652,7 @@ static void manage_new(struct mdstat_ent *mdstat, mdi = sysfs_read(-1, mdstat->devnm, GET_LEVEL|GET_CHUNK|GET_DISKS|GET_COMPONENT| GET_DEGRADED|GET_SAFEMODE| - GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE); + GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE|GET_LAYOUT); if (!mdi) return; -- 1.7.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/4] DDF: no need for GET_LAYOUT any more 2013-08-01 22:35 [PATCH 1/4] mdmon: always get layout from sysfs mwilck @ 2013-08-01 22:35 ` mwilck 2013-08-01 22:35 ` [PATCH 3/4] DDF: ddf_process_update: delete removed disks from dlist mwilck 2013-08-01 22:35 ` [PATCH 4/4] tests/10ddf-fail-twice: New unit test mwilck 2 siblings, 0 replies; 6+ messages in thread From: mwilck @ 2013-08-01 22:35 UTC (permalink / raw) To: neilb, linux-raid; +Cc: mwilck With the previous patch, mdmon will provide the layout property for us. Signed-off-by: Martin Wilck <mwilck@arcor.de> --- super-ddf.c | 8 +------- 1 files changed, 1 insertions(+), 7 deletions(-) diff --git a/super-ddf.c b/super-ddf.c index 65472a2..7a7f5fe 100644 --- a/super-ddf.c +++ b/super-ddf.c @@ -4697,16 +4697,10 @@ static int raid10_degraded(struct mdinfo *info) { int n_prim, n_bvds; int i; - struct mdinfo *d, *sra; + struct mdinfo *d; char *found; int ret = -1; - if (info->array.layout == 0) { - sra = sysfs_read(-1, info->sys_name, GET_LAYOUT); - info->array.layout = sra->array.layout; - free(sra); - } - n_prim = info->array.layout & ~0x100; n_bvds = info->array.raid_disks / n_prim; found = xmalloc(n_bvds); -- 1.7.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/4] DDF: ddf_process_update: delete removed disks from dlist 2013-08-01 22:35 [PATCH 1/4] mdmon: always get layout from sysfs mwilck 2013-08-01 22:35 ` [PATCH 2/4] DDF: no need for GET_LAYOUT any more mwilck @ 2013-08-01 22:35 ` mwilck 2013-08-02 19:32 ` Martin Wilck 2013-08-01 22:35 ` [PATCH 4/4] tests/10ddf-fail-twice: New unit test mwilck 2 siblings, 1 reply; 6+ messages in thread From: mwilck @ 2013-08-01 22:35 UTC (permalink / raw) To: neilb, linux-raid; +Cc: mwilck We currently remove Failed disks that aren't used by any VD. If we do that, we need to remove the disks from the dlist as well. Otherwise, the same pdnum may occur multiple times in the dlist. This fixes the problem reported by Albert Pauw. Signed-off-by: Martin Wilck <mwilck@arcor.de> --- super-ddf.c | 25 ++++++++++++++++++++++--- 1 files changed, 22 insertions(+), 3 deletions(-) diff --git a/super-ddf.c b/super-ddf.c index 7a7f5fe..e1d0509 100644 --- a/super-ddf.c +++ b/super-ddf.c @@ -4366,6 +4366,19 @@ static void copy_matching_bvd(struct ddf_super *ddf, conf->sec_elmnt_seq, guid_str(conf->guid)); } +static void _delete_dl_by_refnum(struct ddf_super *ddf, be32 refnum) +{ + struct dl **pdl = &ddf->dlist, *d; + while (*pdl) { + if (be32_eq((*pdl)->disk.refnum, refnum)) { + d = *pdl; + *pdl = d->next; + free(d); + } else + pdl = &(*pdl)->next; + } +} + static void ddf_process_update(struct supertype *st, struct metadata_update *update) { @@ -4638,9 +4651,15 @@ static void ddf_process_update(struct supertype *st, if (be16_and(ddf->phys->entries[pdnum].state, cpu_to_be16(DDF_Failed)) && be16_and(ddf->phys->entries[pdnum].state, - cpu_to_be16(DDF_Transition))) - /* skip this one */; - else if (pdnum == pd2) + cpu_to_be16(DDF_Transition))) { + /* skip this one and remove from dlist */ + dprintf("%s: %08x no longer used, removing it\n", + __func__, + be32_to_cpu(ddf->phys-> + entries[pdnum].refnum)); + _delete_dl_by_refnum( + ddf, ddf->phys->entries[pdnum].refnum); + } else if (pdnum == pd2) pd2++; else { ddf->phys->entries[pd2] = -- 1.7.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 3/4] DDF: ddf_process_update: delete removed disks from dlist 2013-08-01 22:35 ` [PATCH 3/4] DDF: ddf_process_update: delete removed disks from dlist mwilck @ 2013-08-02 19:32 ` Martin Wilck 2013-08-05 5:11 ` NeilBrown 0 siblings, 1 reply; 6+ messages in thread From: Martin Wilck @ 2013-08-02 19:32 UTC (permalink / raw) To: neilb; +Cc: linux-raid Neil, This calls free() from monitor context - I am not certain if that's allowed and if no, what alternative there might be. Martin On 08/02/2013 12:35 AM, mwilck@arcor.de wrote: > We currently remove Failed disks that aren't used by any VD. > If we do that, we need to remove the disks from the dlist as well. > Otherwise, the same pdnum may occur multiple times in the dlist. > > This fixes the problem reported by Albert Pauw. > > Signed-off-by: Martin Wilck <mwilck@arcor.de> > --- > super-ddf.c | 25 ++++++++++++++++++++++--- > 1 files changed, 22 insertions(+), 3 deletions(-) > > diff --git a/super-ddf.c b/super-ddf.c > index 7a7f5fe..e1d0509 100644 > --- a/super-ddf.c > +++ b/super-ddf.c > @@ -4366,6 +4366,19 @@ static void copy_matching_bvd(struct ddf_super *ddf, > conf->sec_elmnt_seq, guid_str(conf->guid)); > } > > +static void _delete_dl_by_refnum(struct ddf_super *ddf, be32 refnum) > +{ > + struct dl **pdl = &ddf->dlist, *d; > + while (*pdl) { > + if (be32_eq((*pdl)->disk.refnum, refnum)) { > + d = *pdl; > + *pdl = d->next; > + free(d); > + } else > + pdl = &(*pdl)->next; > + } > +} > + > static void ddf_process_update(struct supertype *st, > struct metadata_update *update) > { > @@ -4638,9 +4651,15 @@ static void ddf_process_update(struct supertype *st, > if (be16_and(ddf->phys->entries[pdnum].state, > cpu_to_be16(DDF_Failed)) > && be16_and(ddf->phys->entries[pdnum].state, > - cpu_to_be16(DDF_Transition))) > - /* skip this one */; > - else if (pdnum == pd2) > + cpu_to_be16(DDF_Transition))) { > + /* skip this one and remove from dlist */ > + dprintf("%s: %08x no longer used, removing it\n", > + __func__, > + be32_to_cpu(ddf->phys-> > + entries[pdnum].refnum)); > + _delete_dl_by_refnum( > + ddf, ddf->phys->entries[pdnum].refnum); > + } else if (pdnum == pd2) > pd2++; > else { > ddf->phys->entries[pd2] = ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 3/4] DDF: ddf_process_update: delete removed disks from dlist 2013-08-02 19:32 ` Martin Wilck @ 2013-08-05 5:11 ` NeilBrown 0 siblings, 0 replies; 6+ messages in thread From: NeilBrown @ 2013-08-05 5:11 UTC (permalink / raw) To: Martin Wilck; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 2992 bytes --] On Fri, 02 Aug 2013 21:32:59 +0200 Martin Wilck <mwilck@arcor.de> wrote: > Neil, > > This calls free() from monitor context - I am not certain if that's > allowed and if no, what alternative there might be. Yes, I don't really like that. And there is an 'fd' attached to the 'dl' which would need to be closed too. This relates to the "ddf: remove failed devices that are no longer in use ?!?" discussion we had earlier. That patch isn't really right ... as you noticed by trying to fix it. I think this fix is safer and more in the spirit of the original. It appears to fix the problem for me, and feels "right" as well:-) It only removed pd entries for devices that really have disappeared. If a device still exists in ->dlist, we don't remove the pde. I applied the other patches in the series - thank especially for the added test! Thanks, NeilBrown From 92939eb29175a0dc7c9c46ff70f95b76b693b796 Mon Sep 17 00:00:00 2001 From: NeilBrown <neilb@suse.de> Date: Mon, 5 Aug 2013 14:56:23 +1000 Subject: [PATCH] DDF: fix removal of failed devices. Commit c7079c84 arrange for DDF to forget about any device that is failed and not still marked as part of any array. However such devices could still be part of the container and this removal and updating of 'pdnum' can result in multiple devices having the same pdnum. This in turn easily leads to confusion and corruption. So only discard pd entries for devices which are failed, not listed in any virtual device, and for which we don't have a handle on the device. pd entries will not get removed until a new device is added after the device has been removed from the container, either by "mdadm --remove" or by assembling without the failed devices. Reported-by: Albert Pauw <albert.pauw@gmail.com> Analysed-by: Martin Wilck <mwilck@arcor.de> Signed-off-by: NeilBrown <neilb@suse.de> diff --git a/super-ddf.c b/super-ddf.c index 20f4f25..b352a52 100644 --- a/super-ddf.c +++ b/super-ddf.c @@ -4635,13 +4635,19 @@ static void ddf_process_update(struct supertype *st, */ pd2 = 0; for (pdnum = 0; pdnum < be16_to_cpu(ddf->phys->used_pdes); - pdnum++) + pdnum++) { if (be16_and(ddf->phys->entries[pdnum].state, cpu_to_be16(DDF_Failed)) && be16_and(ddf->phys->entries[pdnum].state, - cpu_to_be16(DDF_Transition))) - /* skip this one */; - else if (pdnum == pd2) + cpu_to_be16(DDF_Transition))) { + /* skip this one unless in dlist*/ + for (dl = ddf->dlist; dl; dl = dl->next) + if (dl->pdnum == (int)pdnum) + break; + if (!dl) + continue; + } + if (pdnum == pd2) pd2++; else { ddf->phys->entries[pd2] = @@ -4651,6 +4657,7 @@ static void ddf_process_update(struct supertype *st, dl->pdnum = pd2; pd2++; } + } ddf->phys->used_pdes = cpu_to_be16(pd2); while (pd2 < pdnum) { memset(ddf->phys->entries[pd2].guid, 0xff, [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 4/4] tests/10ddf-fail-twice: New unit test 2013-08-01 22:35 [PATCH 1/4] mdmon: always get layout from sysfs mwilck 2013-08-01 22:35 ` [PATCH 2/4] DDF: no need for GET_LAYOUT any more mwilck 2013-08-01 22:35 ` [PATCH 3/4] DDF: ddf_process_update: delete removed disks from dlist mwilck @ 2013-08-01 22:35 ` mwilck 2 siblings, 0 replies; 6+ messages in thread From: mwilck @ 2013-08-01 22:35 UTC (permalink / raw) To: neilb, linux-raid; +Cc: mwilck This is the test by Albert Pauw. Fail 2 disks, and add one. Signed-off-by: Martin Wilck <mwilck@arcor.de> --- tests/10ddf-fail-twice | 56 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 56 insertions(+), 0 deletions(-) create mode 100644 tests/10ddf-fail-twice diff --git a/tests/10ddf-fail-twice b/tests/10ddf-fail-twice new file mode 100644 index 0000000..269b424 --- /dev/null +++ b/tests/10ddf-fail-twice @@ -0,0 +1,56 @@ +# sanity check array creation + +ddf_check_hold() { + if mdadm --remove $1 $2; then + echo "$2 removal from $1 should have been blocked" >&2 + cat /proc/mdstat >&2 + mdadm -E $2 + exit 1 + fi +} + +ddf_check_removal() { + if ! mdadm --remove $1 $2 ; then + echo "$2 removal from $1 should have succeeded" >&2 + cat /proc/mdstat >&2 + mdadm -E $2 + exit 1 + fi +} + +. tests/env-ddf-template + +num_disks=5 +mdadm -CR $container -e ddf -n $num_disks $dev8 $dev9 $dev10 $dev11 $dev12 +ddf_check container $num_disks + +mdadm -CR $member0 -n 2 -l 1 $container +mdadm -CR $member1 -n 3 -l 5 $container + +mdadm --wait $member1 $member0 || mdadm --wait $member1 $member0 + +mdadm $member0 --fail $dev11 +mdadm $member1 --fail $dev9 + +mdadm $container --add $dev13 + +mdadm --wait $member1 $member0 || mdadm --wait $member1 $member0 + +{ grep -q 'external:/md127/1.*\[3/3\]' /proc/mdstat && + grep -q 'external:/md127/0.*\[2/1\]' /proc/mdstat; } || { + echo unexpected states in /proc/mdstat + cat /proc/mdstat + mdadm -Ss + exit 1 +} + +{ mdadm -E $dev10 | grep -q 'state\[0\] : Degraded, Consistent' && + mdadm -E $dev10 | grep -q 'state\[1\] : Optimal, Consistent'; } || { + echo unexpected meta data state + mdadm -E $dev10 + mdadm -Ss + exit 1 +} + +mdadm -Ss +exit 0 -- 1.7.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-08-05 5:11 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-08-01 22:35 [PATCH 1/4] mdmon: always get layout from sysfs mwilck 2013-08-01 22:35 ` [PATCH 2/4] DDF: no need for GET_LAYOUT any more mwilck 2013-08-01 22:35 ` [PATCH 3/4] DDF: ddf_process_update: delete removed disks from dlist mwilck 2013-08-02 19:32 ` Martin Wilck 2013-08-05 5:11 ` NeilBrown 2013-08-01 22:35 ` [PATCH 4/4] tests/10ddf-fail-twice: New unit test mwilck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).