From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Brown Subject: Re: [PATCH 14/53] FIX: Cannot exit monitor after takeover Date: Mon, 29 Nov 2010 10:38:29 +1100 Message-ID: <20101129103829.0964debb@notabene.brown> References: <20101126075407.5221.62582.stgit@gklab-170-024.igk.intel.com> <20101126080537.5221.28837.stgit@gklab-170-024.igk.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20101126080537.5221.28837.stgit@gklab-170-024.igk.intel.com> Sender: linux-raid-owner@vger.kernel.org To: Adam Kwolek Cc: linux-raid@vger.kernel.org, dan.j.williams@intel.com, ed.ciechanowski@intel.com List-Id: linux-raid.ids On Fri, 26 Nov 2010 09:05:37 +0100 Adam Kwolek wrote: > When performing backward takeover to raid0 monitor cannot exit > for single raid0 array configuration. > Monitor is locked by communication (ping_manager()) after unfreeze() I think you are saying that when we convert a RAID5 to a RAID0, the mdmon notices that there is nothing more for it to do, so it exits. Then mdadm has problems contacting it. Is that right? It doesn't seem quite right as the 'ping_monitor' should simply fail if the mdmon has disappeared. Could you say a bit more about what you observe happening. > > Do not ping manager for raid0 array as they shouldn't be monitored. Only this isn't quite what the patch does. What it does is: if the 'last' subarray found is raid0, then don't ping the monitor. In general, (though possibly not in imsm) there could be multiple arrays, some RAID0, some not. So we would need to track if there are an with level > 0 and ping_monitor if any such were found. I would be reasonably happy with such a patch, except that I cannot yet see exactly why it is needed. So could you explain exactly what you are seeing please? Thanks, NeilBrown > > Signed-off-by: Adam Kwolek > --- > > msg.c | 5 +++-- > 1 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/msg.c b/msg.c > index 8e7ebfd..95c6f0b 100644 > --- a/msg.c > +++ b/msg.c > @@ -385,11 +385,12 @@ void unblock_monitor(char *container, const int unfreeze) > if (!is_container_member(e, container)) > continue; > sysfs_free(sra); > - sra = sysfs_read(-1, e->devnum, GET_VERSION); > + sra = sysfs_read(-1, e->devnum, GET_VERSION|GET_LEVEL); > if (unblock_subarray(sra, unfreeze)) > fprintf(stderr, Name ": Failed to unfreeze %s\n", e->dev); > } > - ping_monitor(container); > + if (sra && sra->array.level > 0) > + ping_monitor(container); > > sysfs_free(sra); > free_mdstat(ent);