From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Brown Subject: Re: [PATCH 2/2] IMSM: do not rebuild the array if a non-redundant sub-array with failed disks is present Date: Wed, 8 Dec 2010 13:32:08 +1100 Message-ID: <20101208133208.0f62b900@notabene.brown> References: <905EDD02F158D948B186911EB64DB3D17676E3C3@irsmsx503.ger.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <905EDD02F158D948B186911EB64DB3D17676E3C3@irsmsx503.ger.corp.intel.com> Sender: linux-raid-owner@vger.kernel.org To: "Labun, Marcin" Cc: "linux-raid@vger.kernel.org" , "Neubauer, Wojciech" , "Williams, Dan J" , "Czarnowska, Anna" , "Ciechanowski, Ed" , "Hawrylewicz Czarnowski, Przemyslaw" List-Id: linux-raid.ids On Tue, 7 Dec 2010 16:10:48 +0000 "Labun, Marcin" wrote: > >From ee52735e3576dc998a837229ac5b9fb3ed1faeaf Mon Sep 17 00:00:00 2001 > From: Marcin Labun > Date: Wed, 30 Nov 2010 15:47:19 +0100 > Subject: [PATCH 2/2] IMSM: do not rebuild the array if a non-redundant sub-array with failed disks is present > > Now Intel metadata handler rebuilds all sub-arrays even if one of them > is non-redundant. In case of failed sub-array, failed disks are just replaced > with new ones in the metadata mapping. The data for failed disk is not restored > even the disk is present in the system. This fix requests to remove the failed > disk from container to let the process of rebuilding the array with failed > member. If the disk is physically pulled out of the system, the disk is removed > from container automatically by exiting udev rules. This mostly makes sense, though... > > + /* > + * If there are any failed disks check state of the other volume. > + * Block rebuild if the other one is failed until failed disks > + * are removed from container. > + */ This comment was a lot clearer to me that the description at the top :-) However: > + if (failed) { > + dprintf("found failed disks in %s, check if there is another" > + "sub-array\n", > + dev->volume); > + /* check the state of the other volume allows for rebuild */ > + allowed = imsm_rebuild_allowed(a, (inst == 0) ? 1 : 0, failed); ^^^^^^^^^^^^^^^^^^ This seems to imply that there are only ever at most 2 volumes in a container. Is that really true? The rest of the code seems to assume that there could be several. If there can be more than two, then you need a loop over all the 'other' devices to check that they are allowed to rebuild. Thanks, NeilBrown