* [PATCH] imsm: fix: correct checking newly missing disks @ 2011-11-14 14:52 Lukasz Dorau 2011-11-15 4:43 ` NeilBrown 2011-11-30 2:26 ` Dan Williams 0 siblings, 2 replies; 8+ messages in thread From: Lukasz Dorau @ 2011-11-14 14:52 UTC (permalink / raw) To: neilb; +Cc: linux-raid, dan.j.williams, marcin.labun, ed.ciechanowski The problem occurs when RAID10 array under rebuild (after one disk fails) is assembled incrementally. Mdadm tries to start array just after adding the third disk and the volume is assembled incorrectly (in degraded state). The cause is that container_enough depends on newly missing disks which are checked incorrectly now. They should be checked using always the first map. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> --- super-intel.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/super-intel.c b/super-intel.c index 4ebee78..511a32a 100644 --- a/super-intel.c +++ b/super-intel.c @@ -2529,13 +2529,13 @@ static void getinfo_super_imsm(struct supertype *st, struct mdinfo *info, char * failed = imsm_count_failed(super, dev); state = imsm_check_degraded(super, dev, failed); - map = get_imsm_map(dev, dev->vol.migr_state); + map = get_imsm_map(dev, 0); /* any newly missing disks? * (catches single-degraded vs double-degraded) */ for (j = 0; j < map->num_members; j++) { - __u32 ord = get_imsm_ord_tbl_ent(dev, i, -1); + __u32 ord = get_imsm_ord_tbl_ent(dev, i, 0); __u32 idx = ord_to_idx(ord); if (!(ord & IMSM_ORD_REBUILD) && ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] imsm: fix: correct checking newly missing disks 2011-11-14 14:52 [PATCH] imsm: fix: correct checking newly missing disks Lukasz Dorau @ 2011-11-15 4:43 ` NeilBrown 2011-11-30 2:26 ` Dan Williams 1 sibling, 0 replies; 8+ messages in thread From: NeilBrown @ 2011-11-15 4:43 UTC (permalink / raw) To: Lukasz Dorau; +Cc: linux-raid, dan.j.williams, marcin.labun, ed.ciechanowski [-- Attachment #1: Type: text/plain, Size: 1435 bytes --] On Mon, 14 Nov 2011 15:52:52 +0100 Lukasz Dorau <lukasz.dorau@intel.com> wrote: > The problem occurs when RAID10 array under rebuild > (after one disk fails) is assembled incrementally. > Mdadm tries to start array just after adding the third disk > and the volume is assembled incorrectly (in degraded state). > > The cause is that container_enough depends on > newly missing disks which are checked incorrectly now. > They should be checked using always the first map. > > Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> > --- > super-intel.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/super-intel.c b/super-intel.c > index 4ebee78..511a32a 100644 > --- a/super-intel.c > +++ b/super-intel.c > @@ -2529,13 +2529,13 @@ static void getinfo_super_imsm(struct supertype *st, struct mdinfo *info, char * > > failed = imsm_count_failed(super, dev); > state = imsm_check_degraded(super, dev, failed); > - map = get_imsm_map(dev, dev->vol.migr_state); > + map = get_imsm_map(dev, 0); > > /* any newly missing disks? > * (catches single-degraded vs double-degraded) > */ > for (j = 0; j < map->num_members; j++) { > - __u32 ord = get_imsm_ord_tbl_ent(dev, i, -1); > + __u32 ord = get_imsm_ord_tbl_ent(dev, i, 0); > __u32 idx = ord_to_idx(ord); > > if (!(ord & IMSM_ORD_REBUILD) && Applied, thanks, NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] imsm: fix: correct checking newly missing disks 2011-11-14 14:52 [PATCH] imsm: fix: correct checking newly missing disks Lukasz Dorau 2011-11-15 4:43 ` NeilBrown @ 2011-11-30 2:26 ` Dan Williams 2011-12-01 14:23 ` Dorau, Lukasz 2011-12-01 14:40 ` Dorau, Lukasz 1 sibling, 2 replies; 8+ messages in thread From: Dan Williams @ 2011-11-30 2:26 UTC (permalink / raw) To: Lukasz Dorau; +Cc: neilb, linux-raid, marcin.labun, ed.ciechanowski On Mon, Nov 14, 2011 at 6:52 AM, Lukasz Dorau <lukasz.dorau@intel.com> wrote: > The problem occurs when RAID10 array under rebuild > (after one disk fails) is assembled incrementally. > Mdadm tries to start array just after adding the third disk > and the volume is assembled incorrectly (in degraded state). > > The cause is that container_enough depends on > newly missing disks which are checked incorrectly now. > They should be checked using always the first map. > > Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> > --- > super-intel.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/super-intel.c b/super-intel.c > index 4ebee78..511a32a 100644 > --- a/super-intel.c > +++ b/super-intel.c > @@ -2529,13 +2529,13 @@ static void getinfo_super_imsm(struct supertype *st, struct mdinfo *info, char * > > failed = imsm_count_failed(super, dev); > state = imsm_check_degraded(super, dev, failed); > - map = get_imsm_map(dev, dev->vol.migr_state); > + map = get_imsm_map(dev, 0); > > /* any newly missing disks? > * (catches single-degraded vs double-degraded) > */ > for (j = 0; j < map->num_members; j++) { > - __u32 ord = get_imsm_ord_tbl_ent(dev, i, -1); > + __u32 ord = get_imsm_ord_tbl_ent(dev, i, 0); This looks wrong. I noticed this when looking over Przemyslaw's patch [1]. map[0] always contains the destination state of the migration so the most reliable source for looking for out of sync disks is map[1]. -- Dan [1]: http://marc.info/?l=linux-raid&m=132206766827484&w=2 -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] imsm: fix: correct checking newly missing disks 2011-11-30 2:26 ` Dan Williams @ 2011-12-01 14:23 ` Dorau, Lukasz 2011-12-06 1:10 ` NeilBrown 2011-12-01 14:40 ` Dorau, Lukasz 1 sibling, 1 reply; 8+ messages in thread From: Dorau, Lukasz @ 2011-12-01 14:23 UTC (permalink / raw) To: Williams, Dan J Cc: neilb@suse.de, linux-raid@vger.kernel.org, Labun, Marcin, Ciechanowski, Ed, Kwolek, Adam Pozdrawiam, Łukasz > -----Original Message----- > From: dan.j.williams@gmail.com [mailto:dan.j.williams@gmail.com] On Behalf > Of Dan Williams > Sent: Wednesday, November 30, 2011 3:27 AM > To: Dorau, Lukasz > Cc: neilb@suse.de; linux-raid@vger.kernel.org; Labun, Marcin; Ciechanowski, Ed > Subject: Re: [PATCH] imsm: fix: correct checking newly missing disks > > On Mon, Nov 14, 2011 at 6:52 AM, Lukasz Dorau <lukasz.dorau@intel.com> > wrote: > > The problem occurs when RAID10 array under rebuild > > (after one disk fails) is assembled incrementally. > > Mdadm tries to start array just after adding the third disk > > and the volume is assembled incorrectly (in degraded state). > > > > The cause is that container_enough depends on > > newly missing disks which are checked incorrectly now. > > They should be checked using always the first map. > > > > Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> > > --- > > super-intel.c | 4 ++-- > > 1 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/super-intel.c b/super-intel.c > > index 4ebee78..511a32a 100644 > > --- a/super-intel.c > > +++ b/super-intel.c > > @@ -2529,13 +2529,13 @@ static void getinfo_super_imsm(struct supertype > *st, struct mdinfo *info, char * > > > > failed = imsm_count_failed(super, dev); > > state = imsm_check_degraded(super, dev, failed); > > - map = get_imsm_map(dev, dev->vol.migr_state); > > + map = get_imsm_map(dev, 0); > > > > /* any newly missing disks? > > * (catches single-degraded vs double-degraded) > > */ > > for (j = 0; j < map->num_members; j++) { > > - __u32 ord = get_imsm_ord_tbl_ent(dev, i, -1); > > + __u32 ord = get_imsm_ord_tbl_ent(dev, i, 0); > > This looks wrong. I noticed this when looking over Przemyslaw's patch [1]. > > map[0] always contains the destination state of the migration so the > most reliable source for looking for out of sync disks is map[1]. > I am convinced that the patch is good. We are looking for information what was the state of array during migration (before it was stopped), so we have to use map[0]. map[1] contains information about the state of array before migration, which we do not need. Regards, Lukasz -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] imsm: fix: correct checking newly missing disks 2011-12-01 14:23 ` Dorau, Lukasz @ 2011-12-06 1:10 ` NeilBrown 2011-12-06 2:19 ` Williams, Dan J 0 siblings, 1 reply; 8+ messages in thread From: NeilBrown @ 2011-12-06 1:10 UTC (permalink / raw) To: Dorau, Lukasz Cc: Williams, Dan J, linux-raid@vger.kernel.org, Labun, Marcin, Ciechanowski, Ed, Kwolek, Adam [-- Attachment #1: Type: text/plain, Size: 1783 bytes --] On Thu, 1 Dec 2011 14:23:16 +0000 "Dorau, Lukasz" <lukasz.dorau@intel.com> wrote: > > > diff --git a/super-intel.c b/super-intel.c > > > index 4ebee78..511a32a 100644 > > > --- a/super-intel.c > > > +++ b/super-intel.c > > > @@ -2529,13 +2529,13 @@ static void getinfo_super_imsm(struct supertype > > *st, struct mdinfo *info, char * > > > > > > failed = imsm_count_failed(super, dev); > > > state = imsm_check_degraded(super, dev, failed); > > > - map = get_imsm_map(dev, dev->vol.migr_state); > > > + map = get_imsm_map(dev, 0); > > > > > > /* any newly missing disks? > > > * (catches single-degraded vs double-degraded) > > > */ > > > for (j = 0; j < map->num_members; j++) { > > > - __u32 ord = get_imsm_ord_tbl_ent(dev, i, -1); > > > + __u32 ord = get_imsm_ord_tbl_ent(dev, i, 0); > > > > This looks wrong. I noticed this when looking over Przemyslaw's patch [1]. > > > > map[0] always contains the destination state of the migration so the > > most reliable source for looking for out of sync disks is map[1]. > > > > I am convinced that the patch is good. > We are looking for information what was the state of array during migration (before it was stopped), so we have to use map[0]. > map[1] contains information about the state of array before migration, which we do not need. > > Regards, > Lukasz Hi, do we have agreement on this? Dan - do you stand by your original concern or have you seen the light :-) The patch is in, but I'd like to be sure it is right and to be honest I haven't followed the dance of the maps too closely... Thanks, NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] imsm: fix: correct checking newly missing disks 2011-12-06 1:10 ` NeilBrown @ 2011-12-06 2:19 ` Williams, Dan J 2011-12-06 2:22 ` NeilBrown 0 siblings, 1 reply; 8+ messages in thread From: Williams, Dan J @ 2011-12-06 2:19 UTC (permalink / raw) To: NeilBrown Cc: Dorau, Lukasz, linux-raid@vger.kernel.org, Labun, Marcin, Ciechanowski, Ed, Kwolek, Adam On Mon, Dec 5, 2011 at 5:10 PM, NeilBrown <neilb@suse.de> wrote: > On Thu, 1 Dec 2011 14:23:16 +0000 "Dorau, Lukasz" <lukasz.dorau@intel.com> > wrote: >> > > diff --git a/super-intel.c b/super-intel.c >> > > index 4ebee78..511a32a 100644 >> > > --- a/super-intel.c >> > > +++ b/super-intel.c >> > > @@ -2529,13 +2529,13 @@ static void getinfo_super_imsm(struct supertype >> > *st, struct mdinfo *info, char * >> > > >> > > failed = imsm_count_failed(super, dev); >> > > state = imsm_check_degraded(super, dev, failed); >> > > - map = get_imsm_map(dev, dev->vol.migr_state); >> > > + map = get_imsm_map(dev, 0); >> > > >> > > /* any newly missing disks? >> > > * (catches single-degraded vs double-degraded) >> > > */ >> > > for (j = 0; j < map->num_members; j++) { >> > > - __u32 ord = get_imsm_ord_tbl_ent(dev, i, -1); >> > > + __u32 ord = get_imsm_ord_tbl_ent(dev, i, 0); >> > >> > This looks wrong. I noticed this when looking over Przemyslaw's patch [1]. >> > >> > map[0] always contains the destination state of the migration so the >> > most reliable source for looking for out of sync disks is map[1]. >> > >> >> I am convinced that the patch is good. >> We are looking for information what was the state of array during migration (before it was stopped), so we have to use map[0]. >> map[1] contains information about the state of array before migration, which we do not need. >> >> Regards, >> Lukasz > > Hi, > do we have agreement on this? Dan - do you stand by your original concern > or have you seen the light :-) > > The patch is in, but I'd like to be sure it is right and to be honest I > haven't followed the dance of the maps too closely... Lukasz is right. We want to find out if starting the array with the current list of disks in the container would regress the state of the array recorded in the metadata. map[0] should always record the best possible state of all the slots in the array if any of those have gone missing we don't want incremental assembly to proceed. Sorry for the noise, my point was 'correct' in isolation but it missed that the context is looking for the most optimistic view of the disk. -- Dan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] imsm: fix: correct checking newly missing disks 2011-12-06 2:19 ` Williams, Dan J @ 2011-12-06 2:22 ` NeilBrown 0 siblings, 0 replies; 8+ messages in thread From: NeilBrown @ 2011-12-06 2:22 UTC (permalink / raw) To: Williams, Dan J Cc: Dorau, Lukasz, linux-raid@vger.kernel.org, Labun, Marcin, Ciechanowski, Ed, Kwolek, Adam [-- Attachment #1: Type: text/plain, Size: 2598 bytes --] On Mon, 5 Dec 2011 18:19:09 -0800 "Williams, Dan J" <dan.j.williams@intel.com> wrote: > On Mon, Dec 5, 2011 at 5:10 PM, NeilBrown <neilb@suse.de> wrote: > > On Thu, 1 Dec 2011 14:23:16 +0000 "Dorau, Lukasz" <lukasz.dorau@intel.com> > > wrote: > >> > > diff --git a/super-intel.c b/super-intel.c > >> > > index 4ebee78..511a32a 100644 > >> > > --- a/super-intel.c > >> > > +++ b/super-intel.c > >> > > @@ -2529,13 +2529,13 @@ static void getinfo_super_imsm(struct supertype > >> > *st, struct mdinfo *info, char * > >> > > > >> > > failed = imsm_count_failed(super, dev); > >> > > state = imsm_check_degraded(super, dev, failed); > >> > > - map = get_imsm_map(dev, dev->vol.migr_state); > >> > > + map = get_imsm_map(dev, 0); > >> > > > >> > > /* any newly missing disks? > >> > > * (catches single-degraded vs double-degraded) > >> > > */ > >> > > for (j = 0; j < map->num_members; j++) { > >> > > - __u32 ord = get_imsm_ord_tbl_ent(dev, i, -1); > >> > > + __u32 ord = get_imsm_ord_tbl_ent(dev, i, 0); > >> > > >> > This looks wrong. I noticed this when looking over Przemyslaw's patch [1]. > >> > > >> > map[0] always contains the destination state of the migration so the > >> > most reliable source for looking for out of sync disks is map[1]. > >> > > >> > >> I am convinced that the patch is good. > >> We are looking for information what was the state of array during migration (before it was stopped), so we have to use map[0]. > >> map[1] contains information about the state of array before migration, which we do not need. > >> > >> Regards, > >> Lukasz > > > > Hi, > > do we have agreement on this? Dan - do you stand by your original concern > > or have you seen the light :-) > > > > The patch is in, but I'd like to be sure it is right and to be honest I > > haven't followed the dance of the maps too closely... > > Lukasz is right. We want to find out if starting the array with the > current list of disks in the container would regress the state of the > array recorded in the metadata. map[0] should always record the best > possible state of all the slots in the array if any of those have gone > missing we don't want incremental assembly to proceed. > > Sorry for the noise, my point was 'correct' in isolation but it missed > that the context is looking for the most optimistic view of the disk. Great - thanks for clearing that up. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] imsm: fix: correct checking newly missing disks 2011-11-30 2:26 ` Dan Williams 2011-12-01 14:23 ` Dorau, Lukasz @ 2011-12-01 14:40 ` Dorau, Lukasz 1 sibling, 0 replies; 8+ messages in thread From: Dorau, Lukasz @ 2011-12-01 14:40 UTC (permalink / raw) To: Williams, Dan J Cc: neilb@suse.de, linux-raid@vger.kernel.org, Labun, Marcin, Ciechanowski, Ed, Kwolek, Adam > -----Original Message----- > From: Dorau, Lukasz > Sent: Thursday, December 01, 2011 3:23 PM > To: 'Dan Williams' > Cc: neilb@suse.de; linux-raid@vger.kernel.org; Labun, Marcin; Ciechanowski, > Ed; Kwolek, Adam > Subject: RE: [PATCH] imsm: fix: correct checking newly missing disks > > Pozdrawiam, > Łukasz > I apologize for the words at the beginning of the last message. They are unnecessary. It was an oversight. The right reply is of course at the end of the message. Lukasz > > > -----Original Message----- > > From: dan.j.williams@gmail.com [mailto:dan.j.williams@gmail.com] On > Behalf > > Of Dan Williams > > Sent: Wednesday, November 30, 2011 3:27 AM > > To: Dorau, Lukasz > > Cc: neilb@suse.de; linux-raid@vger.kernel.org; Labun, Marcin; Ciechanowski, > Ed > > Subject: Re: [PATCH] imsm: fix: correct checking newly missing disks > > > > On Mon, Nov 14, 2011 at 6:52 AM, Lukasz Dorau <lukasz.dorau@intel.com> > > wrote: > > > The problem occurs when RAID10 array under rebuild > > > (after one disk fails) is assembled incrementally. > > > Mdadm tries to start array just after adding the third disk > > > and the volume is assembled incorrectly (in degraded state). > > > > > > The cause is that container_enough depends on > > > newly missing disks which are checked incorrectly now. > > > They should be checked using always the first map. > > > > > > Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> > > > --- > > > super-intel.c | 4 ++-- > > > 1 files changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/super-intel.c b/super-intel.c > > > index 4ebee78..511a32a 100644 > > > --- a/super-intel.c > > > +++ b/super-intel.c > > > @@ -2529,13 +2529,13 @@ static void getinfo_super_imsm(struct > supertype > > *st, struct mdinfo *info, char * > > > > > > failed = imsm_count_failed(super, dev); > > > state = imsm_check_degraded(super, dev, failed); > > > - map = get_imsm_map(dev, dev->vol.migr_state); > > > + map = get_imsm_map(dev, 0); > > > > > > /* any newly missing disks? > > > * (catches single-degraded vs double-degraded) > > > */ > > > for (j = 0; j < map->num_members; j++) { > > > - __u32 ord = get_imsm_ord_tbl_ent(dev, i, -1); > > > + __u32 ord = get_imsm_ord_tbl_ent(dev, i, 0); > > > > This looks wrong. I noticed this when looking over Przemyslaw's patch [1]. > > > > map[0] always contains the destination state of the migration so the > > most reliable source for looking for out of sync disks is map[1]. > > > > I am convinced that the patch is good. > We are looking for information what was the state of array during migration > (before it was stopped), so we have to use map[0]. > map[1] contains information about the state of array before migration, which > we do not need. > > Regards, > Lukasz -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-12-06 2:22 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-14 14:52 [PATCH] imsm: fix: correct checking newly missing disks Lukasz Dorau 2011-11-15 4:43 ` NeilBrown 2011-11-30 2:26 ` Dan Williams 2011-12-01 14:23 ` Dorau, Lukasz 2011-12-06 1:10 ` NeilBrown 2011-12-06 2:19 ` Williams, Dan J 2011-12-06 2:22 ` NeilBrown 2011-12-01 14:40 ` Dorau, Lukasz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).