* RAID 5 rebuild fails with power interruption. @ 2009-11-16 3:56 senthilkumar.muthukalai 2009-11-16 5:19 ` Goswin von Brederlow 0 siblings, 1 reply; 12+ messages in thread From: senthilkumar.muthukalai @ 2009-11-16 3:56 UTC (permalink / raw) To: linux-raid Adding a subject line... -----Original Message----- From: SenthilKumar Muthukalai (WT01 - Telecom Equipment) Sent: Monday, November 16, 2009 9:14 AM To: linux-raid@vger.kernel.org Subject: Hi All, Could you pls help me out with the below problem? 1. Created a RAID5 with 3 disks. 2. Initial rebuild done. 3. Pulled out a disk from the array. 4. The array got degraded. 5. Added the disk back to the array with 'assemble' command. 6. The disk was successfully added and the array started rebuilding again. 7. While rebuilding, reset the power to the NAS box. 8. When the NAS box boot up, the RAID was in degraded with the added disk thrown out. 9. The boot messages say 'kicking out of the non-fresh disk from the array'. We tried '--force' option with the 'assemble' command but no success. Thanks, Senthil M ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID 5 rebuild fails with power interruption. 2009-11-16 3:56 RAID 5 rebuild fails with power interruption senthilkumar.muthukalai @ 2009-11-16 5:19 ` Goswin von Brederlow 2009-11-16 10:30 ` senthilkumar.muthukalai 0 siblings, 1 reply; 12+ messages in thread From: Goswin von Brederlow @ 2009-11-16 5:19 UTC (permalink / raw) To: senthilkumar.muthukalai; +Cc: linux-raid <senthilkumar.muthukalai@wipro.com> writes: > Adding a subject line... > > -----Original Message----- > From: SenthilKumar Muthukalai (WT01 - Telecom Equipment) > Sent: Monday, November 16, 2009 9:14 AM > To: linux-raid@vger.kernel.org > Subject: > > Hi All, > > Could you pls help me out with the below problem? > > 1. Created a RAID5 with 3 disks. > 2. Initial rebuild done. > 3. Pulled out a disk from the array. > 4. The array got degraded. > 5. Added the disk back to the array with 'assemble' command. > 6. The disk was successfully added and the array started rebuilding > again. > 7. While rebuilding, reset the power to the NAS box. > 8. When the NAS box boot up, the RAID was in degraded with the added > disk thrown out. > 9. The boot messages say 'kicking out of the non-fresh disk from the > array'. > > We tried '--force' option with the 'assemble' command but no success. > > Thanks, > Senthil M mdadm --add /dev/md0 /dev/sdc1 But normaly it should just continue the resync. MfG Goswin ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: RAID 5 rebuild fails with power interruption. 2009-11-16 5:19 ` Goswin von Brederlow @ 2009-11-16 10:30 ` senthilkumar.muthukalai 2009-11-16 22:47 ` Neil Brown 0 siblings, 1 reply; 12+ messages in thread From: senthilkumar.muthukalai @ 2009-11-16 10:30 UTC (permalink / raw) To: goswin-v-b; +Cc: linux-raid We face this problem in our NAS product where we handle RAID5. In the below mentioned scenario, when RAID5 is rebuilding after adding a disk, we reset the power. Ideally when the system comes up, the RAID5 should have accepted the disk but not in our case. We get the 'kicking the non-fresh disk from array' message with the boot message. In our RAID init script we run 'mdadm -- Examine -- scan', followed by 'mdadm --assemble'. Could you pls help me to understand why this disk is being thrown out? What could be the solution? -----Original Message----- From: goswin-v-b@web.de [mailto:goswin-v-b@web.de] Sent: Monday, November 16, 2009 10:49 AM To: SenthilKumar Muthukalai (WT01 - Telecom Equipment) Cc: linux-raid@vger.kernel.org Subject: Re: RAID 5 rebuild fails with power interruption. <senthilkumar.muthukalai@wipro.com> writes: > Adding a subject line... > > -----Original Message----- > From: SenthilKumar Muthukalai (WT01 - Telecom Equipment) > Sent: Monday, November 16, 2009 9:14 AM > To: linux-raid@vger.kernel.org > Subject: > > Hi All, > > Could you pls help me out with the below problem? > > 1. Created a RAID5 with 3 disks. > 2. Initial rebuild done. > 3. Pulled out a disk from the array. > 4. The array got degraded. > 5. Added the disk back to the array with 'assemble' command. > 6. The disk was successfully added and the array started rebuilding > again. > 7. While rebuilding, reset the power to the NAS box. > 8. When the NAS box boot up, the RAID was in degraded with the added > disk thrown out. > 9. The boot messages say 'kicking out of the non-fresh disk from the > array'. > > We tried '--force' option with the 'assemble' command but no success. > > Thanks, > Senthil M mdadm --add /dev/md0 /dev/sdc1 But normaly it should just continue the resync. MfG Goswin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID 5 rebuild fails with power interruption. 2009-11-16 10:30 ` senthilkumar.muthukalai @ 2009-11-16 22:47 ` Neil Brown 2009-11-17 6:30 ` senthilkumar.muthukalai 2009-11-24 1:36 ` Kasper Sandberg 0 siblings, 2 replies; 12+ messages in thread From: Neil Brown @ 2009-11-16 22:47 UTC (permalink / raw) To: senthilkumar.muthukalai; +Cc: goswin-v-b, linux-raid On Mon, 16 Nov 2009 16:00:38 +0530 <senthilkumar.muthukalai@wipro.com> wrote: > We face this problem in our NAS product where we handle RAID5. > In the below mentioned scenario, when RAID5 is rebuilding after > adding a disk, we reset the power. > Ideally when the system comes up, the RAID5 should have accepted the > disk but not in our case. > We get the 'kicking the non-fresh disk from array' message with the > boot message. > In our RAID init script we run 'mdadm -- Examine -- scan', followed by > 'mdadm --assemble'. > Could you pls help me to understand why this disk is being thrown out? It is because the metadata being used (v0.90) does not have the ability to record that a device is partially recoverred. It can only record that a device is either a full member of the array, or is not a member of the array. So until the recovery completes, the metadata only records that the device is not a member of the array. So when you restart, you find that the device is not a member of the array. > What could be the solution? Use 1.x metadata. e.g. add --metadata=1.1 to your --create command. 1.x metadata is able to record that a device is only partially recovered. So when the array is restarted the device will be included and recovery will continue. NeilBrown > > -----Original Message----- > From: goswin-v-b@web.de [mailto:goswin-v-b@web.de] > Sent: Monday, November 16, 2009 10:49 AM > To: SenthilKumar Muthukalai (WT01 - Telecom Equipment) > Cc: linux-raid@vger.kernel.org > Subject: Re: RAID 5 rebuild fails with power interruption. > > <senthilkumar.muthukalai@wipro.com> writes: > > > Adding a subject line... > > > > -----Original Message----- > > From: SenthilKumar Muthukalai (WT01 - Telecom Equipment) > > Sent: Monday, November 16, 2009 9:14 AM > > To: linux-raid@vger.kernel.org > > Subject: > > > > Hi All, > > > > Could you pls help me out with the below problem? > > > > 1. Created a RAID5 with 3 disks. > > 2. Initial rebuild done. > > 3. Pulled out a disk from the array. > > 4. The array got degraded. > > 5. Added the disk back to the array with 'assemble' command. > > 6. The disk was successfully added and the array started rebuilding > > again. > > 7. While rebuilding, reset the power to the NAS box. > > 8. When the NAS box boot up, the RAID was in degraded with the added > > disk thrown out. > > 9. The boot messages say 'kicking out of the non-fresh disk from the > > array'. > > > > We tried '--force' option with the 'assemble' command but no > > success. > > > > Thanks, > > Senthil M > > mdadm --add /dev/md0 /dev/sdc1 > > But normaly it should just continue the resync. > > MfG > Goswin > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" > in the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: RAID 5 rebuild fails with power interruption. 2009-11-16 22:47 ` Neil Brown @ 2009-11-17 6:30 ` senthilkumar.muthukalai [not found] ` <20091118163655.2ef3f00d@notabene.brown> 2009-11-24 1:36 ` Kasper Sandberg 1 sibling, 1 reply; 12+ messages in thread From: senthilkumar.muthukalai @ 2009-11-17 6:30 UTC (permalink / raw) To: neilb; +Cc: goswin-v-b, linux-raid Hi Neil, We use metadata 1.2. Thanks, Senthil M -----Original Message----- From: Neil Brown [mailto:neilb@suse.de] Sent: Tuesday, November 17, 2009 4:17 AM To: SenthilKumar Muthukalai (WT01 - Telecom Equipment) Cc: goswin-v-b@web.de; linux-raid@vger.kernel.org Subject: Re: RAID 5 rebuild fails with power interruption. On Mon, 16 Nov 2009 16:00:38 +0530 <senthilkumar.muthukalai@wipro.com> wrote: > We face this problem in our NAS product where we handle RAID5. > In the below mentioned scenario, when RAID5 is rebuilding after > adding a disk, we reset the power. > Ideally when the system comes up, the RAID5 should have accepted the > disk but not in our case. > We get the 'kicking the non-fresh disk from array' message with the > boot message. > In our RAID init script we run 'mdadm -- Examine -- scan', followed by > 'mdadm --assemble'. > Could you pls help me to understand why this disk is being thrown out? It is because the metadata being used (v0.90) does not have the ability to record that a device is partially recoverred. It can only record that a device is either a full member of the array, or is not a member of the array. So until the recovery completes, the metadata only records that the device is not a member of the array. So when you restart, you find that the device is not a member of the array. > What could be the solution? Use 1.x metadata. e.g. add --metadata=1.1 to your --create command. 1.x metadata is able to record that a device is only partially recovered. So when the array is restarted the device will be included and recovery will continue. NeilBrown > > -----Original Message----- > From: goswin-v-b@web.de [mailto:goswin-v-b@web.de] > Sent: Monday, November 16, 2009 10:49 AM > To: SenthilKumar Muthukalai (WT01 - Telecom Equipment) > Cc: linux-raid@vger.kernel.org > Subject: Re: RAID 5 rebuild fails with power interruption. > > <senthilkumar.muthukalai@wipro.com> writes: > > > Adding a subject line... > > > > -----Original Message----- > > From: SenthilKumar Muthukalai (WT01 - Telecom Equipment) > > Sent: Monday, November 16, 2009 9:14 AM > > To: linux-raid@vger.kernel.org > > Subject: > > > > Hi All, > > > > Could you pls help me out with the below problem? > > > > 1. Created a RAID5 with 3 disks. > > 2. Initial rebuild done. > > 3. Pulled out a disk from the array. > > 4. The array got degraded. > > 5. Added the disk back to the array with 'assemble' command. > > 6. The disk was successfully added and the array started rebuilding > > again. > > 7. While rebuilding, reset the power to the NAS box. > > 8. When the NAS box boot up, the RAID was in degraded with the added > > disk thrown out. > > 9. The boot messages say 'kicking out of the non-fresh disk from the > > array'. > > > > We tried '--force' option with the 'assemble' command but no > > success. > > > > Thanks, > > Senthil M > > mdadm --add /dev/md0 /dev/sdc1 > > But normaly it should just continue the resync. > > MfG > Goswin > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" > in the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20091118163655.2ef3f00d@notabene.brown>]
[parent not found: <8338BD137FF1B64EB341218BD702985E02B43A2A@BLR-EC-MBX03.wipro.com>]
* Re: RAID 5 rebuild fails with power interruption. [not found] ` <8338BD137FF1B64EB341218BD702985E02B43A2A@BLR-EC-MBX03.wipro.com> @ 2009-11-25 2:14 ` Neil Brown 2009-11-25 14:20 ` senthilkumar.muthukalai 0 siblings, 1 reply; 12+ messages in thread From: Neil Brown @ 2009-11-25 2:14 UTC (permalink / raw) To: senthilkumar.muthukalai, linux-raid (adding linux-raid back in to the CC list - please don't drop Cc's) On Mon, 23 Nov 2009 19:01:31 +0530 <senthilkumar.muthukalai@wipro.com> wrote: > Hi Neil, > > I applied the patch to our code as seen below. > But then the disk is kicked out of the array while the system is power > interrupted. > Should I use --force option always to ensure the disk is not thrown > out in this case? > Pls advice. It looks like you need one extra change in that patch for it to be completely reliable. See below. Note that if you interrupt power while the array is degraded (which is the case while it is recovering to a spare), and the array was active at that time (i.e. there had been a write in the last 200ms or so), then you will have a "dirty degraded" array and mdadm will refuse to assemble such an array unless you use --force. This is because when an array is 'dirty' you cannot trust the parity to be correct, and when it is degraded you might have some data missing, and that data cannot reliably be recovered from the parity (because we don't trust the parity). Pulling the power on a RAID5 array simply is not a good idea. NeilBrown diff --git a/drivers/md/md.c b/drivers/md/md.c index b2a9ebc..e68b254 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -1517,12 +1517,10 @@ static void super_1_sync(mddev_t *mddev, mdk_rdev_t *rdev) if (rdev->raid_disk >= 0 && !test_bit(In_sync, &rdev->flags)) { - if (rdev->recovery_offset > 0) { - sb->feature_map |= - cpu_to_le32(MD_FEATURE_RECOVERY_OFFSET); - sb->recovery_offset = - cpu_to_le64(rdev->recovery_offset); - } + sb->feature_map |= + cpu_to_le32(MD_FEATURE_RECOVERY_OFFSET); + sb->recovery_offset = + cpu_to_le64(rdev->recovery_offset); } if (mddev->reshape_position != MaxSector) { @@ -1556,7 +1554,7 @@ static void super_1_sync(mddev_t *mddev, mdk_rdev_t *rdev) sb->dev_roles[i] = cpu_to_le16(0xfffe); else if (test_bit(In_sync, &rdev2->flags)) sb->dev_roles[i] = cpu_to_le16(rdev2->raid_disk); - else if (rdev2->raid_disk >= 0 && rdev2->recovery_offset > 0) + else if (rdev2->raid_disk >= 0) sb->dev_roles[i] = cpu_to_le16(rdev2->raid_disk); else sb->dev_roles[i] = cpu_to_le16(0xffff); @@ -6769,6 +6767,7 @@ static int remove_and_add_spares(mddev_t *mddev) nm, mdname(mddev)); spares++; md_new_event(mddev); + set_bit(MD_CHANGE_DEVS, &mddev->flags); } else break; } ^ permalink raw reply related [flat|nested] 12+ messages in thread
* RE: RAID 5 rebuild fails with power interruption. 2009-11-25 2:14 ` Neil Brown @ 2009-11-25 14:20 ` senthilkumar.muthukalai 2009-11-25 20:58 ` Neil Brown 0 siblings, 1 reply; 12+ messages in thread From: senthilkumar.muthukalai @ 2009-11-25 14:20 UTC (permalink / raw) To: neilb, linux-raid Neil, The patch you have provided doesn't seem to go with the code we use. We use the md driver that comes with linux-1.16.8 source code. Could you pls suggest the changes for this version of md? Thanks, Senthil M -----Original Message----- From: Neil Brown [mailto:neilb@suse.de] Sent: Wednesday, November 25, 2009 7:44 AM To: SenthilKumar Muthukalai (WT01 - Telecom Equipment); linux-raid@vger.kernel.org Subject: Re: RAID 5 rebuild fails with power interruption. (adding linux-raid back in to the CC list - please don't drop Cc's) On Mon, 23 Nov 2009 19:01:31 +0530 <senthilkumar.muthukalai@wipro.com> wrote: > Hi Neil, > > I applied the patch to our code as seen below. > But then the disk is kicked out of the array while the system is power > interrupted. > Should I use --force option always to ensure the disk is not thrown > out in this case? > Pls advice. It looks like you need one extra change in that patch for it to be completely reliable. See below. Note that if you interrupt power while the array is degraded (which is the case while it is recovering to a spare), and the array was active at that time (i.e. there had been a write in the last 200ms or so), then you will have a "dirty degraded" array and mdadm will refuse to assemble such an array unless you use --force. This is because when an array is 'dirty' you cannot trust the parity to be correct, and when it is degraded you might have some data missing, and that data cannot reliably be recovered from the parity (because we don't trust the parity). Pulling the power on a RAID5 array simply is not a good idea. NeilBrown diff --git a/drivers/md/md.c b/drivers/md/md.c index b2a9ebc..e68b254 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -1517,12 +1517,10 @@ static void super_1_sync(mddev_t *mddev, mdk_rdev_t *rdev) if (rdev->raid_disk >= 0 && !test_bit(In_sync, &rdev->flags)) { - if (rdev->recovery_offset > 0) { - sb->feature_map |= - cpu_to_le32(MD_FEATURE_RECOVERY_OFFSET); - sb->recovery_offset = - cpu_to_le64(rdev->recovery_offset); - } + sb->feature_map |= + cpu_to_le32(MD_FEATURE_RECOVERY_OFFSET); + sb->recovery_offset = + cpu_to_le64(rdev->recovery_offset); } if (mddev->reshape_position != MaxSector) { @@ -1556,7 +1554,7 @@ static void super_1_sync(mddev_t *mddev, mdk_rdev_t *rdev) sb->dev_roles[i] = cpu_to_le16(0xfffe); else if (test_bit(In_sync, &rdev2->flags)) sb->dev_roles[i] = cpu_to_le16(rdev2->raid_disk); - else if (rdev2->raid_disk >= 0 && rdev2->recovery_offset > 0) + else if (rdev2->raid_disk >= 0) sb->dev_roles[i] = cpu_to_le16(rdev2->raid_disk); else sb->dev_roles[i] = cpu_to_le16(0xffff); @@ -6769,6 +6767,7 @@ static int remove_and_add_spares(mddev_t *mddev) nm, mdname(mddev)); spares++; md_new_event(mddev); + set_bit(MD_CHANGE_DEVS, &mddev->flags); } else break; } ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: RAID 5 rebuild fails with power interruption. 2009-11-25 14:20 ` senthilkumar.muthukalai @ 2009-11-25 20:58 ` Neil Brown 2009-11-26 6:26 ` senthilkumar.muthukalai 0 siblings, 1 reply; 12+ messages in thread From: Neil Brown @ 2009-11-25 20:58 UTC (permalink / raw) To: senthilkumar.muthukalai; +Cc: linux-raid On Wed, 25 Nov 2009 19:50:47 +0530 <senthilkumar.muthukalai@wipro.com> wrote: > Neil, > > The patch you have provided doesn't seem to go with the code we use. > We use the md driver that comes with linux-1.16.8 source code. > Could you pls suggest the changes for this version of md? I have no idea what you mean by "linux-1.16.8". My patch is against the latest main-line kernel with is 2.6.32-rc7 or something close to that. I only provide patches against mainline. If you need it backported to an earlier kernel, that is up to you. NeilBrown ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: RAID 5 rebuild fails with power interruption. 2009-11-25 20:58 ` Neil Brown @ 2009-11-26 6:26 ` senthilkumar.muthukalai [not found] ` <200911260448.55466.tfjellstrom@shaw.ca> 0 siblings, 1 reply; 12+ messages in thread From: senthilkumar.muthukalai @ 2009-11-26 6:26 UTC (permalink / raw) To: neilb; +Cc: linux-raid Oh sorry... That's a typo... Its 2.16.8. Anyways, thanks for the help. Let me try to backport as you say. Thanks, Senthil M -----Original Message----- From: Neil Brown [mailto:neilb@suse.de] Sent: Thursday, November 26, 2009 2:29 AM To: SenthilKumar Muthukalai (WT01 - Telecom Equipment) Cc: linux-raid@vger.kernel.org Subject: Re: RAID 5 rebuild fails with power interruption. On Wed, 25 Nov 2009 19:50:47 +0530 <senthilkumar.muthukalai@wipro.com> wrote: > Neil, > > The patch you have provided doesn't seem to go with the code we use. > We use the md driver that comes with linux-1.16.8 source code. > Could you pls suggest the changes for this version of md? I have no idea what you mean by "linux-1.16.8". My patch is against the latest main-line kernel with is 2.6.32-rc7 or something close to that. I only provide patches against mainline. If you need it backported to an earlier kernel, that is up to you. NeilBrown ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <200911260448.55466.tfjellstrom@shaw.ca>]
* RE: RAID 5 rebuild fails with power interruption. [not found] ` <200911260448.55466.tfjellstrom@shaw.ca> @ 2009-11-26 11:54 ` senthilkumar.muthukalai 0 siblings, 0 replies; 12+ messages in thread From: senthilkumar.muthukalai @ 2009-11-26 11:54 UTC (permalink / raw) To: tfjellstrom; +Cc: linux-raid Oh no... again a typo?! Anyways thanks for notifying that... -----Original Message----- From: Thomas Fjellstrom [mailto:tfjellstrom@shaw.ca] Sent: Thursday, November 26, 2009 5:19 PM To: SenthilKumar Muthukalai (WT01 - Telecom Equipment) Subject: Re: RAID 5 rebuild fails with power interruption. On Wed November 25 2009, you wrote: > Oh sorry... That's a typo... > Its 2.16.8. That kernel still doesn't exist ;) the largest numbered kernel out now is 2.6.31, with 2.6.32 coming out soon. Maybe you meant 2.6.18? That's pretty old, if you can upgrade you probably should. > Anyways, thanks for the help. > Let me try to backport as you say. > > Thanks, > Senthil M > > -----Original Message----- > From: Neil Brown [mailto:neilb@suse.de] > Sent: Thursday, November 26, 2009 2:29 AM > To: SenthilKumar Muthukalai (WT01 - Telecom Equipment) > Cc: linux-raid@vger.kernel.org > Subject: Re: RAID 5 rebuild fails with power interruption. > > On Wed, 25 Nov 2009 19:50:47 +0530 > > <senthilkumar.muthukalai@wipro.com> wrote: > > Neil, > > > > The patch you have provided doesn't seem to go with the code we use. > > We use the md driver that comes with linux-1.16.8 source code. > > Could you pls suggest the changes for this version of md? > > I have no idea what you mean by "linux-1.16.8". > My patch is against the latest main-line kernel with is 2.6.32-rc7 or > something close to that. I only provide patches against mainline. > If you need it backported to an earlier kernel, that is up to you. > > NeilBrown > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Thomas Fjellstrom tfjellstrom@shaw.ca ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID 5 rebuild fails with power interruption. 2009-11-16 22:47 ` Neil Brown 2009-11-17 6:30 ` senthilkumar.muthukalai @ 2009-11-24 1:36 ` Kasper Sandberg 2009-11-24 2:09 ` Neil Brown 1 sibling, 1 reply; 12+ messages in thread From: Kasper Sandberg @ 2009-11-24 1:36 UTC (permalink / raw) To: Neil Brown; +Cc: senthilkumar.muthukalai, goswin-v-b, linux-raid On Tue, 2009-11-17 at 09:47 +1100, Neil Brown wrote: > On Mon, 16 Nov 2009 16:00:38 +0530 > <senthilkumar.muthukalai@wipro.com> wrote: > > > We face this problem in our NAS product where we handle RAID5. > > In the below mentioned scenario, when RAID5 is rebuilding after > > adding a disk, we reset the power. > > Ideally when the system comes up, the RAID5 should have accepted the > > disk but not in our case. > > We get the 'kicking the non-fresh disk from array' message with the > > boot message. > > In our RAID init script we run 'mdadm -- Examine -- scan', followed by > > 'mdadm --assemble'. > > Could you pls help me to understand why this disk is being thrown out? > > It is because the metadata being used (v0.90) does not have the ability > to record that a device is partially recoverred. It can only record > that a device is either a full member of the array, or is not a member > of the array. So until the recovery completes, the metadata only > records that the device is not a member of the array. So when you > restart, you find that the device is not a member of the array. > > > > What could be the solution? > > Use 1.x metadata. e.g. add > --metadata=1.1 > to your --create command. > 1.x metadata is able to record that a device is only partially > recovered. So when the array is restarted the device will be included > and recovery will continue. Might it be possible to upgrade metadata without having to recreate the array? > > NeilBrown <snip> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID 5 rebuild fails with power interruption. 2009-11-24 1:36 ` Kasper Sandberg @ 2009-11-24 2:09 ` Neil Brown 0 siblings, 0 replies; 12+ messages in thread From: Neil Brown @ 2009-11-24 2:09 UTC (permalink / raw) To: Kasper Sandberg; +Cc: senthilkumar.muthukalai, goswin-v-b, linux-raid On Tue, 24 Nov 2009 02:36:37 +0100 Kasper Sandberg <postmaster@metanurb.dk> wrote: > On Tue, 2009-11-17 at 09:47 +1100, Neil Brown wrote: > > On Mon, 16 Nov 2009 16:00:38 +0530 > > <senthilkumar.muthukalai@wipro.com> wrote: > > > > > We face this problem in our NAS product where we handle RAID5. > > > In the below mentioned scenario, when RAID5 is rebuilding after > > > adding a disk, we reset the power. > > > Ideally when the system comes up, the RAID5 should have accepted > > > the disk but not in our case. > > > We get the 'kicking the non-fresh disk from array' message with > > > the boot message. > > > In our RAID init script we run 'mdadm -- Examine -- scan', > > > followed by 'mdadm --assemble'. > > > Could you pls help me to understand why this disk is being thrown > > > out? > > > > It is because the metadata being used (v0.90) does not have the > > ability to record that a device is partially recoverred. It can > > only record that a device is either a full member of the array, or > > is not a member of the array. So until the recovery completes, the > > metadata only records that the device is not a member of the > > array. So when you restart, you find that the device is not a > > member of the array. > > > > > > > What could be the solution? > > > > Use 1.x metadata. e.g. add > > --metadata=1.1 > > to your --create command. > > 1.x metadata is able to record that a device is only partially > > recovered. So when the array is restarted the device will be > > included and recovery will continue. > > Might it be possible to upgrade metadata without having to recreate > the array? It isn't currently possible. It would not be too hard to implement a conversion from 0.90 to 1.0 metadata, but I have not concrete plans to provide this. NeilBrown ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-11-26 11:54 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-16 3:56 RAID 5 rebuild fails with power interruption senthilkumar.muthukalai
2009-11-16 5:19 ` Goswin von Brederlow
2009-11-16 10:30 ` senthilkumar.muthukalai
2009-11-16 22:47 ` Neil Brown
2009-11-17 6:30 ` senthilkumar.muthukalai
[not found] ` <20091118163655.2ef3f00d@notabene.brown>
[not found] ` <8338BD137FF1B64EB341218BD702985E02B43A2A@BLR-EC-MBX03.wipro.com>
2009-11-25 2:14 ` Neil Brown
2009-11-25 14:20 ` senthilkumar.muthukalai
2009-11-25 20:58 ` Neil Brown
2009-11-26 6:26 ` senthilkumar.muthukalai
[not found] ` <200911260448.55466.tfjellstrom@shaw.ca>
2009-11-26 11:54 ` senthilkumar.muthukalai
2009-11-24 1:36 ` Kasper Sandberg
2009-11-24 2:09 ` Neil Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox