* [md PATCH 0/2] Fix problems with reshape on degraded raid5/6
@ 2017-10-17  5:18 NeilBrown
  2017-10-17  5:18 ` [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset NeilBrown
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: NeilBrown @ 2017-10-17  5:18 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-raid
Reshape on a degraded array is supposed to work, but it
obviously hasn't been tested.
These two patches make at least one test case (reported by
Curt <lightspd@gmail.com>) to work.
Thanks,
NeilBrown
---
NeilBrown (2):
      raid5: Set R5_Expanded on parity devices as well as data.
      md: be cautious about using ->curr_resync_completed for ->recovery_offset
 drivers/md/md.c    |   32 +++++++++++++++++++++-----------
 drivers/md/raid5.c |   23 ++++++++++++++++++++++-
 2 files changed, 43 insertions(+), 12 deletions(-)
--
Signature
^ permalink raw reply	[flat|nested] 10+ messages in thread* [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset 2017-10-17 5:18 [md PATCH 0/2] Fix problems with reshape on degraded raid5/6 NeilBrown @ 2017-10-17 5:18 ` NeilBrown 2017-10-31 14:42 ` Tomasz Majchrzak 2017-10-17 5:18 ` [md PATCH 1/2] raid5: Set R5_Expanded on parity devices as well as data NeilBrown 2017-10-19 3:06 ` [md PATCH 0/2] Fix problems with reshape on degraded raid5/6 Shaohua Li 2 siblings, 1 reply; 10+ messages in thread From: NeilBrown @ 2017-10-17 5:18 UTC (permalink / raw) To: Shaohua Li; +Cc: linux-raid The ->recovery_offset shows how much of a non-InSync device is actually in sync - how much has been recoveryed. When performing a recovery, ->curr_resync and ->curr_resync_completed follow the device address being recovered and so can be used to update ->recovery_offset. When performing a reshape, ->curr_resync* might follow the device addresses (raid5) or might follow array addresses (raid10), so cannot in general be used to set ->recovery_offset. When reshaping backwards, ->curre_resync* measures from the *end* of the array-or-device, so is particularly unhelpful. So change the common code in md.c to only use ->curr_resync_complete for the simple recovery case, and add code to raid5.c to update ->recovery_offset during a forwards reshape. Signed-off-by: NeilBrown <neilb@suse.com> --- drivers/md/md.c | 32 +++++++++++++++++++++----------- drivers/md/raid5.c | 18 ++++++++++++++++++ 2 files changed, 39 insertions(+), 11 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index a9f1352b3849..d1dfc9879368 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2453,10 +2453,17 @@ void md_update_sb(struct mddev *mddev, int force_change) } } - /* First make sure individual recovery_offsets are correct */ + /* First make sure individual recovery_offsets are correct + * curr_resync_completed can only be used during recovery. + * During reshape/resync it might use array-addresses rather + * that device addresses. + */ rdev_for_each(rdev, mddev) { if (rdev->raid_disk >= 0 && mddev->delta_disks >= 0 && + test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) && + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && + !test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && !test_bit(Journal, &rdev->flags) && !test_bit(In_sync, &rdev->flags) && mddev->curr_resync_completed > rdev->recovery_offset) @@ -8507,16 +8514,19 @@ void md_do_sync(struct md_thread *thread) } else { if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) mddev->curr_resync = MaxSector; - rcu_read_lock(); - rdev_for_each_rcu(rdev, mddev) - if (rdev->raid_disk >= 0 && - mddev->delta_disks >= 0 && - !test_bit(Journal, &rdev->flags) && - !test_bit(Faulty, &rdev->flags) && - !test_bit(In_sync, &rdev->flags) && - rdev->recovery_offset < mddev->curr_resync) - rdev->recovery_offset = mddev->curr_resync; - rcu_read_unlock(); + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { + rcu_read_lock(); + rdev_for_each_rcu(rdev, mddev) + if (rdev->raid_disk >= 0 && + mddev->delta_disks >= 0 && + !test_bit(Journal, &rdev->flags) && + !test_bit(Faulty, &rdev->flags) && + !test_bit(In_sync, &rdev->flags) && + rdev->recovery_offset < mddev->curr_resync) + rdev->recovery_offset = mddev->curr_resync; + rcu_read_unlock(); + } } } skip: diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index a8df52130f8a..89ad79614309 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -5739,6 +5739,7 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk */ struct r5conf *conf = mddev->private; struct stripe_head *sh; + struct md_rdev *rdev; sector_t first_sector, last_sector; int raid_disks = conf->previous_raid_disks; int data_disks = raid_disks - conf->max_degraded; @@ -5861,6 +5862,15 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk return 0; mddev->reshape_position = conf->reshape_progress; mddev->curr_resync_completed = sector_nr; + if (!mddev->reshape_backwards) + /* Can update recovery_offset */ + rdev_for_each(rdev, mddev) + if (rdev->raid_disk >= 0 && + !test_bit(Journal, &rdev->flags) && + !test_bit(In_sync, &rdev->flags) && + rdev->recovery_offset < sector_nr) + rdev->recovery_offset = sector_nr; + conf->reshape_checkpoint = jiffies; set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); md_wakeup_thread(mddev->thread); @@ -5959,6 +5969,14 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk goto ret; mddev->reshape_position = conf->reshape_progress; mddev->curr_resync_completed = sector_nr; + if (!mddev->reshape_backwards) + /* Can update recovery_offset */ + rdev_for_each(rdev, mddev) + if (rdev->raid_disk >= 0 && + !test_bit(Journal, &rdev->flags) && + !test_bit(In_sync, &rdev->flags) && + rdev->recovery_offset < sector_nr) + rdev->recovery_offset = sector_nr; conf->reshape_checkpoint = jiffies; set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); md_wakeup_thread(mddev->thread); ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset 2017-10-17 5:18 ` [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset NeilBrown @ 2017-10-31 14:42 ` Tomasz Majchrzak 2017-11-01 1:13 ` NeilBrown 0 siblings, 1 reply; 10+ messages in thread From: Tomasz Majchrzak @ 2017-10-31 14:42 UTC (permalink / raw) To: NeilBrown; +Cc: Shaohua Li, linux-raid On Tue, Oct 17, 2017 at 04:18:36PM +1100, NeilBrown wrote: > The ->recovery_offset shows how much of a non-InSync device is actually > in sync - how much has been recoveryed. > > When performing a recovery, ->curr_resync and ->curr_resync_completed > follow the device address being recovered and so can be used to update > ->recovery_offset. > > When performing a reshape, ->curr_resync* might follow the device > addresses (raid5) or might follow array addresses (raid10), so cannot > in general be used to set ->recovery_offset. When reshaping backwards, > ->curre_resync* measures from the *end* of the array-or-device, so is > particularly unhelpful. > > So change the common code in md.c to only use ->curr_resync_complete > for the simple recovery case, and add code to raid5.c to update > ->recovery_offset during a forwards reshape. > > Signed-off-by: NeilBrown <neilb@suse.com> > --- > drivers/md/md.c | 32 +++++++++++++++++++++----------- > drivers/md/raid5.c | 18 ++++++++++++++++++ > 2 files changed, 39 insertions(+), 11 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index a9f1352b3849..d1dfc9879368 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -2453,10 +2453,17 @@ void md_update_sb(struct mddev *mddev, int force_change) > } > } > > - /* First make sure individual recovery_offsets are correct */ > + /* First make sure individual recovery_offsets are correct > + * curr_resync_completed can only be used during recovery. > + * During reshape/resync it might use array-addresses rather > + * that device addresses. > + */ > rdev_for_each(rdev, mddev) { > if (rdev->raid_disk >= 0 && > mddev->delta_disks >= 0 && > + test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) && > + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && > + !test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > !test_bit(Journal, &rdev->flags) && > !test_bit(In_sync, &rdev->flags) && > mddev->curr_resync_completed > rdev->recovery_offset) > @@ -8507,16 +8514,19 @@ void md_do_sync(struct md_thread *thread) > } else { > if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) > mddev->curr_resync = MaxSector; > - rcu_read_lock(); > - rdev_for_each_rcu(rdev, mddev) > - if (rdev->raid_disk >= 0 && > - mddev->delta_disks >= 0 && > - !test_bit(Journal, &rdev->flags) && > - !test_bit(Faulty, &rdev->flags) && > - !test_bit(In_sync, &rdev->flags) && > - rdev->recovery_offset < mddev->curr_resync) > - rdev->recovery_offset = mddev->curr_resync; > - rcu_read_unlock(); > + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { > + rcu_read_lock(); > + rdev_for_each_rcu(rdev, mddev) > + if (rdev->raid_disk >= 0 && > + mddev->delta_disks >= 0 && > + !test_bit(Journal, &rdev->flags) && > + !test_bit(Faulty, &rdev->flags) && > + !test_bit(In_sync, &rdev->flags) && > + rdev->recovery_offset < mddev->curr_resync) > + rdev->recovery_offset = mddev->curr_resync; > + rcu_read_unlock(); > + } > } > } > skip: > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index a8df52130f8a..89ad79614309 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -5739,6 +5739,7 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > */ > struct r5conf *conf = mddev->private; > struct stripe_head *sh; > + struct md_rdev *rdev; > sector_t first_sector, last_sector; > int raid_disks = conf->previous_raid_disks; > int data_disks = raid_disks - conf->max_degraded; > @@ -5861,6 +5862,15 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > return 0; > mddev->reshape_position = conf->reshape_progress; > mddev->curr_resync_completed = sector_nr; > + if (!mddev->reshape_backwards) > + /* Can update recovery_offset */ > + rdev_for_each(rdev, mddev) > + if (rdev->raid_disk >= 0 && > + !test_bit(Journal, &rdev->flags) && > + !test_bit(In_sync, &rdev->flags) && > + rdev->recovery_offset < sector_nr) > + rdev->recovery_offset = sector_nr; > + > conf->reshape_checkpoint = jiffies; > set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); > md_wakeup_thread(mddev->thread); > @@ -5959,6 +5969,14 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > goto ret; > mddev->reshape_position = conf->reshape_progress; > mddev->curr_resync_completed = sector_nr; > + if (!mddev->reshape_backwards) > + /* Can update recovery_offset */ > + rdev_for_each(rdev, mddev) > + if (rdev->raid_disk >= 0 && > + !test_bit(Journal, &rdev->flags) && > + !test_bit(In_sync, &rdev->flags) && > + rdev->recovery_offset < sector_nr) > + rdev->recovery_offset = sector_nr; > conf->reshape_checkpoint = jiffies; > set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); > md_wakeup_thread(mddev->thread); > > > -- Hi, This patch has caused a regression in following IMSM RAID scenario: mdadm --create /dev/md/imsm0 --metadata=imsm --raid-devices=4 /dev/nvme[0-3]n1 --run mdadm --create /dev/md/r10 --level=10 --size=100M --raid-devices=4 /dev/nvme[0-3]n1 --run --assume-clean MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/r10 --level=0 MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/imsm0 --raid-devices=4 cat /proc/mdstat Personalities : [raid10] [raid0] [raid6] [raid5] [raid4] md126 : active raid4 nvme0n1[3] nvme2n1[2] nvme1n1[1] 409600 blocks super external:/md127/0 level 4, 128k chunk, algorithm 0 [5/3] [UU_U_] md127 : inactive nvme3n1[3](S) nvme2n1[2](S) nvme1n1[1](S) nvme0n1[0](S) 4420 blocks super external:imsm The array should have been reshaped to 4-disk RAID0 but it has failed in the middle of reshape as RAID4. Following condition is not true for above scenario: > + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { Tomek ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset 2017-10-31 14:42 ` Tomasz Majchrzak @ 2017-11-01 1:13 ` NeilBrown 2017-11-02 8:40 ` Tomasz Majchrzak 0 siblings, 1 reply; 10+ messages in thread From: NeilBrown @ 2017-11-01 1:13 UTC (permalink / raw) To: Tomasz Majchrzak; +Cc: Shaohua Li, linux-raid [-- Attachment #1: Type: text/plain, Size: 7795 bytes --] On Tue, Oct 31 2017, Tomasz Majchrzak wrote: > On Tue, Oct 17, 2017 at 04:18:36PM +1100, NeilBrown wrote: >> The ->recovery_offset shows how much of a non-InSync device is actually >> in sync - how much has been recoveryed. >> >> When performing a recovery, ->curr_resync and ->curr_resync_completed >> follow the device address being recovered and so can be used to update >> ->recovery_offset. >> >> When performing a reshape, ->curr_resync* might follow the device >> addresses (raid5) or might follow array addresses (raid10), so cannot >> in general be used to set ->recovery_offset. When reshaping backwards, >> ->curre_resync* measures from the *end* of the array-or-device, so is >> particularly unhelpful. >> >> So change the common code in md.c to only use ->curr_resync_complete >> for the simple recovery case, and add code to raid5.c to update >> ->recovery_offset during a forwards reshape. >> >> Signed-off-by: NeilBrown <neilb@suse.com> >> --- >> drivers/md/md.c | 32 +++++++++++++++++++++----------- >> drivers/md/raid5.c | 18 ++++++++++++++++++ >> 2 files changed, 39 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/md/md.c b/drivers/md/md.c >> index a9f1352b3849..d1dfc9879368 100644 >> --- a/drivers/md/md.c >> +++ b/drivers/md/md.c >> @@ -2453,10 +2453,17 @@ void md_update_sb(struct mddev *mddev, int force_change) >> } >> } >> >> - /* First make sure individual recovery_offsets are correct */ >> + /* First make sure individual recovery_offsets are correct >> + * curr_resync_completed can only be used during recovery. >> + * During reshape/resync it might use array-addresses rather >> + * that device addresses. >> + */ >> rdev_for_each(rdev, mddev) { >> if (rdev->raid_disk >= 0 && >> mddev->delta_disks >= 0 && >> + test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) && >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && >> + !test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && >> !test_bit(Journal, &rdev->flags) && >> !test_bit(In_sync, &rdev->flags) && >> mddev->curr_resync_completed > rdev->recovery_offset) >> @@ -8507,16 +8514,19 @@ void md_do_sync(struct md_thread *thread) >> } else { >> if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) >> mddev->curr_resync = MaxSector; >> - rcu_read_lock(); >> - rdev_for_each_rcu(rdev, mddev) >> - if (rdev->raid_disk >= 0 && >> - mddev->delta_disks >= 0 && >> - !test_bit(Journal, &rdev->flags) && >> - !test_bit(Faulty, &rdev->flags) && >> - !test_bit(In_sync, &rdev->flags) && >> - rdev->recovery_offset < mddev->curr_resync) >> - rdev->recovery_offset = mddev->curr_resync; >> - rcu_read_unlock(); >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { >> + rcu_read_lock(); >> + rdev_for_each_rcu(rdev, mddev) >> + if (rdev->raid_disk >= 0 && >> + mddev->delta_disks >= 0 && >> + !test_bit(Journal, &rdev->flags) && >> + !test_bit(Faulty, &rdev->flags) && >> + !test_bit(In_sync, &rdev->flags) && >> + rdev->recovery_offset < mddev->curr_resync) >> + rdev->recovery_offset = mddev->curr_resync; >> + rcu_read_unlock(); >> + } >> } >> } >> skip: >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c >> index a8df52130f8a..89ad79614309 100644 >> --- a/drivers/md/raid5.c >> +++ b/drivers/md/raid5.c >> @@ -5739,6 +5739,7 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk >> */ >> struct r5conf *conf = mddev->private; >> struct stripe_head *sh; >> + struct md_rdev *rdev; >> sector_t first_sector, last_sector; >> int raid_disks = conf->previous_raid_disks; >> int data_disks = raid_disks - conf->max_degraded; >> @@ -5861,6 +5862,15 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk >> return 0; >> mddev->reshape_position = conf->reshape_progress; >> mddev->curr_resync_completed = sector_nr; >> + if (!mddev->reshape_backwards) >> + /* Can update recovery_offset */ >> + rdev_for_each(rdev, mddev) >> + if (rdev->raid_disk >= 0 && >> + !test_bit(Journal, &rdev->flags) && >> + !test_bit(In_sync, &rdev->flags) && >> + rdev->recovery_offset < sector_nr) >> + rdev->recovery_offset = sector_nr; >> + >> conf->reshape_checkpoint = jiffies; >> set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); >> md_wakeup_thread(mddev->thread); >> @@ -5959,6 +5969,14 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk >> goto ret; >> mddev->reshape_position = conf->reshape_progress; >> mddev->curr_resync_completed = sector_nr; >> + if (!mddev->reshape_backwards) >> + /* Can update recovery_offset */ >> + rdev_for_each(rdev, mddev) >> + if (rdev->raid_disk >= 0 && >> + !test_bit(Journal, &rdev->flags) && >> + !test_bit(In_sync, &rdev->flags) && >> + rdev->recovery_offset < sector_nr) >> + rdev->recovery_offset = sector_nr; >> conf->reshape_checkpoint = jiffies; >> set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); >> md_wakeup_thread(mddev->thread); >> >> >> -- > > Hi, > > This patch has caused a regression in following IMSM RAID scenario: That's unfortunate. > > mdadm --create /dev/md/imsm0 --metadata=imsm --raid-devices=4 /dev/nvme[0-3]n1 --run > mdadm --create /dev/md/r10 --level=10 --size=100M --raid-devices=4 /dev/nvme[0-3]n1 --run --assume-clean > > MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/r10 --level=0 > MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/imsm0 --raid-devices=4 > > cat /proc/mdstat > Personalities : [raid10] [raid0] [raid6] [raid5] [raid4] > md126 : active raid4 nvme0n1[3] nvme2n1[2] nvme1n1[1] > 409600 blocks super external:/md127/0 level 4, 128k chunk, algorithm 0 [5/3] [UU_U_] > > md127 : inactive nvme3n1[3](S) nvme2n1[2](S) nvme1n1[1](S) nvme0n1[0](S) > 4420 blocks super external:imsm Why isn't nvme3n1 in md126? Did it not get added, or did it get removed for some reason? If it wasn't added, raid5_start_reshape() would have failed, so I guess it must have been removed?? > > The array should have been reshaped to 4-disk RAID0 but it has failed in > the middle of reshape as RAID4. > > Following condition is not true for above scenario: > >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { No, it isn't meant to be. This code doesn't work reliably in the reshape case. When reshaping raid5 (or raid4), reshape_request updates rdev->recovery_offset. ... maybe end_reshape needs to update it too. If you add this patch, does it help? Thanks, NeilBrown diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 89ad79614309..a99ae1be9175 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -7964,6 +7964,7 @@ static void end_reshape(struct r5conf *conf) { if (!test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery)) { + struct md_rdev *rdev; spin_lock_irq(&conf->device_lock); conf->previous_raid_disks = conf->raid_disks; @@ -7971,6 +7972,12 @@ static void end_reshape(struct r5conf *conf) smp_wmb(); conf->reshape_progress = MaxSector; conf->mddev->reshape_position = MaxSector; + rdev_for_each(rdev, conf->mddev) + if (rdev->raid_disk >= 0 && + !test_bit(Journal, &rdev->flags) && + !test_bit(In_sync, &rdev->flags)) + rdev->recovery_offset = MaxSector; + spin_unlock_irq(&conf->device_lock); wake_up(&conf->wait_for_overlap); [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset 2017-11-01 1:13 ` NeilBrown @ 2017-11-02 8:40 ` Tomasz Majchrzak 2017-11-09 0:08 ` Shaohua Li 0 siblings, 1 reply; 10+ messages in thread From: Tomasz Majchrzak @ 2017-11-02 8:40 UTC (permalink / raw) To: NeilBrown; +Cc: Shaohua Li, linux-raid On Wed, Nov 01, 2017 at 12:13:22PM +1100, NeilBrown wrote: > On Tue, Oct 31 2017, Tomasz Majchrzak wrote: > > > On Tue, Oct 17, 2017 at 04:18:36PM +1100, NeilBrown wrote: > >> The ->recovery_offset shows how much of a non-InSync device is actually > >> in sync - how much has been recoveryed. > >> > >> When performing a recovery, ->curr_resync and ->curr_resync_completed > >> follow the device address being recovered and so can be used to update > >> ->recovery_offset. > >> > >> When performing a reshape, ->curr_resync* might follow the device > >> addresses (raid5) or might follow array addresses (raid10), so cannot > >> in general be used to set ->recovery_offset. When reshaping backwards, > >> ->curre_resync* measures from the *end* of the array-or-device, so is > >> particularly unhelpful. > >> > >> So change the common code in md.c to only use ->curr_resync_complete > >> for the simple recovery case, and add code to raid5.c to update > >> ->recovery_offset during a forwards reshape. > >> > >> Signed-off-by: NeilBrown <neilb@suse.com> > >> --- > >> drivers/md/md.c | 32 +++++++++++++++++++++----------- > >> drivers/md/raid5.c | 18 ++++++++++++++++++ > >> 2 files changed, 39 insertions(+), 11 deletions(-) > >> > >> diff --git a/drivers/md/md.c b/drivers/md/md.c > >> index a9f1352b3849..d1dfc9879368 100644 > >> --- a/drivers/md/md.c > >> +++ b/drivers/md/md.c > >> @@ -2453,10 +2453,17 @@ void md_update_sb(struct mddev *mddev, int force_change) > >> } > >> } > >> > >> - /* First make sure individual recovery_offsets are correct */ > >> + /* First make sure individual recovery_offsets are correct > >> + * curr_resync_completed can only be used during recovery. > >> + * During reshape/resync it might use array-addresses rather > >> + * that device addresses. > >> + */ > >> rdev_for_each(rdev, mddev) { > >> if (rdev->raid_disk >= 0 && > >> mddev->delta_disks >= 0 && > >> + test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) && > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && > >> + !test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > >> !test_bit(Journal, &rdev->flags) && > >> !test_bit(In_sync, &rdev->flags) && > >> mddev->curr_resync_completed > rdev->recovery_offset) > >> @@ -8507,16 +8514,19 @@ void md_do_sync(struct md_thread *thread) > >> } else { > >> if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) > >> mddev->curr_resync = MaxSector; > >> - rcu_read_lock(); > >> - rdev_for_each_rcu(rdev, mddev) > >> - if (rdev->raid_disk >= 0 && > >> - mddev->delta_disks >= 0 && > >> - !test_bit(Journal, &rdev->flags) && > >> - !test_bit(Faulty, &rdev->flags) && > >> - !test_bit(In_sync, &rdev->flags) && > >> - rdev->recovery_offset < mddev->curr_resync) > >> - rdev->recovery_offset = mddev->curr_resync; > >> - rcu_read_unlock(); > >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { > >> + rcu_read_lock(); > >> + rdev_for_each_rcu(rdev, mddev) > >> + if (rdev->raid_disk >= 0 && > >> + mddev->delta_disks >= 0 && > >> + !test_bit(Journal, &rdev->flags) && > >> + !test_bit(Faulty, &rdev->flags) && > >> + !test_bit(In_sync, &rdev->flags) && > >> + rdev->recovery_offset < mddev->curr_resync) > >> + rdev->recovery_offset = mddev->curr_resync; > >> + rcu_read_unlock(); > >> + } > >> } > >> } > >> skip: > >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > >> index a8df52130f8a..89ad79614309 100644 > >> --- a/drivers/md/raid5.c > >> +++ b/drivers/md/raid5.c > >> @@ -5739,6 +5739,7 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > >> */ > >> struct r5conf *conf = mddev->private; > >> struct stripe_head *sh; > >> + struct md_rdev *rdev; > >> sector_t first_sector, last_sector; > >> int raid_disks = conf->previous_raid_disks; > >> int data_disks = raid_disks - conf->max_degraded; > >> @@ -5861,6 +5862,15 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > >> return 0; > >> mddev->reshape_position = conf->reshape_progress; > >> mddev->curr_resync_completed = sector_nr; > >> + if (!mddev->reshape_backwards) > >> + /* Can update recovery_offset */ > >> + rdev_for_each(rdev, mddev) > >> + if (rdev->raid_disk >= 0 && > >> + !test_bit(Journal, &rdev->flags) && > >> + !test_bit(In_sync, &rdev->flags) && > >> + rdev->recovery_offset < sector_nr) > >> + rdev->recovery_offset = sector_nr; > >> + > >> conf->reshape_checkpoint = jiffies; > >> set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); > >> md_wakeup_thread(mddev->thread); > >> @@ -5959,6 +5969,14 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > >> goto ret; > >> mddev->reshape_position = conf->reshape_progress; > >> mddev->curr_resync_completed = sector_nr; > >> + if (!mddev->reshape_backwards) > >> + /* Can update recovery_offset */ > >> + rdev_for_each(rdev, mddev) > >> + if (rdev->raid_disk >= 0 && > >> + !test_bit(Journal, &rdev->flags) && > >> + !test_bit(In_sync, &rdev->flags) && > >> + rdev->recovery_offset < sector_nr) > >> + rdev->recovery_offset = sector_nr; > >> conf->reshape_checkpoint = jiffies; > >> set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); > >> md_wakeup_thread(mddev->thread); > >> > >> > >> -- > > > > Hi, > > > > This patch has caused a regression in following IMSM RAID scenario: > > That's unfortunate. > > > > > mdadm --create /dev/md/imsm0 --metadata=imsm --raid-devices=4 /dev/nvme[0-3]n1 --run > > mdadm --create /dev/md/r10 --level=10 --size=100M --raid-devices=4 /dev/nvme[0-3]n1 --run --assume-clean > > > > MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/r10 --level=0 > > MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/imsm0 --raid-devices=4 > > > > cat /proc/mdstat > > Personalities : [raid10] [raid0] [raid6] [raid5] [raid4] > > md126 : active raid4 nvme0n1[3] nvme2n1[2] nvme1n1[1] > > 409600 blocks super external:/md127/0 level 4, 128k chunk, algorithm 0 [5/3] [UU_U_] > > > > md127 : inactive nvme3n1[3](S) nvme2n1[2](S) nvme1n1[1](S) nvme0n1[0](S) > > 4420 blocks super external:imsm > > Why isn't nvme3n1 in md126? Did it not get added, or did it get removed > for some reason? > If it wasn't added, raid5_start_reshape() would have failed, so I guess > it must have been removed?? It got removed. It's in the array during the reshape: cat /proc/mdstat Personalities : [raid10] [raid0] [raid6] [raid5] [raid4] [raid1] md126 : active raid4 nvme3n1[4] nvme0n1[3] nvme2n1[2] nvme1n1[1] 4194304 blocks super external:-md127/0 level 4, 128k chunk, algorithm 5 [5/4] [UU_U_] [===>.................] reshape = 16.6% (350476/2097152) finish=0.1min speed=175238K/sec md127 : inactive nvme3n1[3](S) nvme2n1[2](S) nvme1n1[1](S) nvme0n1[0](S) 4420 blocks super external:imsm > > > > > The array should have been reshaped to 4-disk RAID0 but it has failed in > > the middle of reshape as RAID4. > > > > Following condition is not true for above scenario: > > > >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { > > No, it isn't meant to be. This code doesn't work reliably in the > reshape case. When reshaping raid5 (or raid4), reshape_request updates > rdev->recovery_offset. ... maybe end_reshape needs to update it too. > > If you add this patch, does it help? > > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index 89ad79614309..a99ae1be9175 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -7964,6 +7964,7 @@ static void end_reshape(struct r5conf *conf) > { > > if (!test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery)) { > + struct md_rdev *rdev; > > spin_lock_irq(&conf->device_lock); > conf->previous_raid_disks = conf->raid_disks; > @@ -7971,6 +7972,12 @@ static void end_reshape(struct r5conf *conf) > smp_wmb(); > conf->reshape_progress = MaxSector; > conf->mddev->reshape_position = MaxSector; > + rdev_for_each(rdev, conf->mddev) > + if (rdev->raid_disk >= 0 && > + !test_bit(Journal, &rdev->flags) && > + !test_bit(In_sync, &rdev->flags)) > + rdev->recovery_offset = MaxSector; > + > spin_unlock_irq(&conf->device_lock); > wake_up(&conf->wait_for_overlap); > Yes, this patch fixes the problem. I have also checked that reshape position is really stored and reshape interrupted in the middle restarts from the last checkpoint. Thank you, Tomek ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset 2017-11-02 8:40 ` Tomasz Majchrzak @ 2017-11-09 0:08 ` Shaohua Li 2017-11-09 7:55 ` NeilBrown 0 siblings, 1 reply; 10+ messages in thread From: Shaohua Li @ 2017-11-09 0:08 UTC (permalink / raw) To: Tomasz Majchrzak; +Cc: NeilBrown, linux-raid On Thu, Nov 02, 2017 at 09:40:50AM +0100, Tomasz Majchrzak wrote: > On Wed, Nov 01, 2017 at 12:13:22PM +1100, NeilBrown wrote: > > On Tue, Oct 31 2017, Tomasz Majchrzak wrote: > > > > > On Tue, Oct 17, 2017 at 04:18:36PM +1100, NeilBrown wrote: > > >> The ->recovery_offset shows how much of a non-InSync device is actually > > >> in sync - how much has been recoveryed. > > >> > > >> When performing a recovery, ->curr_resync and ->curr_resync_completed > > >> follow the device address being recovered and so can be used to update > > >> ->recovery_offset. > > >> > > >> When performing a reshape, ->curr_resync* might follow the device > > >> addresses (raid5) or might follow array addresses (raid10), so cannot > > >> in general be used to set ->recovery_offset. When reshaping backwards, > > >> ->curre_resync* measures from the *end* of the array-or-device, so is > > >> particularly unhelpful. > > >> > > >> So change the common code in md.c to only use ->curr_resync_complete > > >> for the simple recovery case, and add code to raid5.c to update > > >> ->recovery_offset during a forwards reshape. > > >> > > >> Signed-off-by: NeilBrown <neilb@suse.com> > > >> --- > > >> drivers/md/md.c | 32 +++++++++++++++++++++----------- > > >> drivers/md/raid5.c | 18 ++++++++++++++++++ > > >> 2 files changed, 39 insertions(+), 11 deletions(-) > > >> > > >> diff --git a/drivers/md/md.c b/drivers/md/md.c > > >> index a9f1352b3849..d1dfc9879368 100644 > > >> --- a/drivers/md/md.c > > >> +++ b/drivers/md/md.c > > >> @@ -2453,10 +2453,17 @@ void md_update_sb(struct mddev *mddev, int force_change) > > >> } > > >> } > > >> > > >> - /* First make sure individual recovery_offsets are correct */ > > >> + /* First make sure individual recovery_offsets are correct > > >> + * curr_resync_completed can only be used during recovery. > > >> + * During reshape/resync it might use array-addresses rather > > >> + * that device addresses. > > >> + */ > > >> rdev_for_each(rdev, mddev) { > > >> if (rdev->raid_disk >= 0 && > > >> mddev->delta_disks >= 0 && > > >> + test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) && > > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && > > >> + !test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > > >> !test_bit(Journal, &rdev->flags) && > > >> !test_bit(In_sync, &rdev->flags) && > > >> mddev->curr_resync_completed > rdev->recovery_offset) > > >> @@ -8507,16 +8514,19 @@ void md_do_sync(struct md_thread *thread) > > >> } else { > > >> if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) > > >> mddev->curr_resync = MaxSector; > > >> - rcu_read_lock(); > > >> - rdev_for_each_rcu(rdev, mddev) > > >> - if (rdev->raid_disk >= 0 && > > >> - mddev->delta_disks >= 0 && > > >> - !test_bit(Journal, &rdev->flags) && > > >> - !test_bit(Faulty, &rdev->flags) && > > >> - !test_bit(In_sync, &rdev->flags) && > > >> - rdev->recovery_offset < mddev->curr_resync) > > >> - rdev->recovery_offset = mddev->curr_resync; > > >> - rcu_read_unlock(); > > >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { > > >> + rcu_read_lock(); > > >> + rdev_for_each_rcu(rdev, mddev) > > >> + if (rdev->raid_disk >= 0 && > > >> + mddev->delta_disks >= 0 && > > >> + !test_bit(Journal, &rdev->flags) && > > >> + !test_bit(Faulty, &rdev->flags) && > > >> + !test_bit(In_sync, &rdev->flags) && > > >> + rdev->recovery_offset < mddev->curr_resync) > > >> + rdev->recovery_offset = mddev->curr_resync; > > >> + rcu_read_unlock(); > > >> + } > > >> } > > >> } > > >> skip: > > >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > > >> index a8df52130f8a..89ad79614309 100644 > > >> --- a/drivers/md/raid5.c > > >> +++ b/drivers/md/raid5.c > > >> @@ -5739,6 +5739,7 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > > >> */ > > >> struct r5conf *conf = mddev->private; > > >> struct stripe_head *sh; > > >> + struct md_rdev *rdev; > > >> sector_t first_sector, last_sector; > > >> int raid_disks = conf->previous_raid_disks; > > >> int data_disks = raid_disks - conf->max_degraded; > > >> @@ -5861,6 +5862,15 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > > >> return 0; > > >> mddev->reshape_position = conf->reshape_progress; > > >> mddev->curr_resync_completed = sector_nr; > > >> + if (!mddev->reshape_backwards) > > >> + /* Can update recovery_offset */ > > >> + rdev_for_each(rdev, mddev) > > >> + if (rdev->raid_disk >= 0 && > > >> + !test_bit(Journal, &rdev->flags) && > > >> + !test_bit(In_sync, &rdev->flags) && > > >> + rdev->recovery_offset < sector_nr) > > >> + rdev->recovery_offset = sector_nr; > > >> + > > >> conf->reshape_checkpoint = jiffies; > > >> set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); > > >> md_wakeup_thread(mddev->thread); > > >> @@ -5959,6 +5969,14 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > > >> goto ret; > > >> mddev->reshape_position = conf->reshape_progress; > > >> mddev->curr_resync_completed = sector_nr; > > >> + if (!mddev->reshape_backwards) > > >> + /* Can update recovery_offset */ > > >> + rdev_for_each(rdev, mddev) > > >> + if (rdev->raid_disk >= 0 && > > >> + !test_bit(Journal, &rdev->flags) && > > >> + !test_bit(In_sync, &rdev->flags) && > > >> + rdev->recovery_offset < sector_nr) > > >> + rdev->recovery_offset = sector_nr; > > >> conf->reshape_checkpoint = jiffies; > > >> set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); > > >> md_wakeup_thread(mddev->thread); > > >> > > >> > > >> -- > > > > > > Hi, > > > > > > This patch has caused a regression in following IMSM RAID scenario: > > > > That's unfortunate. > > > > > > > > mdadm --create /dev/md/imsm0 --metadata=imsm --raid-devices=4 /dev/nvme[0-3]n1 --run > > > mdadm --create /dev/md/r10 --level=10 --size=100M --raid-devices=4 /dev/nvme[0-3]n1 --run --assume-clean > > > > > > MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/r10 --level=0 > > > MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/imsm0 --raid-devices=4 > > > > > > cat /proc/mdstat > > > Personalities : [raid10] [raid0] [raid6] [raid5] [raid4] > > > md126 : active raid4 nvme0n1[3] nvme2n1[2] nvme1n1[1] > > > 409600 blocks super external:/md127/0 level 4, 128k chunk, algorithm 0 [5/3] [UU_U_] > > > > > > md127 : inactive nvme3n1[3](S) nvme2n1[2](S) nvme1n1[1](S) nvme0n1[0](S) > > > 4420 blocks super external:imsm > > > > Why isn't nvme3n1 in md126? Did it not get added, or did it get removed > > for some reason? > > If it wasn't added, raid5_start_reshape() would have failed, so I guess > > it must have been removed?? > > It got removed. It's in the array during the reshape: > > cat /proc/mdstat > Personalities : [raid10] [raid0] [raid6] [raid5] [raid4] [raid1] > md126 : active raid4 nvme3n1[4] nvme0n1[3] nvme2n1[2] nvme1n1[1] > 4194304 blocks super external:-md127/0 level 4, 128k chunk, algorithm 5 [5/4] [UU_U_] > [===>.................] reshape = 16.6% (350476/2097152) finish=0.1min speed=175238K/sec > > md127 : inactive nvme3n1[3](S) nvme2n1[2](S) nvme1n1[1](S) nvme0n1[0](S) > 4420 blocks super external:imsm > > > > > > > > > The array should have been reshaped to 4-disk RAID0 but it has failed in > > > the middle of reshape as RAID4. > > > > > > Following condition is not true for above scenario: > > > > > >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { > > > > No, it isn't meant to be. This code doesn't work reliably in the > > reshape case. When reshaping raid5 (or raid4), reshape_request updates > > rdev->recovery_offset. ... maybe end_reshape needs to update it too. > > > > If you add this patch, does it help? > > > > > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > > index 89ad79614309..a99ae1be9175 100644 > > --- a/drivers/md/raid5.c > > +++ b/drivers/md/raid5.c > > @@ -7964,6 +7964,7 @@ static void end_reshape(struct r5conf *conf) > > { > > > > if (!test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery)) { > > + struct md_rdev *rdev; > > > > spin_lock_irq(&conf->device_lock); > > conf->previous_raid_disks = conf->raid_disks; > > @@ -7971,6 +7972,12 @@ static void end_reshape(struct r5conf *conf) > > smp_wmb(); > > conf->reshape_progress = MaxSector; > > conf->mddev->reshape_position = MaxSector; > > + rdev_for_each(rdev, conf->mddev) > > + if (rdev->raid_disk >= 0 && > > + !test_bit(Journal, &rdev->flags) && > > + !test_bit(In_sync, &rdev->flags)) > > + rdev->recovery_offset = MaxSector; > > + > > spin_unlock_irq(&conf->device_lock); > > wake_up(&conf->wait_for_overlap); > > > > Yes, this patch fixes the problem. I have also checked that reshape position > is really stored and reshape interrupted in the middle restarts from the > last checkpoint. Neil, can you send me a formal patch for this? or I can just fold this into your original patch. Thanks, Shaohua ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset 2017-11-09 0:08 ` Shaohua Li @ 2017-11-09 7:55 ` NeilBrown 2017-11-09 15:31 ` Shaohua Li 0 siblings, 1 reply; 10+ messages in thread From: NeilBrown @ 2017-11-09 7:55 UTC (permalink / raw) To: Shaohua Li, Tomasz Majchrzak; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 9883 bytes --] On Wed, Nov 08 2017, Shaohua Li wrote: > On Thu, Nov 02, 2017 at 09:40:50AM +0100, Tomasz Majchrzak wrote: >> On Wed, Nov 01, 2017 at 12:13:22PM +1100, NeilBrown wrote: >> > On Tue, Oct 31 2017, Tomasz Majchrzak wrote: >> > >> > > On Tue, Oct 17, 2017 at 04:18:36PM +1100, NeilBrown wrote: >> > >> The ->recovery_offset shows how much of a non-InSync device is actually >> > >> in sync - how much has been recoveryed. >> > >> >> > >> When performing a recovery, ->curr_resync and ->curr_resync_completed >> > >> follow the device address being recovered and so can be used to update >> > >> ->recovery_offset. >> > >> >> > >> When performing a reshape, ->curr_resync* might follow the device >> > >> addresses (raid5) or might follow array addresses (raid10), so cannot >> > >> in general be used to set ->recovery_offset. When reshaping backwards, >> > >> ->curre_resync* measures from the *end* of the array-or-device, so is >> > >> particularly unhelpful. >> > >> >> > >> So change the common code in md.c to only use ->curr_resync_complete >> > >> for the simple recovery case, and add code to raid5.c to update >> > >> ->recovery_offset during a forwards reshape. >> > >> >> > >> Signed-off-by: NeilBrown <neilb@suse.com> >> > >> --- >> > >> drivers/md/md.c | 32 +++++++++++++++++++++----------- >> > >> drivers/md/raid5.c | 18 ++++++++++++++++++ >> > >> 2 files changed, 39 insertions(+), 11 deletions(-) >> > >> >> > >> diff --git a/drivers/md/md.c b/drivers/md/md.c >> > >> index a9f1352b3849..d1dfc9879368 100644 >> > >> --- a/drivers/md/md.c >> > >> +++ b/drivers/md/md.c >> > >> @@ -2453,10 +2453,17 @@ void md_update_sb(struct mddev *mddev, int force_change) >> > >> } >> > >> } >> > >> >> > >> - /* First make sure individual recovery_offsets are correct */ >> > >> + /* First make sure individual recovery_offsets are correct >> > >> + * curr_resync_completed can only be used during recovery. >> > >> + * During reshape/resync it might use array-addresses rather >> > >> + * that device addresses. >> > >> + */ >> > >> rdev_for_each(rdev, mddev) { >> > >> if (rdev->raid_disk >= 0 && >> > >> mddev->delta_disks >= 0 && >> > >> + test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) && >> > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && >> > >> + !test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && >> > >> !test_bit(Journal, &rdev->flags) && >> > >> !test_bit(In_sync, &rdev->flags) && >> > >> mddev->curr_resync_completed > rdev->recovery_offset) >> > >> @@ -8507,16 +8514,19 @@ void md_do_sync(struct md_thread *thread) >> > >> } else { >> > >> if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) >> > >> mddev->curr_resync = MaxSector; >> > >> - rcu_read_lock(); >> > >> - rdev_for_each_rcu(rdev, mddev) >> > >> - if (rdev->raid_disk >= 0 && >> > >> - mddev->delta_disks >= 0 && >> > >> - !test_bit(Journal, &rdev->flags) && >> > >> - !test_bit(Faulty, &rdev->flags) && >> > >> - !test_bit(In_sync, &rdev->flags) && >> > >> - rdev->recovery_offset < mddev->curr_resync) >> > >> - rdev->recovery_offset = mddev->curr_resync; >> > >> - rcu_read_unlock(); >> > >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && >> > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { >> > >> + rcu_read_lock(); >> > >> + rdev_for_each_rcu(rdev, mddev) >> > >> + if (rdev->raid_disk >= 0 && >> > >> + mddev->delta_disks >= 0 && >> > >> + !test_bit(Journal, &rdev->flags) && >> > >> + !test_bit(Faulty, &rdev->flags) && >> > >> + !test_bit(In_sync, &rdev->flags) && >> > >> + rdev->recovery_offset < mddev->curr_resync) >> > >> + rdev->recovery_offset = mddev->curr_resync; >> > >> + rcu_read_unlock(); >> > >> + } >> > >> } >> > >> } >> > >> skip: >> > >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c >> > >> index a8df52130f8a..89ad79614309 100644 >> > >> --- a/drivers/md/raid5.c >> > >> +++ b/drivers/md/raid5.c >> > >> @@ -5739,6 +5739,7 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk >> > >> */ >> > >> struct r5conf *conf = mddev->private; >> > >> struct stripe_head *sh; >> > >> + struct md_rdev *rdev; >> > >> sector_t first_sector, last_sector; >> > >> int raid_disks = conf->previous_raid_disks; >> > >> int data_disks = raid_disks - conf->max_degraded; >> > >> @@ -5861,6 +5862,15 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk >> > >> return 0; >> > >> mddev->reshape_position = conf->reshape_progress; >> > >> mddev->curr_resync_completed = sector_nr; >> > >> + if (!mddev->reshape_backwards) >> > >> + /* Can update recovery_offset */ >> > >> + rdev_for_each(rdev, mddev) >> > >> + if (rdev->raid_disk >= 0 && >> > >> + !test_bit(Journal, &rdev->flags) && >> > >> + !test_bit(In_sync, &rdev->flags) && >> > >> + rdev->recovery_offset < sector_nr) >> > >> + rdev->recovery_offset = sector_nr; >> > >> + >> > >> conf->reshape_checkpoint = jiffies; >> > >> set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); >> > >> md_wakeup_thread(mddev->thread); >> > >> @@ -5959,6 +5969,14 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk >> > >> goto ret; >> > >> mddev->reshape_position = conf->reshape_progress; >> > >> mddev->curr_resync_completed = sector_nr; >> > >> + if (!mddev->reshape_backwards) >> > >> + /* Can update recovery_offset */ >> > >> + rdev_for_each(rdev, mddev) >> > >> + if (rdev->raid_disk >= 0 && >> > >> + !test_bit(Journal, &rdev->flags) && >> > >> + !test_bit(In_sync, &rdev->flags) && >> > >> + rdev->recovery_offset < sector_nr) >> > >> + rdev->recovery_offset = sector_nr; >> > >> conf->reshape_checkpoint = jiffies; >> > >> set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); >> > >> md_wakeup_thread(mddev->thread); >> > >> >> > >> >> > >> -- >> > > >> > > Hi, >> > > >> > > This patch has caused a regression in following IMSM RAID scenario: >> > >> > That's unfortunate. >> > >> > > >> > > mdadm --create /dev/md/imsm0 --metadata=imsm --raid-devices=4 /dev/nvme[0-3]n1 --run >> > > mdadm --create /dev/md/r10 --level=10 --size=100M --raid-devices=4 /dev/nvme[0-3]n1 --run --assume-clean >> > > >> > > MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/r10 --level=0 >> > > MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/imsm0 --raid-devices=4 >> > > >> > > cat /proc/mdstat >> > > Personalities : [raid10] [raid0] [raid6] [raid5] [raid4] >> > > md126 : active raid4 nvme0n1[3] nvme2n1[2] nvme1n1[1] >> > > 409600 blocks super external:/md127/0 level 4, 128k chunk, algorithm 0 [5/3] [UU_U_] >> > > >> > > md127 : inactive nvme3n1[3](S) nvme2n1[2](S) nvme1n1[1](S) nvme0n1[0](S) >> > > 4420 blocks super external:imsm >> > >> > Why isn't nvme3n1 in md126? Did it not get added, or did it get removed >> > for some reason? >> > If it wasn't added, raid5_start_reshape() would have failed, so I guess >> > it must have been removed?? >> >> It got removed. It's in the array during the reshape: >> >> cat /proc/mdstat >> Personalities : [raid10] [raid0] [raid6] [raid5] [raid4] [raid1] >> md126 : active raid4 nvme3n1[4] nvme0n1[3] nvme2n1[2] nvme1n1[1] >> 4194304 blocks super external:-md127/0 level 4, 128k chunk, algorithm 5 [5/4] [UU_U_] >> [===>.................] reshape = 16.6% (350476/2097152) finish=0.1min speed=175238K/sec >> >> md127 : inactive nvme3n1[3](S) nvme2n1[2](S) nvme1n1[1](S) nvme0n1[0](S) >> 4420 blocks super external:imsm >> >> > >> > > >> > > The array should have been reshaped to 4-disk RAID0 but it has failed in >> > > the middle of reshape as RAID4. >> > > >> > > Following condition is not true for above scenario: >> > > >> > >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && >> > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { >> > >> > No, it isn't meant to be. This code doesn't work reliably in the >> > reshape case. When reshaping raid5 (or raid4), reshape_request updates >> > rdev->recovery_offset. ... maybe end_reshape needs to update it too. >> > >> > If you add this patch, does it help? >> > >> > >> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c >> > index 89ad79614309..a99ae1be9175 100644 >> > --- a/drivers/md/raid5.c >> > +++ b/drivers/md/raid5.c >> > @@ -7964,6 +7964,7 @@ static void end_reshape(struct r5conf *conf) >> > { >> > >> > if (!test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery)) { >> > + struct md_rdev *rdev; >> > >> > spin_lock_irq(&conf->device_lock); >> > conf->previous_raid_disks = conf->raid_disks; >> > @@ -7971,6 +7972,12 @@ static void end_reshape(struct r5conf *conf) >> > smp_wmb(); >> > conf->reshape_progress = MaxSector; >> > conf->mddev->reshape_position = MaxSector; >> > + rdev_for_each(rdev, conf->mddev) >> > + if (rdev->raid_disk >= 0 && >> > + !test_bit(Journal, &rdev->flags) && >> > + !test_bit(In_sync, &rdev->flags)) >> > + rdev->recovery_offset = MaxSector; >> > + >> > spin_unlock_irq(&conf->device_lock); >> > wake_up(&conf->wait_for_overlap); >> > >> >> Yes, this patch fixes the problem. I have also checked that reshape position >> is really stored and reshape interrupted in the middle restarts from the >> last checkpoint. > > Neil, > can you send me a formal patch for this? or I can just fold this into your > original patch. If you can fold it in, that would be great. Otherwise it may be a while before I get to it. Thanks, NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset 2017-11-09 7:55 ` NeilBrown @ 2017-11-09 15:31 ` Shaohua Li 0 siblings, 0 replies; 10+ messages in thread From: Shaohua Li @ 2017-11-09 15:31 UTC (permalink / raw) To: NeilBrown; +Cc: Tomasz Majchrzak, linux-raid On Thu, Nov 09, 2017 at 06:55:57PM +1100, Neil Brown wrote: > On Wed, Nov 08 2017, Shaohua Li wrote: > > > On Thu, Nov 02, 2017 at 09:40:50AM +0100, Tomasz Majchrzak wrote: > >> On Wed, Nov 01, 2017 at 12:13:22PM +1100, NeilBrown wrote: > >> > On Tue, Oct 31 2017, Tomasz Majchrzak wrote: > >> > > >> > > On Tue, Oct 17, 2017 at 04:18:36PM +1100, NeilBrown wrote: > >> > >> The ->recovery_offset shows how much of a non-InSync device is actually > >> > >> in sync - how much has been recoveryed. > >> > >> > >> > >> When performing a recovery, ->curr_resync and ->curr_resync_completed > >> > >> follow the device address being recovered and so can be used to update > >> > >> ->recovery_offset. > >> > >> > >> > >> When performing a reshape, ->curr_resync* might follow the device > >> > >> addresses (raid5) or might follow array addresses (raid10), so cannot > >> > >> in general be used to set ->recovery_offset. When reshaping backwards, > >> > >> ->curre_resync* measures from the *end* of the array-or-device, so is > >> > >> particularly unhelpful. > >> > >> > >> > >> So change the common code in md.c to only use ->curr_resync_complete > >> > >> for the simple recovery case, and add code to raid5.c to update > >> > >> ->recovery_offset during a forwards reshape. > >> > >> > >> > >> Signed-off-by: NeilBrown <neilb@suse.com> > >> > >> --- > >> > >> drivers/md/md.c | 32 +++++++++++++++++++++----------- > >> > >> drivers/md/raid5.c | 18 ++++++++++++++++++ > >> > >> 2 files changed, 39 insertions(+), 11 deletions(-) > >> > >> > >> > >> diff --git a/drivers/md/md.c b/drivers/md/md.c > >> > >> index a9f1352b3849..d1dfc9879368 100644 > >> > >> --- a/drivers/md/md.c > >> > >> +++ b/drivers/md/md.c > >> > >> @@ -2453,10 +2453,17 @@ void md_update_sb(struct mddev *mddev, int force_change) > >> > >> } > >> > >> } > >> > >> > >> > >> - /* First make sure individual recovery_offsets are correct */ > >> > >> + /* First make sure individual recovery_offsets are correct > >> > >> + * curr_resync_completed can only be used during recovery. > >> > >> + * During reshape/resync it might use array-addresses rather > >> > >> + * that device addresses. > >> > >> + */ > >> > >> rdev_for_each(rdev, mddev) { > >> > >> if (rdev->raid_disk >= 0 && > >> > >> mddev->delta_disks >= 0 && > >> > >> + test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) && > >> > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && > >> > >> + !test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > >> > >> !test_bit(Journal, &rdev->flags) && > >> > >> !test_bit(In_sync, &rdev->flags) && > >> > >> mddev->curr_resync_completed > rdev->recovery_offset) > >> > >> @@ -8507,16 +8514,19 @@ void md_do_sync(struct md_thread *thread) > >> > >> } else { > >> > >> if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) > >> > >> mddev->curr_resync = MaxSector; > >> > >> - rcu_read_lock(); > >> > >> - rdev_for_each_rcu(rdev, mddev) > >> > >> - if (rdev->raid_disk >= 0 && > >> > >> - mddev->delta_disks >= 0 && > >> > >> - !test_bit(Journal, &rdev->flags) && > >> > >> - !test_bit(Faulty, &rdev->flags) && > >> > >> - !test_bit(In_sync, &rdev->flags) && > >> > >> - rdev->recovery_offset < mddev->curr_resync) > >> > >> - rdev->recovery_offset = mddev->curr_resync; > >> > >> - rcu_read_unlock(); > >> > >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > >> > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { > >> > >> + rcu_read_lock(); > >> > >> + rdev_for_each_rcu(rdev, mddev) > >> > >> + if (rdev->raid_disk >= 0 && > >> > >> + mddev->delta_disks >= 0 && > >> > >> + !test_bit(Journal, &rdev->flags) && > >> > >> + !test_bit(Faulty, &rdev->flags) && > >> > >> + !test_bit(In_sync, &rdev->flags) && > >> > >> + rdev->recovery_offset < mddev->curr_resync) > >> > >> + rdev->recovery_offset = mddev->curr_resync; > >> > >> + rcu_read_unlock(); > >> > >> + } > >> > >> } > >> > >> } > >> > >> skip: > >> > >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > >> > >> index a8df52130f8a..89ad79614309 100644 > >> > >> --- a/drivers/md/raid5.c > >> > >> +++ b/drivers/md/raid5.c > >> > >> @@ -5739,6 +5739,7 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > >> > >> */ > >> > >> struct r5conf *conf = mddev->private; > >> > >> struct stripe_head *sh; > >> > >> + struct md_rdev *rdev; > >> > >> sector_t first_sector, last_sector; > >> > >> int raid_disks = conf->previous_raid_disks; > >> > >> int data_disks = raid_disks - conf->max_degraded; > >> > >> @@ -5861,6 +5862,15 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > >> > >> return 0; > >> > >> mddev->reshape_position = conf->reshape_progress; > >> > >> mddev->curr_resync_completed = sector_nr; > >> > >> + if (!mddev->reshape_backwards) > >> > >> + /* Can update recovery_offset */ > >> > >> + rdev_for_each(rdev, mddev) > >> > >> + if (rdev->raid_disk >= 0 && > >> > >> + !test_bit(Journal, &rdev->flags) && > >> > >> + !test_bit(In_sync, &rdev->flags) && > >> > >> + rdev->recovery_offset < sector_nr) > >> > >> + rdev->recovery_offset = sector_nr; > >> > >> + > >> > >> conf->reshape_checkpoint = jiffies; > >> > >> set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); > >> > >> md_wakeup_thread(mddev->thread); > >> > >> @@ -5959,6 +5969,14 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk > >> > >> goto ret; > >> > >> mddev->reshape_position = conf->reshape_progress; > >> > >> mddev->curr_resync_completed = sector_nr; > >> > >> + if (!mddev->reshape_backwards) > >> > >> + /* Can update recovery_offset */ > >> > >> + rdev_for_each(rdev, mddev) > >> > >> + if (rdev->raid_disk >= 0 && > >> > >> + !test_bit(Journal, &rdev->flags) && > >> > >> + !test_bit(In_sync, &rdev->flags) && > >> > >> + rdev->recovery_offset < sector_nr) > >> > >> + rdev->recovery_offset = sector_nr; > >> > >> conf->reshape_checkpoint = jiffies; > >> > >> set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); > >> > >> md_wakeup_thread(mddev->thread); > >> > >> > >> > >> > >> > >> -- > >> > > > >> > > Hi, > >> > > > >> > > This patch has caused a regression in following IMSM RAID scenario: > >> > > >> > That's unfortunate. > >> > > >> > > > >> > > mdadm --create /dev/md/imsm0 --metadata=imsm --raid-devices=4 /dev/nvme[0-3]n1 --run > >> > > mdadm --create /dev/md/r10 --level=10 --size=100M --raid-devices=4 /dev/nvme[0-3]n1 --run --assume-clean > >> > > > >> > > MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/r10 --level=0 > >> > > MDADM_EXPERIMENTAL=1 mdadm --grow /dev/md/imsm0 --raid-devices=4 > >> > > > >> > > cat /proc/mdstat > >> > > Personalities : [raid10] [raid0] [raid6] [raid5] [raid4] > >> > > md126 : active raid4 nvme0n1[3] nvme2n1[2] nvme1n1[1] > >> > > 409600 blocks super external:/md127/0 level 4, 128k chunk, algorithm 0 [5/3] [UU_U_] > >> > > > >> > > md127 : inactive nvme3n1[3](S) nvme2n1[2](S) nvme1n1[1](S) nvme0n1[0](S) > >> > > 4420 blocks super external:imsm > >> > > >> > Why isn't nvme3n1 in md126? Did it not get added, or did it get removed > >> > for some reason? > >> > If it wasn't added, raid5_start_reshape() would have failed, so I guess > >> > it must have been removed?? > >> > >> It got removed. It's in the array during the reshape: > >> > >> cat /proc/mdstat > >> Personalities : [raid10] [raid0] [raid6] [raid5] [raid4] [raid1] > >> md126 : active raid4 nvme3n1[4] nvme0n1[3] nvme2n1[2] nvme1n1[1] > >> 4194304 blocks super external:-md127/0 level 4, 128k chunk, algorithm 5 [5/4] [UU_U_] > >> [===>.................] reshape = 16.6% (350476/2097152) finish=0.1min speed=175238K/sec > >> > >> md127 : inactive nvme3n1[3](S) nvme2n1[2](S) nvme1n1[1](S) nvme0n1[0](S) > >> 4420 blocks super external:imsm > >> > >> > > >> > > > >> > > The array should have been reshaped to 4-disk RAID0 but it has failed in > >> > > the middle of reshape as RAID4. > >> > > > >> > > Following condition is not true for above scenario: > >> > > > >> > >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && > >> > >> + test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { > >> > > >> > No, it isn't meant to be. This code doesn't work reliably in the > >> > reshape case. When reshaping raid5 (or raid4), reshape_request updates > >> > rdev->recovery_offset. ... maybe end_reshape needs to update it too. > >> > > >> > If you add this patch, does it help? > >> > > >> > > >> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > >> > index 89ad79614309..a99ae1be9175 100644 > >> > --- a/drivers/md/raid5.c > >> > +++ b/drivers/md/raid5.c > >> > @@ -7964,6 +7964,7 @@ static void end_reshape(struct r5conf *conf) > >> > { > >> > > >> > if (!test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery)) { > >> > + struct md_rdev *rdev; > >> > > >> > spin_lock_irq(&conf->device_lock); > >> > conf->previous_raid_disks = conf->raid_disks; > >> > @@ -7971,6 +7972,12 @@ static void end_reshape(struct r5conf *conf) > >> > smp_wmb(); > >> > conf->reshape_progress = MaxSector; > >> > conf->mddev->reshape_position = MaxSector; > >> > + rdev_for_each(rdev, conf->mddev) > >> > + if (rdev->raid_disk >= 0 && > >> > + !test_bit(Journal, &rdev->flags) && > >> > + !test_bit(In_sync, &rdev->flags)) > >> > + rdev->recovery_offset = MaxSector; > >> > + > >> > spin_unlock_irq(&conf->device_lock); > >> > wake_up(&conf->wait_for_overlap); > >> > > >> > >> Yes, this patch fixes the problem. I have also checked that reshape position > >> is really stored and reshape interrupted in the middle restarts from the > >> last checkpoint. > > > > Neil, > > can you send me a formal patch for this? or I can just fold this into your > > original patch. > > If you can fold it in, that would be great. Otherwise it may be a while > before I get to it. Ok, done. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [md PATCH 1/2] raid5: Set R5_Expanded on parity devices as well as data. 2017-10-17 5:18 [md PATCH 0/2] Fix problems with reshape on degraded raid5/6 NeilBrown 2017-10-17 5:18 ` [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset NeilBrown @ 2017-10-17 5:18 ` NeilBrown 2017-10-19 3:06 ` [md PATCH 0/2] Fix problems with reshape on degraded raid5/6 Shaohua Li 2 siblings, 0 replies; 10+ messages in thread From: NeilBrown @ 2017-10-17 5:18 UTC (permalink / raw) To: Shaohua Li; +Cc: linux-raid When reshaping a fully degraded raid5/raid6 to a larger nubmer of devices, the new device(s) are not in-sync and so that can make the newly grown stripe appear to be "failed". To avoid this, we set the R5_Expanded flag to say "Even though this device is not fully in-sync, this block is safe so don't treat the device as failed for this stripe". This flag is set for data devices, not not for parity devices. Consequently, if you have a RAID6 with two devices that are partly recovered and a spare, and start a reshape to include the spare, then when the reshape gets past the point where the recovery was up to, it will think the stripes are failed and will get into an infinite loop, failing to make progress. So when contructing parity on an EXPAND_READY stripe, set R5_Expanded. Reported-by: Curt <lightspd@gmail.com> Signed-off-by: NeilBrown <neilb@suse.com> --- drivers/md/raid5.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index e070e5c68801..a8df52130f8a 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -1818,8 +1818,11 @@ static void ops_complete_reconstruct(void *stripe_head_ref) struct r5dev *dev = &sh->dev[i]; if (dev->written || i == pd_idx || i == qd_idx) { - if (!discard && !test_bit(R5_SkipCopy, &dev->flags)) + if (!discard && !test_bit(R5_SkipCopy, &dev->flags)) { set_bit(R5_UPTODATE, &dev->flags); + if (test_bit(STRIPE_EXPAND_READY, &sh->state)) + set_bit(R5_Expanded, &dev->flags); + } if (fua) set_bit(R5_WantFUA, &dev->flags); if (sync) ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [md PATCH 0/2] Fix problems with reshape on degraded raid5/6 2017-10-17 5:18 [md PATCH 0/2] Fix problems with reshape on degraded raid5/6 NeilBrown 2017-10-17 5:18 ` [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset NeilBrown 2017-10-17 5:18 ` [md PATCH 1/2] raid5: Set R5_Expanded on parity devices as well as data NeilBrown @ 2017-10-19 3:06 ` Shaohua Li 2 siblings, 0 replies; 10+ messages in thread From: Shaohua Li @ 2017-10-19 3:06 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid On Tue, Oct 17, 2017 at 04:18:36PM +1100, Neil Brown wrote: > Reshape on a degraded array is supposed to work, but it > obviously hasn't been tested. > These two patches make at least one test case (reported by > Curt <lightspd@gmail.com>) to work. Applied, thanks! ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-11-09 15:31 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-10-17 5:18 [md PATCH 0/2] Fix problems with reshape on degraded raid5/6 NeilBrown 2017-10-17 5:18 ` [md PATCH 2/2] md: be cautious about using ->curr_resync_completed for ->recovery_offset NeilBrown 2017-10-31 14:42 ` Tomasz Majchrzak 2017-11-01 1:13 ` NeilBrown 2017-11-02 8:40 ` Tomasz Majchrzak 2017-11-09 0:08 ` Shaohua Li 2017-11-09 7:55 ` NeilBrown 2017-11-09 15:31 ` Shaohua Li 2017-10-17 5:18 ` [md PATCH 1/2] raid5: Set R5_Expanded on parity devices as well as data NeilBrown 2017-10-19 3:06 ` [md PATCH 0/2] Fix problems with reshape on degraded raid5/6 Shaohua Li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).