* Superblock of raid5 log can't be updated when array stoped @ 2016-10-15 2:19 liuzhengyuan 2016-10-18 0:28 ` Shaohua Li 0 siblings, 1 reply; 4+ messages in thread From: liuzhengyuan @ 2016-10-15 2:19 UTC (permalink / raw) To: shli; +Cc: Song Liu, linux-raid Hi, Shaohua. when we stop raid5 array with "mdadm -S" or reboot the system, md module will call raid5_quiesce and r5l_quiesce to do some clean work. Some code of r5l_quiesce was pasted bellow. /* make sure r5l_write_super_and_discard_space exits */ mddev = log->rdev->mddev; wake_up(&mddev->sb_wait); r5l_wake_reclaim(log, -1L); md_unregister_thread(&log->reclaim_thread); r5l_do_reclaim(log); + md_update_sb(mddev, 1); It will reclaim all used space of log and call r5l_write_super to reset rdev->journal_tail to log->next_checkpoint . However, new rdev->journal_tail would not be written to journal device for persistent because journal device may not support discard operation or due to mddev_trylock fail (this trylock should always get failed since raid5_quiesce was called with reconfig_mutex hold, isn't it?). As a result, it will take a long time to recovery the log when the arrary was restarted. Should r5l_quiesce call md_update_sb directly to guarantee superblock update? ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Superblock of raid5 log can't be updated when array stoped 2016-10-15 2:19 Superblock of raid5 log can't be updated when array stoped liuzhengyuan @ 2016-10-18 0:28 ` Shaohua Li 2016-10-18 5:51 ` Zhengyuan Liu 0 siblings, 1 reply; 4+ messages in thread From: Shaohua Li @ 2016-10-18 0:28 UTC (permalink / raw) To: liuzhengyuan; +Cc: shli, Song Liu, linux-raid On Sat, Oct 15, 2016 at 10:19:36AM +0800, liuzhengyuan wrote: > Hi, Shaohua. > > when we stop raid5 array with "mdadm -S" or reboot the system, md module will > call raid5_quiesce and r5l_quiesce to do some clean work. Some code of r5l_quiesce > was pasted bellow. > > /* make sure r5l_write_super_and_discard_space exits */ > mddev = log->rdev->mddev; > wake_up(&mddev->sb_wait); > r5l_wake_reclaim(log, -1L); > md_unregister_thread(&log->reclaim_thread); > r5l_do_reclaim(log); > + md_update_sb(mddev, 1); > > It will reclaim all used space of log and call r5l_write_super to reset rdev->journal_tail > to log->next_checkpoint . However, new rdev->journal_tail would not be written to > journal device for persistent because journal device may not support discard operation > or due to mddev_trylock fail (this trylock should always get failed since raid5_quiesce > was called with reconfig_mutex hold, isn't it?). As a result, it will take a long time to > recovery the log when the arrary was restarted. Should r5l_quiesce call md_update_sb > directly to guarantee superblock update? Yep, that's problem here. Unfortunately we can't call md_update_sb here, because we might not hold the mddev lock. I think we should call it at do_md_stop. Thanks, Shaohua ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Superblock of raid5 log can't be updated when array stoped 2016-10-18 0:28 ` Shaohua Li @ 2016-10-18 5:51 ` Zhengyuan Liu 2016-10-21 22:43 ` Shaohua Li 0 siblings, 1 reply; 4+ messages in thread From: Zhengyuan Liu @ 2016-10-18 5:51 UTC (permalink / raw) To: Shaohua Li; +Cc: shli, Song Liu, linux-raid If that's the problem, I think there is another problem about next_checkpoint initialization. No initial operation was done to this field when we loaded/recovery the log , it got assignment only when IO to raid disk was finished. So r5l_quiesce may use wrong next_checkpoint to reclaim log space, that would confuse log recovery. Bellow patch may help the expression. diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c index 1b1ab4a..998ea00 100644 --- a/drivers/md/raid5-cache.c +++ b/drivers/md/raid5-cache.c @@ -1096,6 +1096,8 @@ static int r5l_recovery_log(struct r5l_log *log) log->seq = ctx.seq + 11; log->log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS); r5l_write_super(log, ctx.pos); + log->last_checkpoint = ctx.pos; + log->next_checkpoint = ctx.pos; } else { log->log_start = ctx.pos; log->seq = ctx.seq; @@ -1168,6 +1170,7 @@ create: if (log->max_free_space > RECLAIM_MAX_FREE_SPACE) log->max_free_space = RECLAIM_MAX_FREE_SPACE; log->last_checkpoint = cp; + log->next_checkpoint = cp; __free_page(page); Thanks, --Zhengyuan ------------------ Original ------------------ From: "Shaohua Li"<shli@kernel.org>; Date: Tue, Oct 18, 2016 08:28 AM To: "Zhengyuan Liu"<liuzhengyuan@kylinos.cn>; Cc: "shli"<shli@fb.com>; "Song Liu"<songliubraving@fb.com>; "linux-raid"<linux-raid@vger.kernel.org>; Subject: Re: Superblock of raid5 log can't be updated when array stoped On Sat, Oct 15, 2016 at 10:19:36AM +0800, liuzhengyuan wrote: > Hi, Shaohua. > > when we stop raid5 array with "mdadm -S" or reboot the system, md module will > call raid5_quiesce and r5l_quiesce to do some clean work. Some code of r5l_quiesce > was pasted bellow. > > /* make sure r5l_write_super_and_discard_space exits */ > mddev = log->rdev->mddev; > wake_up(&mddev->sb_wait); > r5l_wake_reclaim(log, -1L); > md_unregister_thread(&log->reclaim_thread); > r5l_do_reclaim(log); > + md_update_sb(mddev, 1); > > It will reclaim all used space of log and call r5l_write_super to reset rdev->journal_tail > to log->next_checkpoint . However, new rdev->journal_tail would not be written to > journal device for persistent because journal device may not support discard operation > or due to mddev_trylock fail (this trylock should always get failed since raid5_quiesce > was called with reconfig_mutex hold, isn't it?). As a result, it will take a long time to > recovery the log when the arrary was restarted. Should r5l_quiesce call md_update_sb > directly to guarantee superblock update? Yep, that's problem here. Unfortunately we can't call md_update_sb here, because we might not hold the mddev lock. I think we should call it at do_md_stop. Thanks, Shaohua ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Superblock of raid5 log can't be updated when array stoped 2016-10-18 5:51 ` Zhengyuan Liu @ 2016-10-21 22:43 ` Shaohua Li 0 siblings, 0 replies; 4+ messages in thread From: Shaohua Li @ 2016-10-21 22:43 UTC (permalink / raw) To: Zhengyuan Liu; +Cc: shli, Song Liu, linux-raid On Tue, Oct 18, 2016 at 01:51:57PM +0800, Zhengyuan Liu wrote: > If that's the problem, I think there is another problem about next_checkpoint initialization. > No initial operation was done to this field when we loaded/recovery the log , it got > assignment only when IO to raid disk was finished. So r5l_quiesce may use wrong > next_checkpoint to reclaim log space, that would confuse log recovery. Bellow patch > may help the expression. Good catch. This doesn't confuse log recovery but maks reclaimable space calculation confused. Could you please send a formal patch? Thanks, Shaohua > diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c > index 1b1ab4a..998ea00 100644 > --- a/drivers/md/raid5-cache.c > +++ b/drivers/md/raid5-cache.c > @@ -1096,6 +1096,8 @@ static int r5l_recovery_log(struct r5l_log *log) > log->seq = ctx.seq + 11; > log->log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS); > r5l_write_super(log, ctx.pos); > + log->last_checkpoint = ctx.pos; > + log->next_checkpoint = ctx.pos; > } else { > log->log_start = ctx.pos; > log->seq = ctx.seq; > @@ -1168,6 +1170,7 @@ create: > if (log->max_free_space > RECLAIM_MAX_FREE_SPACE) > log->max_free_space = RECLAIM_MAX_FREE_SPACE; > log->last_checkpoint = cp; > + log->next_checkpoint = cp; > > __free_page(page); > > Thanks, > --Zhengyuan > > ------------------ Original ------------------ > From: "Shaohua Li"<shli@kernel.org>; > Date: Tue, Oct 18, 2016 08:28 AM > To: "Zhengyuan Liu"<liuzhengyuan@kylinos.cn>; > Cc: "shli"<shli@fb.com>; "Song Liu"<songliubraving@fb.com>; "linux-raid"<linux-raid@vger.kernel.org>; > Subject: Re: Superblock of raid5 log can't be updated when array stoped > > On Sat, Oct 15, 2016 at 10:19:36AM +0800, liuzhengyuan wrote: > > Hi, Shaohua. > > > > when we stop raid5 array with "mdadm -S" or reboot the system, md module will > > call raid5_quiesce and r5l_quiesce to do some clean work. Some code of r5l_quiesce > > was pasted bellow. > > > > /* make sure r5l_write_super_and_discard_space exits */ > > mddev = log->rdev->mddev; > > wake_up(&mddev->sb_wait); > > r5l_wake_reclaim(log, -1L); > > md_unregister_thread(&log->reclaim_thread); > > r5l_do_reclaim(log); > > + md_update_sb(mddev, 1); > > > > It will reclaim all used space of log and call r5l_write_super to reset rdev->journal_tail > > to log->next_checkpoint . However, new rdev->journal_tail would not be written to > > journal device for persistent because journal device may not support discard operation > > or due to mddev_trylock fail (this trylock should always get failed since raid5_quiesce > > was called with reconfig_mutex hold, isn't it?). As a result, it will take a long time to > > recovery the log when the arrary was restarted. Should r5l_quiesce call md_update_sb > > directly to guarantee superblock update? > > Yep, that's problem here. Unfortunately we can't call md_update_sb here, > because we might not hold the mddev lock. I think we should call it at do_md_stop. > > Thanks, > Shaohua ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-10-21 22:43 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-10-15 2:19 Superblock of raid5 log can't be updated when array stoped liuzhengyuan 2016-10-18 0:28 ` Shaohua Li 2016-10-18 5:51 ` Zhengyuan Liu 2016-10-21 22:43 ` Shaohua Li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).