* [MD PATCH 1/1] Don't add nr_pending for resync requests @ 2017-04-24 8:47 Xiao Ni 2017-04-25 17:10 ` Shaohua Li 0 siblings, 1 reply; 4+ messages in thread From: Xiao Ni @ 2017-04-24 8:47 UTC (permalink / raw) To: linux-raid; +Cc: shli, ncroxon In newer barrier codes, raise_barrier waits if conf->nr_pending[idx] is not zero. After all the conditions are true, the resync request can go on be handling. But it adds conf->nr_pending[idx] again. The next resync request hit the same bucket idx need to wait the resync request which is submitted before. The performance of resync/recovery is degraded. I did some tests: 1. Before removing the codes, the resync performance is: Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdc 0.00 0.00 0.00 164.00 0.00 10.13 126.46 0.95 5.80 0.00 5.80 5.82 95.40 sdb 0.00 0.00 163.00 2.00 10.19 0.00 126.47 0.04 0.25 0.18 5.50 0.25 4.10 2. After removing the codes, the resync performance is: Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdc 0.00 2164.00 0.00 751.00 0.00 182.06 496.49 7.49 9.95 0.00 9.95 1.33 100.00 sdb 2162.00 0.00 752.00 0.00 182.25 0.00 496.34 0.56 0.74 0.74 0.00 0.65 48.80 Signed-off-by: Xiao Ni <xni@redhat.com> --- drivers/md/raid1.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index a34f587..48567ea 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -869,7 +869,6 @@ static void raise_barrier(struct r1conf *conf, sector_t sector_nr) atomic_read(&conf->barrier[idx]) < RESYNC_DEPTH, conf->resync_lock); - atomic_inc(&conf->nr_pending[idx]); spin_unlock_irq(&conf->resync_lock); } @@ -880,7 +879,6 @@ static void lower_barrier(struct r1conf *conf, sector_t sector_nr) BUG_ON(atomic_read(&conf->barrier[idx]) <= 0); atomic_dec(&conf->barrier[idx]); - atomic_dec(&conf->nr_pending[idx]); wake_up(&conf->wait_barrier); } -- 2.7.4 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [MD PATCH 1/1] Don't add nr_pending for resync requests 2017-04-24 8:47 [MD PATCH 1/1] Don't add nr_pending for resync requests Xiao Ni @ 2017-04-25 17:10 ` Shaohua Li 2017-04-26 8:40 ` Coly Li 0 siblings, 1 reply; 4+ messages in thread From: Shaohua Li @ 2017-04-25 17:10 UTC (permalink / raw) To: Xiao Ni; +Cc: linux-raid, ncroxon, neilb, colyli Hi, Cc: Coly and Neil. On Mon, Apr 24, 2017 at 04:47:15PM +0800, Xiao Ni wrote: > In newer barrier codes, raise_barrier waits if conf->nr_pending[idx] is not zero. > After all the conditions are true, the resync request can go on be handling. But > it adds conf->nr_pending[idx] again. The next resync request hit the same bucket > idx need to wait the resync request which is submitted before. The performance > of resync/recovery is degraded. > > I did some tests: > 1. Before removing the codes, the resync performance is: > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util > sdc 0.00 0.00 0.00 164.00 0.00 10.13 126.46 0.95 5.80 0.00 5.80 5.82 95.40 > sdb 0.00 0.00 163.00 2.00 10.19 0.00 126.47 0.04 0.25 0.18 5.50 0.25 4.10 > 2. After removing the codes, the resync performance is: > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util > sdc 0.00 2164.00 0.00 751.00 0.00 182.06 496.49 7.49 9.95 0.00 9.95 1.33 100.00 > sdb 2162.00 0.00 752.00 0.00 182.25 0.00 496.34 0.56 0.74 0.74 0.00 0.65 48.80 A big resync regression, thanks to check it! Your analysis makes sense to me, but the patch doesn't. freeze_array() depends on nr_pending to wait for all normal IO and resync. This change will make the wait for resync disappear. I think the problem is we are abusing nr_pending here. freeze_array() waits for a condition which doesn't need to be maintained by nr_pending. It makes sense raise_barrier to wait for nr_pending caused by normal IO, but it doesn't make sense to wait for other raise_barrier. We could add another vairable, increase it in raise_barrier, decrease it in lower_barrier and let freeze_array() check the new variable. That should fix the problem and not expose race condition for freeze_array(). Thanks, Shaohua > Signed-off-by: Xiao Ni <xni@redhat.com> > --- > drivers/md/raid1.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > index a34f587..48567ea 100644 > --- a/drivers/md/raid1.c > +++ b/drivers/md/raid1.c > @@ -869,7 +869,6 @@ static void raise_barrier(struct r1conf *conf, sector_t sector_nr) > atomic_read(&conf->barrier[idx]) < RESYNC_DEPTH, > conf->resync_lock); > > - atomic_inc(&conf->nr_pending[idx]); > spin_unlock_irq(&conf->resync_lock); > } > > @@ -880,7 +879,6 @@ static void lower_barrier(struct r1conf *conf, sector_t sector_nr) > BUG_ON(atomic_read(&conf->barrier[idx]) <= 0); > > atomic_dec(&conf->barrier[idx]); > - atomic_dec(&conf->nr_pending[idx]); > wake_up(&conf->wait_barrier); > } > > -- > 2.7.4 > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [MD PATCH 1/1] Don't add nr_pending for resync requests 2017-04-25 17:10 ` Shaohua Li @ 2017-04-26 8:40 ` Coly Li 2017-04-26 9:22 ` Xiao Ni 0 siblings, 1 reply; 4+ messages in thread From: Coly Li @ 2017-04-26 8:40 UTC (permalink / raw) To: Shaohua Li, Xiao Ni; +Cc: linux-raid, ncroxon, neilb On 2017/4/26 上午1:10, Shaohua Li wrote: > Hi, > Cc: Coly and Neil. > > On Mon, Apr 24, 2017 at 04:47:15PM +0800, Xiao Ni wrote: >> In newer barrier codes, raise_barrier waits if conf->nr_pending[idx] is not zero. >> After all the conditions are true, the resync request can go on be handling. But >> it adds conf->nr_pending[idx] again. The next resync request hit the same bucket >> idx need to wait the resync request which is submitted before. The performance >> of resync/recovery is degraded. >> >> I did some tests: >> 1. Before removing the codes, the resync performance is: >> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >> sdc 0.00 0.00 0.00 164.00 0.00 10.13 126.46 0.95 5.80 0.00 5.80 5.82 95.40 >> sdb 0.00 0.00 163.00 2.00 10.19 0.00 126.47 0.04 0.25 0.18 5.50 0.25 4.10 >> 2. After removing the codes, the resync performance is: >> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >> sdc 0.00 2164.00 0.00 751.00 0.00 182.06 496.49 7.49 9.95 0.00 9.95 1.33 100.00 >> sdb 2162.00 0.00 752.00 0.00 182.25 0.00 496.34 0.56 0.74 0.74 0.00 0.65 48.80 > > A big resync regression, thanks to check it! Your analysis makes sense to me, > but the patch doesn't. freeze_array() depends on nr_pending to wait for all > normal IO and resync. This change will make the wait for resync disappear. > > I think the problem is we are abusing nr_pending here. freeze_array() waits for > a condition which doesn't need to be maintained by nr_pending. It makes sense > raise_barrier to wait for nr_pending caused by normal IO, but it doesn't make > sense to wait for other raise_barrier. > > We could add another vairable, increase it in raise_barrier, decrease it in > lower_barrier and let freeze_array() check the new variable. That should fix > the problem and not expose race condition for freeze_array(). Shaohua, Adding another variable to recode pending resync I/O is good idea. We can do something like this, @@ -869,7 +869,7 @@ static void raise_barrier() - atomic_inc(&conf->nr_pending[idx]); + atomic_inc(&conf->nr_resync_pending[idx]); spin_unlock_irq(&conf->resync_lock); } @@ -880,7 +880,7 @@ static void lower_barrier() atomic_dec(&conf->barrier[idx]); - atomic_dec(&conf->nr_pending[idx]); + atomic_dec(&conf->nr_resync_pending[idx]); wake_up(&conf->wait_barrier); } @@ -1018,7 +1018,8 @@ static int get_unqueued_pending() int idx, ret; for (ret = 0, idx = 0; idx < BARRIER_BUCKETS_NR; idx++) - ret += atomic_read(&conf->nr_pending[idx]) - + ret += atomic_read(&conf->nr_pending[idx]) + + atomic_read(&conf->nr_resync_pending[idx]) - atomic_read(&conf->nr_queued[idx]); return ret; Xiao, Nice catch! From your email, it seems resync request on raid1 device can be non-linear, otherwise, it won't hit same barrier bucket by different resync I/O to raise barrier. This is some thing I don't know. Thank you ! The reason to increase conf->nr_pending[idx] is from 'commit 34e97f17 ("md/raid1: count resync requests in nr_pending.")', because normal I/O and resync I/O can happen at same time on same raid1 device. And both of them can be queued by reschedule_retry(), raise_barrier() need to increase conf->nr_pending[idx] to make sure pending I/Os will be equal to queued I/Os when array is frozen. We cannot remove conf->nr_pending[idx] operations from raise_barrier()/lower_barrier(). Would you like to compose a patch by Shaohua's suggestion ? Thanks. Coly Li ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [MD PATCH 1/1] Don't add nr_pending for resync requests 2017-04-26 8:40 ` Coly Li @ 2017-04-26 9:22 ` Xiao Ni 0 siblings, 0 replies; 4+ messages in thread From: Xiao Ni @ 2017-04-26 9:22 UTC (permalink / raw) To: Coly Li; +Cc: Shaohua Li, linux-raid, ncroxon, neilb ----- Original Message ----- > From: "Coly Li" <colyli@suse.de> > To: "Shaohua Li" <shli@kernel.org>, "Xiao Ni" <xni@redhat.com> > Cc: linux-raid@vger.kernel.org, ncroxon@redhat.com, neilb@suse.com > Sent: Wednesday, April 26, 2017 4:40:20 PM > Subject: Re: [MD PATCH 1/1] Don't add nr_pending for resync requests > > On 2017/4/26 上午1:10, Shaohua Li wrote: > > Hi, > > Cc: Coly and Neil. > > > > On Mon, Apr 24, 2017 at 04:47:15PM +0800, Xiao Ni wrote: > >> In newer barrier codes, raise_barrier waits if conf->nr_pending[idx] is > >> not zero. > >> After all the conditions are true, the resync request can go on be > >> handling. But > >> it adds conf->nr_pending[idx] again. The next resync request hit the same > >> bucket > >> idx need to wait the resync request which is submitted before. The > >> performance > >> of resync/recovery is degraded. > >> > >> I did some tests: > >> 1. Before removing the codes, the resync performance is: > >> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz > >> avgqu-sz await r_await w_await svctm %util > >> sdc 0.00 0.00 0.00 164.00 0.00 10.13 126.46 > >> 0.95 5.80 0.00 5.80 5.82 95.40 > >> sdb 0.00 0.00 163.00 2.00 10.19 0.00 126.47 > >> 0.04 0.25 0.18 5.50 0.25 4.10 > >> 2. After removing the codes, the resync performance is: > >> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz > >> avgqu-sz await r_await w_await svctm %util > >> sdc 0.00 2164.00 0.00 751.00 0.00 182.06 496.49 > >> 7.49 9.95 0.00 9.95 1.33 100.00 > >> sdb 2162.00 0.00 752.00 0.00 182.25 0.00 496.34 > >> 0.56 0.74 0.74 0.00 0.65 48.80 > > > > A big resync regression, thanks to check it! Your analysis makes sense to > > me, > > but the patch doesn't. freeze_array() depends on nr_pending to wait for all > > normal IO and resync. This change will make the wait for resync disappear. > > > > I think the problem is we are abusing nr_pending here. freeze_array() waits > > for > > a condition which doesn't need to be maintained by nr_pending. It makes > > sense > > raise_barrier to wait for nr_pending caused by normal IO, but it doesn't > > make > > sense to wait for other raise_barrier. > > > > We could add another vairable, increase it in raise_barrier, decrease it in > > lower_barrier and let freeze_array() check the new variable. That should > > fix > > the problem and not expose race condition for freeze_array(). > > Shaohua, > > Adding another variable to recode pending resync I/O is good idea. We > can do something like this, > > @@ -869,7 +869,7 @@ static void raise_barrier() > > - atomic_inc(&conf->nr_pending[idx]); > + atomic_inc(&conf->nr_resync_pending[idx]); > spin_unlock_irq(&conf->resync_lock); > } > > @@ -880,7 +880,7 @@ static void lower_barrier() > > atomic_dec(&conf->barrier[idx]); > - atomic_dec(&conf->nr_pending[idx]); > + atomic_dec(&conf->nr_resync_pending[idx]); > wake_up(&conf->wait_barrier); > } > > @@ -1018,7 +1018,8 @@ static int get_unqueued_pending() > int idx, ret; > > for (ret = 0, idx = 0; idx < BARRIER_BUCKETS_NR; idx++) > - ret += atomic_read(&conf->nr_pending[idx]) - > + ret += atomic_read(&conf->nr_pending[idx]) + > + atomic_read(&conf->nr_resync_pending[idx]) - > atomic_read(&conf->nr_queued[idx]); > > return ret; > > Xiao, > > Nice catch! From your email, it seems resync request on raid1 device can > be non-linear, otherwise, it won't hit same barrier bucket by different > resync I/O to raise barrier. This is some thing I don't know. Thank you ! Hi Coly and shaohua Sorry I don't understand "non-linear raid1 device". What's the meaning about this? In raid1_sync_request, it can handle RESYNC_PAGES pages. It's 64KB. So there is a chance that different resync requests hit the same bucket id. Because the bucket unit is 64MB. Am I right? > > The reason to increase conf->nr_pending[idx] is from 'commit 34e97f17 > ("md/raid1: count resync requests in nr_pending.")', because normal I/O > and resync I/O can happen at same time on same raid1 device. And both of > them can be queued by reschedule_retry(), raise_barrier() need to > increase conf->nr_pending[idx] to make sure pending I/Os will be equal > to queued I/Os when array is frozen. We cannot remove > conf->nr_pending[idx] operations from raise_barrier()/lower_barrier(). > > Would you like to compose a patch by Shaohua's suggestion ? > Sure, I'll re-send the patch later. At first I didn't understand the relationship about nr_pending and nr_queued. If I remove the action adding nr_pending for resync requests. There maybe one deadlock in freeze_array. Thanks for the explanation. Regards Xiao ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-04-26 9:22 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-04-24 8:47 [MD PATCH 1/1] Don't add nr_pending for resync requests Xiao Ni 2017-04-25 17:10 ` Shaohua Li 2017-04-26 8:40 ` Coly Li 2017-04-26 9:22 ` Xiao Ni
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).