linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] raid1: prevent freeze_array/wait_all_barriers deadlock
@ 2017-10-17 12:17 Nate Dailey
  2017-10-17 16:31 ` Coly Li
  0 siblings, 1 reply; 3+ messages in thread
From: Nate Dailey @ 2017-10-17 12:17 UTC (permalink / raw)
  To: linux-raid; +Cc: Nate Dailey

If freeze_array is attempted in the middle of close_sync/
wait_all_barriers, deadlock can occur.

freeze_array will wait for nr_pending and nr_queued to line up.
wait_all_barriers increments nr_pending for each barrier bucket, one
at a time, but doesn't actually issue IO that could be counted in
nr_queued. So freeze_array is blocked until wait_all_barriers
completes and allow_all_barriers runs. At the same time, when
_wait_barrier sees array_frozen == 1, it stops and waits for
freeze_array to complete.

Prevent the deadlock by making close_sync call _wait_barrier and
_allow_barrier for one bucket at a time, instead of deferring the
_allow_barrier calls until after all _wait_barriers are complete.

Signed-off-by: Nate Dailey <nate.dailey@stratus.com>
---
 drivers/md/raid1.c | 24 ++++++------------------
 1 file changed, 6 insertions(+), 18 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index f3f3e40dc9d8..e4e8f9e565b7 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -990,14 +990,6 @@ static void wait_barrier(struct r1conf *conf, sector_t sector_nr)
 	_wait_barrier(conf, idx);
 }
 
-static void wait_all_barriers(struct r1conf *conf)
-{
-	int idx;
-
-	for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++)
-		_wait_barrier(conf, idx);
-}
-
 static void _allow_barrier(struct r1conf *conf, int idx)
 {
 	atomic_dec(&conf->nr_pending[idx]);
@@ -1011,14 +1003,6 @@ static void allow_barrier(struct r1conf *conf, sector_t sector_nr)
 	_allow_barrier(conf, idx);
 }
 
-static void allow_all_barriers(struct r1conf *conf)
-{
-	int idx;
-
-	for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++)
-		_allow_barrier(conf, idx);
-}
-
 /* conf->resync_lock should be held */
 static int get_unqueued_pending(struct r1conf *conf)
 {
@@ -1654,8 +1638,12 @@ static void print_conf(struct r1conf *conf)
 
 static void close_sync(struct r1conf *conf)
 {
-	wait_all_barriers(conf);
-	allow_all_barriers(conf);
+	int idx;
+
+	for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) {
+		_wait_barrier(conf, idx);
+		_allow_barrier(conf, idx);
+	}
 
 	mempool_destroy(conf->r1buf_pool);
 	conf->r1buf_pool = NULL;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] raid1: prevent freeze_array/wait_all_barriers deadlock
  2017-10-17 12:17 [PATCH] raid1: prevent freeze_array/wait_all_barriers deadlock Nate Dailey
@ 2017-10-17 16:31 ` Coly Li
  2017-10-19  4:10   ` Shaohua Li
  0 siblings, 1 reply; 3+ messages in thread
From: Coly Li @ 2017-10-17 16:31 UTC (permalink / raw)
  To: Nate Dailey; +Cc: linux-raid

On 2017/10/17 下午8:17, Nate Dailey wrote:
> If freeze_array is attempted in the middle of close_sync/
> wait_all_barriers, deadlock can occur.
> 
> freeze_array will wait for nr_pending and nr_queued to line up.
> wait_all_barriers increments nr_pending for each barrier bucket, one
> at a time, but doesn't actually issue IO that could be counted in
> nr_queued. So freeze_array is blocked until wait_all_barriers
> completes and allow_all_barriers runs. At the same time, when
> _wait_barrier sees array_frozen == 1, it stops and waits for
> freeze_array to complete.
> 
> Prevent the deadlock by making close_sync call _wait_barrier and
> _allow_barrier for one bucket at a time, instead of deferring the
> _allow_barrier calls until after all _wait_barriers are complete.
> 
> Signed-off-by: Nate Dailey <nate.dailey@stratus.com>

Reviewed-by: Coly Li <colyli@suse.de>

Thank you for the great effort on analyzing this deadlock, and provide a
simpler and clean fix :-)

Coly Li

> ---
>  drivers/md/raid1.c | 24 ++++++------------------
>  1 file changed, 6 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index f3f3e40dc9d8..e4e8f9e565b7 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -990,14 +990,6 @@ static void wait_barrier(struct r1conf *conf, sector_t sector_nr)
>  	_wait_barrier(conf, idx);
>  }
>  
> -static void wait_all_barriers(struct r1conf *conf)
> -{
> -	int idx;
> -
> -	for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++)
> -		_wait_barrier(conf, idx);
> -}
> -
>  static void _allow_barrier(struct r1conf *conf, int idx)
>  {
>  	atomic_dec(&conf->nr_pending[idx]);
> @@ -1011,14 +1003,6 @@ static void allow_barrier(struct r1conf *conf, sector_t sector_nr)
>  	_allow_barrier(conf, idx);
>  }
>  
> -static void allow_all_barriers(struct r1conf *conf)
> -{
> -	int idx;
> -
> -	for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++)
> -		_allow_barrier(conf, idx);
> -}
> -
>  /* conf->resync_lock should be held */
>  static int get_unqueued_pending(struct r1conf *conf)
>  {
> @@ -1654,8 +1638,12 @@ static void print_conf(struct r1conf *conf)
>  
>  static void close_sync(struct r1conf *conf)
>  {
> -	wait_all_barriers(conf);
> -	allow_all_barriers(conf);
> +	int idx;
> +
> +	for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) {
> +		_wait_barrier(conf, idx);
> +		_allow_barrier(conf, idx);
> +	}
>  
>  	mempool_destroy(conf->r1buf_pool);
>  	conf->r1buf_pool = NULL;
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] raid1: prevent freeze_array/wait_all_barriers deadlock
  2017-10-17 16:31 ` Coly Li
@ 2017-10-19  4:10   ` Shaohua Li
  0 siblings, 0 replies; 3+ messages in thread
From: Shaohua Li @ 2017-10-19  4:10 UTC (permalink / raw)
  To: Coly Li; +Cc: Nate Dailey, linux-raid

On Wed, Oct 18, 2017 at 12:31:31AM +0800, Coly Li wrote:
> On 2017/10/17 下午8:17, Nate Dailey wrote:
> > If freeze_array is attempted in the middle of close_sync/
> > wait_all_barriers, deadlock can occur.
> > 
> > freeze_array will wait for nr_pending and nr_queued to line up.
> > wait_all_barriers increments nr_pending for each barrier bucket, one
> > at a time, but doesn't actually issue IO that could be counted in
> > nr_queued. So freeze_array is blocked until wait_all_barriers
> > completes and allow_all_barriers runs. At the same time, when
> > _wait_barrier sees array_frozen == 1, it stops and waits for
> > freeze_array to complete.
> > 
> > Prevent the deadlock by making close_sync call _wait_barrier and
> > _allow_barrier for one bucket at a time, instead of deferring the
> > _allow_barrier calls until after all _wait_barriers are complete.
> > 
> > Signed-off-by: Nate Dailey <nate.dailey@stratus.com>
> 
> Reviewed-by: Coly Li <colyli@suse.de>
> 
> Thank you for the great effort on analyzing this deadlock, and provide a
> simpler and clean fix :-)

Nice catch, thanks Nate and Coly! I'll mark this for-stable.

> 
> Coly Li
> 
> > ---
> >  drivers/md/raid1.c | 24 ++++++------------------
> >  1 file changed, 6 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > index f3f3e40dc9d8..e4e8f9e565b7 100644
> > --- a/drivers/md/raid1.c
> > +++ b/drivers/md/raid1.c
> > @@ -990,14 +990,6 @@ static void wait_barrier(struct r1conf *conf, sector_t sector_nr)
> >  	_wait_barrier(conf, idx);
> >  }
> >  
> > -static void wait_all_barriers(struct r1conf *conf)
> > -{
> > -	int idx;
> > -
> > -	for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++)
> > -		_wait_barrier(conf, idx);
> > -}
> > -
> >  static void _allow_barrier(struct r1conf *conf, int idx)
> >  {
> >  	atomic_dec(&conf->nr_pending[idx]);
> > @@ -1011,14 +1003,6 @@ static void allow_barrier(struct r1conf *conf, sector_t sector_nr)
> >  	_allow_barrier(conf, idx);
> >  }
> >  
> > -static void allow_all_barriers(struct r1conf *conf)
> > -{
> > -	int idx;
> > -
> > -	for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++)
> > -		_allow_barrier(conf, idx);
> > -}
> > -
> >  /* conf->resync_lock should be held */
> >  static int get_unqueued_pending(struct r1conf *conf)
> >  {
> > @@ -1654,8 +1638,12 @@ static void print_conf(struct r1conf *conf)
> >  
> >  static void close_sync(struct r1conf *conf)
> >  {
> > -	wait_all_barriers(conf);
> > -	allow_all_barriers(conf);
> > +	int idx;
> > +
> > +	for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) {
> > +		_wait_barrier(conf, idx);
> > +		_allow_barrier(conf, idx);
> > +	}
> >  
> >  	mempool_destroy(conf->r1buf_pool);
> >  	conf->r1buf_pool = NULL;
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-10-19  4:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-17 12:17 [PATCH] raid1: prevent freeze_array/wait_all_barriers deadlock Nate Dailey
2017-10-17 16:31 ` Coly Li
2017-10-19  4:10   ` Shaohua Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).