All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH][RESEND]dm-mpath: fix for race condition between multipath_dtr and pg_init_done.
@ 2013-10-30  3:26 Merla, ShivaKrishna
  2013-10-31  9:03 ` Junichi Nomura
  0 siblings, 1 reply; 8+ messages in thread
From: Merla, ShivaKrishna @ 2013-10-30  3:26 UTC (permalink / raw)
  To: dm-devel@redhat.com; +Cc: agk@redhat.com, snitzer@redhat.com

Whenever multipath_dtr is happening, we should prevent queueing any further path
activation work. There was a kernel panic where after pg_init_done() decrements
pg_init_in_progress to 0, wait_for_pg_init_completion call assumes there are no
more pending path management commands. But if pg_init_required is set by
pg_init_done call due to retriable mode_select errors , then process_queued_ios()
will again queue the path activation work. If free_multipath call has been
completed by the time activate_path work is called, kernel panic was seen on
accessing multipath members.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
RIP: 0010:[<ffffffffa003db1b>]  [<ffffffffa003db1b>] activate_path+0x1b/0x30 [dm_multipath]
[<ffffffff81090ac0>] worker_thread+0x170/0x2a0
[<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40

Signed-off-by: Shiva Krishna Merla<shivakrishna.merla@netapp.com>
Reviewed-by: Krishnasamy Somasundaram<somasundaram.krishnasamy@netapp.com>
Tested-by: Speagle Andy<Andy.Speagle@netapp.com>

---
--- a/drivers/md/dm-mpath.c	2013-01-29 10:12:10.000000000 -0600
+++ b/drivers/md/dm-mpath.c	2013-10-29 21:02:03.267685017 -0500
@@ -73,6 +73,7 @@ struct multipath {
 
 	wait_queue_head_t pg_init_wait;	/* Wait for pg_init completion */
 
+	unsigned dtr_in_progress;	/* multipath destroy in progress */
 	unsigned pg_init_required;	/* pg_init needs calling? */
 	unsigned pg_init_in_progress;	/* Only one pg_init allowed at once */
 	unsigned pg_init_delay_retry;	/* Delay pg_init retry? */
@@ -498,7 +499,8 @@ static void process_queued_ios(struct wo
 	    (!pgpath && !m->queue_if_no_path))
 		must_queue = 0;
 
-	if (m->pg_init_required && !m->pg_init_in_progress && pgpath)
+	if (m->pg_init_required && !m->pg_init_in_progress && pgpath &&
+	    !m->dtr_in_progress)
 		__pg_init_all_paths(m);
 
 	spin_unlock_irqrestore(&m->lock, flags);
@@ -951,6 +953,11 @@ static void flush_multipath_work(struct 
 static void multipath_dtr(struct dm_target *ti)
 {
 	struct multipath *m = ti->private;
+	unsigned long flags;
+
+	spin_lock_irqsave(&m->lock, flags);
+	m->dtr_in_progress = 1;
+	spin_unlock_irqrestore(&m->lock, flags);
 
 	flush_multipath_work(m);
 	free_multipath(m);
@@ -1164,7 +1171,7 @@ static int pg_init_limit_reached(struct 
 
 	spin_lock_irqsave(&m->lock, flags);
 
-	if (m->pg_init_count <= m->pg_init_retries)
+	if (m->pg_init_count <= m->pg_init_retries && !m->dtr_in_progress)
 		m->pg_init_required = 1;
 	else
 		limit_reached = 1;
--

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][RESEND]dm-mpath: fix for race condition between multipath_dtr and pg_init_done.
  2013-10-30  3:26 [PATCH][RESEND]dm-mpath: fix for race condition between multipath_dtr and pg_init_done Merla, ShivaKrishna
@ 2013-10-31  9:03 ` Junichi Nomura
  2013-10-31 14:29   ` Merla, ShivaKrishna
  2013-10-31 14:48   ` [PATCH v2] dm mpath: fix " Mike Snitzer
  0 siblings, 2 replies; 8+ messages in thread
From: Junichi Nomura @ 2013-10-31  9:03 UTC (permalink / raw)
  To: device-mapper development, ShivaKrishna.Merla@netapp.com
  Cc: snitzer@redhat.com, agk@redhat.com

On 10/30/13 12:26, Merla, ShivaKrishna wrote:
> Whenever multipath_dtr is happening, we should prevent queueing any further path
> activation work. There was a kernel panic where after pg_init_done() decrements
> pg_init_in_progress to 0, wait_for_pg_init_completion call assumes there are no
> more pending path management commands. But if pg_init_required is set by
> pg_init_done call due to retriable mode_select errors , then process_queued_ios()
> will again queue the path activation work. If free_multipath call has been
> completed by the time activate_path work is called, kernel panic was seen on
> accessing multipath members.
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
> RIP: 0010:[<ffffffffa003db1b>]  [<ffffffffa003db1b>] activate_path+0x1b/0x30 [dm_multipath]
> [<ffffffff81090ac0>] worker_thread+0x170/0x2a0
> [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
> 
> Signed-off-by: Shiva Krishna Merla<shivakrishna.merla@netapp.com>
> Reviewed-by: Krishnasamy Somasundaram<somasundaram.krishnasamy@netapp.com>
> Tested-by: Speagle Andy<Andy.Speagle@netapp.com>
> 
> ---
> --- a/drivers/md/dm-mpath.c	2013-01-29 10:12:10.000000000 -0600
> +++ b/drivers/md/dm-mpath.c	2013-10-29 21:02:03.267685017 -0500
> @@ -73,6 +73,7 @@ struct multipath {
>  
>  	wait_queue_head_t pg_init_wait;	/* Wait for pg_init completion */
>  
> +	unsigned dtr_in_progress;	/* multipath destroy in progress */
>  	unsigned pg_init_required;	/* pg_init needs calling? */
>  	unsigned pg_init_in_progress;	/* Only one pg_init allowed at once */
>  	unsigned pg_init_delay_retry;	/* Delay pg_init retry? */
> @@ -498,7 +499,8 @@ static void process_queued_ios(struct wo
>  	    (!pgpath && !m->queue_if_no_path))
>  		must_queue = 0;
>  
> -	if (m->pg_init_required && !m->pg_init_in_progress && pgpath)
> +	if (m->pg_init_required && !m->pg_init_in_progress && pgpath &&
> +	    !m->dtr_in_progress)
>  		__pg_init_all_paths(m);
>  
>  	spin_unlock_irqrestore(&m->lock, flags);
> @@ -951,6 +953,11 @@ static void flush_multipath_work(struct 
>  static void multipath_dtr(struct dm_target *ti)
>  {
>  	struct multipath *m = ti->private;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&m->lock, flags);
> +	m->dtr_in_progress = 1;
> +	spin_unlock_irqrestore(&m->lock, flags);

Hi,

how about moving this to flush_multipath_work(), which is supposed
to silence background activities?
I.e.
  flush_multipath_work() {
     <disable pg_init retry>
     ...
     <enable pg_init retry>
  }

Then it not only fixes the crash you hit, it also fixes the hidden bug
that pg_init continues retrying while the device is suspended.

-- 
Jun'ichi Nomura, NEC Corporation

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][RESEND]dm-mpath: fix for race condition between multipath_dtr and pg_init_done.
  2013-10-31  9:03 ` Junichi Nomura
@ 2013-10-31 14:29   ` Merla, ShivaKrishna
  2013-10-31 14:48   ` [PATCH v2] dm mpath: fix " Mike Snitzer
  1 sibling, 0 replies; 8+ messages in thread
From: Merla, ShivaKrishna @ 2013-10-31 14:29 UTC (permalink / raw)
  To: Junichi Nomura, device-mapper development
  Cc: snitzer@redhat.com, agk@redhat.com



> -----Original Message-----
> From: Junichi Nomura [mailto:j-nomura@ce.jp.nec.com]
> Sent: Thursday, October 31, 2013 4:04 AM
> To: device-mapper development; Merla, ShivaKrishna
> Cc: agk@redhat.com; snitzer@redhat.com
> Subject: Re: [dm-devel] [PATCH][RESEND]dm-mpath: fix for race condition
> between multipath_dtr and pg_init_done.
> 
> On 10/30/13 12:26, Merla, ShivaKrishna wrote:
> > Whenever multipath_dtr is happening, we should prevent queueing any
> further path
> > activation work. There was a kernel panic where after pg_init_done()
> decrements
> > pg_init_in_progress to 0, wait_for_pg_init_completion call assumes there
> are no
> > more pending path management commands. But if pg_init_required is set
> by
> > pg_init_done call due to retriable mode_select errors , then
> process_queued_ios()
> > will again queue the path activation work. If free_multipath call has been
> > completed by the time activate_path work is called, kernel panic was seen
> on
> > accessing multipath members.
> >
> > BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000090
> > RIP: 0010:[<ffffffffa003db1b>]  [<ffffffffa003db1b>]
> activate_path+0x1b/0x30 [dm_multipath]
> > [<ffffffff81090ac0>] worker_thread+0x170/0x2a0
> > [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
> >
> > Signed-off-by: Shiva Krishna Merla<shivakrishna.merla@netapp.com>
> > Reviewed-by: Krishnasamy
> Somasundaram<somasundaram.krishnasamy@netapp.com>
> > Tested-by: Speagle Andy<Andy.Speagle@netapp.com>
> >
> > ---
> > --- a/drivers/md/dm-mpath.c	2013-01-29 10:12:10.000000000 -0600
> > +++ b/drivers/md/dm-mpath.c	2013-10-29 21:02:03.267685017 -0500
> > @@ -73,6 +73,7 @@ struct multipath {
> >
> >  	wait_queue_head_t pg_init_wait;	/* Wait for pg_init
> completion */
> >
> > +	unsigned dtr_in_progress;	/* multipath destroy in progress */
> >  	unsigned pg_init_required;	/* pg_init needs calling? */
> >  	unsigned pg_init_in_progress;	/* Only one pg_init allowed at once
> */
> >  	unsigned pg_init_delay_retry;	/* Delay pg_init retry? */
> > @@ -498,7 +499,8 @@ static void process_queued_ios(struct wo
> >  	    (!pgpath && !m->queue_if_no_path))
> >  		must_queue = 0;
> >
> > -	if (m->pg_init_required && !m->pg_init_in_progress && pgpath)
> > +	if (m->pg_init_required && !m->pg_init_in_progress && pgpath &&
> > +	    !m->dtr_in_progress)
> >  		__pg_init_all_paths(m);
> >
> >  	spin_unlock_irqrestore(&m->lock, flags);
> > @@ -951,6 +953,11 @@ static void flush_multipath_work(struct
> >  static void multipath_dtr(struct dm_target *ti)
> >  {
> >  	struct multipath *m = ti->private;
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&m->lock, flags);
> > +	m->dtr_in_progress = 1;
> > +	spin_unlock_irqrestore(&m->lock, flags);
> 
> Hi,
> 
> how about moving this to flush_multipath_work(), which is supposed
> to silence background activities?
> I.e.
>   flush_multipath_work() {
>      <disable pg_init retry>
>      ...
>      <enable pg_init retry>
>   }
> 
> Then it not only fixes the crash you hit, it also fixes the hidden bug
> that pg_init continues retrying while the device is suspended.
We looked into adding checks for pg_init_required as well in
multipath_wait_for_pg_init_completion(), but none of these will prevent
pg_init_limit_reached() i.e from pg_init_done() from setting pg_init_required.
From code I see we end up waiting indefinitely for its completion. The reason 
for adding check in process_queued_ios() is __choose_pgpath() might again
trigger pg_init_required and end up calling __pg_init_all_paths() which will unset
pg_init_required and sets pg_init_in_progress within a lock. Also I believe the
patch given above will prevent further path activations and no work will be queued
again with dtr_in_progress. With Hannes patch IO's will not be queued in when
pg_init_in_progress so we are clean there as well.  Let me know your thoughts.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] dm mpath: fix race condition between multipath_dtr and pg_init_done
  2013-10-31  9:03 ` Junichi Nomura
  2013-10-31 14:29   ` Merla, ShivaKrishna
@ 2013-10-31 14:48   ` Mike Snitzer
  2013-10-31 15:26     ` Merla, ShivaKrishna
                       ` (2 more replies)
  1 sibling, 3 replies; 8+ messages in thread
From: Mike Snitzer @ 2013-10-31 14:48 UTC (permalink / raw)
  To: Junichi Nomura, Shiva Krishna Merla
  Cc: device-mapper development, agk@redhat.com

On Thu, Oct 31 2013 at  5:03am -0400,
Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:

> 
> Hi,
> 
> how about moving this to flush_multipath_work(), which is supposed
> to silence background activities?
> I.e.
>   flush_multipath_work() {
>      <disable pg_init retry>
>      ...
>      <enable pg_init retry>
>   }
> 
> Then it not only fixes the crash you hit, it also fixes the hidden bug
> that pg_init continues retrying while the device is suspended.

I ran with your suggestion.  Please see below.

To be clear, pg_init isn't disabled while mpath device is suspended
(meaning m->pg_init_disabled isn't set until the device is resumed).
But flush_multipath_work() will no longer start pg_init during suspend
-- which could otherwise occur while the mpath device is suspended.  So
in practice it accomplishes the stated goal.

Thanks for the suggestion Junichi.  Are you OK with this?  If so please
provide your Ack.

Shiva, can you please verify that this patch resolves the race, should
accomplish the same: just pushes the disabling of pg_init inside
flush_multipath_work().


From: Shiva Krishna Merla <shivakrishna.merla@netapp.com>
Subject: dm mpath: fix race condition between multipath_dtr and pg_init_done
Date: Wed, 30 Oct 2013 03:26:38 +0000

Whenever multipath_dtr() is happening we must prevent queueing any
further path activation work.  Implement this by adding a new
'pg_init_disabled' flag to the multipath structure that denotes future
path activation work should be skipped if it is set.  By disabling
pg_init and then re-enabling in flush_multipath_work() we also avoid the
potential for pg_init to be initiated while suspending an mpath device.

Without this patch a race condition exists that may result in a kernel
panic:

1) If after pg_init_done() decrements pg_init_in_progress to 0, a call
   to wait_for_pg_init_completion() assumes there are no more pending path
   management commands.
2) If pg_init_required is set by pg_init_done(), due to retryable
   mode_select errors, then process_queued_ios() will again queue the
   path activation work.
3) If free_multipath() completes before activate_path() work is called a
   NULL pointer dereference like the following can be seen when
   accessing members of the recently destructed multipath:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
RIP: 0010:[<ffffffffa003db1b>]  [<ffffffffa003db1b>] activate_path+0x1b/0x30 [dm_multipath]
[<ffffffff81090ac0>] worker_thread+0x170/0x2a0
[<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40

[switched to disabling pg_init in flush_multipath_work, header edited by Mike Snitzer]
Suggested-by: Junichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Shiva Krishna Merla <shivakrishna.merla@netapp.com>
Reviewed-by: Krishnasamy Somasundaram <somasundaram.krishnasamy@netapp.com>
Tested-by: Speagle Andy <Andy.Speagle@netapp.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
---
 drivers/md/dm-mpath.c |   18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

Index: linux/drivers/md/dm-mpath.c
===================================================================
--- linux.orig/drivers/md/dm-mpath.c
+++ linux/drivers/md/dm-mpath.c
@@ -87,6 +87,7 @@ struct multipath {
 	unsigned queue_if_no_path:1;	/* Queue I/O if last path fails? */
 	unsigned saved_queue_if_no_path:1; /* Saved state during suspension */
 	unsigned retain_attached_hw_handler:1; /* If there's already a hw_handler present, don't change it. */
+	unsigned pg_init_disabled:1;	/* pginit is currently not allowed */
 
 	unsigned pg_init_retries;	/* Number of times to retry pg_init */
 	unsigned pg_init_count;		/* Number of times pg_init called */
@@ -497,7 +498,8 @@ static void process_queued_ios(struct wo
 	    (!pgpath && !m->queue_if_no_path))
 		must_queue = 0;
 
-	if (m->pg_init_required && !m->pg_init_in_progress && pgpath)
+	if (m->pg_init_required && !m->pg_init_in_progress && pgpath &&
+	    !m->pg_init_disabled)
 		__pg_init_all_paths(m);
 
 	spin_unlock_irqrestore(&m->lock, flags);
@@ -942,10 +944,20 @@ static void multipath_wait_for_pg_init_c
 
 static void flush_multipath_work(struct multipath *m)
 {
+	unsigned long flags;
+
+	spin_lock_irqsave(&m->lock, flags);
+	m->pg_init_disabled = 1;
+	spin_unlock_irqrestore(&m->lock, flags);
+
 	flush_workqueue(kmpath_handlerd);
 	multipath_wait_for_pg_init_completion(m);
 	flush_workqueue(kmultipathd);
 	flush_work(&m->trigger_event);
+
+	spin_lock_irqsave(&m->lock, flags);
+	m->pg_init_disabled = 0;
+	spin_unlock_irqrestore(&m->lock, flags);
 }
 
 static void multipath_dtr(struct dm_target *ti)
@@ -1164,7 +1176,7 @@ static int pg_init_limit_reached(struct
 
 	spin_lock_irqsave(&m->lock, flags);
 
-	if (m->pg_init_count <= m->pg_init_retries)
+	if (m->pg_init_count <= m->pg_init_retries && !m->pg_init_disabled)
 		m->pg_init_required = 1;
 	else
 		limit_reached = 1;
@@ -1714,7 +1726,7 @@ out:
  *---------------------------------------------------------------*/
 static struct target_type multipath_target = {
 	.name = "multipath",
-	.version = {1, 5, 1},
+	.version = {1, 6, 0},
 	.module = THIS_MODULE,
 	.ctr = multipath_ctr,
 	.dtr = multipath_dtr,

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] dm mpath: fix race condition between multipath_dtr and pg_init_done
  2013-10-31 14:48   ` [PATCH v2] dm mpath: fix " Mike Snitzer
@ 2013-10-31 15:26     ` Merla, ShivaKrishna
  2013-11-01  1:10     ` Junichi Nomura
  2013-11-05 15:56     ` Alasdair G Kergon
  2 siblings, 0 replies; 8+ messages in thread
From: Merla, ShivaKrishna @ 2013-10-31 15:26 UTC (permalink / raw)
  To: Mike Snitzer, Junichi Nomura; +Cc: device-mapper development, agk@redhat.com


> -----Original Message-----
> From: Mike Snitzer [mailto:snitzer@redhat.com]
> Sent: Thursday, October 31, 2013 9:48 AM
> To: Junichi Nomura; Merla, ShivaKrishna
> Cc: device-mapper development; agk@redhat.com
> Subject: [PATCH v2] dm mpath: fix race condition between multipath_dtr
> and pg_init_done
> 
> On Thu, Oct 31 2013 at  5:03am -0400,
> Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:
> 
> >
> > Hi,
> >
> > how about moving this to flush_multipath_work(), which is supposed
> > to silence background activities?
> > I.e.
> >   flush_multipath_work() {
> >      <disable pg_init retry>
> >      ...
> >      <enable pg_init retry>
> >   }
> >
> > Then it not only fixes the crash you hit, it also fixes the hidden bug
> > that pg_init continues retrying while the device is suspended.
> 
> I ran with your suggestion.  Please see below.
> 
> To be clear, pg_init isn't disabled while mpath device is suspended
> (meaning m->pg_init_disabled isn't set until the device is resumed).
> But flush_multipath_work() will no longer start pg_init during suspend
> -- which could otherwise occur while the mpath device is suspended.  So
> in practice it accomplishes the stated goal.
> 
> Thanks for the suggestion Junichi.  Are you OK with this?  If so please
> provide your Ack.
> 
> Shiva, can you please verify that this patch resolves the race, should
> accomplish the same: just pushes the disabling of pg_init inside
> flush_multipath_work().

Yes Mike, This looks good. Thanks Junichi for the suggestion.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] dm mpath: fix race condition between multipath_dtr and pg_init_done
  2013-10-31 14:48   ` [PATCH v2] dm mpath: fix " Mike Snitzer
  2013-10-31 15:26     ` Merla, ShivaKrishna
@ 2013-11-01  1:10     ` Junichi Nomura
  2013-11-05 15:56     ` Alasdair G Kergon
  2 siblings, 0 replies; 8+ messages in thread
From: Junichi Nomura @ 2013-11-01  1:10 UTC (permalink / raw)
  To: Mike Snitzer, Shiva Krishna Merla
  Cc: device-mapper development, agk@redhat.com

On 10/31/13 23:48, Mike Snitzer wrote:
> On Thu, Oct 31 2013 at  5:03am -0400,
> Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:
>> how about moving this to flush_multipath_work(), which is supposed
>> to silence background activities?
>> I.e.
>>   flush_multipath_work() {
>>      <disable pg_init retry>
>>      ...
>>      <enable pg_init retry>
>>   }
>>
>> Then it not only fixes the crash you hit, it also fixes the hidden bug
>> that pg_init continues retrying while the device is suspended.
> 
> I ran with your suggestion.  Please see below.
> 
> To be clear, pg_init isn't disabled while mpath device is suspended
> (meaning m->pg_init_disabled isn't set until the device is resumed).
> But flush_multipath_work() will no longer start pg_init during suspend
> -- which could otherwise occur while the mpath device is suspended.  So
> in practice it accomplishes the stated goal.
> 
> Thanks for the suggestion Junichi.  Are you OK with this?  If so please
> provide your Ack.

Yes, that's perfect. Thank you.
Acked-by: Junichi Nomura <j-nomura@ce.jp.nec.com>

And thanks Shiva Krishna for finding/fixing this.

> 
> Shiva, can you please verify that this patch resolves the race, should
> accomplish the same: just pushes the disabling of pg_init inside
> flush_multipath_work().

-- 
Jun'ichi Nomura, NEC Corporation

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] dm mpath: fix race condition between multipath_dtr and pg_init_done
  2013-10-31 14:48   ` [PATCH v2] dm mpath: fix " Mike Snitzer
  2013-10-31 15:26     ` Merla, ShivaKrishna
  2013-11-01  1:10     ` Junichi Nomura
@ 2013-11-05 15:56     ` Alasdair G Kergon
  2013-11-05 17:12       ` Mike Snitzer
  2 siblings, 1 reply; 8+ messages in thread
From: Alasdair G Kergon @ 2013-11-05 15:56 UTC (permalink / raw)
  To: Mike Snitzer, Junichi Nomura, Shiva Krishna Merla,
	device-mapper development, agk@redhat.com

On Thu, Oct 31, 2013 at 10:48:13AM -0400, Mike Snitzer wrote:
> --- linux.orig/drivers/md/dm-mpath.c
> +++ linux/drivers/md/dm-mpath.c

> +	unsigned pg_init_disabled:1;	/* pginit is currently not allowed */

> +	if (m->pg_init_required && !m->pg_init_in_progress && pgpath &&
> +	    !m->pg_init_disabled)

So we're accepting that pg_init might be both "disabled" and "required"
simultaneously (otherwise we wouldn't need this test)?

> @@ -942,10 +944,20 @@ static void multipath_wait_for_pg_init_c
>  static void flush_multipath_work(struct multipath *m)
>  {
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&m->lock, flags);
> +	m->pg_init_disabled = 1;
> +	spin_unlock_irqrestore(&m->lock, flags);
> +
>  	flush_workqueue(kmpath_handlerd);
>  	multipath_wait_for_pg_init_completion(m);

So this implies pg_init could even still be running while it is disabled?

If the logic is correct, then I find the new variable name quite confusing and
think it should be changed, or as a minimum have inline comments explaining
the apparent contradictions!

Alasdair

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] dm mpath: fix race condition between multipath_dtr and pg_init_done
  2013-11-05 15:56     ` Alasdair G Kergon
@ 2013-11-05 17:12       ` Mike Snitzer
  0 siblings, 0 replies; 8+ messages in thread
From: Mike Snitzer @ 2013-11-05 17:12 UTC (permalink / raw)
  To: Junichi Nomura, Shiva Krishna Merla, device-mapper development,
	agk@redhat.com

On Tue, Nov 05 2013 at 10:56am -0500,
Alasdair G Kergon <agk@redhat.com> wrote:

> On Thu, Oct 31, 2013 at 10:48:13AM -0400, Mike Snitzer wrote:
> > --- linux.orig/drivers/md/dm-mpath.c
> > +++ linux/drivers/md/dm-mpath.c
> 
> > +	unsigned pg_init_disabled:1;	/* pginit is currently not allowed */
> 
> > +	if (m->pg_init_required && !m->pg_init_in_progress && pgpath &&
> > +	    !m->pg_init_disabled)
> 
> So we're accepting that pg_init might be both "disabled" and "required"
> simultaneously (otherwise we wouldn't need this test)?

Yes, needs to be independent of pg_init_required.

> > @@ -942,10 +944,20 @@ static void multipath_wait_for_pg_init_c
> >  static void flush_multipath_work(struct multipath *m)
> >  {
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&m->lock, flags);
> > +	m->pg_init_disabled = 1;
> > +	spin_unlock_irqrestore(&m->lock, flags);
> > +
> >  	flush_workqueue(kmpath_handlerd);
> >  	multipath_wait_for_pg_init_completion(m);
> 
> So this implies pg_init could even still be running while it is disabled?

pg_init_disabled doesn't allow pg_init to be initiated once set (even if
pg_init_required is).  Hence pg_init is disabled.

> If the logic is correct, then I find the new variable name quite confusing and
> think it should be changed, or as a minimum have inline comments explaining
> the apparent contradictions!

I'm not seeing a contradiction.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-11-05 17:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-30  3:26 [PATCH][RESEND]dm-mpath: fix for race condition between multipath_dtr and pg_init_done Merla, ShivaKrishna
2013-10-31  9:03 ` Junichi Nomura
2013-10-31 14:29   ` Merla, ShivaKrishna
2013-10-31 14:48   ` [PATCH v2] dm mpath: fix " Mike Snitzer
2013-10-31 15:26     ` Merla, ShivaKrishna
2013-11-01  1:10     ` Junichi Nomura
2013-11-05 15:56     ` Alasdair G Kergon
2013-11-05 17:12       ` Mike Snitzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.