All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3] multipathd: release uxsocket and resource when cancel thread
@ 2018-01-15 12:09 Wuchongyun
  2018-01-15 14:10 ` Martin Wilck
  0 siblings, 1 reply; 9+ messages in thread
From: Wuchongyun @ 2018-01-15 12:09 UTC (permalink / raw)
  To: Martin Wilck, dm-devel@redhat.com; +Cc: Guozhonghua, Changwei Ge

Hi Martin,
Thank you for reply so quickly.  Below is the new patch according to your comments, please help to review this patch, thanks a lot~

Issue description: we meet this issue: when multipathd initilaze and
call uxsock_listen to create unix domain socket, but return -1 and
the errno is 98 and then the uxsock_listen return null. After multipathd
startup we can't receive any user's multipathd commands to finish the
new multipath creation or any operations any more!

We found that uxlsnr thread's cleanup function not close the sockets
also not release the clients when cancel thread, the domain socket
will be release by the system. In any special environment like the
machine's load is very heavy or any situations, the system may not close
the old domain socket when we try to create and bind the new domain
socket may return errno:98(Address already in use).

And also we make some experiments:
in uxsock_cleanup if we close the ux_sock first and then immdediately
call ux_socket_listen to create new ux_sock and initialization will be
OK; if we don't close the ux_sock and call ux_socket_listen will return
-1 and errno = 98.

So we believe that close uxsocket and release clients  when cancel
thread can make sure of that new starting multipathd thread can
create new uxsocket successfully, also can receive multipathd commands
properly. And this path can fix clients' memory leak too.

Signed-off-by: Chongyun Wu <wu.chongyun@h3c.com>
---
 multipathd/uxlsnr.c |   25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
index 98ac25a..f0041c8 100644
--- a/multipathd/uxlsnr.c
+++ b/multipathd/uxlsnr.c
@@ -102,16 +102,21 @@ static void new_client(int ux_sock)
 /*
  * kill off a dead client
  */
-static void dead_client(struct client *c)
+static void _dead_client(struct client *c)
 {
-	pthread_mutex_lock(&client_lock);
 	list_del_init(&c->node);
-	pthread_mutex_unlock(&client_lock);
 	close(c->fd);
 	c->fd = -1;
 	FREE(c);
 }
 
+static void dead_client(struct client *c)
+{
+	pthread_mutex_lock(&client_lock);
+	_dead_client(c);
+	pthread_mutex_unlock(&client_lock);
+}
+
 void free_polls (void)
 {
 	if (polls)
@@ -139,6 +144,18 @@ void check_timeout(struct timespec start_time, char *inbuf,
 
 void uxsock_cleanup(void *arg)
 {
+	struct client *client_loop;
+	struct client *client_tmp;
+	int ux_sock = (int)arg;
+
+	pthread_mutex_lock(&client_lock);
+	list_for_each_entry_safe(client_loop, client_tmp, &clients, node) {
+		_dead_client(client_loop);
+	}
+	pthread_mutex_unlock(&client_lock);
+
+	close(ux_sock);
+
 	cli_exit();
 	free_polls();
 }
@@ -162,7 +179,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data)
 		return NULL;
 	}
 
-	pthread_cleanup_push(uxsock_cleanup, NULL);
+	pthread_cleanup_push(uxsock_cleanup, (void *)ux_sock);
 
 	condlog(3, "uxsock: startup listener");
 	polls = (struct pollfd *)MALLOC((MIN_POLLS + 1) * sizeof(struct pollfd));
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH V3] multipathd: release uxsocket and resource when cancel thread
  2018-01-15 12:09 [PATCH V3] multipathd: release uxsocket and resource when cancel thread Wuchongyun
@ 2018-01-15 14:10 ` Martin Wilck
  0 siblings, 0 replies; 9+ messages in thread
From: Martin Wilck @ 2018-01-15 14:10 UTC (permalink / raw)
  To: Wuchongyun, dm-devel@redhat.com; +Cc: Guozhonghua, Changwei Ge

On Mon, 2018-01-15 at 12:09 +0000, Wuchongyun wrote:
> Hi Martin,
> Thank you for reply so quickly.  Below is the new patch according to
> your comments, please help to review this patch, thanks a lot~
> 
> 
[...]

>   */
> -static void dead_client(struct client *c)
> +static void _dead_client(struct client *c)
>  {
> -	pthread_mutex_lock(&client_lock);
>  	list_del_init(&c->node);
> -	pthread_mutex_unlock(&client_lock);
>  	close(c->fd);
>  	c->fd = -1;
>  	FREE(c);
>  }

You may need to use pthread_cleanup_push() here for the unlock, 
because close() is a cancellation point.

Regards
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3] multipathd: release uxsocket and resource when cancel thread
@ 2018-01-16 11:48 Wuchongyun
  2018-01-16 13:19 ` Martin Wilck
  2018-01-16 22:45 ` Benjamin Marzinski
  0 siblings, 2 replies; 9+ messages in thread
From: Wuchongyun @ 2018-01-16 11:48 UTC (permalink / raw)
  To: Martin Wilck, dm-devel@redhat.com; +Cc: Guozhonghua, Changwei Ge

Hi Martin,
Sorry to forget that, actually I found that dead_client() will not be interrupt by thread cancle, because after all dead_client() calling point be done then handle_signals() have chance to be called by uxsock_listen() which will call exit_daemon() and send 
cancel threads signal to all child process include uxlsnr.
But your comments is good can make code more safer. Below is the new patch, please have a look, thanks.

Issue description: we meet this issue: when multipathd initilaze and
call uxsock_listen to create unix domain socket, but return -1 and
the errno is 98 and then the uxsock_listen return null. After multipathd
startup we can't receive any user's multipathd commands to finish the
new multipath creation or any operations any more!

We found that uxlsnr thread's cleanup function not close the sockets
also not release the clients when cancel thread, the domain socket
will be release by the system. In any special environment like the
machine's load is very heavy or any situations, the system may not close
the old domain socket when we try to create and bind the new domain
socket may return errno:98(Address already in use).

And also we make some experiments:
in uxsock_cleanup if we close the ux_sock first and then immdediately
call ux_socket_listen to create new ux_sock and initialization will be
OK; if we don't close the ux_sock and call ux_socket_listen will return
-1 and errno = 98.

So we believe that close uxsocket and release clients  when cancel
thread can make sure of that new starting multipathd thread can
create new uxsocket successfully, also can receive multipathd commands
properly. And this path can fix clients' memory leak too.

Signed-off-by: Chongyun Wu <wu.chongyun@h3c.com>
---
 multipathd/uxlsnr.c |   29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
index 98ac25a..c8065ea 100644
--- a/multipathd/uxlsnr.c
+++ b/multipathd/uxlsnr.c
@@ -102,14 +102,21 @@ static void new_client(int ux_sock)
 /*
  * kill off a dead client
  */
-static void dead_client(struct client *c)
+static void _dead_client(struct client *c)
 {
-	pthread_mutex_lock(&client_lock);
+	int fd = c->fd;
 	list_del_init(&c->node);
-	pthread_mutex_unlock(&client_lock);
-	close(c->fd);
 	c->fd = -1;
 	FREE(c);
+	close(fd);
+}
+
+static void dead_client(struct client *c)
+{
+	pthread_cleanup_push(cleanup_lock, &client_lock);
+	pthread_mutex_lock(&client_lock);
+	_dead_client(c);
+	pthread_cleanup_pop(1);
 }
 
 void free_polls (void)
@@ -139,6 +146,18 @@ void check_timeout(struct timespec start_time, char *inbuf,
 
 void uxsock_cleanup(void *arg)
 {
+	struct client *client_loop;
+	struct client *client_tmp;
+	int ux_sock = (int)arg;
+
+	pthread_mutex_lock(&client_lock);
+	list_for_each_entry_safe(client_loop, client_tmp, &clients, node) {
+		_dead_client(client_loop);
+	}
+	pthread_mutex_unlock(&client_lock);
+
+	close(ux_sock);
+
 	cli_exit();
 	free_polls();
 }
@@ -162,7 +181,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data)
 		return NULL;
 	}
 
-	pthread_cleanup_push(uxsock_cleanup, NULL);
+	pthread_cleanup_push(uxsock_cleanup, (void *)ux_sock);
 
 	condlog(3, "uxsock: startup listener");
 	polls = (struct pollfd *)MALLOC((MIN_POLLS + 1) * sizeof(struct pollfd));
-- 
1.7.9.5



-----original-----
sender: Martin Wilck [mailto:mwilck@suse.com] 
send time: 2018-01-15 22:11
receiver: wuchongyun (Cloud) <wu.chongyun@h3c.com>; dm-devel@redhat.com
cc: guozhonghua (Cloud) <guozhonghua@h3c.com>; gechangwei (Cloud) <ge.changwei@h3c.com>
subject: Re: [PATCH V3] multipathd: release uxsocket and resource when cancel thread

On Mon, 2018-01-15 at 12:09 +0000, Wuchongyun wrote:
> Hi Martin,
> Thank you for reply so quickly.  Below is the new patch according to 
> your comments, please help to review this patch, thanks a lot~
> 
> 
[...]

>   */
> -static void dead_client(struct client *c)
> +static void _dead_client(struct client *c)
>  {
> -	pthread_mutex_lock(&client_lock);
>  	list_del_init(&c->node);
> -	pthread_mutex_unlock(&client_lock);
>  	close(c->fd);
>  	c->fd = -1;
>  	FREE(c);
>  }

You may need to use pthread_cleanup_push() here for the unlock, because close() is a cancellation point.

Regards
Martin

--
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg)


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH V3] multipathd: release uxsocket and resource when cancel thread
  2018-01-16 11:48 Wuchongyun
@ 2018-01-16 13:19 ` Martin Wilck
  2018-01-16 22:39   ` Benjamin Marzinski
  2018-01-16 22:45 ` Benjamin Marzinski
  1 sibling, 1 reply; 9+ messages in thread
From: Martin Wilck @ 2018-01-16 13:19 UTC (permalink / raw)
  To: Wuchongyun, dm-devel@redhat.com, Christophe Varoqui
  Cc: Guozhonghua, Changwei Ge

On Tue, 2018-01-16 at 11:48 +0000, Wuchongyun wrote:
> Hi Martin,
> Sorry to forget that, actually I found that dead_client() will not be
> interrupt by thread cancle, because after all dead_client() calling
> point be done then handle_signals() have chance to be called by
> uxsock_listen() which will call exit_daemon() and send 
> cancel threads signal to all child process include uxlsnr.

Fair enough.

> But your comments is good can make code more safer. Below is the new
> patch, please have a look, thanks.

I think it's really safer whis way, should anyone see the need to
cancel the listener thread from another point in the code.

The patch is looks good now.

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3] multipathd: release uxsocket and resource when cancel thread
  2018-01-16 13:19 ` Martin Wilck
@ 2018-01-16 22:39   ` Benjamin Marzinski
  2018-01-17  0:55     ` Martin Wilck
  0 siblings, 1 reply; 9+ messages in thread
From: Benjamin Marzinski @ 2018-01-16 22:39 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Wuchongyun, dm-devel@redhat.com, Guozhonghua, Changwei Ge

On Tue, Jan 16, 2018 at 02:19:20PM +0100, Martin Wilck wrote:
> On Tue, 2018-01-16 at 11:48 +0000, Wuchongyun wrote:
> > Hi Martin,
> > Sorry to forget that, actually I found that dead_client() will not be
> > interrupt by thread cancle, because after all dead_client() calling
> > point be done then handle_signals() have chance to be called by
> > uxsock_listen() which will call exit_daemon() and send 
> > cancel threads signal to all child process include uxlsnr.
> 
> Fair enough.
> 
> > But your comments is good can make code more safer. Below is the new
> > patch, please have a look, thanks.
> 
> I think it's really safer whis way, should anyone see the need to
> cancel the listener thread from another point in the code.

I'm confused why this is safe. After uxsock_listen() calls exit_daemon()
from handle_signals(), it doesn't exit. It loops around and polls again,
and could in theory find a client that has died.  In fact if the client
is killing multipathd via

# multipathd shutdown

instead of a signal, won't it be very likely that it will find a dead
client when it loops right after calling exit_daemon() in
cli_shutdown()? This could hit the deadlock that you noticed, where
uxsock_cleanup() can't run because dead_client() already holding the
mutex.

Or am I missing something here?
-Ben

> 
> The patch is looks good now.
> 
> Reviewed-by: Martin Wilck <mwilck@suse.com>
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3] multipathd: release uxsocket and resource when cancel thread
  2018-01-16 11:48 Wuchongyun
  2018-01-16 13:19 ` Martin Wilck
@ 2018-01-16 22:45 ` Benjamin Marzinski
  1 sibling, 0 replies; 9+ messages in thread
From: Benjamin Marzinski @ 2018-01-16 22:45 UTC (permalink / raw)
  To: Wuchongyun; +Cc: Guozhonghua, dm-devel@redhat.com, Martin Wilck, Changwei Ge

On Tue, Jan 16, 2018 at 11:48:28AM +0000, Wuchongyun wrote:
> Hi Martin,
> Sorry to forget that, actually I found that dead_client() will not be interrupt by thread cancle, because after all dead_client() calling point be done then handle_signals() have chance to be called by uxsock_listen() which will call exit_daemon() and send 
> cancel threads signal to all child process include uxlsnr.
> But your comments is good can make code more safer. Below is the new patch, please have a look, thanks.
> 

I have one small issue with this patch.

Since you are now closing ux_sock in uxsock_cleanup(), you should remove
the close(ux_sock) at the end of uxsock_listen(). pthread_cleanup_pop()
will already take care of that.

-Ben


> Issue description: we meet this issue: when multipathd initilaze and
> call uxsock_listen to create unix domain socket, but return -1 and
> the errno is 98 and then the uxsock_listen return null. After multipathd
> startup we can't receive any user's multipathd commands to finish the
> new multipath creation or any operations any more!
> 
> We found that uxlsnr thread's cleanup function not close the sockets
> also not release the clients when cancel thread, the domain socket
> will be release by the system. In any special environment like the
> machine's load is very heavy or any situations, the system may not close
> the old domain socket when we try to create and bind the new domain
> socket may return errno:98(Address already in use).
> 
> And also we make some experiments:
> in uxsock_cleanup if we close the ux_sock first and then immdediately
> call ux_socket_listen to create new ux_sock and initialization will be
> OK; if we don't close the ux_sock and call ux_socket_listen will return
> -1 and errno = 98.
> 
> So we believe that close uxsocket and release clients  when cancel
> thread can make sure of that new starting multipathd thread can
> create new uxsocket successfully, also can receive multipathd commands
> properly. And this path can fix clients' memory leak too.
> 
> Signed-off-by: Chongyun Wu <wu.chongyun@h3c.com>
> ---
>  multipathd/uxlsnr.c |   29 ++++++++++++++++++++++++-----
>  1 file changed, 24 insertions(+), 5 deletions(-)
> 
> diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
> index 98ac25a..c8065ea 100644
> --- a/multipathd/uxlsnr.c
> +++ b/multipathd/uxlsnr.c
> @@ -102,14 +102,21 @@ static void new_client(int ux_sock)
>  /*
>   * kill off a dead client
>   */
> -static void dead_client(struct client *c)
> +static void _dead_client(struct client *c)
>  {
> -	pthread_mutex_lock(&client_lock);
> +	int fd = c->fd;
>  	list_del_init(&c->node);
> -	pthread_mutex_unlock(&client_lock);
> -	close(c->fd);
>  	c->fd = -1;
>  	FREE(c);
> +	close(fd);
> +}
> +
> +static void dead_client(struct client *c)
> +{
> +	pthread_cleanup_push(cleanup_lock, &client_lock);
> +	pthread_mutex_lock(&client_lock);
> +	_dead_client(c);
> +	pthread_cleanup_pop(1);
>  }
>  
>  void free_polls (void)
> @@ -139,6 +146,18 @@ void check_timeout(struct timespec start_time, char *inbuf,
>  
>  void uxsock_cleanup(void *arg)
>  {
> +	struct client *client_loop;
> +	struct client *client_tmp;
> +	int ux_sock = (int)arg;
> +
> +	pthread_mutex_lock(&client_lock);
> +	list_for_each_entry_safe(client_loop, client_tmp, &clients, node) {
> +		_dead_client(client_loop);
> +	}
> +	pthread_mutex_unlock(&client_lock);
> +
> +	close(ux_sock);
> +
>  	cli_exit();
>  	free_polls();
>  }
> @@ -162,7 +181,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data)
>  		return NULL;
>  	}
>  
> -	pthread_cleanup_push(uxsock_cleanup, NULL);
> +	pthread_cleanup_push(uxsock_cleanup, (void *)ux_sock);
>  
>  	condlog(3, "uxsock: startup listener");
>  	polls = (struct pollfd *)MALLOC((MIN_POLLS + 1) * sizeof(struct pollfd));
> -- 
> 1.7.9.5
> 
> 
> 
> -----original-----
> sender: Martin Wilck [mailto:mwilck@suse.com] 
> send time: 2018-01-15 22:11
> receiver: wuchongyun (Cloud) <wu.chongyun@h3c.com>; dm-devel@redhat.com
> cc: guozhonghua (Cloud) <guozhonghua@h3c.com>; gechangwei (Cloud) <ge.changwei@h3c.com>
> subject: Re: [PATCH V3] multipathd: release uxsocket and resource when cancel thread
> 
> On Mon, 2018-01-15 at 12:09 +0000, Wuchongyun wrote:
> > Hi Martin,
> > Thank you for reply so quickly.  Below is the new patch according to 
> > your comments, please help to review this patch, thanks a lot~
> > 
> > 
> [...]
> 
> >   */
> > -static void dead_client(struct client *c)
> > +static void _dead_client(struct client *c)
> >  {
> > -	pthread_mutex_lock(&client_lock);
> >  	list_del_init(&c->node);
> > -	pthread_mutex_unlock(&client_lock);
> >  	close(c->fd);
> >  	c->fd = -1;
> >  	FREE(c);
> >  }
> 
> You may need to use pthread_cleanup_push() here for the unlock, because close() is a cancellation point.
> 
> Regards
> Martin
> 
> --
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg)
> 
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3] multipathd: release uxsocket and resource when cancel thread
  2018-01-16 22:39   ` Benjamin Marzinski
@ 2018-01-17  0:55     ` Martin Wilck
  0 siblings, 0 replies; 9+ messages in thread
From: Martin Wilck @ 2018-01-17  0:55 UTC (permalink / raw)
  To: Benjamin Marzinski, Wuchongyun
  Cc: Guozhonghua, dm-devel@redhat.com, Changwei Ge

On Tue, 2018-01-16 at 16:39 -0600, Benjamin Marzinski wrote:
> On Tue, Jan 16, 2018 at 02:19:20PM +0100, Martin Wilck wrote:
> > On Tue, 2018-01-16 at 11:48 +0000, Wuchongyun wrote:
> > > Hi Martin,
> > > Sorry to forget that, actually I found that dead_client() will
> > > not be
> > > interrupt by thread cancle, because after all dead_client()
> > > calling
> > > point be done then handle_signals() have chance to be called by
> > > uxsock_listen() which will call exit_daemon() and send 
> > > cancel threads signal to all child process include uxlsnr.
> > 
> > Fair enough.
> > 
> > > But your comments is good can make code more safer. Below is the
> > > new
> > > patch, please have a look, thanks.
> > 
> > I think it's really safer whis way, should anyone see the need to
> > cancel the listener thread from another point in the code.
> 
> I'm confused why this is safe. After uxsock_listen() calls
> exit_daemon()
> from handle_signals(), it doesn't exit. It loops around and polls
> again,
> and could in theory find a client that has died.  In fact if the
> client
> is killing multipathd via
> 
> # multipathd shutdown
> 
> instead of a signal, won't it be very likely that it will find a dead
> client when it loops right after calling exit_daemon() in
> cli_shutdown()? This could hit the deadlock that you noticed, where
> uxsock_cleanup() can't run because dead_client() already holding the
> mutex.
> 
> Or am I missing something here?

With "safer", I was only referring to the fact that Wuchongyun replaced
pthread_mutex_unlock() with pthread_cleanup_push(cleanup_lock,
&client_lock) ... pthread_cleanup_pop(). The original deadlock is
avoided either way, AFAICS.

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3] multipathd: release uxsocket and resource when cancel thread
@ 2018-01-17  2:04 Wuchongyun
  2018-01-17 18:59 ` Benjamin Marzinski
  0 siblings, 1 reply; 9+ messages in thread
From: Wuchongyun @ 2018-01-17  2:04 UTC (permalink / raw)
  To: Benjamin Marzinski, Martin Wilck
  Cc: dm-devel@redhat.com, Changlimin, Changwei Ge, Guozhonghua

On Tue, Jan 17, 2018 at 06:39:20PM +0100, Benjamin Marzinski wrote:
> On Tue, Jan 16, 2018 at 02:19:20PM +0100, Martin Wilck wrote:
> > On Tue, 2018-01-16 at 11:48 +0000, Wuchongyun wrote:
> > > Hi Martin,
> > > Sorry to forget that, actually I found that dead_client() will not 
> > > be interrupt by thread cancle, because after all dead_client() 
> > > calling point be done then handle_signals() have chance to be called 
> > > by
> > > uxsock_listen() which will call exit_daemon() and send cancel 
> > > threads signal to all child process include uxlsnr.
> > 
> > Fair enough.
> > 
> > > But your comments is good can make code more safer. Below is the new 
> > > patch, please have a look, thanks.
> > 
> > I think it's really safer whis way, should anyone see the need to 
>>  cancel the listener thread from another point in the code.

> I'm confused why this is safe. After uxsock_listen() calls exit_daemon() from handle_signals(), it doesn't exit. It loops around and polls again, and could in theory find a client that has died.  In fact if the client is killing multipathd via
> # multipathd shutdown
> instead of a signal, won't it be very likely that it will find a dead client when it loops right after calling exit_daemon() in cli_shutdown()? This could hit the deadlock that you noticed, where
> uxsock_cleanup() can't run because dead_client() already holding the mutex.
> Or am I missing something here?

Hi Benjiamin,
Thanks for your comments below are my rely, thanks.

You really found the scenario which need to add pthread_cleanup_push(cleanup_lock, &client_lock) before get lock in dead_client to avoid the dead lock:
If the client is killing multipathd via multipathd shutdown and it find a dead client when it loops right after calling exit_daemon() in cli_shutdown(), This will not hit deadlock because in dead_client before get lock we call pthread_cleanup_push(cleanup_lock, &client_lock) first, then thread cancelation happened, thread cleanup functions been pop up by reverse order: cleanup_lock() first, then uxsock_cleanup(), which make sure client_lock been release before calling uxsock_cleanup(). So it's safer in this way.

And your next comment is right I will make another patch for this.
>Since you are now closing ux_sock in uxsock_cleanup(), you should remove the close(ux_sock) at the end of uxsock_listen(). pthread_cleanup_pop() will already take care of that.

Regards
Chongyun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3] multipathd: release uxsocket and resource when cancel thread
  2018-01-17  2:04 Wuchongyun
@ 2018-01-17 18:59 ` Benjamin Marzinski
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin Marzinski @ 2018-01-17 18:59 UTC (permalink / raw)
  To: Wuchongyun
  Cc: Guozhonghua, dm-devel@redhat.com, Changwei Ge, Changlimin,
	Martin Wilck

On Wed, Jan 17, 2018 at 02:04:19AM +0000, Wuchongyun wrote:
> On Tue, Jan 17, 2018 at 06:39:20PM +0100, Benjamin Marzinski wrote:
> > On Tue, Jan 16, 2018 at 02:19:20PM +0100, Martin Wilck wrote:
> > > On Tue, 2018-01-16 at 11:48 +0000, Wuchongyun wrote:
> > > > Hi Martin,
> > > > Sorry to forget that, actually I found that dead_client() will not 
> > > > be interrupt by thread cancle, because after all dead_client() 
> > > > calling point be done then handle_signals() have chance to be called 
> > > > by
> > > > uxsock_listen() which will call exit_daemon() and send cancel 
> > > > threads signal to all child process include uxlsnr.
> > > 
> > > Fair enough.
> > > 
> > > > But your comments is good can make code more safer. Below is the new 
> > > > patch, please have a look, thanks.
> > > 
> > > I think it's really safer whis way, should anyone see the need to 
> >>  cancel the listener thread from another point in the code.
> 
> > I'm confused why this is safe. After uxsock_listen() calls exit_daemon() from handle_signals(), it doesn't exit. It loops around and polls again, and could in theory find a client that has died.  In fact if the client is killing multipathd via
> > # multipathd shutdown
> > instead of a signal, won't it be very likely that it will find a dead client when it loops right after calling exit_daemon() in cli_shutdown()? This could hit the deadlock that you noticed, where
> > uxsock_cleanup() can't run because dead_client() already holding the mutex.
> > Or am I missing something here?
> 
> Hi Benjiamin,
> Thanks for your comments below are my rely, thanks.
> 
> You really found the scenario which need to add pthread_cleanup_push(cleanup_lock, &client_lock) before get lock in dead_client to avoid the dead lock:
> If the client is killing multipathd via multipathd shutdown and it find a dead client when it loops right after calling exit_daemon() in cli_shutdown(), This will not hit deadlock because in dead_client before get lock we call pthread_cleanup_push(cleanup_lock, &client_lock) first, then thread cancelation happened, thread cleanup functions been pop up by reverse order: cleanup_lock() first, then uxsock_cleanup(), which make sure client_lock been release before calling uxsock_cleanup(). So it's safer in this way.

I thought Martin was asking you to not add the pthread_cleanup_push/pop
in dead_client, and I was trying to argue that they were necessary. But
I think I just made the conversation more muddled.

So yes, please add the pthread_cleanup_push/pop in dead_client. It does
make the program safer. Sorry if I'm just restating what everyone had
already agreed to.

Thanks.
-Ben

> 
> And your next comment is right I will make another patch for this.
> >Since you are now closing ux_sock in uxsock_cleanup(), you should remove the close(ux_sock) at the end of uxsock_listen(). pthread_cleanup_pop() will already take care of that.
> 
> Regards
> Chongyun

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-01-17 18:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-15 12:09 [PATCH V3] multipathd: release uxsocket and resource when cancel thread Wuchongyun
2018-01-15 14:10 ` Martin Wilck
  -- strict thread matches above, loose matches on Subject: below --
2018-01-16 11:48 Wuchongyun
2018-01-16 13:19 ` Martin Wilck
2018-01-16 22:39   ` Benjamin Marzinski
2018-01-17  0:55     ` Martin Wilck
2018-01-16 22:45 ` Benjamin Marzinski
2018-01-17  2:04 Wuchongyun
2018-01-17 18:59 ` Benjamin Marzinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.