xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Shutdown problems in xs.c
@ 2010-05-11 22:01 Jeremy Fitzhardinge
  2010-05-11 22:02 ` [PATCH 1 of 2] xs: make sure mutexes are cleaned up and memory freed if the read thread is cancelled Jeremy Fitzhardinge
  2010-05-11 22:03 ` [PATCH 2 of 2] xs: avoid pthread_join deadlock in xs_daemon_close Jeremy Fitzhardinge
  0 siblings, 2 replies; 3+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-11 22:01 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Xen-devel, Stefano Stabellini

I've been getting deadlocks in xl, particularly "xl destroy".  It turns
out the main thread is stuck in a pthread_join while holding all the
mutexes, while the xenstore reading thread is stuck in a
pthread_mutex_lock before it can get to a cancellation point and exit.

This looks like it is a very long-standing deadlock (the code in
question mostly dates back to 2005), but perhaps something has changed
that makes it more likely to happen.  I think the original intention of
the code was to hold all the mutexes while doing the cancel/join to
avoid cancelling while the reader is holding any mutexes.  This fails
when the reader loop is not holding any, but needs to take one before
getting to a cancellation point (pthread_mutex_lock is not itself a
cancellation point).

The following two patches address it by 1) making sure that the read
thread has sufficient pthread cleanup handlers to free any
allocated-but-unused memory and release the mutexes when cancelled, and
2) do the pthread cancel/join while not holding any mutexes.

Thanks,
    J

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1 of 2] xs: make sure mutexes are cleaned up and memory freed if the read thread is cancelled
  2010-05-11 22:01 Shutdown problems in xs.c Jeremy Fitzhardinge
@ 2010-05-11 22:02 ` Jeremy Fitzhardinge
  2010-05-11 22:03 ` [PATCH 2 of 2] xs: avoid pthread_join deadlock in xs_daemon_close Jeremy Fitzhardinge
  1 sibling, 0 replies; 3+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-11 22:02 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Xen-devel, Stefano Stabellini

If the read thread is terminated with pthread cancel, it must make sure
all memory is freed and mutexes are unlocked.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

diff -r d77a88f938c6 tools/xenstore/xs.c
--- a/tools/xenstore/xs.c	Tue May 11 14:05:28 2010 +0100
+++ b/tools/xenstore/xs.c	Tue May 11 14:55:14 2010 -0700
@@ -85,6 +85,8 @@
 #define mutex_unlock(m)		pthread_mutex_unlock(m)
 #define condvar_signal(c)	pthread_cond_signal(c)
 #define condvar_wait(c,m,hnd)	pthread_cond_wait(c,m)
+#define cleanup_push(f, a)	pthread_cleanup_push((void (*)(void *))(f), (void *)(a))
+#define cleanup_pop(run)	pthread_cleanup_pop(run)
 
 static void *read_thread(void *arg);
 
@@ -102,6 +104,8 @@
 #define mutex_unlock(m)		((void)0)
 #define condvar_signal(c)	((void)0)
 #define condvar_wait(c,m,hnd)	read_message(hnd)
+#define cleanup_push(f, a)	((void)0)
+#define cleanup_pop(run)	((void)0)
 
 #endif
 
@@ -262,7 +266,6 @@
 
 #ifdef USE_PTHREAD
 	if (h->read_thr_exists) {
-		/* XXX FIXME: May leak an unpublished message buffer. */
 		pthread_cancel(h->read_thr);
 		pthread_join(h->read_thr, NULL);
 	}
@@ -860,44 +863,53 @@
 {
 	struct xs_stored_msg *msg = NULL;
 	char *body = NULL;
-	int saved_errno;
+	int saved_errno = 0;
+	int ret = -1;
 
 	/* Allocate message structure and read the message header. */
 	msg = malloc(sizeof(*msg));
 	if (msg == NULL)
 		goto error;
-	if (!read_all(h->fd, &msg->hdr, sizeof(msg->hdr)))
-		goto error;
+	cleanup_push(free, msg);
+	if (!read_all(h->fd, &msg->hdr, sizeof(msg->hdr))) { /* Cancellation point */
+		saved_errno = errno;
+		goto error_freemsg;
+	}
 
 	/* Allocate and read the message body. */
 	body = msg->body = malloc(msg->hdr.len + 1);
 	if (body == NULL)
-		goto error;
-	if (!read_all(h->fd, body, msg->hdr.len))
-		goto error;
+		goto error_freemsg;
+	cleanup_push(free, body);
+	if (!read_all(h->fd, body, msg->hdr.len)) { /* Cancellation point */
+		saved_errno = errno;
+		goto error_freebody;
+	}
+
 	body[msg->hdr.len] = '\0';
 
 	if (msg->hdr.type == XS_WATCH_EVENT) {
 		mutex_lock(&h->watch_mutex);
+		cleanup_push(pthread_mutex_unlock, &h->watch_mutex);
 
 		/* Kick users out of their select() loop. */
 		if (list_empty(&h->watch_list) &&
 		    (h->watch_pipe[1] != -1))
-			while (write(h->watch_pipe[1], body, 1) != 1)
+			while (write(h->watch_pipe[1], body, 1) != 1) /* Cancellation point */
 				continue;
 
 		list_add_tail(&msg->list, &h->watch_list);
 
 		condvar_signal(&h->watch_condvar);
 
-		mutex_unlock(&h->watch_mutex);
+		cleanup_pop(1);
 	} else {
 		mutex_lock(&h->reply_mutex);
 
 		/* There should only ever be one response pending! */
 		if (!list_empty(&h->reply_list)) {
 			mutex_unlock(&h->reply_mutex);
-			goto error;
+			goto error_freebody;
 		}
 
 		list_add_tail(&msg->list, &h->reply_list);
@@ -906,14 +918,16 @@
 		mutex_unlock(&h->reply_mutex);
 	}
 
-	return 0;
+	ret = 0;
 
- error:
-	saved_errno = errno;
-	free(msg);
-	free(body);
+error_freebody:
+	cleanup_pop(ret == -1);
+error_freemsg:
+	cleanup_pop(ret == -1);
+error:
 	errno = saved_errno;
-	return -1;
+
+	return ret;
 }
 
 #ifdef USE_PTHREAD

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 2 of 2] xs: avoid pthread_join deadlock in xs_daemon_close
  2010-05-11 22:01 Shutdown problems in xs.c Jeremy Fitzhardinge
  2010-05-11 22:02 ` [PATCH 1 of 2] xs: make sure mutexes are cleaned up and memory freed if the read thread is cancelled Jeremy Fitzhardinge
@ 2010-05-11 22:03 ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 3+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-11 22:03 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Xen-devel, Stefano Stabellini

Doing a pthread_cancel and join on the reader thread while holding all
the request/reply/watch mutexes can deadlock if the thread needs to
take any of those mutexes to exit.  Kill off the reader thread before
taking any mutexes (which should be redundant if we're  single-threaded
at that point).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

diff -r 84ee0559ddc1 tools/xenstore/xs.c
--- a/tools/xenstore/xs.c	Tue May 11 14:55:14 2010 -0700
+++ b/tools/xenstore/xs.c	Tue May 11 14:55:20 2010 -0700
@@ -260,10 +260,6 @@
 
 void xs_daemon_close(struct xs_handle *h)
 {
-	mutex_lock(&h->request_mutex);
-	mutex_lock(&h->reply_mutex);
-	mutex_lock(&h->watch_mutex);
-
 #ifdef USE_PTHREAD
 	if (h->read_thr_exists) {
 		pthread_cancel(h->read_thr);
@@ -271,6 +267,10 @@
 	}
 #endif
 
+	mutex_lock(&h->request_mutex);
+	mutex_lock(&h->reply_mutex);
+	mutex_lock(&h->watch_mutex);
+
         close_free_msgs(h);
 
 	mutex_unlock(&h->request_mutex);

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-05-11 22:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-11 22:01 Shutdown problems in xs.c Jeremy Fitzhardinge
2010-05-11 22:02 ` [PATCH 1 of 2] xs: make sure mutexes are cleaned up and memory freed if the read thread is cancelled Jeremy Fitzhardinge
2010-05-11 22:03 ` [PATCH 2 of 2] xs: avoid pthread_join deadlock in xs_daemon_close Jeremy Fitzhardinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).