xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
@ 2014-02-03 16:14 Ian Jackson
  2014-02-03 16:14 ` [PATCH 01/18] libxl: fork: Break out checked_waitpid Ian Jackson
                   ` (19 more replies)
  0 siblings, 20 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Campbell

This is the latest version of my libxl event fixes apropos of Jim's
libvirt testing.

  at  01/18 libxl: fork: Break out checked_waitpid
  at  02/18 libxl: fork: Break out childproc_reaped_ours
  at  03/18 libxl: fork: Clarify docs for libxl_sigchld_owner
  at  04/18 libxl: fork: Document libxl_sigchld_owner_libxl better
  at  05/18 libxl: fork: assert that chldmode is right
  at  06/18 libxl: fork: Provide libxl_childproc_sigchld_occurred
  at  07/18 libxl: fork: Provide ..._always_selective_reap
  at  08/18 libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP
  at  09/18 libxl: fork: Rename sigchld handler functions
  at  10/18 libxl: fork: Break out sigchld_installhandler_core
 * t  11/18 libxl: fork: Break out sigchld_sethandler_raw
 1at  12/18 libxl: fork: Share SIGCHLD handler amongst ctxs
 +at  13/18 libxl: events: Break out libxl__pipe_nonblock, _close
  at  14/18 libxl: fork: Make SIGCHLD self-pipe nonblocking
 N    15/18 libxl: events: Makefile builds internal unit tests
 N    16/18 libxl: events: timedereg internal unit test
 n    17/18 libxl: timeouts: Break out time_occurs
 n    18/18 libxl: timeouts: Record deregistration when one occurs

Notes:
   a    acked by Ian Campbell
   +    modified description in this patch
   1    this patch combined out of two patches which were part
         of "v2.1"; consequently modified in v3 since v2
   N    entirely new patch
   n    new in this series, but previously passed to Jim for testing
   t    AIUI up to here has been tested by Jim and improves (but does
        not entirely eliminate) the problems

Ian.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/18] libxl: fork: Break out checked_waitpid
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 02/18] libxl: fork: Break out childproc_reaped_ours Ian Jackson
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

This is a simple error-handling wrapper for waitpid.  We're going to
want to call waitpid somewhere else and this avoids some of the
duplication.

No functional change in this patch.  (Technically, we used to check
chldmode_ours again in the EINTR case, and don't now, but that can't
have changed because we continuously hold the libxl ctx lock.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_fork.c |   26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 4ae9f94..2252370 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -155,6 +155,22 @@ int libxl__carefd_fd(const libxl__carefd *cf)
  * Actual child process handling
  */
 
+/* Like waitpid(,,WNOHANG) but handles all errors except ECHILD. */
+static pid_t checked_waitpid(libxl__egc *egc, pid_t want, int *status)
+{
+    for (;;) {
+        pid_t got = waitpid(want, status, WNOHANG);
+        if (got != -1)
+            return got;
+        if (errno == ECHILD)
+            return got;
+        if (errno == EINTR)
+            continue;
+        LIBXL__EVENT_DISASTER(egc, "waitpid() failed", errno, 0);
+        return 0;
+    }
+}
+
 static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
                                      int fd, short events, short revents);
 
@@ -331,16 +347,10 @@ static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
 
     while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) {
         int status;
-        pid_t pid = waitpid(-1, &status, WNOHANG);
-
-        if (pid == 0) return;
+        pid_t pid = checked_waitpid(egc, -1, &status);
 
-        if (pid == -1) {
-            if (errno == ECHILD) return;
-            if (errno == EINTR) continue;
-            LIBXL__EVENT_DISASTER(egc, "waitpid() failed", errno, 0);
+        if (pid == 0 || pid == -1 /* ECHILD */)
             return;
-        }
 
         int rc = childproc_reaped(egc, pid, status);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/18] libxl: fork: Break out childproc_reaped_ours
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
  2014-02-03 16:14 ` [PATCH 01/18] libxl: fork: Break out checked_waitpid Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 03/18] libxl: fork: Clarify docs for libxl_sigchld_owner Ian Jackson
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

We're going to want to do this again at a new call site.

No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl_fork.c |   12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 2252370..7b84765 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -290,6 +290,14 @@ static int perhaps_installhandler(libxl__gc *gc, bool creating)
     return 0;
 }
 
+static void childproc_reaped_ours(libxl__egc *egc, libxl__ev_child *ch,
+                                 int status)
+{
+    LIBXL_LIST_REMOVE(ch, entry);
+    ch->pid = -1;
+    ch->callback(egc, ch, ch->pid, status);
+}
+
 static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
 {
     EGC_GC;
@@ -303,9 +311,7 @@ static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
     return ERROR_UNKNOWN_CHILD;
 
  found:
-    LIBXL_LIST_REMOVE(ch, entry);
-    ch->pid = -1;
-    ch->callback(egc, ch, pid, status);
+    childproc_reaped_ours(egc, ch, status);
 
     perhaps_removehandler(gc);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/18] libxl: fork: Clarify docs for libxl_sigchld_owner
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
  2014-02-03 16:14 ` [PATCH 01/18] libxl: fork: Break out checked_waitpid Ian Jackson
  2014-02-03 16:14 ` [PATCH 02/18] libxl: fork: Break out childproc_reaped_ours Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 04/18] libxl: fork: Document libxl_sigchld_owner_libxl better Ian Jackson
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

Clarify that libxl_sigchld_owner_libxl causes libxl to reap all the
process's children, and clarify the wording of the description of
libxl_sigchld_owner_libxl_always.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_event.h |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index 6261f99..ff0b2fa 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -467,7 +467,8 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
 
 
 typedef enum {
-    /* libxl owns SIGCHLD whenever it has a child. */
+    /* libxl owns SIGCHLD whenever it has a child, and reaps
+     * all children, including those not spawned by libxl. */
     libxl_sigchld_owner_libxl,
 
     /* Application promises to call libxl_childproc_exited but NOT
@@ -476,7 +477,7 @@ typedef enum {
     libxl_sigchld_owner_mainloop,
 
     /* libxl owns SIGCHLD all the time, and the application is
-     * relying on libxl's event loop for reaping its own children. */
+     * relying on libxl's event loop for reaping its children too. */
     libxl_sigchld_owner_libxl_always,
 } libxl_sigchld_owner;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/18] libxl: fork: Document libxl_sigchld_owner_libxl better
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (2 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 03/18] libxl: fork: Clarify docs for libxl_sigchld_owner Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 05/18] libxl: fork: assert that chldmode is right Ian Jackson
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

libxl_sigchld_owner_libxl ought to have been mentioned in the list of
options for chldowner.  Since it's the default, move the description
of the its behaviour into the description of that option.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl_event.h |   27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index ff0b2fa..4f72c4b 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -442,9 +442,26 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
  * For programs which run their own children alongside libxl's:
  *
  *     A program which does this must call libxl_childproc_setmode.
- *     There are two options:
+ *     There are three options:
  * 
+ *     libxl_sigchld_owner_libxl:
+ *
+ *       While any libxl operation which might use child processes
+ *       is running, works like libxl_sigchld_owner_libxl_always;
+ *       but, deinstalls the handler the rest of the time.
+ *
+ *       In this mode, the application, while it uses any libxl
+ *       operation which might create or use child processes (see
+ *       above):
+ *           - Must not have any child processes running.
+ *           - Must not install a SIGCHLD handler.
+ *           - Must not reap any children.
+ *
+ *       This is the default (i.e. if setmode is not called, or 0 is
+ *       passed for hooks).
+ *
  *     libxl_sigchld_owner_mainloop:
+ *
  *       The application must install a SIGCHLD handler and reap (at
  *       least) all of libxl's children and pass their exit status to
  *       libxl by calling libxl_childproc_exited.  (If the application
@@ -452,17 +469,11 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
  *       on each ctx.)
  *
  *     libxl_sigchld_owner_libxl_always:
+ *
  *       The application expects libxl to reap all of its children,
  *       and provides a callback to be notified of their exit
  *       statues.  The application must have only one libxl_ctx
  *       configured this way.
- *
- * An application which fails to call setmode, or which passes 0 for
- * hooks, while it uses any libxl operation which might
- * create or use child processes (see above):
- *   - Must not have any child processes running.
- *   - Must not install a SIGCHLD handler.
- *   - Must not reap any children.
  */
 
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/18] libxl: fork: assert that chldmode is right
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (3 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 04/18] libxl: fork: Document libxl_sigchld_owner_libxl better Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 06/18] libxl: fork: Provide libxl_childproc_sigchld_occurred Ian Jackson
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

In libxl_childproc_reaped, check that the chldmode is as expected.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_fork.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 7b84765..85db2fb 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -322,6 +322,8 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t pid, int status)
 {
     EGC_INIT(ctx);
     CTX_LOCK;
+    assert(CTX->childproc_hooks->chldowner
+           == libxl_sigchld_owner_mainloop);
     int rc = childproc_reaped(egc, pid, status);
     CTX_UNLOCK;
     EGC_FREE;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/18] libxl: fork: Provide libxl_childproc_sigchld_occurred
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (4 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 05/18] libxl: fork: assert that chldmode is right Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 07/18] libxl: fork: Provide ..._always_selective_reap Ian Jackson
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

Applications exist which don't keep track of all their child processes
in a manner suitable for coherent dispatch of their termination.  In
such a situation, nothing in the whole process may call wait, or
waitpid(-1,,).  Doing so reaps processes belonging to other parts of
the application and there is then no way to deliver the exit status to
the right place.

To facilitate this, provide a facility for such an application to ask
libxl to call waitpid on each of its children individually.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_event.h |   29 +++++++++++++++++++++++++----
 tools/libxl/libxl_fork.c  |   45 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index 4f72c4b..3c93955 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -482,9 +482,10 @@ typedef enum {
      * all children, including those not spawned by libxl. */
     libxl_sigchld_owner_libxl,
 
-    /* Application promises to call libxl_childproc_exited but NOT
-     * from within a signal handler.  libxl will not itself arrange to
-     * (un)block or catch SIGCHLD. */
+    /* Application promises to discover when SIGCHLD occurs and call
+     * libxl_childproc_exited or libxl_childproc_sigchld_occurred (but
+     * NOT from within a signal handler).  libxl will not itself
+     * arrange to (un)block or catch SIGCHLD. */
     libxl_sigchld_owner_mainloop,
 
     /* libxl owns SIGCHLD all the time, and the application is
@@ -542,7 +543,8 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks,
 
 /*
  * This function is for an application which owns SIGCHLD and which
- * therefore reaps all of the process's children.
+ * reaps all of the process's children, and dispatches the exit status
+ * to the correct place inside the application.
  *
  * May be called only by an application which has called setmode with
  * chldowner == libxl_sigchld_owner_mainloop.  If pid was a process started
@@ -558,6 +560,25 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks,
 int libxl_childproc_reaped(libxl_ctx *ctx, pid_t, int status)
                            LIBXL_EXTERNAL_CALLERS_ONLY;
 
+/*
+ * This function is for an application which owns SIGCHLD but which
+ * doesn't keep track of all of its own children in a manner suitable
+ * for reaping all of them and then dispatching them.
+ *
+ * Such an the application must notify libxl, by calling this
+ * function, that a SIGCHLD occurred.  libxl will then check all its
+ * children, reap any that are ready, and take any action necessary -
+ * but it will not reap anything else.
+ *
+ * May be called only by an application which has called setmode with
+ * chldowner == libxl_sigchld_owner_mainloop.
+ *
+ * May NOT be called from within a signal handler which might
+ * interrupt any libxl operation (just like libxl_childproc_reaped).
+ */
+void libxl_childproc_sigchld_occurred(libxl_ctx *ctx)
+                           LIBXL_EXTERNAL_CALLERS_ONLY;
+
 
 /*
  * An application which initialises a libxl_ctx in a parent process
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 85db2fb..b2325e0 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -330,6 +330,51 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t pid, int status)
     return rc;
 }
 
+static void childproc_checkall(libxl__egc *egc)
+{
+    EGC_GC;
+    libxl__ev_child *ch;
+
+    for (;;) {
+        int status;
+        pid_t got;
+
+        LIBXL_LIST_FOREACH(ch, &CTX->children, entry) {
+            got = checked_waitpid(egc, ch->pid, &status);
+            if (got)
+                goto found;
+        }
+        /* not found */
+        return;
+
+    found:
+        if (got == -1) {
+            LIBXL__EVENT_DISASTER
+                (egc, "waitpid() gave ECHILD but we have a child",
+                 ECHILD, 0);
+            /* it must have finished but we don't know its status */
+            status = 255<<8; /* no wait.h macro for this! */
+            assert(WIFEXITED(status));
+            assert(WEXITSTATUS(status)==255);
+            assert(!WIFSIGNALED(status));
+            assert(!WIFSTOPPED(status));
+        }
+        childproc_reaped_ours(egc, ch, status);
+        /* we need to restart the loop, as children may have been edited */
+    }
+}
+
+void libxl_childproc_sigchld_occurred(libxl_ctx *ctx)
+{
+    EGC_INIT(ctx);
+    CTX_LOCK;
+    assert(CTX->childproc_hooks->chldowner
+           == libxl_sigchld_owner_mainloop);
+    childproc_checkall(egc);
+    CTX_UNLOCK;
+    EGC_FREE;
+}
+
 static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
                                      int fd, short events, short revents)
 {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/18] libxl: fork: Provide ..._always_selective_reap
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (5 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 06/18] libxl: fork: Provide libxl_childproc_sigchld_occurred Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 08/18] libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP Ian Jackson
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

Applications exist which want to use libxl in an event-driven mode but
which do not integrate child termination into their event system, but
instead reap all their own children synchronously.

In such an application libxl must own SIGCHLD but avoid reaping any
children that don't belong to libxl.

Provide libxl_sigchld_owner_libxl_always_selective_reap which has this
behaviour.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

---
v2: Document the new mode in the big "Subprocess handling" comment.
---
 tools/libxl/libxl_event.h |   11 +++++++++++
 tools/libxl/libxl_fork.c  |    7 +++++++
 2 files changed, 18 insertions(+)

diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index 3c93955..824ac88 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -474,6 +474,12 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
  *       and provides a callback to be notified of their exit
  *       statues.  The application must have only one libxl_ctx
  *       configured this way.
+ *
+ *     libxl_sigchld_owner_libxl_always_selective_reap:
+ *
+ *       The application expects to reap all of its own children
+ *       synchronously, and does not use SIGCHLD.  libxl is
+ *       to install a SIGCHLD handler.
  */
 
 
@@ -491,6 +497,11 @@ typedef enum {
     /* libxl owns SIGCHLD all the time, and the application is
      * relying on libxl's event loop for reaping its children too. */
     libxl_sigchld_owner_libxl_always,
+
+    /* libxl owns SIGCHLD all the time, but it must only reap its own
+     * children.  The application will reap its own children
+     * synchronously with waitpid, without the assistance of SIGCHLD. */
+    libxl_sigchld_owner_libxl_always_selective_reap,
 } libxl_sigchld_owner;
 
 typedef struct {
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index b2325e0..16e17f6 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -268,6 +268,7 @@ static bool chldmode_ours(libxl_ctx *ctx, bool creating)
     case libxl_sigchld_owner_mainloop:
         return 0;
     case libxl_sigchld_owner_libxl_always:
+    case libxl_sigchld_owner_libxl_always_selective_reap:
         return 1;
     }
     abort();
@@ -398,6 +399,12 @@ static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
     int e = libxl__self_pipe_eatall(selfpipe);
     if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0);
 
+    if (CTX->childproc_hooks->chldowner
+        == libxl_sigchld_owner_libxl_always_selective_reap) {
+        childproc_checkall(egc);
+        return;
+    }
+
     while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) {
         int status;
         pid_t pid = checked_waitpid(egc, -1, &status);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/18] libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (6 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 07/18] libxl: fork: Provide ..._always_selective_reap Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 09/18] libxl: fork: Rename sigchld handler functions Ian Jackson
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

This is the feature test macro for libxl_childproc_sigchld_occurred
and libxl_sigchld_owner_libxl_always_selective_reap.

It is split out into this separate patch because: a single feature
test is sensible because we do not intend anyone to release or ship
libxl versions with one of these but not the other; but, the two
features are in separate patches for clarity; and, this just makes
reading the actual code easier.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl.h |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 12d6c31..1ac34c3 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -409,6 +409,19 @@
  */
 #define LIBXL_HAVE_DRIVER_DOMAIN_CREATION 1
 
+/*
+ * LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP
+ *
+ * If this is defined:
+ *
+ * Firstly, the enum libxl_sigchld_owner (in libxl_event.h) has the
+ * value libxl_sigchld_owner_libxl_always_selective_reap which may be
+ * passed to libxl_childproc_setmode in hooks->chldmode.
+ *
+ * Secondly, the function libxl_childproc_sigchld_occurred exists.
+ */
+#define LIBXL_HAVE_SIGCHLD_OWNER_SELECTIVE_REAP 1
+
 /* Functions annotated with LIBXL_EXTERNAL_CALLERS_ONLY may not be
  * called from within libxl itself. Callers outside libxl, who
  * do not #include libxl_internal.h, are fine. */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/18] libxl: fork: Rename sigchld handler functions
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (7 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 08/18] libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 10/18] libxl: fork: Break out sigchld_installhandler_core Ian Jackson
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

We are going to change these functions so that different libxl ctx's
can share a single SIGCHLD handler.  Rename them now to a new name
which doesn't imply unconditional handler installation or removal.

Also note in the comments that they are idempotent.

No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl.c          |    2 +-
 tools/libxl/libxl_fork.c     |   22 +++++++++++-----------
 tools/libxl/libxl_internal.h |    4 ++--
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 2845ca4..4679b51 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -170,7 +170,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
     /* If we have outstanding children, then the application inherits
      * them; we wish the application good luck with understanding
      * this if and when it reaps them. */
-    libxl__sigchld_removehandler(gc);
+    libxl__sigchld_notneeded(gc);
 
     if (ctx->sigchld_selfpipe[0] >= 0) {
         close(ctx->sigchld_selfpipe[0]);
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 16e17f6..a15af8e 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -194,7 +194,7 @@ static void sigchld_removehandler_core(void)
     sigchld_owner = 0;
 }
 
-void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */
+void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
 {
     int rc;
 
@@ -210,7 +210,7 @@ void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */
     }
 }
 
-int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */
+int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */
 {
     int r, rc;
 
@@ -274,18 +274,18 @@ static bool chldmode_ours(libxl_ctx *ctx, bool creating)
     abort();
 }
 
-static void perhaps_removehandler(libxl__gc *gc)
+static void perhaps_sigchld_notneeded(libxl__gc *gc)
 {
     if (!chldmode_ours(CTX, 0))
-        libxl__sigchld_removehandler(gc);
+        libxl__sigchld_notneeded(gc);
 }
 
-static int perhaps_installhandler(libxl__gc *gc, bool creating)
+static int perhaps_sigchld_needed(libxl__gc *gc, bool creating)
 {
     int rc;
 
     if (chldmode_ours(CTX, creating)) {
-        rc = libxl__sigchld_installhandler(gc);
+        rc = libxl__sigchld_needed(gc);
         if (rc) return rc;
     }
     return 0;
@@ -314,7 +314,7 @@ static int childproc_reaped(libxl__egc *egc, pid_t pid, int status)
  found:
     childproc_reaped_ours(egc, ch, status);
 
-    perhaps_removehandler(gc);
+    perhaps_sigchld_notneeded(gc);
 
     return 0;
 }
@@ -445,7 +445,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch,
     CTX_LOCK;
     int rc;
 
-    perhaps_installhandler(gc, 1);
+    perhaps_sigchld_needed(gc, 1);
 
     pid_t pid =
         CTX->childproc_hooks->fork_replacement
@@ -473,7 +473,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch,
     rc = pid;
 
  out:
-    perhaps_removehandler(gc);
+    perhaps_sigchld_notneeded(gc);
     CTX_UNLOCK;
     return rc;
 }
@@ -492,8 +492,8 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks,
     ctx->childproc_hooks = hooks;
     ctx->childproc_user = user;
 
-    perhaps_removehandler(gc);
-    perhaps_installhandler(gc, 0); /* idempotent, ok to ignore errors for now */
+    perhaps_sigchld_notneeded(gc);
+    perhaps_sigchld_needed(gc, 0); /* idempotent, ok to ignore errors for now */
 
     CTX_UNLOCK;
     GC_FREE;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 1bd23ff..fba681d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -838,8 +838,8 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p);
 
 /* Internal to fork and child reaping machinery */
 extern const libxl_childproc_hooks libxl__childproc_default_hooks;
-int libxl__sigchld_installhandler(libxl__gc*); /* non-reentrant; logs errs */
-void libxl__sigchld_removehandler(libxl__gc*); /* non-reentrant */
+int libxl__sigchld_needed(libxl__gc*); /* non-reentrant idempotent, logs errs */
+void libxl__sigchld_notneeded(libxl__gc*); /* non-reentrant idempotent */
 int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */
 int libxl__self_pipe_eatall(int fd); /* returns 0 or -1 setting errno */
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/18] libxl: fork: Break out sigchld_installhandler_core
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (8 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 09/18] libxl: fork: Rename sigchld handler functions Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 11/18] libxl: fork: Break out sigchld_sethandler_raw Ian Jackson
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

Pure code motion.  This is going to make the final substantive patch
easier to read.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl_fork.c |   38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index a15af8e..ce8e8eb 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -194,6 +194,27 @@ static void sigchld_removehandler_core(void)
     sigchld_owner = 0;
 }
 
+static void sigchld_installhandler_core(libxl__gc *gc)
+{
+    struct sigaction ours;
+    int r;
+
+    assert(!sigchld_owner);
+    sigchld_owner = CTX;
+
+    memset(&ours,0,sizeof(ours));
+    ours.sa_handler = sigchld_handler;
+    sigemptyset(&ours.sa_mask);
+    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
+    r = sigaction(SIGCHLD, &ours, &sigchld_saved_action);
+    assert(!r);
+
+    assert(((void)"application must negotiate with libxl about SIGCHLD",
+            !(sigchld_saved_action.sa_flags & SA_SIGINFO) &&
+            (sigchld_saved_action.sa_handler == SIG_DFL ||
+             sigchld_saved_action.sa_handler == SIG_IGN)));
+}
+
 void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
 {
     int rc;
@@ -236,22 +257,7 @@ int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */
 
     atfork_lock();
     if (sigchld_owner != CTX) {
-        struct sigaction ours;
-
-        assert(!sigchld_owner);
-        sigchld_owner = CTX;
-
-        memset(&ours,0,sizeof(ours));
-        ours.sa_handler = sigchld_handler;
-        sigemptyset(&ours.sa_mask);
-        ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
-        r = sigaction(SIGCHLD, &ours, &sigchld_saved_action);
-        assert(!r);
-
-        assert(((void)"application must negotiate with libxl about SIGCHLD",
-                !(sigchld_saved_action.sa_flags & SA_SIGINFO) &&
-                (sigchld_saved_action.sa_handler == SIG_DFL ||
-                 sigchld_saved_action.sa_handler == SIG_IGN)));
+        sigchld_installhandler_core(gc);
     }
     atfork_unlock();
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/18] libxl: fork: Break out sigchld_sethandler_raw
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (9 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 10/18] libxl: fork: Break out sigchld_installhandler_core Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-06 13:53   ` Ian Campbell
  2014-02-03 16:14 ` [PATCH 12/18] libxl: fork: Share SIGCHLD handler amongst ctxs Ian Jackson
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

We are going to want introduce another call site in the final
substantive patch.

Pure code motion; no functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>

---
v3: Remove now-unused variables from sigchld_installhandler_core
---
 tools/libxl/libxl_fork.c |   23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index ce8e8eb..084d86a 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -182,6 +182,19 @@ static void sigchld_handler(int signo)
     errno = esave;
 }
 
+static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old)
+{
+    struct sigaction ours;
+    int r;
+
+    memset(&ours,0,sizeof(ours));
+    ours.sa_handler = handler;
+    sigemptyset(&ours.sa_mask);
+    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
+    r = sigaction(SIGCHLD, &ours, old);
+    assert(!r);
+}
+
 static void sigchld_removehandler_core(void)
 {
     struct sigaction was;
@@ -196,18 +209,10 @@ static void sigchld_removehandler_core(void)
 
 static void sigchld_installhandler_core(libxl__gc *gc)
 {
-    struct sigaction ours;
-    int r;
-
     assert(!sigchld_owner);
     sigchld_owner = CTX;
 
-    memset(&ours,0,sizeof(ours));
-    ours.sa_handler = sigchld_handler;
-    sigemptyset(&ours.sa_mask);
-    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
-    r = sigaction(SIGCHLD, &ours, &sigchld_saved_action);
-    assert(!r);
+    sigchld_sethandler_raw(sigchld_handler, &sigchld_saved_action);
 
     assert(((void)"application must negotiate with libxl about SIGCHLD",
             !(sigchld_saved_action.sa_flags & SA_SIGINFO) &&
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/18] libxl: fork: Share SIGCHLD handler amongst ctxs
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (10 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 11/18] libxl: fork: Break out sigchld_sethandler_raw Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 13/18] libxl: events: Break out libxl__pipe_nonblock, _close Ian Jackson
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

Previously, an application which had multiple libxl ctxs in multiple
threads, would have to itself plumb SIGCHLD through to each ctx.
Instead, permit multiple libxl ctxs to all share the SIGCHLD handler.

We keep a list of all the ctxs which are interested in SIGCHLD and
notify all of their self-pipes.

In more detail:

 * sigchld_owner, the ctx* of the SIGCHLD owner, is replaced by
   sigchld_users, a list of SIGCHLD users.

 * Each ctx keeps track of whether it is on the users list, so that
   libxl__sigchld_needed and libxl__sigchld_notneeded now instead of
   idempotently installing and removing the handler, idempotently add
   or remove the ctx from the list.

   We ensure that we always have the SIGCHLD handler installed
   iff the sigchld_users list is nonempty.  To make this a bit
   easier we make sigchld_installhandler_core and
   sigchld_removehandler_core idempotent.

   Specifically, the call sites for sigchld_installhandler_core and
   sigchld_removehandler_core are updated to manipulate sigchld_users
   and only call the install or remove functions as applicable.

 * In the signal handler we walk the list of SIGCHLD users and write
   to each of their self-pipes.  That means that we need to arrange to
   defer SIGCHLD when we are manipulating the list (to avoid the
   signal handler interrupting our list manipulation); this is quite
   tiresome to arrange.

   The code as written will, on the first installation of the SIGCHLD
   handler, firstly install the real handler, then immediately replace
   it with the deferral handler.  Doing it this way makes the code
   clearer as it makes the SIGCHLD deferral machinery much more
   self-contained (and hence easier to reason about).

 * The first part of libxl__sigchld_notneeded is broken out into a new
   function sigchld_user_remove (which is also needed during for
   postfork).  And of course that first part of the function is now
   rather different, as explained above.

 * sigchld_installhandler_core no longer takes the gc argument,
   because it now deals with SIGCHLD for all ctxs.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>

---
v3: Include bugfixes from "Fixup SIGCHLD sharing" patch:

    * Use a mutex for defer_sigchld, to guard against concurrency
      between the thread calling defer_sigchld and an instance of the
      primary signal handler on another thread.

    * libxl_sigchld_owner_libxl_always is incompatible with SIGCHLD
      sharing.  Document this correctly.

    Fix "have have" error in comment.

    Move removal of newly unused variables to previous patch.

v2.1: Provide feature test macro LIBXL_HAVE_SIGCHLD_SHARING
---
 tools/libxl/libxl.h          |    9 +++
 tools/libxl/libxl_event.h    |   14 ++--
 tools/libxl/libxl_fork.c     |  153 ++++++++++++++++++++++++++++++++++++------
 tools/libxl/libxl_internal.h |    3 +
 4 files changed, 153 insertions(+), 26 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 1ac34c3..0b992d1 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -422,6 +422,15 @@
  */
 #define LIBXL_HAVE_SIGCHLD_OWNER_SELECTIVE_REAP 1
 
+/*
+ * LIBXL_HAVE_SIGCHLD_SHARING
+ *
+ * If this is defined, it is permissible for multiple libxl ctxs
+ * to simultaneously "own" SIGCHLD.  See "Subprocess handling"
+ * in libxl_event.h.
+ */
+#define LIBXL_HAVE_SIGCHLD_SHARING 1
+
 /* Functions annotated with LIBXL_EXTERNAL_CALLERS_ONLY may not be
  * called from within libxl itself. Callers outside libxl, who
  * do not #include libxl_internal.h, are fine. */
diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h
index 824ac88..ca43cb9 100644
--- a/tools/libxl/libxl_event.h
+++ b/tools/libxl/libxl_event.h
@@ -470,16 +470,18 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
  *
  *     libxl_sigchld_owner_libxl_always:
  *
- *       The application expects libxl to reap all of its children,
- *       and provides a callback to be notified of their exit
- *       statues.  The application must have only one libxl_ctx
- *       configured this way.
+ *       The application expects this libxl ctx to reap all of the
+ *       process's children, and provides a callback to be notified of
+ *       their exit statuses.  The application must have only one
+ *       libxl_ctx configured this way.
  *
  *     libxl_sigchld_owner_libxl_always_selective_reap:
  *
  *       The application expects to reap all of its own children
- *       synchronously, and does not use SIGCHLD.  libxl is
- *       to install a SIGCHLD handler.
+ *       synchronously, and does not use SIGCHLD.  libxl is to install
+ *       a SIGCHLD handler.  The application may have multiple
+ *       libxl_ctxs configured this way; in which case all of its ctxs
+ *       must be so configured.
  */
 
 
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 084d86a..2432512 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -46,11 +46,19 @@ static int atfork_registered;
 static LIBXL_LIST_HEAD(, libxl__carefd) carefds =
     LIBXL_LIST_HEAD_INITIALIZER(carefds);
 
-/* non-null iff installed, protected by no_forking */
-static libxl_ctx *sigchld_owner;
+/* Protected against concurrency by no_forking.  sigchld_users is
+ * protected against being interrupted by SIGCHLD (and thus read
+ * asynchronously by the signal handler) by sigchld_defer (see
+ * below). */
+static bool sigchld_installed; /* 0 means not */
+static pthread_mutex_t sigchld_defer_mutex = PTHREAD_MUTEX_INITIALIZER;
+static LIBXL_LIST_HEAD(, libxl_ctx) sigchld_users =
+    LIBXL_LIST_HEAD_INITIALIZER(sigchld_users);
 static struct sigaction sigchld_saved_action;
 
-static void sigchld_removehandler_core(void);
+static void sigchld_removehandler_core(void); /* idempotent */
+static void sigchld_user_remove(libxl_ctx *ctx); /* idempotent */
+static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old);
 
 static void atfork_lock(void)
 {
@@ -126,8 +134,7 @@ void libxl_postfork_child_noexec(libxl_ctx *ctx)
     }
     LIBXL_LIST_INIT(&carefds);
 
-    if (sigchld_owner)
-        sigchld_removehandler_core();
+    sigchld_user_remove(ctx);
 
     atfork_unlock();
 }
@@ -152,7 +159,8 @@ int libxl__carefd_fd(const libxl__carefd *cf)
 }
 
 /*
- * Actual child process handling
+ * Low-level functions for child process handling, including
+ * the main SIGCHLD handler.
  */
 
 /* Like waitpid(,,WNOHANG) but handles all errors except ECHILD. */
@@ -176,9 +184,22 @@ static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev,
 
 static void sigchld_handler(int signo)
 {
+    /* This function has to be reentrant!  Luckily it is. */
+
+    libxl_ctx *notify;
     int esave = errno;
-    int e = libxl__self_pipe_wakeup(sigchld_owner->sigchld_selfpipe[1]);
-    assert(!e); /* errors are probably EBADF, very bad */
+
+    int r = pthread_mutex_lock(&sigchld_defer_mutex);
+    assert(!r);
+
+    LIBXL_LIST_FOREACH(notify, &sigchld_users, sigchld_users_entry) {
+        int e = libxl__self_pipe_wakeup(notify->sigchld_selfpipe[1]);
+        assert(!e); /* errors are probably EBADF, very bad */
+    }
+
+    r = pthread_mutex_unlock(&sigchld_defer_mutex);
+    assert(!r);
+
     errno = esave;
 }
 
@@ -195,22 +216,89 @@ static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old)
     assert(!r);
 }
 
-static void sigchld_removehandler_core(void)
+/*
+ * SIGCHLD deferral
+ *
+ * sigchld_defer and sigchld_release are a bit like using sigprocmask
+ * to block the signal only they work for the whole process.  Sadly
+ * this has to be done by setting a special handler that records the
+ * "pendingness" of the signal here in the program.  How tedious.
+ *
+ * A property of this approach is that the signal handler itself
+ * must be reentrant (see the comment in release_sigchld).
+ *
+ * Callers have the atfork_lock so there is no risk of concurrency
+ * within these functions, aside from the risk of being interrupted by
+ * the signal.  We use sigchld_defer_mutex to guard against the
+ * possibility of the real signal handler being still running on
+ * another thread.
+ */
+
+static volatile sig_atomic_t sigchld_occurred_while_deferred;
+
+static void sigchld_handler_when_deferred(int signo)
+{
+    sigchld_occurred_while_deferred = 1;
+}
+
+static void defer_sigchld(void)
+{
+    assert(sigchld_installed);
+
+    sigchld_sethandler_raw(sigchld_handler_when_deferred, 0);
+
+    /* Now _this thread_ cannot any longer be interrupted by the
+     * signal, so we can take the mutex without risk of deadlock.  If
+     * another thread is in the signal handler, either it or we will
+     * block and wait for the other. */
+
+    int r = pthread_mutex_lock(&sigchld_defer_mutex);
+    assert(!r);
+}
+
+static void release_sigchld(void)
+{
+    assert(sigchld_installed);
+
+    int r = pthread_mutex_unlock(&sigchld_defer_mutex);
+    assert(!r);
+
+    sigchld_sethandler_raw(sigchld_handler, 0);
+    if (sigchld_occurred_while_deferred) {
+        sigchld_occurred_while_deferred = 0;
+        /* We might get another SIGCHLD here, in which case
+         * sigchld_handler will be interrupted and re-entered.
+         * This is OK. */
+        sigchld_handler(SIGCHLD);
+    }
+}
+
+/*
+ * Meat of the child process handling.
+ */
+
+static void sigchld_removehandler_core(void) /* idempotent */
 {
     struct sigaction was;
     int r;
     
+    if (!sigchld_installed)
+        return;
+
     r = sigaction(SIGCHLD, &sigchld_saved_action, &was);
     assert(!r);
     assert(!(was.sa_flags & SA_SIGINFO));
     assert(was.sa_handler == sigchld_handler);
-    sigchld_owner = 0;
+
+    sigchld_installed = 0;
 }
 
-static void sigchld_installhandler_core(libxl__gc *gc)
+static void sigchld_installhandler_core(void) /* idempotent */
 {
-    assert(!sigchld_owner);
-    sigchld_owner = CTX;
+    if (sigchld_installed)
+        return;
+
+    sigchld_installed = 1;
 
     sigchld_sethandler_raw(sigchld_handler, &sigchld_saved_action);
 
@@ -220,15 +308,32 @@ static void sigchld_installhandler_core(libxl__gc *gc)
              sigchld_saved_action.sa_handler == SIG_IGN)));
 }
 
-void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
+static void sigchld_user_remove(libxl_ctx *ctx) /* idempotent */
 {
-    int rc;
+    if (!ctx->sigchld_user_registered)
+        return;
 
     atfork_lock();
-    if (sigchld_owner == CTX)
+    defer_sigchld();
+
+    LIBXL_LIST_REMOVE(ctx, sigchld_users_entry);
+
+    release_sigchld();
+
+    if (LIBXL_LIST_EMPTY(&sigchld_users))
         sigchld_removehandler_core();
+
     atfork_unlock();
 
+    ctx->sigchld_user_registered = 0;
+}
+
+void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
+{
+    int rc;
+
+    sigchld_user_remove(CTX);
+
     if (libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) {
         rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, 0);
         if (rc)
@@ -259,12 +364,20 @@ int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */
         rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, POLLIN);
         if (rc) goto out;
     }
+    if (!CTX->sigchld_user_registered) {
+        atfork_lock();
 
-    atfork_lock();
-    if (sigchld_owner != CTX) {
-        sigchld_installhandler_core(gc);
+        sigchld_installhandler_core();
+
+        defer_sigchld();
+
+        LIBXL_LIST_INSERT_HEAD(&sigchld_users, CTX, sigchld_users_entry);
+
+        release_sigchld();
+        atfork_unlock();
+
+        CTX->sigchld_user_registered = 1;
     }
-    atfork_unlock();
 
     rc = 0;
  out:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index fba681d..8429448 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -355,6 +355,8 @@ struct libxl__ctx {
     int sigchld_selfpipe[2]; /* [0]==-1 means handler not installed */
     libxl__ev_fd sigchld_selfpipe_efd;
     LIBXL_LIST_HEAD(, libxl__ev_child) children;
+    bool sigchld_user_registered;
+    LIBXL_LIST_ENTRY(libxl_ctx) sigchld_users_entry;
 
     libxl_version_info version_info;
 };
@@ -840,6 +842,7 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p);
 extern const libxl_childproc_hooks libxl__childproc_default_hooks;
 int libxl__sigchld_needed(libxl__gc*); /* non-reentrant idempotent, logs errs */
 void libxl__sigchld_notneeded(libxl__gc*); /* non-reentrant idempotent */
+void libxl__sigchld_check_stale_handler(void);
 int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */
 int libxl__self_pipe_eatall(int fd); /* returns 0 or -1 setting errno */
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 13/18] libxl: events: Break out libxl__pipe_nonblock, _close
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (11 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 12/18] libxl: fork: Share SIGCHLD handler amongst ctxs Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 14/18] libxl: fork: Make SIGCHLD self-pipe nonblocking Ian Jackson
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

Break out the pipe creation and destruction from the poller code
into two new functions libxl__pipe_nonblock and libxl__pipe_close.
Also change direct use of pipe() to libxl_pipe.

No overall functional difference other than minor differences in exact
log messages.

Also move libxl__self_pipe_wakeup and libxl__self_pipe_eatall into the
new pipe utilities section in libxl_event.c; this is pure code motion.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>

--
v3: Mention that we switched pipe() -> libxl_pipe()
---
 tools/libxl/libxl_event.c    |  104 ++++++++++++++++++++++++++----------------
 tools/libxl/libxl_internal.h |    9 ++++
 2 files changed, 73 insertions(+), 40 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 1c48fee..93f8fdc 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1271,26 +1271,81 @@ int libxl_event_check(libxl_ctx *ctx, libxl_event **event_r,
 }
 
 /*
- * Manipulation of pollers
+ * Utilities for pipes (specifically, useful for self-pipes)
  */
 
-int libxl__poller_init(libxl_ctx *ctx, libxl__poller *p)
+void libxl__pipe_close(int fds[2])
+{
+    if (fds[0] >= 0) close(fds[0]);
+    if (fds[1] >= 0) close(fds[1]);
+    fds[0] = fds[1] = -1;
+}
+
+int libxl__pipe_nonblock(libxl_ctx *ctx, int fds[2])
 {
     int r, rc;
-    p->fd_polls = 0;
-    p->fd_rindices = 0;
 
-    r = pipe(p->wakeup_pipe);
+    r = libxl_pipe(ctx, fds);
     if (r) {
-        LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "cannot create poller pipe");
+        fds[0] = fds[1] = -1;
         rc = ERROR_FAIL;
         goto out;
     }
 
-    rc = libxl_fd_set_nonblock(ctx, p->wakeup_pipe[0], 1);
+    rc = libxl_fd_set_nonblock(ctx, fds[0], 1);
     if (rc) goto out;
 
-    rc = libxl_fd_set_nonblock(ctx, p->wakeup_pipe[1], 1);
+    rc = libxl_fd_set_nonblock(ctx, fds[1], 1);
+    if (rc) goto out;
+
+    return 0;
+
+ out:
+    libxl__pipe_close(fds);
+    return rc;
+}
+
+int libxl__self_pipe_wakeup(int fd)
+{
+    static const char buf[1] = "";
+
+    for (;;) {
+        int r = write(fd, buf, 1);
+        if (r==1) return 0;
+        assert(r==-1);
+        if (errno == EINTR) continue;
+        if (errno == EWOULDBLOCK) return 0;
+        assert(errno);
+        return errno;
+    }
+}
+
+int libxl__self_pipe_eatall(int fd)
+{
+    char buf[256];
+    for (;;) {
+        int r = read(fd, buf, sizeof(buf));
+        if (r == sizeof(buf)) continue;
+        if (r >= 0) return 0;
+        assert(r == -1);
+        if (errno == EINTR) continue;
+        if (errno == EWOULDBLOCK) return 0;
+        assert(errno);
+        return errno;
+    }
+}
+
+/*
+ * Manipulation of pollers
+ */
+
+int libxl__poller_init(libxl_ctx *ctx, libxl__poller *p)
+{
+    int rc;
+    p->fd_polls = 0;
+    p->fd_rindices = 0;
+
+    rc = libxl__pipe_nonblock(ctx, p->wakeup_pipe);
     if (rc) goto out;
 
     return 0;
@@ -1302,8 +1357,7 @@ int libxl__poller_init(libxl_ctx *ctx, libxl__poller *p)
 
 void libxl__poller_dispose(libxl__poller *p)
 {
-    if (p->wakeup_pipe[1] > 0) close(p->wakeup_pipe[1]);
-    if (p->wakeup_pipe[0] > 0) close(p->wakeup_pipe[0]);
+    libxl__pipe_close(p->wakeup_pipe);
     free(p->fd_polls);
     free(p->fd_rindices);
 }
@@ -1347,36 +1401,6 @@ void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p)
     if (e) LIBXL__EVENT_DISASTER(egc, "cannot poke watch pipe", e, 0);
 }
 
-int libxl__self_pipe_wakeup(int fd)
-{
-    static const char buf[1] = "";
-
-    for (;;) {
-        int r = write(fd, buf, 1);
-        if (r==1) return 0;
-        assert(r==-1);
-        if (errno == EINTR) continue;
-        if (errno == EWOULDBLOCK) return 0;
-        assert(errno);
-        return errno;
-    }
-}
-
-int libxl__self_pipe_eatall(int fd)
-{
-    char buf[256];
-    for (;;) {
-        int r = read(fd, buf, sizeof(buf));
-        if (r == sizeof(buf)) continue;
-        if (r >= 0) return 0;
-        assert(r == -1);
-        if (errno == EINTR) continue;
-        if (errno == EWOULDBLOCK) return 0;
-        assert(errno);
-        return errno;
-    }
-}
-
 /*
  * Main event loop iteration
  */
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 8429448..9d17586 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -509,6 +509,15 @@ _hidden char *libxl__strndup(libxl__gc *gc_opt, const char *c, size_t n) NN1;
  * string. (similar to a gc'd dirname(3)). */
 _hidden char *libxl__dirname(libxl__gc *gc_opt, const char *s) NN1;
 
+/* Make a pipe and set both ends nonblocking.  On error, nothing
+ * is left open and both fds[]==-1, and a message is logged.
+ * Useful for self-pipes. */
+_hidden int libxl__pipe_nonblock(libxl_ctx *ctx, int fds[2]);
+/* Closes the pipe fd(s).  Either or both of fds[] may be -1 meaning
+ * `not open'.  Ignores any errors.  Sets fds[] to -1. */
+_hidden void libxl__pipe_close(int fds[2]);
+
+
 /* Each of these logs errors and returns a libxl error code.
  * They do not mind if path is already removed.
  * For _file, path must not be a directory; for _directory it must be. */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 14/18] libxl: fork: Make SIGCHLD self-pipe nonblocking
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (12 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 13/18] libxl: events: Break out libxl__pipe_nonblock, _close Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-03 16:14 ` [PATCH 15/18] libxl: events: Makefile builds internal unit tests Ian Jackson
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

Use the new libxl__pipe_nonblock and _close functions, rather than
open coding the same logic.  Now the pipe is nonblocking, which avoids
a race which could result in libxl deadlocking in a multithreaded
program.

Reported-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl.c      |    6 +-----
 tools/libxl/libxl_fork.c |   12 +++---------
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 4679b51..3730074 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -171,11 +171,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
      * them; we wish the application good luck with understanding
      * this if and when it reaps them. */
     libxl__sigchld_notneeded(gc);
-
-    if (ctx->sigchld_selfpipe[0] >= 0) {
-        close(ctx->sigchld_selfpipe[0]);
-        close(ctx->sigchld_selfpipe[1]);
-    }
+    libxl__pipe_close(ctx->sigchld_selfpipe);
 
     pthread_mutex_destroy(&ctx->lock);
 
diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 2432512..1d0017b 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -343,17 +343,11 @@ void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */
 
 int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */
 {
-    int r, rc;
+    int rc;
 
     if (CTX->sigchld_selfpipe[0] < 0) {
-        r = pipe(CTX->sigchld_selfpipe);
-        if (r) {
-            CTX->sigchld_selfpipe[0] = -1;
-            LIBXL__LOG_ERRNO(CTX, LIBXL__LOG_ERROR,
-                             "failed to create sigchld pipe");
-            rc = ERROR_FAIL;
-            goto out;
-        }
+        rc = libxl__pipe_nonblock(CTX, CTX->sigchld_selfpipe);
+        if (rc) goto out;
     }
     if (!libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) {
         rc = libxl__ev_fd_register(gc, &CTX->sigchld_selfpipe_efd,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 15/18] libxl: events: Makefile builds internal unit tests
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (13 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 14/18] libxl: fork: Make SIGCHLD self-pipe nonblocking Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-06 14:00   ` Ian Campbell
  2014-02-03 16:14 ` [PATCH 16/18] libxl: events: timedereg internal unit test Ian Jackson
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

We provide a new LIBXL_TESTS facility in the Makefile.
Also provide some helpful common routines for unit tests to use.

We don't want to put the weird test case entrypoints and the weird
test case code in the main libxl.so library.  Symbol hiding prevents
us from simply directly linking the libxl_test_FOO.o in later.  So
instead we provide a special library libxenlight_test.so which is used
only locally.

There are not yet any test cases defined; that will come in the next
patch.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/Makefile      |   32 ++++++++++++++++++++++++++++----
 tools/libxl/test_common.c |   15 +++++++++++++++
 tools/libxl/test_common.h |   14 ++++++++++++++
 3 files changed, 57 insertions(+), 4 deletions(-)
 create mode 100644 tools/libxl/test_common.c
 create mode 100644 tools/libxl/test_common.h

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index d8495bb..f04cba7 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -79,7 +79,23 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
 
-$(LIBXL_OBJS): CFLAGS += $(CFLAGS_LIBXL) -include $(XEN_ROOT)/tools/config.h
+LIBXL_TESTS +=
+# Each entry FOO in LIBXL_TESTS has two main .c files:
+#   libxl_test_FOO.c  "inside libxl" code to support the test case
+#   test_FOO.c        "outside libxl" code to exercise the test case
+# Conventionally there will also be:
+#   libxl_test_FOO.h  interface between the "inside" and "outside" parts
+# The "inside libxl" file is compiled exactly like a piece of libxl, and the
+# "outside libxl" file is compiled exactly like a piece of application
+# code.  They must share information via explicit libxl entrypoints.
+# Unlike proper parts of libxl, it is permissible for libxl_test_FOO.c
+# to use private global variables for its state.
+
+LIBXL_TEST_OBJS += $(foreach t, $(LIBXL_TESTS),libxl_test_$t.o)
+TEST_PROG_OBJS += $(foreach t, $(LIBXL_TESTS),test_$t.o) test_common.o
+TEST_PROGS += $(foreach t, $(LIBXL_TESTS),test_$t)
+
+$(LIBXL_OBJS) $(LIBXL_TEST_OBJS): CFLAGS += $(CFLAGS_LIBXL) -include $(XEN_ROOT)/tools/config.h
 
 AUTOINCS= libxlu_cfg_y.h libxlu_cfg_l.h _libxl_list.h _paths.h \
 	_libxl_save_msgs_callout.h _libxl_save_msgs_helper.h
@@ -95,7 +111,7 @@ CFLAGS_XL += $(CFLAGS_libxenlight)
 CFLAGS_XL += -Wshadow
 
 XL_OBJS = xl.o xl_cmdimpl.o xl_cmdtable.o xl_sxp.o
-$(XL_OBJS) _libxl.api-for-check: \
+$(XL_OBJS) $(TEST_PROG_OBJS) _libxl.api-for-check: \
             CFLAGS += $(CFLAGS_libxenctrl) # For xentoollog.h
 $(XL_OBJS): CFLAGS += $(CFLAGS_XL)
 $(XL_OBJS): CFLAGS += -include $(XEN_ROOT)/tools/config.h # libxl_json.h needs it.
@@ -109,10 +125,12 @@ testidl.c: libxl_types.idl gentest.py libxl.h $(AUTOINCS)
 	mv testidl.c.new testidl.c
 
 .PHONY: all
-all: $(CLIENTS) libxenlight.so libxenlight.a libxlutil.so libxlutil.a \
+all: $(CLIENTS) $(TEST_PROGS) \
+		libxenlight.so libxenlight.a libxlutil.so libxlutil.a \
 	$(AUTOSRCS) $(AUTOINCS)
 
-$(LIBXL_OBJS) $(LIBXLU_OBJS) $(XL_OBJS) $(SAVE_HELPER_OBJS): \
+$(LIBXL_OBJS) $(LIBXLU_OBJS) $(XL_OBJS) $(SAVE_HELPER_OBJS) \
+		$(LIBXL_TEST_OBJS): \
 	$(AUTOINCS) libxl.api-ok
 
 %.c %.h:: %.y
@@ -175,6 +193,9 @@ libxenlight.so.$(MAJOR): libxenlight.so.$(MAJOR).$(MINOR)
 libxenlight.so.$(MAJOR).$(MINOR): $(LIBXL_OBJS)
 	$(CC) $(LDFLAGS) -Wl,$(SONAME_LDFLAG) -Wl,libxenlight.so.$(MAJOR) $(SHLIB_LDFLAGS) -o $@ $^ $(LIBXL_LIBS) $(APPEND_LDFLAGS)
 
+libxenlight_test.so: $(LIBXL_OBJS) $(LIBXL_TEST_OBJS)
+	$(CC) $(LDFLAGS) -Wl,$(SONAME_LDFLAG) -Wl,libxenlight_test.so $(SHLIB_LDFLAGS) -o $@ $^ $(LIBXL_LIBS) $(APPEND_LDFLAGS)
+
 libxenlight.a: $(LIBXL_OBJS)
 	$(AR) rcs libxenlight.a $^
 
@@ -193,6 +214,9 @@ libxlutil.a: $(LIBXLU_OBJS)
 xl: $(XL_OBJS) libxlutil.so libxenlight.so
 	$(CC) $(LDFLAGS) -o $@ $(XL_OBJS) libxlutil.so $(LDLIBS_libxenlight) $(LDLIBS_libxenctrl) -lyajl $(APPEND_LDFLAGS)
 
+test_%: test_%.o test_common.o libxlutil.so libxenlight_test.so
+	$(CC) $(LDFLAGS) -o $@ $^ $(LDLIBS_libxenlight) $(LDLIBS_libxenctrl) -lyajl $(APPEND_LDFLAGS)
+
 libxl-save-helper: $(SAVE_HELPER_OBJS) libxenlight.so
 	$(CC) $(LDFLAGS) -o $@ $(SAVE_HELPER_OBJS) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(APPEND_LDFLAGS)
 
diff --git a/tools/libxl/test_common.c b/tools/libxl/test_common.c
new file mode 100644
index 0000000..83b94eb
--- /dev/null
+++ b/tools/libxl/test_common.c
@@ -0,0 +1,15 @@
+#include "test_common.h"
+
+libxl_ctx *ctx;
+
+void test_common_setup(int level)
+{
+    xentoollog_logger_stdiostream *logger_s
+        = xtl_createlogger_stdiostream(stderr, level,  0);
+    assert(logger_s);
+
+    xentoollog_logger *logger = (xentoollog_logger*)logger_s;
+
+    int rc = libxl_ctx_alloc(&ctx, LIBXL_VERSION, 0, logger);
+    assert(!rc);
+}    
diff --git a/tools/libxl/test_common.h b/tools/libxl/test_common.h
new file mode 100644
index 0000000..8b2471e
--- /dev/null
+++ b/tools/libxl/test_common.h
@@ -0,0 +1,14 @@
+#ifndef TEST_COMMON_H
+#define TEST_COMMON_H
+
+#include "libxl.h"
+
+#include <assert.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+void test_common_setup(int level);
+
+extern libxl_ctx *ctx;
+
+#endif /*TEST_COMMON_H*/
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 16/18] libxl: events: timedereg internal unit test
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (14 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 15/18] libxl: events: Makefile builds internal unit tests Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-06 14:01   ` Ian Campbell
  2014-02-03 16:14 ` [PATCH 17/18] libxl: timeouts: Break out time_occurs Ian Jackson
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell

Test timeout deregistration idempotency.  In the current tree this
test fails because ev->func is not cleared, meaning that a timeout
can be removed from the list more than once, corrupting the list.

It is necessary to use multiple timeouts to demonstrate this bug,
because removing the very same entry twice from a list in quick
succession, without modifying the list in other ways in between,
doesn't actually corrupt the list.  (Since removing an entry from a
doubly-linked list just copies next and back from the disappearing
entry into its neighbours.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
---
 .gitignore                         |    1 +
 tools/libxl/Makefile               |    2 +-
 tools/libxl/libxl_test_timedereg.c |   97 ++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_test_timedereg.h |    9 ++++
 tools/libxl/test_timedereg.c       |   11 ++++
 5 files changed, 119 insertions(+), 1 deletion(-)
 create mode 100644 tools/libxl/libxl_test_timedereg.c
 create mode 100644 tools/libxl/libxl_test_timedereg.h
 create mode 100644 tools/libxl/test_timedereg.c

diff --git a/.gitignore b/.gitignore
index 3504584..db3b083 100644
--- a/.gitignore
+++ b/.gitignore
@@ -361,6 +361,7 @@ tools/libxl/testidl
 tools/libxl/testidl.c
 tools/libxl/*.pyc
 tools/libxl/libxl-save-helper
+tools/libxl/test_timedereg
 tools/blktap2/control/tap-ctl
 tools/firmware/etherboot/eb-roms.h
 tools/firmware/etherboot/gpxe-git-snapshot.tar.gz
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index f04cba7..1dccbf0 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -79,7 +79,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
 
-LIBXL_TESTS +=
+LIBXL_TESTS += timedereg
 # Each entry FOO in LIBXL_TESTS has two main .c files:
 #   libxl_test_FOO.c  "inside libxl" code to support the test case
 #   test_FOO.c        "outside libxl" code to exercise the test case
diff --git a/tools/libxl/libxl_test_timedereg.c b/tools/libxl/libxl_test_timedereg.c
new file mode 100644
index 0000000..a44639f
--- /dev/null
+++ b/tools/libxl/libxl_test_timedereg.c
@@ -0,0 +1,97 @@
+/*
+ * timedereg test case for the libxl event system
+ *
+ * To run this test:
+ *    ./test_timedereg
+ * Success:
+ *    program takes a few seconds, prints some debugging output and exits 0
+ * Failure:
+ *    crash
+ *
+ * set up [0]-group timeouts 0 1 2
+ * wait for timeout 1 to occur
+ * deregister 0 and 2.  1 is supposed to be deregistered already
+ * register [1]-group 0 1 2
+ * deregister 1 (should be a no-op)
+ * wait for [1]-group 0 1 2 in turn
+ * on final callback assert that all have been deregistered
+ */
+
+#include "libxl_internal.h"
+
+#include "libxl_test_timedereg.h"
+
+#define NTIMES 3
+static const int ms[2][NTIMES] = { { 2000,1000,2000 }, { 1000,2000,3000 } };
+static libxl__ev_time et[2][NTIMES];
+static libxl__ao *tao;
+static int seq;
+
+static void occurs(libxl__egc *egc, libxl__ev_time *ev,
+                   const struct timeval *requested_abs);
+
+static void regs(libxl__gc *gc, int j)
+{
+    int rc, i;
+    LOG(DEBUG,"regs(%d)", j);
+    for (i=0; i<NTIMES; i++) {
+        rc = libxl__ev_time_register_rel(gc, &et[j][i], occurs, ms[j][i]);
+        assert(!rc);
+    }    
+}
+
+int libxl_test_timedereg(libxl_ctx *ctx, libxl_asyncop_how *ao_how)
+{
+    int i;
+    AO_CREATE(ctx, 0, ao_how);
+
+    tao = ao;
+
+    for (i=0; i<NTIMES; i++) {
+        libxl__ev_time_init(&et[0][i]);
+        libxl__ev_time_init(&et[1][i]);
+    }
+
+    regs(gc, 0);
+
+    return AO_INPROGRESS;
+}
+
+static void occurs(libxl__egc *egc, libxl__ev_time *ev,
+                   const struct timeval *requested_abs)
+{
+    EGC_GC;
+    int i;
+
+    int off = ev - &et[0][0];
+    LOG(DEBUG,"occurs[%d][%d] seq=%d", off/NTIMES, off%NTIMES, seq);
+
+    switch (seq) {
+    case 0:
+        assert(ev == &et[0][1]);
+        libxl__ev_time_deregister(gc, &et[0][0]);
+        libxl__ev_time_deregister(gc, &et[0][2]);
+        regs(gc, 1);
+        libxl__ev_time_deregister(gc, &et[0][1]);
+        break;
+
+    case 1:
+    case 2:
+        assert(ev == &et[1][seq-1]);
+        break;
+        
+    case 3:
+        assert(ev == &et[1][2]);
+        for (i=0; i<NTIMES; i++) {
+            assert(!libxl__ev_time_isregistered(&et[0][i]));
+            assert(!libxl__ev_time_isregistered(&et[1][i]));
+        }
+        libxl__ao_complete(egc, tao, 0);
+        return;
+
+    default:
+        abort();
+    }
+
+    seq++;
+}
diff --git a/tools/libxl/libxl_test_timedereg.h b/tools/libxl/libxl_test_timedereg.h
new file mode 100644
index 0000000..9547dba
--- /dev/null
+++ b/tools/libxl/libxl_test_timedereg.h
@@ -0,0 +1,9 @@
+#ifndef TEST_TIMEDEREG_H
+#define TEST_TIMEDEREG_H
+
+#include <pthread.h>
+
+int libxl_test_timedereg(libxl_ctx *ctx, libxl_asyncop_how *ao_how)
+    LIBXL_EXTERNAL_CALLERS_ONLY;
+
+#endif /*TEST_TIMEDEREG_H*/
diff --git a/tools/libxl/test_timedereg.c b/tools/libxl/test_timedereg.c
new file mode 100644
index 0000000..0081ce3
--- /dev/null
+++ b/tools/libxl/test_timedereg.c
@@ -0,0 +1,11 @@
+#include "test_common.h"
+#include "libxl_test_timedereg.h"
+
+int main(int argc, char **argv) {
+    int rc;
+
+    test_common_setup(XTL_DEBUG);
+
+    rc = libxl_test_timedereg(ctx, 0);
+    assert(!rc);
+}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 17/18] libxl: timeouts: Break out time_occurs
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (15 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 16/18] libxl: events: timedereg internal unit test Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-06 14:02   ` Ian Campbell
  2014-02-03 16:14 ` [PATCH 18/18] libxl: timeouts: Record deregistration when one occurs Ian Jackson
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell, Ian Jackson

From: Ian Jackson <Ian.Jackson@eu.citrix.com>

Bring together the two places where etime->func() is called into a new
function time_occurs.  For one call site this is pure code motion.
For the other the only semantic change is the introduction of a new
debugging message.

Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl_event.c |   18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 93f8fdc..5a99932 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -381,6 +381,15 @@ void libxl__ev_time_deregister(libxl__gc *gc, libxl__ev_time *ev)
     return;
 }
 
+static void time_occurs(libxl__egc *egc, libxl__ev_time *etime)
+{
+    DBG("ev_time=%p occurs abs=%lu.%06lu",
+        etime, (unsigned long)etime->abs.tv_sec,
+        (unsigned long)etime->abs.tv_usec);
+
+    etime->func(egc, etime, &etime->abs);
+}
+
 
 /*
  * xenstore watches
@@ -1007,11 +1016,7 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
 
         time_deregister(gc, etime);
 
-        DBG("ev_time=%p occurs abs=%lu.%06lu",
-            etime, (unsigned long)etime->abs.tv_sec,
-            (unsigned long)etime->abs.tv_usec);
-
-        etime->func(egc, etime, &etime->abs);
+        time_occurs(egc, etime);
     }
 }
 
@@ -1092,7 +1097,8 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
     assert(!ev->infinite);
 
     LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry);
-    ev->func(egc, ev, &ev->abs);
+
+    time_occurs(egc, ev);
 
  out:
     CTX_UNLOCK;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 18/18] libxl: timeouts: Record deregistration when one occurs
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (16 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 17/18] libxl: timeouts: Break out time_occurs Ian Jackson
@ 2014-02-03 16:14 ` Ian Jackson
  2014-02-06 14:04   ` Ian Campbell
  2014-02-03 16:16 ` [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
  2014-02-05 14:10 ` George Dunlap
  19 siblings, 1 reply; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Jim Fehlig, Ian Jackson, Ian Campbell, Ian Jackson

From: Ian Jackson <Ian.Jackson@eu.citrix.com>

When a timeout has occurred, it is deregistered.  However, we failed
to record this fact by updating etime->func.  As a result,
libxl__ev_time_isregistered would say `true' for a timeout which has
already happened.

The results are that we might try to have the timeout occur again
(causing problems for the call site), and/or corrupt the timeout list.

This fixes the timedereg event system unit test.

Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl_event.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 5a99932..ea8c744 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -387,7 +387,9 @@ static void time_occurs(libxl__egc *egc, libxl__ev_time *etime)
         etime, (unsigned long)etime->abs.tv_sec,
         (unsigned long)etime->abs.tv_usec);
 
-    etime->func(egc, etime, &etime->abs);
+    libxl__ev_time_callback *func = etime->func;
+    etime->func = 0;
+    func(egc, etime, &etime->abs);
 }
 
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (17 preceding siblings ...)
  2014-02-03 16:14 ` [PATCH 18/18] libxl: timeouts: Record deregistration when one occurs Ian Jackson
@ 2014-02-03 16:16 ` Ian Jackson
  2014-02-05  5:46   ` Jim Fehlig
  2014-02-05 14:10 ` George Dunlap
  19 siblings, 1 reply; 36+ messages in thread
From: Ian Jackson @ 2014-02-03 16:16 UTC (permalink / raw)
  To: xen-devel, George Dunlap, Ian Campbell, Jim Fehlig

Ian Jackson writes ("[PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4"):
> This is the latest version of my libxl event fixes apropos of Jim's
> libvirt testing.

This can also be found here:
  git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v3

Ian.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
  2014-02-03 16:16 ` [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
@ 2014-02-05  5:46   ` Jim Fehlig
  2014-02-05 11:21     ` Ian Jackson
  0 siblings, 1 reply; 36+ messages in thread
From: Jim Fehlig @ 2014-02-05  5:46 UTC (permalink / raw)
  To: Ian Jackson; +Cc: George Dunlap, xen-devel, Ian Campbell

Ian Jackson wrote:
> Ian Jackson writes ("[PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4"):
>   
>> This is the latest version of my libxl event fixes apropos of Jim's
>> libvirt testing.
>>     
>
> This can also be found here:
>   git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v3
>   

I haven't reviewed this version of the series, but have been running it
in my test setup for several hours now without any problems.  The test
setup includes some fixes on the libvirt side, which I will post
tomorrow after fixing up the commit messages.  I'll cc xen-devel in case
you have some comments on those fixes.

Regards,
Jim

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
  2014-02-05  5:46   ` Jim Fehlig
@ 2014-02-05 11:21     ` Ian Jackson
  0 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-05 11:21 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: George Dunlap, xen-devel, Ian Campbell

Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4"):
> I haven't reviewed this version of the series, but have been running it
> in my test setup for several hours now without any problems.  The test
> setup includes some fixes on the libvirt side, which I will post
> tomorrow after fixing up the commit messages.  I'll cc xen-devel in case
> you have some comments on those fixes.

Thanks!

Ian.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
  2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
                   ` (18 preceding siblings ...)
  2014-02-03 16:16 ` [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
@ 2014-02-05 14:10 ` George Dunlap
  2014-02-05 15:03   ` Ian Jackson
  19 siblings, 1 reply; 36+ messages in thread
From: George Dunlap @ 2014-02-05 14:10 UTC (permalink / raw)
  To: Ian Jackson, xen-devel; +Cc: Jim Fehlig, Ian Campbell

On 02/03/2014 04:14 PM, Ian Jackson wrote:
> This is the latest version of my libxl event fixes apropos of Jim's
> libvirt testing.

Did you have any opinions on the suitability of this for 4.4?

  -George

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
  2014-02-05 14:10 ` George Dunlap
@ 2014-02-05 15:03   ` Ian Jackson
  2014-02-06 10:52     ` George Dunlap
  0 siblings, 1 reply; 36+ messages in thread
From: Ian Jackson @ 2014-02-05 15:03 UTC (permalink / raw)
  To: George Dunlap; +Cc: Jim Fehlig, xen-devel, Ian Campbell

George Dunlap writes ("Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4"):
> On 02/03/2014 04:14 PM, Ian Jackson wrote:
> > This is the latest version of my libxl event fixes apropos of Jim's
> > libvirt testing.
> 
> Did you have any opinions on the suitability of this for 4.4?

Sorry, I should have made that clear in the body text rather than just
the subject line.

I think this needs a freeze exception on the following grounds:

 * There is little change visible to non-eventy/thready callers and
   the risk of new races there is limited; basic functional testing
   ought to catch those errors.

 * The most prominent eventy/thready caller we are currently aware of
   is libvirt.  Without these changes it is nearly impossible to have
   a reliable libvirt.

 * These changes fall into three categories:

     - Bugfixes (3 or so);

     - The new SIGCHLD sharing feature
        "libxl: fork: Share SIGCHLD handler amongst ctxs";

     - New libxl innards unit test framework and a unit test;

     - Documentation improvements;

     - Changes with no functional effect which have been broken out
       from the above in order to facilitate review (lots, but
       all small easily reviewable).

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
  2014-02-05 15:03   ` Ian Jackson
@ 2014-02-06 10:52     ` George Dunlap
  2014-02-06 12:35       ` Ian Jackson
  2014-02-07  4:17       ` Jim Fehlig
  0 siblings, 2 replies; 36+ messages in thread
From: George Dunlap @ 2014-02-06 10:52 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jim Fehlig, xen-devel, Ian Campbell

On 02/05/2014 03:03 PM, Ian Jackson wrote:
> George Dunlap writes ("Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4"):
>> On 02/03/2014 04:14 PM, Ian Jackson wrote:
>>> This is the latest version of my libxl event fixes apropos of Jim's
>>> libvirt testing.
>> Did you have any opinions on the suitability of this for 4.4?
> Sorry, I should have made that clear in the body text rather than just
> the subject line.
>
> I think this needs a freeze exception on the following grounds:
>
>   * There is little change visible to non-eventy/thready callers and
>     the risk of new races there is limited; basic functional testing
>     ought to catch those errors.
>
>   * The most prominent eventy/thready caller we are currently aware of
>     is libvirt.  Without these changes it is nearly impossible to have
>     a reliable libvirt.

Thanks.

I think libvirt support for libxl is a really important functionality 
from a strategic perspective: a solid support should make it much easier 
to integrate with other projects such as OpenStack and Cloudstack, as 
well as (in theory) other tools built on top of libvirt.

So I'm inclined to consider this a blocker*; I think we should accept it 
and delay the release until we feel comfortable that it has been 
sufficiently tested.

Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>

* "blocker" is never an absolute specification; there is almost always a 
point where we would say, "we're just going to have to release without 
this".  Specifying a feature or bug a blocker just means, "At this 
point, we are still willing to slip the release if necessary to include 
this feature / bug fix."

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
  2014-02-06 10:52     ` George Dunlap
@ 2014-02-06 12:35       ` Ian Jackson
  2014-02-06 14:07         ` Ian Campbell
  2014-02-07  4:17       ` Jim Fehlig
  1 sibling, 1 reply; 36+ messages in thread
From: Ian Jackson @ 2014-02-06 12:35 UTC (permalink / raw)
  To: George Dunlap; +Cc: Jim Fehlig, xen-devel, Ian Campbell

George Dunlap writes ("Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4"):
> I think libvirt support for libxl is a really important functionality 
> from a strategic perspective: a solid support should make it much easier 
> to integrate with other projects such as OpenStack and Cloudstack, as 
> well as (in theory) other tools built on top of libvirt.

Right.

> So I'm inclined to consider this a blocker*; I think we should accept it 
> and delay the release until we feel comfortable that it has been 
> sufficiently tested.

Thanks.

> Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>

I think the right answer then would be to commit it as soon as it has
been acked.  There are four patches at the end that could do with an
ack from Ian C.

Ian.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 11/18] libxl: fork: Break out sigchld_sethandler_raw
  2014-02-03 16:14 ` [PATCH 11/18] libxl: fork: Break out sigchld_sethandler_raw Ian Jackson
@ 2014-02-06 13:53   ` Ian Campbell
  0 siblings, 0 replies; 36+ messages in thread
From: Ian Campbell @ 2014-02-06 13:53 UTC (permalink / raw)
  To: Ian Jackson; +Cc: George Dunlap, Jim Fehlig, xen-devel

On Mon, 2014-02-03 at 16:14 +0000, Ian Jackson wrote:
> We are going to want introduce another call site in the final
> substantive patch.
> 
> Pure code motion; no functional change.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

> 
> ---
> v3: Remove now-unused variables from sigchld_installhandler_core
> ---
>  tools/libxl/libxl_fork.c |   23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
> index ce8e8eb..084d86a 100644
> --- a/tools/libxl/libxl_fork.c
> +++ b/tools/libxl/libxl_fork.c
> @@ -182,6 +182,19 @@ static void sigchld_handler(int signo)
>      errno = esave;
>  }
>  
> +static void sigchld_sethandler_raw(void (*handler)(int), struct sigaction *old)
> +{
> +    struct sigaction ours;
> +    int r;
> +
> +    memset(&ours,0,sizeof(ours));
> +    ours.sa_handler = handler;
> +    sigemptyset(&ours.sa_mask);
> +    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
> +    r = sigaction(SIGCHLD, &ours, old);
> +    assert(!r);
> +}
> +
>  static void sigchld_removehandler_core(void)
>  {
>      struct sigaction was;
> @@ -196,18 +209,10 @@ static void sigchld_removehandler_core(void)
>  
>  static void sigchld_installhandler_core(libxl__gc *gc)
>  {
> -    struct sigaction ours;
> -    int r;
> -
>      assert(!sigchld_owner);
>      sigchld_owner = CTX;
>  
> -    memset(&ours,0,sizeof(ours));
> -    ours.sa_handler = sigchld_handler;
> -    sigemptyset(&ours.sa_mask);
> -    ours.sa_flags = SA_NOCLDSTOP | SA_RESTART;
> -    r = sigaction(SIGCHLD, &ours, &sigchld_saved_action);
> -    assert(!r);
> +    sigchld_sethandler_raw(sigchld_handler, &sigchld_saved_action);
>  
>      assert(((void)"application must negotiate with libxl about SIGCHLD",
>              !(sigchld_saved_action.sa_flags & SA_SIGINFO) &&

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 15/18] libxl: events: Makefile builds internal unit tests
  2014-02-03 16:14 ` [PATCH 15/18] libxl: events: Makefile builds internal unit tests Ian Jackson
@ 2014-02-06 14:00   ` Ian Campbell
  0 siblings, 0 replies; 36+ messages in thread
From: Ian Campbell @ 2014-02-06 14:00 UTC (permalink / raw)
  To: Ian Jackson; +Cc: George Dunlap, Jim Fehlig, xen-devel

On Mon, 2014-02-03 at 16:14 +0000, Ian Jackson wrote:
> We provide a new LIBXL_TESTS facility in the Makefile.
> Also provide some helpful common routines for unit tests to use.
> 
> We don't want to put the weird test case entrypoints and the weird
> test case code in the main libxl.so library.  Symbol hiding prevents
> us from simply directly linking the libxl_test_FOO.o in later.  So
> instead we provide a special library libxenlight_test.so which is used
> only locally.
> 
> There are not yet any test cases defined; that will come in the next
> patch.
> 
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 16/18] libxl: events: timedereg internal unit test
  2014-02-03 16:14 ` [PATCH 16/18] libxl: events: timedereg internal unit test Ian Jackson
@ 2014-02-06 14:01   ` Ian Campbell
  0 siblings, 0 replies; 36+ messages in thread
From: Ian Campbell @ 2014-02-06 14:01 UTC (permalink / raw)
  To: Ian Jackson; +Cc: George Dunlap, Jim Fehlig, xen-devel

On Mon, 2014-02-03 at 16:14 +0000, Ian Jackson wrote:
> Test timeout deregistration idempotency.  In the current tree this
> test fails because ev->func is not cleared, meaning that a timeout
> can be removed from the list more than once, corrupting the list.
> 
> It is necessary to use multiple timeouts to demonstrate this bug,
> because removing the very same entry twice from a list in quick
> succession, without modifying the list in other ways in between,
> doesn't actually corrupt the list.  (Since removing an entry from a
> doubly-linked list just copies next and back from the disappearing
> entry into its neighbours.)
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Jim Fehlig <jfehlig@suse.com>

I've only glanced at the test, but I don't think it needs thorough
review so:

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 17/18] libxl: timeouts: Break out time_occurs
  2014-02-03 16:14 ` [PATCH 17/18] libxl: timeouts: Break out time_occurs Ian Jackson
@ 2014-02-06 14:02   ` Ian Campbell
  0 siblings, 0 replies; 36+ messages in thread
From: Ian Campbell @ 2014-02-06 14:02 UTC (permalink / raw)
  To: Ian Jackson; +Cc: George Dunlap, Jim Fehlig, xen-devel, Ian Jackson

On Mon, 2014-02-03 at 16:14 +0000, Ian Jackson wrote:
> From: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> Bring together the two places where etime->func() is called into a new
> function time_occurs.  For one call site this is pure code motion.
> For the other the only semantic change is the introduction of a new
> debugging message.
> 
> Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
> Cc: Jim Fehlig <jfehlig@suse.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 18/18] libxl: timeouts: Record deregistration when one occurs
  2014-02-03 16:14 ` [PATCH 18/18] libxl: timeouts: Record deregistration when one occurs Ian Jackson
@ 2014-02-06 14:04   ` Ian Campbell
  2014-02-06 14:24     ` Ian Jackson
  0 siblings, 1 reply; 36+ messages in thread
From: Ian Campbell @ 2014-02-06 14:04 UTC (permalink / raw)
  To: Ian Jackson; +Cc: George Dunlap, Jim Fehlig, xen-devel, Ian Jackson

On Mon, 2014-02-03 at 16:14 +0000, Ian Jackson wrote:
> From: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> When a timeout has occurred, it is deregistered.  However, we failed
> to record this fact by updating etime->func.  As a result,
> libxl__ev_time_isregistered would say `true' for a timeout which has
> already happened.
> 
> The results are that we might try to have the timeout occur again
> (causing problems for the call site), and/or corrupt the timeout list.
> 
> This fixes the timedereg event system unit test.
> 
> Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
> Cc: Jim Fehlig <jfehlig@suse.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

Although if you were minded to explicitly note in the changelog that it
is necessary to nuke ->func before running the callback so that it
doesn't try and fire it again in parallel while it is running then that
would be good.

Ian.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
  2014-02-06 12:35       ` Ian Jackson
@ 2014-02-06 14:07         ` Ian Campbell
  2014-02-06 14:33           ` Ian Jackson
  0 siblings, 1 reply; 36+ messages in thread
From: Ian Campbell @ 2014-02-06 14:07 UTC (permalink / raw)
  To: Ian Jackson; +Cc: George Dunlap, Jim Fehlig, xen-devel

On Thu, 2014-02-06 at 12:35 +0000, Ian Jackson wrote:
> George Dunlap writes ("Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4"):
> > I think libvirt support for libxl is a really important functionality 
> > from a strategic perspective: a solid support should make it much easier 
> > to integrate with other projects such as OpenStack and Cloudstack, as 
> > well as (in theory) other tools built on top of libvirt.
> 
> Right.
> 
> > So I'm inclined to consider this a blocker*; I think we should accept it 
> > and delay the release until we feel comfortable that it has been 
> > sufficiently tested.
> 
> Thanks.
> 
> > Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>
> 
> I think the right answer then would be to commit it as soon as it has
> been acked.  There are four patches at the end that could do with an
> ack from Ian C.

I think I've now acked everything which was lacking one. Apart from a
nitpick on the commit message of the final one (which you may choose to
ignore) I think this is good to go.

I'm assuming you will shovel this one in yourself.

Ian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 18/18] libxl: timeouts: Record deregistration when one occurs
  2014-02-06 14:04   ` Ian Campbell
@ 2014-02-06 14:24     ` Ian Jackson
  2014-02-06 14:27       ` Ian Campbell
  0 siblings, 1 reply; 36+ messages in thread
From: Ian Jackson @ 2014-02-06 14:24 UTC (permalink / raw)
  To: Ian Campbell; +Cc: George Dunlap, Jim Fehlig, xen-devel

Ian Campbell writes ("Re: [PATCH 18/18] libxl: timeouts: Record deregistration when one occurs"):
> On Mon, 2014-02-03 at 16:14 +0000, Ian Jackson wrote:
> > When a timeout has occurred, it is deregistered.  However, we failed
> > to record this fact by updating etime->func.  As a result,
> > libxl__ev_time_isregistered would say `true' for a timeout which has
> > already happened.
> 
> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
> 
> Although if you were minded to explicitly note in the changelog that it
> is necessary to nuke ->func before running the callback so that it
> doesn't try and fire it again in parallel while it is running then that
> would be good.

That's not the reason.  This is the reason, which I have just put in
the commit message:

    It is necessary to clear etime->func before the callback, because
    the callback might want to reinstate the timeout, or might free
    the etime (or its containing struct) entirely.

I have also fixed my S-o-b which incorrectly used my personal email
address.

Ian.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 18/18] libxl: timeouts: Record deregistration when one occurs
  2014-02-06 14:24     ` Ian Jackson
@ 2014-02-06 14:27       ` Ian Campbell
  0 siblings, 0 replies; 36+ messages in thread
From: Ian Campbell @ 2014-02-06 14:27 UTC (permalink / raw)
  To: Ian Jackson; +Cc: George Dunlap, Jim Fehlig, xen-devel

On Thu, 2014-02-06 at 14:24 +0000, Ian Jackson wrote:
> Ian Campbell writes ("Re: [PATCH 18/18] libxl: timeouts: Record deregistration when one occurs"):
> > On Mon, 2014-02-03 at 16:14 +0000, Ian Jackson wrote:
> > > When a timeout has occurred, it is deregistered.  However, we failed
> > > to record this fact by updating etime->func.  As a result,
> > > libxl__ev_time_isregistered would say `true' for a timeout which has
> > > already happened.
> > 
> > Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
> > 
> > Although if you were minded to explicitly note in the changelog that it
> > is necessary to nuke ->func before running the callback so that it
> > doesn't try and fire it again in parallel while it is running then that
> > would be good.
> 
> That's not the reason.  This is the reason, which I have just put in
> the commit message:
> 
>     It is necessary to clear etime->func before the callback, because
>     the callback might want to reinstate the timeout, or might free
>     the etime (or its containing struct) entirely.

Ah, right!

That's fine -- my ack stands.

> I have also fixed my S-o-b which incorrectly used my personal email
> address.
> 
> Ian.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
  2014-02-06 14:07         ` Ian Campbell
@ 2014-02-06 14:33           ` Ian Jackson
  0 siblings, 0 replies; 36+ messages in thread
From: Ian Jackson @ 2014-02-06 14:33 UTC (permalink / raw)
  To: Ian Campbell; +Cc: George Dunlap, Jim Fehlig, xen-devel

Ian Campbell writes ("Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4"):
> I'm assuming you will shovel this one in yourself.

Now done, as modified, thanks.

Ian.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4
  2014-02-06 10:52     ` George Dunlap
  2014-02-06 12:35       ` Ian Jackson
@ 2014-02-07  4:17       ` Jim Fehlig
  1 sibling, 0 replies; 36+ messages in thread
From: Jim Fehlig @ 2014-02-07  4:17 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Ian Jackson, Ian Campbell

George Dunlap wrote:
> On 02/05/2014 03:03 PM, Ian Jackson wrote:
>> George Dunlap writes ("Re: [PATCH 00/18 v3] libxl: fork and event
>> fixes for libvirt and 4.4"):
>>> On 02/03/2014 04:14 PM, Ian Jackson wrote:
>>>> This is the latest version of my libxl event fixes apropos of Jim's
>>>> libvirt testing.
>>> Did you have any opinions on the suitability of this for 4.4?
>> Sorry, I should have made that clear in the body text rather than just
>> the subject line.
>>
>> I think this needs a freeze exception on the following grounds:
>>
>>   * There is little change visible to non-eventy/thready callers and
>>     the risk of new races there is limited; basic functional testing
>>     ought to catch those errors.
>>
>>   * The most prominent eventy/thready caller we are currently aware of
>>     is libvirt.  Without these changes it is nearly impossible to have
>>     a reliable libvirt.
>
> Thanks.
>
> I think libvirt support for libxl is a really important functionality
> from a strategic perspective: a solid support should make it much
> easier to integrate with other projects such as OpenStack and
> Cloudstack, as well as (in theory) other tools built on top of libvirt.

Thanks for considering this series for inclusion in 4.4!

> So I'm inclined to consider this a blocker*; I think we should accept
> it and delay the release until we feel comfortable that it has been
> sufficiently tested.

I've been running libvirt-based tests that continuously start/shutdown
persistent domains, create/shutdown transient domains, save/restore
domains, and dump info on all these domains, for the past 35 hours.  The
tests have been restarted a few times, as I fine tune some libvirt
patches, but did run for 20 hours before the first restart.

Regards,
Jim

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2014-02-07  4:17 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-03 16:14 [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
2014-02-03 16:14 ` [PATCH 01/18] libxl: fork: Break out checked_waitpid Ian Jackson
2014-02-03 16:14 ` [PATCH 02/18] libxl: fork: Break out childproc_reaped_ours Ian Jackson
2014-02-03 16:14 ` [PATCH 03/18] libxl: fork: Clarify docs for libxl_sigchld_owner Ian Jackson
2014-02-03 16:14 ` [PATCH 04/18] libxl: fork: Document libxl_sigchld_owner_libxl better Ian Jackson
2014-02-03 16:14 ` [PATCH 05/18] libxl: fork: assert that chldmode is right Ian Jackson
2014-02-03 16:14 ` [PATCH 06/18] libxl: fork: Provide libxl_childproc_sigchld_occurred Ian Jackson
2014-02-03 16:14 ` [PATCH 07/18] libxl: fork: Provide ..._always_selective_reap Ian Jackson
2014-02-03 16:14 ` [PATCH 08/18] libxl: fork: Provide LIBXL_HAVE_SIGCHLD_SELECTIVE_REAP Ian Jackson
2014-02-03 16:14 ` [PATCH 09/18] libxl: fork: Rename sigchld handler functions Ian Jackson
2014-02-03 16:14 ` [PATCH 10/18] libxl: fork: Break out sigchld_installhandler_core Ian Jackson
2014-02-03 16:14 ` [PATCH 11/18] libxl: fork: Break out sigchld_sethandler_raw Ian Jackson
2014-02-06 13:53   ` Ian Campbell
2014-02-03 16:14 ` [PATCH 12/18] libxl: fork: Share SIGCHLD handler amongst ctxs Ian Jackson
2014-02-03 16:14 ` [PATCH 13/18] libxl: events: Break out libxl__pipe_nonblock, _close Ian Jackson
2014-02-03 16:14 ` [PATCH 14/18] libxl: fork: Make SIGCHLD self-pipe nonblocking Ian Jackson
2014-02-03 16:14 ` [PATCH 15/18] libxl: events: Makefile builds internal unit tests Ian Jackson
2014-02-06 14:00   ` Ian Campbell
2014-02-03 16:14 ` [PATCH 16/18] libxl: events: timedereg internal unit test Ian Jackson
2014-02-06 14:01   ` Ian Campbell
2014-02-03 16:14 ` [PATCH 17/18] libxl: timeouts: Break out time_occurs Ian Jackson
2014-02-06 14:02   ` Ian Campbell
2014-02-03 16:14 ` [PATCH 18/18] libxl: timeouts: Record deregistration when one occurs Ian Jackson
2014-02-06 14:04   ` Ian Campbell
2014-02-06 14:24     ` Ian Jackson
2014-02-06 14:27       ` Ian Campbell
2014-02-03 16:16 ` [PATCH 00/18 v3] libxl: fork and event fixes for libvirt and 4.4 Ian Jackson
2014-02-05  5:46   ` Jim Fehlig
2014-02-05 11:21     ` Ian Jackson
2014-02-05 14:10 ` George Dunlap
2014-02-05 15:03   ` Ian Jackson
2014-02-06 10:52     ` George Dunlap
2014-02-06 12:35       ` Ian Jackson
2014-02-06 14:07         ` Ian Campbell
2014-02-06 14:33           ` Ian Jackson
2014-02-07  4:17       ` Jim Fehlig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).