* [PATCH 0/2 v4] libxl: fix event races exposed by libvirt
@ 2013-01-23 16:08 Ian Jackson
2013-01-23 16:08 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Ian Jackson @ 2013-01-23 16:08 UTC (permalink / raw)
To: xen-devel; +Cc: Jim Fehlig
This is the final version of these patches which are needed to make
Jim Fehlig's libvirt libxl driver work.
1/2 libxl: fix stale fd event callback race
2/2 libxl: fix stale timeout event callback race
They are identical to the previous posting apart from commit messages
and fixing the bug Jim mentioned in <50FD98DC.4020203@suse.com>.
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH 1/2] libxl: fix stale fd event callback race 2013-01-23 16:08 [PATCH 0/2 v4] libxl: fix event races exposed by libvirt Ian Jackson @ 2013-01-23 16:08 ` Ian Jackson 2013-01-23 16:08 ` [PATCH 2/2] libxl: fix stale timeout " Ian Jackson 2013-01-23 17:58 ` DO NOT APPLY Re: [PATCH 0/2 v4] libxl: fix event races exposed by libvirt Ian Jackson 2 siblings, 0 replies; 13+ messages in thread From: Ian Jackson @ 2013-01-23 16:08 UTC (permalink / raw) To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell, Bamvor Jian Zhang Because there is not necessarily any lock held at the point the application (eg, libvirt) calls libxl_osevent_occurred_timeout and ..._fd, in a multithreaded program those calls may be arbitrarily delayed in relation to other activities within the program. libxl therefore needs to be prepared to receive very old event callbacks. Arrange for this to be the case for fd callbacks. This requires a new layer of indirection through a "hook nexus" struct which can outlive the libxl__ev_foo. Allocation and deallocation of these nexi is mostly handled in the OSEVENT macros which wrap up the application's callbacks. Document the problem and the solution in a comment in libxl_event.c just before the definition of struct libxl__osevent_hook_nexus. There is still a race relating to libxl__osevent_occurred_timeout; this will be addressed in the following patch. Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> -- v2: - Prepare for fixing timeout race too - Break out osevent_release_nexus() - nexusop argument to OSEVENT* macros - Clarify OSEVENT* nexusop hooks - osevent_ev_from_hook_nexus takes a libxl__osevent_hook_nexus* --- tools/libxl/libxl_event.c | 184 +++++++++++++++++++++++++++++++++++------ tools/libxl/libxl_internal.h | 8 ++- 2 files changed, 163 insertions(+), 29 deletions(-) diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 72cb723..f1fe425 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -38,23 +38,131 @@ * The application's registration hooks should be called ONLY via * these macros, with the ctx locked. Likewise all the "occurred" * entrypoints from the application should assert(!in_hook); + * + * During the hook call - including while the arguments are being + * evaluated - ev->nexus is guaranteed to be valid and refer to the + * nexus which is being used for this event registration. The + * arguments should specify ev->nexus for the for_libxl argument and + * ev->nexus->for_app_reg (or a pointer to it) for for_app_reg. */ -#define OSEVENT_HOOK_INTERN(retval, hookname, ...) do { \ - if (CTX->osevent_hooks) { \ - CTX->osevent_in_hook++; \ - retval CTX->osevent_hooks->hookname(CTX->osevent_user, __VA_ARGS__); \ - CTX->osevent_in_hook--; \ - } \ +#define OSEVENT_HOOK_INTERN(retval, failedp, evkind, hookop, nexusop, ...) do { \ + if (CTX->osevent_hooks) { \ + CTX->osevent_in_hook++; \ + libxl__osevent_hook_nexi *nexi = &CTX->hook_##evkind##_nexi_idle; \ + osevent_hook_pre_##nexusop(gc, ev, nexi, &ev->nexus); \ + retval CTX->osevent_hooks->evkind##_##hookop \ + (CTX->osevent_user, __VA_ARGS__); \ + if ((failedp)) \ + osevent_hook_failed_##nexusop(gc, ev, nexi, &ev->nexus); \ + CTX->osevent_in_hook--; \ + } \ } while (0) -#define OSEVENT_HOOK(hookname, ...) ({ \ - int osevent_hook_rc = 0; \ - OSEVENT_HOOK_INTERN(osevent_hook_rc = , hookname, __VA_ARGS__); \ - osevent_hook_rc; \ +#define OSEVENT_HOOK(evkind, hookop, nexusop, ...) ({ \ + int osevent_hook_rc = 0; \ + OSEVENT_HOOK_INTERN(osevent_hook_rc =, !!osevent_hook_rc, \ + evkind, hookop, nexusop, __VA_ARGS__); \ + osevent_hook_rc; \ }) -#define OSEVENT_HOOK_VOID(hookname, ...) \ - OSEVENT_HOOK_INTERN(/* void */, hookname, __VA_ARGS__) +#define OSEVENT_HOOK_VOID(evkind, hookop, nexusop, ...) \ + OSEVENT_HOOK_INTERN(/* void */, 0, evkind, hookop, nexusop, __VA_ARGS__) + +/* + * The application's calls to libxl_osevent_occurred_... may be + * indefinitely delayed with respect to the rest of the program (since + * they are not necessarily called with any lock held). So the + * for_libxl value we receive may be (almost) arbitrarily old. All we + * know is that it came from this ctx. + * + * Therefore we may not free the object referred to by any for_libxl + * value until we free the whole libxl_ctx. And if we reuse it we + * must be able to tell when an old use turns up, and discard the + * stale event. + * + * Thus we cannot use the ev directly as the for_libxl value - we need + * a layer of indirection. + * + * We do this by keeping a pool of libxl__osevent_hook_nexus structs, + * and use pointers to them as for_libxl values. In fact, there are + * two pools: one for fds and one for timeouts. This ensures that we + * don't risk a type error when we upcast nexus->ev. In each nexus + * the ev is either null or points to a valid libxl__ev_time or + * libxl__ev_fd, as applicable. + * + * We /do/ allow ourselves to reassociate an old nexus with a new ev + * as otherwise we would have to leak nexi. (This reassociation + * might, of course, be an old ev being reused for a new purpose so + * simply comparing the ev pointer is not sufficient.) Thus the + * libxl_osevent_occurred functions need to check that the condition + * allegedly signalled by this event actually exists. + * + * The nexi and the lists are all protected by the ctx lock. + */ + +struct libxl__osevent_hook_nexus { + void *ev; + void *for_app_reg; + LIBXL_SLIST_ENTRY(libxl__osevent_hook_nexus) next; +}; + +static void *osevent_ev_from_hook_nexus(libxl_ctx *ctx, + libxl__osevent_hook_nexus *nexus /* pass void *for_libxl */) +{ + return nexus->ev; +} + +static void osevent_release_nexus(libxl__gc *gc, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus *nexus) +{ + nexus->ev = 0; + LIBXL_SLIST_INSERT_HEAD(nexi_idle, nexus, next); +} + +/*----- OSEVENT* hook functions for nexusop "alloc" -----*/ +static void osevent_hook_pre_alloc(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus_r) +{ + libxl__osevent_hook_nexus *nexus = LIBXL_SLIST_FIRST(nexi_idle); + if (nexus) { + LIBXL_SLIST_REMOVE_HEAD(nexi_idle, next); + } else { + nexus = libxl__zalloc(NOGC, sizeof(*nexus)); + } + nexus->ev = ev; + *nexus_r = nexus; +} +static void osevent_hook_failed_alloc(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + osevent_release_nexus(gc, nexi_idle, *nexus); +} + +/*----- OSEVENT* hook functions for nexusop "release" -----*/ +static void osevent_hook_pre_release(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + osevent_release_nexus(gc, nexi_idle, *nexus); +} +static void osevent_hook_failed_release(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + abort(); +} + +/*----- OSEVENT* hook functions for nexusop "noop" -----*/ +static void osevent_hook_pre_noop(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) { } +static void osevent_hook_failed_noop(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) { } + /* * fd events @@ -72,7 +180,8 @@ int libxl__ev_fd_register(libxl__gc *gc, libxl__ev_fd *ev, DBG("ev_fd=%p register fd=%d events=%x", ev, fd, events); - rc = OSEVENT_HOOK(fd_register, fd, &ev->for_app_reg, events, ev); + rc = OSEVENT_HOOK(fd,register, alloc, fd, &ev->nexus->for_app_reg, + events, ev->nexus); if (rc) goto out; ev->fd = fd; @@ -97,7 +206,7 @@ int libxl__ev_fd_modify(libxl__gc *gc, libxl__ev_fd *ev, short events) DBG("ev_fd=%p modify fd=%d events=%x", ev, ev->fd, events); - rc = OSEVENT_HOOK(fd_modify, ev->fd, &ev->for_app_reg, events); + rc = OSEVENT_HOOK(fd,modify, noop, ev->fd, &ev->nexus->for_app_reg, events); if (rc) goto out; ev->events = events; @@ -119,7 +228,7 @@ void libxl__ev_fd_deregister(libxl__gc *gc, libxl__ev_fd *ev) DBG("ev_fd=%p deregister fd=%d", ev, ev->fd); - OSEVENT_HOOK_VOID(fd_deregister, ev->fd, ev->for_app_reg); + OSEVENT_HOOK_VOID(fd,deregister, release, ev->fd, ev->nexus->for_app_reg); LIBXL_LIST_REMOVE(ev, entry); ev->fd = -1; @@ -171,7 +280,8 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, { int rc; - rc = OSEVENT_HOOK(timeout_register, &ev->for_app_reg, absolute, ev); + rc = OSEVENT_HOOK(timeout,register, alloc, &ev->nexus->for_app_reg, + absolute, ev->nexus); if (rc) return rc; ev->infinite = 0; @@ -184,7 +294,7 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, static void time_deregister(libxl__gc *gc, libxl__ev_time *ev) { if (!ev->infinite) { - OSEVENT_HOOK_VOID(timeout_deregister, ev->for_app_reg); + OSEVENT_HOOK_VOID(timeout,deregister, release, ev->nexus->for_app_reg); LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); } } @@ -270,7 +380,8 @@ int libxl__ev_time_modify_abs(libxl__gc *gc, libxl__ev_time *ev, rc = time_register_finite(gc, ev, absolute); if (rc) goto out; } else { - rc = OSEVENT_HOOK(timeout_modify, &ev->for_app_reg, absolute); + rc = OSEVENT_HOOK(timeout,modify, noop, + &ev->nexus->for_app_reg, absolute); if (rc) goto out; LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); @@ -1010,35 +1121,54 @@ void libxl_osevent_register_hooks(libxl_ctx *ctx, void libxl_osevent_occurred_fd(libxl_ctx *ctx, void *for_libxl, - int fd, short events, short revents) + int fd, short events_ign, short revents_ign) { - libxl__ev_fd *ev = for_libxl; - EGC_INIT(ctx); CTX_LOCK; assert(!CTX->osevent_in_hook); - assert(fd == ev->fd); - revents &= ev->events; - if (revents) - ev->func(egc, ev, fd, ev->events, revents); + libxl__ev_fd *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + if (!ev) goto out; + if (ev->fd != fd) goto out; + struct pollfd check; + for (;;) { + check.fd = fd; + check.events = ev->events; + int r = poll(&check, 1, 0); + if (!r) + goto out; + if (r==1) + break; + assert(r<0); + if (errno != EINTR) { + LIBXL__EVENT_DISASTER(egc, "failed poll to check for fd", errno, 0); + goto out; + } + } + + if (check.revents) + ev->func(egc, ev, fd, ev->events, check.revents); + + out: CTX_UNLOCK; EGC_FREE; } void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl) { - libxl__ev_time *ev = for_libxl; - EGC_INIT(ctx); CTX_LOCK; assert(!CTX->osevent_in_hook); + libxl__ev_time *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + if (!ev) goto out; assert(!ev->infinite); + LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); ev->func(egc, ev, &ev->abs); + out: CTX_UNLOCK; EGC_FREE; } diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index cba3616..6484bcb 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -136,6 +136,8 @@ typedef struct libxl__gc libxl__gc; typedef struct libxl__egc libxl__egc; typedef struct libxl__ao libxl__ao; typedef struct libxl__aop_occurred libxl__aop_occurred; +typedef struct libxl__osevent_hook_nexus libxl__osevent_hook_nexus; +typedef struct libxl__osevent_hook_nexi libxl__osevent_hook_nexi; _hidden void libxl__alloc_failed(libxl_ctx *, const char *func, size_t nmemb, size_t size) __attribute__((noreturn)); @@ -163,7 +165,7 @@ struct libxl__ev_fd { libxl__ev_fd_callback *func; /* remainder is private for libxl__ev_fd... */ LIBXL_LIST_ENTRY(libxl__ev_fd) entry; - void *for_app_reg; + libxl__osevent_hook_nexus *nexus; }; @@ -178,7 +180,7 @@ struct libxl__ev_time { int infinite; /* not registered in list or with app if infinite */ LIBXL_TAILQ_ENTRY(libxl__ev_time) entry; struct timeval abs; - void *for_app_reg; + libxl__osevent_hook_nexus *nexus; }; typedef struct libxl__ev_xswatch libxl__ev_xswatch; @@ -329,6 +331,8 @@ struct libxl__ctx { libxl__poller poller_app; /* libxl_osevent_beforepoll and _afterpoll */ LIBXL_LIST_HEAD(, libxl__poller) pollers_event, pollers_idle; + LIBXL_SLIST_HEAD(libxl__osevent_hook_nexi, libxl__osevent_hook_nexus) + hook_fd_nexi_idle, hook_timeout_nexi_idle; LIBXL_LIST_HEAD(, libxl__ev_fd) efds; LIBXL_TAILQ_HEAD(, libxl__ev_time) etimes; -- 1.7.2.5 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/2] libxl: fix stale timeout event callback race 2013-01-23 16:08 [PATCH 0/2 v4] libxl: fix event races exposed by libvirt Ian Jackson 2013-01-23 16:08 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson @ 2013-01-23 16:08 ` Ian Jackson 2013-01-23 17:48 ` Jim Fehlig 2013-01-23 17:58 ` DO NOT APPLY Re: [PATCH 0/2 v4] libxl: fix event races exposed by libvirt Ian Jackson 2 siblings, 1 reply; 13+ messages in thread From: Ian Jackson @ 2013-01-23 16:08 UTC (permalink / raw) To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell, Bamvor Jian Zhang Because there is not necessarily any lock held at the point the application (eg, libvirt) calls libxl_osevent_occurred_timeout, in a multithreaded program those calls may be arbitrarily delayed in relation to other activities within the program. Specifically this means when ->timeout_deregister returns, libxl does not know whether it can safely dispose of the for_libxl value or whether it needs to retain it in case of an in-progress call to _occurred_timeout. The interface could be fixed by requiring the application to make a new call into libxl to say that the deregistration was complete. However that new call would have to be threaded through the application's event loop; this is complicated and some application authors are likely not to implement it properly. Furthermore the easiest way to implement this facility in most event loops is to queue up a time event for "now". Shortcut all of this by having libxl always call timeout_modify setting abs={0,0} (ie, ASAP) instead of timeout_deregister. This will cause the application to call _occurred_timeout. When processing this calldown we see that we were no longer actually interested and simply throw it away. Additionally, there is a race between _occurred_timeout and ->timeout_modify. If libxl ever adjusts the deadline for a timeout the application may already be in the process of calling _occurred, in which case the situation with for_app's lifetime becomes very complicated. Therefore abolish libxl__ev_time_modify_{abs,rel} (which have no callers) and promise to the application only ever to call ->timeout_modify with abs=={0,0}. The application still needs to cope with ->timeout_modify racing with its internal function which calls _occurred_timeout. Document this. This is a forwards-compatible change for applications using the libxl API, and will hopefully eliminate these races in callback-supplying applications (such as libvirt) without the need for corresponding changes to the application. For clarity, fold the body of time_register_finite into its one remaining call site. This makes the semantics of ev->infinite slightly clearer. Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> -- v3: - Fix null pointer dereference in case when hooks not supplied. --- tools/libxl/libxl_event.c | 89 +++++++-------------------------------------- tools/libxl/libxl_event.h | 17 ++++++++- 2 files changed, 29 insertions(+), 77 deletions(-) diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index f1fe425..f86c528 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -267,18 +267,11 @@ static int time_rel_to_abs(libxl__gc *gc, int ms, struct timeval *abs_out) return 0; } -static void time_insert_finite(libxl__gc *gc, libxl__ev_time *ev) -{ - libxl__ev_time *evsearch; - LIBXL_TAILQ_INSERT_SORTED(&CTX->etimes, entry, ev, evsearch, /*empty*/, - timercmp(&ev->abs, &evsearch->abs, >)); - ev->infinite = 0; -} - static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, struct timeval absolute) { int rc; + libxl__ev_time *evsearch; rc = OSEVENT_HOOK(timeout,register, alloc, &ev->nexus->for_app_reg, absolute, ev->nexus); @@ -286,7 +279,8 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, ev->infinite = 0; ev->abs = absolute; - time_insert_finite(gc, ev); + LIBXL_TAILQ_INSERT_SORTED(&CTX->etimes, entry, ev, evsearch, /*empty*/, + timercmp(&ev->abs, &evsearch->abs, >)); return 0; } @@ -294,7 +288,12 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, static void time_deregister(libxl__gc *gc, libxl__ev_time *ev) { if (!ev->infinite) { - OSEVENT_HOOK_VOID(timeout,deregister, release, ev->nexus->for_app_reg); + struct timeval right_away = { 0, 0 }; + if (ev->nexus) /* only set if app provided hooks */ + ev->nexus->ev = 0; + OSEVENT_HOOK_VOID(timeout,modify, + noop /* release nexus in _occurred_ */, + ev->nexus->for_app_reg, right_away); LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); } } @@ -364,70 +363,6 @@ int libxl__ev_time_register_rel(libxl__gc *gc, libxl__ev_time *ev, return rc; } -int libxl__ev_time_modify_abs(libxl__gc *gc, libxl__ev_time *ev, - struct timeval absolute) -{ - int rc; - - CTX_LOCK; - - DBG("ev_time=%p modify abs==%lu.%06lu", - ev, (unsigned long)absolute.tv_sec, (unsigned long)absolute.tv_usec); - - assert(libxl__ev_time_isregistered(ev)); - - if (ev->infinite) { - rc = time_register_finite(gc, ev, absolute); - if (rc) goto out; - } else { - rc = OSEVENT_HOOK(timeout,modify, noop, - &ev->nexus->for_app_reg, absolute); - if (rc) goto out; - - LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); - ev->abs = absolute; - time_insert_finite(gc, ev); - } - - rc = 0; - out: - time_done_debug(gc,__func__,ev,rc); - CTX_UNLOCK; - return rc; -} - -int libxl__ev_time_modify_rel(libxl__gc *gc, libxl__ev_time *ev, - int milliseconds) -{ - struct timeval absolute; - int rc; - - CTX_LOCK; - - DBG("ev_time=%p modify ms=%d", ev, milliseconds); - - assert(libxl__ev_time_isregistered(ev)); - - if (milliseconds < 0) { - time_deregister(gc, ev); - ev->infinite = 1; - rc = 0; - goto out; - } - - rc = time_rel_to_abs(gc, milliseconds, &absolute); - if (rc) goto out; - - rc = libxl__ev_time_modify_abs(gc, ev, absolute); - if (rc) goto out; - - rc = 0; - out: - time_done_debug(gc,__func__,ev,rc); - CTX_UNLOCK; - return rc; -} - void libxl__ev_time_deregister(libxl__gc *gc, libxl__ev_time *ev) { CTX_LOCK; @@ -1161,7 +1096,11 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl) CTX_LOCK; assert(!CTX->osevent_in_hook); - libxl__ev_time *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + libxl__osevent_hook_nexus *nexus = for_libxl; + libxl__ev_time *ev = osevent_ev_from_hook_nexus(ctx, nexus); + + osevent_release_nexus(gc, &CTX->hook_timeout_nexi_idle, nexus); + if (!ev) goto out; assert(!ev->infinite); diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h index 3bcb6d3..51f2721 100644 --- a/tools/libxl/libxl_event.h +++ b/tools/libxl/libxl_event.h @@ -287,8 +287,10 @@ typedef struct libxl_osevent_hooks { int (*timeout_register)(void *user, void **for_app_registration_out, struct timeval abs, void *for_libxl); int (*timeout_modify)(void *user, void **for_app_registration_update, - struct timeval abs); - void (*timeout_deregister)(void *user, void *for_app_registration); + struct timeval abs) + /* only ever called with abs={0,0}, meaning ASAP */; + void (*timeout_deregister)(void *user, void *for_app_registration) + /* will never be called */; } libxl_osevent_hooks; /* The application which calls register_fd_hooks promises to @@ -337,6 +339,17 @@ typedef struct libxl_osevent_hooks { * register (or modify), and pass it to subsequent calls to modify * or deregister. * + * Note that the application must cope with a call from libxl to + * timeout_modify racing with its own call to + * libxl__osevent_occurred_timeout. libxl guarantees that + * timeout_modify will only be called with abs={0,0} but the + * application must still ensure that libxl's attempt to cause the + * timeout to occur immediately is safely ignored even the timeout is + * actually already in the process of occurring. + * + * timeout_deregister is not used because it forms part of a + * deprecated unsafe mode of use of the API. + * * osevent_register_hooks may be called only once for each libxl_ctx. * libxl may make calls to register/modify/deregister from within * any libxl function (indeed, it will usually call register from -- 1.7.2.5 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] libxl: fix stale timeout event callback race 2013-01-23 16:08 ` [PATCH 2/2] libxl: fix stale timeout " Ian Jackson @ 2013-01-23 17:48 ` Jim Fehlig 0 siblings, 0 replies; 13+ messages in thread From: Jim Fehlig @ 2013-01-23 17:48 UTC (permalink / raw) To: Ian Jackson; +Cc: Bamvor Jian Zhang, Ian Campbell, xen-devel Ian Jackson wrote: > Because there is not necessarily any lock held at the point the > application (eg, libvirt) calls libxl_osevent_occurred_timeout, in a > multithreaded program those calls may be arbitrarily delayed in > relation to other activities within the program. > > Specifically this means when ->timeout_deregister returns, libxl does > not know whether it can safely dispose of the for_libxl value or > whether it needs to retain it in case of an in-progress call to > _occurred_timeout. > > The interface could be fixed by requiring the application to make a > new call into libxl to say that the deregistration was complete. > > However that new call would have to be threaded through the > application's event loop; this is complicated and some application > authors are likely not to implement it properly. Furthermore the > easiest way to implement this facility in most event loops is to queue > up a time event for "now". > > Shortcut all of this by having libxl always call timeout_modify > setting abs={0,0} (ie, ASAP) instead of timeout_deregister. This will > cause the application to call _occurred_timeout. When processing this > calldown we see that we were no longer actually interested and simply > throw it away. > > Additionally, there is a race between _occurred_timeout and > ->timeout_modify. If libxl ever adjusts the deadline for a timeout > the application may already be in the process of calling _occurred, in > which case the situation with for_app's lifetime becomes very > complicated. Therefore abolish libxl__ev_time_modify_{abs,rel} (which > have no callers) and promise to the application only ever to call > ->timeout_modify with abs=={0,0}. The application still needs to cope > with ->timeout_modify racing with its internal function which calls > _occurred_timeout. Document this. > > This is a forwards-compatible change for applications using the libxl > API, and will hopefully eliminate these races in callback-supplying > applications (such as libvirt) without the need for corresponding > changes to the application. > > For clarity, fold the body of time_register_finite into its one > remaining call site. This makes the semantics of ev->infinite > slightly clearer. > > Cc: Bamvor Jian Zhang <bjzhang@suse.com> > Cc: Ian Campbell <Ian.Campbell@citrix.com> > Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> > > -- > v3: > - Fix null pointer dereference in case when hooks not supplied. > --- > tools/libxl/libxl_event.c | 89 +++++++-------------------------------------- > tools/libxl/libxl_event.h | 17 ++++++++- > 2 files changed, 29 insertions(+), 77 deletions(-) > > diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c > index f1fe425..f86c528 100644 > --- a/tools/libxl/libxl_event.c > +++ b/tools/libxl/libxl_event.c > @@ -267,18 +267,11 @@ static int time_rel_to_abs(libxl__gc *gc, int ms, struct timeval *abs_out) > return 0; > } > > -static void time_insert_finite(libxl__gc *gc, libxl__ev_time *ev) > -{ > - libxl__ev_time *evsearch; > - LIBXL_TAILQ_INSERT_SORTED(&CTX->etimes, entry, ev, evsearch, /*empty*/, > - timercmp(&ev->abs, &evsearch->abs, >)); > - ev->infinite = 0; > -} > - > static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, > struct timeval absolute) > { > int rc; > + libxl__ev_time *evsearch; > > rc = OSEVENT_HOOK(timeout,register, alloc, &ev->nexus->for_app_reg, > absolute, ev->nexus); > @@ -286,7 +279,8 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, > > ev->infinite = 0; > ev->abs = absolute; > - time_insert_finite(gc, ev); > + LIBXL_TAILQ_INSERT_SORTED(&CTX->etimes, entry, ev, evsearch, /*empty*/, > + timercmp(&ev->abs, &evsearch->abs, >)); > > return 0; > } > @@ -294,7 +288,12 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, > static void time_deregister(libxl__gc *gc, libxl__ev_time *ev) > { > if (!ev->infinite) { > - OSEVENT_HOOK_VOID(timeout,deregister, release, ev->nexus->for_app_reg); > + struct timeval right_away = { 0, 0 }; > + if (ev->nexus) /* only set if app provided hooks */ > + ev->nexus->ev = 0; > + OSEVENT_HOOK_VOID(timeout,modify, > + noop /* release nexus in _occurred_ */, > + ev->nexus->for_app_reg, right_away); > I still hit a segfault in the modify hook. The modify hook prototype is int (*timeout_modify)(void *user, void **for_app_registration_update, struct timeval abs) Shouldn't for_app_registration_update contain the data provided by the app in timeout_register, with the option to update it? With the below change, I don't see the segfault. + OSEVENT_HOOK_VOID(timeout,modify, + noop /* release nexus in _occurred_ */, + &ev->nexus->for_app_reg, right_away); Thanks, Jim > LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); > } > } > @@ -364,70 +363,6 @@ int libxl__ev_time_register_rel(libxl__gc *gc, libxl__ev_time *ev, > return rc; > } > > -int libxl__ev_time_modify_abs(libxl__gc *gc, libxl__ev_time *ev, > - struct timeval absolute) > -{ > - int rc; > - > - CTX_LOCK; > - > - DBG("ev_time=%p modify abs==%lu.%06lu", > - ev, (unsigned long)absolute.tv_sec, (unsigned long)absolute.tv_usec); > - > - assert(libxl__ev_time_isregistered(ev)); > - > - if (ev->infinite) { > - rc = time_register_finite(gc, ev, absolute); > - if (rc) goto out; > - } else { > - rc = OSEVENT_HOOK(timeout,modify, noop, > - &ev->nexus->for_app_reg, absolute); > - if (rc) goto out; > - > - LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); > - ev->abs = absolute; > - time_insert_finite(gc, ev); > - } > - > - rc = 0; > - out: > - time_done_debug(gc,__func__,ev,rc); > - CTX_UNLOCK; > - return rc; > -} > - > -int libxl__ev_time_modify_rel(libxl__gc *gc, libxl__ev_time *ev, > - int milliseconds) > -{ > - struct timeval absolute; > - int rc; > - > - CTX_LOCK; > - > - DBG("ev_time=%p modify ms=%d", ev, milliseconds); > - > - assert(libxl__ev_time_isregistered(ev)); > - > - if (milliseconds < 0) { > - time_deregister(gc, ev); > - ev->infinite = 1; > - rc = 0; > - goto out; > - } > - > - rc = time_rel_to_abs(gc, milliseconds, &absolute); > - if (rc) goto out; > - > - rc = libxl__ev_time_modify_abs(gc, ev, absolute); > - if (rc) goto out; > - > - rc = 0; > - out: > - time_done_debug(gc,__func__,ev,rc); > - CTX_UNLOCK; > - return rc; > -} > - > void libxl__ev_time_deregister(libxl__gc *gc, libxl__ev_time *ev) > { > CTX_LOCK; > @@ -1161,7 +1096,11 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl) > CTX_LOCK; > assert(!CTX->osevent_in_hook); > > - libxl__ev_time *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); > + libxl__osevent_hook_nexus *nexus = for_libxl; > + libxl__ev_time *ev = osevent_ev_from_hook_nexus(ctx, nexus); > + > + osevent_release_nexus(gc, &CTX->hook_timeout_nexi_idle, nexus); > + > if (!ev) goto out; > assert(!ev->infinite); > > diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h > index 3bcb6d3..51f2721 100644 > --- a/tools/libxl/libxl_event.h > +++ b/tools/libxl/libxl_event.h > @@ -287,8 +287,10 @@ typedef struct libxl_osevent_hooks { > int (*timeout_register)(void *user, void **for_app_registration_out, > struct timeval abs, void *for_libxl); > int (*timeout_modify)(void *user, void **for_app_registration_update, > - struct timeval abs); > - void (*timeout_deregister)(void *user, void *for_app_registration); > + struct timeval abs) > + /* only ever called with abs={0,0}, meaning ASAP */; > + void (*timeout_deregister)(void *user, void *for_app_registration) > + /* will never be called */; > } libxl_osevent_hooks; > > /* The application which calls register_fd_hooks promises to > @@ -337,6 +339,17 @@ typedef struct libxl_osevent_hooks { > * register (or modify), and pass it to subsequent calls to modify > * or deregister. > * > + * Note that the application must cope with a call from libxl to > + * timeout_modify racing with its own call to > + * libxl__osevent_occurred_timeout. libxl guarantees that > + * timeout_modify will only be called with abs={0,0} but the > + * application must still ensure that libxl's attempt to cause the > + * timeout to occur immediately is safely ignored even the timeout is > + * actually already in the process of occurring. > + * > + * timeout_deregister is not used because it forms part of a > + * deprecated unsafe mode of use of the API. > + * > * osevent_register_hooks may be called only once for each libxl_ctx. > * libxl may make calls to register/modify/deregister from within > * any libxl function (indeed, it will usually call register from > ^ permalink raw reply [flat|nested] 13+ messages in thread
* DO NOT APPLY Re: [PATCH 0/2 v4] libxl: fix event races exposed by libvirt 2013-01-23 16:08 [PATCH 0/2 v4] libxl: fix event races exposed by libvirt Ian Jackson 2013-01-23 16:08 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson 2013-01-23 16:08 ` [PATCH 2/2] libxl: fix stale timeout " Ian Jackson @ 2013-01-23 17:58 ` Ian Jackson 2 siblings, 0 replies; 13+ messages in thread From: Ian Jackson @ 2013-01-23 17:58 UTC (permalink / raw) To: xen-devel@lists.xen.org, Jim Fehlig, Ian Campbell Ian Jackson writes ("[PATCH 0/2 v4] libxl: fix event races exposed by libvirt"): > This is the final version of these patches which are needed to make > Jim Fehlig's libvirt libxl driver work. > > 1/2 libxl: fix stale fd event callback race > 2/2 libxl: fix stale timeout event callback race > > They are identical to the previous posting apart from commit messages > and fixing the bug Jim mentioned in <50FD98DC.4020203@suse.com>. Something has gone wrong with my patch toolchain and the version posted is not the version in my tree. Thanks to Jim for spotting this. Ian. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 0/2 v4.1] libxl: fix event races exposed by libvirt @ 2013-01-23 18:21 Ian Jackson 2013-01-23 18:21 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson 0 siblings, 1 reply; 13+ messages in thread From: Ian Jackson @ 2013-01-23 18:21 UTC (permalink / raw) To: xen-devel; +Cc: Jim Fehlig, Ian Campbell This is the really honestly final version of these patches which are needed to make Jim Fehlig's libvirt libxl driver work. 1/2 libxl: fix stale fd event callback race 2/2 libxl: fix stale timeout event callback race It appears that I didn't run git-format-patch before git-send-email when I sent out v4, so they are missing an important bugfix from Jim Fehlig. Sorry about that. Ian. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] libxl: fix stale fd event callback race 2013-01-23 18:21 [PATCH 0/2 v4.1] " Ian Jackson @ 2013-01-23 18:21 ` Ian Jackson 0 siblings, 0 replies; 13+ messages in thread From: Ian Jackson @ 2013-01-23 18:21 UTC (permalink / raw) To: xen-devel; +Cc: Jim Fehlig, Ian Jackson, Ian Campbell, Bamvor Jian Zhang Because there is not necessarily any lock held at the point the application (eg, libvirt) calls libxl_osevent_occurred_timeout and ..._fd, in a multithreaded program those calls may be arbitrarily delayed in relation to other activities within the program. libxl therefore needs to be prepared to receive very old event callbacks. Arrange for this to be the case for fd callbacks. This requires a new layer of indirection through a "hook nexus" struct which can outlive the libxl__ev_foo. Allocation and deallocation of these nexi is mostly handled in the OSEVENT macros which wrap up the application's callbacks. Document the problem and the solution in a comment in libxl_event.c just before the definition of struct libxl__osevent_hook_nexus. There is still a race relating to libxl__osevent_occurred_timeout; this will be addressed in the following patch. Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Tested-by: Jim Fehlig <jfehlig@suse.com> Acked-by: Jim Fehlig <jfehlig@suse.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> -- v4: - Add Jim Fehlig's Tested-by and Acked-by. --- tools/libxl/libxl_event.c | 184 +++++++++++++++++++++++++++++++++++------ tools/libxl/libxl_internal.h | 8 ++- 2 files changed, 163 insertions(+), 29 deletions(-) diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 72cb723..f1fe425 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -38,23 +38,131 @@ * The application's registration hooks should be called ONLY via * these macros, with the ctx locked. Likewise all the "occurred" * entrypoints from the application should assert(!in_hook); + * + * During the hook call - including while the arguments are being + * evaluated - ev->nexus is guaranteed to be valid and refer to the + * nexus which is being used for this event registration. The + * arguments should specify ev->nexus for the for_libxl argument and + * ev->nexus->for_app_reg (or a pointer to it) for for_app_reg. */ -#define OSEVENT_HOOK_INTERN(retval, hookname, ...) do { \ - if (CTX->osevent_hooks) { \ - CTX->osevent_in_hook++; \ - retval CTX->osevent_hooks->hookname(CTX->osevent_user, __VA_ARGS__); \ - CTX->osevent_in_hook--; \ - } \ +#define OSEVENT_HOOK_INTERN(retval, failedp, evkind, hookop, nexusop, ...) do { \ + if (CTX->osevent_hooks) { \ + CTX->osevent_in_hook++; \ + libxl__osevent_hook_nexi *nexi = &CTX->hook_##evkind##_nexi_idle; \ + osevent_hook_pre_##nexusop(gc, ev, nexi, &ev->nexus); \ + retval CTX->osevent_hooks->evkind##_##hookop \ + (CTX->osevent_user, __VA_ARGS__); \ + if ((failedp)) \ + osevent_hook_failed_##nexusop(gc, ev, nexi, &ev->nexus); \ + CTX->osevent_in_hook--; \ + } \ } while (0) -#define OSEVENT_HOOK(hookname, ...) ({ \ - int osevent_hook_rc = 0; \ - OSEVENT_HOOK_INTERN(osevent_hook_rc = , hookname, __VA_ARGS__); \ - osevent_hook_rc; \ +#define OSEVENT_HOOK(evkind, hookop, nexusop, ...) ({ \ + int osevent_hook_rc = 0; \ + OSEVENT_HOOK_INTERN(osevent_hook_rc =, !!osevent_hook_rc, \ + evkind, hookop, nexusop, __VA_ARGS__); \ + osevent_hook_rc; \ }) -#define OSEVENT_HOOK_VOID(hookname, ...) \ - OSEVENT_HOOK_INTERN(/* void */, hookname, __VA_ARGS__) +#define OSEVENT_HOOK_VOID(evkind, hookop, nexusop, ...) \ + OSEVENT_HOOK_INTERN(/* void */, 0, evkind, hookop, nexusop, __VA_ARGS__) + +/* + * The application's calls to libxl_osevent_occurred_... may be + * indefinitely delayed with respect to the rest of the program (since + * they are not necessarily called with any lock held). So the + * for_libxl value we receive may be (almost) arbitrarily old. All we + * know is that it came from this ctx. + * + * Therefore we may not free the object referred to by any for_libxl + * value until we free the whole libxl_ctx. And if we reuse it we + * must be able to tell when an old use turns up, and discard the + * stale event. + * + * Thus we cannot use the ev directly as the for_libxl value - we need + * a layer of indirection. + * + * We do this by keeping a pool of libxl__osevent_hook_nexus structs, + * and use pointers to them as for_libxl values. In fact, there are + * two pools: one for fds and one for timeouts. This ensures that we + * don't risk a type error when we upcast nexus->ev. In each nexus + * the ev is either null or points to a valid libxl__ev_time or + * libxl__ev_fd, as applicable. + * + * We /do/ allow ourselves to reassociate an old nexus with a new ev + * as otherwise we would have to leak nexi. (This reassociation + * might, of course, be an old ev being reused for a new purpose so + * simply comparing the ev pointer is not sufficient.) Thus the + * libxl_osevent_occurred functions need to check that the condition + * allegedly signalled by this event actually exists. + * + * The nexi and the lists are all protected by the ctx lock. + */ + +struct libxl__osevent_hook_nexus { + void *ev; + void *for_app_reg; + LIBXL_SLIST_ENTRY(libxl__osevent_hook_nexus) next; +}; + +static void *osevent_ev_from_hook_nexus(libxl_ctx *ctx, + libxl__osevent_hook_nexus *nexus /* pass void *for_libxl */) +{ + return nexus->ev; +} + +static void osevent_release_nexus(libxl__gc *gc, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus *nexus) +{ + nexus->ev = 0; + LIBXL_SLIST_INSERT_HEAD(nexi_idle, nexus, next); +} + +/*----- OSEVENT* hook functions for nexusop "alloc" -----*/ +static void osevent_hook_pre_alloc(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus_r) +{ + libxl__osevent_hook_nexus *nexus = LIBXL_SLIST_FIRST(nexi_idle); + if (nexus) { + LIBXL_SLIST_REMOVE_HEAD(nexi_idle, next); + } else { + nexus = libxl__zalloc(NOGC, sizeof(*nexus)); + } + nexus->ev = ev; + *nexus_r = nexus; +} +static void osevent_hook_failed_alloc(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + osevent_release_nexus(gc, nexi_idle, *nexus); +} + +/*----- OSEVENT* hook functions for nexusop "release" -----*/ +static void osevent_hook_pre_release(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + osevent_release_nexus(gc, nexi_idle, *nexus); +} +static void osevent_hook_failed_release(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + abort(); +} + +/*----- OSEVENT* hook functions for nexusop "noop" -----*/ +static void osevent_hook_pre_noop(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) { } +static void osevent_hook_failed_noop(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) { } + /* * fd events @@ -72,7 +180,8 @@ int libxl__ev_fd_register(libxl__gc *gc, libxl__ev_fd *ev, DBG("ev_fd=%p register fd=%d events=%x", ev, fd, events); - rc = OSEVENT_HOOK(fd_register, fd, &ev->for_app_reg, events, ev); + rc = OSEVENT_HOOK(fd,register, alloc, fd, &ev->nexus->for_app_reg, + events, ev->nexus); if (rc) goto out; ev->fd = fd; @@ -97,7 +206,7 @@ int libxl__ev_fd_modify(libxl__gc *gc, libxl__ev_fd *ev, short events) DBG("ev_fd=%p modify fd=%d events=%x", ev, ev->fd, events); - rc = OSEVENT_HOOK(fd_modify, ev->fd, &ev->for_app_reg, events); + rc = OSEVENT_HOOK(fd,modify, noop, ev->fd, &ev->nexus->for_app_reg, events); if (rc) goto out; ev->events = events; @@ -119,7 +228,7 @@ void libxl__ev_fd_deregister(libxl__gc *gc, libxl__ev_fd *ev) DBG("ev_fd=%p deregister fd=%d", ev, ev->fd); - OSEVENT_HOOK_VOID(fd_deregister, ev->fd, ev->for_app_reg); + OSEVENT_HOOK_VOID(fd,deregister, release, ev->fd, ev->nexus->for_app_reg); LIBXL_LIST_REMOVE(ev, entry); ev->fd = -1; @@ -171,7 +280,8 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, { int rc; - rc = OSEVENT_HOOK(timeout_register, &ev->for_app_reg, absolute, ev); + rc = OSEVENT_HOOK(timeout,register, alloc, &ev->nexus->for_app_reg, + absolute, ev->nexus); if (rc) return rc; ev->infinite = 0; @@ -184,7 +294,7 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, static void time_deregister(libxl__gc *gc, libxl__ev_time *ev) { if (!ev->infinite) { - OSEVENT_HOOK_VOID(timeout_deregister, ev->for_app_reg); + OSEVENT_HOOK_VOID(timeout,deregister, release, ev->nexus->for_app_reg); LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); } } @@ -270,7 +380,8 @@ int libxl__ev_time_modify_abs(libxl__gc *gc, libxl__ev_time *ev, rc = time_register_finite(gc, ev, absolute); if (rc) goto out; } else { - rc = OSEVENT_HOOK(timeout_modify, &ev->for_app_reg, absolute); + rc = OSEVENT_HOOK(timeout,modify, noop, + &ev->nexus->for_app_reg, absolute); if (rc) goto out; LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); @@ -1010,35 +1121,54 @@ void libxl_osevent_register_hooks(libxl_ctx *ctx, void libxl_osevent_occurred_fd(libxl_ctx *ctx, void *for_libxl, - int fd, short events, short revents) + int fd, short events_ign, short revents_ign) { - libxl__ev_fd *ev = for_libxl; - EGC_INIT(ctx); CTX_LOCK; assert(!CTX->osevent_in_hook); - assert(fd == ev->fd); - revents &= ev->events; - if (revents) - ev->func(egc, ev, fd, ev->events, revents); + libxl__ev_fd *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + if (!ev) goto out; + if (ev->fd != fd) goto out; + struct pollfd check; + for (;;) { + check.fd = fd; + check.events = ev->events; + int r = poll(&check, 1, 0); + if (!r) + goto out; + if (r==1) + break; + assert(r<0); + if (errno != EINTR) { + LIBXL__EVENT_DISASTER(egc, "failed poll to check for fd", errno, 0); + goto out; + } + } + + if (check.revents) + ev->func(egc, ev, fd, ev->events, check.revents); + + out: CTX_UNLOCK; EGC_FREE; } void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl) { - libxl__ev_time *ev = for_libxl; - EGC_INIT(ctx); CTX_LOCK; assert(!CTX->osevent_in_hook); + libxl__ev_time *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + if (!ev) goto out; assert(!ev->infinite); + LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); ev->func(egc, ev, &ev->abs); + out: CTX_UNLOCK; EGC_FREE; } diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index cba3616..6484bcb 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -136,6 +136,8 @@ typedef struct libxl__gc libxl__gc; typedef struct libxl__egc libxl__egc; typedef struct libxl__ao libxl__ao; typedef struct libxl__aop_occurred libxl__aop_occurred; +typedef struct libxl__osevent_hook_nexus libxl__osevent_hook_nexus; +typedef struct libxl__osevent_hook_nexi libxl__osevent_hook_nexi; _hidden void libxl__alloc_failed(libxl_ctx *, const char *func, size_t nmemb, size_t size) __attribute__((noreturn)); @@ -163,7 +165,7 @@ struct libxl__ev_fd { libxl__ev_fd_callback *func; /* remainder is private for libxl__ev_fd... */ LIBXL_LIST_ENTRY(libxl__ev_fd) entry; - void *for_app_reg; + libxl__osevent_hook_nexus *nexus; }; @@ -178,7 +180,7 @@ struct libxl__ev_time { int infinite; /* not registered in list or with app if infinite */ LIBXL_TAILQ_ENTRY(libxl__ev_time) entry; struct timeval abs; - void *for_app_reg; + libxl__osevent_hook_nexus *nexus; }; typedef struct libxl__ev_xswatch libxl__ev_xswatch; @@ -329,6 +331,8 @@ struct libxl__ctx { libxl__poller poller_app; /* libxl_osevent_beforepoll and _afterpoll */ LIBXL_LIST_HEAD(, libxl__poller) pollers_event, pollers_idle; + LIBXL_SLIST_HEAD(libxl__osevent_hook_nexi, libxl__osevent_hook_nexus) + hook_fd_nexi_idle, hook_timeout_nexi_idle; LIBXL_LIST_HEAD(, libxl__ev_fd) efds; LIBXL_TAILQ_HEAD(, libxl__ev_time) etimes; -- 1.7.2.5 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] fix race condition between libvirtd event handling and libxl fd deregister
@ 2012-12-10 16:56 Ian Jackson
2012-12-10 16:57 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson
0 siblings, 1 reply; 13+ messages in thread
From: Ian Jackson @ 2012-12-10 16:56 UTC (permalink / raw)
To: Ian Campbell, Bamvor Jian Zhang, xen-devel@lists.xen.org,
jfehlig@suse.com
Ian Jackson writes ("Re: [PATCH] fix race condition between libvirtd event handling and libxl fd deregister"):
> I'm not surprised that the original patch makes Bamvor's symptoms go
> away. Bamvor had one of the possible races (the fd-related one) but
> not the other.
Here (followups to this message, shortly) is v3 of my two-patch series
which after conversation with Ian C I think fully fixes the race, and
which I have tested now.
Bamvor, can you test this and let us know hwether it fixes your problem?
Thanks,
Ian.
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH 1/2] libxl: fix stale fd event callback race 2012-12-10 16:56 [PATCH] fix race condition between libvirtd event handling and libxl fd deregister Ian Jackson @ 2012-12-10 16:57 ` Ian Jackson 2012-12-11 22:35 ` Jim Fehlig 0 siblings, 1 reply; 13+ messages in thread From: Ian Jackson @ 2012-12-10 16:57 UTC (permalink / raw) To: xen-devel; +Cc: Ian Jackson, Ian Campbell, Bamvor Jian Zhang Because there is not necessarily any lock held at the point the application (eg, libvirt) calls libxl_osevent_occurred_timeout and ..._fd, in a multithreaded program those calls may be arbitrarily delayed in relation to other activities within the program. libxl therefore needs to be prepared to receive very old event callbacks. Arrange for this to be the case for fd callbacks. This requires a new layer of indirection through a "hook nexus" struct which can outlive the libxl__ev_foo. Allocation and deallocation of these nexi is mostly handled in the OSEVENT macros which wrap up the application's callbacks. Document the problem and the solution in a comment in libxl_event.c just before the definition of struct libxl__osevent_hook_nexus. There is still a race relating to libxl__osevent_occurred_timeout; this will be addressed in the following patch. Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> -- v2: - Prepare for fixing timeout race too - Break out osevent_release_nexus() - nexusop argument to OSEVENT* macros - Clarify OSEVENT* nexusop hooks - osevent_ev_from_hook_nexus takes a libxl__osevent_hook_nexus* --- tools/libxl/libxl_event.c | 184 +++++++++++++++++++++++++++++++++++------ tools/libxl/libxl_internal.h | 8 ++- 2 files changed, 163 insertions(+), 29 deletions(-) diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 72cb723..f1fe425 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -38,23 +38,131 @@ * The application's registration hooks should be called ONLY via * these macros, with the ctx locked. Likewise all the "occurred" * entrypoints from the application should assert(!in_hook); + * + * During the hook call - including while the arguments are being + * evaluated - ev->nexus is guaranteed to be valid and refer to the + * nexus which is being used for this event registration. The + * arguments should specify ev->nexus for the for_libxl argument and + * ev->nexus->for_app_reg (or a pointer to it) for for_app_reg. */ -#define OSEVENT_HOOK_INTERN(retval, hookname, ...) do { \ - if (CTX->osevent_hooks) { \ - CTX->osevent_in_hook++; \ - retval CTX->osevent_hooks->hookname(CTX->osevent_user, __VA_ARGS__); \ - CTX->osevent_in_hook--; \ - } \ +#define OSEVENT_HOOK_INTERN(retval, failedp, evkind, hookop, nexusop, ...) do { \ + if (CTX->osevent_hooks) { \ + CTX->osevent_in_hook++; \ + libxl__osevent_hook_nexi *nexi = &CTX->hook_##evkind##_nexi_idle; \ + osevent_hook_pre_##nexusop(gc, ev, nexi, &ev->nexus); \ + retval CTX->osevent_hooks->evkind##_##hookop \ + (CTX->osevent_user, __VA_ARGS__); \ + if ((failedp)) \ + osevent_hook_failed_##nexusop(gc, ev, nexi, &ev->nexus); \ + CTX->osevent_in_hook--; \ + } \ } while (0) -#define OSEVENT_HOOK(hookname, ...) ({ \ - int osevent_hook_rc = 0; \ - OSEVENT_HOOK_INTERN(osevent_hook_rc = , hookname, __VA_ARGS__); \ - osevent_hook_rc; \ +#define OSEVENT_HOOK(evkind, hookop, nexusop, ...) ({ \ + int osevent_hook_rc = 0; \ + OSEVENT_HOOK_INTERN(osevent_hook_rc =, !!osevent_hook_rc, \ + evkind, hookop, nexusop, __VA_ARGS__); \ + osevent_hook_rc; \ }) -#define OSEVENT_HOOK_VOID(hookname, ...) \ - OSEVENT_HOOK_INTERN(/* void */, hookname, __VA_ARGS__) +#define OSEVENT_HOOK_VOID(evkind, hookop, nexusop, ...) \ + OSEVENT_HOOK_INTERN(/* void */, 0, evkind, hookop, nexusop, __VA_ARGS__) + +/* + * The application's calls to libxl_osevent_occurred_... may be + * indefinitely delayed with respect to the rest of the program (since + * they are not necessarily called with any lock held). So the + * for_libxl value we receive may be (almost) arbitrarily old. All we + * know is that it came from this ctx. + * + * Therefore we may not free the object referred to by any for_libxl + * value until we free the whole libxl_ctx. And if we reuse it we + * must be able to tell when an old use turns up, and discard the + * stale event. + * + * Thus we cannot use the ev directly as the for_libxl value - we need + * a layer of indirection. + * + * We do this by keeping a pool of libxl__osevent_hook_nexus structs, + * and use pointers to them as for_libxl values. In fact, there are + * two pools: one for fds and one for timeouts. This ensures that we + * don't risk a type error when we upcast nexus->ev. In each nexus + * the ev is either null or points to a valid libxl__ev_time or + * libxl__ev_fd, as applicable. + * + * We /do/ allow ourselves to reassociate an old nexus with a new ev + * as otherwise we would have to leak nexi. (This reassociation + * might, of course, be an old ev being reused for a new purpose so + * simply comparing the ev pointer is not sufficient.) Thus the + * libxl_osevent_occurred functions need to check that the condition + * allegedly signalled by this event actually exists. + * + * The nexi and the lists are all protected by the ctx lock. + */ + +struct libxl__osevent_hook_nexus { + void *ev; + void *for_app_reg; + LIBXL_SLIST_ENTRY(libxl__osevent_hook_nexus) next; +}; + +static void *osevent_ev_from_hook_nexus(libxl_ctx *ctx, + libxl__osevent_hook_nexus *nexus /* pass void *for_libxl */) +{ + return nexus->ev; +} + +static void osevent_release_nexus(libxl__gc *gc, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus *nexus) +{ + nexus->ev = 0; + LIBXL_SLIST_INSERT_HEAD(nexi_idle, nexus, next); +} + +/*----- OSEVENT* hook functions for nexusop "alloc" -----*/ +static void osevent_hook_pre_alloc(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus_r) +{ + libxl__osevent_hook_nexus *nexus = LIBXL_SLIST_FIRST(nexi_idle); + if (nexus) { + LIBXL_SLIST_REMOVE_HEAD(nexi_idle, next); + } else { + nexus = libxl__zalloc(NOGC, sizeof(*nexus)); + } + nexus->ev = ev; + *nexus_r = nexus; +} +static void osevent_hook_failed_alloc(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + osevent_release_nexus(gc, nexi_idle, *nexus); +} + +/*----- OSEVENT* hook functions for nexusop "release" -----*/ +static void osevent_hook_pre_release(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + osevent_release_nexus(gc, nexi_idle, *nexus); +} +static void osevent_hook_failed_release(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + abort(); +} + +/*----- OSEVENT* hook functions for nexusop "noop" -----*/ +static void osevent_hook_pre_noop(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) { } +static void osevent_hook_failed_noop(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) { } + /* * fd events @@ -72,7 +180,8 @@ int libxl__ev_fd_register(libxl__gc *gc, libxl__ev_fd *ev, DBG("ev_fd=%p register fd=%d events=%x", ev, fd, events); - rc = OSEVENT_HOOK(fd_register, fd, &ev->for_app_reg, events, ev); + rc = OSEVENT_HOOK(fd,register, alloc, fd, &ev->nexus->for_app_reg, + events, ev->nexus); if (rc) goto out; ev->fd = fd; @@ -97,7 +206,7 @@ int libxl__ev_fd_modify(libxl__gc *gc, libxl__ev_fd *ev, short events) DBG("ev_fd=%p modify fd=%d events=%x", ev, ev->fd, events); - rc = OSEVENT_HOOK(fd_modify, ev->fd, &ev->for_app_reg, events); + rc = OSEVENT_HOOK(fd,modify, noop, ev->fd, &ev->nexus->for_app_reg, events); if (rc) goto out; ev->events = events; @@ -119,7 +228,7 @@ void libxl__ev_fd_deregister(libxl__gc *gc, libxl__ev_fd *ev) DBG("ev_fd=%p deregister fd=%d", ev, ev->fd); - OSEVENT_HOOK_VOID(fd_deregister, ev->fd, ev->for_app_reg); + OSEVENT_HOOK_VOID(fd,deregister, release, ev->fd, ev->nexus->for_app_reg); LIBXL_LIST_REMOVE(ev, entry); ev->fd = -1; @@ -171,7 +280,8 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, { int rc; - rc = OSEVENT_HOOK(timeout_register, &ev->for_app_reg, absolute, ev); + rc = OSEVENT_HOOK(timeout,register, alloc, &ev->nexus->for_app_reg, + absolute, ev->nexus); if (rc) return rc; ev->infinite = 0; @@ -184,7 +294,7 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, static void time_deregister(libxl__gc *gc, libxl__ev_time *ev) { if (!ev->infinite) { - OSEVENT_HOOK_VOID(timeout_deregister, ev->for_app_reg); + OSEVENT_HOOK_VOID(timeout,deregister, release, ev->nexus->for_app_reg); LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); } } @@ -270,7 +380,8 @@ int libxl__ev_time_modify_abs(libxl__gc *gc, libxl__ev_time *ev, rc = time_register_finite(gc, ev, absolute); if (rc) goto out; } else { - rc = OSEVENT_HOOK(timeout_modify, &ev->for_app_reg, absolute); + rc = OSEVENT_HOOK(timeout,modify, noop, + &ev->nexus->for_app_reg, absolute); if (rc) goto out; LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); @@ -1010,35 +1121,54 @@ void libxl_osevent_register_hooks(libxl_ctx *ctx, void libxl_osevent_occurred_fd(libxl_ctx *ctx, void *for_libxl, - int fd, short events, short revents) + int fd, short events_ign, short revents_ign) { - libxl__ev_fd *ev = for_libxl; - EGC_INIT(ctx); CTX_LOCK; assert(!CTX->osevent_in_hook); - assert(fd == ev->fd); - revents &= ev->events; - if (revents) - ev->func(egc, ev, fd, ev->events, revents); + libxl__ev_fd *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + if (!ev) goto out; + if (ev->fd != fd) goto out; + struct pollfd check; + for (;;) { + check.fd = fd; + check.events = ev->events; + int r = poll(&check, 1, 0); + if (!r) + goto out; + if (r==1) + break; + assert(r<0); + if (errno != EINTR) { + LIBXL__EVENT_DISASTER(egc, "failed poll to check for fd", errno, 0); + goto out; + } + } + + if (check.revents) + ev->func(egc, ev, fd, ev->events, check.revents); + + out: CTX_UNLOCK; EGC_FREE; } void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl) { - libxl__ev_time *ev = for_libxl; - EGC_INIT(ctx); CTX_LOCK; assert(!CTX->osevent_in_hook); + libxl__ev_time *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + if (!ev) goto out; assert(!ev->infinite); + LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); ev->func(egc, ev, &ev->abs); + out: CTX_UNLOCK; EGC_FREE; } diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index cba3616..6484bcb 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -136,6 +136,8 @@ typedef struct libxl__gc libxl__gc; typedef struct libxl__egc libxl__egc; typedef struct libxl__ao libxl__ao; typedef struct libxl__aop_occurred libxl__aop_occurred; +typedef struct libxl__osevent_hook_nexus libxl__osevent_hook_nexus; +typedef struct libxl__osevent_hook_nexi libxl__osevent_hook_nexi; _hidden void libxl__alloc_failed(libxl_ctx *, const char *func, size_t nmemb, size_t size) __attribute__((noreturn)); @@ -163,7 +165,7 @@ struct libxl__ev_fd { libxl__ev_fd_callback *func; /* remainder is private for libxl__ev_fd... */ LIBXL_LIST_ENTRY(libxl__ev_fd) entry; - void *for_app_reg; + libxl__osevent_hook_nexus *nexus; }; @@ -178,7 +180,7 @@ struct libxl__ev_time { int infinite; /* not registered in list or with app if infinite */ LIBXL_TAILQ_ENTRY(libxl__ev_time) entry; struct timeval abs; - void *for_app_reg; + libxl__osevent_hook_nexus *nexus; }; typedef struct libxl__ev_xswatch libxl__ev_xswatch; @@ -329,6 +331,8 @@ struct libxl__ctx { libxl__poller poller_app; /* libxl_osevent_beforepoll and _afterpoll */ LIBXL_LIST_HEAD(, libxl__poller) pollers_event, pollers_idle; + LIBXL_SLIST_HEAD(libxl__osevent_hook_nexi, libxl__osevent_hook_nexus) + hook_fd_nexi_idle, hook_timeout_nexi_idle; LIBXL_LIST_HEAD(, libxl__ev_fd) efds; LIBXL_TAILQ_HEAD(, libxl__ev_time) etimes; -- 1.7.2.5 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] libxl: fix stale fd event callback race 2012-12-10 16:57 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson @ 2012-12-11 22:35 ` Jim Fehlig 2012-12-12 17:04 ` Ian Jackson 0 siblings, 1 reply; 13+ messages in thread From: Jim Fehlig @ 2012-12-11 22:35 UTC (permalink / raw) To: Ian Jackson; +Cc: Bamvor Jian Zhang, Ian Campbell, xen-devel Ian Jackson wrote: > Because there is not necessarily any lock held at the point the > application (eg, libvirt) calls libxl_osevent_occurred_timeout and > ..._fd, in a multithreaded program those calls may be arbitrarily > delayed in relation to other activities within the program. > > libxl therefore needs to be prepared to receive very old event > callbacks. Arrange for this to be the case for fd callbacks. > > This requires a new layer of indirection through a "hook nexus" struct > which can outlive the libxl__ev_foo. Allocation and deallocation of > these nexi is mostly handled in the OSEVENT macros which wrap up > the application's callbacks. > > Document the problem and the solution in a comment in libxl_event.c > just before the definition of struct libxl__osevent_hook_nexus. > > There is still a race relating to libxl__osevent_occurred_timeout; > this will be addressed in the following patch. > > Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> > Cc: Bamvor Jian Zhang <bjzhang@suse.com> > Cc: Ian Campbell <Ian.Campbell@citrix.com> > Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> > Hi Ian, Thanks for the patches! I've found some time to test them in the context of Xen 4.2 and have some comments. For this patch, only a few nits below. > -- > v2: > - Prepare for fixing timeout race too > - Break out osevent_release_nexus() > - nexusop argument to OSEVENT* macros > - Clarify OSEVENT* nexusop hooks > - osevent_ev_from_hook_nexus takes a libxl__osevent_hook_nexus* > --- > tools/libxl/libxl_event.c | 184 +++++++++++++++++++++++++++++++++++------ > tools/libxl/libxl_internal.h | 8 ++- > 2 files changed, 163 insertions(+), 29 deletions(-) > > diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c > index 72cb723..f1fe425 100644 > --- a/tools/libxl/libxl_event.c > +++ b/tools/libxl/libxl_event.c > @@ -38,23 +38,131 @@ > * The application's registration hooks should be called ONLY via > * these macros, with the ctx locked. Likewise all the "occurred" > * entrypoints from the application should assert(!in_hook); > + * > + * During the hook call - including while the arguments are being > + * evaluated - ev->nexus is guaranteed to be valid and refer to the > + * nexus which is being used for this event registration. The > + * arguments should specify ev->nexus for the for_libxl argument and > + * ev->nexus->for_app_reg (or a pointer to it) for for_app_reg. > */ > -#define OSEVENT_HOOK_INTERN(retval, hookname, ...) do { \ > - if (CTX->osevent_hooks) { \ > - CTX->osevent_in_hook++; \ > - retval CTX->osevent_hooks->hookname(CTX->osevent_user, __VA_ARGS__); \ > - CTX->osevent_in_hook--; \ > - } \ > +#define OSEVENT_HOOK_INTERN(retval, failedp, evkind, hookop, nexusop, ...) do { \ > + if (CTX->osevent_hooks) { \ > + CTX->osevent_in_hook++; \ > + libxl__osevent_hook_nexi *nexi = &CTX->hook_##evkind##_nexi_idle; \ > + osevent_hook_pre_##nexusop(gc, ev, nexi, &ev->nexus); \ > + retval CTX->osevent_hooks->evkind##_##hookop \ > + (CTX->osevent_user, __VA_ARGS__); \ > + if ((failedp)) \ > + osevent_hook_failed_##nexusop(gc, ev, nexi, &ev->nexus); \ > + CTX->osevent_in_hook--; \ > + } \ > } while (0) > > -#define OSEVENT_HOOK(hookname, ...) ({ \ > - int osevent_hook_rc = 0; \ > - OSEVENT_HOOK_INTERN(osevent_hook_rc = , hookname, __VA_ARGS__); \ > - osevent_hook_rc; \ > +#define OSEVENT_HOOK(evkind, hookop, nexusop, ...) ({ \ > + int osevent_hook_rc = 0; \ > + OSEVENT_HOOK_INTERN(osevent_hook_rc =, !!osevent_hook_rc, \ > + evkind, hookop, nexusop, __VA_ARGS__); \ > + osevent_hook_rc; \ > }) > > -#define OSEVENT_HOOK_VOID(hookname, ...) \ > - OSEVENT_HOOK_INTERN(/* void */, hookname, __VA_ARGS__) > +#define OSEVENT_HOOK_VOID(evkind, hookop, nexusop, ...) \ > + OSEVENT_HOOK_INTERN(/* void */, 0, evkind, hookop, nexusop, __VA_ARGS__) > + > +/* > + * The application's calls to libxl_osevent_occurred_... may be > + * indefinitely delayed with respect to the rest of the program (since > + * they are not necessarily called with any lock held). So the > + * for_libxl value we receive may be (almost) arbitrarily old. All we > + * know is that it came from this ctx. > + * > + * Therefore we may not free the object referred to by any for_libxl > + * value until we free the whole libxl_ctx. And if we reuse it we > + * must be able to tell when an old use turns up, and discard the > + * stale event. > + * > + * Thus we cannot use the ev directly as the for_libxl value - we need > + * a layer of indirection. > + * > + * We do this by keeping a pool of libxl__osevent_hook_nexus structs, > + * and use pointers to them as for_libxl values. In fact, there are > + * two pools: one for fds and one for timeouts. This ensures that we > + * don't risk a type error when we upcast nexus->ev. In each nexus > + * the ev is either null or points to a valid libxl__ev_time or > + * libxl__ev_fd, as applicable. > + * > + * We /do/ allow ourselves to reassociate an old nexus with a new ev > + * as otherwise we would have to leak nexi. (This reassociation > + * might, of course, be an old ev being reused for a new purpose so > + * simply comparing the ev pointer is not sufficient.) Thus the > + * libxl_osevent_occurred functions need to check that the condition > + * allegedly signalled by this event actually exists. > + * > + * The nexi and the lists are all protected by the ctx lock. > + */ > + > +struct libxl__osevent_hook_nexus { > + void *ev; > + void *for_app_reg; > + LIBXL_SLIST_ENTRY(libxl__osevent_hook_nexus) next; > +}; > + > +static void *osevent_ev_from_hook_nexus(libxl_ctx *ctx, > + libxl__osevent_hook_nexus *nexus /* pass void *for_libxl */) > +{ > + return nexus->ev; > +} > + > +static void osevent_release_nexus(libxl__gc *gc, > + libxl__osevent_hook_nexi *nexi_idle, > + libxl__osevent_hook_nexus *nexus) > +{ > + nexus->ev = 0; > + LIBXL_SLIST_INSERT_HEAD(nexi_idle, nexus, next); > +} > + > +/*----- OSEVENT* hook functions for nexusop "alloc" -----*/ > +static void osevent_hook_pre_alloc(libxl__gc *gc, void *ev, > + libxl__osevent_hook_nexi *nexi_idle, > + libxl__osevent_hook_nexus **nexus_r) > +{ > + libxl__osevent_hook_nexus *nexus = LIBXL_SLIST_FIRST(nexi_idle); > + if (nexus) { > + LIBXL_SLIST_REMOVE_HEAD(nexi_idle, next); > + } else { > + nexus = libxl__zalloc(NOGC, sizeof(*nexus)); > + } > + nexus->ev = ev; > + *nexus_r = nexus; > +} > +static void osevent_hook_failed_alloc(libxl__gc *gc, void *ev, > + libxl__osevent_hook_nexi *nexi_idle, > + libxl__osevent_hook_nexus **nexus) > +{ > + osevent_release_nexus(gc, nexi_idle, *nexus); > +} > + > +/*----- OSEVENT* hook functions for nexusop "release" -----*/ > +static void osevent_hook_pre_release(libxl__gc *gc, void *ev, > + libxl__osevent_hook_nexi *nexi_idle, > + libxl__osevent_hook_nexus **nexus) > +{ > + osevent_release_nexus(gc, nexi_idle, *nexus); > +} > +static void osevent_hook_failed_release(libxl__gc *gc, void *ev, > + libxl__osevent_hook_nexi *nexi_idle, > + libxl__osevent_hook_nexus **nexus) > +{ > + abort(); > +} > + > +/*----- OSEVENT* hook functions for nexusop "noop" -----*/ > +static void osevent_hook_pre_noop(libxl__gc *gc, void *ev, > + libxl__osevent_hook_nexi *nexi_idle, > + libxl__osevent_hook_nexus **nexus) { } > +static void osevent_hook_failed_noop(libxl__gc *gc, void *ev, > + libxl__osevent_hook_nexi *nexi_idle, > + libxl__osevent_hook_nexus **nexus) { } > + > > /* > * fd events > @@ -72,7 +180,8 @@ int libxl__ev_fd_register(libxl__gc *gc, libxl__ev_fd *ev, > > DBG("ev_fd=%p register fd=%d events=%x", ev, fd, events); > > - rc = OSEVENT_HOOK(fd_register, fd, &ev->for_app_reg, events, ev); > + rc = OSEVENT_HOOK(fd,register, alloc, fd, &ev->nexus->for_app_reg, > Nit, should there be a space between 'fd,' and 'register'? Also, not that gcc complained, but register is a keyword. > + events, ev->nexus); > if (rc) goto out; > > ev->fd = fd; > @@ -97,7 +206,7 @@ int libxl__ev_fd_modify(libxl__gc *gc, libxl__ev_fd *ev, short events) > > DBG("ev_fd=%p modify fd=%d events=%x", ev, ev->fd, events); > > - rc = OSEVENT_HOOK(fd_modify, ev->fd, &ev->for_app_reg, events); > + rc = OSEVENT_HOOK(fd,modify, noop, ev->fd, &ev->nexus->for_app_reg, events); > Same nit here, and in a few other uses of OSEVENT_HOOK* below. Regards, Jim > if (rc) goto out; > > ev->events = events; > @@ -119,7 +228,7 @@ void libxl__ev_fd_deregister(libxl__gc *gc, libxl__ev_fd *ev) > > DBG("ev_fd=%p deregister fd=%d", ev, ev->fd); > > - OSEVENT_HOOK_VOID(fd_deregister, ev->fd, ev->for_app_reg); > + OSEVENT_HOOK_VOID(fd,deregister, release, ev->fd, ev->nexus->for_app_reg); > LIBXL_LIST_REMOVE(ev, entry); > ev->fd = -1; > > @@ -171,7 +280,8 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, > { > int rc; > > - rc = OSEVENT_HOOK(timeout_register, &ev->for_app_reg, absolute, ev); > + rc = OSEVENT_HOOK(timeout,register, alloc, &ev->nexus->for_app_reg, > + absolute, ev->nexus); > if (rc) return rc; > > ev->infinite = 0; > @@ -184,7 +294,7 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, > static void time_deregister(libxl__gc *gc, libxl__ev_time *ev) > { > if (!ev->infinite) { > - OSEVENT_HOOK_VOID(timeout_deregister, ev->for_app_reg); > + OSEVENT_HOOK_VOID(timeout,deregister, release, ev->nexus->for_app_reg); > LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); > } > } > @@ -270,7 +380,8 @@ int libxl__ev_time_modify_abs(libxl__gc *gc, libxl__ev_time *ev, > rc = time_register_finite(gc, ev, absolute); > if (rc) goto out; > } else { > - rc = OSEVENT_HOOK(timeout_modify, &ev->for_app_reg, absolute); > + rc = OSEVENT_HOOK(timeout,modify, noop, > + &ev->nexus->for_app_reg, absolute); > if (rc) goto out; > > LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); > @@ -1010,35 +1121,54 @@ void libxl_osevent_register_hooks(libxl_ctx *ctx, > > > void libxl_osevent_occurred_fd(libxl_ctx *ctx, void *for_libxl, > - int fd, short events, short revents) > + int fd, short events_ign, short revents_ign) > { > - libxl__ev_fd *ev = for_libxl; > - > EGC_INIT(ctx); > CTX_LOCK; > assert(!CTX->osevent_in_hook); > > - assert(fd == ev->fd); > - revents &= ev->events; > - if (revents) > - ev->func(egc, ev, fd, ev->events, revents); > + libxl__ev_fd *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); > + if (!ev) goto out; > + if (ev->fd != fd) goto out; > > + struct pollfd check; > + for (;;) { > + check.fd = fd; > + check.events = ev->events; > + int r = poll(&check, 1, 0); > + if (!r) > + goto out; > + if (r==1) > + break; > + assert(r<0); > + if (errno != EINTR) { > + LIBXL__EVENT_DISASTER(egc, "failed poll to check for fd", errno, 0); > + goto out; > + } > + } > + > + if (check.revents) > + ev->func(egc, ev, fd, ev->events, check.revents); > + > + out: > CTX_UNLOCK; > EGC_FREE; > } > > void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl) > { > - libxl__ev_time *ev = for_libxl; > - > EGC_INIT(ctx); > CTX_LOCK; > assert(!CTX->osevent_in_hook); > > + libxl__ev_time *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); > + if (!ev) goto out; > assert(!ev->infinite); > + > LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); > ev->func(egc, ev, &ev->abs); > > + out: > CTX_UNLOCK; > EGC_FREE; > } > diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h > index cba3616..6484bcb 100644 > --- a/tools/libxl/libxl_internal.h > +++ b/tools/libxl/libxl_internal.h > @@ -136,6 +136,8 @@ typedef struct libxl__gc libxl__gc; > typedef struct libxl__egc libxl__egc; > typedef struct libxl__ao libxl__ao; > typedef struct libxl__aop_occurred libxl__aop_occurred; > +typedef struct libxl__osevent_hook_nexus libxl__osevent_hook_nexus; > +typedef struct libxl__osevent_hook_nexi libxl__osevent_hook_nexi; > > _hidden void libxl__alloc_failed(libxl_ctx *, const char *func, > size_t nmemb, size_t size) __attribute__((noreturn)); > @@ -163,7 +165,7 @@ struct libxl__ev_fd { > libxl__ev_fd_callback *func; > /* remainder is private for libxl__ev_fd... */ > LIBXL_LIST_ENTRY(libxl__ev_fd) entry; > - void *for_app_reg; > + libxl__osevent_hook_nexus *nexus; > }; > > > @@ -178,7 +180,7 @@ struct libxl__ev_time { > int infinite; /* not registered in list or with app if infinite */ > LIBXL_TAILQ_ENTRY(libxl__ev_time) entry; > struct timeval abs; > - void *for_app_reg; > + libxl__osevent_hook_nexus *nexus; > }; > > typedef struct libxl__ev_xswatch libxl__ev_xswatch; > @@ -329,6 +331,8 @@ struct libxl__ctx { > libxl__poller poller_app; /* libxl_osevent_beforepoll and _afterpoll */ > LIBXL_LIST_HEAD(, libxl__poller) pollers_event, pollers_idle; > > + LIBXL_SLIST_HEAD(libxl__osevent_hook_nexi, libxl__osevent_hook_nexus) > + hook_fd_nexi_idle, hook_timeout_nexi_idle; > LIBXL_LIST_HEAD(, libxl__ev_fd) efds; > LIBXL_TAILQ_HEAD(, libxl__ev_time) etimes; > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] libxl: fix stale fd event callback race 2012-12-11 22:35 ` Jim Fehlig @ 2012-12-12 17:04 ` Ian Jackson 2012-12-12 17:20 ` Jim Fehlig 0 siblings, 1 reply; 13+ messages in thread From: Ian Jackson @ 2012-12-12 17:04 UTC (permalink / raw) To: Jim Fehlig; +Cc: Bamvor Jian Zhang, Ian Campbell, xen-devel@lists.xen.org Jim Fehlig writes ("Re: [Xen-devel] [PATCH 1/2] libxl: fix stale fd event callback race"): > Thanks for the patches! I've found some time to test them in the context > of Xen 4.2 and have some comments. For this patch, only a few nits below. Thanks for the review and testing. > > - rc = OSEVENT_HOOK(fd_register, fd, &ev->for_app_reg, events, ev); > > + rc = OSEVENT_HOOK(fd,register, alloc, fd, &ev->nexus->for_app_reg, > > > > Nit, should there be a space between 'fd,' and 'register'? I did this deliberately because the macro uses token pasting to turn this into fd_register, at least in many of the uses. >>Also, not that gcc complained, but register is a keyword. The token "register" gets pasted together with other things by the preprocessor before the compiler sees it, so it's correct as far as the language spec goes. As for it being potentially confusing, changing what appears here is difficult without changing the names in libxl_osevent_hooks. Thanks, Ian. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] libxl: fix stale fd event callback race 2012-12-12 17:04 ` Ian Jackson @ 2012-12-12 17:20 ` Jim Fehlig 0 siblings, 0 replies; 13+ messages in thread From: Jim Fehlig @ 2012-12-12 17:20 UTC (permalink / raw) To: Ian Jackson; +Cc: xen-devel@lists.xen.org, Ian Campbell, Bamvor Jian Zhang Ian Jackson wrote: > Jim Fehlig writes ("Re: [Xen-devel] [PATCH 1/2] libxl: fix stale fd event callback race"): > >> Thanks for the patches! I've found some time to test them in the context >> of Xen 4.2 and have some comments. For this patch, only a few nits below. >> > > Thanks for the review and testing. > > >>> - rc = OSEVENT_HOOK(fd_register, fd, &ev->for_app_reg, events, ev); >>> + rc = OSEVENT_HOOK(fd,register, alloc, fd, &ev->nexus->for_app_reg, >>> >>> >> Nit, should there be a space between 'fd,' and 'register'? >> > > I did this deliberately because the macro uses token pasting to turn > this into fd_register, at least in many of the uses. > Ok. > >>> Also, not that gcc complained, but register is a keyword. >>> > > The token "register" gets pasted together with other things by the > preprocessor before the compiler sees it, so it's correct as far as > the language spec goes. As for it being potentially confusing, > changing what appears here is difficult without changing the names > in libxl_osevent_hooks. > Yes, understood. Regards, Jim ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] fix race condition between libvirtd event handling and libxl fd deregister
@ 2012-12-07 19:11 Ian Jackson
2012-12-07 19:15 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson
2012-12-07 19:21 ` Ian Jackson
0 siblings, 2 replies; 13+ messages in thread
From: Ian Jackson @ 2012-12-07 19:11 UTC (permalink / raw)
To: Ian Campbell; +Cc: jfehlig@suse.com, xen-devel@lists.xen.org, Bamvor Jian Zhang
Ian Campbell writes ("Re: [PATCH] fix race condition between libvirtd event handling and libxl fd deregister"):
> Can we provide, or (more likely) require the application to provide, a
> lock (perhaps per-event or, again more likely, per-event-loop) which
> must be held while processing callbacks and also while events are being
> registered/unregistered with the application's event handling subsystem?
> With such a lock in place the application would be able to guarantee
> that having returned from the deregister hook any further events would
> be seen as spurious events by its own event processing loop.
I think this might be difficult to get right without deadlocks.
...
> Last half brained idea would be to split the deregistration into two.
> libxl calls up to the app saying "please deregister" and the app calls
> back to libxl to say "I am no longer watching for this event and
> guarantee that I won't deliver it any more". (Presumably this would be
> implemented by the application via some combination of the above). This
> could be done in a somewhat compatible way by allowing the deregister
> hook to return "PENDING".
This is in fact straightforward and is a subset of the existing API.
If we have libxl always call timeout_modify with abs={0,0}, it will
get timeout_occurred when the application's event loop has dealt with
it. We can simply never call timeout_deregister.
I have implemented this in the 2-patch RFD series I'm about to send.
NB this series has been complied but not (as yet) executed by me...
Ian.
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH 1/2] libxl: fix stale fd event callback race 2012-12-07 19:11 [PATCH] fix race condition between libvirtd event handling and libxl fd deregister Ian Jackson @ 2012-12-07 19:15 ` Ian Jackson 2012-12-07 19:22 ` Ian Jackson 2012-12-07 19:21 ` Ian Jackson 1 sibling, 1 reply; 13+ messages in thread From: Ian Jackson @ 2012-12-07 19:15 UTC (permalink / raw) To: xen-devel; +Cc: Ian Jackson, Ian Campbell, Bamvor Jian Zhang Because there is not necessarily any lock held at the point the application (eg, libvirt) calls libxl_osevent_occurred_timeout and ..._fd, in a multithreaded program those calls may be arbitrarily delayed in relation to other activities within the program. libxl therefore needs to be prepared to receive very old event callbacks. Arrange for this to be the case for fd callbacks. This requires a new layer of indirection through a "hook nexus" struct which can outlive the libxl__ev_foo. Allocation and deallocation of these nexi is mostly handled in the OSEVENT macros which wrap up the application's callbacks. Document the problem and the solution in a comment in libxl_event.c just before the definition of struct libxl__osevent_hook_nexus. There is still a race relating to libxl__osevent_occurred_timeout; this will be addressed in the following patch. Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> -- v2: - Prepare for fixing timeout race too - Break out osevent_release_nexus() - nexusop argument to OSEVENT* macros - Clarify OSEVENT* nexusop hooks - osevent_ev_from_hook_nexus takes a libxl__osevent_hook_nexus* --- tools/libxl/libxl_event.c | 184 +++++++++++++++++++++++++++++++++++------ tools/libxl/libxl_internal.h | 8 ++- 2 files changed, 163 insertions(+), 29 deletions(-) diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 72cb723..f1fe425 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -38,23 +38,131 @@ * The application's registration hooks should be called ONLY via * these macros, with the ctx locked. Likewise all the "occurred" * entrypoints from the application should assert(!in_hook); + * + * During the hook call - including while the arguments are being + * evaluated - ev->nexus is guaranteed to be valid and refer to the + * nexus which is being used for this event registration. The + * arguments should specify ev->nexus for the for_libxl argument and + * ev->nexus->for_app_reg (or a pointer to it) for for_app_reg. */ -#define OSEVENT_HOOK_INTERN(retval, hookname, ...) do { \ - if (CTX->osevent_hooks) { \ - CTX->osevent_in_hook++; \ - retval CTX->osevent_hooks->hookname(CTX->osevent_user, __VA_ARGS__); \ - CTX->osevent_in_hook--; \ - } \ +#define OSEVENT_HOOK_INTERN(retval, failedp, evkind, hookop, nexusop, ...) do { \ + if (CTX->osevent_hooks) { \ + CTX->osevent_in_hook++; \ + libxl__osevent_hook_nexi *nexi = &CTX->hook_##evkind##_nexi_idle; \ + osevent_hook_pre_##nexusop(gc, ev, nexi, &ev->nexus); \ + retval CTX->osevent_hooks->evkind##_##hookop \ + (CTX->osevent_user, __VA_ARGS__); \ + if ((failedp)) \ + osevent_hook_failed_##nexusop(gc, ev, nexi, &ev->nexus); \ + CTX->osevent_in_hook--; \ + } \ } while (0) -#define OSEVENT_HOOK(hookname, ...) ({ \ - int osevent_hook_rc = 0; \ - OSEVENT_HOOK_INTERN(osevent_hook_rc = , hookname, __VA_ARGS__); \ - osevent_hook_rc; \ +#define OSEVENT_HOOK(evkind, hookop, nexusop, ...) ({ \ + int osevent_hook_rc = 0; \ + OSEVENT_HOOK_INTERN(osevent_hook_rc =, !!osevent_hook_rc, \ + evkind, hookop, nexusop, __VA_ARGS__); \ + osevent_hook_rc; \ }) -#define OSEVENT_HOOK_VOID(hookname, ...) \ - OSEVENT_HOOK_INTERN(/* void */, hookname, __VA_ARGS__) +#define OSEVENT_HOOK_VOID(evkind, hookop, nexusop, ...) \ + OSEVENT_HOOK_INTERN(/* void */, 0, evkind, hookop, nexusop, __VA_ARGS__) + +/* + * The application's calls to libxl_osevent_occurred_... may be + * indefinitely delayed with respect to the rest of the program (since + * they are not necessarily called with any lock held). So the + * for_libxl value we receive may be (almost) arbitrarily old. All we + * know is that it came from this ctx. + * + * Therefore we may not free the object referred to by any for_libxl + * value until we free the whole libxl_ctx. And if we reuse it we + * must be able to tell when an old use turns up, and discard the + * stale event. + * + * Thus we cannot use the ev directly as the for_libxl value - we need + * a layer of indirection. + * + * We do this by keeping a pool of libxl__osevent_hook_nexus structs, + * and use pointers to them as for_libxl values. In fact, there are + * two pools: one for fds and one for timeouts. This ensures that we + * don't risk a type error when we upcast nexus->ev. In each nexus + * the ev is either null or points to a valid libxl__ev_time or + * libxl__ev_fd, as applicable. + * + * We /do/ allow ourselves to reassociate an old nexus with a new ev + * as otherwise we would have to leak nexi. (This reassociation + * might, of course, be an old ev being reused for a new purpose so + * simply comparing the ev pointer is not sufficient.) Thus the + * libxl_osevent_occurred functions need to check that the condition + * allegedly signalled by this event actually exists. + * + * The nexi and the lists are all protected by the ctx lock. + */ + +struct libxl__osevent_hook_nexus { + void *ev; + void *for_app_reg; + LIBXL_SLIST_ENTRY(libxl__osevent_hook_nexus) next; +}; + +static void *osevent_ev_from_hook_nexus(libxl_ctx *ctx, + libxl__osevent_hook_nexus *nexus /* pass void *for_libxl */) +{ + return nexus->ev; +} + +static void osevent_release_nexus(libxl__gc *gc, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus *nexus) +{ + nexus->ev = 0; + LIBXL_SLIST_INSERT_HEAD(nexi_idle, nexus, next); +} + +/*----- OSEVENT* hook functions for nexusop "alloc" -----*/ +static void osevent_hook_pre_alloc(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus_r) +{ + libxl__osevent_hook_nexus *nexus = LIBXL_SLIST_FIRST(nexi_idle); + if (nexus) { + LIBXL_SLIST_REMOVE_HEAD(nexi_idle, next); + } else { + nexus = libxl__zalloc(NOGC, sizeof(*nexus)); + } + nexus->ev = ev; + *nexus_r = nexus; +} +static void osevent_hook_failed_alloc(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + osevent_release_nexus(gc, nexi_idle, *nexus); +} + +/*----- OSEVENT* hook functions for nexusop "release" -----*/ +static void osevent_hook_pre_release(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + osevent_release_nexus(gc, nexi_idle, *nexus); +} +static void osevent_hook_failed_release(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + abort(); +} + +/*----- OSEVENT* hook functions for nexusop "noop" -----*/ +static void osevent_hook_pre_noop(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) { } +static void osevent_hook_failed_noop(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) { } + /* * fd events @@ -72,7 +180,8 @@ int libxl__ev_fd_register(libxl__gc *gc, libxl__ev_fd *ev, DBG("ev_fd=%p register fd=%d events=%x", ev, fd, events); - rc = OSEVENT_HOOK(fd_register, fd, &ev->for_app_reg, events, ev); + rc = OSEVENT_HOOK(fd,register, alloc, fd, &ev->nexus->for_app_reg, + events, ev->nexus); if (rc) goto out; ev->fd = fd; @@ -97,7 +206,7 @@ int libxl__ev_fd_modify(libxl__gc *gc, libxl__ev_fd *ev, short events) DBG("ev_fd=%p modify fd=%d events=%x", ev, ev->fd, events); - rc = OSEVENT_HOOK(fd_modify, ev->fd, &ev->for_app_reg, events); + rc = OSEVENT_HOOK(fd,modify, noop, ev->fd, &ev->nexus->for_app_reg, events); if (rc) goto out; ev->events = events; @@ -119,7 +228,7 @@ void libxl__ev_fd_deregister(libxl__gc *gc, libxl__ev_fd *ev) DBG("ev_fd=%p deregister fd=%d", ev, ev->fd); - OSEVENT_HOOK_VOID(fd_deregister, ev->fd, ev->for_app_reg); + OSEVENT_HOOK_VOID(fd,deregister, release, ev->fd, ev->nexus->for_app_reg); LIBXL_LIST_REMOVE(ev, entry); ev->fd = -1; @@ -171,7 +280,8 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, { int rc; - rc = OSEVENT_HOOK(timeout_register, &ev->for_app_reg, absolute, ev); + rc = OSEVENT_HOOK(timeout,register, alloc, &ev->nexus->for_app_reg, + absolute, ev->nexus); if (rc) return rc; ev->infinite = 0; @@ -184,7 +294,7 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, static void time_deregister(libxl__gc *gc, libxl__ev_time *ev) { if (!ev->infinite) { - OSEVENT_HOOK_VOID(timeout_deregister, ev->for_app_reg); + OSEVENT_HOOK_VOID(timeout,deregister, release, ev->nexus->for_app_reg); LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); } } @@ -270,7 +380,8 @@ int libxl__ev_time_modify_abs(libxl__gc *gc, libxl__ev_time *ev, rc = time_register_finite(gc, ev, absolute); if (rc) goto out; } else { - rc = OSEVENT_HOOK(timeout_modify, &ev->for_app_reg, absolute); + rc = OSEVENT_HOOK(timeout,modify, noop, + &ev->nexus->for_app_reg, absolute); if (rc) goto out; LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); @@ -1010,35 +1121,54 @@ void libxl_osevent_register_hooks(libxl_ctx *ctx, void libxl_osevent_occurred_fd(libxl_ctx *ctx, void *for_libxl, - int fd, short events, short revents) + int fd, short events_ign, short revents_ign) { - libxl__ev_fd *ev = for_libxl; - EGC_INIT(ctx); CTX_LOCK; assert(!CTX->osevent_in_hook); - assert(fd == ev->fd); - revents &= ev->events; - if (revents) - ev->func(egc, ev, fd, ev->events, revents); + libxl__ev_fd *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + if (!ev) goto out; + if (ev->fd != fd) goto out; + struct pollfd check; + for (;;) { + check.fd = fd; + check.events = ev->events; + int r = poll(&check, 1, 0); + if (!r) + goto out; + if (r==1) + break; + assert(r<0); + if (errno != EINTR) { + LIBXL__EVENT_DISASTER(egc, "failed poll to check for fd", errno, 0); + goto out; + } + } + + if (check.revents) + ev->func(egc, ev, fd, ev->events, check.revents); + + out: CTX_UNLOCK; EGC_FREE; } void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl) { - libxl__ev_time *ev = for_libxl; - EGC_INIT(ctx); CTX_LOCK; assert(!CTX->osevent_in_hook); + libxl__ev_time *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + if (!ev) goto out; assert(!ev->infinite); + LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); ev->func(egc, ev, &ev->abs); + out: CTX_UNLOCK; EGC_FREE; } diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index cba3616..6484bcb 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -136,6 +136,8 @@ typedef struct libxl__gc libxl__gc; typedef struct libxl__egc libxl__egc; typedef struct libxl__ao libxl__ao; typedef struct libxl__aop_occurred libxl__aop_occurred; +typedef struct libxl__osevent_hook_nexus libxl__osevent_hook_nexus; +typedef struct libxl__osevent_hook_nexi libxl__osevent_hook_nexi; _hidden void libxl__alloc_failed(libxl_ctx *, const char *func, size_t nmemb, size_t size) __attribute__((noreturn)); @@ -163,7 +165,7 @@ struct libxl__ev_fd { libxl__ev_fd_callback *func; /* remainder is private for libxl__ev_fd... */ LIBXL_LIST_ENTRY(libxl__ev_fd) entry; - void *for_app_reg; + libxl__osevent_hook_nexus *nexus; }; @@ -178,7 +180,7 @@ struct libxl__ev_time { int infinite; /* not registered in list or with app if infinite */ LIBXL_TAILQ_ENTRY(libxl__ev_time) entry; struct timeval abs; - void *for_app_reg; + libxl__osevent_hook_nexus *nexus; }; typedef struct libxl__ev_xswatch libxl__ev_xswatch; @@ -329,6 +331,8 @@ struct libxl__ctx { libxl__poller poller_app; /* libxl_osevent_beforepoll and _afterpoll */ LIBXL_LIST_HEAD(, libxl__poller) pollers_event, pollers_idle; + LIBXL_SLIST_HEAD(libxl__osevent_hook_nexi, libxl__osevent_hook_nexus) + hook_fd_nexi_idle, hook_timeout_nexi_idle; LIBXL_LIST_HEAD(, libxl__ev_fd) efds; LIBXL_TAILQ_HEAD(, libxl__ev_time) etimes; -- 1.7.2.5 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] libxl: fix stale fd event callback race 2012-12-07 19:15 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson @ 2012-12-07 19:22 ` Ian Jackson 0 siblings, 0 replies; 13+ messages in thread From: Ian Jackson @ 2012-12-07 19:22 UTC (permalink / raw) To: xen-devel@lists.xen.org, Bamvor Jian Zhang, Ian Campbell Ian Jackson writes ("[PATCH 1/2] libxl: fix stale fd event callback race"): > Because there is not necessarily any lock held at the point the > application (eg, libvirt) calls libxl_osevent_occurred_timeout and > ..._fd, in a multithreaded program those calls may be arbitrarily > delayed in relation to other activities within the program. Sorry for the duplicate; the first seemed to have vanished so I resent it and then naturally it turned up. Ian. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] libxl: fix stale fd event callback race 2012-12-07 19:11 [PATCH] fix race condition between libvirtd event handling and libxl fd deregister Ian Jackson 2012-12-07 19:15 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson @ 2012-12-07 19:21 ` Ian Jackson 1 sibling, 0 replies; 13+ messages in thread From: Ian Jackson @ 2012-12-07 19:21 UTC (permalink / raw) To: xen-devel; +Cc: Ian Jackson, Ian Campbell, Bamvor Jian Zhang Because there is not necessarily any lock held at the point the application (eg, libvirt) calls libxl_osevent_occurred_timeout and ..._fd, in a multithreaded program those calls may be arbitrarily delayed in relation to other activities within the program. libxl therefore needs to be prepared to receive very old event callbacks. Arrange for this to be the case for fd callbacks. This requires a new layer of indirection through a "hook nexus" struct which can outlive the libxl__ev_foo. Allocation and deallocation of these nexi is mostly handled in the OSEVENT macros which wrap up the application's callbacks. Document the problem and the solution in a comment in libxl_event.c just before the definition of struct libxl__osevent_hook_nexus. There is still a race relating to libxl__osevent_occurred_timeout; this will be addressed in the following patch. Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> -- v2: - Prepare for fixing timeout race too - Break out osevent_release_nexus() - nexusop argument to OSEVENT* macros - Clarify OSEVENT* nexusop hooks - osevent_ev_from_hook_nexus takes a libxl__osevent_hook_nexus* --- tools/libxl/libxl_event.c | 184 +++++++++++++++++++++++++++++++++++------ tools/libxl/libxl_internal.h | 8 ++- 2 files changed, 163 insertions(+), 29 deletions(-) diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 72cb723..f1fe425 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -38,23 +38,131 @@ * The application's registration hooks should be called ONLY via * these macros, with the ctx locked. Likewise all the "occurred" * entrypoints from the application should assert(!in_hook); + * + * During the hook call - including while the arguments are being + * evaluated - ev->nexus is guaranteed to be valid and refer to the + * nexus which is being used for this event registration. The + * arguments should specify ev->nexus for the for_libxl argument and + * ev->nexus->for_app_reg (or a pointer to it) for for_app_reg. */ -#define OSEVENT_HOOK_INTERN(retval, hookname, ...) do { \ - if (CTX->osevent_hooks) { \ - CTX->osevent_in_hook++; \ - retval CTX->osevent_hooks->hookname(CTX->osevent_user, __VA_ARGS__); \ - CTX->osevent_in_hook--; \ - } \ +#define OSEVENT_HOOK_INTERN(retval, failedp, evkind, hookop, nexusop, ...) do { \ + if (CTX->osevent_hooks) { \ + CTX->osevent_in_hook++; \ + libxl__osevent_hook_nexi *nexi = &CTX->hook_##evkind##_nexi_idle; \ + osevent_hook_pre_##nexusop(gc, ev, nexi, &ev->nexus); \ + retval CTX->osevent_hooks->evkind##_##hookop \ + (CTX->osevent_user, __VA_ARGS__); \ + if ((failedp)) \ + osevent_hook_failed_##nexusop(gc, ev, nexi, &ev->nexus); \ + CTX->osevent_in_hook--; \ + } \ } while (0) -#define OSEVENT_HOOK(hookname, ...) ({ \ - int osevent_hook_rc = 0; \ - OSEVENT_HOOK_INTERN(osevent_hook_rc = , hookname, __VA_ARGS__); \ - osevent_hook_rc; \ +#define OSEVENT_HOOK(evkind, hookop, nexusop, ...) ({ \ + int osevent_hook_rc = 0; \ + OSEVENT_HOOK_INTERN(osevent_hook_rc =, !!osevent_hook_rc, \ + evkind, hookop, nexusop, __VA_ARGS__); \ + osevent_hook_rc; \ }) -#define OSEVENT_HOOK_VOID(hookname, ...) \ - OSEVENT_HOOK_INTERN(/* void */, hookname, __VA_ARGS__) +#define OSEVENT_HOOK_VOID(evkind, hookop, nexusop, ...) \ + OSEVENT_HOOK_INTERN(/* void */, 0, evkind, hookop, nexusop, __VA_ARGS__) + +/* + * The application's calls to libxl_osevent_occurred_... may be + * indefinitely delayed with respect to the rest of the program (since + * they are not necessarily called with any lock held). So the + * for_libxl value we receive may be (almost) arbitrarily old. All we + * know is that it came from this ctx. + * + * Therefore we may not free the object referred to by any for_libxl + * value until we free the whole libxl_ctx. And if we reuse it we + * must be able to tell when an old use turns up, and discard the + * stale event. + * + * Thus we cannot use the ev directly as the for_libxl value - we need + * a layer of indirection. + * + * We do this by keeping a pool of libxl__osevent_hook_nexus structs, + * and use pointers to them as for_libxl values. In fact, there are + * two pools: one for fds and one for timeouts. This ensures that we + * don't risk a type error when we upcast nexus->ev. In each nexus + * the ev is either null or points to a valid libxl__ev_time or + * libxl__ev_fd, as applicable. + * + * We /do/ allow ourselves to reassociate an old nexus with a new ev + * as otherwise we would have to leak nexi. (This reassociation + * might, of course, be an old ev being reused for a new purpose so + * simply comparing the ev pointer is not sufficient.) Thus the + * libxl_osevent_occurred functions need to check that the condition + * allegedly signalled by this event actually exists. + * + * The nexi and the lists are all protected by the ctx lock. + */ + +struct libxl__osevent_hook_nexus { + void *ev; + void *for_app_reg; + LIBXL_SLIST_ENTRY(libxl__osevent_hook_nexus) next; +}; + +static void *osevent_ev_from_hook_nexus(libxl_ctx *ctx, + libxl__osevent_hook_nexus *nexus /* pass void *for_libxl */) +{ + return nexus->ev; +} + +static void osevent_release_nexus(libxl__gc *gc, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus *nexus) +{ + nexus->ev = 0; + LIBXL_SLIST_INSERT_HEAD(nexi_idle, nexus, next); +} + +/*----- OSEVENT* hook functions for nexusop "alloc" -----*/ +static void osevent_hook_pre_alloc(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus_r) +{ + libxl__osevent_hook_nexus *nexus = LIBXL_SLIST_FIRST(nexi_idle); + if (nexus) { + LIBXL_SLIST_REMOVE_HEAD(nexi_idle, next); + } else { + nexus = libxl__zalloc(NOGC, sizeof(*nexus)); + } + nexus->ev = ev; + *nexus_r = nexus; +} +static void osevent_hook_failed_alloc(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + osevent_release_nexus(gc, nexi_idle, *nexus); +} + +/*----- OSEVENT* hook functions for nexusop "release" -----*/ +static void osevent_hook_pre_release(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + osevent_release_nexus(gc, nexi_idle, *nexus); +} +static void osevent_hook_failed_release(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) +{ + abort(); +} + +/*----- OSEVENT* hook functions for nexusop "noop" -----*/ +static void osevent_hook_pre_noop(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) { } +static void osevent_hook_failed_noop(libxl__gc *gc, void *ev, + libxl__osevent_hook_nexi *nexi_idle, + libxl__osevent_hook_nexus **nexus) { } + /* * fd events @@ -72,7 +180,8 @@ int libxl__ev_fd_register(libxl__gc *gc, libxl__ev_fd *ev, DBG("ev_fd=%p register fd=%d events=%x", ev, fd, events); - rc = OSEVENT_HOOK(fd_register, fd, &ev->for_app_reg, events, ev); + rc = OSEVENT_HOOK(fd,register, alloc, fd, &ev->nexus->for_app_reg, + events, ev->nexus); if (rc) goto out; ev->fd = fd; @@ -97,7 +206,7 @@ int libxl__ev_fd_modify(libxl__gc *gc, libxl__ev_fd *ev, short events) DBG("ev_fd=%p modify fd=%d events=%x", ev, ev->fd, events); - rc = OSEVENT_HOOK(fd_modify, ev->fd, &ev->for_app_reg, events); + rc = OSEVENT_HOOK(fd,modify, noop, ev->fd, &ev->nexus->for_app_reg, events); if (rc) goto out; ev->events = events; @@ -119,7 +228,7 @@ void libxl__ev_fd_deregister(libxl__gc *gc, libxl__ev_fd *ev) DBG("ev_fd=%p deregister fd=%d", ev, ev->fd); - OSEVENT_HOOK_VOID(fd_deregister, ev->fd, ev->for_app_reg); + OSEVENT_HOOK_VOID(fd,deregister, release, ev->fd, ev->nexus->for_app_reg); LIBXL_LIST_REMOVE(ev, entry); ev->fd = -1; @@ -171,7 +280,8 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, { int rc; - rc = OSEVENT_HOOK(timeout_register, &ev->for_app_reg, absolute, ev); + rc = OSEVENT_HOOK(timeout,register, alloc, &ev->nexus->for_app_reg, + absolute, ev->nexus); if (rc) return rc; ev->infinite = 0; @@ -184,7 +294,7 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, static void time_deregister(libxl__gc *gc, libxl__ev_time *ev) { if (!ev->infinite) { - OSEVENT_HOOK_VOID(timeout_deregister, ev->for_app_reg); + OSEVENT_HOOK_VOID(timeout,deregister, release, ev->nexus->for_app_reg); LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); } } @@ -270,7 +380,8 @@ int libxl__ev_time_modify_abs(libxl__gc *gc, libxl__ev_time *ev, rc = time_register_finite(gc, ev, absolute); if (rc) goto out; } else { - rc = OSEVENT_HOOK(timeout_modify, &ev->for_app_reg, absolute); + rc = OSEVENT_HOOK(timeout,modify, noop, + &ev->nexus->for_app_reg, absolute); if (rc) goto out; LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); @@ -1010,35 +1121,54 @@ void libxl_osevent_register_hooks(libxl_ctx *ctx, void libxl_osevent_occurred_fd(libxl_ctx *ctx, void *for_libxl, - int fd, short events, short revents) + int fd, short events_ign, short revents_ign) { - libxl__ev_fd *ev = for_libxl; - EGC_INIT(ctx); CTX_LOCK; assert(!CTX->osevent_in_hook); - assert(fd == ev->fd); - revents &= ev->events; - if (revents) - ev->func(egc, ev, fd, ev->events, revents); + libxl__ev_fd *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + if (!ev) goto out; + if (ev->fd != fd) goto out; + struct pollfd check; + for (;;) { + check.fd = fd; + check.events = ev->events; + int r = poll(&check, 1, 0); + if (!r) + goto out; + if (r==1) + break; + assert(r<0); + if (errno != EINTR) { + LIBXL__EVENT_DISASTER(egc, "failed poll to check for fd", errno, 0); + goto out; + } + } + + if (check.revents) + ev->func(egc, ev, fd, ev->events, check.revents); + + out: CTX_UNLOCK; EGC_FREE; } void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl) { - libxl__ev_time *ev = for_libxl; - EGC_INIT(ctx); CTX_LOCK; assert(!CTX->osevent_in_hook); + libxl__ev_time *ev = osevent_ev_from_hook_nexus(ctx, for_libxl); + if (!ev) goto out; assert(!ev->infinite); + LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry); ev->func(egc, ev, &ev->abs); + out: CTX_UNLOCK; EGC_FREE; } diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index cba3616..6484bcb 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -136,6 +136,8 @@ typedef struct libxl__gc libxl__gc; typedef struct libxl__egc libxl__egc; typedef struct libxl__ao libxl__ao; typedef struct libxl__aop_occurred libxl__aop_occurred; +typedef struct libxl__osevent_hook_nexus libxl__osevent_hook_nexus; +typedef struct libxl__osevent_hook_nexi libxl__osevent_hook_nexi; _hidden void libxl__alloc_failed(libxl_ctx *, const char *func, size_t nmemb, size_t size) __attribute__((noreturn)); @@ -163,7 +165,7 @@ struct libxl__ev_fd { libxl__ev_fd_callback *func; /* remainder is private for libxl__ev_fd... */ LIBXL_LIST_ENTRY(libxl__ev_fd) entry; - void *for_app_reg; + libxl__osevent_hook_nexus *nexus; }; @@ -178,7 +180,7 @@ struct libxl__ev_time { int infinite; /* not registered in list or with app if infinite */ LIBXL_TAILQ_ENTRY(libxl__ev_time) entry; struct timeval abs; - void *for_app_reg; + libxl__osevent_hook_nexus *nexus; }; typedef struct libxl__ev_xswatch libxl__ev_xswatch; @@ -329,6 +331,8 @@ struct libxl__ctx { libxl__poller poller_app; /* libxl_osevent_beforepoll and _afterpoll */ LIBXL_LIST_HEAD(, libxl__poller) pollers_event, pollers_idle; + LIBXL_SLIST_HEAD(libxl__osevent_hook_nexi, libxl__osevent_hook_nexus) + hook_fd_nexi_idle, hook_timeout_nexi_idle; LIBXL_LIST_HEAD(, libxl__ev_fd) efds; LIBXL_TAILQ_HEAD(, libxl__ev_time) etimes; -- 1.7.2.5 ^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-01-23 18:21 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-01-23 16:08 [PATCH 0/2 v4] libxl: fix event races exposed by libvirt Ian Jackson 2013-01-23 16:08 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson 2013-01-23 16:08 ` [PATCH 2/2] libxl: fix stale timeout " Ian Jackson 2013-01-23 17:48 ` Jim Fehlig 2013-01-23 17:58 ` DO NOT APPLY Re: [PATCH 0/2 v4] libxl: fix event races exposed by libvirt Ian Jackson -- strict thread matches above, loose matches on Subject: below -- 2013-01-23 18:21 [PATCH 0/2 v4.1] " Ian Jackson 2013-01-23 18:21 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson 2012-12-10 16:56 [PATCH] fix race condition between libvirtd event handling and libxl fd deregister Ian Jackson 2012-12-10 16:57 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson 2012-12-11 22:35 ` Jim Fehlig 2012-12-12 17:04 ` Ian Jackson 2012-12-12 17:20 ` Jim Fehlig 2012-12-07 19:11 [PATCH] fix race condition between libvirtd event handling and libxl fd deregister Ian Jackson 2012-12-07 19:15 ` [PATCH 1/2] libxl: fix stale fd event callback race Ian Jackson 2012-12-07 19:22 ` Ian Jackson 2012-12-07 19:21 ` Ian Jackson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).