From: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
To: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org,
matthltc-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
Subject: Re: [PATCH] Clear the objhash before completing restart, but delay free (v2)
Date: Mon, 01 Nov 2010 12:24:13 -0400 [thread overview]
Message-ID: <4CCEE9AD.7010105@cs.columbia.edu> (raw)
In-Reply-To: <1287496991-18207-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Dan,
Does this patch solve the performance problem in freeing all
the objhash entries upfront (e.g. before letting userspace
resume) ?
If so, is there still a performance hit for doing the 'clear'
portion early before resuming the tasks ? How much does it
depend on the complexity of the hierarchy being checkpointed ?
Oren.
On 10/19/2010 10:03 AM, Dan Smith wrote:
> This patch causes the restart coordinator to clear the object hash
> before releasing the restarted tasks. It does this to make sure
> that any objects being held exclusively by the hash are released
> before the tasks start running again.
>
> If we continue to postpone clearing the object hash until restart
> returns to userspace there can be a race where the restarted tasks
> behave differently due to the references held by the objhash.
> One specific example of this is restarting half-closed pipes.
> Without this patch we've got a race between the coordinator --
> about to clear the object hash -- and two restarted tasks connected
> via a half-closed pipe. Because the object hash contains a reference
> to both ends of the pipe one end of the pipe will not be closed
> and EPIPE/SIGPIPE won't be handled when reading from the pipe
> for instance. As far as the restarted userspace task can tell the
> pipe may briefly appear to re-open. Moving the object hash clear
> prevents this race and others like it.
>
> Note that eventually the coordinator would close the pipe and correct
> behavior would be restored. Thus this bug would only affect the
> correctness of userspace -- after a close() the pipe may briefly re-open
> and allow more data to be sent before automatically closing again.
>
> To avoid the overhead of actually freeing the object hash's structures
> at the same time, this adds a queue to ckpt_obj_hash and pushes
> the ckpt_obj structures there to be free()'d later during the cleanup
> process.
>
> Changes in v2:
> - Avoid adding another list_head to ckpt_obj by re-using the hlist
> hash node for the free list
>
> Signed-off-by: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> ---
> include/linux/checkpoint.h | 1 +
> kernel/checkpoint/objhash.c | 18 +++++++++++++++---
> kernel/checkpoint/restart.c | 2 ++
> 3 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/checkpoint.h b/include/linux/checkpoint.h
> index a11d40e..f888363 100644
> --- a/include/linux/checkpoint.h
> +++ b/include/linux/checkpoint.h
> @@ -179,6 +179,7 @@ extern void restore_notify_error(struct ckpt_ctx *ctx);
> extern int ckpt_obj_module_get(void);
> extern void ckpt_obj_module_put(void);
>
> +extern void ckpt_obj_hash_clear(struct ckpt_ctx *ctx);
> extern void ckpt_obj_hash_free(struct ckpt_ctx *ctx);
> extern int ckpt_obj_hash_alloc(struct ckpt_ctx *ctx);
>
> diff --git a/kernel/checkpoint/objhash.c b/kernel/checkpoint/objhash.c
> index 62c34ff..4aae0c0 100644
> --- a/kernel/checkpoint/objhash.c
> +++ b/kernel/checkpoint/objhash.c
> @@ -36,6 +36,7 @@ struct ckpt_obj {
> struct ckpt_obj_hash {
> struct hlist_head *head;
> struct hlist_head list;
> + struct hlist_head free;
> int next_free_objref;
> };
>
> @@ -128,8 +129,9 @@ int ckpt_obj_module_get(void)
> #define CKPT_OBJ_HASH_NBITS 10
> #define CKPT_OBJ_HASH_TOTAL (1UL << CKPT_OBJ_HASH_NBITS)
>
> -static void obj_hash_clear(struct ckpt_obj_hash *obj_hash)
> +void ckpt_obj_hash_clear(struct ckpt_ctx *ctx)
> {
> + struct ckpt_obj_hash *obj_hash = ctx->obj_hash;
> struct hlist_head *h = obj_hash->head;
> struct hlist_node *n, *t;
> struct ckpt_obj *obj;
> @@ -139,7 +141,9 @@ static void obj_hash_clear(struct ckpt_obj_hash *obj_hash)
> hlist_for_each_entry_safe(obj, n, t, &h[i], hash) {
> if (obj->ops->ref_drop)
> obj->ops->ref_drop(obj->ptr, 1);
> - kfree(obj);
> + hlist_del_init(&obj->hash);
> + hlist_del(&obj->next);
> + hlist_add_head(&obj->hash, &obj_hash->free);
> }
> }
> }
> @@ -149,7 +153,14 @@ void ckpt_obj_hash_free(struct ckpt_ctx *ctx)
> struct ckpt_obj_hash *obj_hash = ctx->obj_hash;
>
> if (obj_hash) {
> - obj_hash_clear(obj_hash);
> + struct hlist_node *pos, *n;
> + struct ckpt_obj *obj;
> +
> + ckpt_obj_hash_clear(ctx);
> +
> + hlist_for_each_entry_safe(obj, pos, n, &obj_hash->free, hash)
> + kfree(obj);
> +
> kfree(obj_hash->head);
> kfree(ctx->obj_hash);
> ctx->obj_hash = NULL;
> @@ -173,6 +184,7 @@ int ckpt_obj_hash_alloc(struct ckpt_ctx *ctx)
> obj_hash->head = head;
> obj_hash->next_free_objref = 1;
> INIT_HLIST_HEAD(&obj_hash->list);
> + INIT_HLIST_HEAD(&obj_hash->free);
>
> ctx->obj_hash = obj_hash;
> return 0;
> diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
> index 17270b8..f2241a9 100644
> --- a/kernel/checkpoint/restart.c
> +++ b/kernel/checkpoint/restart.c
> @@ -1349,6 +1349,8 @@ static int do_restore_coord(struct ckpt_ctx *ctx, pid_t pid)
> if (ret < 0)
> ckpt_err(ctx, ret, "restart failed (coordinator)\n");
>
> + ckpt_obj_hash_clear(ctx);
> +
> if (ckpt_test_error(ctx)) {
> destroy_descendants(ctx);
> ret = ckpt_get_error(ctx);
next prev parent reply other threads:[~2010-11-01 16:24 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-19 14:03 [PATCH] Clear the objhash before completing restart, but delay free (v2) Dan Smith
[not found] ` <1287496991-18207-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-10-19 18:46 ` Matt Helsley
2010-11-01 16:24 ` Oren Laadan [this message]
[not found] ` <4CCEE9AD.7010105-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-11-01 17:01 ` Dan Smith
[not found] ` <87oca9vuyw.fsf-bi+AKbBUZKagILUCTcTcHdKyNwTtLsGr@public.gmane.org>
2010-11-01 17:50 ` Oren Laadan
[not found] ` <4CCEFDD4.1030308-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-11-01 18:10 ` Dan Smith
2011-02-01 0:31 ` Oren Laadan
[not found] ` <4D47544E.7080600-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2011-02-01 19:38 ` Dan Smith
[not found] ` <87pqrb5yvn.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2011-02-02 14:55 ` Oren Laadan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CCEE9AD.7010105@cs.columbia.edu \
--to=orenl-eqauephvms7envbuuze7ea@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
--cc=danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
--cc=matthltc-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.