Linux Container Development
 help / color / mirror / Atom feed
From: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
To: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org,
	matthltc-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
Subject: Re: [PATCH] Clear the objhash before completing restart, but delay free (v2)
Date: Mon, 01 Nov 2010 12:24:13 -0400	[thread overview]
Message-ID: <4CCEE9AD.7010105@cs.columbia.edu> (raw)
In-Reply-To: <1287496991-18207-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

Dan,

Does this patch solve the performance problem in freeing all
the objhash entries upfront (e.g. before letting userspace
resume) ?

If so, is there still a performance hit for doing the 'clear'
portion early before resuming the tasks ?   How much does it
depend on the complexity of the hierarchy being checkpointed ?

Oren.

On 10/19/2010 10:03 AM, Dan Smith wrote:
> This patch causes the restart coordinator to clear the object hash
> before releasing the restarted tasks.  It does this to make sure
> that any objects being held exclusively by the hash are released
> before the tasks start running again.
> 
> If we continue to postpone clearing the object hash until restart
> returns to userspace there can be a race where the restarted tasks
> behave differently due to the references held by the objhash.
> One specific example of this is restarting half-closed pipes.
> Without this patch we've got a race between the coordinator --
> about to clear the object hash -- and two restarted tasks connected
> via a half-closed pipe. Because the object hash contains a reference
> to both ends of the pipe one end of the pipe will not be closed
> and EPIPE/SIGPIPE won't be handled when reading from the pipe
> for instance. As far as the restarted userspace task can tell the
> pipe may briefly appear to re-open. Moving the object hash clear
> prevents this race and others like it.
> 
> Note that eventually the coordinator would close the pipe and correct
> behavior would be restored. Thus this bug would only affect the
> correctness of userspace -- after a close() the pipe may briefly re-open
> and allow more data to be sent before automatically closing again.
> 
> To avoid the overhead of actually freeing the object hash's structures
> at the same time, this adds a queue to ckpt_obj_hash and pushes
> the ckpt_obj structures there to be free()'d later during the cleanup
> process.
> 
> Changes in v2:
>  - Avoid adding another list_head to ckpt_obj by re-using the hlist
>    hash node for the free list
> 
> Signed-off-by: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> ---
>  include/linux/checkpoint.h  |    1 +
>  kernel/checkpoint/objhash.c |   18 +++++++++++++++---
>  kernel/checkpoint/restart.c |    2 ++
>  3 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/checkpoint.h b/include/linux/checkpoint.h
> index a11d40e..f888363 100644
> --- a/include/linux/checkpoint.h
> +++ b/include/linux/checkpoint.h
> @@ -179,6 +179,7 @@ extern void restore_notify_error(struct ckpt_ctx *ctx);
>  extern int ckpt_obj_module_get(void);
>  extern void ckpt_obj_module_put(void);
>  
> +extern void ckpt_obj_hash_clear(struct ckpt_ctx *ctx);
>  extern void ckpt_obj_hash_free(struct ckpt_ctx *ctx);
>  extern int ckpt_obj_hash_alloc(struct ckpt_ctx *ctx);
>  
> diff --git a/kernel/checkpoint/objhash.c b/kernel/checkpoint/objhash.c
> index 62c34ff..4aae0c0 100644
> --- a/kernel/checkpoint/objhash.c
> +++ b/kernel/checkpoint/objhash.c
> @@ -36,6 +36,7 @@ struct ckpt_obj {
>  struct ckpt_obj_hash {
>  	struct hlist_head *head;
>  	struct hlist_head list;
> +	struct hlist_head free;
>  	int next_free_objref;
>  };
>  
> @@ -128,8 +129,9 @@ int ckpt_obj_module_get(void)
>  #define CKPT_OBJ_HASH_NBITS  10
>  #define CKPT_OBJ_HASH_TOTAL  (1UL << CKPT_OBJ_HASH_NBITS)
>  
> -static void obj_hash_clear(struct ckpt_obj_hash *obj_hash)
> +void ckpt_obj_hash_clear(struct ckpt_ctx *ctx)
>  {
> +	struct ckpt_obj_hash *obj_hash = ctx->obj_hash;
>  	struct hlist_head *h = obj_hash->head;
>  	struct hlist_node *n, *t;
>  	struct ckpt_obj *obj;
> @@ -139,7 +141,9 @@ static void obj_hash_clear(struct ckpt_obj_hash *obj_hash)
>  		hlist_for_each_entry_safe(obj, n, t, &h[i], hash) {
>  			if (obj->ops->ref_drop)
>  				obj->ops->ref_drop(obj->ptr, 1);
> -			kfree(obj);
> +			hlist_del_init(&obj->hash);
> +			hlist_del(&obj->next);
> +			hlist_add_head(&obj->hash, &obj_hash->free);
>  		}
>  	}
>  }
> @@ -149,7 +153,14 @@ void ckpt_obj_hash_free(struct ckpt_ctx *ctx)
>  	struct ckpt_obj_hash *obj_hash = ctx->obj_hash;
>  
>  	if (obj_hash) {
> -		obj_hash_clear(obj_hash);
> +		struct hlist_node *pos, *n;
> +		struct ckpt_obj *obj;
> +
> +		ckpt_obj_hash_clear(ctx);
> +
> +		hlist_for_each_entry_safe(obj, pos, n, &obj_hash->free, hash)
> +			kfree(obj);
> +
>  		kfree(obj_hash->head);
>  		kfree(ctx->obj_hash);
>  		ctx->obj_hash = NULL;
> @@ -173,6 +184,7 @@ int ckpt_obj_hash_alloc(struct ckpt_ctx *ctx)
>  	obj_hash->head = head;
>  	obj_hash->next_free_objref = 1;
>  	INIT_HLIST_HEAD(&obj_hash->list);
> +	INIT_HLIST_HEAD(&obj_hash->free);
>  
>  	ctx->obj_hash = obj_hash;
>  	return 0;
> diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
> index 17270b8..f2241a9 100644
> --- a/kernel/checkpoint/restart.c
> +++ b/kernel/checkpoint/restart.c
> @@ -1349,6 +1349,8 @@ static int do_restore_coord(struct ckpt_ctx *ctx, pid_t pid)
>  	if (ret < 0)
>  		ckpt_err(ctx, ret, "restart failed (coordinator)\n");
>  
> +	ckpt_obj_hash_clear(ctx);
> +
>  	if (ckpt_test_error(ctx)) {
>  		destroy_descendants(ctx);
>  		ret = ckpt_get_error(ctx);

  parent reply	other threads:[~2010-11-01 16:24 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-19 14:03 [PATCH] Clear the objhash before completing restart, but delay free (v2) Dan Smith
     [not found] ` <1287496991-18207-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-10-19 18:46   ` Matt Helsley
2010-11-01 16:24   ` Oren Laadan [this message]
     [not found]     ` <4CCEE9AD.7010105-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-11-01 17:01       ` Dan Smith
     [not found]         ` <87oca9vuyw.fsf-bi+AKbBUZKagILUCTcTcHdKyNwTtLsGr@public.gmane.org>
2010-11-01 17:50           ` Oren Laadan
     [not found]             ` <4CCEFDD4.1030308-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-11-01 18:10               ` Dan Smith
2011-02-01  0:31   ` Oren Laadan
     [not found]     ` <4D47544E.7080600-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2011-02-01 19:38       ` Dan Smith
     [not found]         ` <87pqrb5yvn.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2011-02-02 14:55           ` Oren Laadan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CCEE9AD.7010105@cs.columbia.edu \
    --to=orenl-eqauephvms7envbuuze7ea@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=matthltc-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox