All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oren Laadan <orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
To: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: [PATCH 2/3] epoll: Add support for checkpointing large numbers of epoll items
Date: Fri, 23 Oct 2009 19:51:59 -0400	[thread overview]
Message-ID: <4AE2419F.7040506@librato.com> (raw)
In-Reply-To: <d0fd1f3eb4eaa326488f59955e5b4790080f3073.1255971848.git.matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>



Matt Helsley wrote:
> Currently we allocate memory to output all of the epoll items in one
> big chunk. At 20 bytes per item, and since epoll was designed to
> support on the order of 10,000 items, we may find ourselves kmalloc'ing
> 200,000 bytes. That's an order 7 allocation whereas the heuristic for
> difficult allocations, PAGE_ALLOC_COST_ORDER, is 3.
> 
> Instead, output the epoll header and items separately. Chunk the output
> much like the pid array gets chunked. This ensures that even sub-order 0
> allocations will enable checkpoint of large epoll sets. A subsequent
> patch will do something similar for the restore path.
> 
> Signed-off-by: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/eventpoll.c |   71 ++++++++++++++++++++++++++++++++++++-------------------
>  1 files changed, 46 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 4706ec5..2506b40 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -1480,7 +1480,7 @@ static int ep_items_checkpoint(void *data)
>  	struct rb_node *rbp;
>  	struct eventpoll *ep;
>  	__s32 epfile_objref;
> -	int i, num_items, ret;
> +	int num_items = 0, nchunk, ret;
>  
>  	ctx = dq_entry->ctx;
>  
> @@ -1489,9 +1489,8 @@ static int ep_items_checkpoint(void *data)
>  
>  	ep = dq_entry->epfile->private_data;
>  	mutex_lock(&ep->mtx);
> -	for (i = 0, rbp = rb_first(&ep->rbr); rbp; rbp = rb_next(rbp), i++) {}
> +	for (rbp = rb_first(&ep->rbr); rbp; rbp = rb_next(rbp), num_items++) {}
>  	mutex_unlock(&ep->mtx);
> -	num_items = i;
>  
>  	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_EPOLL_ITEMS);
>  	if (!h)
> @@ -1503,36 +1502,58 @@ static int ep_items_checkpoint(void *data)
>  	if (ret || !num_items)
>  		return ret;
>  
> -	items = kzalloc(sizeof(*items)*num_items, GFP_KERNEL);
> +	ret = ckpt_write_obj_type(ctx, NULL, sizeof(*items)*num_items,
> +				  CKPT_HDR_BUFFER);
> +	if (ret < 0)
> +		return ret;
> +
> +	nchunk = num_items;
> +	do {
> +		items = kzalloc(sizeof(*items)*nchunk, GFP_KERNEL);
> +		if (items)
> +			break;
> +		nchunk = nchunk >> 1;
> +	} while (nchunk > 0);

An allocation may or may not succeed for num_items; however, it if
does succeed, it may unnecessarily fragment the memory.

So I wonder if it's simpler to set the chunk size to 1-2 pages, like
in the pids code ?

The other advantage is that if we eventually optimize by allocating
a generic buffer for the c/r (e.g. ctx->buffer), we could easily
reuse it here.

[...]

Oren.

  parent reply	other threads:[~2009-10-23 23:51 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-19 17:04 [PATCH 1/3] Checkpoint/restart epoll sets Matt Helsley
     [not found] ` <ce2e15faf44e254b80578c6c62e71d8685516896.1255971848.git.matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-19 17:04   ` [PATCH 2/3] epoll: Add support for checkpointing large numbers of epoll items Matt Helsley
     [not found]     ` <d0fd1f3eb4eaa326488f59955e5b4790080f3073.1255971848.git.matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-21 14:59       ` Serge E. Hallyn
     [not found]         ` <20091021145950.GA13327-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-22  6:40           ` Matt Helsley
     [not found]             ` <20091022064007.GG7757-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2009-10-23 23:54               ` Oren Laadan
2009-10-23 23:51       ` Oren Laadan [this message]
2009-10-23 23:58       ` Oren Laadan
     [not found]         ` <4AE24340.9030203-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-10-24  4:32           ` Matt Helsley
2009-10-19 17:04   ` [PATCH 3/3] epoll: Add support for restoring many " Matt Helsley
     [not found]     ` <8e4344b801150b95cd54f2d09b660525601de256.1255971848.git.matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-21 15:09       ` Serge E. Hallyn
2009-10-23 23:56       ` Oren Laadan
2009-10-21  0:31   ` [PATCH 1/3] Checkpoint/restart epoll sets Serge E. Hallyn
     [not found]     ` <20091021003128.GA23721-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-22  6:29       ` Matt Helsley
     [not found]         ` <20091022062909.GF7757-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2009-10-22 14:02           ` Serge E. Hallyn
2009-10-23 23:30       ` Oren Laadan
2009-10-23 23:41   ` Oren Laadan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AE2419F.7040506@librato.com \
    --to=orenl-rdfvbdnroixbdgjk7y7tuq@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.