From: Oren Laadan <orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
To: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: [PATCH 2/3] epoll: Add support for checkpointing large numbers of epoll items
Date: Fri, 23 Oct 2009 19:51:59 -0400 [thread overview]
Message-ID: <4AE2419F.7040506@librato.com> (raw)
In-Reply-To: <d0fd1f3eb4eaa326488f59955e5b4790080f3073.1255971848.git.matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Matt Helsley wrote:
> Currently we allocate memory to output all of the epoll items in one
> big chunk. At 20 bytes per item, and since epoll was designed to
> support on the order of 10,000 items, we may find ourselves kmalloc'ing
> 200,000 bytes. That's an order 7 allocation whereas the heuristic for
> difficult allocations, PAGE_ALLOC_COST_ORDER, is 3.
>
> Instead, output the epoll header and items separately. Chunk the output
> much like the pid array gets chunked. This ensures that even sub-order 0
> allocations will enable checkpoint of large epoll sets. A subsequent
> patch will do something similar for the restore path.
>
> Signed-off-by: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> ---
> fs/eventpoll.c | 71 ++++++++++++++++++++++++++++++++++++-------------------
> 1 files changed, 46 insertions(+), 25 deletions(-)
>
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 4706ec5..2506b40 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -1480,7 +1480,7 @@ static int ep_items_checkpoint(void *data)
> struct rb_node *rbp;
> struct eventpoll *ep;
> __s32 epfile_objref;
> - int i, num_items, ret;
> + int num_items = 0, nchunk, ret;
>
> ctx = dq_entry->ctx;
>
> @@ -1489,9 +1489,8 @@ static int ep_items_checkpoint(void *data)
>
> ep = dq_entry->epfile->private_data;
> mutex_lock(&ep->mtx);
> - for (i = 0, rbp = rb_first(&ep->rbr); rbp; rbp = rb_next(rbp), i++) {}
> + for (rbp = rb_first(&ep->rbr); rbp; rbp = rb_next(rbp), num_items++) {}
> mutex_unlock(&ep->mtx);
> - num_items = i;
>
> h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_EPOLL_ITEMS);
> if (!h)
> @@ -1503,36 +1502,58 @@ static int ep_items_checkpoint(void *data)
> if (ret || !num_items)
> return ret;
>
> - items = kzalloc(sizeof(*items)*num_items, GFP_KERNEL);
> + ret = ckpt_write_obj_type(ctx, NULL, sizeof(*items)*num_items,
> + CKPT_HDR_BUFFER);
> + if (ret < 0)
> + return ret;
> +
> + nchunk = num_items;
> + do {
> + items = kzalloc(sizeof(*items)*nchunk, GFP_KERNEL);
> + if (items)
> + break;
> + nchunk = nchunk >> 1;
> + } while (nchunk > 0);
An allocation may or may not succeed for num_items; however, it if
does succeed, it may unnecessarily fragment the memory.
So I wonder if it's simpler to set the chunk size to 1-2 pages, like
in the pids code ?
The other advantage is that if we eventually optimize by allocating
a generic buffer for the c/r (e.g. ctx->buffer), we could easily
reuse it here.
[...]
Oren.
next prev parent reply other threads:[~2009-10-23 23:51 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-19 17:04 [PATCH 1/3] Checkpoint/restart epoll sets Matt Helsley
[not found] ` <ce2e15faf44e254b80578c6c62e71d8685516896.1255971848.git.matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-19 17:04 ` [PATCH 2/3] epoll: Add support for checkpointing large numbers of epoll items Matt Helsley
[not found] ` <d0fd1f3eb4eaa326488f59955e5b4790080f3073.1255971848.git.matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-21 14:59 ` Serge E. Hallyn
[not found] ` <20091021145950.GA13327-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-22 6:40 ` Matt Helsley
[not found] ` <20091022064007.GG7757-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2009-10-23 23:54 ` Oren Laadan
2009-10-23 23:51 ` Oren Laadan [this message]
2009-10-23 23:58 ` Oren Laadan
[not found] ` <4AE24340.9030203-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-10-24 4:32 ` Matt Helsley
2009-10-19 17:04 ` [PATCH 3/3] epoll: Add support for restoring many " Matt Helsley
[not found] ` <8e4344b801150b95cd54f2d09b660525601de256.1255971848.git.matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-21 15:09 ` Serge E. Hallyn
2009-10-23 23:56 ` Oren Laadan
2009-10-21 0:31 ` [PATCH 1/3] Checkpoint/restart epoll sets Serge E. Hallyn
[not found] ` <20091021003128.GA23721-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-22 6:29 ` Matt Helsley
[not found] ` <20091022062909.GF7757-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2009-10-22 14:02 ` Serge E. Hallyn
2009-10-23 23:30 ` Oren Laadan
2009-10-23 23:41 ` Oren Laadan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AE2419F.7040506@librato.com \
--to=orenl-rdfvbdnroixbdgjk7y7tuq@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox