From: Balbir Singh <balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: KAMEZAWA Hiroyuki
<kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Cc: "containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org"
<containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>,
"yamamoto-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org"
<yamamoto-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>,
xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
Subject: Re: [RFC] memory controller : backgorund reclaim and avoid excessive locking [3/5] throttling
Date: Thu, 14 Feb 2008 14:44:50 +0530 [thread overview]
Message-ID: <47B4068A.2050104@linux.vnet.ibm.com> (raw)
In-Reply-To: <20080214173043.d57b3619.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
KAMEZAWA Hiroyuki wrote:
> Throttle memory reclaim interface for cgroup.
>
> This patch adds..
> - a limit for simultaneous callers of try_to_free_mem_cgroup_pages().
> - interface for that. memory.shrinkers.
>
> There are some reasons.
> - try_to_free... is very heavy and shoulnd't be called too much at once.
> - When the number of callers of try_to_free.. is big, we'll reclaim
> too much memory.
>
> By this interface, a user can control the # of threads which can enter
> try_to_free...
>
What happens to the other threads, do they sleep?
> Default is 10240 ...a enough big number for unlimited. Maybe this should
> be changed.
>
>
> Changes from previous one.
> - Added an interface to control the limit.
> - don't call wake_up at uncharge()...it seems hevay..
> Instead of that, sleepers use schedule_timeout(HZ/100). (10ms)
>
> Considerations:
> - Should we add this 'throttle' to global_lru at first ?
Hmm.. interesting question. I think Rik is looking into some of these issues
> - Do we need more knobs ?
> - Should default value to be automtically estimated value ?
> (I used '# of cpus/4' as default in previous version.)
>
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
>
>
> mm/memcontrol.c | 235 ++++++++++++++++++++++++++++++--------------------------
> 1 files changed, 127 insertions(+), 108 deletions(-)
>
> Index: linux-2.6.24-mm1/mm/memcontrol.c
> ===================================================================
> --- linux-2.6.24-mm1.orig/mm/memcontrol.c
> +++ linux-2.6.24-mm1/mm/memcontrol.c
> @@ -145,8 +145,17 @@ struct mem_cgroup {
> wait_queue_head_t waitq;
> struct task_struct *kthread;
> } daemon;
> + /*
> + * throttling params for reclaim.
> + */
> + struct {
> + int limit;
We might want to use something else instead of limit. How about
max_parallel_reclaimers?
> + atomic_t reclaimers;
> + wait_queue_head_t waitq;
> + } throttle;
> };
>
> +
> /*
> * We use the lower bit of the page->page_cgroup pointer as a bit spin
> * lock. We need to ensure that page->page_cgroup is atleast two
> @@ -520,6 +529,27 @@ static inline void mem_cgroup_schedule_d
> wake_up_interruptible(&mem->daemon.waitq);
> }
>
> +static inline void mem_cgroup_wait_reclaim(struct mem_cgroup *mem)
> +{
> + DEFINE_WAIT(wait);
> + while (1) {
> + prepare_to_wait(&mem->throttle.waitq, &wait,
> + TASK_INTERRUPTIBLE);
> + if (res_counter_check_under_limit(&mem->res)) {
> + finish_wait(&mem->throttle.waitq, &wait);
> + break;
> + }
> + /* expect some progress in... */
> + schedule_timeout(HZ/50);
Can't we wait on someone to wake us up? Why do we need to do a schedule_timeout?
And why HZ/50 and not HZ/10? Too many timers distract the system from power
management :)
> + finish_wait(&mem->throttle.waitq, &wait);
> + }
> +}
> +
> +static inline int mem_cgroup_throttle_reclaim(struct mem_cgroup *mem)
> +{
> + return atomic_add_unless(&mem->throttle.reclaimers, 1,
> + mem->throttle.limit);
> +}
>
> unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan,
> struct list_head *dst,
> @@ -652,11 +682,22 @@ retry:
> * the cgroup limit.
> */
> while (res_counter_charge(&mem->res, PAGE_SIZE)) {
> + int ret;
> if (!(gfp_mask & __GFP_WAIT))
> goto out;
>
> - if (try_to_free_mem_cgroup_pages(mem, gfp_mask))
> + if (((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO))
> + || mem_cgroup_throttle_reclaim(mem)) {
I think we should split this check and add detailed comments. If we cannot
sleep, then we cannot be throttled, right?
> + ret = try_to_free_mem_cgroup_pages(mem, gfp_mask);
> + atomic_dec(&mem->throttle.reclaimers);
> + if (waitqueue_active(&mem->throttle.waitq))
> + wake_up_all(&mem->throttle.waitq);
> + if (ret)
> + continue;
> + } else {
> + mem_cgroup_wait_reclaim(mem);
> continue;
> + }
>
> /*
> * try_to_free_mem_cgroup_pages() might not give us a full
> @@ -1054,6 +1095,19 @@ static ssize_t mem_force_empty_read(stru
> return -EINVAL;
> }
>
> +static int mem_throttle_write(struct cgroup *cont, struct cftype *cft, u64 val)
> +{
> + struct mem_cgroup *mem = mem_cgroup_from_cont(cont);
> + int limit = (int)val;
> + mem->throttle.limit = limit;
> + return 0;
> +}
> +
> +static u64 mem_throttle_read(struct cgroup *cont, struct cftype *cft)
> +{
> + struct mem_cgroup *mem = mem_cgroup_from_cont(cont);
> + return (u64)mem->throttle.limit;
> +}
>
> static const struct mem_cgroup_stat_desc {
> const char *msg;
> @@ -1146,6 +1200,11 @@ static struct cftype mem_cgroup_files[]
> .read = mem_force_empty_read,
> },
> {
> + .name = "shrinks",
> + .write_uint = mem_throttle_write,
> + .read_uint = mem_throttle_read,
> + },
> + {
> .name = "stat",
> .open = mem_control_stat_open,
> },
> @@ -1215,6 +1274,11 @@ mem_cgroup_create(struct cgroup_subsys *
> goto free_out;
> init_waitqueue_head(&mem->daemon.waitq);
> mem->daemon.kthread = NULL;
> +
> + init_waitqueue_head(&mem->throttle.waitq);
> + mem->throttle.limit = 10240; /* maybe enough big for no throttle */
Why 10240? So, 10,000 threads can run in parallel, isn't that an overkill? How
about setting it to number of cpus/2 or something like that?
> + atomic_set(&mem->throttle.reclaimers, 0);
> +
> return &mem->css;
> free_out:
> for_each_node_state(node, N_POSSIBLE)
>
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
next prev parent reply other threads:[~2008-02-14 9:14 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-14 8:21 [RFC] memory controller : backgorund reclaim and avoid excessive locking [0/5] KAMEZAWA Hiroyuki
[not found] ` <20080214172148.bfd564d5.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-02-14 8:28 ` [RFC] memory controller : backgorund reclaim and avoid excessive locking [1/5] high-low watermark KAMEZAWA Hiroyuki
[not found] ` <20080214172801.b1c35067.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-02-14 8:48 ` Balbir Singh
[not found] ` <47B40061.20200-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-02-14 9:12 ` KAMEZAWA Hiroyuki
2008-02-14 8:29 ` [RFC] memory controller : backgorund reclaim and avoid excessive locking [2/5] background reclaim KAMEZAWA Hiroyuki
[not found] ` <20080214172910.017cc755.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-02-14 8:32 ` Balbir Singh
[not found] ` <47B3FCAC.6010109-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-02-14 8:57 ` KAMEZAWA Hiroyuki
2008-02-14 8:30 ` [RFC] memory controller : backgorund reclaim and avoid excessive locking [3/5] throttling KAMEZAWA Hiroyuki
[not found] ` <20080214173043.d57b3619.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-02-14 9:14 ` Balbir Singh [this message]
[not found] ` <47B4068A.2050104-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-02-14 9:38 ` KAMEZAWA Hiroyuki
2008-02-14 8:35 ` [RFC] memory controller : backgorund reclaim and avoid excessive locking [4/5] borrow resource KAMEZAWA Hiroyuki
[not found] ` <20080214173504.cc1aa42e.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-02-18 0:53 ` YAMAMOTO Takashi
[not found] ` <20080218005335.D93F51E3C5A-Pcsii4f/SVk@public.gmane.org>
2008-02-18 2:06 ` KAMEZAWA Hiroyuki
2008-02-14 8:36 ` [RFC] memory controller : backgorund reclaim and avoid excessive locking [5/5] lazy page_cgroup freeing KAMEZAWA Hiroyuki
[not found] ` <20080214173624.d29b26d9.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-02-18 1:58 ` YAMAMOTO Takashi
[not found] ` <20080218015840.CE6FF1E3C58-Pcsii4f/SVk@public.gmane.org>
2008-02-18 2:11 ` KAMEZAWA Hiroyuki
2008-02-18 4:35 ` Balbir Singh
[not found] ` <47B90B04.3090001-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-02-18 4:55 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47B4068A.2050104@linux.vnet.ibm.com \
--to=balbir-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
--cc=kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
--cc=xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
--cc=yamamoto-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.