linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Paul Menage <menage@google.com>
Cc: linux-mm@kvack.org, containers@lists.linux-foundation.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Phil Carmody <ext-phil.2.carmody@nokia.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Li Zefan <lizf@cn.fujitsu.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] cgroups: make cftype.unregister_event() void-returning
Date: Sun, 9 May 2010 00:35:15 +0300	[thread overview]
Message-ID: <i2mcc557aab1005081435g3cd209fbg4eedd0791c2fb358@mail.gmail.com> (raw)
In-Reply-To: <p2l6599ad831005071407yaa994357s1261317cc7f552b@mail.gmail.com>

On Sat, May 8, 2010 at 12:07 AM, Paul Menage <menage@google.com> wrote:
> I like the principle. I think this patch leaks arrays, though.
>
> I think the sequence:
>
> register;register;unregister;unregister
>
> will leak the array of size 2. Using the notation Ax, Bx, Cx, etc to
> represent distinct buffers of size x, we have:
>
> initially: size = 0, thresholds = NULL, spare = NULL
> register: size = 1, thresholds = A1, spare = NULL
> register: size = 2, thresholds = B2, spare = A1
> unregister: size = 1, thresholds = A1, spare = B2
> unregister: size = 0, thresholds = NULL, spare = A1 (B2 is leaked)
>
> In the case when you're unregistering and the size goes down to 0, you
> need to free the spare before doing the swap.

Nice catch!

> Maybe get rid of the
> thresholds_new local variable, and instead in the if(!size) {} branch
> just free and the spare buffer and set its pointer to NULL? Then at
> swap_buffers:, unconditionally swap the two.

Good idea. Thanks.

> Also, I think the code would be cleaner if you created a structure to
> hold a primary threshold and its spare; then you could have one for
> each threshold set, and just pass that to the register/unregister
> functions, rather than them having to be aware of how the type maps to
> the primary and backup array pointers.

Ok. I'll try to implement it in separate patch.

Thank you for reviewing.

>
> Paul
>
> On Fri, May 7, 2010 at 4:46 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>> Since we unable to handle error returned by cftype.unregister_event()
>> properly, let's make the callback void-returning.
>>
>> mem_cgroup_unregister_event() has been rewritten to be "never fail"
>> function. On mem_cgroup_usage_register_event() we save old buffer
>> for thresholds array and reuse it in mem_cgroup_usage_unregister_event()
>> to avoid allocation.
>>
>> Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
>> ---
>>  include/linux/cgroup.h |    2 +-
>>  kernel/cgroup.c        |    1 -
>>  mm/memcontrol.c        |   64 ++++++++++++++++++++++++++++++------------------
>>  3 files changed, 41 insertions(+), 26 deletions(-)
>>
>> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
>> index 8f78073..0c62160 100644
>> --- a/include/linux/cgroup.h
>> +++ b/include/linux/cgroup.h
>> @@ -397,7 +397,7 @@ struct cftype {
>>         * This callback must be implemented, if you want provide
>>         * notification functionality.
>>         */
>> -       int (*unregister_event)(struct cgroup *cgrp, struct cftype *cft,
>> +       void (*unregister_event)(struct cgroup *cgrp, struct cftype *cft,
>>                        struct eventfd_ctx *eventfd);
>>  };
>>
>> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
>> index 06dbf97..6675e8c 100644
>> --- a/kernel/cgroup.c
>> +++ b/kernel/cgroup.c
>> @@ -2988,7 +2988,6 @@ static void cgroup_event_remove(struct work_struct *work)
>>                        remove);
>>        struct cgroup *cgrp = event->cgrp;
>>
>> -       /* TODO: check return code */
>>        event->cft->unregister_event(cgrp, event->cft, event->eventfd);
>>
>>        eventfd_ctx_put(event->eventfd);
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 8cb2722..0a37b5d 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -226,9 +226,19 @@ struct mem_cgroup {
>>        /* thresholds for memory usage. RCU-protected */
>>        struct mem_cgroup_threshold_ary *thresholds;
>>
>> +       /*
>> +        * Preallocated buffer to be used in mem_cgroup_unregister_event()
>> +        * to make it "never fail".
>> +        * It must be able to store at least thresholds->size - 1 entries.
>> +        */
>> +       struct mem_cgroup_threshold_ary *__thresholds;
>> +
>>        /* thresholds for mem+swap usage. RCU-protected */
>>        struct mem_cgroup_threshold_ary *memsw_thresholds;
>>
>> +       /* the same as __thresholds, but for memsw_thresholds */
>> +       struct mem_cgroup_threshold_ary *__memsw_thresholds;
>> +
>>        /* For oom notifier event fd */
>>        struct list_head oom_notify;
>>
>> @@ -3575,17 +3585,27 @@ static int
>> mem_cgroup_usage_register_event(struct cgroup *cgrp,
>>        else
>>                rcu_assign_pointer(memcg->memsw_thresholds, thresholds_new);
>>
>> -       /* To be sure that nobody uses thresholds before freeing it */
>> +       /* To be sure that nobody uses thresholds */
>>        synchronize_rcu();
>>
>> -       kfree(thresholds);
>> +       /*
>> +        * Free old preallocated buffer and use thresholds as new
>> +        * preallocated buffer.
>> +        */
>> +       if (type == _MEM) {
>> +               kfree(memcg->__thresholds);
>> +               memcg->__thresholds = thresholds;
>> +       } else {
>> +               kfree(memcg->__memsw_thresholds);
>> +               memcg->__memsw_thresholds = thresholds;
>> +       }
>>  unlock:
>>        mutex_unlock(&memcg->thresholds_lock);
>>
>>        return ret;
>>  }
>>
>> -static int mem_cgroup_usage_unregister_event(struct cgroup *cgrp,
>> +static void mem_cgroup_usage_unregister_event(struct cgroup *cgrp,
>>        struct cftype *cft, struct eventfd_ctx *eventfd)
>>  {
>>        struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
>> @@ -3593,7 +3613,7 @@ static int
>> mem_cgroup_usage_unregister_event(struct cgroup *cgrp,
>>        int type = MEMFILE_TYPE(cft->private);
>>        u64 usage;
>>        int size = 0;
>> -       int i, j, ret = 0;
>> +       int i, j;
>>
>>        mutex_lock(&memcg->thresholds_lock);
>>        if (type == _MEM)
>> @@ -3623,17 +3643,15 @@ static int
>> mem_cgroup_usage_unregister_event(struct cgroup *cgrp,
>>        /* Set thresholds array to NULL if we don't have thresholds */
>>        if (!size) {
>>                thresholds_new = NULL;
>> -               goto assign;
>> +               goto swap_buffers;
>>        }
>>
>> -       /* Allocate memory for new array of thresholds */
>> -       thresholds_new = kmalloc(sizeof(*thresholds_new) +
>> -                       size * sizeof(struct mem_cgroup_threshold),
>> -                       GFP_KERNEL);
>> -       if (!thresholds_new) {
>> -               ret = -ENOMEM;
>> -               goto unlock;
>> -       }
>> +       /* Use preallocated buffer for new array of thresholds */
>> +       if (type == _MEM)
>> +               thresholds_new = memcg->__thresholds;
>> +       else
>> +               thresholds_new = memcg->__memsw_thresholds;
>> +
>>        thresholds_new->size = size;
>>
>>        /* Copy thresholds and find current threshold */
>> @@ -3654,20 +3672,20 @@ static int
>> mem_cgroup_usage_unregister_event(struct cgroup *cgrp,
>>                j++;
>>        }
>>
>> -assign:
>> -       if (type == _MEM)
>> +swap_buffers:
>> +       /* Swap thresholds array and preallocated buffer */
>> +       if (type == _MEM) {
>> +               memcg->__thresholds = thresholds;
>>                rcu_assign_pointer(memcg->thresholds, thresholds_new);
>> -       else
>> +       } else {
>> +               memcg->__memsw_thresholds = thresholds;
>>                rcu_assign_pointer(memcg->memsw_thresholds, thresholds_new);
>> +       }
>>
>> -       /* To be sure that nobody uses thresholds before freeing it */
>> +       /* To be sure that nobody uses thresholds */
>>        synchronize_rcu();
>>
>> -       kfree(thresholds);
>> -unlock:
>>        mutex_unlock(&memcg->thresholds_lock);
>> -
>> -       return ret;
>>  }
>>
>>  static int mem_cgroup_oom_register_event(struct cgroup *cgrp,
>> @@ -3695,7 +3713,7 @@ static int mem_cgroup_oom_register_event(struct
>> cgroup *cgrp,
>>        return 0;
>>  }
>>
>> -static int mem_cgroup_oom_unregister_event(struct cgroup *cgrp,
>> +static void mem_cgroup_oom_unregister_event(struct cgroup *cgrp,
>>        struct cftype *cft, struct eventfd_ctx *eventfd)
>>  {
>>        struct mem_cgroup *mem = mem_cgroup_from_cont(cgrp);
>> @@ -3714,8 +3732,6 @@ static int
>> mem_cgroup_oom_unregister_event(struct cgroup *cgrp,
>>        }
>>
>>        mutex_unlock(&memcg_oom_mutex);
>> -
>> -       return 0;
>>  }
>>
>>  static int mem_cgroup_oom_control_read(struct cgroup *cgrp,
>> --
>> 1.7.0.4
>>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2010-05-08 21:35 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-07 11:46 [PATCH] cgroups: make cftype.unregister_event() void-returning Kirill A. Shutemov
2010-05-07 21:07 ` Paul Menage
2010-05-08 21:35   ` Kirill A. Shutemov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=i2mcc557aab1005081435g3cd209fbg4eedd0791c2fb358@mail.gmail.com \
    --to=kirill@shutemov.name \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=ext-phil.2.carmody@nokia.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).