All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	mhocko-AlSwsSmVLrQ@public.gmane.org,
	hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: [PATCHSET] cgroup: simplify cgroup removal path
Date: Wed, 31 Oct 2012 17:49:33 +0400	[thread overview]
Message-ID: <50912C6D.6020000@parallels.com> (raw)
In-Reply-To: <1351657365-25055-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

On 10/31/2012 08:22 AM, Tejun Heo wrote:
> Hello, guys.
> 
> cgroup removal path is quite ugly.  A lot of the ugliness comes from
> the weird design which allows ->pre_destroy() to fail and the feature
> to drain existing CSS reference counts before committing to removal.
> Both mean that it should be possible to roll-back cgroup destruction
> after some or all ->pre_destroy() invocations.
> 
> This weird design has never really worked.  To list a couple examples.
> 
>  * Some ->pre_destroy() implementations aren't side-effect free.
>    Roll-back happens after a lot of state is already lost.
> 
>  * Some ->pre_destroy() implementations (naturally) assume that the
>    cgroup being destroyed would stay quiescent between successful
>    ->pre_destroy() and its destruction.  Unfortunately, any operation
>    can happen inbetween and the cgroup could be in a very different
>    state by the time it actually gets destroyed.
> 
> It's just such an unusual design which unnecessarily contains weird
> code path combinations which are tricky to hit, reproduce and expect.
> Moreover, the design's deficiencies attracts kludges on top as
> workarounds and we end up with stuff like cgroup_exclude_rmdir() and
> cgroup_release_and_wakeup_rmdir() which really make me want to cry.
> 
> Now that memcg has moved away from failable ->pre_destroy(), we can do
> away with all these.  I tested some basic operations and some corner
> cases but am still a bit scared.  Would love to get acks from Li and
> memcg people.
> 
> This patchset contains the following eight patches.
> 
>  0001-cgroup-kill-cgroup_subsys-__DEPRECATED_clear_css_ref.patch
>  0002-cgroup-kill-CSS_REMOVED.patch
>  0003-cgroup-use-cgroup_lock_live_group-parent-in-cgroup_c.patch
>  0004-cgroup-deactivate-CSS-s-and-mark-cgroup-dead-before-.patch
>  0005-cgroup-remove-CGRP_WAIT_ON_RMDIR-cgroup_exclude_rmdi.patch
>  0006-memcg-make-mem_cgroup_reparent_charges-non-failing.patch
>  0007-hugetlb-do-not-fail-in-hugetlb_cgroup_pre_destroy.patch
>  0008-cgroup-make-pre_destroy-return-void.patch
> 
> 0001-0002 remove now unused ->pre_destroy() failure handling and do
> follow-up simplification.
> 
> 0003-0004 update removal path such that each ->pre_destroy() is
> guaranteed to be invoked once per removal and the cgroup being
> destroyed stays quiescent until destruction is complete.
> 
> 0005 removes the scary CGRP_WAIT_ON_RMDIR mechanism.
> 
> 0006-0008 are follow-up clean-ups.  0006 and 0007 are from Michal's
> patchset[1].
> 
> This patchset is on top of
> 
>   v3.6 (a0d271cbfe)
> + [1] the first three patches of
>       "memcg/cgroup: do not fail fail on pre_destroy callbacks" patchset
> 
> and available in the following git branch.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-cgroup-rmdir-updates
> 
> Thanks.
> 
>  block/blk-cgroup.c     |    3 
>  include/linux/cgroup.h |   41 -------
>  kernel/cgroup.c        |  256 +++++++++++--------------------------------------
>  mm/hugetlb_cgroup.c    |   11 --
>  mm/memcontrol.c        |   51 +--------
>  5 files changed, 75 insertions(+), 287 deletions(-)


The patches are quite straightforward, and you are basically throwing
useless code away...

The only think that drew my attention is that you are changing the
local_irq_save callsite to local_irq_disable. It shouldn't be a problem,
since this is never expected to be called in interrupt context.

Still... it makes me wonder if that disabled-interrupt block is still
needed? According to the changelogs, it was introduced in e7c5ec919 for
the css_tryget mechanism. But css_tryget itself will never scan
subsystems, so if we can no longer fail, we should be able to just ditch
it. Unless I am missing something

WARNING: multiple messages have this Message-ID (diff)
From: Glauber Costa <glommer@parallels.com>
To: Tejun Heo <tj@kernel.org>
Cc: <lizefan@huawei.com>, <hannes@cmpxchg.org>, <mhocko@suse.cz>,
	<bsingharora@gmail.com>, <kamezawa.hiroyu@jp.fujitsu.com>,
	<containers@lists.linux-foundation.org>,
	<cgroups@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCHSET] cgroup: simplify cgroup removal path
Date: Wed, 31 Oct 2012 17:49:33 +0400	[thread overview]
Message-ID: <50912C6D.6020000@parallels.com> (raw)
In-Reply-To: <1351657365-25055-1-git-send-email-tj@kernel.org>

On 10/31/2012 08:22 AM, Tejun Heo wrote:
> Hello, guys.
> 
> cgroup removal path is quite ugly.  A lot of the ugliness comes from
> the weird design which allows ->pre_destroy() to fail and the feature
> to drain existing CSS reference counts before committing to removal.
> Both mean that it should be possible to roll-back cgroup destruction
> after some or all ->pre_destroy() invocations.
> 
> This weird design has never really worked.  To list a couple examples.
> 
>  * Some ->pre_destroy() implementations aren't side-effect free.
>    Roll-back happens after a lot of state is already lost.
> 
>  * Some ->pre_destroy() implementations (naturally) assume that the
>    cgroup being destroyed would stay quiescent between successful
>    ->pre_destroy() and its destruction.  Unfortunately, any operation
>    can happen inbetween and the cgroup could be in a very different
>    state by the time it actually gets destroyed.
> 
> It's just such an unusual design which unnecessarily contains weird
> code path combinations which are tricky to hit, reproduce and expect.
> Moreover, the design's deficiencies attracts kludges on top as
> workarounds and we end up with stuff like cgroup_exclude_rmdir() and
> cgroup_release_and_wakeup_rmdir() which really make me want to cry.
> 
> Now that memcg has moved away from failable ->pre_destroy(), we can do
> away with all these.  I tested some basic operations and some corner
> cases but am still a bit scared.  Would love to get acks from Li and
> memcg people.
> 
> This patchset contains the following eight patches.
> 
>  0001-cgroup-kill-cgroup_subsys-__DEPRECATED_clear_css_ref.patch
>  0002-cgroup-kill-CSS_REMOVED.patch
>  0003-cgroup-use-cgroup_lock_live_group-parent-in-cgroup_c.patch
>  0004-cgroup-deactivate-CSS-s-and-mark-cgroup-dead-before-.patch
>  0005-cgroup-remove-CGRP_WAIT_ON_RMDIR-cgroup_exclude_rmdi.patch
>  0006-memcg-make-mem_cgroup_reparent_charges-non-failing.patch
>  0007-hugetlb-do-not-fail-in-hugetlb_cgroup_pre_destroy.patch
>  0008-cgroup-make-pre_destroy-return-void.patch
> 
> 0001-0002 remove now unused ->pre_destroy() failure handling and do
> follow-up simplification.
> 
> 0003-0004 update removal path such that each ->pre_destroy() is
> guaranteed to be invoked once per removal and the cgroup being
> destroyed stays quiescent until destruction is complete.
> 
> 0005 removes the scary CGRP_WAIT_ON_RMDIR mechanism.
> 
> 0006-0008 are follow-up clean-ups.  0006 and 0007 are from Michal's
> patchset[1].
> 
> This patchset is on top of
> 
>   v3.6 (a0d271cbfe)
> + [1] the first three patches of
>       "memcg/cgroup: do not fail fail on pre_destroy callbacks" patchset
> 
> and available in the following git branch.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-cgroup-rmdir-updates
> 
> Thanks.
> 
>  block/blk-cgroup.c     |    3 
>  include/linux/cgroup.h |   41 -------
>  kernel/cgroup.c        |  256 +++++++++++--------------------------------------
>  mm/hugetlb_cgroup.c    |   11 --
>  mm/memcontrol.c        |   51 +--------
>  5 files changed, 75 insertions(+), 287 deletions(-)


The patches are quite straightforward, and you are basically throwing
useless code away...

The only think that drew my attention is that you are changing the
local_irq_save callsite to local_irq_disable. It shouldn't be a problem,
since this is never expected to be called in interrupt context.

Still... it makes me wonder if that disabled-interrupt block is still
needed? According to the changelogs, it was introduced in e7c5ec919 for
the css_tryget mechanism. But css_tryget itself will never scan
subsystems, so if we can no longer fail, we should be able to just ditch
it. Unless I am missing something


  parent reply	other threads:[~2012-10-31 13:49 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-31  4:22 [PATCHSET] cgroup: simplify cgroup removal path Tejun Heo
2012-10-31  4:22 ` Tejun Heo
     [not found] ` <1351657365-25055-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-10-31  4:22   ` [PATCH 1/8] cgroup: kill cgroup_subsys->__DEPRECATED_clear_css_refs Tejun Heo
2012-10-31  4:22     ` Tejun Heo
     [not found]     ` <1351657365-25055-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-10-31 13:21       ` Glauber Costa
2012-10-31 13:21         ` Glauber Costa
     [not found]         ` <509125D9.8070100-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-10-31 16:38           ` Tejun Heo
2012-10-31 16:38             ` Tejun Heo
2012-10-31 13:21       ` Glauber Costa
2012-10-31 14:37       ` Michal Hocko
2012-10-31 14:37         ` Michal Hocko
     [not found]         ` <20121031143751.GA22809-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-10-31 16:41           ` Tejun Heo
2012-10-31 16:41           ` Tejun Heo
2012-10-31 16:41             ` Tejun Heo
     [not found]             ` <20121031164123.GD2945-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-10-31 16:48               ` Michal Hocko
2012-10-31 16:48                 ` Michal Hocko
     [not found]                 ` <20121031164855.GI22809-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-10-31 17:22                   ` Tejun Heo
2012-10-31 17:22                     ` Tejun Heo
2012-10-31 14:37       ` Michal Hocko
2012-11-02  9:23       ` Kamezawa Hiroyuki
2012-11-02  9:23       ` Kamezawa Hiroyuki
2012-11-02  9:23         ` Kamezawa Hiroyuki
2012-10-31  4:22   ` [PATCH 2/8] cgroup: kill CSS_REMOVED Tejun Heo
2012-10-31  4:22     ` Tejun Heo
     [not found]     ` <1351657365-25055-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-10-31 15:39       ` Michal Hocko
2012-10-31 15:39         ` Michal Hocko
     [not found]         ` <20121031153926.GC22809-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-10-31 16:57           ` Tejun Heo
2012-10-31 16:57             ` Tejun Heo
     [not found]             ` <20121031165739.GE2945-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-10-31 17:06               ` Glauber Costa
2012-10-31 17:06                 ` Glauber Costa
     [not found]                 ` <50915A87.4070504-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-10-31 17:10                   ` Tejun Heo
2012-10-31 17:10                     ` Tejun Heo
     [not found]                     ` <CAOS58YP=CjTPFdETLRXnc3gXEzX2=EEe2dMdSh3Eov7zRfV4Qg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-31 17:19                       ` Glauber Costa
2012-10-31 17:19                         ` Glauber Costa
     [not found]                         ` <50915DB7.5020706-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-10-31 17:25                           ` Tejun Heo
2012-10-31 17:25                             ` Tejun Heo
     [not found]                             ` <20121031172522.GJ2945-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-10-31 17:38                               ` Glauber Costa
2012-10-31 17:38                                 ` Glauber Costa
     [not found]                                 ` <50916218.3090301-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-10-31 17:44                                   ` Tejun Heo
2012-10-31 17:44                                     ` Tejun Heo
2012-10-31 17:39                               ` Glauber Costa
2012-10-31 17:39                                 ` Glauber Costa
2012-10-31 19:16               ` Michal Hocko
2012-10-31 19:16                 ` Michal Hocko
     [not found]                 ` <20121031191602.GB1271-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-10-31 19:33                   ` Tejun Heo
2012-10-31 19:33                     ` Tejun Heo
2012-10-31 15:39       ` Michal Hocko
2012-11-02  9:30       ` Kamezawa Hiroyuki
2012-11-02  9:30         ` Kamezawa Hiroyuki
2012-10-31  4:22   ` [PATCH 3/8] cgroup: use cgroup_lock_live_group(parent) in cgroup_create() Tejun Heo
2012-10-31  4:22     ` Tejun Heo
     [not found]     ` <1351657365-25055-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-10-31 15:55       ` Michal Hocko
2012-10-31 15:55         ` Michal Hocko
     [not found]         ` <20121031155514.GD22809-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-10-31 17:04           ` Tejun Heo
2012-10-31 17:04           ` Tejun Heo
2012-10-31 17:04             ` Tejun Heo
     [not found]             ` <20121031170431.GF2945-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-11-01  9:16               ` Michal Hocko
2012-11-01  9:16                 ` Michal Hocko
     [not found]                 ` <20121101091644.GA8533-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-11-01 14:52                   ` Tejun Heo
2012-11-01 14:52                     ` Tejun Heo
     [not found]                     ` <CAOS58YM+kRtspVUzmnSmOmrDoNS_kF6KA02zWGxqH5FUcRWo1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-11-01 15:05                       ` Michal Hocko
2012-11-01 15:05                       ` Michal Hocko
2012-11-01 15:05                         ` Michal Hocko
     [not found]                         ` <20121101150532.GA5065-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-11-01 15:15                           ` Michal Hocko
2012-11-01 15:15                             ` Michal Hocko
     [not found]                             ` <20121101151556.GB5065-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-11-01 15:43                               ` Tejun Heo
2012-11-01 15:43                                 ` Tejun Heo
2012-11-01 15:15                           ` Michal Hocko
2012-11-02  9:37       ` Kamezawa Hiroyuki
2012-11-02  9:37         ` Kamezawa Hiroyuki
2012-10-31  4:22   ` Tejun Heo
2012-10-31  4:22   ` [PATCH 4/8] cgroup: deactivate CSS's and mark cgroup dead before invoking ->pre_destroy() Tejun Heo
2012-10-31  4:22     ` Tejun Heo
     [not found]     ` <1351657365-25055-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-10-31 13:42       ` Glauber Costa
2012-10-31 13:42         ` Glauber Costa
2012-10-31 16:05       ` Michal Hocko
2012-10-31 16:05         ` Michal Hocko
2012-11-02  9:43       ` Kamezawa Hiroyuki
2012-11-02  9:43         ` Kamezawa Hiroyuki
2012-10-31  4:22   ` [PATCH 5/8] cgroup: remove CGRP_WAIT_ON_RMDIR, cgroup_exclude_rmdir() and cgroup_release_and_wakeup_rmdir() Tejun Heo
2012-10-31  4:22     ` Tejun Heo
     [not found]     ` <1351657365-25055-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-10-31 16:27       ` Michal Hocko
2012-10-31 16:27         ` Michal Hocko
     [not found]         ` <20121031162735.GF22809-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-10-31 17:16           ` Tejun Heo
2012-10-31 17:16             ` Tejun Heo
2012-11-02  9:53       ` Kamezawa Hiroyuki
2012-11-02  9:53         ` Kamezawa Hiroyuki
2012-10-31  4:22   ` [PATCH 6/8] memcg: make mem_cgroup_reparent_charges non failing Tejun Heo
2012-10-31  4:22     ` Tejun Heo
     [not found]     ` <1351657365-25055-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-11-02  9:54       ` Kamezawa Hiroyuki
2012-11-02  9:54         ` Kamezawa Hiroyuki
2012-10-31  4:22   ` [PATCH 7/8] hugetlb: do not fail in hugetlb_cgroup_pre_destroy Tejun Heo
2012-10-31  4:22     ` Tejun Heo
     [not found]     ` <1351657365-25055-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-11-02  9:56       ` Kamezawa Hiroyuki
2012-11-02  9:56         ` Kamezawa Hiroyuki
2012-11-02  9:56       ` Kamezawa Hiroyuki
2012-10-31  4:22   ` [PATCH 8/8] cgroup: make ->pre_destroy() return void Tejun Heo
2012-10-31  4:22     ` Tejun Heo
     [not found]     ` <1351657365-25055-9-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-10-31 13:57       ` Vivek Goyal
2012-10-31 13:57         ` Vivek Goyal
2012-10-31 16:28       ` Michal Hocko
2012-10-31 16:28         ` Michal Hocko
2012-11-02  9:57       ` Kamezawa Hiroyuki
2012-11-02  9:57         ` Kamezawa Hiroyuki
2012-10-31 13:49   ` Glauber Costa [this message]
2012-10-31 13:49     ` [PATCHSET] cgroup: simplify cgroup removal path Glauber Costa
     [not found]     ` <50912C6D.6020000-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-10-31 17:18       ` Tejun Heo
2012-10-31 17:18         ` Tejun Heo
     [not found]         ` <20121031171849.GH2945-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-10-31 17:24           ` Glauber Costa
2012-10-31 17:24           ` Glauber Costa
2012-10-31 17:24             ` Glauber Costa
     [not found]             ` <50915EB6.3060704-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-10-31 17:26               ` Tejun Heo
2012-10-31 17:26                 ` Tejun Heo
     [not found]                 ` <20121031172617.GK2945-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-10-31 17:33                   ` Glauber Costa
2012-10-31 17:33                     ` Glauber Costa
2012-10-31 16:31   ` Michal Hocko
2012-10-31 16:31     ` Michal Hocko
     [not found]     ` <20121031163134.GH22809-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-10-31 16:35       ` Tejun Heo
2012-10-31 16:35         ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2012-10-31  4:22 Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50912C6D.6020000@parallels.com \
    --to=glommer-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mhocko-AlSwsSmVLrQ@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.