cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Aristeu Rozanski <aris-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	mhocko-AlSwsSmVLrQ@public.gmane.org,
	hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCHSET] cgroup: allow dropping RCU read lock while iterating
Date: Wed, 22 May 2013 10:53:17 -0400	[thread overview]
Message-ID: <20130522145316.GD16739@redhat.com> (raw)
In-Reply-To: <1369101025-28335-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Hi Tejun,
On Tue, May 21, 2013 at 10:50:20AM +0900, Tejun Heo wrote:
> Currently all cgroup iterators require the whole traversal to be
> contained in a single RCU read critical section, which can be too
> restrictive as there are times when blocking operations are necessary
> during traversal.  This forces controllers to implement specific
> workarounds in those cases - building separate iteration list, punting
> actual operations to work items and so on.
> 
> This patchset updates cgroup iterators so that they allow dropping RCU
> read lock while iteration is in progress so that controllers which
> require sleeping during iteration don't need to implement their own
> mechanisms.
> 
> Dropping RCU read lock during iteration is unsafe because
> cgroup->sibling.next can't be trusted once RCU read lock is dropped.
> The sibling list is a RCU list and when a cgroup is removed the next
> pointer is retained to keep RCU traversal working.  If the next
> sibling is removed while RCU read lock is dropped, the removed current
> cgroup's next won't be updated and the next sibling may complete its
> grace period and get freed leaving the next pointer dangling.
> 
> Working around the problem is relatiely simple.  Whether
> ->sibling.next can be trusted can be trusted can be decided by looking
> at CGRP_REMOVED - as cgroup removals are fully serialized, the flag is
> guaranteed to be visible before the next sibling finishes its grace
> period.  For those cases, each cgroup is assigned a monotonically
> increasing serial number.  Because new cgroups are always appeneded to
> the children list, it's guaranteed that all children list are sorted
> in the ascending order of the serial numbers.  When the next pointer
> can't be trusted, the next sibling can be located by walking the
> parent's children list from the beginning looking for the first cgroup
> with higher serial number.
> 
> The above is implemented in cgroup_next_sibling() and all iterators
> are updated to use it to find out the next sibling thus allowing
> droppping RCU read lock while iteration is in progress.  This patchset
> replaces separate iteration list in device_cgroup with direct
> descendant walk and there will be further patches making use of this
> update.
> 
> This patchset contains the following five patches.
> 
>  0001-cgroup-fix-a-subtle-bug-in-descendant-pre-order-walk.patch
>  0002-cgroup-make-cgroup_is_removed-static.patch
>  0003-cgroup-add-cgroup-serial_nr-and-implement-cgroup_nex.patch
>  0004-cgroup-update-iterators-to-use-cgroup_next_sibling.patch
>  0005-device_cgroup-simplify-cgroup-tree-walk-in-propagate.patch
> 
> 0001 fixes a subtle iteration bug.  Will be applied to for-3.10-fixes.
> 
> 0002 is a trivial prep patch.
> 
> 0003 implements cgroup_next_sibling() which can find out the next
> sibling regardless of the state of the current cgroup.
> 
> 0004 updates all iterators to use cgroup_next_sibling().
> 
> 0005 replaces iteration list work around in device_cgroup with direct
> iteration.
> 
> This patchset is on top of cgroup/for-3.11 23958e729e ("cgroup.h:
> remove some functions that are now gone") and available in the
> following git branch.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-interruptible-iter

patchset looks good to me. ran some tests in a kernel with it without problems.

Acked-by: Aristeu Rozanski <aris-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

-- 
Aristeu

      parent reply	other threads:[~2013-05-22 14:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-21  1:50 [PATCHSET] cgroup: allow dropping RCU read lock while iterating Tejun Heo
     [not found] ` <1369101025-28335-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-05-21  1:50   ` [PATCH 1/5] cgroup: fix a subtle bug in descendant pre-order walk Tejun Heo
     [not found]     ` <1369101025-28335-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-05-22 18:22       ` Michal Hocko
2013-05-24  1:51       ` Tejun Heo
2013-05-21  1:50   ` [PATCH 2/5] cgroup: make cgroup_is_removed() static Tejun Heo
     [not found]     ` <1369101025-28335-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-05-24  1:56       ` Tejun Heo
     [not found]         ` <20130524015613.GB19755-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-05-24  3:32           ` Li Zefan
2013-05-21  1:50   ` [PATCH 3/5] cgroup: add cgroup->serial_nr and implement cgroup_next_sibling() Tejun Heo
     [not found]     ` <1369101025-28335-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-05-21 14:33       ` Serge Hallyn
2013-05-22 14:36       ` Aristeu Rozanski
     [not found]         ` <20130522143636.GC16739-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-05-22 14:38           ` Tejun Heo
2013-05-22 18:41       ` Michal Hocko
2013-05-21  1:50   ` [PATCH 4/5] cgroup: update iterators to use cgroup_next_sibling() Tejun Heo
     [not found]     ` <1369101025-28335-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-05-21 22:31       ` Serge Hallyn
2013-05-22  9:09       ` Li Zefan
     [not found]         ` <519C8B2E.5040606-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-05-22  9:17           ` Tejun Heo
2013-05-22 18:46       ` Michal Hocko
2013-05-21  1:50   ` [PATCH 5/5] device_cgroup: simplify cgroup tree walk in propagate_exception() Tejun Heo
     [not found]     ` <1369101025-28335-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-05-21 22:35       ` Serge Hallyn
2013-05-21  3:20   ` [PATCHSET] cgroup: allow dropping RCU read lock while iterating Tejun Heo
2013-05-22 14:53   ` Aristeu Rozanski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130522145316.GD16739@redhat.com \
    --to=aris-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=mhocko-AlSwsSmVLrQ@public.gmane.org \
    --cc=serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).