Linux Container Development
 help / color / mirror / Atom feed
  • [parent not found: <1385331096-7918-5-git-send-email-tj@kernel.org>]
  • [parent not found: <1385331096-7918-8-git-send-email-tj@kernel.org>]
  • * [PATCHSET cgroup-for-3.14] cgroup: restructure pidlist handling
    @ 2013-11-24 22:11 Tejun Heo
      0 siblings, 0 replies; 16+ messages in thread
    From: Tejun Heo @ 2013-11-24 22:11 UTC (permalink / raw)
      To: lizefan-hv44wF8Li93QT0dZR+AlfA
      Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
    	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
    
    Hello,
    
    pidlist is hanlding is quite elaborate.  Because the pidlist files -
    "tasks" and "cgroup.pids" - guarantee that the result is sorted and a
    task can be associated with different pids, with no inherent order
    among them, depending on namespaces, it is impossible to give a
    certain order to tasks of a cgroup and then just iterate through them.
    
    Instead, we end up creating tables of the relevant ids and then sort
    them before serving them out for reads.  As those tables can be huge,
    we also implement logic to share those tables if the id type and
    namespace match, which in turn involves reference counting those
    tables and synchronizing accesses to them.
    
    What could have been a simple iteration through the member tasks
    became this unnecessary hunk of complexity because it, for some
    reason, wanted to guarantee sorted output, which is extremely unusual
    for this type of interface.
    
    The refcnting is done from open() and release() callbacks, which
    kernfs doesn't expose.  This patchset updates pidlist handling so that
    pidlists are managed from seq_file operations proper.  As the duration
    between the paired start and stop denotes a single read invocation and
    we don't want to reload pidlist for each instance of consecutive read
    calls, pidlist is released with time delay.  This also bounds the
    stale the output of read calls can be.  This makes refcnting
    unnecessary - locking is simplified and refcnting is dropped.
    
    In the long term, we want to do away with pidlist and make this a
    simple iteration over member tasks.  The last patch scrambles the sort
    order of "cgroup.pids" if sane_behavior, so that the sorted
    expectation is broken in the new interface and we can eventually drop
    pidlist logic.
    
    This patchset contains the following nine patches.
    
     0001-cgroup-don-t-skip-seq_open-on-write-only-opens-on-pi.patch
     0002-cgroup-remove-cftype-release.patch
     0003-cgroup-implement-delayed-destruction-for-cgroup_pidl.patch
     0004-cgroup-introduce-struct-cgroup_pidlist_open_file.patch
     0005-cgroup-refactor-cgroup_pidlist_find.patch
     0006-cgroup-remove-cgroup_pidlist-rwsem.patch
     0007-cgroup-load-and-release-pidlists-from-seq_file-start.patch
     0008-cgroup-remove-cgroup_pidlist-use_count.patch
     0009-cgroup-don-t-guarantee-cgroup.procs-is-sorted-if-san.patch
    
    0001-0002 are prep patches.
    
    0003-0008 restructure pidlist handling so that it's managed from
    seq_file operations.
    
    0009 scrames sort order of cgroup.pids if sane_behavior.
    
    This patchset is on top of cgroup/for-3.14 edab95103d3a ("cgroup:
    Merge branch 'memcg_event' into for-3.14") and available in the
    following git branch.
    
     git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-pidlist
    
    diffstat follows.
    
     include/linux/cgroup.h |    5 
     kernel/cgroup.c        |  310 +++++++++++++++++++++++++++++++------------------
     2 files changed, 204 insertions(+), 111 deletions(-)
    
    Thanks.
    
    --
    tejun
    
    ^ permalink raw reply	[flat|nested] 16+ messages in thread

    end of thread, other threads:[~2013-11-29 15:46 UTC | newest]
    
    Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
    -- links below jump to the message on this page --
         [not found] <1385331096-7918-1-git-send-email-tj@kernel.org>
         [not found] ` <1385331096-7918-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
    2013-11-24 22:11   ` [PATCH 1/9] cgroup: don't skip seq_open on write only opens on pidlist files Tejun Heo
    2013-11-24 22:11   ` [PATCH 2/9] cgroup: remove cftype->release() Tejun Heo
    2013-11-24 22:11   ` [PATCH 3/9] cgroup: implement delayed destruction for cgroup_pidlist Tejun Heo
    2013-11-24 22:11   ` [PATCH 4/9] cgroup: introduce struct cgroup_pidlist_open_file Tejun Heo
    2013-11-24 22:11   ` [PATCH 5/9] cgroup: refactor cgroup_pidlist_find() Tejun Heo
    2013-11-24 22:11   ` [PATCH 6/9] cgroup: remove cgroup_pidlist->rwsem Tejun Heo
    2013-11-24 22:11   ` [PATCH 7/9] cgroup: load and release pidlists from seq_file start and stop respectively Tejun Heo
    2013-11-24 22:11   ` [PATCH 8/9] cgroup: remove cgroup_pidlist->use_count Tejun Heo
    2013-11-24 22:11   ` [PATCH 9/9] cgroup: don't guarantee cgroup.procs is sorted if sane_behavior Tejun Heo
    2013-11-27 23:23   ` [PATCHSET cgroup-for-3.14] cgroup: restructure pidlist handling Tejun Heo
    2013-11-29  1:03   ` Li Zefan
         [not found]     ` <5297E7E2.8080404-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
    2013-11-29 15:46       ` Tejun Heo
         [not found] ` <1385331096-7918-5-git-send-email-tj@kernel.org>
         [not found]   ` <1385331096-7918-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
    2013-11-29  1:03     ` [PATCH 4/9] cgroup: introduce struct cgroup_pidlist_open_file Li Zefan
    2013-11-29 15:44     ` [PATCH v2 " Tejun Heo
         [not found] ` <1385331096-7918-8-git-send-email-tj@kernel.org>
         [not found]   ` <1385331096-7918-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
    2013-11-29 15:45     ` [PATCH v2 7/9] cgroup: load and release pidlists from seq_file start and stop respectively Tejun Heo
    2013-11-24 22:11 [PATCHSET cgroup-for-3.14] cgroup: restructure pidlist handling Tejun Heo
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox