Linux Container Development
 help / color / mirror / Atom feed
From: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: [PATCHSET cgroup-for-3.14] cgroup: restructure pidlist handling
Date: Fri, 29 Nov 2013 09:03:30 +0800	[thread overview]
Message-ID: <5297E7E2.8080404@huawei.com> (raw)
In-Reply-To: <1385331096-7918-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

> pidlist is hanlding is quite elaborate.  Because the pidlist files -
> "tasks" and "cgroup.pids" - guarantee that the result is sorted and a
> task can be associated with different pids, with no inherent order
> among them, depending on namespaces, it is impossible to give a
> certain order to tasks of a cgroup and then just iterate through them.
> 
> Instead, we end up creating tables of the relevant ids and then sort
> them before serving them out for reads.  As those tables can be huge,
> we also implement logic to share those tables if the id type and
> namespace match, which in turn involves reference counting those
> tables and synchronizing accesses to them.
> 
> What could have been a simple iteration through the member tasks
> became this unnecessary hunk of complexity because it, for some
> reason, wanted to guarantee sorted output, which is extremely unusual
> for this type of interface.
> 
> The refcnting is done from open() and release() callbacks, which
> kernfs doesn't expose.  This patchset updates pidlist handling so that
> pidlists are managed from seq_file operations proper.  As the duration
> between the paired start and stop denotes a single read invocation and
> we don't want to reload pidlist for each instance of consecutive read
> calls, pidlist is released with time delay.  This also bounds the
> stale the output of read calls can be.  This makes refcnting
> unnecessary - locking is simplified and refcnting is dropped.
> 
> In the long term, we want to do away with pidlist and make this a
> simple iteration over member tasks.  The last patch scrambles the sort
> order of "cgroup.pids" if sane_behavior, so that the sorted
> expectation is broken in the new interface and we can eventually drop
> pidlist logic.
> 
> This patchset contains the following nine patches.
> 
>  0001-cgroup-don-t-skip-seq_open-on-write-only-opens-on-pi.patch
>  0002-cgroup-remove-cftype-release.patch
>  0003-cgroup-implement-delayed-destruction-for-cgroup_pidl.patch
>  0004-cgroup-introduce-struct-cgroup_pidlist_open_file.patch
>  0005-cgroup-refactor-cgroup_pidlist_find.patch
>  0006-cgroup-remove-cgroup_pidlist-rwsem.patch
>  0007-cgroup-load-and-release-pidlists-from-seq_file-start.patch
>  0008-cgroup-remove-cgroup_pidlist-use_count.patch
>  0009-cgroup-don-t-guarantee-cgroup.procs-is-sorted-if-san.patch
> 

Acked-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

  parent reply	other threads:[~2013-11-29  1:03 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1385331096-7918-1-git-send-email-tj@kernel.org>
     [not found] ` <1385331096-7918-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-11-24 22:11   ` [PATCH 1/9] cgroup: don't skip seq_open on write only opens on pidlist files Tejun Heo
2013-11-24 22:11   ` [PATCH 2/9] cgroup: remove cftype->release() Tejun Heo
2013-11-24 22:11   ` [PATCH 3/9] cgroup: implement delayed destruction for cgroup_pidlist Tejun Heo
2013-11-24 22:11   ` [PATCH 4/9] cgroup: introduce struct cgroup_pidlist_open_file Tejun Heo
2013-11-24 22:11   ` [PATCH 5/9] cgroup: refactor cgroup_pidlist_find() Tejun Heo
2013-11-24 22:11   ` [PATCH 6/9] cgroup: remove cgroup_pidlist->rwsem Tejun Heo
2013-11-24 22:11   ` [PATCH 7/9] cgroup: load and release pidlists from seq_file start and stop respectively Tejun Heo
2013-11-24 22:11   ` [PATCH 8/9] cgroup: remove cgroup_pidlist->use_count Tejun Heo
2013-11-24 22:11   ` [PATCH 9/9] cgroup: don't guarantee cgroup.procs is sorted if sane_behavior Tejun Heo
2013-11-27 23:23   ` [PATCHSET cgroup-for-3.14] cgroup: restructure pidlist handling Tejun Heo
2013-11-29  1:03   ` Li Zefan [this message]
     [not found]     ` <5297E7E2.8080404-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-11-29 15:46       ` Tejun Heo
     [not found] ` <1385331096-7918-5-git-send-email-tj@kernel.org>
     [not found]   ` <1385331096-7918-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-11-29  1:03     ` [PATCH 4/9] cgroup: introduce struct cgroup_pidlist_open_file Li Zefan
2013-11-29 15:44     ` [PATCH v2 " Tejun Heo
     [not found] ` <1385331096-7918-8-git-send-email-tj@kernel.org>
     [not found]   ` <1385331096-7918-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-11-29 15:45     ` [PATCH v2 7/9] cgroup: load and release pidlists from seq_file start and stop respectively Tejun Heo
2013-11-24 22:11 [PATCHSET cgroup-for-3.14] cgroup: restructure pidlist handling Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5297E7E2.8080404@huawei.com \
    --to=lizefan-hv44wf8li93qt0dzr+alfa@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox