public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Tejun Heo <tj@kernel.org>
Cc: lizefan@huawei.com, containers@lists.linux-foundation.org,
	cgroups@vger.kernel.org, koverstreet@google.com,
	linux-kernel@vger.kernel.org, cl@linux-foundation.org,
	Mike Snitzer <snitzer@redhat.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	"Alasdair G. Kergon" <agk@redhat.com>,
	Jens Axboe <axboe@kernel.dk>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Glauber Costa <glommer@gmail.com>
Subject: Re: [PATCH 11/11] cgroup: use percpu refcnt for cgroup_subsys_states
Date: Fri, 14 Jun 2013 15:20:26 +0200	[thread overview]
Message-ID: <20130614132026.GD10084@dhcp22.suse.cz> (raw)
In-Reply-To: <1371096298-24402-12-git-send-email-tj@kernel.org>

On Wed 12-06-13 21:04:58, Tejun Heo wrote:
> A css (cgroup_subsys_state) is how each cgroup is represented to a
> controller.  As such, it can be used in hot paths across the various
> subsystems different controllers are associated with.
> 
> One of the common operations is reference counting, which up until now
> has been implemented using a global atomic counter and can have
> significant adverse impact on scalability.  For example, css refcnt
> can be gotten and put multiple times by blkcg for each IO request.
> For highops configurations which try to do as much per-cpu as
> possible, the global frequent refcnting can be very expensive.
> 
> In general, given the various and hugely diverse paths css's end up
> being used from, we need to make it cheap and highly scalable.  In its
> usage, css refcnting isn't very different from module refcnting.
> 
> This patch converts css refcnting to use the recently added
> percpu_ref. 

I have no objections to change css reference counting scheme if the
guarantees we used to have are still valid. I am just missing some
comparisons. Do you have any numbers that would show benefits clearly?

You are mentioning that especially controllers that are strongly per-cpu
oriented will see the biggest improvements. What about others?
A single atomic_add resp. atomic_dec_return is much less heavy than the
new ref counting. Is it possible that those could regress? Or the
differences would be only within the noise?

I do realize that it is not possible to test all controllers but I
would be interested to see at least that those for which it really
matters get a nice boost. Memcg uses css with caution although there are
places which are in really hot paths (e.g. charging) so any improvement
would be really welcome.

Sorry, if this information has been posted along with the series. I was
CCed only on this one and didn't get to look at the rest yet (apart from
"percpu: implement generic percpu refcounting" in your
review-css-percpu-ref branch).
[...]
-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2013-06-14 13:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-13  4:04 [PATCHSET v2 cgroup/for-3.11] cgroup: convert cgroup_subsys_state refcnt to percpu_ref Tejun Heo
2013-06-13  4:04 ` [PATCH 01/11] cgroup: remove now unused css_depth() Tejun Heo
2013-06-13  4:04 ` [PATCH 02/11] cgroup: consistently use @cset for struct css_set variables Tejun Heo
2013-06-13  4:04 ` [PATCH 03/11] cgroup: bring some sanity to naming around cg_cgroup_link Tejun Heo
2013-06-13  4:04 ` [PATCH 04/11] cgroup: use kzalloc() instead of kmalloc() Tejun Heo
2013-06-13  4:04 ` [PATCH 05/11] cgroup: clean up css_[try]get() and css_put() Tejun Heo
2013-06-13  4:04 ` [PATCH 06/11] cgroup: rename CGRP_REMOVED to CGRP_DEAD Tejun Heo
2013-06-13  4:04 ` [PATCH 07/11] cgroup: drop unnecessary RCU dancing from __put_css_set() Tejun Heo
2013-06-13  4:04 ` [PATCH 08/11] cgroup: remove cgroup->count and use Tejun Heo
2013-06-13  4:04 ` [PATCH 09/11] cgroup: reorder the operations in cgroup_destroy_locked() Tejun Heo
2013-06-13  4:04 ` [PATCH 10/11] cgroup: split cgroup destruction into two steps Tejun Heo
2013-06-13  4:04 ` [PATCH 11/11] cgroup: use percpu refcnt for cgroup_subsys_states Tejun Heo
2013-06-13 23:16   ` Kent Overstreet
2013-06-14 12:55   ` Michal Hocko
2013-06-14 14:15     ` Glauber Costa
2013-06-14 14:22       ` Michal Hocko
2013-06-14 13:20   ` Michal Hocko [this message]
2013-06-14 22:31     ` Tejun Heo
2013-06-15  5:35       ` Tejun Heo
2013-06-15  5:39         ` Tejun Heo
2013-06-15  6:31         ` Tejun Heo
2013-06-17 13:27         ` Michal Hocko
2013-06-17 17:16           ` Tejun Heo
2013-06-13  6:04 ` [PATCHSET v2 cgroup/for-3.11] cgroup: convert cgroup_subsys_state refcnt to percpu_ref Li Zefan
2013-06-13 17:56   ` Tejun Heo
2013-06-14  2:41     ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2013-06-12 21:03 [PATCHSET " Tejun Heo
2013-06-12 21:03 ` [PATCH 11/11] cgroup: use percpu refcnt for cgroup_subsys_states Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130614132026.GD10084@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=cl@linux-foundation.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=glommer@gmail.com \
    --cc=koverstreet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    --cc=tj@kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox