linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Glauber Costa <glommer@parallels.com>
Cc: cgroups@vger.kernel.org, Li Zefan <lizefan@huawei.com>,
	Tejun Heo <tj@kernel.org>,
	devel@openvz.org, Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>, Linux MM <linux-mm@kvack.org>,
	Pavel Emelyanov <xemul@parallels.com>
Subject: Re: [RFC 0/7] Initial proposal for faster res_counter updates
Date: Fri, 30 Mar 2012 17:32:06 +0900	[thread overview]
Message-ID: <4F756F86.8030906@jp.fujitsu.com> (raw)
In-Reply-To: <1333094685-5507-1-git-send-email-glommer@parallels.com>

(2012/03/30 17:04), Glauber Costa wrote:

> Hi,
> 
> Here is my take about how we can make res_counter updates faster.
> Keep in mind this is a bit of a hack intended as a proof of concept.
> 
> The pros I see with this:
> 
> * free updates in non-constrained paths. non-constrained paths includes
>   unlimited scenarios, but also ones in which we are far from the limit.
> 
> * No need to have a special cache mechanism in memcg. The problem with
>   the caching is my opinion, is that we will forward-account pages, meaning
>   that we'll consider accounted pages we never used. I am not sure
>   anyone actually ran into this, but in theory, this can fire events
>   much earlier than it should.
> 


Note: Assume a big system which has many cpus, and user wants to devide
the system into containers. Current memcg's percpu caching is done
only when a task in memcg is on the cpu, running. So, it's not so dangerous
as it looks.

But yes, if we can drop memcg's code, it's good. Then, we can remove some
amount of codes.

> But the cons:
> 
> * percpu counters have signed quantities, so this would limit us 4G.
>   We can add a shift and then count pages instead of bytes, but we
>   are still in the 16T area here. Maybe we really need more than that.
> 

....
struct percpu_counter {
        raw_spinlock_t lock;
        s64 count;

s64 limtes us 4G ?


> * some of the additions here may slow down the percpu_counters for
>   users that don't care about our usage. Things about min/max tracking
>   enter in this category.
> 


I think it's not very good to increase size of percpu counter. It's already
very big...Hm. How about

	struct percpu_counter_lazy {
		struct percpu_counter pcp;
		extra information
		s64 margin;
	}
?

> * growth of the percpu memory.
>


This may be a concern.

I'll look into patches.

Thanks,
-Kame

 
> It is still not clear for me if we should use percpu_counters as this
> patch implies, or if we should just replicate its functionality.
> 
> I need to go through at least one more full round of auditing before
> making sure the locking is safe, specially my use of synchronize_rcu().
> 
> As for measurements, the cache we have in memcg kind of distort things.
> I need to either disable it, or find the cases in which it is likely
> to lose and benchmark them, such as deep hierarchy concurrent updates
> with common parents.
> 
> I also included a possible optimization that can be done when we
> are close to the limit to avoid the initial tests altogether, but
> it needs to be extended to avoid scanning the percpu areas as well.
> 
> In summary, if this is to be carried forward, it definitely needs
> some love. It should be, however, more than enough to make the
> proposal clear.
> 
> Comments are appreciated.
> 
> Glauber Costa (7):
>   split percpu_counter_sum
>   consolidate all res_counter manipulation
>   bundle a percpu counter into res_counters and use its lock
>   move res_counter_set limit to res_counter.c
>   use percpu_counters for res_counter usage
>   Add min and max statistics to percpu_counter
>   Global optimization
> 
>  include/linux/percpu_counter.h |    3 +
>  include/linux/res_counter.h    |   63 ++++++-----------
>  kernel/res_counter.c           |  151 +++++++++++++++++++++++++++++-----------
>  lib/percpu_counter.c           |   16 ++++-
>  4 files changed, 151 insertions(+), 82 deletions(-)
> 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2012-03-30  8:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-30  8:04 [RFC 0/7] Initial proposal for faster res_counter updates Glauber Costa
2012-03-30  8:04 ` [RFC 1/7] split percpu_counter_sum Glauber Costa
2012-03-30  8:04 ` [RFC 2/7] consolidate all res_counter manipulation Glauber Costa
2012-03-30  8:04 ` [RFC 3/7] bundle a percpu counter into res_counters and use its lock Glauber Costa
2012-03-30  8:04 ` [RFC 4/7] move res_counter_set limit to res_counter.c Glauber Costa
2012-03-30  8:04 ` [RFC 5/7] use percpu_counters for res_counter usage Glauber Costa
2012-03-30  9:33   ` KAMEZAWA Hiroyuki
2012-03-30  9:58     ` KAMEZAWA Hiroyuki
2012-03-30 13:53       ` Glauber Costa
2012-04-09  1:48         ` KAMEZAWA Hiroyuki
2012-03-30 12:59     ` Glauber Costa
2012-03-30  8:04 ` [RFC 6/7] Add min and max statistics to percpu_counter Glauber Costa
2012-03-30  8:04 ` [RFC 7/7] Global optimization Glauber Costa
2012-03-30  8:32 ` KAMEZAWA Hiroyuki [this message]
2012-03-30 10:46   ` [RFC 0/7] Initial proposal for faster res_counter updates Glauber Costa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F756F86.8030906@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=cgroups@vger.kernel.org \
    --cc=devel@openvz.org \
    --cc=glommer@parallels.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=mhocko@suse.cz \
    --cc=tj@kernel.org \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).