From: Frederic Weisbecker <fweisbec@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Tejun Heo <tj@kernel.org>, Li Zefan <lizf@cn.fujitsu.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Aditya Kali <adityakali@google.com>,
Oleg Nesterov <oleg@redhat.com>, Tim Hockin <thockin@hockin.org>,
Tejun Heo <htejun@gmail.com>,
Containers <containers@lists.linux-foundation.org>,
Glauber Costa <glommer@gmail.com>,
Cgroups <cgroups@vger.kernel.org>,
Daniel J Walsh <dwalsh@redhat.com>,
"Daniel P. Berrange" <berrange@redhat.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Max Kellermann <mk@cm4all.com>,
Mandeep Singh Baines <msb@chromium.org>
Subject: [PATCH 03/10] cgroups: ability to stop res charge propagation on bounded ancestor
Date: Wed, 1 Feb 2012 04:37:43 +0100 [thread overview]
Message-ID: <1328067470-5980-4-git-send-email-fweisbec@gmail.com> (raw)
In-Reply-To: <1328067470-5980-1-git-send-email-fweisbec@gmail.com>
Moving a task from a cgroup to another may require to substract its
resource charge from the old cgroup and add it to the new one.
For this to happen, the uncharge/charge propagation can just stop when we
reach the common ancestor for the two cgroups. Further the performance
reasons, we also want to avoid to temporarily overload the common
ancestors with a non-accurate resource counter usage if we charge first
the new cgroup and uncharge the old one thereafter. This is going to be a
requirement for the coming max number of task subsystem.
To solve this, provide a pair of new API that can charge/uncharge a
resource counter until we reach a given ancestor.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Aditya Kali <adityakali@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tim Hockin <thockin@hockin.org>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Containers <containers@lists.linux-foundation.org>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Cgroups <cgroups@vger.kernel.org>
Cc: Daniel J Walsh <dwalsh@redhat.com>
Cc: "Daniel P. Berrange" <berrange@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Max Kellermann <mk@cm4all.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/cgroups/resource_counter.txt | 18 +++++++++++++++++-
include/linux/res_counter.h | 20 +++++++++++++++++---
kernel/res_counter.c | 13 ++++++++-----
3 files changed, 42 insertions(+), 9 deletions(-)
diff --git a/Documentation/cgroups/resource_counter.txt b/Documentation/cgroups/resource_counter.txt
index 95b24d7..a2cd05b 100644
--- a/Documentation/cgroups/resource_counter.txt
+++ b/Documentation/cgroups/resource_counter.txt
@@ -83,7 +83,15 @@ to work with it.
res_counter->lock internally (it must be called with res_counter->lock
held).
- e. void res_counter_uncharge[_locked]
+ e. int res_counter_charge_until(struct res_counter *counter,
+ struct res_counter *limit, unsigned long val,
+ struct res_counter **limit_fail_at)
+
+ The same as res_counter_charge(), but the charge propagation to
+ the hierarchy stops at the limit given in the "limit" parameter.
+
+
+ f. void res_counter_uncharge[_locked]
(struct res_counter *rc, unsigned long val)
When a resource is released (freed) it should be de-accounted
@@ -92,6 +100,14 @@ to work with it.
The _locked routines imply that the res_counter->lock is taken.
+
+ g. void res_counter_uncharge_until(struct res_counter *counter,
+ struct res_counter *limit,
+ unsigned long val)
+
+ The same as res_counter_charge, but the uncharge propagation to
+ the hierarchy stops at the limit given in the "limit" parameter.
+
2.1 Other accounting routines
There are more routines that may help you with common needs, like
diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
index 109d118..de4ba29 100644
--- a/include/linux/res_counter.h
+++ b/include/linux/res_counter.h
@@ -117,8 +117,16 @@ void res_counter_init(struct res_counter *counter, struct res_counter *parent);
int __must_check res_counter_charge_locked(struct res_counter *counter,
unsigned long val);
-int __must_check res_counter_charge(struct res_counter *counter,
- unsigned long val, struct res_counter **limit_fail_at);
+int __must_check res_counter_charge_until(struct res_counter *counter,
+ struct res_counter *limit,
+ unsigned long val,
+ struct res_counter **limit_fail_at);
+static inline int __must_check
+res_counter_charge(struct res_counter *counter, unsigned long val,
+ struct res_counter **limit_fail_at)
+{
+ return res_counter_charge_until(counter, NULL, val, limit_fail_at);
+}
/*
* uncharge - tell that some portion of the resource is released
@@ -131,7 +139,13 @@ int __must_check res_counter_charge(struct res_counter *counter,
*/
void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val);
-void res_counter_uncharge(struct res_counter *counter, unsigned long val);
+void res_counter_uncharge_until(struct res_counter *counter,
+ struct res_counter *limit, unsigned long val);
+static inline void res_counter_uncharge(struct res_counter *counter,
+ unsigned long val)
+{
+ res_counter_uncharge_until(counter, NULL, val);
+}
/**
* res_counter_margin - calculate chargeable space of a counter
diff --git a/kernel/res_counter.c b/kernel/res_counter.c
index 3a93a82..40f15aa 100644
--- a/kernel/res_counter.c
+++ b/kernel/res_counter.c
@@ -35,8 +35,9 @@ int res_counter_charge_locked(struct res_counter *counter, unsigned long val)
return 0;
}
-int res_counter_charge(struct res_counter *counter, unsigned long val,
- struct res_counter **limit_fail_at)
+int res_counter_charge_until(struct res_counter *counter,
+ struct res_counter *limit, unsigned long val,
+ struct res_counter **limit_fail_at)
{
int ret;
unsigned long flags;
@@ -44,7 +45,7 @@ int res_counter_charge(struct res_counter *counter, unsigned long val,
*limit_fail_at = NULL;
local_irq_save(flags);
- for (c = counter; c != NULL; c = c->parent) {
+ for (c = counter; c != limit; c = c->parent) {
spin_lock(&c->lock);
ret = res_counter_charge_locked(c, val);
spin_unlock(&c->lock);
@@ -74,13 +75,15 @@ void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val)
counter->usage -= val;
}
-void res_counter_uncharge(struct res_counter *counter, unsigned long val)
+void res_counter_uncharge_until(struct res_counter *counter,
+ struct res_counter *limit,
+ unsigned long val)
{
unsigned long flags;
struct res_counter *c;
local_irq_save(flags);
- for (c = counter; c != NULL; c = c->parent) {
+ for (c = counter; c != limit; c = c->parent) {
spin_lock(&c->lock);
res_counter_uncharge_locked(c, val);
spin_unlock(&c->lock);
--
1.7.5.4
next prev parent reply other threads:[~2012-02-01 3:38 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-01 3:37 [PATCH 00/10] cgroups: Task counter subsystem v8 Frederic Weisbecker
2012-02-01 3:37 ` [PATCH 01/10] cgroups: add res_counter_write_u64() API Frederic Weisbecker
2012-02-02 12:33 ` Kirill A. Shutemov
2012-02-02 13:56 ` Frederic Weisbecker
2012-02-01 3:37 ` [PATCH 02/10] cgroups: new resource counter inheritance API Frederic Weisbecker
2012-02-01 3:37 ` Frederic Weisbecker [this message]
2012-02-01 3:37 ` [PATCH 04/10] cgroups: add res counter common ancestor searching Frederic Weisbecker
2012-02-01 3:37 ` [PATCH 05/10] res_counter: allow charge failure pointer to be null Frederic Weisbecker
2012-02-01 3:37 ` [PATCH 06/10] cgroups: pull up res counter charge failure interpretation to caller Frederic Weisbecker
2012-02-01 3:37 ` [PATCH 07/10] cgroups: allow subsystems to cancel a fork Frederic Weisbecker
2012-02-01 3:37 ` [PATCH 08/10] cgroups: Add a task counter subsystem Frederic Weisbecker
2012-02-01 3:37 ` [PATCH 09/10] selftests: Enter each directories before executing selftests Frederic Weisbecker
2012-02-01 3:37 ` [PATCH 10/10] selftests: Add a new task counter selftest Frederic Weisbecker
2012-02-01 16:31 ` [PATCH 00/10] cgroups: Task counter subsystem v8 Tejun Heo
2012-02-01 18:50 ` Frederic Weisbecker
2012-02-01 19:51 ` Andrew Morton
2012-02-02 14:50 ` Frederic Weisbecker
2012-02-16 15:31 ` Frederic Weisbecker
2012-03-01 22:53 ` Daniel Lezcano
2012-03-05 3:21 ` Frederic Weisbecker
2012-03-05 16:26 ` Tejun Heo
2012-03-05 16:27 ` Tejun Heo
2012-03-05 16:48 ` Frederic Weisbecker
2012-03-05 16:44 ` Rik van Riel
2013-04-01 18:43 ` Tim Hockin
2013-04-01 18:46 ` Tejun Heo
2013-04-01 20:09 ` Tim Hockin
2013-04-01 20:29 ` Tejun Heo
2013-04-01 21:02 ` Tim Hockin
2013-04-01 22:03 ` Tejun Heo
2013-04-01 22:20 ` Tim Hockin
2013-04-01 22:35 ` Tejun Heo
2013-04-01 22:57 ` Tim Hockin
2013-04-01 23:18 ` Tejun Heo
2013-04-02 0:07 ` Tim Hockin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1328067470-5980-4-git-send-email-fweisbec@gmail.com \
--to=fweisbec@gmail.com \
--cc=adityakali@google.com \
--cc=akpm@linux-foundation.org \
--cc=berrange@redhat.com \
--cc=cgroups@vger.kernel.org \
--cc=containers@lists.linux-foundation.org \
--cc=dwalsh@redhat.com \
--cc=glommer@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=htejun@gmail.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=mk@cm4all.com \
--cc=msb@chromium.org \
--cc=oleg@redhat.com \
--cc=thockin@hockin.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).