* [RFC][PATCH 0/9] memcg soft limit v2 (new design)
@ 2009-04-03 8:08 KAMEZAWA Hiroyuki
2009-04-03 8:09 ` [RFC][PATCH 1/9] " KAMEZAWA Hiroyuki
` (11 more replies)
0 siblings, 12 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:08 UTC (permalink / raw)
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, balbir@linux.vnet.ibm.com,
kosaki.motohiro@jp.fujitsu.com
Hi,
Memory cgroup's soft limit feature is a feature to tell global LRU
"please reclaim from this memcg at memory shortage".
This is v2. Fixed some troubles under hierarchy. and increase soft limit
update hooks to proper places.
This patch is on to
mmotom-Mar23 + memcg-cleanup-cache_charge.patch
+ vmscan-fix-it-to-take-care-of-nodemask.patch
So, not for wide use ;)
This patch tries to avoid to use existing memcg's reclaim routine and
just tell "Hints" to global LRU. This patch is briefly tested and shows
good result to me. (But may not to you. plz brame me.)
Major characteristic is.
- memcg will be inserted to softlimit-queue at charge() if usage excess
soft limit.
- softlimit-queue is a queue with priority. priority is detemined by size
of excessing usage.
- memcg's soft limit hooks is called by shrink_xxx_list() to show hints.
- Behavior is affected by vm.swappiness and LRU scan rate is determined by
global LRU's status.
In this v2.
- problems under use_hierarchy=1 case are fixed.
- more hooks are added.
- codes are cleaned up.
Shows good results on my private box test under several work loads.
But in special artificial case, when victim memcg's Active/Inactive ratio of
ANON is very different from global LRU, the result seems not very good.
i.e.
under vicitm memcg, ACTIVE_ANON=100%, INACTIVE=0% (access memory in busy loop)
under global, ACTIVE_ANON=10%, INACTIVE=90% (almost all processes are sleeping.)
memory can be swapped out from global LRU, not from vicitm.
(If there are file cache in victims, file cacahes will be out.)
But, in this case, even if we successfully swap out anon pages under victime memcg,
they will come back to memory soon and can show heavy slashing.
While using soft limit, I felt this is useful feature :)
But keep this RFC for a while. I'll prepare Documentation until the next post.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC][PATCH 1/9] memcg soft limit v2 (new design)
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
@ 2009-04-03 8:09 ` KAMEZAWA Hiroyuki
2009-04-03 8:10 ` [RFC][PATCH 2/9] soft limit framework for memcg KAMEZAWA Hiroyuki
` (10 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:09 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
balbir@linux.vnet.ibm.com, kosaki.motohiro@jp.fujitsu.com
No chanages from v1.
==
From: Balbir Singh <balbir@linux.vnet.ibm.com>
Changelog v2...v1
1. Add support for res_counter_check_soft_limit_locked. This is used
by the hierarchy code.
Add an interface to allow get/set of soft limits. Soft limits for memory plus
swap controller (memsw) is currently not supported. Resource counters have
been enhanced to support soft limits and new type RES_SOFT_LIMIT has been
added. Unlike hard limits, soft limits can be directly set and do not
need any reclaim or checks before setting them to a newer value.
Kamezawa-San raised a question as to whether soft limit should belong
to res_counter. Since all resources understand the basic concepts of
hard and soft limits, it is justified to add soft limits here. Soft limits
are a generic resource usage feature, even file system quotas support
soft limits.
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
---
Index: softlimit-test2/include/linux/res_counter.h
===================================================================
--- softlimit-test2.orig/include/linux/res_counter.h
+++ softlimit-test2/include/linux/res_counter.h
@@ -35,6 +35,10 @@ struct res_counter {
*/
unsigned long long limit;
/*
+ * the limit that usage can be exceed
+ */
+ unsigned long long soft_limit;
+ /*
* the number of unsuccessful attempts to consume the resource
*/
unsigned long long failcnt;
@@ -85,6 +89,7 @@ enum {
RES_MAX_USAGE,
RES_LIMIT,
RES_FAILCNT,
+ RES_SOFT_LIMIT,
};
/*
@@ -130,6 +135,36 @@ static inline bool res_counter_limit_che
return false;
}
+static inline bool res_counter_soft_limit_check_locked(struct res_counter *cnt)
+{
+ if (cnt->usage < cnt->soft_limit)
+ return true;
+
+ return false;
+}
+
+/**
+ * Get the difference between the usage and the soft limit
+ * @cnt: The counter
+ *
+ * Returns 0 if usage is less than or equal to soft limit
+ * The difference between usage and soft limit, otherwise.
+ */
+static inline unsigned long long
+res_counter_soft_limit_excess(struct res_counter *cnt)
+{
+ unsigned long long excess;
+ unsigned long flags;
+
+ spin_lock_irqsave(&cnt->lock, flags);
+ if (cnt->usage <= cnt->soft_limit)
+ excess = 0;
+ else
+ excess = cnt->usage - cnt->soft_limit;
+ spin_unlock_irqrestore(&cnt->lock, flags);
+ return excess;
+}
+
/*
* Helper function to detect if the cgroup is within it's limit or
* not. It's currently called from cgroup_rss_prepare()
@@ -145,6 +180,17 @@ static inline bool res_counter_check_und
return ret;
}
+static inline bool res_counter_check_under_soft_limit(struct res_counter *cnt)
+{
+ bool ret;
+ unsigned long flags;
+
+ spin_lock_irqsave(&cnt->lock, flags);
+ ret = res_counter_soft_limit_check_locked(cnt);
+ spin_unlock_irqrestore(&cnt->lock, flags);
+ return ret;
+}
+
static inline void res_counter_reset_max(struct res_counter *cnt)
{
unsigned long flags;
@@ -178,4 +224,16 @@ static inline int res_counter_set_limit(
return ret;
}
+static inline int
+res_counter_set_soft_limit(struct res_counter *cnt,
+ unsigned long long soft_limit)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&cnt->lock, flags);
+ cnt->soft_limit = soft_limit;
+ spin_unlock_irqrestore(&cnt->lock, flags);
+ return 0;
+}
+
#endif
Index: softlimit-test2/kernel/res_counter.c
===================================================================
--- softlimit-test2.orig/kernel/res_counter.c
+++ softlimit-test2/kernel/res_counter.c
@@ -19,6 +19,7 @@ void res_counter_init(struct res_counter
{
spin_lock_init(&counter->lock);
counter->limit = (unsigned long long)LLONG_MAX;
+ counter->soft_limit = (unsigned long long)LLONG_MAX;
counter->parent = parent;
}
@@ -101,6 +102,8 @@ res_counter_member(struct res_counter *c
return &counter->limit;
case RES_FAILCNT:
return &counter->failcnt;
+ case RES_SOFT_LIMIT:
+ return &counter->soft_limit;
};
BUG();
Index: softlimit-test2/mm/memcontrol.c
===================================================================
--- softlimit-test2.orig/mm/memcontrol.c
+++ softlimit-test2/mm/memcontrol.c
@@ -1988,6 +1988,20 @@ static int mem_cgroup_write(struct cgrou
else
ret = mem_cgroup_resize_memsw_limit(memcg, val);
break;
+ case RES_SOFT_LIMIT:
+ ret = res_counter_memparse_write_strategy(buffer, &val);
+ if (ret)
+ break;
+ /*
+ * For memsw, soft limits are hard to implement in terms
+ * of semantics, for now, we support soft limits for
+ * control without swap
+ */
+ if (type == _MEM)
+ ret = res_counter_set_soft_limit(&memcg->res, val);
+ else
+ ret = -EINVAL;
+ break;
default:
ret = -EINVAL; /* should be BUG() ? */
break;
@@ -2237,6 +2251,12 @@ static struct cftype mem_cgroup_files[]
.read_u64 = mem_cgroup_read,
},
{
+ .name = "soft_limit_in_bytes",
+ .private = MEMFILE_PRIVATE(_MEM, RES_SOFT_LIMIT),
+ .write_string = mem_cgroup_write,
+ .read_u64 = mem_cgroup_read,
+ },
+ {
.name = "failcnt",
.private = MEMFILE_PRIVATE(_MEM, RES_FAILCNT),
.trigger = mem_cgroup_reset,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC][PATCH 2/9] soft limit framework for memcg.
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
2009-04-03 8:09 ` [RFC][PATCH 1/9] " KAMEZAWA Hiroyuki
@ 2009-04-03 8:10 ` KAMEZAWA Hiroyuki
2009-04-03 8:12 ` [RFC][PATCH 3/9] soft limit update filter KAMEZAWA Hiroyuki
` (9 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:10 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
balbir@linux.vnet.ibm.com, kosaki.motohiro@jp.fujitsu.com
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Add minimal modification for soft limit to res_counter_charge() and memcontol.c
Based on Balbir Singh <balbir@linux.vnet.ibm.com> 's work but most of
features are removed. (dropped or moved to later patch.)
This is for building a frame to implement soft limit handler in memcg.
- Checks soft limit status at every charge.
- Adds mem_cgroup_soft_limit_check() as a function to detect we need
check now or not.
- mem_cgroup_update_soft_limit() is a function for updates internal status
of soft limit controller of memcg.
- As an experimental, this has no hooks in uncharge path.
Changelog: v1 -> v2
- removed "update" from mem_cgroup_free() (revisit in later patch.)
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/res_counter.h | 3 ++-
kernel/res_counter.c | 12 +++++++++++-
mm/memcontrol.c | 19 +++++++++++++++++--
3 files changed, 30 insertions(+), 4 deletions(-)
Index: softlimit-test2/include/linux/res_counter.h
===================================================================
--- softlimit-test2.orig/include/linux/res_counter.h
+++ softlimit-test2/include/linux/res_counter.h
@@ -112,7 +112,8 @@ void res_counter_init(struct res_counter
int __must_check res_counter_charge_locked(struct res_counter *counter,
unsigned long val);
int __must_check res_counter_charge(struct res_counter *counter,
- unsigned long val, struct res_counter **limit_fail_at);
+ unsigned long val, struct res_counter **limit_fail_at,
+ bool *soft_limit_failure);
/*
* uncharge - tell that some portion of the resource is released
Index: softlimit-test2/kernel/res_counter.c
===================================================================
--- softlimit-test2.orig/kernel/res_counter.c
+++ softlimit-test2/kernel/res_counter.c
@@ -37,9 +37,11 @@ int res_counter_charge_locked(struct res
}
int res_counter_charge(struct res_counter *counter, unsigned long val,
- struct res_counter **limit_fail_at)
+ struct res_counter **limit_fail_at,
+ bool *soft_limit_failure)
{
int ret;
+ int soft_cnt = 0;
unsigned long flags;
struct res_counter *c, *u;
@@ -48,6 +50,8 @@ int res_counter_charge(struct res_counte
for (c = counter; c != NULL; c = c->parent) {
spin_lock(&c->lock);
ret = res_counter_charge_locked(c, val);
+ if (!res_counter_soft_limit_check_locked(c))
+ soft_cnt += 1;
spin_unlock(&c->lock);
if (ret < 0) {
*limit_fail_at = c;
@@ -55,6 +59,12 @@ int res_counter_charge(struct res_counte
}
}
ret = 0;
+ if (soft_limit_failure) {
+ if (!soft_cnt)
+ *soft_limit_failure = false;
+ else
+ *soft_limit_failure = true;
+ }
goto done;
undo:
for (u = counter; u != c; u = u->parent) {
Index: softlimit-test2/mm/memcontrol.c
===================================================================
--- softlimit-test2.orig/mm/memcontrol.c
+++ softlimit-test2/mm/memcontrol.c
@@ -897,6 +897,15 @@ static void record_last_oom(struct mem_c
mem_cgroup_walk_tree(mem, NULL, record_last_oom_cb);
}
+static bool mem_cgroup_soft_limit_check(struct mem_cgroup *mem)
+{
+ return false;
+}
+
+static void mem_cgroup_update_soft_limit(struct mem_cgroup *mem)
+{
+ return;
+}
/*
* Unlike exported interface, "oom" parameter is added. if oom==true,
@@ -909,6 +918,7 @@ static int __mem_cgroup_try_charge(struc
struct mem_cgroup *mem, *mem_over_limit;
int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
struct res_counter *fail_res;
+ bool soft_fail;
if (unlikely(test_thread_flag(TIF_MEMDIE))) {
/* Don't account this! */
@@ -938,12 +948,13 @@ static int __mem_cgroup_try_charge(struc
int ret;
bool noswap = false;
- ret = res_counter_charge(&mem->res, PAGE_SIZE, &fail_res);
+ ret = res_counter_charge(&mem->res, PAGE_SIZE, &fail_res,
+ &soft_fail);
if (likely(!ret)) {
if (!do_swap_account)
break;
ret = res_counter_charge(&mem->memsw, PAGE_SIZE,
- &fail_res);
+ &fail_res, NULL);
if (likely(!ret))
break;
/* mem+swap counter fails */
@@ -985,6 +996,10 @@ static int __mem_cgroup_try_charge(struc
goto nomem;
}
}
+
+ if (soft_fail && mem_cgroup_soft_limit_check(mem))
+ mem_cgroup_update_soft_limit(mem);
+
return 0;
nomem:
css_put(&mem->css);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC][PATCH 3/9] soft limit update filter
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
2009-04-03 8:09 ` [RFC][PATCH 1/9] " KAMEZAWA Hiroyuki
2009-04-03 8:10 ` [RFC][PATCH 2/9] soft limit framework for memcg KAMEZAWA Hiroyuki
@ 2009-04-03 8:12 ` KAMEZAWA Hiroyuki
2009-04-06 9:43 ` Balbir Singh
2009-04-03 8:12 ` [RFC][PATCH 4/9] soft limit queue and priority KAMEZAWA Hiroyuki
` (8 subsequent siblings)
11 siblings, 1 reply; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:12 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
balbir@linux.vnet.ibm.com, kosaki.motohiro@jp.fujitsu.com
No changes from v1.
==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Check/Update softlimit information at every charge is over-killing, so
we need some filter.
This patch tries to count events in the memcg and if events > threshold
tries to update memcg's soft limit status and reset event counter to 0.
Event counter is maintained by per-cpu which has been already used,
Then, no siginificant overhead(extra cache-miss etc..) in theory.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
+++ mmotm-2.6.29-Mar23/mm/memcontrol.c
@@ -66,6 +66,7 @@ enum mem_cgroup_stat_index {
MEM_CGROUP_STAT_PGPGIN_COUNT, /* # of pages paged in */
MEM_CGROUP_STAT_PGPGOUT_COUNT, /* # of pages paged out */
+ MEM_CGROUP_STAT_EVENTS, /* sum of page-in/page-out for internal use */
MEM_CGROUP_STAT_NSTATS,
};
@@ -105,6 +106,22 @@ static s64 mem_cgroup_local_usage(struct
return ret;
}
+/* For intenal use of per-cpu event counting. */
+
+static inline void
+__mem_cgroup_stat_reset_safe(struct mem_cgroup_stat_cpu *stat,
+ enum mem_cgroup_stat_index idx)
+{
+ stat->count[idx] = 0;
+}
+
+static inline s64
+__mem_cgroup_stat_read_local(struct mem_cgroup_stat_cpu *stat,
+ enum mem_cgroup_stat_index idx)
+{
+ return stat->count[idx];
+}
+
/*
* per-zone information in memory controller.
*/
@@ -235,6 +252,8 @@ static void mem_cgroup_charge_statistics
else
__mem_cgroup_stat_add_safe(cpustat,
MEM_CGROUP_STAT_PGPGOUT_COUNT, 1);
+ __mem_cgroup_stat_add_safe(cpustat, MEM_CGROUP_STAT_EVENTS, 1);
+
put_cpu();
}
@@ -897,9 +916,26 @@ static void record_last_oom(struct mem_c
mem_cgroup_walk_tree(mem, NULL, record_last_oom_cb);
}
+#define SOFTLIMIT_EVENTS_THRESH (1024) /* 1024 times of page-in/out */
+/*
+ * Returns true if sum of page-in/page-out events since last check is
+ * over SOFTLIMIT_EVENT_THRESH. (counter is per-cpu.)
+ */
static bool mem_cgroup_soft_limit_check(struct mem_cgroup *mem)
{
- return false;
+ bool ret = false;
+ int cpu = get_cpu();
+ s64 val;
+ struct mem_cgroup_stat_cpu *cpustat;
+
+ cpustat = &mem->stat.cpustat[cpu];
+ val = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_EVENTS);
+ if (unlikely(val > SOFTLIMIT_EVENTS_THRESH)) {
+ __mem_cgroup_stat_reset_safe(cpustat, MEM_CGROUP_STAT_EVENTS);
+ ret = true;
+ }
+ put_cpu();
+ return ret;
}
static void mem_cgroup_update_soft_limit(struct mem_cgroup *mem)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC][PATCH 4/9] soft limit queue and priority
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
` (2 preceding siblings ...)
2009-04-03 8:12 ` [RFC][PATCH 3/9] soft limit update filter KAMEZAWA Hiroyuki
@ 2009-04-03 8:12 ` KAMEZAWA Hiroyuki
2009-04-06 11:05 ` Balbir Singh
2009-04-06 18:42 ` Balbir Singh
2009-04-03 8:13 ` [RFC][PATCH 5/9] add more hooks and check in lazy manner KAMEZAWA Hiroyuki
` (7 subsequent siblings)
11 siblings, 2 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:12 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
balbir@linux.vnet.ibm.com, kosaki.motohiro@jp.fujitsu.com
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Softlimitq. for memcg.
Implements an array of queue to list memcgs, array index is determined by
the amount of memory usage excess the soft limit.
While Balbir's one uses RB-tree and my old one used a per-zone queue
(with round-robin), this is one of mixture of them.
(I'd like to use rotation of queue in later patches)
Priority is determined by following.
Assume unit = total pages/1024. (the code uses different value)
if excess is...
< unit, priority = 0,
< unit*2, priority = 1,
< unit*2*2, priority = 2,
...
< unit*2^9, priority = 9,
< unit*2^10, priority = 10, (> 50% to total mem)
This patch just includes queue management part and not includes
selection logic from queue. Some trick will be used for selecting victims at
soft limit in efficient way.
And this equips 2 queues, for anon and file. Inset/Delete of both list is
done at once but scan will be independent. (These 2 queues are used later.)
Major difference from Balbir's one other than RB-tree is bahavior under
hierarchy. This one adds all children to queue by checking hierarchical
priority. This is for helping per-zone usage check on victim-selection logic.
Changelog: v1->v2
- fixed comments.
- change base size to exponent.
- some micro optimization to reduce code size.
- considering memory hotplug, it's not good to record a value calculated
from totalram_pages at boot and using it later is bad manner. Fixed it.
- removed soft_limit_lock (spinlock)
- added soft_limit_update counter for avoiding mulptiple update at once.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 117 insertions(+), 1 deletion(-)
Index: softlimit-test2/mm/memcontrol.c
===================================================================
--- softlimit-test2.orig/mm/memcontrol.c
+++ softlimit-test2/mm/memcontrol.c
@@ -192,7 +192,14 @@ struct mem_cgroup {
atomic_t refcnt;
unsigned int swappiness;
-
+ /*
+ * For soft limit.
+ */
+ int soft_limit_priority;
+ struct list_head soft_limit_list[2];
+#define SL_ANON (0)
+#define SL_FILE (1)
+ atomic_t soft_limit_update;
/*
* statistics. This must be placed at the end of memcg.
*/
@@ -938,11 +945,115 @@ static bool mem_cgroup_soft_limit_check(
return ret;
}
+/*
+ * Assume "base_amount", and excess = usage - soft limit.
+ *
+ * 0...... if excess < base_amount
+ * 1...... if excess < base_amount * 2
+ * 2...... if excess < base_amount * 2^2
+ * 3.......if excess < base_amount * 2^3
+ * ....
+ * 9.......if excess < base_amount * 2^9
+ * 10 .....if excess < base_amount * 2^10
+ *
+ * base_amount is detemined from total pages in the system.
+ */
+
+#define SLQ_MAXPRIO (11)
+static struct {
+ spinlock_t lock;
+ struct list_head queue[SLQ_MAXPRIO][2]; /* 0:anon 1:file */
+} softlimitq;
+
+#define SLQ_PRIO_FACTOR (1024) /* 2^10 */
+
+static int __calc_soft_limit_prio(unsigned long excess)
+{
+ unsigned long factor = totalram_pages /SLQ_PRIO_FACTOR;
+
+ return fls(excess/factor);
+}
+
+static int mem_cgroup_soft_limit_prio(struct mem_cgroup *mem)
+{
+ unsigned long excess, max_excess = 0;
+ struct res_counter *c = &mem->res;
+
+ do {
+ excess = res_counter_soft_limit_excess(c) >> PAGE_SHIFT;
+ if (max_excess < excess)
+ max_excess = excess;
+ c = c->parent;
+ } while (c);
+
+ return __calc_soft_limit_prio(max_excess);
+}
+
+static void __mem_cgroup_requeue(struct mem_cgroup *mem, int prio)
+{
+ /* enqueue to softlimit queue */
+ int i;
+
+ spin_lock(&softlimitq.lock);
+ if (prio != mem->soft_limit_priority) {
+ mem->soft_limit_priority = prio;
+ for (i = 0; i < 2; i++) {
+ list_del_init(&mem->soft_limit_list[i]);
+ list_add_tail(&mem->soft_limit_list[i],
+ &softlimitq.queue[prio][i]);
+ }
+ }
+ spin_unlock(&softlimitq.lock);
+}
+
+static void __mem_cgroup_dequeue(struct mem_cgroup *mem)
+{
+ int i;
+
+ spin_lock(&softlimitq.lock);
+ for (i = 0; i < 2; i++)
+ list_del_init(&mem->soft_limit_list[i]);
+ spin_unlock(&softlimitq.lock);
+}
+
+static int
+__mem_cgroup_update_soft_limit_cb(struct mem_cgroup *mem, void *data)
+{
+ int priority;
+ /* If someone updates, we don't need more */
+ priority = mem_cgroup_soft_limit_prio(mem);
+
+ if (priority != mem->soft_limit_priority)
+ __mem_cgroup_requeue(mem, priority);
+ return 0;
+}
+
static void mem_cgroup_update_soft_limit(struct mem_cgroup *mem)
{
+ int priority;
+
+ /* check status change */
+ priority = mem_cgroup_soft_limit_prio(mem);
+ if (priority != mem->soft_limit_priority &&
+ atomic_inc_return(&mem->soft_limit_update) > 1) {
+ mem_cgroup_walk_tree(mem, NULL,
+ __mem_cgroup_update_soft_limit_cb);
+ atomic_set(&mem->soft_limit_update, 0);
+ }
return;
}
+static void softlimitq_init(void)
+{
+ int i;
+
+ spin_lock_init(&softlimitq.lock);
+ for (i = 0; i < SLQ_MAXPRIO; i++) {
+ INIT_LIST_HEAD(&softlimitq.queue[i][SL_ANON]);
+ INIT_LIST_HEAD(&softlimitq.queue[i][SL_FILE]);
+ }
+}
+
/*
* Unlike exported interface, "oom" parameter is added. if oom==true,
* oom-killer can be invoked.
@@ -2512,6 +2623,7 @@ mem_cgroup_create(struct cgroup_subsys *
if (cont->parent == NULL) {
enable_swap_cgroup();
parent = NULL;
+ softlimitq_init();
} else {
parent = mem_cgroup_from_cont(cont->parent);
mem->use_hierarchy = parent->use_hierarchy;
@@ -2532,6 +2644,9 @@ mem_cgroup_create(struct cgroup_subsys *
res_counter_init(&mem->memsw, NULL);
}
mem->last_scanned_child = 0;
+ mem->soft_limit_priority = 0;
+ INIT_LIST_HEAD(&mem->soft_limit_list[SL_ANON]);
+ INIT_LIST_HEAD(&mem->soft_limit_list[SL_FILE]);
spin_lock_init(&mem->reclaim_param_lock);
if (parent)
@@ -2556,6 +2671,7 @@ static void mem_cgroup_destroy(struct cg
{
struct mem_cgroup *mem = mem_cgroup_from_cont(cont);
+ __mem_cgroup_dequeue(mem);
mem_cgroup_put(mem);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC][PATCH 5/9] add more hooks and check in lazy manner
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
` (3 preceding siblings ...)
2009-04-03 8:12 ` [RFC][PATCH 4/9] soft limit queue and priority KAMEZAWA Hiroyuki
@ 2009-04-03 8:13 ` KAMEZAWA Hiroyuki
2009-04-03 8:14 ` [RFC][PATCH 6/9] active inactive ratio for private KAMEZAWA Hiroyuki
` (6 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:13 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
balbir@linux.vnet.ibm.com, kosaki.motohiro@jp.fujitsu.com
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Adds 2 more soft limit update hooks.
- uncharge
- write to memory.soft_limit_in_bytes file.
And fixes issues under hierarchy. (This is the most complicated part...)
Because ucharge() can be called under very busy spin_lock, all checks should be
done in lazy. We can use this lazy work to charge() part and make use of it.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 55 insertions(+), 11 deletions(-)
Index: softlimit-test2/mm/memcontrol.c
===================================================================
--- softlimit-test2.orig/mm/memcontrol.c
+++ softlimit-test2/mm/memcontrol.c
@@ -200,6 +200,8 @@ struct mem_cgroup {
#define SL_ANON (0)
#define SL_FILE (1)
atomic_t soft_limit_update;
+ struct work_struct soft_limit_work;
+
/*
* statistics. This must be placed at the end of memcg.
*/
@@ -989,6 +991,23 @@ static int mem_cgroup_soft_limit_prio(st
return __calc_soft_limit_prio(max_excess);
}
+static struct mem_cgroup *
+mem_cgroup_soft_limit_need_check(struct mem_cgroup *mem)
+{
+ struct res_counter *c = &mem->res;
+ unsigned long excess, prio;
+
+ do {
+ excess = res_counter_soft_limit_excess(c) >> PAGE_SHIFT;
+ prio = __calc_soft_limit_prio(excess);
+ mem = container_of(c, struct mem_cgroup, res);
+ if (mem->soft_limit_priority != prio)
+ return mem;
+ c = c->parent;
+ } while (c);
+ return NULL;
+}
+
static void __mem_cgroup_requeue(struct mem_cgroup *mem, int prio)
{
/* enqueue to softlimit queue */
@@ -1028,18 +1047,36 @@ __mem_cgroup_update_soft_limit_cb(struct
return 0;
}
-static void mem_cgroup_update_soft_limit(struct mem_cgroup *mem)
+static void mem_cgroup_update_soft_limit_work(struct work_struct *work)
{
- int priority;
+ struct mem_cgroup *mem;
+
+ mem = container_of(work, struct mem_cgroup, soft_limit_work);
+
+ mem_cgroup_walk_tree(mem, NULL, __mem_cgroup_update_soft_limit_cb);
+ atomic_set(&mem->soft_limit_update, 0);
+ css_put(&mem->css);
+}
+
+static void mem_cgroup_update_soft_limit_lazy(struct mem_cgroup *mem)
+{
+ int ret, priority;
+ struct mem_cgroup * root;
+
+ /*
+ * check status change under hierarchy.
+ */
+ root = mem_cgroup_soft_limit_need_check(mem);
+ if (!root)
+ return;
+
+ if (atomic_inc_return(&root->soft_limit_update) > 1)
+ return;
+ css_get(&root->css);
+ ret = schedule_work(&root->soft_limit_work);
+ if (!ret)
+ css_put(&root->css);
- /* check status change */
- priority = mem_cgroup_soft_limit_prio(mem);
- if (priority != mem->soft_limit_priority &&
- atomic_inc_return(&mem->soft_limit_update) > 1) {
- mem_cgroup_walk_tree(mem, NULL,
- __mem_cgroup_update_soft_limit_cb);
- atomic_set(&mem->soft_limit_update, 0);
- }
return;
}
@@ -1145,7 +1182,7 @@ static int __mem_cgroup_try_charge(struc
}
if (soft_fail && mem_cgroup_soft_limit_check(mem))
- mem_cgroup_update_soft_limit(mem);
+ mem_cgroup_update_soft_limit_lazy(mem);
return 0;
nomem:
@@ -1625,6 +1662,9 @@ __mem_cgroup_uncharge_common(struct page
mz = page_cgroup_zoneinfo(pc);
unlock_page_cgroup(pc);
+ if (mem->soft_limit_priority && mem_cgroup_soft_limit_check(mem))
+ mem_cgroup_update_soft_limit_lazy(mem);
+
/* at swapout, this memcg will be accessed to record to swap */
if (ctype != MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
css_put(&mem->css);
@@ -2163,6 +2203,9 @@ static int mem_cgroup_write(struct cgrou
ret = res_counter_set_soft_limit(&memcg->res, val);
else
ret = -EINVAL;
+ if (!ret)
+ mem_cgroup_update_soft_limit_lazy(memcg);
+
break;
default:
ret = -EINVAL; /* should be BUG() ? */
@@ -2648,6 +2691,7 @@ mem_cgroup_create(struct cgroup_subsys *
INIT_LIST_HEAD(&mem->soft_limit_list[SL_ANON]);
INIT_LIST_HEAD(&mem->soft_limit_list[SL_FILE]);
spin_lock_init(&mem->reclaim_param_lock);
+ INIT_WORK(&mem->soft_limit_work, mem_cgroup_update_soft_limit_work);
if (parent)
mem->swappiness = get_swappiness(parent);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC][PATCH 6/9] active inactive ratio for private
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
` (4 preceding siblings ...)
2009-04-03 8:13 ` [RFC][PATCH 5/9] add more hooks and check in lazy manner KAMEZAWA Hiroyuki
@ 2009-04-03 8:14 ` KAMEZAWA Hiroyuki
2009-04-03 8:15 ` [RFC][PATCH 7/9] vicitim selection logic KAMEZAWA Hiroyuki
` (5 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:14 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
balbir@linux.vnet.ibm.com, kosaki.motohiro@jp.fujitsu.com
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Current memcg's active/inactive ratio calclation ignores zone.
(It's designed for reducing usage of memcg and not for recovering
from memory shortage.)
But softlimit should take care of zone, later.
Changelog v1->v2:
- fixed buggy argument in mem_cgroup_inactive_anon_is_low
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/memcontrol.h | 4 ++--
mm/memcontrol.c | 26 ++++++++++++++++++++------
mm/vmscan.c | 2 +-
3 files changed, 23 insertions(+), 9 deletions(-)
Index: softlimit-test2/mm/memcontrol.c
===================================================================
--- softlimit-test2.orig/mm/memcontrol.c
+++ softlimit-test2/mm/memcontrol.c
@@ -564,15 +564,28 @@ void mem_cgroup_record_reclaim_priority(
spin_unlock(&mem->reclaim_param_lock);
}
-static int calc_inactive_ratio(struct mem_cgroup *memcg, unsigned long *present_pages)
+static int calc_inactive_ratio(struct mem_cgroup *memcg,
+ unsigned long *present_pages,
+ struct zone *z)
{
unsigned long active;
unsigned long inactive;
unsigned long gb;
unsigned long inactive_ratio;
- inactive = mem_cgroup_get_local_zonestat(memcg, LRU_INACTIVE_ANON);
- active = mem_cgroup_get_local_zonestat(memcg, LRU_ACTIVE_ANON);
+ if (!z) {
+ inactive = mem_cgroup_get_local_zonestat(memcg,
+ LRU_INACTIVE_ANON);
+ active = mem_cgroup_get_local_zonestat(memcg, LRU_ACTIVE_ANON);
+ } else {
+ int nid = z->zone_pgdat->node_id;
+ int zid = zone_idx(z);
+ struct mem_cgroup_per_zone *mz;
+
+ mz = mem_cgroup_zoneinfo(memcg, nid, zid);
+ inactive = MEM_CGROUP_ZSTAT(mz, LRU_INACTIVE_ANON);
+ active = MEM_CGROUP_ZSTAT(mz, LRU_ACTIVE_ANON);
+ }
gb = (inactive + active) >> (30 - PAGE_SHIFT);
if (gb)
@@ -588,14 +601,14 @@ static int calc_inactive_ratio(struct me
return inactive_ratio;
}
-int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg)
+int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg, struct zone *z)
{
unsigned long active;
unsigned long inactive;
unsigned long present_pages[2];
unsigned long inactive_ratio;
- inactive_ratio = calc_inactive_ratio(memcg, present_pages);
+ inactive_ratio = calc_inactive_ratio(memcg, present_pages, z);
inactive = present_pages[0];
active = present_pages[1];
@@ -2366,7 +2379,8 @@ static int mem_control_stat_show(struct
#ifdef CONFIG_DEBUG_VM
- cb->fill(cb, "inactive_ratio", calc_inactive_ratio(mem_cont, NULL));
+ cb->fill(cb, "inactive_ratio",
+ calc_inactive_ratio(mem_cont, NULL, NULL));
{
int nid, zid;
Index: softlimit-test2/include/linux/memcontrol.h
===================================================================
--- softlimit-test2.orig/include/linux/memcontrol.h
+++ softlimit-test2/include/linux/memcontrol.h
@@ -93,7 +93,7 @@ extern void mem_cgroup_note_reclaim_prio
int priority);
extern void mem_cgroup_record_reclaim_priority(struct mem_cgroup *mem,
int priority);
-int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg);
+int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg, struct zone *z);
unsigned long mem_cgroup_zone_nr_pages(struct mem_cgroup *memcg,
struct zone *zone,
enum lru_list lru);
@@ -234,7 +234,7 @@ static inline bool mem_cgroup_oom_called
}
static inline int
-mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg)
+mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg, struct zone *z)
{
return 1;
}
Index: softlimit-test2/mm/vmscan.c
===================================================================
--- softlimit-test2.orig/mm/vmscan.c
+++ softlimit-test2/mm/vmscan.c
@@ -1347,7 +1347,7 @@ static int inactive_anon_is_low(struct z
if (scanning_global_lru(sc))
low = inactive_anon_is_low_global(zone);
else
- low = mem_cgroup_inactive_anon_is_low(sc->mem_cgroup);
+ low = mem_cgroup_inactive_anon_is_low(sc->mem_cgroup, NULL);
return low;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC][PATCH 7/9] vicitim selection logic
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
` (5 preceding siblings ...)
2009-04-03 8:14 ` [RFC][PATCH 6/9] active inactive ratio for private KAMEZAWA Hiroyuki
@ 2009-04-03 8:15 ` KAMEZAWA Hiroyuki
2009-04-03 8:17 ` [RFC][PATCH 8/9] lru reordering KAMEZAWA Hiroyuki
` (4 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:15 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
balbir@linux.vnet.ibm.com, kosaki.motohiro@jp.fujitsu.com
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Soft Limit victim selection/cache logic.
This patch implements victim selection logic and caching method.
victim memcg is selected in following way, assume a zone under shrinking
is specified. Selected memcg will be
- has the highest priority (high usage)
- has memory on the zone.
When a memcg is selected, it's rotated and cached per cpu with tickets.
This cache is refreshed when
- given ticket is exhausetd
- very long time since last update.
- the cached memcg doesn't include proper zone.
Even when no proper memcg is not found in victim selection logic,
some tickets are assigned to NULL victim.
As softlimitq, this cache's information has 2 ents for anon and file.
Change Log v1 -> v2:
- clean up.
- cpu hotplug support.
- change "bonus" calclation of victime.
- try to make the code slim.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 198 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 198 insertions(+)
Index: softlimit-test2/mm/memcontrol.c
===================================================================
--- softlimit-test2.orig/mm/memcontrol.c
+++ softlimit-test2/mm/memcontrol.c
@@ -37,6 +37,8 @@
#include <linux/vmalloc.h>
#include <linux/mm_inline.h>
#include <linux/page_cgroup.h>
+#include <linux/cpu.h>
+
#include "internal.h"
#include <asm/uaccess.h>
@@ -1093,6 +1095,169 @@ static void mem_cgroup_update_soft_limit
return;
}
+/* softlimit victim selection logic */
+
+/* Returns the amount of evictable memory in memcg */
+static unsigned long
+mem_cgroup_usage(struct mem_cgroup *mem, struct zone *zone, int file)
+{
+ struct mem_cgroup_per_zone *mz;
+ int nid = zone->zone_pgdat->node_id;
+ int zid = zone_idx(zone);
+ unsigned long usage = 0;
+ enum lru_list l = LRU_BASE;
+
+ mz = mem_cgroup_zoneinfo(mem, nid, zid);
+ if (file)
+ l += LRU_FILE;
+ usage = MEM_CGROUP_ZSTAT(mz, l) + MEM_CGROUP_ZSTAT(mz, l + LRU_ACTIVE);
+
+ return usage;
+}
+
+struct soft_limit_cache {
+ /* If ticket is 0, refresh and refill the cache.*/
+ int ticket[2];
+ /* next update time for ticket(jiffies)*/
+ unsigned long next_update;
+ /* victim memcg */
+ struct mem_cgroup *mem[2];
+};
+
+/*
+ * Typically, 32pages are reclaimed per call. 4*32=128pages as base ticket.
+ * 4 * prio scans are added as bonus for high priority.
+ */
+#define SLCACHE_NULL_TICKET (4)
+#define SLCACHE_UPDATE_JIFFIES (HZ*5) /* 5 minutes is very long. */
+DEFINE_PER_CPU(struct soft_limit_cache, soft_limit_cache);
+
+#ifdef CONFIG_HOTPLUG_CPU
+static void forget_soft_limit_cache(long cpu)
+{
+ struct soft_limit_cache *slc;
+
+ slc = &per_cpu(soft_limit_cache, cpu);
+ slc->ticket[0] = 0;
+ slc->ticket[1] = 0;
+ slc->next_update = jiffies;
+ if (slc->mem[0])
+ mem_cgroup_put(slc->mem[0]);
+ if (slc->mem[1])
+ mem_cgroup_put(slc->mem[1]);
+ slc->mem[0] = NULL;
+ slc->mem[1] = NULL;
+}
+#endif
+
+
+/* This is called under preempt disabled context....*/
+static noinline void reload_softlimit_victim(struct soft_limit_cache *slc,
+ struct zone *zone, int file)
+{
+ struct mem_cgroup *mem, *tmp;
+ struct list_head *queue, *cur;
+ int prio;
+ unsigned long usage = 0;
+
+ if (slc->mem[file]) {
+ mem_cgroup_put(slc->mem[file]);
+ slc->mem[file] = NULL;
+ }
+ slc->ticket[file] = SLCACHE_NULL_TICKET;
+ slc->next_update = jiffies + SLCACHE_UPDATE_JIFFIES;
+
+ /* brief check the queue */
+ for (prio = SLQ_MAXPRIO - 1; prio > 0; prio--) {
+ if (!list_empty(&softlimitq.queue[prio][file]))
+ break;
+ }
+retry:
+ if (prio == 0)
+ return;
+
+ /* check queue in priority order */
+
+ queue = &softlimitq.queue[prio][file];
+
+ spin_lock(&softlimitq.lock);
+ mem = NULL;
+ /*
+ * does same behavior as list_for_each_entry but
+ * member for next entity depends on "file".
+ */
+ list_for_each(cur, queue) {
+ if (!file)
+ tmp = container_of(cur, struct mem_cgroup,
+ soft_limit_list[0]);
+ else
+ tmp = container_of(cur, struct mem_cgroup,
+ soft_limit_list[1]);
+
+ usage = mem_cgroup_usage(tmp, zone, file);
+ if (usage) {
+ mem = tmp;
+ list_move_tail(&mem->soft_limit_list[file], queue);
+ break;
+ }
+ }
+ spin_unlock(&softlimitq.lock);
+
+ /* If not found, goes to next priority */
+ if (!mem) {
+ prio--;
+ goto retry;
+ }
+
+ if (!css_is_removed(&mem->css)) {
+ int bonus = 0;
+ unsigned long estimated_excess;
+ estimated_excess = totalram_pages/SLQ_PRIO_FACTOR;
+ estimated_excess <<= prio;
+ slc->mem[file] = mem;
+ /*
+ * If not using hierarchy, this memcg itself consumes memory.
+ * Then, add extra scan bonus to this memcg itself.
+ * If not, this memcg itself may not be very bad one. If
+ * this memcg's (anon or file )usage > 12% of excess,
+ * add extra scan bonus. if not, just small scan.
+ */
+ if (!mem->use_hierarchy || (usage > estimated_excess/8))
+ bonus = SLCACHE_NULL_TICKET * prio;
+ else
+ bonus = SLCACHE_NULL_TICKET; /* twice to NULL */
+ slc->ticket[file] += bonus;
+ mem_cgroup_get(mem);
+ }
+}
+
+static void slc_reset_cache_ticket(int file)
+{
+ struct soft_limit_cache *slc = &get_cpu_var(soft_limit_cache);
+
+ slc->ticket[file] = 0;
+ put_cpu_var(soft_limit_cache);
+}
+
+static struct mem_cgroup *get_soft_limit_victim(struct zone *zone, int file)
+{
+ struct mem_cgroup *ret;
+ struct soft_limit_cache *slc;
+
+ slc = &get_cpu_var(soft_limit_cache);
+ /*
+ * If ticket is expired or long time since last ticket.
+ * reload victim.
+ */
+ if ((--slc->ticket[file] < 0) ||
+ (time_after(jiffies, slc->next_update)))
+ reload_softlimit_victim(slc, zone, file);
+ ret = slc->mem[file];
+ put_cpu_var(soft_limit_cache);
+ return ret;
+}
+
+
static void softlimitq_init(void)
{
int i;
@@ -2780,3 +2945,36 @@ static int __init disable_swap_account(c
}
__setup("noswapaccount", disable_swap_account);
#endif
+
+#ifdef CONFIG_HOTPLUG_CPU
+/*
+ * _NOW_, what we have to handle is just cpu removal.
+ */
+static int __cpuinit memcg_cpu_callback(struct notifier_block *nfb,
+ unsigned long action,
+ void *hcpu)
+{
+ long cpu = (long) hcpu;
+
+ switch (action) {
+ case CPU_DEAD:
+ case CPU_DEAD_FROZEN:
+ forget_soft_limit_cache(cpu);
+ break;
+ default:
+ break;
+ }
+ return NOTIFY_OK;
+}
+
+static struct notifier_block __cpuinitdata soft_limit_notifier = {
+ &memcg_cpu_callback, NULL, 0
+};
+
+static int __cpuinit memcg_cpuhp_init(void)
+{
+ register_cpu_notifier(&soft_limit_notifier);
+ return 0;
+}
+__initcall(memcg_cpuhp_init);
+#endif
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC][PATCH 8/9] lru reordering
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
` (6 preceding siblings ...)
2009-04-03 8:15 ` [RFC][PATCH 7/9] vicitim selection logic KAMEZAWA Hiroyuki
@ 2009-04-03 8:17 ` KAMEZAWA Hiroyuki
2009-04-03 8:18 ` [RFC][PATCH 9/9] more event filter depend on priority KAMEZAWA Hiroyuki
` (3 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:17 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
balbir@linux.vnet.ibm.com, kosaki.motohiro@jp.fujitsu.com
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
This patch adds a function to change the LRU order of pages in global LRU
under control of memcg's victim of soft limit.
FILE and ANON victim is divided and LRU rotation will be done independently.
(memcg which only includes FILE cache or ANON can exists.)
This patch removes finds specfied number of pages from memcg's LRU and
move it to top of global LRU. They will be the first target of shrink_xxx_list.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/memcontrol.h | 15 ++++++++++
mm/memcontrol.c | 67 +++++++++++++++++++++++++++++++++++++++++++++
mm/vmscan.c | 18 +++++++++++-
3 files changed, 99 insertions(+), 1 deletion(-)
Index: softlimit-test2/include/linux/memcontrol.h
===================================================================
--- softlimit-test2.orig/include/linux/memcontrol.h
+++ softlimit-test2/include/linux/memcontrol.h
@@ -117,6 +117,9 @@ static inline bool mem_cgroup_disabled(v
extern bool mem_cgroup_oom_called(struct task_struct *task);
+void mem_cgroup_soft_limit_reorder_lru(struct zone *zone,
+ unsigned long nr_to_scan, enum lru_list l);
+int mem_cgroup_soft_limit_inactive_anon_is_low(struct zone *zone);
#else /* CONFIG_CGROUP_MEM_RES_CTLR */
struct mem_cgroup;
@@ -264,6 +267,18 @@ mem_cgroup_print_oom_info(struct mem_cgr
{
}
+static inline void
+mem_cgroup_soft_limit_reorder_lru(struct zone *zone, unsigned long nr_to_scan,
+ enum lru_list lru);
+{
+}
+
+static inline
+int mem_cgroup_soft_limit_inactive_anon_is_low(struct zone *zone)
+{
+ return 0;
+}
+
#endif /* CONFIG_CGROUP_MEM_CONT */
#endif /* _LINUX_MEMCONTROL_H */
Index: softlimit-test2/mm/memcontrol.c
===================================================================
--- softlimit-test2.orig/mm/memcontrol.c
+++ softlimit-test2/mm/memcontrol.c
@@ -1257,6 +1257,73 @@ static struct mem_cgroup *get_soft_limit
return ret;
}
+/*
+ * zone->lru and memcg's lru is synchronous under zone->lock.
+ * This tries to rotate pages in specfied LRU.
+ */
+void mem_cgroup_soft_limit_reorder_lru(struct zone *zone,
+ unsigned long nr_to_scan,
+ enum lru_list l)
+{
+ struct mem_cgroup *mem;
+ struct mem_cgroup_per_zone *mz;
+ int nid, zid, file;
+ unsigned long scan, flags;
+ struct list_head *src;
+ LIST_HEAD(found);
+ struct page_cgroup *pc;
+ struct page *page;
+
+ nid = zone->zone_pgdat->node_id;
+ zid = zone_idx(zone);
+
+ file = is_file_lru(l);
+
+ mem = get_soft_limit_victim(zone, file);
+ if (!mem)
+ return;
+ mz = mem_cgroup_zoneinfo(mem, nid, zid);
+ src = &mz->lists[l];
+ scan = 0;
+
+ /* Find at most nr_to_scan pages from local LRU */
+ spin_lock_irqsave(&zone->lru_lock, flags);
+ list_for_each_entry_reverse(pc, src, lru) {
+ if (scan >= nr_to_scan)
+ break;
+ /* We don't check Used bit */
+ page = pc->page;
+ /* Can happen ? */
+ if (unlikely(!PageLRU(page)))
+ continue;
+ /* This page is on (the same) LRU */
+ list_move(&page->lru, &found);
+ scan++;
+ }
+ /* vmscan searches pages from lru->prev. link this to lru->prev. */
+ list_splice_tail(&found, &zone->lru[l].list);
+ spin_unlock_irqrestore(&zone->lru_lock, flags);
+
+ /* When we cannot fill the request, check we should forget this cache
+ or not */
+ if (scan < nr_to_scan &&
+ !is_active_lru(l) &&
+ mem_cgroup_usage(mem, zone, file) < SWAP_CLUSTER_MAX)
+ slc_reset_cache_ticket(file);
+}
+
+/* Returns 1 if soft limit is active && memcg's zone's status is that */
+int mem_cgroup_soft_limit_inactive_anon_is_low(struct zone *zone)
+{
+ struct soft_limit_cache *slc;
+ int ret = 0;
+
+ slc = &get_cpu_var(soft_limit_cache);
+ if (slc->mem[SL_ANON])
+ ret = mem_cgroup_inactive_anon_is_low(slc->mem[SL_ANON], zone);
+ put_cpu_var(soft_limit_cache);
+ return ret;
+}
static void softlimitq_init(void)
{
Index: softlimit-test2/mm/vmscan.c
===================================================================
--- softlimit-test2.orig/mm/vmscan.c
+++ softlimit-test2/mm/vmscan.c
@@ -1066,6 +1066,13 @@ static unsigned long shrink_inactive_lis
pagevec_init(&pvec, 1);
lru_add_drain();
+ if (scanning_global_lru(sc)) {
+ enum lru_list l = LRU_INACTIVE_ANON;
+ if (file)
+ l = LRU_INACTIVE_FILE;
+ mem_cgroup_soft_limit_reorder_lru(zone, max_scan, l);
+ }
+
spin_lock_irq(&zone->lru_lock);
do {
struct page *page;
@@ -1233,6 +1240,13 @@ static void shrink_active_list(unsigned
struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
lru_add_drain();
+ if (scanning_global_lru(sc)) {
+ enum lru_list l = LRU_ACTIVE_ANON;
+ if (file)
+ l = LRU_ACTIVE_FILE;
+ mem_cgroup_soft_limit_reorder_lru(zone, nr_pages, l);
+ }
+
spin_lock_irq(&zone->lru_lock);
pgmoved = sc->isolate_pages(nr_pages, &l_hold, &pgscanned, sc->order,
ISOLATE_ACTIVE, zone,
@@ -1328,7 +1342,9 @@ static int inactive_anon_is_low_global(s
if (inactive * zone->inactive_ratio < active)
return 1;
-
+ /* check soft limit vicitm's status */
+ if (mem_cgroup_soft_limit_inactive_anon_is_low(zone))
+ return 1;
return 0;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC][PATCH 9/9] more event filter depend on priority
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
` (7 preceding siblings ...)
2009-04-03 8:17 ` [RFC][PATCH 8/9] lru reordering KAMEZAWA Hiroyuki
@ 2009-04-03 8:18 ` KAMEZAWA Hiroyuki
2009-04-03 8:24 ` [RFC][PATCH ex/9] for debug KAMEZAWA Hiroyuki
` (2 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:18 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
balbir@linux.vnet.ibm.com, kosaki.motohiro@jp.fujitsu.com
I'll revisit this one before v3...
==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reduce softlimit update ratio depends on its priority(usage).
After this.
if priority=0,1 -> check once in 1024 page-in/out
if priority=2,3 -> check once in 2048 page-in/out
...
if priority=10,11 -> check once in 32k page-in/out
(Note: this is called only when the usage exceeds soft limit)
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
Index: softlimit-test2/mm/memcontrol.c
===================================================================
--- softlimit-test2.orig/mm/memcontrol.c
+++ softlimit-test2/mm/memcontrol.c
@@ -940,7 +940,7 @@ static void record_last_oom(struct mem_c
mem_cgroup_walk_tree(mem, NULL, record_last_oom_cb);
}
-#define SOFTLIMIT_EVENTS_THRESH (1024) /* 1024 times of page-in/out */
+#define SOFTLIMIT_EVENTS_THRESH (512) /* 512 times of page-in/out */
/*
* Returns true if sum of page-in/page-out events since last check is
* over SOFTLIMIT_EVENT_THRESH. (counter is per-cpu.)
@@ -950,11 +950,15 @@ static bool mem_cgroup_soft_limit_check(
bool ret = false;
int cpu = get_cpu();
s64 val;
+ int thresh;
struct mem_cgroup_stat_cpu *cpustat;
cpustat = &mem->stat.cpustat[cpu];
val = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_EVENTS);
- if (unlikely(val > SOFTLIMIT_EVENTS_THRESH)) {
+ /* If usage is big, this check can be rough */
+ thresh = SOFTLIMIT_EVENTS_THRESH;
+ thresh <<= ((mem->soft_limit_priority >> 1) + 1);
+ if (unlikely(val > thresh)) {
__mem_cgroup_stat_reset_safe(cpustat, MEM_CGROUP_STAT_EVENTS);
ret = true;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC][PATCH ex/9] for debug
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
` (8 preceding siblings ...)
2009-04-03 8:18 ` [RFC][PATCH 9/9] more event filter depend on priority KAMEZAWA Hiroyuki
@ 2009-04-03 8:24 ` KAMEZAWA Hiroyuki
2009-04-06 9:08 ` [RFC][PATCH 0/9] memcg soft limit v2 (new design) Balbir Singh
2009-04-24 12:24 ` Balbir Singh
11 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-03 8:24 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 396 bytes --]
This mail attaches a patch and scirpt for debug.
soft_limit_show_prio.patch is for showing priority in memory.stat file.
I wonder I should add this to patch series or now...
cgroup.rb and ctop.rb is my personal ruby script, an utility to manage cgroup.
I sometimes use this. place both files to the same directory and run ctop.rb
#ruby ctop.rb
help will show this is for what.
Thanks,
-Kame
[-- Attachment #2: soft_limit_show_prio.patch --]
[-- Type: application/octet-stream, Size: 698 bytes --]
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Show internal control information of soft limit when DEBUG_VM is on.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 1 +
1 file changed, 1 insertion(+)
Index: softlimit-test2/mm/memcontrol.c
===================================================================
--- softlimit-test2.orig/mm/memcontrol.c
+++ softlimit-test2/mm/memcontrol.c
@@ -2617,6 +2617,7 @@ static int mem_control_stat_show(struct
#ifdef CONFIG_DEBUG_VM
cb->fill(cb, "inactive_ratio",
calc_inactive_ratio(mem_cont, NULL, NULL));
+ cb->fill(cb, "soft_limit_prio", mem_cont->soft_limit_priority);
{
int nid, zid;
[-- Attachment #3: cgroup.rb --]
[-- Type: application/octet-stream, Size: 10798 bytes --]
require 'find'
$subsys_array = Array.new
$allsubsys = Hash.new
$allmounts = Hash.new
class Sub_system
def initialize(name, mount, option)
@name = name
@mount= mount
@hierarchy = Array.new
if (option =~ /.*noprefix.*/) then
@prefix =""
else
@prefix = name +"."
end
@option = option
@writable_files = Array.new
end
def mount_point
@mount
end
def type
@name
end
def myfile(name, attr)
name + "/" + @prefix + attr
end
def option
@option
end
#
# walk directroy tree and add Cgroups to hash.
#
def reload
@hierarchy.clear
len = @mount.size
Find.find(@mount) do |file|
if File.directory?(file) then
@hierarchy.push(file);
end
end
end
def each_cgroup(&block)
@hierarchy.each(&block)
end
def ent(id)
if (id < 0) then return nil
end
return @hierarchy.at(id)
end
def size
@hierarchy.size
end
def stat (name)
[["Not implemented", ""]]
end
def each_writable_files(name)
@writable_files.each {|x| yield myfile(name,x)}
end
def tasks(name)
list=Array.new
begin
File.open(name+"/tasks", "r") do |file|
file.each_line do |x|
x.chomp!
list.push(x)
end
end
rescue
return nil
end
return list
end
end
def read_oneline_file(filename)
val=nil
begin
f = File.open(filename, "r")
line = f.readline
val = line.to_i
rescue
throw :readfailure,false
ensure
f.close if f != nil
end
return val
end
#
#for CPU subsystem
#
class Cpu_Subsys < Sub_system
def initialize(mount, option)
super("cpu", mount, option)
@writable_files += ["shares"]
end
def read_share (name)
ret = nil
catch :readfailure do
val = read_oneline_file(myfile(name,"shares"))
return [val.to_s, val.to_s+" (100%)"] if (name == @mount)
all=0
dirname = File.dirname(name)
Dir.foreach(dirname) do |x|
next if ((x == ".") || (x == ".."))
x = "#{dirname}/#{x}"
next unless File.directory?(x)
next unless File.exist?(myfile(name,"shares"))
got = read_oneline_file(myfile(name,"shares"))
all+=got
end
share=sprintf("%d (%.1f%%)", all, val*100.0/all)
ret = [val.to_s, share]
end
return ret
end
def stat(name)
level=0
data = Array.new
pos = @mount
name_array = Array.new
loop do
name_array.push(name)
break if name == @mount
name = File.dirname(name)
end
name_array.reverse!
name_array.each do |x|
val = read_share(x)
if val == nil then
data = nil
break
end
str = sprintf("%5s / %s", val[0], val[1])
data.push([x, str])
end
return data if (data != nil && data.size > 0)
return nil
end
end
#
#for CPUacct subsystem
#
class Cpuacct_Subsys < Sub_system
def initialize(mount, option)
super("cpuacct", mount, option)
end
def stat(name)
data = Array.new
catch :read_failure do
val = read_oneline_file(myfile(name, "usage"))
data.push(["All", val.to_s])
begin
f = File.open(myfile(name,"usage_percpu"), "r")
id=0
line = f.readline
while (line =~/\d+/) do
line =$'
data.push(["cpu"+id.to_s, $&])
id += 1
end
rescue
data.clear
ensure
f.close if f != nil
end
return data if data.size > 0
return nil
end
end
end
#
# For cpuset
#
class Cpuset_Subsys < Sub_system
def initialize(mount, option)
super("cpuset", mount, option)
@elements =["cpu_exclusive","cpus","mems", "mem_exclusive","mem_hardwall",
"memory_migrate", "memory_pressure", "memory_pressure_enabled",
"memory_spread_page","memory_spread_slab",
"sched_load_balance","sched_relax_domain_level"]
@writable_files += @elements
end
def stat(name)
data = Array.new
for x in @elements
begin
filename = myfile(name, x)
next unless (File.file?(filename))
File.open(filename, "r") do | file |
str = file.readline
str.chomp!
case x
when "cpus", "mems"
str = "empty" if (str == "")
end
data.push([x,str])
end
rescue
#data = nil
break
end
end
return data
end
end
#
#for Memory Subsys
#
def convert_bytes(bytes, precise)
case
when (precise == 0) && (bytes > 64 * 1024*1024*1024*1024)
sprintf("Unlimited")
when (precise == 0) && (bytes > 1024*1024*1024*1024)
sprintf("%dT",bytes/1024/1024/1024/1024)
when (precise == 0) && (bytes > 1024*1024*1024)
sprintf("%dG", bytes/1024/1024/1024)
when (bytes > 1024*1024)
sprintf("%dM", bytes/1024/1024)
when (bytes > 1024)
sprintf("%dk", bytes/1024)
else
sprintf("%d", bytes)
end
end
#
#for Memory Subsystem
#
class Memory_Subsys < Sub_system
def initialize(mount, option)
super("memory", mount, option)
if (File.exist?("#{mount}/memory.memsw.usage_in_bytes")) then
@memsw=true
else
@memsw=false
end
@writable_files += ["limit_in_bytes", "use_hierarchy","swappiness", "soft_limit_in_bytes"]
if (@memsw) then
@writable_files += ["memsw.limit_in_bytes"]
end
end
#
# Find a root directroy of hierarchy.
#
def find_hierarchy_root(name)
cur=[name, File.dirname(name)]
ret=@mount
while (cur[0] != @mount)
ret="hoge"
under = read_oneline_file("#{cur[1]}/memory.use_hierarchy")
if (under == 0) then
return cur[0]
end
cur[0] = cur[1]
cur[1] = File.dirname(cur[1])
end
return ret
end
#
# Generate an array for reporintg status
#
def stat(name)
data = Array.new
success = catch(:readfailure) do
under =read_oneline_file(myfile(name,"use_hierarchy"))
if (under == 1) then
str=find_hierarchy_root(name)
if (str != name) then
str="under #{str}"
under=2
else
str="hierarchy ROOT"
end
else #Not under hierarchy
str=""
end
ent = ["Memory Subsys", str]
data.push(ent)
# Limit and Usage
x=Array.new
x.push("Usage/Limit")
bytes = read_oneline_file(myfile(name,"usage_in_bytes"))
usage = convert_bytes(bytes, 1)
if (@memsw) then
bytes = read_oneline_file(myfile(name,"memsw.usage_in_bytes"))
usage2 = convert_bytes(bytes, 1)
usage = "#{usage} (#{usage2})"
end
bytes = read_oneline_file(myfile(name,"limit_in_bytes"))
limit = convert_bytes(bytes, 0)
usage = "#{usage} / #{limit}"
if (@memsw) then
bytes = read_oneline_file(myfile(name,"memsw.limit_in_bytes"))
limit2 = convert_bytes(bytes, 0)
usage = "#{usage} (#{limit2})"
end
x.push(usage)
data.push(x)
# MAX USAGE
x = Array.new
x.push("Max Usage")
bytes = read_oneline_file(myfile(name, "max_usage_in_bytes"))
usage = convert_bytes(bytes, 1)
if (@memsw) then
bytes = read_oneline_file(myfile(name,"memsw.max_usage_in_bytes"))
usage2 = convert_bytes(bytes, 1)
usage = "#{usage} (#{usage2})"
end
x.push(usage)
data.push(x)
# soft limit
x = Array.new
x.push("Soft limit")
bytes = read_oneline_file(myfile(name, "soft_limit_in_bytes"))
usage = convert_bytes(bytes, 1)
x.push(usage)
data.push(x)
# failcnt
x = Array.new
x.push("Fail Count")
cnt = read_oneline_file(myfile(name,"failcnt"))
failcnt = cnt.to_s
if (@memsw) then
cnt = read_oneline_file(myfile(name, "memsw.failcnt"))
failcnt="#{failcnt} (#{cnt.to_s})"
end
x.push(failcnt)
data.push(x)
begin
f = File.open(myfile(name,"stat"), "r")
for x in ["Cache","Rss","Pagein","Pageout",nil, nil, nil, nil, nil,
"HierarchyLimit","SubtreeCache","SubtreeRss", nil, nil,
nil, nil, nil, nil, nil, nil, "soft_limit_prio"]
line =f.readline
next if x == nil
line =~ /^\S+\s+(.+)/
val=$1
case x
when "Cache","Rss","SubtreeCache","SubtreeRss"
bytes = convert_bytes(val.to_i, 1)
data.push([x, bytes])
when "Pagein","Pageout"
data.push([x, val])
when "soft_limit_prio"
data.push([x, val])
when "HierarchyLimit"
memlimit = convert_bytes(val.to_i, 0)
if (@memsw) then
line =f.readline
line =~ /^\S+\s+(.+)/
memswlimit = convert_bytes(val.to_i, 0)
memlimit += " (" + memswlimit + ")"
end
data.push([x, memlimit])
end
end
ensure
f.close if f != nil
end
true
end
return data if success==true
return nil
end
end
#
# Read /proc/mounts and parse each lines.
# When cgroup mount point is found, each subsystem's cgroups are added
# to subsystem's Hash.
#
def register_subsys(name, mount, option)
if $allsubsys[name] == nil then
subsys = nil
case name
when "cpu" then subsys = Cpu_Subsys.new(mount, option)
when "cpuacct" then subsys = Cpuacct_Subsys.new(mount, option)
when "memory" then subsys = Memory_Subsys.new(mount, option)
when "cpuset" then subsys = Cpuset_Subsys.new(mount, option)
end
if subsys != nil then
$subsys_array.push(name)
$allsubsys[name] = subsys
end
end
end
#
# Read /proc/mounts and prepare subsys array
#
def parse_mount(line)
parsed = line.split(/\s+/)
if parsed[2] == "cgroup" then
mount=parsed[1]
opts=parsed[3].split(/\,/)
opts.each do |name|
case name
when "rw" then next
else
register_subsys(name, mount, parsed[3])
$allmounts[mount]=name
end
end
end
end
def read_mount
File.open("/proc/mounts", "r") do |file|
file.each_line {|line| parse_mount(line) }
end
$subsys_array.sort!
end
#
# Read all /proc/mounts and scan directory under mount point.
#
def refresh_all
$allmounts.clear
$subsys_array.clear
$allsubsys.clear
read_mount
end
def check_and_refresh_mount_info
mysubsys=Array.new
File.open("/proc/mounts", "r") do |file|
file.each_line do |line|
parsed = line.split(/\s+/)
if (parsed[2] == "cgroup") then
mysubsys.push(parsed[1])
end
end
end
if (mysubsys.size != $allmounts.size) then
refresh_all
return true
end
mysubsys.each do |x|
if ($allmounts[x] == nil) then
refresh_all
return true
end
end
return false
end
[-- Attachment #4: ctop.rb --]
[-- Type: application/octet-stream, Size: 20106 bytes --]
#
# ctop.rb
# written by KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
# Copyright 2009 Fujitsu Limited
#
# Changelog:
#
# v003
# - fixed bug in rmdir/mkdir
# - changed command-mode interface
# - added comments and made codes clean
#
# v002 (2009/02/25)
# - fixed leak of file descriptor
# - mount/umount <-> reload data problem is fixed.
# - "mount twice" problem is fixed.
# - removed R key for reload all. it's now automatic
# - handle "noprefix" mount option
# - show mount option in help window
# - add cpuset support
# - add command-mode
#
# v001 (2009/02/04)
# - first version released
# - cpu, cpuacct, memory subsys is supported
# known bugs -> noprefix, umount, mount twice
#
require 'cgroup.rb'
require 'curses'
require 'etc'
require 'timeout'
require 'singleton'
DIRWIN_LINES=7
DIRWIN_FIELDS= DIRWIN_LINES - 2
UPKEY=256
DOWNKEY=257
RIGHTKEY=258
LEFTKEY=259
#mode
SHOWMOUNT=0
SHOWTASKS=1
SHOWSUBSYS=2
#for 'ps'
PID=0
STATE=1
PPID=2
UID=3
COMMAND=4
PGID=5
#for process status filter
RUNNING=0
#
# Helper function for curses
#
def hit_any_key(str, window)
window.addstr(str) if str != nil
window.addstr("\n[Hit Any Key]")
window.getch
end
def window_printf(window, format, *arg)
str = sprintf(format, *arg)
window.addstr(str)
end
#
# Cursor holds current status of subsys's window.
#
#
class Cursor
def initialize(name)
@subsysname=name # name of subsys
@cursor=0 # current directroy position
@mode=SHOWTASKS # current mode (ps-mode/stat-mode)
@info_startline=0 # used for scroll in infowin
@info_endline=0 # used for scorll in infowin
@show_only_running = 0 # a filter for ps-mode
@user_name_filter=nil # a filter for ps-mode
@command_name_filter=nil # a filter for ps-mode
end
def pos
@cursor
end
def mode
@mode
end
def change_mode # switch mode ps-mode <-> stat-mode
case @mode
when SHOWTASKS then @mode=SHOWSUBSYS
when SHOWSUBSYS then @mode=SHOWTASKS
end
end
#
# Filter for PS-MODE
#
def process_status_filter(stat)
return true if (@show_only_running == 0)
return true if (stat =="R")
return false
end
def user_name_filter(str)
return true if (@user_name_filter == nil)
return true if (@user_name_filter == str)
return false
end
def command_name_filter(str)
return true if (@command_name_filter == nil)
return true if (str =~ /#{@command_name_filter}/)
return false
end
def toggle_show_only_running
if (@show_only_running == 0) then
@show_only_running = 1 # show only running process in ps-mode
else
@show_only_running = 0 # show all processes
end
end
def set_user_name_filter(str)
str = nil if (str == "")
@user_name_filter=str
end
def set_command_name_filter(str)
str=nil if (str == "")
@command_name_filter=str
end
#
# Scroll management for infowin
#
def info_startline
@info_startline
end
def set_infoendline(num)
@info_endline=num
end
def set_infoline(num)
if ((num < 0) || (num >= @info_endline)) then
@info_startline=0
else
@info_startline=num
end
end
#
# chdir() for subsys.
#
def move(direction)
subsys =$allsubsys[@subsysname]
if (subsys == nil) then return
end
if (direction == -1) then
@cursor -= 1 if @cursor > 0
elsif (direction == 1)
@cursor += 1 if @cursor < subsys.size-1
end
end
end
#
# Current is a singleton holds current status of this program.
#
class Current
include Singleton
def initialize
@index=-1 #current susbsys index in $subsys_array[]
@name=nil #current name of subsys
@cursor=nil #reference to current Cursor
@subsys=nil #reference to current Subsys
@subsys_cursor = Hash.new
end
def set(x)
@index=x
if (x == -1) then
@index, @name, @cursor, @subsys = -1, "help", nil, nil
else
@name = $subsys_array[x]
@subsys = $allsubsys[@name]
if (@subsys_cursor[@name] == nil) then
@subsys_cursor[@name] = Cursor.new(@name)
end
@cursor = @subsys_cursor[@name]
end
end
#change subsys view
def move (dir)
case dir
when "left"
@index -= 1 if (@index > -1)
when "right"
@index += 1 if (@index < $subsys_array.size - 1)
end
set(@index)
end
#change directroy view of current cursor
def chdir(direction)
if (@cursor != nil) then
@cursor.move(direction)
end
end
#switch current mode of cursor
def change_mode
if (@cursor != nil) then
@cursor.change_mode
end
end
def name
@name
end
def cursor
@cursor
end
def subsys
@subsys
end
end
$cur = Current.instance
#
# Show directory window
#
def detect_dirlist_position(subsysname, subsys)
pos = 0
size=subsys.size
cursor = $cur.cursor
return [0, 0, 0] if cursor == nil
pos = cursor.pos
if ((size < 4) || (pos <= 2)) then
head=0
tail=4
elsif (pos < size - 2) then
head=pos-1
tail=pos+2
else
head = size - 4
tail = size - 1
end
return [pos, head, tail]
end
def get_owner_name(name)
begin
stat = File.stat(name)
rescue
return ""
end
begin
info = Etc::getpwuid(stat.uid)
uname = info.name
rescue
$barwin.addstr($!)
uname = stat.uid.to_s
end
begin
info = Etc::getgrgid(stat.gid)
gname = info.name
rescue
gname = stat.gid.to_s
end
sprintf("\t-\t(%s/%s)", uname, gname)
end
def draw_dirlist(dirwin, subsys)
now, head, tail = detect_dirlist_position($cur.name, subsys)
lines=1
i=head
while i <= tail
name = subsys.ent(i)
if (name == nil) then break
end
dirwin.setpos(lines, 3)
dirwin.standout if (i == now)
dirwin.addstr(name + get_owner_name(name))
dirwin.standend if (i == now)
lines+=1
i += 1
end
end
#
# Fill dirwin contents.
#
def draw_dirwin(dirwin)
dirwin.clear
dirwin.box(?|,?-,?*)
dirwin.setpos(0, 1)
#show all subsyss in head
-1.upto($subsys_array.size) do |x|
dirwin.addstr("-")
if (x == -1) then
str="help"
else
str=sprintf("%s",$subsys_array[x])
end
break if (str == nil)
dirwin.standout if (str == $cur.name)
dirwin.addstr(str)
dirwin.standend if (str == $cur.name)
end
#show current time
dirwin.setpos(6,dirwin.maxx-32)
dirwin.addstr("[#{Time.now.asctime}]")
#
# Show directory list
#
if $cur.subsys != nil then
#Reload information
$cur.subsys.reload
draw_dirlist(dirwin, $cur.subsys)
end
end
#
#
# for infowin
#
#
# Contents of infowin will be passed by data[]
# This function shows contents based on current scroll infromation.
# for converting contents of array to string, code block is called by yield
#
def draw_infowin_limited(infowin, cursor, data)
#
# Generate Header
#
str = yield nil # write a header if necessary
if (str != nil) then
draw=1
infowin.setpos(0,2)
infowin.addstr(str)
else
draw=0
end
#
# print a line whici is in the window
#
startline = cursor.info_startline
endline = cursor.info_startline + infowin.maxy-2
startline.upto(endline) do |linenumber|
x = data.at(linenumber)
return if (x == nil) #no more data
str = yield(x)
infowin.setpos(draw, 2)
infowin.addstr(str)
draw = 1+infowin.cury
break if (draw == infowin.maxy)
end
cursor.set_infoendline(data.size)
end
#
#
# Show help and current mount information in help window
#
def show_mount_info(infowin)
if ($allsubsys.empty?) then
$barwin.addstr("cgroups are not mounted\n")
end
$allsubsys.each do |name, subsys|
window_printf(infowin, "%12s\t%s\t#%s\n",
name, subsys.mount_point, subsys.option)
end
#$barwin.addstr("mounted subsystems")
#
# Help
#
infowin.addstr("Command\n")
infowin.addstr("[LEFT, RIGHT]\t move subsystems\n")
infowin.addstr("[UP, DOWN]\t move directory\n")
infowin.addstr("[n, b]\t\t scorll information window\n")
infowin.addstr("[s]\t\t switch shown information (ps-mode/stat-mode)\n")
infowin.addstr("[r]\t\t set refresh rate\n")
infowin.addstr("[c]\t\t Enter command-mode\n")
infowin.addstr("ps mode option\n")
infowin.addstr("[t]\t\t (ps-mode)toggle show only running process\n")
infowin.addstr("[u]\t\t (ps-mode)set/unset user name filter\n")
infowin.addstr("[f]\t\t (ps-mode)set/unset command name filter")
end
#
# Read /proc/<pid>/status file and fill data[] array, return it
#
def parse_pid_status(f, es)
input = f.readline
input =~ es
return $1
end
#
# data[] =[PID, State, PPID, UID, COMMAND, PGID]
#
def parse_process(pid)
#
# Status
#
data = Array.new
stat = nil
stat = catch(:bad_task_status) do
data[PID]=pid.to_i
begin
f = File.open("/proc/#{pid}/status", "r")
#Name
data[COMMAND] = parse_pid_status(f,/^Name:\s+(.+)/)
unless (File.exist?("/proc/#{pid}/exe")) then
data[COMMAND] = "[" + data[COMMAND] + "]"
end
#State
data[STATE] = parse_pid_status(f, /^State:\s+([A-Z]).+/)
# TGID: Is thread grouo leader ?
if (parse_pid_status(f, /^Tgid:\s+(.+)/) != pid) then
throw :bad_task_status, false
end
#skip PID
input = f.readline
#PPID
data[PPID]= parse_pid_status(f,/^PPid:\s+(.+)/)
ppid=data[PPID]
#TracerPID
input = f.readline
#UID
uid = parse_pid_status(f,/^Uid:\s+([0-9]+).+/)
begin
info=Etc::getpwuid(uid.to_i)
data[UID]=info.name
rescue
data[UID]=uid
end
rescue
throw :bad_task_status, false
ensure
f.close unless f.nil?
end
end
return data unless stat.nil?
return nil
end
#
# PS-MODE
# Cat "tasks" file and visit all /proc/<pid>/status file
# All information will be pushed into "ps" array
#
def show_tasks(subsys, cursor, infowin)
# Get Name of Current Cgroup and read task file
ps = Array.new
catch :quit do
group = subsys.ent(cursor.pos)
throw :quit,"nogroup" if group==nil
tasks = subsys.tasks(group)
throw :quit,"nogroup" if tasks==nil
tasks.each do |x|
data = parse_process(x)
next if (data == nil)
next unless (cursor.process_status_filter(data[STATE]))
next unless (cursor.command_name_filter(data[COMMAND]))
ps.push(data) if (cursor.user_name_filter(data[UID]))
end
#
# Sort ps's result, "R" first.
#
ps.sort! do |x , y|
if (x[STATE] == "R" && y[STATE] != "R") then
-1
elsif (x[STATE] != "R" && y[STATE] == "R") then
1
else
0
end
end
end
return if (ps.size == 0)
draw_infowin_limited(infowin, cursor, ps)do |x|
if (x == nil) then
sprintf("%6s %6s %8s %5s %16s", "PID","PPID","USER","STATE", "COMMAND")
else
sprintf("%6d %6d %8s %5s %16s",
x[PID], x[PPID], x[UID], x[STATE], x[COMMAND])
end
end
unless ($cur.cursor.process_status_filter("S")) then
$barwin.addstr("[r]")
end
unless ($cur.cursor.user_name_filter("badnamemandab")) then
$barwin.addstr("[u]")
end
unless ($cur.cursor.command_name_filter("badnamemandab")) then
$barwin.addstr("[c]")
end
end
def show_subsys_stat(subsys, cursor, infowin)
group = subsys.ent(cursor.pos)
return if group == nil
data = subsys.stat(group)
return if data == nil
draw_infowin_limited(infowin, cursor, data) do |x|
next if x == nil
if (x[0].size > 24) then
len = x[0].size - 24
x[0].slice!(0..len)
end
sprintf("%24s\t%s", x[0], x[1])
end
end
#
# [n],[b] Move cursor's current position in infowin
#
def set_scroll(infowin, direction)
cursor = $cur.cursor
return if (cursor == nil)
if (direction == 1) then
curline=cursor.info_startline
cursor.set_infoline(curline+infowin.maxy)
else
curline=cursor.info_startline
cursor.set_infoline(curline-infowin.maxy)
end
end
#
# [t] Set/Unset Show-Running-Only filter
#
def toggle_running_filter
if ($cur.cursor != nil) then
$cur.cursor.toggle_show_only_running
end
end
#
# Filters for ps-mode
#
#
# [u] Filter by UID
#
def user_name_filter(infowin)
infowin.clear
window_printf(infowin, "user name filter:")
str=infowin.getstr
cursor= $cur.cursor
cursor.set_user_name_filter(str) if (cursor != nil)
end
#
# [f] Filter by name of command
#
def command_name_filter(infowin)
infowin.clear
window_printf(infowin, "command name filter:")
str=infowin.getstr
cursor =$cur.cursor
cursor.set_command_name_filter(str) if (cursor != nil)
end
#
# [r] set refresh time
#
def set_refresh_time(time, infowin)
infowin.clear
window_printf(infowin, "set refresh time(now %ds)",time)
str=infowin.getstr
return time if (str.to_i == 0)
return str.to_i
end
#
# [c] Below are sub routines for command-mode.
#
def smart_print(str, window)
if (window.maxx - window.curx < str.size-2) then
window.addstr("\n"+str)
else
window.addstr(str)
end
end
def show_writable_files(subsys, cursor, infowin)
group = subsys.ent(cursor.pos)
return nil if group == nil
ent=1
data = Array.new
subsys.each_writable_files(group) do |x|
str = sprintf("%2d: %s ", ent, File.basename(x))
ent=ent+1
smart_print(str, infowin)
data.push(x)
end
infowin.refresh
return data
end
#
# Scan directroy and change owner/group of all regular files
# and current directory.
#
def chown_all_files(uid, gid, group, infowin)
# change owner/group of current dir
begin
File.chown(uid, gid, group)
rescue
hit_any_key("Error:"+$!, infowin)
return
end
# change owner/group of regular files
Dir.foreach(group) do |x|
name = group+"/"+x
next if File.directory?(name)
begin
File.chown(nil, gid, name)
rescue
hit_any_key("Error:"+$!, infowin)
break
end
end
end
#
# Check "/" is included or not at mkdir/rmdir
#
def check_mkrmdir_string(str, infowin)
if (str =~ /\//) then
infowin.addstr("don't include /\n")
return false
elsif (str == ".") then
infowin.addstr("can't remove current\n")
end
return true
end
#
# Get string and retuns uid or gid as integer
#
def parse_id(window, uid, str)
if (str =~ /\D/) then
begin
if (uid == 1) then
info = Etc::getpwnam(str)
id = info.uid
else
info = Etc::getgrnam(str)
id = info.gid
end
rescue
hit_any_key("Error:"+$!, window)
id=nil
end
else
id = str.to_i
end
return id
end
#
#
# Command mode interface
#
#
def command_mode(infowin)
return if ($cur.subsys == nil)
infowin.clear
$barwin.clear
$barwin.addstr("[command-mode]")
$barwin.refresh
#
# Subsys special files are in number
#
infowin.addstr("====subsys command====\n")
data = show_writable_files($cur.subsys, $cur.cursor, infowin)
if data==nil then
infowin.addstr("no subsys command")
end
#
# Cgroup generic ops are in alphabet
#
infowin.addstr("\n====cgroup command====\n")
smart_print("[A] attach task(PID)", infowin)
smart_print(" [M] mkdir", infowin)
smart_print(" [R] rmdir",infowin)
smart_print(" [O] chown(OWNER)", infowin)
smart_print(" [G] chown(GID)", infowin)
infowin.addstr("\n\nModify which ? [and Hit return]:")
#line to show prompt
endline = infowin.cury+1
#wait for the numbers or AOGMR
str=infowin.getstr
#target directory is this.
group = $cur.subsys.ent($cur.cursor.pos)
case str.to_i # if str is not number, returns 0.
# Subsystem commands
when 1..99
if (data != nil) then
name = data.at(str.to_i - 1)
#get input
infowin.setpos(endline, 0)
window_printf(infowin, "#echo to >%s:", File.basename(name))
str = infowin.getstr
#write
begin
f = File.open(name, "w") {|f| f.write(str) }
rescue
hit_any_key("Error:"+$!, infowin)
end
end
# Cgroup commands (str.to_i returns 0)
when 0
case str
when "a","A" #Attach
window_printf(infowin, "Attach task to %s:", group)
str = infowin.getstr
begin
File.open(group + "/tasks", "w") {|f| f.write(str) }
rescue
hit_any_key("Error:"+$!, infowin)
end
when "o","O" #chown (OWNER)
infowin.addstr("change owner id of all files to:")
id = parse_id(infowin, 1, infowin.getstr)
chown_all_files(id, -1, group, infowin) if id != nil
when "g","G" #chown (GROUP)
infowin.addstr("change group id of all files to:")
id = parse_id(infowin, 0, infowin.getstr)
chown_all_files(-1, id, group, infowin) if id != nil
when "m","M" #mkdir
infowin.addstr("mkdir -.enter name:")
str = infowin.getstr
if (check_mkrmdir_string(str, infowin)) then
begin
if (Dir.mkdir(group+"/"+str) != 0) then
hit_any_key("Error:"+$!, infowin)
end
rescue
hit_any_key("Error:"+$!, infowin)
end
else
hit_any_key(nil, infowin)
end
when "r","R" #rmdir
infowin.addstr("rmdir -.enter name:")
str = infowin.getstr
if (check_mkrmdir_string(str, infowin)) then
begin
if (Dir.rmdir(group+"/"+str) != 0) then
hit_any_key("Error:"+$!, infowin)
end
rescue
hit_any_key("Error:"+$!, infowin)
end
else
hit_any_key(nil, infowin)
end
end
end
$barwin.clear
end
#
# Main draw routine
#
def draw_infowin(infowin)
infowin.clear
cursor = $cur.cursor
if cursor == nil then
mode = SHOWMOUNT
else
mode = cursor.mode
end
#
# If no subsys is specified, just show mount information.
#
case mode
when SHOWMOUNT
show_mount_info(infowin)
when SHOWTASKS
$barwin.addstr("[ps-mode]")
show_tasks($cur.subsys, cursor, infowin)
when SHOWSUBSYS
$barwin.addstr("[stat-mode]")
show_subsys_stat($cur.subsys, cursor, infowin)
end
end
#
# Main loop
#
#
# For stdscreen
#
# Check /proc/mounts and read all subsys.
#
refresh_all
#
# Main loop. create windows and wait for inputs
#
Curses::init_screen
begin
$lines=Curses::lines
$cols=Curses::cols
off=0
#
# Create window
#
dirwin = Curses::stdscr.subwin(DIRWIN_LINES, $cols, off, 0)
#for misc info
off+=DIRWIN_LINES
$barwin = Curses::stdscr.subwin(1, $cols, off, 0);
$barwin.standout
off+=1
infowin = Curses::stdscr.subwin($lines-off, $cols, off, 0)
mode=SHOWTASKS
quit=0
refresh_time=15
while quit == 0
#$barwin.clear
#$barwin.addstr("Info:")
draw_dirwin(dirwin)
draw_infowin(infowin)
dirwin.refresh
infowin.refresh
$barwin.refresh
#
# handle input.
#
$barwin.clear
Curses::setpos(0,0)
ch=0
Curses::noecho
begin
Timeout::timeout(refresh_time) do
ch=Curses::getch
end
rescue Timeout::Error
#$barwin.addstr("timeout")
end
Curses::echo
#check espace sequence
if ch == 27 then
ch = Curses::getch
if ch == 91 then
ch = Curses::getch
case ch
when 65 then ch = UPKEY
when 66 then ch = DOWNKEY
when 67 then ch = RIGHTKEY
when 68 then ch = LEFTKEY
end
end
end
#
#
#
if (check_and_refresh_mount_info) then
$cur.set(-1)
end
#$barwin.addstr(Time.now.asctime)
case ch
when ?q
quit=1
break
when LEFTKEY then $cur.move("left")
when RIGHTKEY then $cur.move("right")
when UPKEY then $cur.chdir(-1)
when DOWNKEY then $cur.chdir(1)
when ?s then $cur.change_mode
when ?n then set_scroll(infowin, 1)
when ?b then set_scroll(infowin, -1)
when ?t then toggle_running_filter
when ?u then user_name_filter(infowin)
when ?f then command_name_filter(infowin)
when ?c then command_mode(infowin)
when ?r then refresh_time=set_refresh_time(refresh_time, infowin)
end
end
ensure
Curses::close_screen
end
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 0/9] memcg soft limit v2 (new design)
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
` (9 preceding siblings ...)
2009-04-03 8:24 ` [RFC][PATCH ex/9] for debug KAMEZAWA Hiroyuki
@ 2009-04-06 9:08 ` Balbir Singh
2009-04-07 0:16 ` KAMEZAWA Hiroyuki
2009-04-24 12:24 ` Balbir Singh
11 siblings, 1 reply; 22+ messages in thread
From: Balbir Singh @ 2009-04-06 9:08 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kosaki.motohiro@jp.fujitsu.com
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:08:35]:
> Hi,
>
> Memory cgroup's soft limit feature is a feature to tell global LRU
> "please reclaim from this memcg at memory shortage".
>
> This is v2. Fixed some troubles under hierarchy. and increase soft limit
> update hooks to proper places.
>
> This patch is on to
> mmotom-Mar23 + memcg-cleanup-cache_charge.patch
> + vmscan-fix-it-to-take-care-of-nodemask.patch
>
> So, not for wide use ;)
>
> This patch tries to avoid to use existing memcg's reclaim routine and
> just tell "Hints" to global LRU. This patch is briefly tested and shows
> good result to me. (But may not to you. plz brame me.)
>
> Major characteristic is.
> - memcg will be inserted to softlimit-queue at charge() if usage excess
> soft limit.
> - softlimit-queue is a queue with priority. priority is detemined by size
> of excessing usage.
This is critical and good that you have this now. In my patchset, it
helps me achieve a lot of the expected functionality.
> - memcg's soft limit hooks is called by shrink_xxx_list() to show hints.
I am not too happy with moving pages in global LRU based on soft
limits based on my comments earlier. My objection is not too strong,
since reclaiming from the memcg also exhibits functionally similar
behaviour.
> - Behavior is affected by vm.swappiness and LRU scan rate is determined by
> global LRU's status.
>
I also have concerns about not sorting the list of memcg's. I need to
write some scalabilityt tests and check.
> In this v2.
> - problems under use_hierarchy=1 case are fixed.
> - more hooks are added.
> - codes are cleaned up.
>
> Shows good results on my private box test under several work loads.
>
> But in special artificial case, when victim memcg's Active/Inactive ratio of
> ANON is very different from global LRU, the result seems not very good.
> i.e.
> under vicitm memcg, ACTIVE_ANON=100%, INACTIVE=0% (access memory in busy loop)
> under global, ACTIVE_ANON=10%, INACTIVE=90% (almost all processes are sleeping.)
> memory can be swapped out from global LRU, not from vicitm.
> (If there are file cache in victims, file cacahes will be out.)
>
> But, in this case, even if we successfully swap out anon pages under victime memcg,
> they will come back to memory soon and can show heavy slashing.
heavy slashing? Not sure I understand what you mean.
>
> While using soft limit, I felt this is useful feature :)
> But keep this RFC for a while. I'll prepare Documentation until the next post.
>
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 3/9] soft limit update filter
2009-04-03 8:12 ` [RFC][PATCH 3/9] soft limit update filter KAMEZAWA Hiroyuki
@ 2009-04-06 9:43 ` Balbir Singh
2009-04-07 0:04 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 22+ messages in thread
From: Balbir Singh @ 2009-04-06 9:43 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kosaki.motohiro@jp.fujitsu.com
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:12:02]:
> No changes from v1.
> ==
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Check/Update softlimit information at every charge is over-killing, so
> we need some filter.
>
> This patch tries to count events in the memcg and if events > threshold
> tries to update memcg's soft limit status and reset event counter to 0.
>
> Event counter is maintained by per-cpu which has been already used,
> Then, no siginificant overhead(extra cache-miss etc..) in theory.
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
> ===================================================================
> --- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
> +++ mmotm-2.6.29-Mar23/mm/memcontrol.c
> @@ -66,6 +66,7 @@ enum mem_cgroup_stat_index {
> MEM_CGROUP_STAT_PGPGIN_COUNT, /* # of pages paged in */
> MEM_CGROUP_STAT_PGPGOUT_COUNT, /* # of pages paged out */
>
> + MEM_CGROUP_STAT_EVENTS, /* sum of page-in/page-out for internal use */
> MEM_CGROUP_STAT_NSTATS,
> };
>
> @@ -105,6 +106,22 @@ static s64 mem_cgroup_local_usage(struct
> return ret;
> }
>
> +/* For intenal use of per-cpu event counting. */
> +
> +static inline void
> +__mem_cgroup_stat_reset_safe(struct mem_cgroup_stat_cpu *stat,
> + enum mem_cgroup_stat_index idx)
> +{
> + stat->count[idx] = 0;
> +}
Why do we do this and why do we need a special event?
> +
> +static inline s64
> +__mem_cgroup_stat_read_local(struct mem_cgroup_stat_cpu *stat,
> + enum mem_cgroup_stat_index idx)
> +{
> + return stat->count[idx];
> +}
> +
> /*
> * per-zone information in memory controller.
> */
> @@ -235,6 +252,8 @@ static void mem_cgroup_charge_statistics
> else
> __mem_cgroup_stat_add_safe(cpustat,
> MEM_CGROUP_STAT_PGPGOUT_COUNT, 1);
> + __mem_cgroup_stat_add_safe(cpustat, MEM_CGROUP_STAT_EVENTS, 1);
> +
> put_cpu();
> }
>
> @@ -897,9 +916,26 @@ static void record_last_oom(struct mem_c
> mem_cgroup_walk_tree(mem, NULL, record_last_oom_cb);
> }
>
> +#define SOFTLIMIT_EVENTS_THRESH (1024) /* 1024 times of page-in/out */
> +/*
> + * Returns true if sum of page-in/page-out events since last check is
> + * over SOFTLIMIT_EVENT_THRESH. (counter is per-cpu.)
> + */
> static bool mem_cgroup_soft_limit_check(struct mem_cgroup *mem)
> {
> - return false;
> + bool ret = false;
> + int cpu = get_cpu();
> + s64 val;
> + struct mem_cgroup_stat_cpu *cpustat;
> +
> + cpustat = &mem->stat.cpustat[cpu];
> + val = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_EVENTS);
> + if (unlikely(val > SOFTLIMIT_EVENTS_THRESH)) {
> + __mem_cgroup_stat_reset_safe(cpustat, MEM_CGROUP_STAT_EVENTS);
> + ret = true;
> + }
> + put_cpu();
> + return ret;
> }
>
It is good to have the caller and the function in the same patch.
Otherwise, you'll notice unused warnings. I think this function can be
simplified further
1. Lets gid rid of MEM_CGRUP_STAT_EVENTS
2. Lets rewrite mem_cgroup_soft_limit_check as
static bool mem_cgroup_soft_limit_check(struct mem_cgroup *mem)
{
bool ret = false;
int cpu = get_cpu();
s64 pgin, pgout;
struct mem_cgroup_stat_cpu *cpustat;
cpustat = &mem->stat.cpustat[cpu];
pgin = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_PGPGIN_COUNT);
pgout = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_PGPGOUT_COUNT);
val = pgin + pgout - mem->last_event_count;
if (unlikely(val > SOFTLIMIT_EVENTS_THRESH)) {
mem->last_event_count = pgin + pgout;
ret = true;
}
put_cpu();
return ret;
}
mem->last_event_count can either be atomic or protected using one of
the locks you intend to introduce. This will avoid the overhead of
incrementing event at every charge_statistics.
> static void mem_cgroup_update_soft_limit(struct mem_cgroup *mem)
>
>
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 4/9] soft limit queue and priority
2009-04-03 8:12 ` [RFC][PATCH 4/9] soft limit queue and priority KAMEZAWA Hiroyuki
@ 2009-04-06 11:05 ` Balbir Singh
2009-04-06 23:55 ` KAMEZAWA Hiroyuki
2009-04-06 18:42 ` Balbir Singh
1 sibling, 1 reply; 22+ messages in thread
From: Balbir Singh @ 2009-04-06 11:05 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kosaki.motohiro@jp.fujitsu.com
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:12:48]:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Softlimitq. for memcg.
>
> Implements an array of queue to list memcgs, array index is determined by
> the amount of memory usage excess the soft limit.
>
> While Balbir's one uses RB-tree and my old one used a per-zone queue
> (with round-robin), this is one of mixture of them.
> (I'd like to use rotation of queue in later patches)
>
> Priority is determined by following.
> Assume unit = total pages/1024. (the code uses different value)
> if excess is...
> < unit, priority = 0,
> < unit*2, priority = 1,
> < unit*2*2, priority = 2,
> ...
> < unit*2^9, priority = 9,
> < unit*2^10, priority = 10, (> 50% to total mem)
>
> This patch just includes queue management part and not includes
> selection logic from queue. Some trick will be used for selecting victims at
> soft limit in efficient way.
>
> And this equips 2 queues, for anon and file. Inset/Delete of both list is
> done at once but scan will be independent. (These 2 queues are used later.)
>
> Major difference from Balbir's one other than RB-tree is bahavior under
> hierarchy. This one adds all children to queue by checking hierarchical
> priority. This is for helping per-zone usage check on victim-selection logic.
>
> Changelog: v1->v2
> - fixed comments.
> - change base size to exponent.
> - some micro optimization to reduce code size.
> - considering memory hotplug, it's not good to record a value calculated
> from totalram_pages at boot and using it later is bad manner. Fixed it.
> - removed soft_limit_lock (spinlock)
> - added soft_limit_update counter for avoiding mulptiple update at once.
>
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> mm/memcontrol.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 117 insertions(+), 1 deletion(-)
>
> Index: softlimit-test2/mm/memcontrol.c
> ===================================================================
> --- softlimit-test2.orig/mm/memcontrol.c
> +++ softlimit-test2/mm/memcontrol.c
> @@ -192,7 +192,14 @@ struct mem_cgroup {
> atomic_t refcnt;
>
> unsigned int swappiness;
> -
> + /*
> + * For soft limit.
> + */
> + int soft_limit_priority;
> + struct list_head soft_limit_list[2];
> +#define SL_ANON (0)
> +#define SL_FILE (1)
Comments for the #define please.
> + atomic_t soft_limit_update;
> /*
> * statistics. This must be placed at the end of memcg.
> */
> @@ -938,11 +945,115 @@ static bool mem_cgroup_soft_limit_check(
> return ret;
> }
>
> +/*
> + * Assume "base_amount", and excess = usage - soft limit.
> + *
> + * 0...... if excess < base_amount
> + * 1...... if excess < base_amount * 2
> + * 2...... if excess < base_amount * 2^2
> + * 3.......if excess < base_amount * 2^3
> + * ....
> + * 9.......if excess < base_amount * 2^9
> + * 10 .....if excess < base_amount * 2^10
> + *
> + * base_amount is detemined from total pages in the system.
> + */
> +
> +#define SLQ_MAXPRIO (11)
> +static struct {
> + spinlock_t lock;
> + struct list_head queue[SLQ_MAXPRIO][2]; /* 0:anon 1:file */
> +} softlimitq;
> +
> +#define SLQ_PRIO_FACTOR (1024) /* 2^10 */
> +
> +static int __calc_soft_limit_prio(unsigned long excess)
> +{
> + unsigned long factor = totalram_pages /SLQ_PRIO_FACTOR;
I would prefer to use global_lru_pages()
> +
> + return fls(excess/factor);
> +}
> +
> +static int mem_cgroup_soft_limit_prio(struct mem_cgroup *mem)
> +{
> + unsigned long excess, max_excess = 0;
> + struct res_counter *c = &mem->res;
> +
> + do {
> + excess = res_counter_soft_limit_excess(c) >> PAGE_SHIFT;
> + if (max_excess < excess)
> + max_excess = excess;
max_excess = min(max_excess, excess)
> + c = c->parent;
> + } while (c);
> +
> + return __calc_soft_limit_prio(max_excess);
> +}
> +
> +static void __mem_cgroup_requeue(struct mem_cgroup *mem, int prio)
> +{
> + /* enqueue to softlimit queue */
> + int i;
> +
> + spin_lock(&softlimitq.lock);
> + if (prio != mem->soft_limit_priority) {
> + mem->soft_limit_priority = prio;
> + for (i = 0; i < 2; i++) {
> + list_del_init(&mem->soft_limit_list[i]);
> + list_add_tail(&mem->soft_limit_list[i],
> + &softlimitq.queue[prio][i]);
> + }
> + }
> + spin_unlock(&softlimitq.lock);
> +}
> +
> +static void __mem_cgroup_dequeue(struct mem_cgroup *mem)
> +{
> + int i;
> +
> + spin_lock(&softlimitq.lock);
> + for (i = 0; i < 2; i++)
> + list_del_init(&mem->soft_limit_list[i]);
> + spin_unlock(&softlimitq.lock);
> +}
> +
> +static int
> +__mem_cgroup_update_soft_limit_cb(struct mem_cgroup *mem, void *data)
> +{
> + int priority;
> + /* If someone updates, we don't need more */
> + priority = mem_cgroup_soft_limit_prio(mem);
> +
> + if (priority != mem->soft_limit_priority)
> + __mem_cgroup_requeue(mem, priority);
> + return 0;
> +}
> +
> static void mem_cgroup_update_soft_limit(struct mem_cgroup *mem)
> {
> + int priority;
> +
> + /* check status change */
> + priority = mem_cgroup_soft_limit_prio(mem);
> + if (priority != mem->soft_limit_priority &&
> + atomic_inc_return(&mem->soft_limit_update) > 1) {
> + mem_cgroup_walk_tree(mem, NULL,
> + __mem_cgroup_update_soft_limit_cb);
> + atomic_set(&mem->soft_limit_update, 0);
> + }
> return;
> }
>
> +static void softlimitq_init(void)
> +{
> + int i;
> +
> + spin_lock_init(&softlimitq.lock);
> + for (i = 0; i < SLQ_MAXPRIO; i++) {
> + INIT_LIST_HEAD(&softlimitq.queue[i][SL_ANON]);
> + INIT_LIST_HEAD(&softlimitq.queue[i][SL_FILE]);
> + }
> +}
> +
> /*
> * Unlike exported interface, "oom" parameter is added. if oom==true,
> * oom-killer can be invoked.
> @@ -2512,6 +2623,7 @@ mem_cgroup_create(struct cgroup_subsys *
> if (cont->parent == NULL) {
> enable_swap_cgroup();
> parent = NULL;
> + softlimitq_init();
> } else {
> parent = mem_cgroup_from_cont(cont->parent);
> mem->use_hierarchy = parent->use_hierarchy;
> @@ -2532,6 +2644,9 @@ mem_cgroup_create(struct cgroup_subsys *
> res_counter_init(&mem->memsw, NULL);
> }
> mem->last_scanned_child = 0;
> + mem->soft_limit_priority = 0;
> + INIT_LIST_HEAD(&mem->soft_limit_list[SL_ANON]);
> + INIT_LIST_HEAD(&mem->soft_limit_list[SL_FILE]);
> spin_lock_init(&mem->reclaim_param_lock);
>
> if (parent)
> @@ -2556,6 +2671,7 @@ static void mem_cgroup_destroy(struct cg
> {
> struct mem_cgroup *mem = mem_cgroup_from_cont(cont);
>
> + __mem_cgroup_dequeue(mem);
> mem_cgroup_put(mem);
> }
>
>
>
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 4/9] soft limit queue and priority
2009-04-03 8:12 ` [RFC][PATCH 4/9] soft limit queue and priority KAMEZAWA Hiroyuki
2009-04-06 11:05 ` Balbir Singh
@ 2009-04-06 18:42 ` Balbir Singh
2009-04-06 23:54 ` KAMEZAWA Hiroyuki
1 sibling, 1 reply; 22+ messages in thread
From: Balbir Singh @ 2009-04-06 18:42 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kosaki.motohiro@jp.fujitsu.com
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:12:48]:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Softlimitq. for memcg.
>
> Implements an array of queue to list memcgs, array index is determined by
> the amount of memory usage excess the soft limit.
>
> While Balbir's one uses RB-tree and my old one used a per-zone queue
> (with round-robin), this is one of mixture of them.
> (I'd like to use rotation of queue in later patches)
>
> Priority is determined by following.
> Assume unit = total pages/1024. (the code uses different value)
> if excess is...
> < unit, priority = 0,
> < unit*2, priority = 1,
> < unit*2*2, priority = 2,
> ...
> < unit*2^9, priority = 9,
> < unit*2^10, priority = 10, (> 50% to total mem)
>
> This patch just includes queue management part and not includes
> selection logic from queue. Some trick will be used for selecting victims at
> soft limit in efficient way.
>
> And this equips 2 queues, for anon and file. Inset/Delete of both list is
> done at once but scan will be independent. (These 2 queues are used later.)
>
> Major difference from Balbir's one other than RB-tree is bahavior under
> hierarchy. This one adds all children to queue by checking hierarchical
> priority. This is for helping per-zone usage check on victim-selection logic.
>
> Changelog: v1->v2
> - fixed comments.
> - change base size to exponent.
> - some micro optimization to reduce code size.
> - considering memory hotplug, it's not good to record a value calculated
> from totalram_pages at boot and using it later is bad manner. Fixed it.
> - removed soft_limit_lock (spinlock)
> - added soft_limit_update counter for avoiding mulptiple update at once.
>
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> mm/memcontrol.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 117 insertions(+), 1 deletion(-)
>
> Index: softlimit-test2/mm/memcontrol.c
> ===================================================================
> --- softlimit-test2.orig/mm/memcontrol.c
> +++ softlimit-test2/mm/memcontrol.c
> @@ -192,7 +192,14 @@ struct mem_cgroup {
> atomic_t refcnt;
>
> unsigned int swappiness;
> -
> + /*
> + * For soft limit.
> + */
> + int soft_limit_priority;
> + struct list_head soft_limit_list[2];
Looking at the rest of the code in the patch, it is not apparent as to
why we need two list_heads/array of list_heads?
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 4/9] soft limit queue and priority
2009-04-06 18:42 ` Balbir Singh
@ 2009-04-06 23:54 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-06 23:54 UTC (permalink / raw)
To: balbir
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kosaki.motohiro@jp.fujitsu.com
On Tue, 7 Apr 2009 00:12:21 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:12:48]:
>
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > Softlimitq. for memcg.
> >
> > Implements an array of queue to list memcgs, array index is determined by
> > the amount of memory usage excess the soft limit.
> >
> > While Balbir's one uses RB-tree and my old one used a per-zone queue
> > (with round-robin), this is one of mixture of them.
> > (I'd like to use rotation of queue in later patches)
> >
> > Priority is determined by following.
> > Assume unit = total pages/1024. (the code uses different value)
> > if excess is...
> > < unit, priority = 0,
> > < unit*2, priority = 1,
> > < unit*2*2, priority = 2,
> > ...
> > < unit*2^9, priority = 9,
> > < unit*2^10, priority = 10, (> 50% to total mem)
> >
> > This patch just includes queue management part and not includes
> > selection logic from queue. Some trick will be used for selecting victims at
> > soft limit in efficient way.
> >
> > And this equips 2 queues, for anon and file. Inset/Delete of both list is
> > done at once but scan will be independent. (These 2 queues are used later.)
> >
> > Major difference from Balbir's one other than RB-tree is bahavior under
> > hierarchy. This one adds all children to queue by checking hierarchical
> > priority. This is for helping per-zone usage check on victim-selection logic.
> >
> > Changelog: v1->v2
> > - fixed comments.
> > - change base size to exponent.
> > - some micro optimization to reduce code size.
> > - considering memory hotplug, it's not good to record a value calculated
> > from totalram_pages at boot and using it later is bad manner. Fixed it.
> > - removed soft_limit_lock (spinlock)
> > - added soft_limit_update counter for avoiding mulptiple update at once.
> >
> >
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > ---
> > mm/memcontrol.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> > 1 file changed, 117 insertions(+), 1 deletion(-)
> >
> > Index: softlimit-test2/mm/memcontrol.c
> > ===================================================================
> > --- softlimit-test2.orig/mm/memcontrol.c
> > +++ softlimit-test2/mm/memcontrol.c
> > @@ -192,7 +192,14 @@ struct mem_cgroup {
> > atomic_t refcnt;
> >
> > unsigned int swappiness;
> > -
> > + /*
> > + * For soft limit.
> > + */
> > + int soft_limit_priority;
> > + struct list_head soft_limit_list[2];
>
> Looking at the rest of the code in the patch, it is not apparent as to
> why we need two list_heads/array of list_heads?
>
Considering LRU rotation, it's done per anon, file in zone.
ACTIVE -> INACTIVE -> out.
And, there can be 'File only', 'Anon only' cgroup.
Then, we have 2 design choices.
1. Use one list for selecting victim.
If target memory type (FILE/ANON) is empty, select another victim.
2. Use two list for selecting victim.
FILE and ANON victim selection can be done independently from each other.
This series uses "2". Because "1" can make "ticket" parameter useless in victim
selection.
Sorry for short text.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 4/9] soft limit queue and priority
2009-04-06 11:05 ` Balbir Singh
@ 2009-04-06 23:55 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-06 23:55 UTC (permalink / raw)
To: balbir
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kosaki.motohiro@jp.fujitsu.com
On Mon, 6 Apr 2009 16:35:34 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:12:48]:
>
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > Softlimitq. for memcg.
> >
> > Implements an array of queue to list memcgs, array index is determined by
> > the amount of memory usage excess the soft limit.
> >
> > While Balbir's one uses RB-tree and my old one used a per-zone queue
> > (with round-robin), this is one of mixture of them.
> > (I'd like to use rotation of queue in later patches)
> >
> > Priority is determined by following.
> > Assume unit = total pages/1024. (the code uses different value)
> > if excess is...
> > < unit, priority = 0,
> > < unit*2, priority = 1,
> > < unit*2*2, priority = 2,
> > ...
> > < unit*2^9, priority = 9,
> > < unit*2^10, priority = 10, (> 50% to total mem)
> >
> > This patch just includes queue management part and not includes
> > selection logic from queue. Some trick will be used for selecting victims at
> > soft limit in efficient way.
> >
> > And this equips 2 queues, for anon and file. Inset/Delete of both list is
> > done at once but scan will be independent. (These 2 queues are used later.)
> >
> > Major difference from Balbir's one other than RB-tree is bahavior under
> > hierarchy. This one adds all children to queue by checking hierarchical
> > priority. This is for helping per-zone usage check on victim-selection logic.
> >
> > Changelog: v1->v2
> > - fixed comments.
> > - change base size to exponent.
> > - some micro optimization to reduce code size.
> > - considering memory hotplug, it's not good to record a value calculated
> > from totalram_pages at boot and using it later is bad manner. Fixed it.
> > - removed soft_limit_lock (spinlock)
> > - added soft_limit_update counter for avoiding mulptiple update at once.
> >
> >
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > ---
> > mm/memcontrol.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> > 1 file changed, 117 insertions(+), 1 deletion(-)
> >
> > Index: softlimit-test2/mm/memcontrol.c
> > ===================================================================
> > --- softlimit-test2.orig/mm/memcontrol.c
> > +++ softlimit-test2/mm/memcontrol.c
> > @@ -192,7 +192,14 @@ struct mem_cgroup {
> > atomic_t refcnt;
> >
> > unsigned int swappiness;
> > -
> > + /*
> > + * For soft limit.
> > + */
> > + int soft_limit_priority;
> > + struct list_head soft_limit_list[2];
> > +#define SL_ANON (0)
> > +#define SL_FILE (1)
>
> Comments for the #define please.
>
Sure.
> > + atomic_t soft_limit_update;
> > /*
> > * statistics. This must be placed at the end of memcg.
> > */
> > @@ -938,11 +945,115 @@ static bool mem_cgroup_soft_limit_check(
> > return ret;
> > }
> >
> > +/*
> > + * Assume "base_amount", and excess = usage - soft limit.
> > + *
> > + * 0...... if excess < base_amount
> > + * 1...... if excess < base_amount * 2
> > + * 2...... if excess < base_amount * 2^2
> > + * 3.......if excess < base_amount * 2^3
> > + * ....
> > + * 9.......if excess < base_amount * 2^9
> > + * 10 .....if excess < base_amount * 2^10
> > + *
> > + * base_amount is detemined from total pages in the system.
> > + */
> > +
> > +#define SLQ_MAXPRIO (11)
> > +static struct {
> > + spinlock_t lock;
> > + struct list_head queue[SLQ_MAXPRIO][2]; /* 0:anon 1:file */
> > +} softlimitq;
> > +
> > +#define SLQ_PRIO_FACTOR (1024) /* 2^10 */
> > +
> > +static int __calc_soft_limit_prio(unsigned long excess)
> > +{
> > + unsigned long factor = totalram_pages /SLQ_PRIO_FACTOR;
>
> I would prefer to use global_lru_pages()
>
Hmm, ok.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 3/9] soft limit update filter
2009-04-06 9:43 ` Balbir Singh
@ 2009-04-07 0:04 ` KAMEZAWA Hiroyuki
2009-04-07 2:26 ` Balbir Singh
0 siblings, 1 reply; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-07 0:04 UTC (permalink / raw)
To: balbir
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kosaki.motohiro@jp.fujitsu.com
On Mon, 6 Apr 2009 15:13:51 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:12:02]:
>
> > No changes from v1.
> > ==
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > Check/Update softlimit information at every charge is over-killing, so
> > we need some filter.
> >
> > This patch tries to count events in the memcg and if events > threshold
> > tries to update memcg's soft limit status and reset event counter to 0.
> >
> > Event counter is maintained by per-cpu which has been already used,
> > Then, no siginificant overhead(extra cache-miss etc..) in theory.
> >
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > ---
> > Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
> > ===================================================================
> > --- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
> > +++ mmotm-2.6.29-Mar23/mm/memcontrol.c
> > @@ -66,6 +66,7 @@ enum mem_cgroup_stat_index {
> > MEM_CGROUP_STAT_PGPGIN_COUNT, /* # of pages paged in */
> > MEM_CGROUP_STAT_PGPGOUT_COUNT, /* # of pages paged out */
> >
> > + MEM_CGROUP_STAT_EVENTS, /* sum of page-in/page-out for internal use */
> > MEM_CGROUP_STAT_NSTATS,
> > };
> >
> > @@ -105,6 +106,22 @@ static s64 mem_cgroup_local_usage(struct
> > return ret;
> > }
> >
> > +/* For intenal use of per-cpu event counting. */
> > +
> > +static inline void
> > +__mem_cgroup_stat_reset_safe(struct mem_cgroup_stat_cpu *stat,
> > + enum mem_cgroup_stat_index idx)
> > +{
> > + stat->count[idx] = 0;
> > +}
>
> Why do we do this and why do we need a special event?
>
2 points.
1. we do "reset" this counter.
2. We're counting page-in/page-out. I wonder I should counter others...
> > +
> > +static inline s64
> > +__mem_cgroup_stat_read_local(struct mem_cgroup_stat_cpu *stat,
> > + enum mem_cgroup_stat_index idx)
> > +{
> > + return stat->count[idx];
> > +}
> > +
> > /*
> > * per-zone information in memory controller.
> > */
> > @@ -235,6 +252,8 @@ static void mem_cgroup_charge_statistics
> > else
> > __mem_cgroup_stat_add_safe(cpustat,
> > MEM_CGROUP_STAT_PGPGOUT_COUNT, 1);
> > + __mem_cgroup_stat_add_safe(cpustat, MEM_CGROUP_STAT_EVENTS, 1);
> > +
> > put_cpu();
> > }
> >
> > @@ -897,9 +916,26 @@ static void record_last_oom(struct mem_c
> > mem_cgroup_walk_tree(mem, NULL, record_last_oom_cb);
> > }
> >
> > +#define SOFTLIMIT_EVENTS_THRESH (1024) /* 1024 times of page-in/out */
> > +/*
> > + * Returns true if sum of page-in/page-out events since last check is
> > + * over SOFTLIMIT_EVENT_THRESH. (counter is per-cpu.)
> > + */
> > static bool mem_cgroup_soft_limit_check(struct mem_cgroup *mem)
> > {
> > - return false;
> > + bool ret = false;
> > + int cpu = get_cpu();
> > + s64 val;
> > + struct mem_cgroup_stat_cpu *cpustat;
> > +
> > + cpustat = &mem->stat.cpustat[cpu];
> > + val = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_EVENTS);
> > + if (unlikely(val > SOFTLIMIT_EVENTS_THRESH)) {
> > + __mem_cgroup_stat_reset_safe(cpustat, MEM_CGROUP_STAT_EVENTS);
> > + ret = true;
> > + }
> > + put_cpu();
> > + return ret;
> > }
> >
>
> It is good to have the caller and the function in the same patch.
> Otherwise, you'll notice unused warnings. I think this function can be
> simplified further
>
> 1. Lets gid rid of MEM_CGRUP_STAT_EVENTS
> 2. Lets rewrite mem_cgroup_soft_limit_check as
>
> static bool mem_cgroup_soft_limit_check(struct mem_cgroup *mem)
> {
> bool ret = false;
> int cpu = get_cpu();
> s64 pgin, pgout;
> struct mem_cgroup_stat_cpu *cpustat;
>
> cpustat = &mem->stat.cpustat[cpu];
> pgin = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_PGPGIN_COUNT);
> pgout = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_PGPGOUT_COUNT);
> val = pgin + pgout - mem->last_event_count;
> if (unlikely(val > SOFTLIMIT_EVENTS_THRESH)) {
> mem->last_event_count = pgin + pgout;
> ret = true;
> }
> put_cpu();
> return ret;
> }
>
> mem->last_event_count can either be atomic or protected using one of
> the locks you intend to introduce. This will avoid the overhead of
> incrementing event at every charge_statistics.
>
Incrementing always hits cache.
Hmm, making mem->last_event_count as per-cpu, we can do above. And maybe no
difference with current code. But you don't seem to like counting,
it's ok to change the shape.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 0/9] memcg soft limit v2 (new design)
2009-04-06 9:08 ` [RFC][PATCH 0/9] memcg soft limit v2 (new design) Balbir Singh
@ 2009-04-07 0:16 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-07 0:16 UTC (permalink / raw)
To: balbir
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kosaki.motohiro@jp.fujitsu.com
On Mon, 6 Apr 2009 14:38:00 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:08:35]:
>
> > Hi,
> >
> > Memory cgroup's soft limit feature is a feature to tell global LRU
> > "please reclaim from this memcg at memory shortage".
> >
> > This is v2. Fixed some troubles under hierarchy. and increase soft limit
> > update hooks to proper places.
> >
> > This patch is on to
> > mmotom-Mar23 + memcg-cleanup-cache_charge.patch
> > + vmscan-fix-it-to-take-care-of-nodemask.patch
> >
> > So, not for wide use ;)
> >
> > This patch tries to avoid to use existing memcg's reclaim routine and
> > just tell "Hints" to global LRU. This patch is briefly tested and shows
> > good result to me. (But may not to you. plz brame me.)
> >
> > Major characteristic is.
> > - memcg will be inserted to softlimit-queue at charge() if usage excess
> > soft limit.
> > - softlimit-queue is a queue with priority. priority is detemined by size
> > of excessing usage.
>
> This is critical and good that you have this now. In my patchset, it
> helps me achieve a lot of the expected functionality.
>
> > - memcg's soft limit hooks is called by shrink_xxx_list() to show hints.
>
> I am not too happy with moving pages in global LRU based on soft
> limits based on my comments earlier. My objection is not too strong,
> since reclaiming from the memcg also exhibits functionally similar
> behaviour.
Yes, not so much difference from memcg' reclaim routine other than this is
called under scanning_global_lru()==ture.
>
> > - Behavior is affected by vm.swappiness and LRU scan rate is determined by
> > global LRU's status.
> >
>
> I also have concerns about not sorting the list of memcg's. I need to
> write some scalabilityt tests and check.
Ah yes, I admit scalability is my concern, too.
About sorting, this priority list uses exponet as parameter. Then,
When excess is small, priority control is done under close observation.
When excess is big, priority control is done under rough observation.
I'm wondering how ->ticket can be big, now.
>
> > In this v2.
> > - problems under use_hierarchy=1 case are fixed.
> > - more hooks are added.
> > - codes are cleaned up.
> >
> > Shows good results on my private box test under several work loads.
> >
> > But in special artificial case, when victim memcg's Active/Inactive ratio of
> > ANON is very different from global LRU, the result seems not very good.
> > i.e.
> > under vicitm memcg, ACTIVE_ANON=100%, INACTIVE=0% (access memory in busy loop)
> > under global, ACTIVE_ANON=10%, INACTIVE=90% (almost all processes are sleeping.)
> > memory can be swapped out from global LRU, not from vicitm.
> > (If there are file cache in victims, file cacahes will be out.)
> >
> > But, in this case, even if we successfully swap out anon pages under victime memcg,
> > they will come back to memory soon and can show heavy slashing.
>
> heavy slashing? Not sure I understand what you mean.
>
Heavy swapin <-> swapout and user applicatons can't make progress.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 3/9] soft limit update filter
2009-04-07 0:04 ` KAMEZAWA Hiroyuki
@ 2009-04-07 2:26 ` Balbir Singh
0 siblings, 0 replies; 22+ messages in thread
From: Balbir Singh @ 2009-04-07 2:26 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kosaki.motohiro@jp.fujitsu.com
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-07 09:04:38]:
> On Mon, 6 Apr 2009 15:13:51 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:12:02]:
> >
> > > No changes from v1.
> > > ==
> > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > >
> > > Check/Update softlimit information at every charge is over-killing, so
> > > we need some filter.
> > >
> > > This patch tries to count events in the memcg and if events > threshold
> > > tries to update memcg's soft limit status and reset event counter to 0.
> > >
> > > Event counter is maintained by per-cpu which has been already used,
> > > Then, no siginificant overhead(extra cache-miss etc..) in theory.
> > >
> > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > ---
> > > Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
> > > ===================================================================
> > > --- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
> > > +++ mmotm-2.6.29-Mar23/mm/memcontrol.c
> > > @@ -66,6 +66,7 @@ enum mem_cgroup_stat_index {
> > > MEM_CGROUP_STAT_PGPGIN_COUNT, /* # of pages paged in */
> > > MEM_CGROUP_STAT_PGPGOUT_COUNT, /* # of pages paged out */
> > >
> > > + MEM_CGROUP_STAT_EVENTS, /* sum of page-in/page-out for internal use */
> > > MEM_CGROUP_STAT_NSTATS,
> > > };
> > >
> > > @@ -105,6 +106,22 @@ static s64 mem_cgroup_local_usage(struct
> > > return ret;
> > > }
> > >
> > > +/* For intenal use of per-cpu event counting. */
> > > +
> > > +static inline void
> > > +__mem_cgroup_stat_reset_safe(struct mem_cgroup_stat_cpu *stat,
> > > + enum mem_cgroup_stat_index idx)
> > > +{
> > > + stat->count[idx] = 0;
> > > +}
> >
> > Why do we do this and why do we need a special event?
> >
> 2 points.
>
> 1. we do "reset" this counter.
> 2. We're counting page-in/page-out. I wonder I should counter others...
>
> > > +
> > > +static inline s64
> > > +__mem_cgroup_stat_read_local(struct mem_cgroup_stat_cpu *stat,
> > > + enum mem_cgroup_stat_index idx)
> > > +{
> > > + return stat->count[idx];
> > > +}
> > > +
> > > /*
> > > * per-zone information in memory controller.
> > > */
> > > @@ -235,6 +252,8 @@ static void mem_cgroup_charge_statistics
> > > else
> > > __mem_cgroup_stat_add_safe(cpustat,
> > > MEM_CGROUP_STAT_PGPGOUT_COUNT, 1);
> > > + __mem_cgroup_stat_add_safe(cpustat, MEM_CGROUP_STAT_EVENTS, 1);
> > > +
> > > put_cpu();
> > > }
> > >
> > > @@ -897,9 +916,26 @@ static void record_last_oom(struct mem_c
> > > mem_cgroup_walk_tree(mem, NULL, record_last_oom_cb);
> > > }
> > >
> > > +#define SOFTLIMIT_EVENTS_THRESH (1024) /* 1024 times of page-in/out */
> > > +/*
> > > + * Returns true if sum of page-in/page-out events since last check is
> > > + * over SOFTLIMIT_EVENT_THRESH. (counter is per-cpu.)
> > > + */
> > > static bool mem_cgroup_soft_limit_check(struct mem_cgroup *mem)
> > > {
> > > - return false;
> > > + bool ret = false;
> > > + int cpu = get_cpu();
> > > + s64 val;
> > > + struct mem_cgroup_stat_cpu *cpustat;
> > > +
> > > + cpustat = &mem->stat.cpustat[cpu];
> > > + val = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_EVENTS);
> > > + if (unlikely(val > SOFTLIMIT_EVENTS_THRESH)) {
> > > + __mem_cgroup_stat_reset_safe(cpustat, MEM_CGROUP_STAT_EVENTS);
> > > + ret = true;
> > > + }
> > > + put_cpu();
> > > + return ret;
> > > }
> > >
> >
> > It is good to have the caller and the function in the same patch.
> > Otherwise, you'll notice unused warnings. I think this function can be
> > simplified further
> >
> > 1. Lets gid rid of MEM_CGRUP_STAT_EVENTS
> > 2. Lets rewrite mem_cgroup_soft_limit_check as
> >
> > static bool mem_cgroup_soft_limit_check(struct mem_cgroup *mem)
> > {
> > bool ret = false;
> > int cpu = get_cpu();
> > s64 pgin, pgout;
> > struct mem_cgroup_stat_cpu *cpustat;
> >
> > cpustat = &mem->stat.cpustat[cpu];
> > pgin = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_PGPGIN_COUNT);
> > pgout = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_PGPGOUT_COUNT);
> > val = pgin + pgout - mem->last_event_count;
> > if (unlikely(val > SOFTLIMIT_EVENTS_THRESH)) {
> > mem->last_event_count = pgin + pgout;
> > ret = true;
> > }
> > put_cpu();
> > return ret;
> > }
> >
> > mem->last_event_count can either be atomic or protected using one of
> > the locks you intend to introduce. This will avoid the overhead of
> > incrementing event at every charge_statistics.
> >
> Incrementing always hits cache.
>
> Hmm, making mem->last_event_count as per-cpu, we can do above. And maybe no
> difference with current code. But you don't seem to like counting,
> it's ok to change the shape.
>
I was wondering as to why we were adding another EVENT counter, when
we can sum up pgpgin and pgpgout, but we already have the
infrastructure to make EVENT per-cpu, so lets stick with it for now.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 0/9] memcg soft limit v2 (new design)
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
` (10 preceding siblings ...)
2009-04-06 9:08 ` [RFC][PATCH 0/9] memcg soft limit v2 (new design) Balbir Singh
@ 2009-04-24 12:24 ` Balbir Singh
2009-04-24 15:19 ` KAMEZAWA Hiroyuki
11 siblings, 1 reply; 22+ messages in thread
From: Balbir Singh @ 2009-04-24 12:24 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kosaki.motohiro@jp.fujitsu.com
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:08:35]:
> Hi,
>
> Memory cgroup's soft limit feature is a feature to tell global LRU
> "please reclaim from this memcg at memory shortage".
>
> This is v2. Fixed some troubles under hierarchy. and increase soft limit
> update hooks to proper places.
>
> This patch is on to
> mmotom-Mar23 + memcg-cleanup-cache_charge.patch
> + vmscan-fix-it-to-take-care-of-nodemask.patch
>
> So, not for wide use ;)
>
> This patch tries to avoid to use existing memcg's reclaim routine and
> just tell "Hints" to global LRU. This patch is briefly tested and shows
> good result to me. (But may not to you. plz brame me.)
>
> Major characteristic is.
> - memcg will be inserted to softlimit-queue at charge() if usage excess
> soft limit.
> - softlimit-queue is a queue with priority. priority is detemined by size
> of excessing usage.
> - memcg's soft limit hooks is called by shrink_xxx_list() to show hints.
> - Behavior is affected by vm.swappiness and LRU scan rate is determined by
> global LRU's status.
>
> In this v2.
> - problems under use_hierarchy=1 case are fixed.
> - more hooks are added.
> - codes are cleaned up.
>
The results seem good so far with some basic tests I've been doing.
I'll come back with more feedback, I would like to see this feature in
-mm soon.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC][PATCH 0/9] memcg soft limit v2 (new design)
2009-04-24 12:24 ` Balbir Singh
@ 2009-04-24 15:19 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-24 15:19 UTC (permalink / raw)
To: balbir
Cc: KAMEZAWA Hiroyuki, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kosaki.motohiro@jp.fujitsu.com
Balbir Singh wrote:
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03
> 17:08:35]:
>> In this v2.
>> - problems under use_hierarchy=1 case are fixed.
>> - more hooks are added.
>> - codes are cleaned up.
>>
>
> The results seem good so far with some basic tests I've been doing.
> I'll come back with more feedback, I would like to see this feature in
> -mm soon.
>
Thank you. I'll update this. But now I have bugfix patch for
stale swap caches (coop with Nishimura). Then, I'll go ahead one by one.
Regards,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2009-04-24 15:19 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
2009-04-03 8:09 ` [RFC][PATCH 1/9] " KAMEZAWA Hiroyuki
2009-04-03 8:10 ` [RFC][PATCH 2/9] soft limit framework for memcg KAMEZAWA Hiroyuki
2009-04-03 8:12 ` [RFC][PATCH 3/9] soft limit update filter KAMEZAWA Hiroyuki
2009-04-06 9:43 ` Balbir Singh
2009-04-07 0:04 ` KAMEZAWA Hiroyuki
2009-04-07 2:26 ` Balbir Singh
2009-04-03 8:12 ` [RFC][PATCH 4/9] soft limit queue and priority KAMEZAWA Hiroyuki
2009-04-06 11:05 ` Balbir Singh
2009-04-06 23:55 ` KAMEZAWA Hiroyuki
2009-04-06 18:42 ` Balbir Singh
2009-04-06 23:54 ` KAMEZAWA Hiroyuki
2009-04-03 8:13 ` [RFC][PATCH 5/9] add more hooks and check in lazy manner KAMEZAWA Hiroyuki
2009-04-03 8:14 ` [RFC][PATCH 6/9] active inactive ratio for private KAMEZAWA Hiroyuki
2009-04-03 8:15 ` [RFC][PATCH 7/9] vicitim selection logic KAMEZAWA Hiroyuki
2009-04-03 8:17 ` [RFC][PATCH 8/9] lru reordering KAMEZAWA Hiroyuki
2009-04-03 8:18 ` [RFC][PATCH 9/9] more event filter depend on priority KAMEZAWA Hiroyuki
2009-04-03 8:24 ` [RFC][PATCH ex/9] for debug KAMEZAWA Hiroyuki
2009-04-06 9:08 ` [RFC][PATCH 0/9] memcg soft limit v2 (new design) Balbir Singh
2009-04-07 0:16 ` KAMEZAWA Hiroyuki
2009-04-24 12:24 ` Balbir Singh
2009-04-24 15:19 ` KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).