[RFC] [PATCH 0/4] Add soft limits to the memory controller

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [RFC] [PATCH 0/4] Add soft limits to the memory controller
@ 2008-02-13 15:12 Balbir Singh
  2008-02-13 15:12 ` [RFC] [PATCH 1/4] Modify resource counters to add soft limit support Balbir Singh
                   ` (3 more replies)
  0 siblings, 4 replies; 26+ messages in thread
From: Balbir Singh @ 2008-02-13 15:12 UTC (permalink / raw)
  To: linux-mm
  Cc: Nick Piggin, Paul Menage, Hugh Dickins, YAMAMOTO Takashi,
	Herbert Poetzl, Peter Zijlstra, Lee Schermerhorn,
	Eric W. Biederman, David Rientjes, Pavel Emelianov, Balbir Singh,
	Rik Van Riel, Andrew Morton, KAMEZAWA Hiroyuki

This patchset implements the basic changes required to implement soft limits
in the memory controller. A soft limit is a variation of the currently
supported hard limit feature. A memory cgroup can exceed it's soft limit
provided there is no contention for memory.

These patches were tested under KVM and on a PowerPC box, by running a 
programs in parallel, and checking their behaviour for various soft limit
values.

These patches were developed on top of 2.6.24-mm1. Comments, suggestions,
criticism are all welcome!

TODOs:

1. Currently there is no ordering of memory cgroups over their limit.
   We use a simple linked list to maintain a list of groups over their
   limit. In the future, we might want to create a heap of objects ordered
   by the amount by which they exceed soft limit.
2. Distribute the excessive (non-contended) resources between groups
   in the ratio of their soft limits

series
------
memory-controller-res_counters-soft-limit-setup.patch
memory-controller-add-soft-limit-interface.patch
memory-controller-reclaim-on-contention.patch
memory-controller-add-soft-limit-documentation.patch

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC] [PATCH 1/4] Modify resource counters to add soft limit support
  2008-02-13 15:12 [RFC] [PATCH 0/4] Add soft limits to the memory controller Balbir Singh
@ 2008-02-13 15:12 ` Balbir Singh
  2008-02-13 17:12   ` Pavel Emelyanov
  2008-02-13 15:12 ` [RFC] [PATCH 2/4] Add the soft limit interface Balbir Singh
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 26+ messages in thread
From: Balbir Singh @ 2008-02-13 15:12 UTC (permalink / raw)
  To: linux-mm
  Cc: Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi, Paul Menage,
	Lee Schermerhorn, Nick Piggin, Eric W. Biederman, David Rientjes,
	Andrew Morton, Pavel Emelianov, Balbir Singh, Rik Van Riel,
	Herbert Poetzl, KAMEZAWA Hiroyuki


The resource counter member limit is split into soft and hard limits.
The same locking rule apply for both limits.

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
---

 include/linux/res_counter.h |   34 ++++++++++++++++++++++++++--------
 kernel/res_counter.c        |   11 +++++++----
 mm/memcontrol.c             |   10 +++++-----
 3 files changed, 38 insertions(+), 17 deletions(-)

diff -puN mm/vmscan.c~memory-controller-res_counters-soft-limit-setup mm/vmscan.c
diff -puN mm/memcontrol.c~memory-controller-res_counters-soft-limit-setup mm/memcontrol.c
--- linux-2.6.24/mm/memcontrol.c~memory-controller-res_counters-soft-limit-setup	2008-02-13 19:50:24.000000000 +0530
+++ linux-2.6.24-balbir/mm/memcontrol.c	2008-02-13 19:50:24.000000000 +0530
@@ -568,7 +568,7 @@ unsigned long mem_cgroup_isolate_pages(u
  * Charge the memory controller for page usage.
  * Return
  * 0 if the charge was successful
- * < 0 if the cgroup is over its limit
+ * < 0 if the cgroup is over its hard limit
  */
 static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
 				gfp_t gfp_mask, enum charge_type ctype)
@@ -632,7 +632,7 @@ retry:
 
 	/*
 	 * If we created the page_cgroup, we should free it on exceeding
-	 * the cgroup limit.
+	 * the cgroup hard limit.
 	 */
 	while (res_counter_charge(&mem->res, PAGE_SIZE)) {
 		if (!(gfp_mask & __GFP_WAIT))
@@ -645,10 +645,10 @@ retry:
  		 * try_to_free_mem_cgroup_pages() might not give us a full
  		 * picture of reclaim. Some pages are reclaimed and might be
  		 * moved to swap cache or just unmapped from the cgroup.
- 		 * Check the limit again to see if the reclaim reduced the
+ 		 * Check the hard limit again to see if the reclaim reduced the
  		 * current usage of the cgroup before giving up
  		 */
-		if (res_counter_check_under_limit(&mem->res))
+		if (res_counter_check_under_limit(&mem->res, RES_HARD_LIMIT))
 			continue;
 
 		if (!nr_retries--) {
@@ -1028,7 +1028,7 @@ static struct cftype mem_cgroup_files[] 
 	},
 	{
 		.name = "limit_in_bytes",
-		.private = RES_LIMIT,
+		.private = RES_HARD_LIMIT,
 		.write = mem_cgroup_write,
 		.read = mem_cgroup_read,
 	},
diff -puN kernel/res_counter.c~memory-controller-res_counters-soft-limit-setup kernel/res_counter.c
--- linux-2.6.24/kernel/res_counter.c~memory-controller-res_counters-soft-limit-setup	2008-02-13 19:50:24.000000000 +0530
+++ linux-2.6.24-balbir/kernel/res_counter.c	2008-02-13 19:50:24.000000000 +0530
@@ -16,12 +16,13 @@
 void res_counter_init(struct res_counter *counter)
 {
 	spin_lock_init(&counter->lock);
-	counter->limit = (unsigned long long)LLONG_MAX;
+	counter->soft_limit = (unsigned long long)LLONG_MAX;
+	counter->hard_limit = (unsigned long long)LLONG_MAX;
 }
 
 int res_counter_charge_locked(struct res_counter *counter, unsigned long val)
 {
-	if (counter->usage + val > counter->limit) {
+	if (counter->usage + val > counter->hard_limit) {
 		counter->failcnt++;
 		return -ENOMEM;
 	}
@@ -65,8 +66,10 @@ res_counter_member(struct res_counter *c
 	switch (member) {
 	case RES_USAGE:
 		return &counter->usage;
-	case RES_LIMIT:
-		return &counter->limit;
+	case RES_SOFT_LIMIT:
+		return &counter->soft_limit;
+	case RES_HARD_LIMIT:
+		return &counter->hard_limit;
 	case RES_FAILCNT:
 		return &counter->failcnt;
 	};
diff -puN include/linux/res_counter.h~memory-controller-res_counters-soft-limit-setup include/linux/res_counter.h
--- linux-2.6.24/include/linux/res_counter.h~memory-controller-res_counters-soft-limit-setup	2008-02-13 19:50:24.000000000 +0530
+++ linux-2.6.24-balbir/include/linux/res_counter.h	2008-02-13 19:50:24.000000000 +0530
@@ -27,7 +27,13 @@ struct res_counter {
 	/*
 	 * the limit that usage cannot exceed
 	 */
-	unsigned long long limit;
+	unsigned long long hard_limit;
+	/*
+	 * the limit that usage can exceed, but under memory
+	 * pressure, we will reclaim back memory above the
+	 * soft limit mark
+	 */
+	unsigned long long soft_limit;
 	/*
 	 * the number of unsuccessful attempts to consume the resource
 	 */
@@ -64,7 +70,8 @@ ssize_t res_counter_write(struct res_cou
 
 enum {
 	RES_USAGE,
-	RES_LIMIT,
+	RES_SOFT_LIMIT,
+	RES_HARD_LIMIT,
 	RES_FAILCNT,
 };
 
@@ -101,11 +108,21 @@ int res_counter_charge(struct res_counte
 void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val);
 void res_counter_uncharge(struct res_counter *counter, unsigned long val);
 
-static inline bool res_counter_limit_check_locked(struct res_counter *cnt)
+static inline bool res_counter_limit_check_locked(struct res_counter *cnt,
+							int member)
 {
-	if (cnt->usage < cnt->limit)
-		return true;
-
+	switch (member) {
+	case RES_HARD_LIMIT:
+		if (cnt->usage < cnt->hard_limit)
+			return true;
+		break;
+	case RES_SOFT_LIMIT:
+		if (cnt->usage < cnt->soft_limit)
+			return true;
+		break;
+	default:
+		BUG_ON(1);
+	}
 	return false;
 }
 
@@ -113,13 +130,14 @@ static inline bool res_counter_limit_che
  * Helper function to detect if the cgroup is within it's limit or
  * not. It's currently called from cgroup_rss_prepare()
  */
-static inline bool res_counter_check_under_limit(struct res_counter *cnt)
+static inline bool res_counter_check_under_limit(struct res_counter *cnt,
+							int member)
 {
 	bool ret;
 	unsigned long flags;
 
 	spin_lock_irqsave(&cnt->lock, flags);
-	ret = res_counter_limit_check_locked(cnt);
+	ret = res_counter_limit_check_locked(cnt, member);
 	spin_unlock_irqrestore(&cnt->lock, flags);
 	return ret;
 }
diff -puN include/linux/memcontrol.h~memory-controller-res_counters-soft-limit-setup include/linux/memcontrol.h
_

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 1/4] Modify resource counters to add soft limit support
  2008-02-13 15:12 ` [RFC] [PATCH 1/4] Modify resource counters to add soft limit support Balbir Singh
@ 2008-02-13 17:12   ` Pavel Emelyanov
  2008-02-13 17:19     ` Balbir Singh
  0 siblings, 1 reply; 26+ messages in thread
From: Pavel Emelyanov @ 2008-02-13 17:12 UTC (permalink / raw)
  To: Balbir Singh
  Cc: linux-mm, Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi,
	Paul Menage, Lee Schermerhorn, Nick Piggin, Eric W. Biederman,
	David Rientjes, Andrew Morton, Rik Van Riel, Herbert Poetzl,
	KAMEZAWA Hiroyuki

Balbir Singh wrote:
> The resource counter member limit is split into soft and hard limits.
> The same locking rule apply for both limits.
> 
> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
> ---
> 
>  include/linux/res_counter.h |   34 ++++++++++++++++++++++++++--------
>  kernel/res_counter.c        |   11 +++++++----
>  mm/memcontrol.c             |   10 +++++-----
>  3 files changed, 38 insertions(+), 17 deletions(-)
> 
> diff -puN mm/vmscan.c~memory-controller-res_counters-soft-limit-setup mm/vmscan.c
> diff -puN mm/memcontrol.c~memory-controller-res_counters-soft-limit-setup mm/memcontrol.c
> --- linux-2.6.24/mm/memcontrol.c~memory-controller-res_counters-soft-limit-setup	2008-02-13 19:50:24.000000000 +0530
> +++ linux-2.6.24-balbir/mm/memcontrol.c	2008-02-13 19:50:24.000000000 +0530
> @@ -568,7 +568,7 @@ unsigned long mem_cgroup_isolate_pages(u
>   * Charge the memory controller for page usage.
>   * Return
>   * 0 if the charge was successful
> - * < 0 if the cgroup is over its limit
> + * < 0 if the cgroup is over its hard limit
>   */
>  static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
>  				gfp_t gfp_mask, enum charge_type ctype)
> @@ -632,7 +632,7 @@ retry:
>  
>  	/*
>  	 * If we created the page_cgroup, we should free it on exceeding
> -	 * the cgroup limit.
> +	 * the cgroup hard limit.
>  	 */
>  	while (res_counter_charge(&mem->res, PAGE_SIZE)) {
>  		if (!(gfp_mask & __GFP_WAIT))
> @@ -645,10 +645,10 @@ retry:
>   		 * try_to_free_mem_cgroup_pages() might not give us a full
>   		 * picture of reclaim. Some pages are reclaimed and might be
>   		 * moved to swap cache or just unmapped from the cgroup.
> - 		 * Check the limit again to see if the reclaim reduced the
> + 		 * Check the hard limit again to see if the reclaim reduced the
>   		 * current usage of the cgroup before giving up
>   		 */
> -		if (res_counter_check_under_limit(&mem->res))
> +		if (res_counter_check_under_limit(&mem->res, RES_HARD_LIMIT))
>  			continue;
>  
>  		if (!nr_retries--) {
> @@ -1028,7 +1028,7 @@ static struct cftype mem_cgroup_files[] 
>  	},
>  	{
>  		.name = "limit_in_bytes",
> -		.private = RES_LIMIT,
> +		.private = RES_HARD_LIMIT,
>  		.write = mem_cgroup_write,
>  		.read = mem_cgroup_read,
>  	},
> diff -puN kernel/res_counter.c~memory-controller-res_counters-soft-limit-setup kernel/res_counter.c
> --- linux-2.6.24/kernel/res_counter.c~memory-controller-res_counters-soft-limit-setup	2008-02-13 19:50:24.000000000 +0530
> +++ linux-2.6.24-balbir/kernel/res_counter.c	2008-02-13 19:50:24.000000000 +0530
> @@ -16,12 +16,13 @@
>  void res_counter_init(struct res_counter *counter)
>  {
>  	spin_lock_init(&counter->lock);
> -	counter->limit = (unsigned long long)LLONG_MAX;
> +	counter->soft_limit = (unsigned long long)LLONG_MAX;
> +	counter->hard_limit = (unsigned long long)LLONG_MAX;
>  }
>  
>  int res_counter_charge_locked(struct res_counter *counter, unsigned long val)
>  {
> -	if (counter->usage + val > counter->limit) {
> +	if (counter->usage + val > counter->hard_limit) {
>  		counter->failcnt++;
>  		return -ENOMEM;
>  	}
> @@ -65,8 +66,10 @@ res_counter_member(struct res_counter *c
>  	switch (member) {
>  	case RES_USAGE:
>  		return &counter->usage;
> -	case RES_LIMIT:
> -		return &counter->limit;
> +	case RES_SOFT_LIMIT:
> +		return &counter->soft_limit;
> +	case RES_HARD_LIMIT:
> +		return &counter->hard_limit;
>  	case RES_FAILCNT:
>  		return &counter->failcnt;
>  	};
> diff -puN include/linux/res_counter.h~memory-controller-res_counters-soft-limit-setup include/linux/res_counter.h
> --- linux-2.6.24/include/linux/res_counter.h~memory-controller-res_counters-soft-limit-setup	2008-02-13 19:50:24.000000000 +0530
> +++ linux-2.6.24-balbir/include/linux/res_counter.h	2008-02-13 19:50:24.000000000 +0530
> @@ -27,7 +27,13 @@ struct res_counter {
>  	/*
>  	 * the limit that usage cannot exceed
>  	 */
> -	unsigned long long limit;
> +	unsigned long long hard_limit;
> +	/*
> +	 * the limit that usage can exceed, but under memory
> +	 * pressure, we will reclaim back memory above the
> +	 * soft limit mark
> +	 */

Resource counter accounts for arbitrary resource. Memory pressure
and memory reclamation both only make sense in case we're dealing
with memory controller. Please, remove this comment or move it to
memcontrol.c.

> +	unsigned long long soft_limit;
>  	/*
>  	 * the number of unsuccessful attempts to consume the resource
>  	 */
> @@ -64,7 +70,8 @@ ssize_t res_counter_write(struct res_cou
>  
>  enum {
>  	RES_USAGE,
> -	RES_LIMIT,
> +	RES_SOFT_LIMIT,
> +	RES_HARD_LIMIT,
>  	RES_FAILCNT,
>  };
>  
> @@ -101,11 +108,21 @@ int res_counter_charge(struct res_counte
>  void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val);
>  void res_counter_uncharge(struct res_counter *counter, unsigned long val);
>  
> -static inline bool res_counter_limit_check_locked(struct res_counter *cnt)
> +static inline bool res_counter_limit_check_locked(struct res_counter *cnt,
> +							int member)
>  {
> -	if (cnt->usage < cnt->limit)
> -		return true;
> -
> +	switch (member) {
> +	case RES_HARD_LIMIT:
> +		if (cnt->usage < cnt->hard_limit)
> +			return true;
> +		break;
> +	case RES_SOFT_LIMIT:
> +		if (cnt->usage < cnt->soft_limit)
> +			return true;
> +		break;
> +	default:
> +		BUG_ON(1);
> +	}

Does the compiler optimize this when the member is a built in const?

>  	return false;
>  }
>  
> @@ -113,13 +130,14 @@ static inline bool res_counter_limit_che
>   * Helper function to detect if the cgroup is within it's limit or
>   * not. It's currently called from cgroup_rss_prepare()
>   */
> -static inline bool res_counter_check_under_limit(struct res_counter *cnt)
> +static inline bool res_counter_check_under_limit(struct res_counter *cnt,
> +							int member)
>  {
>  	bool ret;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&cnt->lock, flags);
> -	ret = res_counter_limit_check_locked(cnt);
> +	ret = res_counter_limit_check_locked(cnt, member);
>  	spin_unlock_irqrestore(&cnt->lock, flags);
>  	return ret;
>  }
> diff -puN include/linux/memcontrol.h~memory-controller-res_counters-soft-limit-setup include/linux/memcontrol.h
> _
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 1/4] Modify resource counters to add soft limit support
  2008-02-13 17:12   ` Pavel Emelyanov
@ 2008-02-13 17:19     ` Balbir Singh
  2008-02-13 17:38       ` Pavel Emelyanov
  0 siblings, 1 reply; 26+ messages in thread
From: Balbir Singh @ 2008-02-13 17:19 UTC (permalink / raw)
  To: Pavel Emelyanov
  Cc: linux-mm, Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi,
	Paul Menage, Lee Schermerhorn, Nick Piggin, Eric W. Biederman,
	David Rientjes, Andrew Morton, Rik Van Riel, Herbert Poetzl,
	KAMEZAWA Hiroyuki

Pavel Emelyanov wrote:
> Balbir Singh wrote:

> Resource counter accounts for arbitrary resource. Memory pressure
> and memory reclamation both only make sense in case we're dealing
> with memory controller. Please, remove this comment or move it to
> memcontrol.c.
> 

Yes, they always have. The concept of soft limits, hard limits, guarantees
applies to all resources. Why do you say they apply only to memory controller? I
can change the comment to make the definition generic for all resources.


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 1/4] Modify resource counters to add soft limit support
  2008-02-13 17:19     ` Balbir Singh
@ 2008-02-13 17:38       ` Pavel Emelyanov
  2008-02-13 17:54         ` Balbir Singh
  0 siblings, 1 reply; 26+ messages in thread
From: Pavel Emelyanov @ 2008-02-13 17:38 UTC (permalink / raw)
  To: balbir
  Cc: linux-mm, Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi,
	Paul Menage, Lee Schermerhorn, Nick Piggin, Eric W. Biederman,
	David Rientjes, Andrew Morton, Rik Van Riel, Herbert Poetzl,
	KAMEZAWA Hiroyuki

Balbir Singh wrote:
> Pavel Emelyanov wrote:
>> Balbir Singh wrote:
> 
>> Resource counter accounts for arbitrary resource. Memory pressure
>> and memory reclamation both only make sense in case we're dealing
>> with memory controller. Please, remove this comment or move it to
>> memcontrol.c.
>>
> 
> Yes, they always have. The concept of soft limits, hard limits, guarantees
> applies to all resources. Why do you say they apply only to memory controller? I

I said that *memory pressure and memory reclamation*, not the soft limits
in general, applies to memory controller only :)

> can change the comment to make the definition generic for all resources.
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 1/4] Modify resource counters to add soft limit support
  2008-02-13 17:38       ` Pavel Emelyanov
@ 2008-02-13 17:54         ` Balbir Singh
  0 siblings, 0 replies; 26+ messages in thread
From: Balbir Singh @ 2008-02-13 17:54 UTC (permalink / raw)
  To: Pavel Emelyanov
  Cc: linux-mm, Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi,
	Paul Menage, Lee Schermerhorn, Nick Piggin, Eric W. Biederman,
	David Rientjes, Andrew Morton, Rik Van Riel, Herbert Poetzl,
	KAMEZAWA Hiroyuki

Pavel Emelyanov wrote:
> Balbir Singh wrote:
>> Pavel Emelyanov wrote:
>>> Balbir Singh wrote:
>>> Resource counter accounts for arbitrary resource. Memory pressure
>>> and memory reclamation both only make sense in case we're dealing
>>> with memory controller. Please, remove this comment or move it to
>>> memcontrol.c.
>>>
>> Yes, they always have. The concept of soft limits, hard limits, guarantees
>> applies to all resources. Why do you say they apply only to memory controller? I
> 
> I said that *memory pressure and memory reclamation*, not the soft limits
> in general, applies to memory controller only :)
> 

I suspected that, that's why I asked if I should change the comment to make it
generic :) I'll make that change in the next revision of the patches.

>> can change the comment to make the definition generic for all resources.
>>

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC] [PATCH 2/4] Add the soft limit interface
  2008-02-13 15:12 [RFC] [PATCH 0/4] Add soft limits to the memory controller Balbir Singh
  2008-02-13 15:12 ` [RFC] [PATCH 1/4] Modify resource counters to add soft limit support Balbir Singh
@ 2008-02-13 15:12 ` Balbir Singh
  2008-02-13 15:12 ` [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Balbir Singh
  2008-02-13 15:12 ` [RFC] [PATCH 4/4] Add soft limit documentation Balbir Singh
  3 siblings, 0 replies; 26+ messages in thread
From: Balbir Singh @ 2008-02-13 15:12 UTC (permalink / raw)
  To: linux-mm
  Cc: Hugh Dickins, Paul Menage, YAMAMOTO Takashi, Herbert Poetzl,
	Peter Zijlstra, Lee Schermerhorn, Nick Piggin, David Rientjes,
	Andrew Morton, Pavel Emelianov, Balbir Singh, Rik Van Riel,
	Eric W. Biederman, KAMEZAWA Hiroyuki


A new configuration file called soft_limit_in_bytes is added. The parsing
and configuration rules remain the same as for the limit_in_bytes user
interface.

A global list of all memory cgroups over their soft limit is maintained.
This list is then used to reclaim memory on global pressure. A cgroup is
removed from the list when the cgroup is deleted.

The global list is protected with a read-write spinlock.

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
---

 mm/memcontrol.c |   33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff -puN mm/memcontrol.c~memory-controller-add-soft-limit-interface mm/memcontrol.c
--- linux-2.6.24/mm/memcontrol.c~memory-controller-add-soft-limit-interface	2008-02-13 19:50:27.000000000 +0530
+++ linux-2.6.24-balbir/mm/memcontrol.c	2008-02-13 19:50:27.000000000 +0530
@@ -35,6 +35,10 @@
 
 struct cgroup_subsys mem_cgroup_subsys;
 static const int MEM_CGROUP_RECLAIM_RETRIES = 5;
+static spinlock_t mem_cgroup_sl_list_lock;	/* spin lock that protects */
+						/* the list of cgroups over*/
+						/* their soft limit */
+static struct list_head mem_cgroup_sl_exceeded_list;
 
 /*
  * Statistics for memory cgroup.
@@ -136,6 +140,10 @@ struct mem_cgroup {
 	 * statistics.
 	 */
 	struct mem_cgroup_stat stat;
+	/*
+	 * List of all mem_cgroup's that exceed their soft limit
+	 */
+	struct list_head sl_exceeded_list;
 };
 
 /*
@@ -679,6 +687,18 @@ retry:
 		goto retry;
 	}
 
+	/*
+	 * If we exceed our soft limit, we get added to the list of
+	 * cgroups over their soft limit
+	 */
+	if (!res_counter_check_under_limit(&mem->res, RES_SOFT_LIMIT)) {
+		spin_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
+		if (list_empty(&mem->sl_exceeded_list))
+			list_add_tail(&mem->sl_exceeded_list,
+						&mem_cgroup_sl_exceeded_list);
+		spin_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags);
+	}
+
 	mz = page_cgroup_zoneinfo(pc);
 	spin_lock_irqsave(&mz->lru_lock, flags);
 	/* Update statistics vector */
@@ -736,13 +756,14 @@ void mem_cgroup_uncharge(struct page_cgr
 	if (atomic_dec_and_test(&pc->ref_cnt)) {
 		page = pc->page;
 		mz = page_cgroup_zoneinfo(pc);
+		mem = pc->mem_cgroup;
 		/*
 		 * get page->cgroup and clear it under lock.
 		 * force_empty can drop page->cgroup without checking refcnt.
 		 */
 		unlock_page_cgroup(page);
+
 		if (clear_page_cgroup(page, pc) == pc) {
-			mem = pc->mem_cgroup;
 			css_put(&mem->css);
 			res_counter_uncharge(&mem->res, PAGE_SIZE);
 			spin_lock_irqsave(&mz->lru_lock, flags);
@@ -1046,6 +1067,12 @@ static struct cftype mem_cgroup_files[] 
 		.name = "stat",
 		.open = mem_control_stat_open,
 	},
+	{
+		.name = "soft_limit_in_bytes",
+		.private = RES_SOFT_LIMIT,
+		.write = mem_cgroup_write,
+		.read = mem_cgroup_read,
+	},
 };
 
 static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *mem, int node)
@@ -1097,6 +1124,9 @@ mem_cgroup_create(struct cgroup_subsys *
 	if (unlikely((cont->parent) == NULL)) {
 		mem = &init_mem_cgroup;
 		init_mm.mem_cgroup = mem;
+		INIT_LIST_HEAD(&mem->sl_exceeded_list);
+		spin_lock_init(&mem_cgroup_sl_list_lock);
+		INIT_LIST_HEAD(&mem_cgroup_sl_exceeded_list);
 	} else
 		mem = kzalloc(sizeof(struct mem_cgroup), GFP_KERNEL);
 
@@ -1104,6 +1134,7 @@ mem_cgroup_create(struct cgroup_subsys *
 		return NULL;
 
 	res_counter_init(&mem->res);
+	INIT_LIST_HEAD(&mem->sl_exceeded_list);
 
 	memset(&mem->info, 0, sizeof(mem->info));
 
diff -puN include/linux/memcontrol.h~memory-controller-add-soft-limit-interface include/linux/memcontrol.h
_

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-13 15:12 [RFC] [PATCH 0/4] Add soft limits to the memory controller Balbir Singh
  2008-02-13 15:12 ` [RFC] [PATCH 1/4] Modify resource counters to add soft limit support Balbir Singh
  2008-02-13 15:12 ` [RFC] [PATCH 2/4] Add the soft limit interface Balbir Singh
@ 2008-02-13 15:12 ` Balbir Singh
  2008-02-14  7:30   ` KAMEZAWA Hiroyuki
  2008-02-14 10:27   ` YAMAMOTO Takashi
  2008-02-13 15:12 ` [RFC] [PATCH 4/4] Add soft limit documentation Balbir Singh
  3 siblings, 2 replies; 26+ messages in thread
From: Balbir Singh @ 2008-02-13 15:12 UTC (permalink / raw)
  To: linux-mm
  Cc: Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi, Paul Menage,
	Lee Schermerhorn, Herbert Poetzl, Eric W. Biederman,
	David Rientjes, Pavel Emelianov, Nick Piggin, Balbir Singh,
	Rik Van Riel, Andrew Morton, KAMEZAWA Hiroyuki


The global list of all cgroups over their soft limit is scanned under
memory pressure. We call mem_cgroup_pushback_groups_over_soft_limit
from __alloc_pages() prior to calling try_to_free_pages(), in an attempt
to rescue memory from groups that are using memory above their soft limit.
If this attempt is unsuccessfull, we call try_to_free_pages() and take
the normal global reclaim path.

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
---

 include/linux/memcontrol.h  |    9 +++++
 include/linux/res_counter.h |   11 +++++++
 include/linux/swap.h        |    4 +-
 mm/memcontrol.c             |   67 ++++++++++++++++++++++++++++++++++++++++----
 mm/page_alloc.c             |   10 +++++-
 mm/vmscan.c                 |   12 +++++--
 6 files changed, 101 insertions(+), 12 deletions(-)

diff -puN mm/vmscan.c~memory-controller-reclaim-on-contention mm/vmscan.c
--- linux-2.6.24/mm/vmscan.c~memory-controller-reclaim-on-contention	2008-02-13 20:16:04.000000000 +0530
+++ linux-2.6.24-balbir/mm/vmscan.c	2008-02-13 20:16:04.000000000 +0530
@@ -1440,22 +1440,26 @@ unsigned long try_to_free_pages(struct z
 #ifdef CONFIG_CGROUP_MEM_CONT
 
 unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
-						gfp_t gfp_mask)
+						gfp_t gfp_mask,
+						unsigned long nr_pages,
+						struct zone **zones)
 {
 	struct scan_control sc = {
 		.gfp_mask = gfp_mask,
 		.may_writepage = !laptop_mode,
 		.may_swap = 1,
-		.swap_cluster_max = SWAP_CLUSTER_MAX,
+		.swap_cluster_max = nr_pages,
 		.swappiness = vm_swappiness,
 		.order = 0,
 		.mem_cgroup = mem_cont,
 		.isolate_pages = mem_cgroup_isolate_pages,
 	};
-	struct zone **zones;
 	int target_zone = gfp_zone(GFP_HIGHUSER_MOVABLE);
 
-	zones = NODE_DATA(numa_node_id())->node_zonelists[target_zone].zones;
+	if (!zones)
+		zones =
+		NODE_DATA(numa_node_id())->node_zonelists[target_zone].zones;
+
 	if (do_try_to_free_pages(zones, sc.gfp_mask, &sc))
 		return 1;
 	return 0;
diff -puN include/linux/memcontrol.h~memory-controller-reclaim-on-contention include/linux/memcontrol.h
--- linux-2.6.24/include/linux/memcontrol.h~memory-controller-reclaim-on-contention	2008-02-13 20:16:04.000000000 +0530
+++ linux-2.6.24-balbir/include/linux/memcontrol.h	2008-02-13 20:19:20.000000000 +0530
@@ -76,6 +76,8 @@ extern long mem_cgroup_calc_reclaim_acti
 				struct zone *zone, int priority);
 extern long mem_cgroup_calc_reclaim_inactive(struct mem_cgroup *mem,
 				struct zone *zone, int priority);
+extern unsigned long
+mem_cgroup_pushback_groups_over_soft_limit(struct zone **zones, gfp_t gfp_mask);
 
 #else /* CONFIG_CGROUP_MEM_CONT */
 static inline void mm_init_cgroup(struct mm_struct *mm,
@@ -184,6 +186,13 @@ static inline long mem_cgroup_calc_recla
 {
 	return 0;
 }
+
+static inline unsigned long
+mem_cgroup_pushback_groups_over_soft_limit(struct zone **zones, gfp_t gfp_mask)
+{
+	return 0;
+}
+
 #endif /* CONFIG_CGROUP_MEM_CONT */
 
 #endif /* _LINUX_MEMCONTROL_H */
diff -puN mm/memcontrol.c~memory-controller-reclaim-on-contention mm/memcontrol.c
--- linux-2.6.24/mm/memcontrol.c~memory-controller-reclaim-on-contention	2008-02-13 20:16:04.000000000 +0530
+++ linux-2.6.24-balbir/mm/memcontrol.c	2008-02-13 20:16:04.000000000 +0530
@@ -35,7 +35,7 @@
 
 struct cgroup_subsys mem_cgroup_subsys;
 static const int MEM_CGROUP_RECLAIM_RETRIES = 5;
-static spinlock_t mem_cgroup_sl_list_lock;	/* spin lock that protects */
+static rwlock_t mem_cgroup_sl_list_lock;	/* spin lock that protects */
 						/* the list of cgroups over*/
 						/* their soft limit */
 static struct list_head mem_cgroup_sl_exceeded_list;
@@ -646,7 +646,8 @@ retry:
 		if (!(gfp_mask & __GFP_WAIT))
 			goto out;
 
-		if (try_to_free_mem_cgroup_pages(mem, gfp_mask))
+		if (try_to_free_mem_cgroup_pages(mem, gfp_mask,
+							SWAP_CLUSTER_MAX, NULL))
 			continue;
 
 		/*
@@ -692,11 +693,11 @@ retry:
 	 * cgroups over their soft limit
 	 */
 	if (!res_counter_check_under_limit(&mem->res, RES_SOFT_LIMIT)) {
-		spin_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
+		write_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
 		if (list_empty(&mem->sl_exceeded_list))
 			list_add_tail(&mem->sl_exceeded_list,
 						&mem_cgroup_sl_exceeded_list);
-		spin_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags);
+		write_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags);
 	}
 
 	mz = page_cgroup_zoneinfo(pc);
@@ -928,7 +929,55 @@ out:
 	return ret;
 }
 
+/*
+ * Free all control groups, which are over their soft limit
+ */
+unsigned long mem_cgroup_pushback_groups_over_soft_limit(struct zone **zones,
+								gfp_t gfp_mask)
+{
+	struct mem_cgroup *mem;
+	unsigned long nr_pages;
+	long long nr_bytes_over_sl;
+	unsigned long ret = 0;
+	unsigned long flags;
+	struct list_head reclaimed_groups;
 
+	INIT_LIST_HEAD(&reclaimed_groups);
+	read_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
+	while (!list_empty(&mem_cgroup_sl_exceeded_list)) {
+		mem = list_first_entry(&mem_cgroup_sl_exceeded_list,
+				struct mem_cgroup, sl_exceeded_list);
+		list_move(&mem->sl_exceeded_list, &reclaimed_groups);
+		read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags);
+
+		nr_bytes_over_sl = res_counter_sl_excess(&mem->res);
+		if (nr_bytes_over_sl <= 0)
+			goto next;
+		nr_pages = (nr_bytes_over_sl >> PAGE_SHIFT);
+		ret += try_to_free_mem_cgroup_pages(mem, gfp_mask, nr_pages,
+							zones);
+next:
+		read_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
+	}
+
+	while (!list_empty(&reclaimed_groups)) {
+		/*
+		 * Check again to see if we've gone below the soft
+		 * limit. XXX: Consider giving up the &mem_cgroup_sl_list_lock
+		 * before calling res_counter_sl_excess.
+		 */
+		mem = list_first_entry(&reclaimed_groups, struct mem_cgroup,
+					sl_exceeded_list);
+		nr_bytes_over_sl = res_counter_sl_excess(&mem->res);
+		if (nr_bytes_over_sl <= 0)
+			list_del_init(&mem->sl_exceeded_list);
+		else
+			list_move(&mem->sl_exceeded_list,
+				&mem_cgroup_sl_exceeded_list);
+	}
+	read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags);
+	return ret;
+}
 
 int mem_cgroup_write_strategy(char *buf, unsigned long long *tmp)
 {
@@ -1124,8 +1173,7 @@ mem_cgroup_create(struct cgroup_subsys *
 	if (unlikely((cont->parent) == NULL)) {
 		mem = &init_mem_cgroup;
 		init_mm.mem_cgroup = mem;
-		INIT_LIST_HEAD(&mem->sl_exceeded_list);
-		spin_lock_init(&mem_cgroup_sl_list_lock);
+		rwlock_init(&mem_cgroup_sl_list_lock);
 		INIT_LIST_HEAD(&mem_cgroup_sl_exceeded_list);
 	} else
 		mem = kzalloc(sizeof(struct mem_cgroup), GFP_KERNEL);
@@ -1155,7 +1203,14 @@ static void mem_cgroup_pre_destroy(struc
 					struct cgroup *cont)
 {
 	struct mem_cgroup *mem = mem_cgroup_from_cont(cont);
+	unsigned long flags;
+
 	mem_cgroup_force_empty(mem);
+
+	write_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
+	if (!list_empty(&mem->sl_exceeded_list))
+		list_del_init(&mem->sl_exceeded_list);
+	write_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags);
 }
 
 static void mem_cgroup_destroy(struct cgroup_subsys *ss,
diff -puN include/linux/res_counter.h~memory-controller-reclaim-on-contention include/linux/res_counter.h
--- linux-2.6.24/include/linux/res_counter.h~memory-controller-reclaim-on-contention	2008-02-13 20:16:04.000000000 +0530
+++ linux-2.6.24-balbir/include/linux/res_counter.h	2008-02-13 20:16:04.000000000 +0530
@@ -142,4 +142,15 @@ static inline bool res_counter_check_und
 	return ret;
 }
 
+static inline long long res_counter_sl_excess(struct res_counter *cnt)
+{
+	unsigned long flags;
+	long long ret;
+
+	spin_lock_irqsave(&cnt->lock, flags);
+	ret = cnt->usage - cnt->soft_limit;
+	spin_unlock_irqrestore(&cnt->lock, flags);
+	return ret;
+}
+
 #endif
diff -puN kernel/res_counter.c~memory-controller-reclaim-on-contention kernel/res_counter.c
diff -puN mm/page_alloc.c~memory-controller-reclaim-on-contention mm/page_alloc.c
--- linux-2.6.24/mm/page_alloc.c~memory-controller-reclaim-on-contention	2008-02-13 20:16:04.000000000 +0530
+++ linux-2.6.24-balbir/mm/page_alloc.c	2008-02-13 20:16:04.000000000 +0530
@@ -1635,7 +1635,15 @@ nofail_alloc:
 	reclaim_state.reclaimed_slab = 0;
 	p->reclaim_state = &reclaim_state;
 
-	did_some_progress = try_to_free_pages(zonelist->zones, order, gfp_mask);
+	/*
+	 * First reclaim from all memory control groups over their
+	 * soft limit
+	 */
+	did_some_progress = mem_cgroup_pushback_groups_over_soft_limit(
+						zonelist->zones, gfp_mask);
+	if (!did_some_progress)
+		did_some_progress =
+			try_to_free_pages(zonelist->zones, order, gfp_mask);
 
 	p->reclaim_state = NULL;
 	p->flags &= ~PF_MEMALLOC;
diff -puN include/linux/swap.h~memory-controller-reclaim-on-contention include/linux/swap.h
--- linux-2.6.24/include/linux/swap.h~memory-controller-reclaim-on-contention	2008-02-13 20:16:04.000000000 +0530
+++ linux-2.6.24-balbir/include/linux/swap.h	2008-02-13 20:16:04.000000000 +0530
@@ -184,7 +184,9 @@ extern void swap_setup(void);
 extern unsigned long try_to_free_pages(struct zone **zones, int order,
 					gfp_t gfp_mask);
 extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem,
-							gfp_t gfp_mask);
+							gfp_t gfp_mask,
+							unsigned long nr_pages,
+							struct zone **zones);
 extern int __isolate_lru_page(struct page *page, int mode);
 extern unsigned long shrink_all_memory(unsigned long nr_pages);
 extern int vm_swappiness;
_

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-13 15:12 ` [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Balbir Singh
@ 2008-02-14  7:30   ` KAMEZAWA Hiroyuki
  2008-02-14  7:40     ` Balbir Singh
  2008-02-14 10:27   ` YAMAMOTO Takashi
  1 sibling, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-02-14  7:30 UTC (permalink / raw)
  To: Balbir Singh
  Cc: linux-mm, Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi,
	Paul Menage, Lee Schermerhorn, Herbert Poetzl, Eric W. Biederman,
	David Rientjes, Pavel Emelianov, Nick Piggin, Rik Van Riel,
	Andrew Morton

On Wed, 13 Feb 2008 20:42:42 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> 
> +	read_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
> +	while (!list_empty(&mem_cgroup_sl_exceeded_list)) {
> +		mem = list_first_entry(&mem_cgroup_sl_exceeded_list,
> +				struct mem_cgroup, sl_exceeded_list);
> +		list_move(&mem->sl_exceeded_list, &reclaimed_groups);
> +		read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags);
> +
> +		nr_bytes_over_sl = res_counter_sl_excess(&mem->res);
> +		if (nr_bytes_over_sl <= 0)
> +			goto next;
> +		nr_pages = (nr_bytes_over_sl >> PAGE_SHIFT);
> +		ret += try_to_free_mem_cgroup_pages(mem, gfp_mask, nr_pages,
> +							zones);
> +next:
> +		read_lock_irqsave(&mem_cgroup_sl_list_lock, flags);

Hmm... 
This is triggered by page allocation failure (fast path) in alloc_pages()
after try_to_free_pages(). Then, what pages should be reclaimed is 
depends on zones[]. Because nr-bytes_over_sl is counted globally, cgroup's
pages may not be included in zones[].

And I think it's big workload to relclaim all excessed pages at once.

How about just reclaiming small # of pages ? like
==
if (nr_bytes_over_sl <= 0)
	goto next;
nr_pages = SWAP_CLUSTER_MAX;
==

Regards,
-Kame




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-14  7:30   ` KAMEZAWA Hiroyuki
@ 2008-02-14  7:40     ` Balbir Singh
  2008-02-14  8:42       ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 26+ messages in thread
From: Balbir Singh @ 2008-02-14  7:40 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-mm, Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi,
	Paul Menage, Lee Schermerhorn, Herbert Poetzl, Eric W. Biederman,
	David Rientjes, Pavel Emelianov, Nick Piggin, Rik Van Riel,
	Andrew Morton

KAMEZAWA Hiroyuki wrote:
> On Wed, 13 Feb 2008 20:42:42 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
>> +	read_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
>> +	while (!list_empty(&mem_cgroup_sl_exceeded_list)) {
>> +		mem = list_first_entry(&mem_cgroup_sl_exceeded_list,
>> +				struct mem_cgroup, sl_exceeded_list);
>> +		list_move(&mem->sl_exceeded_list, &reclaimed_groups);
>> +		read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags);
>> +
>> +		nr_bytes_over_sl = res_counter_sl_excess(&mem->res);
>> +		if (nr_bytes_over_sl <= 0)
>> +			goto next;
>> +		nr_pages = (nr_bytes_over_sl >> PAGE_SHIFT);
>> +		ret += try_to_free_mem_cgroup_pages(mem, gfp_mask, nr_pages,
>> +							zones);
>> +next:
>> +		read_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
> 
> Hmm... 
> This is triggered by page allocation failure (fast path) in alloc_pages()
> after try_to_free_pages(). 

We trigger it prior to try_to_free_pages() in __alloc_pages()

Then, what pages should be reclaimed is
> depends on zones[]. Because nr-bytes_over_sl is counted globally, cgroup's
> pages may not be included in zones[].
> 

True, that is quite possible.

> And I think it's big workload to relclaim all excessed pages at once.
> 
> How about just reclaiming small # of pages ? like
> ==
> if (nr_bytes_over_sl <= 0)
> 	goto next;
> nr_pages = SWAP_CLUSTER_MAX;

I thought about this, but wanted to push back all groups over their soft limit
back to their soft limit quickly. I'll experiment with your suggestion and see
how the system behaves when we push back pages slowly. Thanks for the suggestion.

> ==
> 
> Regards,
> -Kame


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-14  7:40     ` Balbir Singh
@ 2008-02-14  8:42       ` KAMEZAWA Hiroyuki
  2008-02-14  9:16         ` Balbir Singh
  0 siblings, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-02-14  8:42 UTC (permalink / raw)
  To: balbir
  Cc: linux-mm, Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi,
	Paul Menage, Lee Schermerhorn, Herbert Poetzl, Eric W. Biederman,
	David Rientjes, Pavel Emelianov, Nick Piggin, Rik Van Riel,
	Andrew Morton

On Thu, 14 Feb 2008 13:10:35 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> > And I think it's big workload to relclaim all excessed pages at once.
> > 
> > How about just reclaiming small # of pages ? like
> > ==
> > if (nr_bytes_over_sl <= 0)
> > 	goto next;
> > nr_pages = SWAP_CLUSTER_MAX;
> 
> I thought about this, but wanted to push back all groups over their soft limit
> back to their soft limit quickly. I'll experiment with your suggestion and see
> how the system behaves when we push back pages slowly. Thanks for the suggestion.

My point is an unlucky process may have to reclaim tons of pages even if
what he wants is just 1 page. It's not good, IMO.

Probably backgound-reclaim patch will be able to help this soft-limit situation,
if a daemon can know it should reclaim or not.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-14  8:42       ` KAMEZAWA Hiroyuki
@ 2008-02-14  9:16         ` Balbir Singh
  2008-02-15  4:17           ` Paul Menage
  0 siblings, 1 reply; 26+ messages in thread
From: Balbir Singh @ 2008-02-14  9:16 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-mm, Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi,
	Paul Menage, Lee Schermerhorn, Herbert Poetzl, Eric W. Biederman,
	David Rientjes, Pavel Emelianov, Nick Piggin, Rik Van Riel,
	Andrew Morton

KAMEZAWA Hiroyuki wrote:
> On Thu, 14 Feb 2008 13:10:35 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
>>> And I think it's big workload to relclaim all excessed pages at once.
>>>
>>> How about just reclaiming small # of pages ? like
>>> ==
>>> if (nr_bytes_over_sl <= 0)
>>> 	goto next;
>>> nr_pages = SWAP_CLUSTER_MAX;
>> I thought about this, but wanted to push back all groups over their soft limit
>> back to their soft limit quickly. I'll experiment with your suggestion and see
>> how the system behaves when we push back pages slowly. Thanks for the suggestion.
> 
> My point is an unlucky process may have to reclaim tons of pages even if
> what he wants is just 1 page. It's not good, IMO.
> 

Yes, that makes sense.

> Probably backgound-reclaim patch will be able to help this soft-limit situation,
> if a daemon can know it should reclaim or not.
> 

Yes, I agree. I might just need to schedule the daemon under memory pressure.

> Thanks,
> -Kame

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-14  9:16         ` Balbir Singh
@ 2008-02-15  4:17           ` Paul Menage
  2008-02-15  4:25             ` Balbir Singh
  0 siblings, 1 reply; 26+ messages in thread
From: Paul Menage @ 2008-02-15  4:17 UTC (permalink / raw)
  To: balbir
  Cc: KAMEZAWA Hiroyuki, linux-mm, Hugh Dickins, Peter Zijlstra,
	YAMAMOTO Takashi, Lee Schermerhorn, Herbert Poetzl,
	Eric W. Biederman, David Rientjes, Pavel Emelianov, Nick Piggin,
	Rik Van Riel, Andrew Morton

On Thu, Feb 14, 2008 at 1:16 AM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>  > Probably backgound-reclaim patch will be able to help this soft-limit situation,
>  > if a daemon can know it should reclaim or not.
>  >
>
>  Yes, I agree. I might just need to schedule the daemon under memory pressure.
>

Can we also have a way to trigger a one-off reclaim (of a configurable
magnitude) from userspace? Having a background daemon doing it may be
fine as a default, but there will be cases when a userspace machine
manager knows better than the kernel how frequently/hard to try to
reclaim on a given cgroup.

Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-15  4:17           ` Paul Menage
@ 2008-02-15  4:25             ` Balbir Singh
  2008-02-15  5:07               ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 26+ messages in thread
From: Balbir Singh @ 2008-02-15  4:25 UTC (permalink / raw)
  To: Paul Menage
  Cc: KAMEZAWA Hiroyuki, linux-mm, Hugh Dickins, Peter Zijlstra,
	YAMAMOTO Takashi, Lee Schermerhorn, Herbert Poetzl,
	Eric W. Biederman, David Rientjes, Pavel Emelianov, Nick Piggin,
	Rik Van Riel, Andrew Morton

Paul Menage wrote:
> On Thu, Feb 14, 2008 at 1:16 AM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>>  > Probably backgound-reclaim patch will be able to help this soft-limit situation,
>>  > if a daemon can know it should reclaim or not.
>>  >
>>
>>  Yes, I agree. I might just need to schedule the daemon under memory pressure.
>>
> 
> Can we also have a way to trigger a one-off reclaim (of a configurable
> magnitude) from userspace? Having a background daemon doing it may be
> fine as a default, but there will be cases when a userspace machine
> manager knows better than the kernel how frequently/hard to try to
> reclaim on a given cgroup.
> 
> Paul

We have that capability, but we cannot specify how much to reclaim.
There is a force_empty file that when written to, tries to reclaim all pages
from the cgroup. Depending on the need, it can be extended so that the number of
pages to be reclaimed can be specified.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-15  4:25             ` Balbir Singh
@ 2008-02-15  5:07               ` KAMEZAWA Hiroyuki
  2008-02-15  5:16                 ` Paul Menage
  0 siblings, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-02-15  5:07 UTC (permalink / raw)
  To: balbir
  Cc: Paul Menage, linux-mm, Hugh Dickins, Peter Zijlstra,
	YAMAMOTO Takashi, Lee Schermerhorn, Herbert Poetzl,
	Eric W. Biederman, David Rientjes, Pavel Emelianov, Nick Piggin,
	Rik Van Riel, Andrew Morton

On Fri, 15 Feb 2008 09:55:20 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> Paul Menage wrote:
> > On Thu, Feb 14, 2008 at 1:16 AM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> >>  > Probably backgound-reclaim patch will be able to help this soft-limit situation,
> >>  > if a daemon can know it should reclaim or not.
> >>  >
> >>
> >>  Yes, I agree. I might just need to schedule the daemon under memory pressure.
> >>
> > 
> > Can we also have a way to trigger a one-off reclaim (of a configurable
> > magnitude) from userspace? Having a background daemon doing it may be
> > fine as a default, but there will be cases when a userspace machine
> > manager knows better than the kernel how frequently/hard to try to
> > reclaim on a given cgroup.
> > 
> > Paul
> 
> We have that capability, but we cannot specify how much to reclaim.
> There is a force_empty file that when written to, tries to reclaim all pages
> from the cgroup. Depending on the need, it can be extended so that the number of
> pages to be reclaimed can be specified.
> 
Note:
Now, force_empty doesn't try to free memory but just drops charges.

We can free memory by just making memory.limit to smaller number.
(This may cause OOM. If we added high-low watermark, making memory.high smaller
 can works well for memory freeing to some extent.)

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-15  5:07               ` KAMEZAWA Hiroyuki
@ 2008-02-15  5:16                 ` Paul Menage
  2008-02-15  5:18                   ` Balbir Singh
  2008-02-15  5:29                   ` KAMEZAWA Hiroyuki
  0 siblings, 2 replies; 26+ messages in thread
From: Paul Menage @ 2008-02-15  5:16 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: balbir, linux-mm, Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi,
	Lee Schermerhorn, Herbert Poetzl, Eric W. Biederman,
	David Rientjes, Pavel Emelianov, Nick Piggin, Rik Van Riel,
	Andrew Morton

On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
>  We can free memory by just making memory.limit to smaller number.
>  (This may cause OOM. If we added high-low watermark, making memory.high smaller
>   can works well for memory freeing to some extent.)
>

What about if we want to apply memory pressure to a cgroup to push out
unused memory, but not push out memory that it's actively using?

Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-15  5:16                 ` Paul Menage
@ 2008-02-15  5:18                   ` Balbir Singh
  2008-02-15  5:30                     ` Paul Menage
  2008-02-15  5:33                     ` KAMEZAWA Hiroyuki
  2008-02-15  5:29                   ` KAMEZAWA Hiroyuki
  1 sibling, 2 replies; 26+ messages in thread
From: Balbir Singh @ 2008-02-15  5:18 UTC (permalink / raw)
  To: Paul Menage
  Cc: KAMEZAWA Hiroyuki, linux-mm, Hugh Dickins, Peter Zijlstra,
	YAMAMOTO Takashi, Lee Schermerhorn, Herbert Poetzl,
	Eric W. Biederman, David Rientjes, Pavel Emelianov, Nick Piggin,
	Rik Van Riel, Andrew Morton

Paul Menage wrote:
> On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>  We can free memory by just making memory.limit to smaller number.
>>  (This may cause OOM. If we added high-low watermark, making memory.high smaller
>>   can works well for memory freeing to some extent.)
>>
> 
> What about if we want to apply memory pressure to a cgroup to push out
> unused memory, but not push out memory that it's actively using?

Both watermarks and reducing the limit will reclaim from the inactive list
first. The reclaim logic is the same as that of the per zone LRU. It would be
right to assume that both would push out unused memory first. Am I missing
something?

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-15  5:18                   ` Balbir Singh
@ 2008-02-15  5:30                     ` Paul Menage
  2008-02-15  5:33                     ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 26+ messages in thread
From: Paul Menage @ 2008-02-15  5:30 UTC (permalink / raw)
  To: balbir
  Cc: KAMEZAWA Hiroyuki, linux-mm, Hugh Dickins, Peter Zijlstra,
	YAMAMOTO Takashi, Lee Schermerhorn, Herbert Poetzl,
	Eric W. Biederman, David Rientjes, Pavel Emelianov, Nick Piggin,
	Rik Van Riel, Andrew Morton

On Thu, Feb 14, 2008 at 9:18 PM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> Paul Menage wrote:
>  > On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki
>  > <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>  >>  We can free memory by just making memory.limit to smaller number.
>  >>  (This may cause OOM. If we added high-low watermark, making memory.high smaller
>  >>   can works well for memory freeing to some extent.)
>  >>
>  >
>  > What about if we want to apply memory pressure to a cgroup to push out
>  > unused memory, but not push out memory that it's actively using?
>
>  Both watermarks and reducing the limit will reclaim from the inactive list
>  first. The reclaim logic is the same as that of the per zone LRU. It would be
>  right to assume that both would push out unused memory first. Am I missing
>  something?
>

Doesn't the per-zone LRU logic try to keep the inactive list at a
certain percentage of memory? In which case you can't really tell from
the active/inactive stats for a cgroup how much of that memory it's
really using.

Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-15  5:18                   ` Balbir Singh
  2008-02-15  5:30                     ` Paul Menage
@ 2008-02-15  5:33                     ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-02-15  5:33 UTC (permalink / raw)
  To: balbir
  Cc: Paul Menage, linux-mm, Hugh Dickins, Peter Zijlstra,
	YAMAMOTO Takashi, Lee Schermerhorn, Herbert Poetzl,
	Eric W. Biederman, David Rientjes, Pavel Emelianov, Nick Piggin,
	Rik Van Riel, Andrew Morton

On Fri, 15 Feb 2008 10:48:46 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> Paul Menage wrote:
> > On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki
> > <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> >>  We can free memory by just making memory.limit to smaller number.
> >>  (This may cause OOM. If we added high-low watermark, making memory.high smaller
> >>   can works well for memory freeing to some extent.)
> >>
> > 
> > What about if we want to apply memory pressure to a cgroup to push out
> > unused memory, but not push out memory that it's actively using?
> 
> Both watermarks and reducing the limit will reclaim from the inactive list
> first. The reclaim logic is the same as that of the per zone LRU. It would be
> right to assume that both would push out unused memory first. Am I missing
> something?
> 
You are right to some extent.  If memory.limit is very small and there is
heavy memory pressure, we have no chance.
(For example, some text/program for shell-scirpt can be pageout easily
 because it's not mapped always.)

Thanks,
-Kame


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-15  5:16                 ` Paul Menage
  2008-02-15  5:18                   ` Balbir Singh
@ 2008-02-15  5:29                   ` KAMEZAWA Hiroyuki
  2008-02-15  6:36                     ` Balbir Singh
  1 sibling, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-02-15  5:29 UTC (permalink / raw)
  To: Paul Menage
  Cc: balbir, linux-mm, Hugh Dickins, Peter Zijlstra, YAMAMOTO Takashi,
	Lee Schermerhorn, Herbert Poetzl, Eric W. Biederman,
	David Rientjes, Pavel Emelianov, Nick Piggin, Rik Van Riel,
	Andrew Morton

On Thu, 14 Feb 2008 21:16:48 -0800
"Paul Menage" <menage@google.com> wrote:

> On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> >  We can free memory by just making memory.limit to smaller number.
> >  (This may cause OOM. If we added high-low watermark, making memory.high smaller
> >   can works well for memory freeing to some extent.)
> >
> 
> What about if we want to apply memory pressure to a cgroup to push out
> unused memory, but not push out memory that it's actively using?
> 
Generally, only way to avoid pageout is mlock() because actively-used is just
determeined by reference-bit and heavy pressure can do page-scanning too much.
I hope that RvR's LRU improvement may change things better.


Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-15  5:29                   ` KAMEZAWA Hiroyuki
@ 2008-02-15  6:36                     ` Balbir Singh
  0 siblings, 0 replies; 26+ messages in thread
From: Balbir Singh @ 2008-02-15  6:36 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Paul Menage, linux-mm, Hugh Dickins, Peter Zijlstra,
	YAMAMOTO Takashi, Lee Schermerhorn, Herbert Poetzl,
	Eric W. Biederman, David Rientjes, Pavel Emelianov, Nick Piggin,
	Rik Van Riel, Andrew Morton

KAMEZAWA Hiroyuki wrote:
> On Thu, 14 Feb 2008 21:16:48 -0800
> "Paul Menage" <menage@google.com> wrote:
> 
>> On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki
>> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>>  We can free memory by just making memory.limit to smaller number.
>>>  (This may cause OOM. If we added high-low watermark, making memory.high smaller
>>>   can works well for memory freeing to some extent.)
>>>
>> What about if we want to apply memory pressure to a cgroup to push out
>> unused memory, but not push out memory that it's actively using?
>>
> Generally, only way to avoid pageout is mlock() because actively-used is just
> determeined by reference-bit and heavy pressure can do page-scanning too much.
> I hope that RvR's LRU improvement may change things better.

There are two other controllers, I plan to work on soon. The mlock() and virtual
memory limit controller. Hopefully that should fix the mlock() problem to some
extent.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-13 15:12 ` [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Balbir Singh
  2008-02-14  7:30   ` KAMEZAWA Hiroyuki
@ 2008-02-14 10:27   ` YAMAMOTO Takashi
  2008-02-15  3:19     ` Balbir Singh
  1 sibling, 1 reply; 26+ messages in thread
From: YAMAMOTO Takashi @ 2008-02-14 10:27 UTC (permalink / raw)
  To: balbir
  Cc: linux-mm, hugh, a.p.zijlstra, menage, Lee.Schermerhorn, herbert,
	ebiederm, rientjes, xemul, nickpiggin, riel, akpm,
	kamezawa.hiroyu

> +/*
> + * Free all control groups, which are over their soft limit
> + */
> +unsigned long mem_cgroup_pushback_groups_over_soft_limit(struct zone **zones,
> +								gfp_t gfp_mask)
> +{
> +	struct mem_cgroup *mem;
> +	unsigned long nr_pages;
> +	long long nr_bytes_over_sl;
> +	unsigned long ret = 0;
> +	unsigned long flags;
> +	struct list_head reclaimed_groups;
>  
> +	INIT_LIST_HEAD(&reclaimed_groups);
> +	read_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
> +	while (!list_empty(&mem_cgroup_sl_exceeded_list)) {
> +		mem = list_first_entry(&mem_cgroup_sl_exceeded_list,
> +				struct mem_cgroup, sl_exceeded_list);
> +		list_move(&mem->sl_exceeded_list, &reclaimed_groups);
> +		read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags);
> +
> +		nr_bytes_over_sl = res_counter_sl_excess(&mem->res);
> +		if (nr_bytes_over_sl <= 0)
> +			goto next;
> +		nr_pages = (nr_bytes_over_sl >> PAGE_SHIFT);
> +		ret += try_to_free_mem_cgroup_pages(mem, gfp_mask, nr_pages,
> +							zones);
> +next:
> +		read_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
> +	}

what prevents the cgroup 'mem' from disappearing while we are dropping
mem_cgroup_sl_list_lock?

YAMAMOTO Takashi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure
  2008-02-14 10:27   ` YAMAMOTO Takashi
@ 2008-02-15  3:19     ` Balbir Singh
  0 siblings, 0 replies; 26+ messages in thread
From: Balbir Singh @ 2008-02-15  3:19 UTC (permalink / raw)
  To: YAMAMOTO Takashi
  Cc: linux-mm, hugh, a.p.zijlstra, menage, Lee.Schermerhorn, herbert,
	ebiederm, rientjes, xemul, nickpiggin, riel, akpm,
	kamezawa.hiroyu

YAMAMOTO Takashi wrote:
>> +/*
>> + * Free all control groups, which are over their soft limit
>> + */
>> +unsigned long mem_cgroup_pushback_groups_over_soft_limit(struct zone **zones,
>> +								gfp_t gfp_mask)
>> +{
>> +	struct mem_cgroup *mem;
>> +	unsigned long nr_pages;
>> +	long long nr_bytes_over_sl;
>> +	unsigned long ret = 0;
>> +	unsigned long flags;
>> +	struct list_head reclaimed_groups;
>>  
>> +	INIT_LIST_HEAD(&reclaimed_groups);
>> +	read_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
>> +	while (!list_empty(&mem_cgroup_sl_exceeded_list)) {
>> +		mem = list_first_entry(&mem_cgroup_sl_exceeded_list,
>> +				struct mem_cgroup, sl_exceeded_list);
>> +		list_move(&mem->sl_exceeded_list, &reclaimed_groups);
>> +		read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags);
>> +
>> +		nr_bytes_over_sl = res_counter_sl_excess(&mem->res);
>> +		if (nr_bytes_over_sl <= 0)
>> +			goto next;
>> +		nr_pages = (nr_bytes_over_sl >> PAGE_SHIFT);
>> +		ret += try_to_free_mem_cgroup_pages(mem, gfp_mask, nr_pages,
>> +							zones);
>> +next:
>> +		read_lock_irqsave(&mem_cgroup_sl_list_lock, flags);
>> +	}
> 
> what prevents the cgroup 'mem' from disappearing while we are dropping
> mem_cgroup_sl_list_lock?
> 

I thought I had a css_get/put around it, but I don't. Thanks for catching the
problem.

> YAMAMOTO Takashi


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC] [PATCH 4/4] Add soft limit documentation
  2008-02-13 15:12 [RFC] [PATCH 0/4] Add soft limits to the memory controller Balbir Singh
                   ` (2 preceding siblings ...)
  2008-02-13 15:12 ` [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Balbir Singh
@ 2008-02-13 15:12 ` Balbir Singh
  2008-02-13 15:59   ` Randy Dunlap
  3 siblings, 1 reply; 26+ messages in thread
From: Balbir Singh @ 2008-02-13 15:12 UTC (permalink / raw)
  To: linux-mm
  Cc: Hugh Dickins, Paul Menage, YAMAMOTO Takashi, Peter Zijlstra,
	Lee Schermerhorn, Herbert Poetzl, David Rientjes, Andrew Morton,
	Pavel Emelianov, Nick Piggin, Balbir Singh, Rik Van Riel,
	Eric W. Biederman, KAMEZAWA Hiroyuki


Add documentation for the soft limit feature.

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
---

 Documentation/controllers/memory.txt |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff -puN Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation Documentation/controllers/memory.txt
--- linux-2.6.24/Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation	2008-02-13 18:45:40.000000000 +0530
+++ linux-2.6.24-balbir/Documentation/controllers/memory.txt	2008-02-13 18:49:58.000000000 +0530
@@ -201,6 +201,22 @@ The memory.force_empty gives an interfac
 
 will drop all charges in cgroup. Currently, this is maintained for test.
 
+The file memory.soft_limit_in_bytes allows users to set soft limits. A soft
+limit is set in a manner similar to limit. The limit feature described
+earlier is a hard limit, a group can never exceed it's hard limit. A soft
+limit on the other hand can be exceeded. A group will be shrunk back
+to it's soft limit, when there is memory pressure/contention.
+
+Ideally the soft limit should always be set to a value smaller than the
+hard limit. However, the code does not force the user to do so. The soft
+limit can be greater than the hard limit; then the soft limit has
+no meaning in that setup, since the group will alwasy be restrained to its
+hard limit.
+
+Example setting of soft limit
+
+# echo -n 100M > memory.soft_limit_in_bytes
+
 4. Testing
 
 Balbir posted lmbench, AIM9, LTP and vmmstress results [10] and [11].
_

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 4/4] Add soft limit documentation
  2008-02-13 15:12 ` [RFC] [PATCH 4/4] Add soft limit documentation Balbir Singh
@ 2008-02-13 15:59   ` Randy Dunlap
  2008-02-13 16:08     ` Balbir Singh
  0 siblings, 1 reply; 26+ messages in thread
From: Randy Dunlap @ 2008-02-13 15:59 UTC (permalink / raw)
  To: Balbir Singh
  Cc: linux-mm, Hugh Dickins, Paul Menage, YAMAMOTO Takashi,
	Peter Zijlstra, Lee Schermerhorn, Herbert Poetzl, David Rientjes,
	Andrew Morton, Pavel Emelianov, Nick Piggin, Rik Van Riel,
	Eric W. Biederman, KAMEZAWA Hiroyuki

On Wed, 13 Feb 2008 20:42:56 +0530 Balbir Singh wrote:

> 
> Add documentation for the soft limit feature.
> 
> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
> ---
> 
>  Documentation/controllers/memory.txt |   16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff -puN Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation Documentation/controllers/memory.txt
> --- linux-2.6.24/Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation	2008-02-13 18:45:40.000000000 +0530
> +++ linux-2.6.24-balbir/Documentation/controllers/memory.txt	2008-02-13 18:49:58.000000000 +0530
> @@ -201,6 +201,22 @@ The memory.force_empty gives an interfac
>  
>  will drop all charges in cgroup. Currently, this is maintained for test.
>  
> +The file memory.soft_limit_in_bytes allows users to set soft limits. A soft
> +limit is set in a manner similar to limit. The limit feature described
> +earlier is a hard limit, a group can never exceed it's hard limit. A soft

                          ;  [or: ". A group ..."]
and s/it's/its/

> +limit on the other hand can be exceeded. A group will be shrunk back
> +to it's soft limit, when there is memory pressure/contention.

      its  [it's == it is]

> +
> +Ideally the soft limit should always be set to a value smaller than the
> +hard limit. However, the code does not force the user to do so. The soft
> +limit can be greater than the hard limit; then the soft limit has
> +no meaning in that setup, since the group will alwasy be restrained to its

                                                  always

> +hard limit.
> +
> +Example setting of soft limit
> +
> +# echo -n 100M > memory.soft_limit_in_bytes
> +
>  4. Testing

---
~Randy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC] [PATCH 4/4] Add soft limit documentation
  2008-02-13 15:59   ` Randy Dunlap
@ 2008-02-13 16:08     ` Balbir Singh
  0 siblings, 0 replies; 26+ messages in thread
From: Balbir Singh @ 2008-02-13 16:08 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: linux-mm, Hugh Dickins, Paul Menage, YAMAMOTO Takashi,
	Peter Zijlstra, Lee Schermerhorn, Herbert Poetzl, David Rientjes,
	Andrew Morton, Pavel Emelianov, Nick Piggin, Rik Van Riel,
	Eric W. Biederman, KAMEZAWA Hiroyuki

Randy Dunlap wrote:
> On Wed, 13 Feb 2008 20:42:56 +0530 Balbir Singh wrote:
> 
>> Add documentation for the soft limit feature.
>>
>> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
>> ---
>>
>>  Documentation/controllers/memory.txt |   16 ++++++++++++++++
>>  1 file changed, 16 insertions(+)
>>
>> diff -puN Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation Documentation/controllers/memory.txt
>> --- linux-2.6.24/Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation	2008-02-13 18:45:40.000000000 +0530
>> +++ linux-2.6.24-balbir/Documentation/controllers/memory.txt	2008-02-13 18:49:58.000000000 +0530
>> @@ -201,6 +201,22 @@ The memory.force_empty gives an interfac
>>  
>>  will drop all charges in cgroup. Currently, this is maintained for test.
>>  
>> +The file memory.soft_limit_in_bytes allows users to set soft limits. A soft
>> +limit is set in a manner similar to limit. The limit feature described
>> +earlier is a hard limit, a group can never exceed it's hard limit. A soft
> 
>                           ;  [or: ". A group ..."]


Will do

> and s/it's/its/
> 

Thanks, I seem to use it's instead of its at times. I'll double check next time

>> +limit on the other hand can be exceeded. A group will be shrunk back
>> +to it's soft limit, when there is memory pressure/contention.
> 
>       its  [it's == it is]
> 
>> +
>> +Ideally the soft limit should always be set to a value smaller than the
>> +hard limit. However, the code does not force the user to do so. The soft
>> +limit can be greater than the hard limit; then the soft limit has
>> +no meaning in that setup, since the group will alwasy be restrained to its
> 
>                                                   always
> 

Will fix

>> +hard limit.
>> +
>> +Example setting of soft limit
>> +
>> +# echo -n 100M > memory.soft_limit_in_bytes
>> +
>>  4. Testing
> 

Thanks for helping us keep the documentation readable.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2008-02-15  6:39 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-13 15:12 [RFC] [PATCH 0/4] Add soft limits to the memory controller Balbir Singh
2008-02-13 15:12 ` [RFC] [PATCH 1/4] Modify resource counters to add soft limit support Balbir Singh
2008-02-13 17:12   ` Pavel Emelyanov
2008-02-13 17:19     ` Balbir Singh
2008-02-13 17:38       ` Pavel Emelyanov
2008-02-13 17:54         ` Balbir Singh
2008-02-13 15:12 ` [RFC] [PATCH 2/4] Add the soft limit interface Balbir Singh
2008-02-13 15:12 ` [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Balbir Singh
2008-02-14  7:30   ` KAMEZAWA Hiroyuki
2008-02-14  7:40     ` Balbir Singh
2008-02-14  8:42       ` KAMEZAWA Hiroyuki
2008-02-14  9:16         ` Balbir Singh
2008-02-15  4:17           ` Paul Menage
2008-02-15  4:25             ` Balbir Singh
2008-02-15  5:07               ` KAMEZAWA Hiroyuki
2008-02-15  5:16                 ` Paul Menage
2008-02-15  5:18                   ` Balbir Singh
2008-02-15  5:30                     ` Paul Menage
2008-02-15  5:33                     ` KAMEZAWA Hiroyuki
2008-02-15  5:29                   ` KAMEZAWA Hiroyuki
2008-02-15  6:36                     ` Balbir Singh
2008-02-14 10:27   ` YAMAMOTO Takashi
2008-02-15  3:19     ` Balbir Singh
2008-02-13 15:12 ` [RFC] [PATCH 4/4] Add soft limit documentation Balbir Singh
2008-02-13 15:59   ` Randy Dunlap
2008-02-13 16:08     ` Balbir Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).