public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* sched-idle and disk-priorities for 2.6.X
@ 2004-01-16 18:10 Pavel Machek
  2004-01-16 19:37 ` Valdis.Kletnieks
  0 siblings, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2004-01-16 18:10 UTC (permalink / raw)
  To: kernel list

Hi!

I have some lingvistics application here that is pretty
demanding... it eats a lot of memory, overloads disk, and basically
makes system unusable for even as simple tasks as reading maillists.

I basically don't care about performance of that job, as long as it
does  progress at least when I'm not in front of the computer. But I'd
like to have my machine usable...

Any ideas?

Where are lastest versions of disk-prio and sched-idle patches?

								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-16 18:10 sched-idle and disk-priorities for 2.6.X Pavel Machek
@ 2004-01-16 19:37 ` Valdis.Kletnieks
  2004-01-17 15:24   ` Bill Davidsen
  2004-01-17 23:29   ` Pavel Machek
  0 siblings, 2 replies; 19+ messages in thread
From: Valdis.Kletnieks @ 2004-01-16 19:37 UTC (permalink / raw)
  To: Pavel Machek; +Cc: kernel list

[-- Attachment #1: Type: text/plain, Size: 1363 bytes --]

On Fri, 16 Jan 2004 19:10:47 +0100, Pavel Machek <pavel@ucw.cz>  said:
> I have some lingvistics application here that is pretty
> demanding... it eats a lot of memory, overloads disk, and basically
> makes system unusable for even as simple tasks as reading maillists.

> Where are lastest versions of disk-prio and sched-idle patches?

The most likely culprit here is "eats memory".  What's almost certainly
happening is that your application is using a lot of pages, and leaving the
system in thrashing mode.  Quite likely, "sched-idle" won't do what you want,
as all that will happen is that the application will dirty pages while the CPU
is otherwise idle (and cause the same problem you're seeing already).
Similarly, disk-prio doesn't do you any good, because it isn't the I/O of the
huge application that's the problem, it's the I/O for the processes that are
thrashing.

A better bet would be a patch that allowed you to set the maximum RSS size for
the process so it can basically thrash itself while leaving enough memory for
everybody else (and yes, I *know* how this can be self-defeating if the
thrashing app then increases the total I/O consumed to be higher than the I/O
bandwidth available - the point is that it's probably the high RSS value for
his application causing OTHER things to thrash that's the root cause of his
performance problem).


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-16 19:37 ` Valdis.Kletnieks
@ 2004-01-17 15:24   ` Bill Davidsen
  2004-01-17 15:29     ` Nick Piggin
  2004-01-17 23:29   ` Pavel Machek
  1 sibling, 1 reply; 19+ messages in thread
From: Bill Davidsen @ 2004-01-17 15:24 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Pavel Machek, kernel list

Valdis.Kletnieks@vt.edu wrote:

> A better bet would be a patch that allowed you to set the maximum RSS size for
> the process so it can basically thrash itself while leaving enough memory for
> everybody else (and yes, I *know* how this can be self-defeating if the
> thrashing app then increases the total I/O consumed to be higher than the I/O
> bandwidth available - the point is that it's probably the high RSS value for
> his application causing OTHER things to thrash that's the root cause of his
> performance problem).

Or you could use "ulimit -m" to set the RSS, of course.

-- 
bill davidsen <davidsen@tmr.com>
   CTO TMR Associates, Inc
   Doing interesting things with small computers since 1979

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-17 15:24   ` Bill Davidsen
@ 2004-01-17 15:29     ` Nick Piggin
  2004-01-19  4:50       ` Bill Davidsen
  0 siblings, 1 reply; 19+ messages in thread
From: Nick Piggin @ 2004-01-17 15:29 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Valdis.Kletnieks, Pavel Machek, kernel list



Bill Davidsen wrote:

> Valdis.Kletnieks@vt.edu wrote:
>
>> A better bet would be a patch that allowed you to set the maximum RSS 
>> size for
>> the process so it can basically thrash itself while leaving enough 
>> memory for
>> everybody else (and yes, I *know* how this can be self-defeating if the
>> thrashing app then increases the total I/O consumed to be higher than 
>> the I/O
>> bandwidth available - the point is that it's probably the high RSS 
>> value for
>> his application causing OTHER things to thrash that's the root cause 
>> of his
>> performance problem).
>
>
> Or you could use "ulimit -m" to set the RSS, of course.


I don't think that would do anything with 2.6 :P



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-16 19:37 ` Valdis.Kletnieks
  2004-01-17 15:24   ` Bill Davidsen
@ 2004-01-17 23:29   ` Pavel Machek
  2004-01-18 18:47     ` Rik van Riel
  1 sibling, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2004-01-17 23:29 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: kernel list

Hi!

> > I have some lingvistics application here that is pretty
> > demanding... it eats a lot of memory, overloads disk, and basically
> > makes system unusable for even as simple tasks as reading maillists.
> 
> > Where are lastest versions of disk-prio and sched-idle patches?
> 
> The most likely culprit here is "eats memory".  What's almost certainly
> happening is that your application is using a lot of pages, and leaving the
> system in thrashing mode.  Quite likely, "sched-idle" won't do what you want,
> as all that will happen is that the application will dirty pages while the CPU
> is otherwise idle (and cause the same problem you're seeing
> already).

Well, problem is that it takes long even to awake galeon after you
have not used it for few seconds. Disk is utilized by galeon at this
point, and disk-prio + sched-idle should guarantee that when I ask
system to do something, it will do it at max possible performance.

> A better bet would be a patch that allowed you to set the maximum RSS size for
> the process so it can basically thrash itself while leaving enough memory for
> everybody else (and yes, I *know* how this can be self-defeating if the
> thrashing app then increases the total I/O consumed to be higher than the I/O
> bandwidth available - the point is that it's probably the high RSS value for
> his application causing OTHER things to thrash that's the root cause of his
> performance problem).

Is there effective way to limit RSS?
									Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-17 23:29   ` Pavel Machek
@ 2004-01-18 18:47     ` Rik van Riel
  2004-01-18 19:58       ` Pavel Machek
  2004-01-26  2:06       ` Bill Davidsen
  0 siblings, 2 replies; 19+ messages in thread
From: Rik van Riel @ 2004-01-18 18:47 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Valdis.Kletnieks, kernel list

On Sun, 18 Jan 2004, Pavel Machek wrote:

> Is there effective way to limit RSS?

Want me to port the RSS stuff from 2.4-rmap to 2.6 ?

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-18 18:47     ` Rik van Riel
@ 2004-01-18 19:58       ` Pavel Machek
  2004-01-21 19:49         ` Rik van Riel
  2004-01-26  2:06       ` Bill Davidsen
  1 sibling, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2004-01-18 19:58 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Pavel Machek, Valdis.Kletnieks, kernel list

Hi!

> > Is there effective way to limit RSS?
> 
> Want me to port the RSS stuff from 2.4-rmap to 2.6 ?

Well, if it allows me to limit memory for one task so that it does not
make system unusable... yes, that would be great.
								Pavel
-- 
When do you have heart between your knees?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-17 15:29     ` Nick Piggin
@ 2004-01-19  4:50       ` Bill Davidsen
  2004-01-19  5:39         ` Nick Piggin
  0 siblings, 1 reply; 19+ messages in thread
From: Bill Davidsen @ 2004-01-19  4:50 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Valdis.Kletnieks, Pavel Machek, kernel list

Nick Piggin wrote:
> 
> 
> Bill Davidsen wrote:
> 
>> Valdis.Kletnieks@vt.edu wrote:
>>
>>> A better bet would be a patch that allowed you to set the maximum RSS 
>>> size for
>>> the process so it can basically thrash itself while leaving enough 
>>> memory for
>>> everybody else (and yes, I *know* how this can be self-defeating if the
>>> thrashing app then increases the total I/O consumed to be higher than 
>>> the I/O
>>> bandwidth available - the point is that it's probably the high RSS 
>>> value for
>>> his application causing OTHER things to thrash that's the root cause 
>>> of his
>>> performance problem).
>>
>>
>>
>> Or you could use "ulimit -m" to set the RSS, of course.
> 
> 
> 
> I don't think that would do anything with 2.6 :P

Does that imply that the feature doesn't function as documented in 2.6? 
Or is that a SysV-ism not in SuS and documented but not implemented, or 
what other reason would there be for it to not work?


-- 
bill davidsen <davidsen@tmr.com>
   CTO TMR Associates, Inc
   Doing interesting things with small computers since 1979

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-19  4:50       ` Bill Davidsen
@ 2004-01-19  5:39         ` Nick Piggin
  0 siblings, 0 replies; 19+ messages in thread
From: Nick Piggin @ 2004-01-19  5:39 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Valdis.Kletnieks, Pavel Machek, kernel list



Bill Davidsen wrote:

> Nick Piggin wrote:
>
>>
>>
>> Bill Davidsen wrote:
>>
>>>
>>> Or you could use "ulimit -m" to set the RSS, of course.
>>
>>
>>
>>
>> I don't think that would do anything with 2.6 :P
>
>
> Does that imply that the feature doesn't function as documented in 
> 2.6? Or is that a SysV-ism not in SuS and documented but not 
> implemented, or what other reason would there be for it to not work?
>

The first one. AFAIKS ulimit RSS doesn't do anything in the 2.6 vm.

Rik has a fairly straightforward looking implementation in his 2.4 vm
which probably wouldn't be too hard to forward port. It doesn't impose
a hard limit on RSS though: I'm not sure what the standards say about that.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-18 19:58       ` Pavel Machek
@ 2004-01-21 19:49         ` Rik van Riel
  2004-01-22  1:04           ` Pavel Machek
  0 siblings, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2004-01-21 19:49 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Valdis.Kletnieks, kernel list

On Sun, 18 Jan 2004, Pavel Machek wrote:

> > > Is there effective way to limit RSS?
> > 
> > Want me to port the RSS stuff from 2.4-rmap to 2.6 ?
> 
> Well, if it allows me to limit memory for one task so that it does not
> make system unusable... yes, that would be great.

Here it is.  Untested, except for whether it compiles cleanly ;)

Let me know how it works, if the enforcement is aggressive
enough or not, whether I need to tweak things etc...

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



===== include/linux/init_task.h 1.27 vs edited =====
--- 1.27/include/linux/init_task.h	Mon Aug 18 22:46:23 2003
+++ edited/include/linux/init_task.h	Tue Jan 20 17:34:40 2004
@@ -2,6 +2,7 @@
 #define _LINUX__INIT_TASK_H
 
 #include <linux/file.h>
+#include <asm/resource.h>
 
 #define INIT_FILES \
 { 							\
@@ -41,6 +42,7 @@
 	.page_table_lock =  SPIN_LOCK_UNLOCKED, 		\
 	.mmlist		= LIST_HEAD_INIT(name.mmlist),		\
 	.default_kioctx = INIT_KIOCTX(name.default_kioctx, name),	\
+	.rlimit_rss	= RLIM_INFINITY			\
 }
 
 #define INIT_SIGNALS(sig) {	\
===== include/linux/sched.h 1.178 vs edited =====
--- 1.178/include/linux/sched.h	Mon Jan 19 18:38:15 2004
+++ edited/include/linux/sched.h	Tue Jan 20 17:32:56 2004
@@ -204,6 +204,7 @@
 	unsigned long arg_start, arg_end, env_start, env_end;
 	unsigned long rss, total_vm, locked_vm;
 	unsigned long def_flags;
+	unsigned long rlimit_rss;
 	cpumask_t cpu_vm_mask;
 
 	unsigned long saved_auxv[40]; /* for /proc/PID/auxv */
===== include/linux/swap.h 1.80 vs edited =====
--- 1.80/include/linux/swap.h	Mon Jan 19 01:28:35 2004
+++ edited/include/linux/swap.h	Tue Jan 20 18:16:28 2004
@@ -179,7 +179,7 @@
 
 /* linux/mm/rmap.c */
 #ifdef CONFIG_MMU
-int FASTCALL(page_referenced(struct page *));
+int FASTCALL(page_referenced(struct page *, int *));
 struct pte_chain *FASTCALL(page_add_rmap(struct page *, pte_t *,
 					struct pte_chain *));
 void FASTCALL(page_remove_rmap(struct page *, pte_t *));
@@ -188,7 +188,7 @@
 /* linux/mm/shmem.c */
 extern int shmem_unuse(swp_entry_t entry, struct page *page);
 #else
-#define page_referenced(page)	TestClearPageReferenced(page)
+#define page_referenced(page, _x)	TestClearPageReferenced(page)
 #define try_to_unmap(page)	SWAP_FAIL
 #endif /* CONFIG_MMU */
 
===== kernel/sys.c 1.69 vs edited =====
--- 1.69/kernel/sys.c	Mon Jan 19 18:38:13 2004
+++ edited/kernel/sys.c	Tue Jan 20 18:02:19 2004
@@ -1308,6 +1308,14 @@
 	if (retval)
 		return retval;
 
+	/* The rlimit is specified in bytes, convert to pages for mm. */
+	if (resource == RLIMIT_RSS && current->mm) {
+		unsigned long pages = RLIM_INFINITY;
+		if (new_rlim.rlim_cur != RLIM_INFINITY)
+			pages = new_rlim.rlim_cur >> PAGE_SHIFT;
+		current->mm->rlimit_rss = pages;
+	}
+
 	*old_rlim = new_rlim;
 	return 0;
 }
===== mm/rmap.c 1.34 vs edited =====
--- 1.34/mm/rmap.c	Mon Jan 19 01:36:00 2004
+++ edited/mm/rmap.c	Tue Jan 20 18:26:03 2004
@@ -104,6 +104,7 @@
 /**
  * page_referenced - test if the page was referenced
  * @page: the page to test
+ * rsslimit: set if the process(es) using the page is(are) over RSS limit
  *
  * Quick test_and_clear_referenced for all mappings to a page,
  * returns the number of processes which referenced the page.
@@ -112,8 +113,9 @@
  * If the page has a single-entry pte_chain, collapse that back to a PageDirect
  * representation.  This way, it's only done under memory pressure.
  */
-int page_referenced(struct page * page)
+int page_referenced(struct page * page, int * rsslimit)
 {
+	struct mm_struct * mm;
 	struct pte_chain *pc;
 	int referenced = 0;
 
@@ -127,10 +129,17 @@
 		pte_t *pte = rmap_ptep_map(page->pte.direct);
 		if (ptep_test_and_clear_young(pte))
 			referenced++;
+
+		mm = ptep_to_mm(pte);
+		if (mm->rss > mm->rlimit_rss)
+			*rsslimit = 1;
 		rmap_ptep_unmap(pte);
 	} else {
 		int nr_chains = 0;
 
+		/* We clear it if any task using the page is under its limit. */
+		*rsslimit = 1;
+
 		/* Check all the page tables mapping this page. */
 		for (pc = page->pte.chain; pc; pc = pte_chain_next(pc)) {
 			int i;
@@ -142,6 +151,10 @@
 				p = rmap_ptep_map(pte_paddr);
 				if (ptep_test_and_clear_young(p))
 					referenced++;
+
+				mm = ptep_to_mm(p);
+				if (mm->rss < mm->rlimit_rss)
+					*rsslimit = 0;
 				rmap_ptep_unmap(p);
 				nr_chains++;
 			}
===== mm/vmscan.c 1.177 vs edited =====
--- 1.177/mm/vmscan.c	Mon Jan 19 18:38:07 2004
+++ edited/mm/vmscan.c	Wed Jan 21 14:34:44 2004
@@ -250,6 +250,7 @@
 	LIST_HEAD(ret_pages);
 	struct pagevec freed_pvec;
 	int pgactivate = 0;
+	int over_rsslimit;
 	int ret = 0;
 
 	cond_resched();
@@ -278,10 +279,12 @@
 			goto keep_locked;
 
 		pte_chain_lock(page);
-		referenced = page_referenced(page);
+		referenced = page_referenced(page, &over_rsslimit);
 		if (referenced && page_mapping_inuse(page)) {
 			/* In active use or really unfreeable.  Activate it. */
 			pte_chain_unlock(page);
+			if (over_rsslimit)
+				goto keep_locked;
 			goto activate_locked;
 		}
 
@@ -597,6 +600,7 @@
 	long mapped_ratio;
 	long distress;
 	long swap_tendency;
+	int over_rsslimit;
 
 	lru_add_drain();
 	pgmoved = 0;
@@ -657,7 +661,7 @@
 		list_del(&page->lru);
 		if (page_mapped(page)) {
 			pte_chain_lock(page);
-			if (page_mapped(page) && page_referenced(page)) {
+			if (page_mapped(page) && page_referenced(page, &over_rsslimit) && !over_rsslimit) {
 				pte_chain_unlock(page);
 				list_add(&page->lru, &l_active);
 				continue;


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-21 19:49         ` Rik van Riel
@ 2004-01-22  1:04           ` Pavel Machek
  2004-01-22  1:13             ` Rik van Riel
  0 siblings, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2004-01-22  1:04 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Valdis.Kletnieks, kernel list

Hi!

> > > > Is there effective way to limit RSS?
> > > 
> > > Want me to port the RSS stuff from 2.4-rmap to 2.6 ?
> > 
> > Well, if it allows me to limit memory for one task so that it does not
> > make system unusable... yes, that would be great.
> 
> Here it is.  Untested, except for whether it compiles cleanly ;)
> 
> Let me know how it works, if the enforcement is aggressive
> enough or not, whether I need to tweak things etc...

It boots, and seems to have no ill effects. I've yet to see some good
effects, too...

doing 

ulimit -m 1
<some task>

should make that task run with extremely low priority, right?

								Pavel

-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-22  1:04           ` Pavel Machek
@ 2004-01-22  1:13             ` Rik van Riel
  2004-01-23 18:59               ` Pavel Machek
  0 siblings, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2004-01-22  1:13 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Valdis.Kletnieks, kernel list

On Thu, 22 Jan 2004, Pavel Machek wrote:

> doing 
> 
> ulimit -m 1
> <some task>
> 
> should make that task run with extremely low priority, right?

Yeah, when the box is under memory pressure, pages from that
task should never hit the active list.  Instead, they should
always stay on the inactive list and the non-referenced pages
from that app should get reclaimed.

OTOH, if the app keeps referencing all pages, maybe I need
to tune up the aggressiveness a bit and also reclaim the
referenced pages ... if the current patch doesn't work right
I'll make a more aggressive one.

Note that RSS limit enforcement is always lazy, because
otherwise the RSS limited task will hog the IO subsystem
full-time and slow everything else down ... even when there's
more than enough memory.

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-22  1:13             ` Rik van Riel
@ 2004-01-23 18:59               ` Pavel Machek
  2004-01-23 19:04                 ` Rik van Riel
  0 siblings, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2004-01-23 18:59 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Valdis.Kletnieks, kernel list

Hi!

> > ulimit -m 1
> > <some task>
> > 
> > should make that task run with extremely low priority, right?
> 
> Yeah, when the box is under memory pressure, pages from that
> task should never hit the active list.  Instead, they should
> always stay on the inactive list and the non-referenced pages
> from that app should get reclaimed.
> 
> OTOH, if the app keeps referencing all pages, maybe I need
> to tune up the aggressiveness a bit and also reclaim the
> referenced pages ... if the current patch doesn't work right
> I'll make a more aggressive one.

I'm afraid it needs to be more aggressive.

I made two programs, each walking over 150MB of memory, and ran them
at same time on 250MB machine. One of them with ulimit -m 1... Both
got about the same ammount of RAM and progressed at similar speed.

							Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-23 18:59               ` Pavel Machek
@ 2004-01-23 19:04                 ` Rik van Riel
  2004-01-23 21:04                   ` Pavel Machek
  0 siblings, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2004-01-23 19:04 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Valdis.Kletnieks, kernel list

On Fri, 23 Jan 2004, Pavel Machek wrote:

> I'm afraid it needs to be more aggressive.

OK, is the patch below any better ?

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan

===== include/linux/init_task.h 1.27 vs edited =====
--- 1.27/include/linux/init_task.h	Mon Aug 18 22:46:23 2003
+++ edited/include/linux/init_task.h	Tue Jan 20 17:34:40 2004
@@ -2,6 +2,7 @@
 #define _LINUX__INIT_TASK_H
 
 #include <linux/file.h>
+#include <asm/resource.h>
 
 #define INIT_FILES \
 { 							\
@@ -41,6 +42,7 @@
 	.page_table_lock =  SPIN_LOCK_UNLOCKED, 		\
 	.mmlist		= LIST_HEAD_INIT(name.mmlist),		\
 	.default_kioctx = INIT_KIOCTX(name.default_kioctx, name),	\
+	.rlimit_rss	= RLIM_INFINITY			\
 }
 
 #define INIT_SIGNALS(sig) {	\
===== include/linux/sched.h 1.178 vs edited =====
--- 1.178/include/linux/sched.h	Mon Jan 19 18:38:15 2004
+++ edited/include/linux/sched.h	Tue Jan 20 17:32:56 2004
@@ -204,6 +204,7 @@
 	unsigned long arg_start, arg_end, env_start, env_end;
 	unsigned long rss, total_vm, locked_vm;
 	unsigned long def_flags;
+	unsigned long rlimit_rss;
 	cpumask_t cpu_vm_mask;
 
 	unsigned long saved_auxv[40]; /* for /proc/PID/auxv */
===== include/linux/swap.h 1.80 vs edited =====
--- 1.80/include/linux/swap.h	Mon Jan 19 01:28:35 2004
+++ edited/include/linux/swap.h	Tue Jan 20 18:16:28 2004
@@ -179,7 +179,7 @@
 
 /* linux/mm/rmap.c */
 #ifdef CONFIG_MMU
-int FASTCALL(page_referenced(struct page *));
+int FASTCALL(page_referenced(struct page *, int *));
 struct pte_chain *FASTCALL(page_add_rmap(struct page *, pte_t *,
 					struct pte_chain *));
 void FASTCALL(page_remove_rmap(struct page *, pte_t *));
@@ -188,7 +188,7 @@
 /* linux/mm/shmem.c */
 extern int shmem_unuse(swp_entry_t entry, struct page *page);
 #else
-#define page_referenced(page)	TestClearPageReferenced(page)
+#define page_referenced(page, _x)	TestClearPageReferenced(page)
 #define try_to_unmap(page)	SWAP_FAIL
 #endif /* CONFIG_MMU */
 
===== kernel/sys.c 1.69 vs edited =====
--- 1.69/kernel/sys.c	Mon Jan 19 18:38:13 2004
+++ edited/kernel/sys.c	Tue Jan 20 18:02:19 2004
@@ -1308,6 +1308,14 @@
 	if (retval)
 		return retval;
 
+	/* The rlimit is specified in bytes, convert to pages for mm. */
+	if (resource == RLIMIT_RSS && current->mm) {
+		unsigned long pages = RLIM_INFINITY;
+		if (new_rlim.rlim_cur != RLIM_INFINITY)
+			pages = new_rlim.rlim_cur >> PAGE_SHIFT;
+		current->mm->rlimit_rss = pages;
+	}
+
 	*old_rlim = new_rlim;
 	return 0;
 }
===== mm/rmap.c 1.34 vs edited =====
--- 1.34/mm/rmap.c	Mon Jan 19 01:36:00 2004
+++ edited/mm/rmap.c	Tue Jan 20 18:26:03 2004
@@ -104,6 +104,7 @@
 /**
  * page_referenced - test if the page was referenced
  * @page: the page to test
+ * rsslimit: set if the process(es) using the page is(are) over RSS limit
  *
  * Quick test_and_clear_referenced for all mappings to a page,
  * returns the number of processes which referenced the page.
@@ -112,8 +113,9 @@
  * If the page has a single-entry pte_chain, collapse that back to a PageDirect
  * representation.  This way, it's only done under memory pressure.
  */
-int page_referenced(struct page * page)
+int page_referenced(struct page * page, int * rsslimit)
 {
+	struct mm_struct * mm;
 	struct pte_chain *pc;
 	int referenced = 0;
 
@@ -127,10 +129,17 @@
 		pte_t *pte = rmap_ptep_map(page->pte.direct);
 		if (ptep_test_and_clear_young(pte))
 			referenced++;
+
+		mm = ptep_to_mm(pte);
+		if (mm->rss > mm->rlimit_rss)
+			*rsslimit = 1;
 		rmap_ptep_unmap(pte);
 	} else {
 		int nr_chains = 0;
 
+		/* We clear it if any task using the page is under its limit. */
+		*rsslimit = 1;
+
 		/* Check all the page tables mapping this page. */
 		for (pc = page->pte.chain; pc; pc = pte_chain_next(pc)) {
 			int i;
@@ -142,6 +151,10 @@
 				p = rmap_ptep_map(pte_paddr);
 				if (ptep_test_and_clear_young(p))
 					referenced++;
+
+				mm = ptep_to_mm(p);
+				if (mm->rss < mm->rlimit_rss)
+					*rsslimit = 0;
 				rmap_ptep_unmap(p);
 				nr_chains++;
 			}
===== mm/vmscan.c 1.177 vs edited =====
--- 1.177/mm/vmscan.c	Mon Jan 19 18:38:07 2004
+++ edited/mm/vmscan.c	Fri Jan 23 14:00:48 2004
@@ -250,6 +250,7 @@
 	LIST_HEAD(ret_pages);
 	struct pagevec freed_pvec;
 	int pgactivate = 0;
+	int over_rsslimit;
 	int ret = 0;
 
 	cond_resched();
@@ -278,8 +279,8 @@
 			goto keep_locked;
 
 		pte_chain_lock(page);
-		referenced = page_referenced(page);
-		if (referenced && page_mapping_inuse(page)) {
+		referenced = page_referenced(page, &over_rsslimit);
+		if (referenced && page_mapping_inuse(page) && !over_rsslimit) {
 			/* In active use or really unfreeable.  Activate it. */
 			pte_chain_unlock(page);
 			goto activate_locked;
@@ -597,6 +598,7 @@
 	long mapped_ratio;
 	long distress;
 	long swap_tendency;
+	int over_rsslimit;
 
 	lru_add_drain();
 	pgmoved = 0;
@@ -657,7 +659,7 @@
 		list_del(&page->lru);
 		if (page_mapped(page)) {
 			pte_chain_lock(page);
-			if (page_mapped(page) && page_referenced(page)) {
+			if (page_mapped(page) && page_referenced(page, &over_rsslimit) && !over_rsslimit) {
 				pte_chain_unlock(page);
 				list_add(&page->lru, &l_active);
 				continue;


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-23 19:04                 ` Rik van Riel
@ 2004-01-23 21:04                   ` Pavel Machek
  2004-01-26 23:08                     ` bill davidsen
  0 siblings, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2004-01-23 21:04 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Valdis.Kletnieks, kernel list

Hi!

> > I'm afraid it needs to be more aggressive.
> 
> OK, is the patch below any better ?

Yes, this one actually works. When I launched two 150MB tasks, one of
them with ulimit -m 1, the limited task yielded its memory to
unlimited one. It worked as expected.
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-18 18:47     ` Rik van Riel
  2004-01-18 19:58       ` Pavel Machek
@ 2004-01-26  2:06       ` Bill Davidsen
  2004-01-26  7:19         ` Pavel Machek
  1 sibling, 1 reply; 19+ messages in thread
From: Bill Davidsen @ 2004-01-26  2:06 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Pavel Machek, Valdis.Kletnieks, kernel list

Rik van Riel wrote:
> On Sun, 18 Jan 2004, Pavel Machek wrote:
> 
> 
>>Is there effective way to limit RSS?
> 
> 
> Want me to port the RSS stuff from 2.4-rmap to 2.6 ?

What's the effort? It's useful for programs which use a lot of memory, 
particularly for those which only do it on some data sets. It would 
certainly act as a safty net.

-- 
bill davidsen <davidsen@tmr.com>
   CTO TMR Associates, Inc
   Doing interesting things with small computers since 1979

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-26  2:06       ` Bill Davidsen
@ 2004-01-26  7:19         ` Pavel Machek
  0 siblings, 0 replies; 19+ messages in thread
From: Pavel Machek @ 2004-01-26  7:19 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Rik van Riel, Valdis.Kletnieks, kernel list

Hi!

> >>Is there effective way to limit RSS?
> >
> >
> >Want me to port the RSS stuff from 2.4-rmap to 2.6 ?
> 
> What's the effort? It's useful for programs which use a lot of memory, 
> particularly for those which only do it on some data sets. It would 
> certainly act as a safty net.

If you have two apps which oth need lots of memory, and want to
"renice" one to run slower.

Alternatively when you have one app eating a lots of memory and you
want your system to remain usable.
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-23 21:04                   ` Pavel Machek
@ 2004-01-26 23:08                     ` bill davidsen
  2004-02-03 19:13                       ` Pavel Machek
  0 siblings, 1 reply; 19+ messages in thread
From: bill davidsen @ 2004-01-26 23:08 UTC (permalink / raw)
  To: linux-kernel

In article <20040123210449.GA250@elf.ucw.cz>,
Pavel Machek  <pavel@ucw.cz> wrote:

| > > I'm afraid it needs to be more aggressive.
| > 
| > OK, is the patch below any better ?
| 
| Yes, this one actually works. When I launched two 150MB tasks, one of
| them with ulimit -m 1, the limited task yielded its memory to
| unlimited one. It worked as expected.

I'm not sure what "as expected" means with this small a limit, hopefully
not "pages its butt off." I am printing a hardcopy of the 2nd patch and
a bit of the surrounding code, and also compiling a new kernel with the
patch in place, so I can play a bit in the morning.

I also wonder if a sanity check is desirable on the minimum size. At
some point I would think the system would get a lot of overhead trying
to actually use a single 1k page :-(

Thanks for this prompt implementation, I do have a few applications
which can use it!
-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: sched-idle and disk-priorities for 2.6.X
  2004-01-26 23:08                     ` bill davidsen
@ 2004-02-03 19:13                       ` Pavel Machek
  0 siblings, 0 replies; 19+ messages in thread
From: Pavel Machek @ 2004-02-03 19:13 UTC (permalink / raw)
  To: bill davidsen; +Cc: linux-kernel

Hi!

> | > > I'm afraid it needs to be more aggressive.
> | > 
> | > OK, is the patch below any better ?
> | 
> | Yes, this one actually works. When I launched two 150MB tasks, one of
> | them with ulimit -m 1, the limited task yielded its memory to
> | unlimited one. It worked as expected.
> 
> I'm not sure what "as expected" means with this small a limit, hopefully
> not "pages its butt off." I am printing a hardcopy of the 2nd patch and
> a bit of the surrounding code, and also compiling a new kernel with the
> patch in place, so I can play a bit in the morning.

Well, it should mean "all the memory from this process should be
reclaimed if it is needed".

> I also wonder if a sanity check is desirable on the minimum size. At
> some point I would think the system would get a lot of overhead trying
> to actually use a single 1k page :-(

I believe it is okay as is. It just gives it very low "memory-priority".

									Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2004-02-03 19:25 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-16 18:10 sched-idle and disk-priorities for 2.6.X Pavel Machek
2004-01-16 19:37 ` Valdis.Kletnieks
2004-01-17 15:24   ` Bill Davidsen
2004-01-17 15:29     ` Nick Piggin
2004-01-19  4:50       ` Bill Davidsen
2004-01-19  5:39         ` Nick Piggin
2004-01-17 23:29   ` Pavel Machek
2004-01-18 18:47     ` Rik van Riel
2004-01-18 19:58       ` Pavel Machek
2004-01-21 19:49         ` Rik van Riel
2004-01-22  1:04           ` Pavel Machek
2004-01-22  1:13             ` Rik van Riel
2004-01-23 18:59               ` Pavel Machek
2004-01-23 19:04                 ` Rik van Riel
2004-01-23 21:04                   ` Pavel Machek
2004-01-26 23:08                     ` bill davidsen
2004-02-03 19:13                       ` Pavel Machek
2004-01-26  2:06       ` Bill Davidsen
2004-01-26  7:19         ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox