public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "M.H.VanLeeuwen" <vanl@megsinet.net>
To: Andreas Hartmann <andihartmann@freenet.de>
Cc: Rik van Riel <riel@conectiva.com.br>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Andrea Arcangeli <andrea@suse.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] [2.4.17/18pre] VM and swap - it's really unusable
Date: Mon, 31 Dec 2001 11:14:04 -0600	[thread overview]
Message-ID: <3C309CDC.DEA9960A@megsinet.net> (raw)
In-Reply-To: <Pine.LNX.4.33L.0112292256490.24031-100000@imladris.surriel.com> <3C2F04F6.7030700@athlon.maya.org>

[-- Attachment #1: Type: text/plain, Size: 2312 bytes --]

Andreas,

Glad you liked oom2 patch ;)

I split of the patch into 2 pieces, mostly because the OOM killer tamer isn't
really needed anymore with the characteristics of the vmscan/cache recovery
changes.  Both are a little different then before.

a. bug fixed oom killer that looked at whether or not cache was shrinking.
   A greater or equal check should have been a greater then check only...unless
   you were severely stressing your system you probably wouldn't have hit this
   and because vmscan/cache recovery is working quite well.

b. removed nr_swap_pages from the check before deciding to goto swap_out.  it seems
   that we really do need to go to swap_out even for systems w/o any swap to clean
   up various resources before some cache can be recovered.  Unless you run out of
   swap space or have no swap device you probably didn't hit this problem either.

I'm a little concerned about the amount of pages we are potentially throwing between
the active and inactive (refill_inactive) and back again once max_mapped is hit, but
a few experiments with limiting page movement between lists just seemed to make
near OOM performance much worse (lengthly system pauses & mouse freezes).

Anyone else have any comments on the patch(es)?

Do they make any since, if so, should either/both patches or pieces therof be
pushed to Marcello?

Martin
--------------------------------------------------------------------------------------
OOM tamer patch:

a. if cache continues to shrink, we're not OOM, just reset OOM variables and exit
b. instead of waiting 1 second, wait variable time based on Mb of cache, thus the
   larger the cache the longer we wait to start killing processes.  eg 10 megs of
   cache causes OOM killer to wait 10 seconds before starting "c" below.
c. instead of fixed 10 occurrances after the N second wait above, wait 10 * Mb cache.

vmscan patch:

a. instead of calling swap_out as soon as max_mapped is reached, continue to try
   to free pages.  this reduces the number of times we hit try_to_free_pages() and
   swap_out().
b. once max_mapped count is reached move any Referenced pages to the active list, until
   enough pages have been freed or there are no more pages that can be freed.
c. only call swap_out() if both max_mapped is reached and we fail to free enough pages.

[-- Attachment #2: oom.patch.2.4.17 --]
[-- Type: application/octet-stream, Size: 1285 bytes --]

--- ./mm/oom_kill.c	Mon Nov  5 19:45:10 2001
+++ /usr/src/linux/./mm/oom_kill.c	Sun Dec 30 21:35:19 2001
@@ -198,7 +198,7 @@
  */
 void out_of_memory(void)
 {
-	static unsigned long first, last, count;
+	static unsigned long first, last, count, mega;
 	unsigned long now, since;
 
 	/*
@@ -220,19 +220,30 @@
 		goto reset;
 
 	/*
-	 * If we haven't tried for at least one second,
+	 * If cache is still shrinking, we're not oom.
+	 */
+	if (mega > pages_to_mb(atomic_read(&page_cache_size))) {
+		goto reset;
+	}
+
+	/*
+	 * If we haven't tried for at least mega*second(s),
 	 * we're not really oom.
 	 */
 	since = now - first;
-	if (since < HZ)
+	if (since < mega * HZ) {
+		printk(KERN_DEBUG "out_of_memory: cache size %d Mb, since = %lu.%02lu\n",mega, since/HZ, since%HZ);
 		return;
+	}
 
 	/*
-	 * If we have gotten only a few failures,
+	 * If we haven't gotten mega failures,
 	 * we're not really oom. 
 	 */
-	if (++count < 10)
+	if (++count < mega * 10) {
+		printk(KERN_DEBUG "out_of_memory: cache size %d Mb, since = %lu.%02lu, count %d\n",mega, since/HZ, since%HZ, count);
 		return;
+	}
 
 	/*
 	 * Ok, really out of memory. Kill something.
@@ -240,6 +251,7 @@
 	oom_kill();
 
 reset:
+	mega = pages_to_mb(atomic_read(&page_cache_size));
 	first = now;
 	count = 0;
 }

[-- Attachment #3: vmscan.patch.2.4.17 --]
[-- Type: application/octet-stream, Size: 1929 bytes --]

--- ./mm/swap.c	Sat Nov 24 23:59:28 2001
+++ /usr/src/linux/./mm/swap.c	Sun Dec 30 14:26:45 2001
@@ -36,7 +36,7 @@
 /*
  * Move an inactive page to the active list.
  */
-static inline void activate_page_nolock(struct page * page)
+void activate_page_nolock(struct page * page)
 {
 	if (PageLRU(page) && !PageActive(page)) {
 		del_page_from_inactive_list(page);
--- ./mm/vmscan.c	Sat Dec 22 09:35:54 2001
+++ /usr/src/linux/./mm/vmscan.c	Sun Dec 30 23:40:10 2001
@@ -394,9 +394,9 @@
 		if (PageDirty(page) && is_page_cache_freeable(page) && page->mapping) {
 			/*
 			 * It is not critical here to write it only if
-			 * the page is unmapped beause any direct writer
+			 * the page is unmapped because any direct writer
 			 * like O_DIRECT would set the PG_dirty bitflag
-			 * on the phisical page after having successfully
+			 * on the physical page after having successfully
 			 * pinned it and after the I/O to the page is finished,
 			 * so the direct writes to the page cannot get lost.
 			 */
@@ -480,11 +480,12 @@
 
 			/*
 			 * Alert! We've found too many mapped pages on the
-			 * inactive list, so we start swapping out now!
+			 * inactive list.
+			 * Move referenced pages to the active list.
 			 */
-			spin_unlock(&pagemap_lru_lock);
-			swap_out(priority, gfp_mask, classzone);
-			return nr_pages;
+			if (PageReferenced(page))
+				activate_page_nolock(page);
+			continue;
 		}
 
 		/*
@@ -521,6 +522,9 @@
 	}
 	spin_unlock(&pagemap_lru_lock);
 
+	if (max_mapped <= 0 && nr_pages > 0)
+		swap_out(priority, gfp_mask, classzone);
+
 	return nr_pages;
 }
 
--- ./include/linux/swap.h	Sat Nov 24 23:59:24 2001
+++ /usr/src/linux/./include/linux/swap.h	Sun Dec 30 15:01:21 2001
@@ -106,6 +106,7 @@
 extern void FASTCALL(lru_cache_del(struct page *));
 
 extern void FASTCALL(activate_page(struct page *));
+extern void FASTCALL(activate_page_nolock(struct page *));
 
 extern void swap_setup(void);
 

       reply	other threads:[~2001-12-31 17:15 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.4.33L.0112292256490.24031-100000@imladris.surriel.com>
     [not found] ` <3C2F04F6.7030700@athlon.maya.org>
2001-12-31 17:14   ` M.H.VanLeeuwen [this message]
2001-12-31 17:53     ` [PATCH] [2.4.17/18pre] VM and swap - it's really unusable Stephan von Krawczynski
2001-12-31 20:13       ` M.H.VanLeeuwen
2002-01-04  2:14       ` M.H.VanLeeuwen
2002-01-04 12:33         ` Stephan von Krawczynski
2002-01-04 14:14           ` Andrea Arcangeli
2002-01-04 14:24             ` Stephan von Krawczynski
2002-01-04 14:51               ` Andrea Arcangeli
2002-01-05  2:20                 ` M.H.VanLeeuwen
2002-01-04 14:11         ` Andrea Arcangeli
2002-01-05 19:47         ` José Luis Domingo López

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3C309CDC.DEA9960A@megsinet.net \
    --to=vanl@megsinet.net \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=andihartmann@freenet.de \
    --cc=andrea@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@conectiva.com.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox