From: Roger Larsson <roger.larsson@norran.net>
To: Linus Torvalds <torvalds@transmeta.com>,
Rik van Riel <riel@conectiva.com.br>
Cc: linux-mm@kvack.org
Subject: [RFC][PATCH] shrink_mmap avoid list_del (Was: Re: [PATCH] Recent VM fiasco - fixed)
Date: Fri, 12 May 2000 04:51:42 +0200 [thread overview]
Message-ID: <391B71BE.6302F9BD@norran.net> (raw)
In-Reply-To: Pine.LNX.4.10.10005111700520.1319-100000@penguin.transmeta.com
[-- Attachment #1: Type: text/plain, Size: 2210 bytes --]
Hi,
I tried to find a way to walk the lru list without list_del.
Here is my patch:
- not compiled nor run (low on HD...)
Could something like this be used?
If no, why not?
/RogerL
Linus Torvalds wrote:
>
> On Thu, 11 May 2000, Simon Kirby wrote:
> >
> > Hrm! pre7 release seems to be even better. 113 vmstat-line-seconds now
> > (yes, I know this isn't a very scientific testing method :)). Second try
> > was 114 vmstat-line-seconds. classzone-27 did it in 107, so that's not
> > very far off! Also, it swapped much less this time, and used less CPU.
> > vmstat output attached.
>
> The final pre7 did something that I'm not entirely excited about, but that
> kind of makes sense at least from a CPU standpoint (as the SGI people have
> repeated multiple times). What the real pre7 does is to just move any page
> that has problems getting free'd to the head of the LRU list, so that we
> won't try it immediately the next time. This way we don't test the same
> pages over and over again when they are either shared, in the wrong zone,
> or have dirty/locked buffers.
>
> It means that the "LRU" is less LRU, but you could see it as a "how hard
> do we want to free this" pressure-based system that really a least
> recently _used_ system. And it avoids the "repeat the whole thing on the
> same page" issue. And it looks like it behaves reasonably well, while
> saving a lot of CPU.
>
> Knock wood.
>
> I'm still considering the pre7 as more a "ok, I tried to get rid of the
> cruft" thing. Most of the special case code that has accumulated lately is
> gone. We can start adding stuff back now, I'm happy that the basics are
> reasonably clean.
>
> I think Ingo already posted a very valid concern about high-memory
> machines, and there are other issues we should look at. I just want to be
> in a position where we can look at the code and say "we do X because Y",
> rather than a collection of random tweaks that just happens to work.
>
> Linus
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux.eu.org/Linux-MM/
--
Home page:
http://www.norran.net/nra02596/
[-- Attachment #2: patch-2.3.99-pre7-9-shrink_mmap.1 --]
[-- Type: text/plain, Size: 3624 bytes --]
diff -Naur linux-2.3-pre9--/mm/filemap.c linux-2.3/mm/filemap.c
--- linux-2.3-pre9--/mm/filemap.c Fri May 12 02:42:19 2000
+++ linux-2.3/mm/filemap.c Fri May 12 04:28:30 2000
@@ -236,7 +236,6 @@
int shrink_mmap(int priority, int gfp_mask)
{
int ret = 0, count;
- LIST_HEAD(old);
struct list_head * page_lru, * dispose;
struct page * page = NULL;
@@ -244,26 +243,29 @@
/* we need pagemap_lru_lock for list_del() ... subtle code below */
spin_lock(&pagemap_lru_lock);
- while (count > 0 && (page_lru = lru_cache.prev) != &lru_cache) {
+ page_lru = &lru_cache;
+ while (count > 0 && (page_lru = page_lru->prev) != &lru_cache) {
page = list_entry(page_lru, struct page, lru);
- list_del(page_lru);
dispose = &lru_cache;
if (PageTestandClearReferenced(page))
goto dispose_continue;
count--;
- dispose = &old;
+
+ dispose = NULL;
/*
* Avoid unscalable SMP locking for pages we can
* immediate tell are untouchable..
*/
if (!page->buffers && page_count(page) > 1)
- goto dispose_continue;
+ continue;
+ /* Lock this lru page, reentrant
+ * will be disposed correctly when unlocked */
if (TryLockPage(page))
- goto dispose_continue;
+ continue;
/* Release the pagemap_lru lock even if the page is not yet
queued in any lru queue since we have just locked down
@@ -281,7 +283,7 @@
*/
if (page->buffers) {
if (!try_to_free_buffers(page))
- goto unlock_continue;
+ goto page_unlock_continue;
/* page was locked, inode can't go away under us */
if (!page->mapping) {
atomic_dec(&buffermem_pages);
@@ -336,27 +338,43 @@
cache_unlock_continue:
spin_unlock(&pagecache_lock);
-unlock_continue:
+page_unlock_continue:
spin_lock(&pagemap_lru_lock);
UnlockPage(page);
put_page(page);
+ continue;
+
dispose_continue:
- list_add(page_lru, dispose);
- }
- goto out;
+ /* have the pagemap_lru_lock, lru cannot change */
+ {
+ struct list_head * page_lru_to_move = page_lru;
+ page_lru = page_lru->next; /* continues with page_lru.prev */
+ list_del(page_lru_to_move);
+ list_add(page_lru_to_move, dispose);
+ }
+ continue;
made_inode_progress:
- page_cache_release(page);
+ page_cache_release(page);
made_buffer_progress:
- UnlockPage(page);
- put_page(page);
- ret = 1;
- spin_lock(&pagemap_lru_lock);
- /* nr_lru_pages needs the spinlock */
- nr_lru_pages--;
+ /* like to have the lru lock before UnlockPage */
+ spin_lock(&pagemap_lru_lock);
-out:
- list_splice(&old, lru_cache.prev);
+ UnlockPage(page);
+ put_page(page);
+ ret++;
+
+ /* lru manipulation needs the spin lock */
+ {
+ struct list_head * page_lru_to_free = page_lru;
+ page_lru = page_lru->next; /* continues with page_lru.prev */
+ list_del(page_lru_to_free);
+ }
+
+ /* nr_lru_pages needs the spinlock */
+ nr_lru_pages--;
+
+ }
spin_unlock(&pagemap_lru_lock);
diff -Naur linux-2.3-pre9--/mm/vmscan.c linux-2.3/mm/vmscan.c
--- linux-2.3-pre9--/mm/vmscan.c Fri May 12 02:42:19 2000
+++ linux-2.3/mm/vmscan.c Fri May 12 04:32:16 2000
@@ -443,10 +443,9 @@
priority = 6;
do {
- while (shrink_mmap(priority, gfp_mask)) {
- if (!--count)
- goto done;
- }
+ count -= shrink_mmap(priority, gfp_mask);
+ if (count <= 0)
+ goto done;
/* Try to get rid of some shared memory pages.. */
if (gfp_mask & __GFP_IO) {
@@ -481,10 +480,9 @@
} while (--priority >= 0);
/* Always end on a shrink_mmap.. */
- while (shrink_mmap(0, gfp_mask)) {
- if (!--count)
- goto done;
- }
+ count -= shrink_mmap(priority, gfp_mask);
+ if (count <= 0)
+ goto done;
return 0;
next prev parent reply other threads:[~2000-05-12 2:51 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-05-08 17:21 [PATCH] Recent VM fiasco - fixed Zlatko Calusic
2000-05-08 17:43 ` Rik van Riel
2000-05-08 18:16 ` Zlatko Calusic
2000-05-08 18:20 ` Linus Torvalds
2000-05-08 18:46 ` Rik van Riel
2000-05-08 18:53 ` Zlatko Calusic
2000-05-08 19:04 ` Rik van Riel
2000-05-09 7:56 ` Daniel Stone
2000-05-09 8:25 ` Christoph Rohland
2000-05-09 15:44 ` Linus Torvalds
2000-05-09 16:12 ` Simon Kirby
2000-05-09 17:42 ` Christoph Rohland
2000-05-09 19:50 ` Linus Torvalds
2000-05-10 11:25 ` Christoph Rohland
2000-05-10 11:50 ` Zlatko Calusic
2000-05-11 23:40 ` Mark Hahn
2000-05-10 4:05 ` James H. Cloos Jr.
2000-05-10 7:29 ` James H. Cloos Jr.
2000-05-11 0:16 ` Linus Torvalds
2000-05-11 0:32 ` Linus Torvalds
2000-05-11 16:36 ` [PATCH] Recent VM fiasco - fixed (pre7-9) Rajagopal Ananthanarayanan
2000-05-11 1:04 ` [PATCH] Recent VM fiasco - fixed Juan J. Quintela
2000-05-11 1:53 ` Simon Kirby
2000-05-11 7:23 ` Linus Torvalds
2000-05-11 14:17 ` Simon Kirby
2000-05-11 23:38 ` Simon Kirby
2000-05-12 0:09 ` Linus Torvalds
2000-05-12 2:51 ` Roger Larsson [this message]
2000-05-11 11:15 ` Rik van Riel
2000-05-11 5:10 ` Linus Torvalds
2000-05-11 10:09 ` James H. Cloos Jr.
2000-05-11 17:25 ` Juan J. Quintela
2000-05-11 23:25 ` [patch] balanced highmem subsystem under pre7-9 Ingo Molnar
2000-05-11 23:46 ` Linus Torvalds
2000-05-12 0:08 ` Ingo Molnar
2000-05-12 0:15 ` Ingo Molnar
2000-05-12 9:02 ` Christoph Rohland
2000-05-12 9:56 ` Ingo Molnar
2000-05-12 11:49 ` Christoph Rohland
2000-05-12 16:12 ` Linus Torvalds
2000-05-12 10:57 ` Andrea Arcangeli
2000-05-12 12:11 ` Ingo Molnar
2000-05-12 12:57 ` Andrea Arcangeli
2000-05-12 13:20 ` Rik van Riel
2000-05-12 16:40 ` Ingo Molnar
2000-05-12 17:15 ` Rik van Riel
2000-05-12 18:15 ` Linus Torvalds
2000-05-12 18:53 ` Ingo Molnar
2000-05-12 19:06 ` Linus Torvalds
2000-05-12 19:36 ` Ingo Molnar
2000-05-12 19:40 ` Ingo Molnar
2000-05-12 19:54 ` Ingo Molnar
2000-05-12 22:48 ` Rik van Riel
2000-05-13 11:57 ` Stephen C. Tweedie
2000-05-13 12:03 ` Rik van Riel
2000-05-13 12:14 ` Ingo Molnar
2000-05-13 14:23 ` Ingo Molnar
2000-05-19 1:58 ` Andrea Arcangeli
2000-05-19 15:03 ` Rik van Riel
2000-05-19 16:08 ` Andrea Arcangeli
2000-05-19 17:05 ` Rik van Riel
2000-05-19 22:28 ` Linus Torvalds
2000-05-11 11:12 ` [PATCH] Recent VM fiasco - fixed Christoph Rohland
2000-05-11 17:38 ` Steve Dodd
2000-05-09 10:21 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=391B71BE.6302F9BD@norran.net \
--to=roger.larsson@norran.net \
--cc=linux-mm@kvack.org \
--cc=riel@conectiva.com.br \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.