Re: need help interpreting 'free' output.

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Andrea Arcangeli <andrea@suse.de>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Hugh Dickins <hugh@veritas.com>,
	Frank Dekervel <Frank.dekervel@student.kuleuven.ac.Be>,
	Marcelo Tosatti <marcelo@conectiva.com.br>,
	linux-kernel@vger.kernel.org
Subject: Re: need help interpreting 'free' output.
Date: Tue, 30 Oct 2001 21:05:53 +0100	[thread overview]
Message-ID: <20011030210553.A1340@athlon.random> (raw)
In-Reply-To: <20011030195828.X1340@athlon.random> <Pine.LNX.4.33.0110301117220.12145-100000@penguin.transmeta.com>
In-Reply-To: <Pine.LNX.4.33.0110301117220.12145-100000@penguin.transmeta.com>; from torvalds@transmeta.com on Tue, Oct 30, 2001 at 11:21:46AM -0800

On Tue, Oct 30, 2001 at 11:21:46AM -0800, Linus Torvalds wrote:
> 
> On Tue, 30 Oct 2001, Andrea Arcangeli wrote:
> >
> > So in short we only need to replace the lock_page with a TryLockPage
> > (plus your wait_on_page if page is not uptodate to catch the major
> > faults) and here we go, faster than pre5.
> 
> Wrong.
> 
> If _anybody_ accesses the page unlocked, you cannot do the swap_count() at
> all, because then you don't have anything that serializes the accesses to
> swap_count vs page_count any more.

incidentally if trylock fails do_wp_page doesn't even try to check the
swap count, it just lefts the swap cache there. same thing do_swap_page
can do at the early-cow stage. this is the only point I'm making.

and as said if you want to do any remove_exclusive_swap_page() in
do_swap_page as you claimed in earlier email you also need to get the
page lock.

As far I can tell here the magic key is "trylock" and nothing else, it's
not that the remove_exclusive_swap_page or the avoidance of the
early-cow per se can make any difference (let's ignore swapoff) except
running slower, here the only improvement during swapout load is that
you're delegating the work of remove_exclusive_swap_page to do_wp_page
that will do a trylock instead of a lock_page as far I can tell. This is
the only point I'm making.

Go ahead and implement this thing in do_swap_page:

        remove = 0;
        if ((vm_swap_full() && (remove = exclusive_swap_cache_delete())) ||
            only_swap_user()) {
                pte = mk_pte(page, vma->vm_page_prot);
                if (remove || write_access)
                        pte = pte_mkdirty(pte);
                if (vma->vm_page_prot & VM_WRITE)
                        pte = pte_mkwrite(pte);
                install_pte();
                return;
        }

and you'll find yourself grabbing the page lock somehow first in the
do_swap_page path, or exclusive_swap_cache_delete will obviously BUG()
on you.

This is why I'm saying the real magic is to conver the lock_page of pre4
in a TryLockPage, all other changes are not interesting in real load and
I obviously agree that's very good idea to fix the minor faults, that
in pre4 (and all previous kernels including all -ac and -aa) are running
as slow as major faults!

Now about the real need of exclusive_swap_cache_delete compared to
exclusive_swap_page I need to think a little more about it to be sure.

In sort previously we run exclusive_swap_page only with the page lock,
page->buffers is constant if the page is locked. And swap count and page
count _can't_ increase under us if the page happen to be exclusive once.
This was the previous rule at least, but as usual there's the swapoff
evil caming out and doing the lookup on a exclusive swap page... Hugh
may provide more hints on this case.

Andrea

next prev parent reply	other threads:[~2001-10-30 20:08 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-10-30 11:32 need help interpreting 'free' output Frank Dekervel
2001-10-30 11:46 ` Mike Fedyk
2001-10-30 14:02   ` Frank Dekervel
2001-10-30 16:07 ` Hugh Dickins
2001-10-30 16:51   ` Andrea Arcangeli
2001-10-30 16:52   ` Linus Torvalds
2001-10-30 17:06     ` Andrea Arcangeli
2001-10-30 17:28       ` Linus Torvalds
2001-10-30 17:39         ` Andrea Arcangeli
2001-10-30 17:53           ` Linus Torvalds
2001-10-30 18:16             ` Andrea Arcangeli
2001-10-30 18:28               ` Linus Torvalds
2001-10-30 18:58                 ` Andrea Arcangeli
2001-10-30 19:21                   ` Linus Torvalds
2001-10-30 20:05                     ` Andrea Arcangeli [this message]
2001-10-30 20:25                       ` Linus Torvalds
2001-10-30 18:05         ` Eric W. Biederman
2001-10-30 20:47     ` David S. Miller
2001-10-30 18:11   ` Frank Dekervel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20011030210553.A1340@athlon.random \
    --to=andrea@suse.de \
    --cc=Frank.dekervel@student.kuleuven.ac.Be \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo@conectiva.com.br \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox