public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: Robin Holt <holt@sgi.com>
Cc: raybry@sgi.com, linux-mm@kvack.org, ak@muc.de,
	haveblue@us.ibm.com, marcello@cyclades.com,
	stevel@mwwireless.net, peterc@gelato.unsw.edu.au
Subject: Re: manual page migration -- issue list
Date: Wed, 16 Feb 2005 02:20:09 -0800	[thread overview]
Message-ID: <20050216022009.7afb2e6d.pj@sgi.com> (raw)
In-Reply-To: <20050216092011.GA6616@lnx-holt.americas.sgi.com>

Robin wrote:
> What that would result in is a syscall for each
> non-overlapping vma per node.

My latest, most radical, proposal did not take an address range.  It was
simply:

    sys_page_migrate(pid, oldnode, newnode)

It would be called once per node.  In your example, this would be 128
calls.  Nothing "for each non-overlapping vma".  Just per node.

Until I drove you to near distraction, and you spelled out the details
of an example that migrated 96% of the address space in the first call,
and only need 3 calls total, I would have presumed that the API:

    sys_page_migrate(pid, va_start, va_end, count, old_nodes, new_nodes)

would have required one call per pid, or 256 calls, for your example.

My method did not look insanely worse to me, indeed it would have looked
better in this example with two tasks per node, since I did one call per
node, and I thought you did one per task.

... However, I see now that you can routinely get by with dramatically
fewer calls than the number of tasks, by noticing what portions of the
typically huge shared address space have already been covered, and not
covering them again.

There is no need to convince me that 384 syscalls and 128 full scans
is insanely worse than 3 syscalls with 1 full scan, and no need to
get frustrated that I cannot see the insanity of it.

However, you might have wanted to allow for the possibility, when you
reduced what you thought I was proposing to insanity, that rather than
my proposing something insane, perhaps we had different numbers ... as
happened here.  Your numbers for the array API had 80 times fewer system
calls than I would have expected, and your numbers for the single
parameter call had 3 times _more_ system calls than I had in mind (I had
one call per node, period, not one per node per vma or whatever).

> How much opposition is there to the array of integers?

My opposition to the array was not profound.  It needed to provide
an advantage, which I didn't see it much did.

I now see it provides an advantage, dramatically reducing the number of
system calls and scans in typical cases, to substantially fewer than
either the number of tasks or of nodes.

Ok ... onward.  I'll take the node arrays.

The next concern that rises to the top for me was best expressed by Andi:
>
> The main reasons for that is that I don't think external
> processes should mess with virtual addresses of another process.
> It just feels unclean and has many drawbacks (parsing /proc/*/maps
> needs complicated user code, racy, locking difficult).  
> 
> In kernel space handling full VMs is much easier and safer due to better 
> locking facilities.

I share Andi's concerns, but I don't see what to do about this.  Andi's
recommendations seem to be about memory policies (which guide future
allocations), and not about migration of already allocated physical
pages.  So for now at least, his recommendations don't seem like answers
to me.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.650.933.1373, 1.925.600.0401
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  reply	other threads:[~2005-02-16 10:20 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-02-15 23:52 manual page migration -- issue list Ray Bryant
2005-02-16  0:09 ` Paul Jackson
2005-02-16  0:28   ` Ray Bryant
2005-02-16  0:51 ` Paul Jackson
2005-02-16  1:17   ` Paul Jackson
2005-02-16  2:01     ` Robin Holt
2005-02-16  4:04       ` Ray Bryant
2005-02-16  4:28         ` Paul Jackson
2005-02-16  4:24       ` Paul Jackson
2005-02-16  3:55     ` Ray Bryant
2005-02-16  1:56   ` Robin Holt
2005-02-16  4:22     ` Paul Jackson
2005-02-16  9:20       ` Robin Holt
2005-02-16 10:20         ` Paul Jackson [this message]
2005-02-16 11:30           ` Robin Holt
2005-02-16 15:45             ` Paul Jackson
2005-02-16 16:08               ` Robin Holt
2005-02-16 19:23                 ` Paul Jackson
2005-02-16 19:56                   ` Robin Holt
2005-02-16 23:08           ` Ray Bryant
2005-02-16 23:05         ` Ray Bryant
2005-02-17  0:28           ` Paul Jackson
2005-02-16  1:41 ` Paul Jackson
2005-02-16  3:56   ` Ray Bryant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050216022009.7afb2e6d.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=ak@muc.de \
    --cc=haveblue@us.ibm.com \
    --cc=holt@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=marcello@cyclades.com \
    --cc=peterc@gelato.unsw.edu.au \
    --cc=raybry@sgi.com \
    --cc=stevel@mwwireless.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox