From: Andi Kleen <ak@suse.de>
To: Ray Bryant <raybry@sgi.com>
Cc: Andi Kleen <ak@muc.de>, Ray Bryant <raybry@austin.rr.com>,
linux-mm <linux-mm@kvack.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview II
Date: Tue, 15 Feb 2005 22:48:31 +0100 [thread overview]
Message-ID: <20050215214831.GC7345@wotan.suse.de> (raw)
In-Reply-To: <421241A2.8040407@sgi.com>
> Making memory migration a subset of page migration is not a general
> solution. It only works for programs that are using memory policy
> to control placement. As I've tried to point out multiple times
> before, most programs that I am aware of use placement based on
> first-touch. When we migrate such programs, we have to respect
> the placement decisions that the program has implicitly made in
> this way.
Sorry, but the only real difference between your API and mbind is that
yours has a pid argument.
I think we are talking by each other, here's a more structured
overview of my thinking on the issue.
Main cases:
- Program is NUMA API aware. Fine. It takes care of its own.
- Program is not aware, but is started with a process policy from
numactl/cpusets/batch scheduler. Already covered too in NUMA API.
- Program is not aware and hasn't been started with a policy
(or has and you change your mind) but you want to change it later.
That's the new case we discuss here.
Now how to change policy of objects in an already running process.
First there are already some special cases already handled or
with existing patches:
- tmpfs/hugetlbfs/sysv shm: numactl can handle this by just mapping
the object into a different process and changing the policy there.
- shared libraries and mmaped files in general: this is a generialization of
the previous point. SteveL's patch is the beginning of handling this, although
it needs some more work (xattrs) to make the policy persistent over
memory pressure.
Only case not covered left is anonymous memory.
You said it would need user space control, but the main reason for
wanting that seems to be to handle the non anonymous cases which
are already covered above.
My thinking is the simplest way to handle that is to have a call that just o
migrates everything. The main reasons for that is that I don't think external
processes should mess with virtual addresses of another process.
It just feels unclean and has many drawbacks (parsing /proc/*/maps
needs complicated user code, racy, locking difficult).
In kernel space handling full VMs is much easier and safer due to better
locking facilities.
In user space only the process itself really can handle its own virtual
addresses well, and if it does that it can use NUMA API directly anyways.
You argued that it may be costly to walk everything, but I don't see this
as a big problem - first walking mempolicies is not very costly and then
fork() and exit() do exactly this already.
The main missing piece for this would be a way to make policies for
files persistent. One way would be to use xattrs like selinux, but
that may be costly (not sure we want to read xattrs all the time
when reading a file).
A hackish way to do this that already
works would be to do a mlock on one page of the file to keep
the inode pinned. E.g. the batch manager could do this. That's
not very clean, but would probably work.
-Andi
next prev parent reply other threads:[~2005-02-15 21:48 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-02-12 3:25 [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview Ray Bryant
2005-02-12 3:25 ` [RFC 2.6.11-rc2-mm2 1/7] mm: manual page migration -- cleanup 1 Ray Bryant
2005-02-12 3:25 ` [RFC 2.6.11-rc2-mm2 2/7] mm: manual page migration -- cleanup 2 Ray Bryant
2005-02-12 3:25 ` [RFC 2.6.11-rc2-mm2 3/7] mm: manual page migration -- cleanup 3 Ray Bryant
2005-02-12 3:26 ` [RFC 2.6.11-rc2-mm2 4/7] mm: manual page migration -- cleanup 4 Ray Bryant
2005-02-12 3:26 ` [RFC 2.6.11-rc2-mm2 5/7] mm: manual page migration -- cleanup 5 Ray Bryant
2005-02-12 3:26 ` [RFC 2.6.11-rc2-mm2 6/7] mm: manual page migration -- add node_map arg to try_to_migrate_pages() Ray Bryant
2005-02-12 3:26 ` [RFC 2.6.11-rc2-mm2 7/7] mm: manual page migration -- sys_page_migrate Ray Bryant
2005-02-12 8:08 ` Paul Jackson
2005-02-12 12:34 ` Arjan van de Ven
2005-02-12 14:48 ` Andi Kleen
2005-02-12 20:51 ` Paul Jackson
2005-02-12 21:04 ` Dave Hansen
2005-02-12 21:44 ` Paul Jackson
2005-02-14 13:52 ` Robin Holt
2005-02-14 18:50 ` Dave Hansen
2005-02-14 22:01 ` Robin Holt
2005-02-14 22:22 ` Dave Hansen
2005-02-15 10:50 ` Robin Holt
2005-02-15 15:38 ` Paul Jackson
2005-02-15 18:39 ` Dave Hansen
2005-02-15 18:54 ` Ray Bryant
2005-02-15 15:49 ` Paul Jackson
2005-02-15 16:21 ` Robin Holt
2005-02-15 16:35 ` Paul Jackson
2005-02-15 18:59 ` Robin Holt
2005-02-15 20:54 ` Dave Hansen
[not found] ` <16914.28795.316835.291470@wombat.chubb.wattle.id.au>
2005-02-15 22:10 ` Paul Jackson
2005-02-15 22:51 ` Robin Holt
2005-02-15 23:00 ` Paul Jackson
2005-02-15 15:40 ` Paul Jackson
2005-02-12 11:17 ` [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview Andi Kleen
2005-02-12 12:12 ` Robin Holt
2005-02-14 19:18 ` Andi Kleen
2005-02-15 1:02 ` Steve Longerbeam
2005-02-12 15:54 ` Marcelo Tosatti
2005-02-12 16:18 ` Marcelo Tosatti
2005-02-12 21:29 ` Andi Kleen
2005-02-14 16:38 ` Robin Holt
2005-02-14 19:15 ` Andi Kleen
2005-02-14 23:49 ` Ray Bryant
2005-02-15 3:16 ` Paul Jackson
2005-02-15 9:14 ` Ray Bryant
2005-02-15 15:21 ` Paul Jackson
2005-02-15 0:29 ` Ray Bryant
2005-02-15 11:05 ` Robin Holt
2005-02-15 17:44 ` Ray Bryant
2005-02-15 11:53 ` Andi Kleen
2005-02-15 12:15 ` Robin Holt
2005-02-15 15:07 ` Paul Jackson
2005-02-15 15:11 ` Paul Jackson
2005-02-15 18:16 ` Ray Bryant
2005-02-15 18:24 ` Andi Kleen
2005-02-15 12:14 ` [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview II Andi Kleen
2005-02-15 18:38 ` Ray Bryant
2005-02-15 21:48 ` Andi Kleen [this message]
2005-02-15 22:37 ` Paul Jackson
2005-02-16 3:44 ` Ray Bryant
2005-02-17 23:54 ` Andi Kleen
2005-02-18 8:38 ` Ray Bryant
2005-02-18 13:02 ` Andi Kleen
2005-02-18 16:18 ` Paul Jackson
2005-02-18 16:20 ` Paul Jackson
2005-02-18 16:22 ` Paul Jackson
2005-02-18 16:25 ` Paul Jackson
2005-02-19 1:01 ` Ray Bryant
2005-02-20 21:49 ` Andi Kleen
2005-02-20 22:30 ` Paul Jackson
2005-02-20 22:35 ` Andi Kleen
2005-02-21 1:50 ` Paul Jackson
2005-02-21 7:39 ` Ray Bryant
2005-02-21 7:29 ` Ray Bryant
2005-02-21 9:57 ` Andi Kleen
2005-02-21 12:02 ` Paul Jackson
2005-02-21 8:42 ` Ray Bryant
2005-02-21 12:10 ` Andi Kleen
2005-02-21 17:12 ` Ray Bryant
2005-02-22 18:03 ` Andi Kleen
2005-02-22 6:40 ` Ray Bryant
2005-02-22 18:01 ` Andi Kleen
2005-02-22 18:45 ` Ray Bryant
2005-02-22 18:49 ` Andi Kleen
2005-02-22 22:04 ` Ray Bryant
2005-02-22 6:44 ` Ray Bryant
2005-02-21 4:20 ` Ray Bryant
2005-02-18 16:58 ` Ray Bryant
2005-02-18 17:02 ` Ray Bryant
2005-02-18 17:11 ` Ray Bryant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050215214831.GC7345@wotan.suse.de \
--to=ak@suse.de \
--cc=ak@muc.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=raybry@austin.rr.com \
--cc=raybry@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox