From: Ray Bryant <raybry@sgi.com>
To: Andi Kleen <ak@suse.de>
Cc: Paul Jackson <pj@sgi.com>,
ak@muc.de, raybry@austin.rr.com, linux-mm@kvack.org,
Nathan Scott <nathans@sgi.com>,
Dave Hansen <haveblue@us.ibm.com>Paul Jackson <pj@sgi.com>,
Jack Steiner <steiner@sgi.com>, Robin Holt <holt@sgi.com>,
Dean Roe <roe@sgi.com>
Subject: Re: [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview II
Date: Sat, 26 Feb 2005 12:22:42 -0600 [thread overview]
Message-ID: <4220BE72.2010400@sgi.com> (raw)
In-Reply-To: <20050222184915.GA8981@wotan.suse.de>
Andi,
Just to give you an update on where what our thinking is on the
page migration system call. Our current proposal would be the
following:
(1) The system call would look like:
migrate_pages(pid, count, old_nodes, new_nodes);
(2) The old nodes and new nodes lists would have to be disjoint.
A library routine has been written to convert the case where
the lists are not disjoint to a series of migrations each of
which only uses disjoint lists.
This has the advantage that the system call is restartable
and can be repeated if an error condition occurs that causes
the system call to return before completing the migration
without fear of migrating a page more than once.
In extreme situations, this can cause an O(N**2) effect to
occur, but we think that these extreme situations are less
likely to occur than we had previously thought.
(3) We have a patch for xfs (thanks to Nathan Scott) that supports
the "system.migration" extended attribute for files stored in
xfs. We intend to use two values for this extended attribute:
"none" and "libr". "none" implies that no pages of this file
should be migrated if it is found as a mapped file in a pid that
is being migrated, and "libr" implies that only writable pages
should be migrated. The latter is intended to support per
process read/write data associated with a process, as well as
handle some special edge cases (e. g. what happens if you put
a breakpont in a shared library?).
Part of the reason for making this change is your concern about
adding va_start and length fields to the system call could produce
a "ptrace()" like system call and the problems that this entails.
The other part is the realization that the information required to
figure out what to migrate is not sufficiently encoded in the
/proc/pid/maps files. As an example, it is impossible to figure
out whether an anoymous page range contains COW pages shared with
the process parent or pages at the same address range that have
been written and the COW sharing has been broken. While there are
ways around this, I'd rather handle all such cases rather than
special case each such edge condition.
I should have a new patch with this implementation done by the
end of next week.
While the resulting system call will not require the target pid
to be suspended, because the underlying page migration code will
work even if the target is suspended, there is no guarentee that
all of the pages will be migrated off of the old_nodes unless this
is the case, since the process could allocate new pages on the
target nodes after that portion of the address space has been
scanned.
I don't have a good solution for this at the moment, other than
to require that the target task be suspended. We could add the
suspend/resume logic to the system call, but given that we are
using a library call to handle the overlapped node list cases,
we probably want to do the suspend/resume as part of that library
call rather than the base system call.
--
Best Regards,
Ray
-----------------------------------------------
Ray Bryant
512-453-9679 (work) 512-507-7807 (cell)
raybry@sgi.com raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
so I installed Linux.
-----------------------------------------------
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2005-02-26 18:22 UTC|newest]
Thread overview: 103+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-02-12 3:25 [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview Ray Bryant
2005-02-12 3:25 ` [RFC 2.6.11-rc2-mm2 1/7] mm: manual page migration -- cleanup 1 Ray Bryant
2005-02-12 3:25 ` [RFC 2.6.11-rc2-mm2 2/7] mm: manual page migration -- cleanup 2 Ray Bryant
2005-02-12 3:25 ` [RFC 2.6.11-rc2-mm2 3/7] mm: manual page migration -- cleanup 3 Ray Bryant
2005-02-12 3:26 ` [RFC 2.6.11-rc2-mm2 4/7] mm: manual page migration -- cleanup 4 Ray Bryant
2005-02-12 3:26 ` [RFC 2.6.11-rc2-mm2 5/7] mm: manual page migration -- cleanup 5 Ray Bryant
2005-02-12 3:26 ` [RFC 2.6.11-rc2-mm2 6/7] mm: manual page migration -- add node_map arg to try_to_migrate_pages() Ray Bryant
2005-02-12 3:26 ` [RFC 2.6.11-rc2-mm2 7/7] mm: manual page migration -- sys_page_migrate Ray Bryant
2005-02-12 8:08 ` Paul Jackson
2005-02-12 12:34 ` Arjan van de Ven
2005-02-12 14:48 ` Andi Kleen
2005-02-12 20:51 ` Paul Jackson
2005-02-12 21:04 ` Dave Hansen
2005-02-12 21:44 ` Paul Jackson
2005-02-14 13:52 ` Robin Holt
2005-02-14 18:50 ` Dave Hansen
2005-02-14 22:01 ` Robin Holt
2005-02-14 22:22 ` Dave Hansen
2005-02-15 10:50 ` Robin Holt
2005-02-15 15:38 ` Paul Jackson
2005-02-15 18:39 ` Dave Hansen
2005-02-15 18:54 ` Ray Bryant
2005-02-15 15:49 ` Paul Jackson
2005-02-15 16:21 ` Robin Holt
2005-02-15 16:35 ` Paul Jackson
2005-02-15 18:59 ` Robin Holt
2005-02-15 20:54 ` Dave Hansen
2005-02-15 21:58 ` Peter Chubb
2005-02-15 22:10 ` Paul Jackson
2005-02-15 22:51 ` Robin Holt
2005-02-15 23:00 ` Paul Jackson
2005-02-15 23:21 ` Ray Bryant
2005-02-15 23:51 ` Martin J. Bligh
2005-02-16 0:38 ` Ray Bryant
2005-02-16 0:44 ` Andi Kleen
2005-02-16 0:54 ` Martin J. Bligh
2005-02-16 10:02 ` Andi Kleen
2005-02-16 15:21 ` Martin J. Bligh
2005-02-16 15:49 ` Paul Jackson
2005-02-16 16:08 ` Andi Kleen
2005-02-16 16:55 ` Martin J. Bligh
2005-02-16 23:35 ` Ray Bryant
2005-02-16 0:50 ` Martin J. Bligh
2005-02-15 15:40 ` Paul Jackson
2005-02-12 11:17 ` [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview Andi Kleen
2005-02-12 12:12 ` Robin Holt
2005-02-14 19:18 ` Andi Kleen
2005-02-15 1:02 ` Steve Longerbeam
2005-02-12 15:54 ` Marcelo Tosatti
2005-02-12 16:18 ` Marcelo Tosatti
2005-02-12 21:29 ` Andi Kleen
2005-02-14 16:38 ` Robin Holt
2005-02-14 19:15 ` Andi Kleen
2005-02-14 23:49 ` Ray Bryant
2005-02-15 3:16 ` Paul Jackson
2005-02-15 9:14 ` Ray Bryant
2005-02-15 15:21 ` Paul Jackson
2005-02-15 0:29 ` Ray Bryant
2005-02-15 11:05 ` Robin Holt
2005-02-15 17:44 ` Ray Bryant
2005-02-15 11:53 ` Andi Kleen
2005-02-15 12:15 ` Robin Holt
2005-02-15 15:07 ` Paul Jackson
2005-02-15 15:11 ` Paul Jackson
2005-02-15 18:16 ` Ray Bryant
2005-02-15 18:24 ` Andi Kleen
2005-02-15 12:14 ` [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview II Andi Kleen
2005-02-15 18:38 ` Ray Bryant
2005-02-15 21:48 ` Andi Kleen
2005-02-15 22:37 ` Paul Jackson
2005-02-16 3:44 ` Ray Bryant
2005-02-17 23:54 ` Andi Kleen
2005-02-18 8:38 ` Ray Bryant
2005-02-18 13:02 ` Andi Kleen
2005-02-18 16:18 ` Paul Jackson
2005-02-18 16:20 ` Paul Jackson
2005-02-18 16:22 ` Paul Jackson
2005-02-18 16:25 ` Paul Jackson
2005-02-19 1:01 ` Ray Bryant
2005-02-20 21:49 ` Andi Kleen
2005-02-20 22:30 ` Paul Jackson
2005-02-20 22:35 ` Andi Kleen
2005-02-21 1:50 ` Paul Jackson
2005-02-21 7:39 ` Ray Bryant
2005-02-21 7:29 ` Ray Bryant
2005-02-21 9:57 ` Andi Kleen
2005-02-21 12:02 ` Paul Jackson
2005-02-21 8:42 ` Ray Bryant
2005-02-21 12:10 ` Andi Kleen
2005-02-21 17:12 ` Ray Bryant
2005-02-22 18:03 ` Andi Kleen
2005-02-23 3:33 ` Ray Bryant
2005-02-22 6:40 ` Ray Bryant
2005-02-22 18:01 ` Andi Kleen
2005-02-22 18:45 ` Ray Bryant
2005-02-22 18:49 ` Andi Kleen
2005-02-26 18:22 ` Ray Bryant [this message]
2005-02-22 22:04 ` Ray Bryant
2005-02-22 6:44 ` Ray Bryant
2005-02-21 4:20 ` Ray Bryant
2005-02-18 16:58 ` Ray Bryant
2005-02-18 17:02 ` Ray Bryant
2005-02-18 17:11 ` Ray Bryant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4220BE72.2010400@sgi.com \
--to=raybry@sgi.com \
--cc=ak@muc.de \
--cc=ak@suse.de \
--cc=haveblue@us.ibm.com \
--cc=linux-mm@kvack.org \
--cc=nathans@sgi.com \
--cc=pj@sgi.com \
--cc=raybry@austin.rr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox