public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Ray Bryant <raybry@sgi.com>
To: Andi Kleen <ak@suse.de>
Cc: Paul Jackson <pj@sgi.com>,
	ak@muc.de, raybry@austin.rr.com, linux-mm@kvack.org,
	Nathan Scott <nathans@sgi.com>,
	Dave Hansen <haveblue@us.ibm.com>Paul Jackson <pj@sgi.com>,
	Jack Steiner <steiner@sgi.com>, Robin Holt <holt@sgi.com>,
	Dean Roe <roe@sgi.com>
Subject: Re: [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview II
Date: Sat, 26 Feb 2005 12:22:42 -0600	[thread overview]
Message-ID: <4220BE72.2010400@sgi.com> (raw)
In-Reply-To: <20050222184915.GA8981@wotan.suse.de>

Andi,

Just to give you an update on where what our thinking is on the
page migration system call.  Our current proposal would be the
following:

(1)  The system call would look like:
	migrate_pages(pid, count, old_nodes, new_nodes);

(2)  The old nodes and new nodes lists would have to be disjoint.
      A library routine has been written to convert the case where
      the lists are not disjoint to a series of migrations each of
      which only uses disjoint lists.

      This has the advantage that the system call is restartable
      and can be repeated if an error condition occurs that causes
      the system call to return before completing the migration
      without fear of migrating a page more than once.

      In extreme situations, this can cause an O(N**2) effect to
      occur, but we think that these extreme situations are less
      likely to occur than we had previously thought.

(3)  We have a patch for xfs (thanks to Nathan Scott) that supports
      the "system.migration" extended attribute for files stored in
      xfs.  We intend to use two values for this extended attribute:
      "none" and "libr".  "none" implies that no pages of this file
      should be migrated if it is found as a mapped file in a pid that
      is being migrated, and "libr" implies that only writable pages
      should be migrated.  The latter is intended to support per
      process read/write data associated with a process, as well as
      handle some special edge cases (e. g. what happens if you put
      a breakpont in a shared library?).

Part of the reason for making this change is your concern about
adding va_start and length fields to the system call could produce
a "ptrace()" like system call and the problems that this entails.

The other part is the realization that the information required to
figure out what to migrate is not sufficiently encoded in the
/proc/pid/maps files.  As an example, it is impossible to figure
out whether an anoymous page range contains COW pages shared with
the process parent or pages at the same address range that have
been written and the COW sharing has been broken.  While there are
ways around this, I'd rather handle all such cases rather than
special case each such edge condition.

I should have a new patch with this implementation done by the
end of next week.

While the resulting system call will not require the target pid
to be suspended, because the underlying page migration code will
work even if the target is suspended, there is no guarentee that
all of the pages will be migrated off of the old_nodes unless this
is the case, since the process could allocate new pages on the
target nodes after that portion of the address space has been
scanned.

I don't have a good solution for this at the moment, other than
to require that the target task be suspended.  We could add the
suspend/resume logic to the system call, but given that we are
using a library call to handle the overlapped node list cases,
we probably want to do the suspend/resume as part of that library
call rather than the base system call.
-- 
Best Regards,
Ray
-----------------------------------------------
                   Ray Bryant
512-453-9679 (work)         512-507-7807 (cell)
raybry@sgi.com             raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
            so I installed Linux.
-----------------------------------------------
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  reply	other threads:[~2005-02-26 18:22 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-02-12  3:25 [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview Ray Bryant
2005-02-12  3:25 ` [RFC 2.6.11-rc2-mm2 1/7] mm: manual page migration -- cleanup 1 Ray Bryant
2005-02-12  3:25 ` [RFC 2.6.11-rc2-mm2 2/7] mm: manual page migration -- cleanup 2 Ray Bryant
2005-02-12  3:25 ` [RFC 2.6.11-rc2-mm2 3/7] mm: manual page migration -- cleanup 3 Ray Bryant
2005-02-12  3:26 ` [RFC 2.6.11-rc2-mm2 4/7] mm: manual page migration -- cleanup 4 Ray Bryant
2005-02-12  3:26 ` [RFC 2.6.11-rc2-mm2 5/7] mm: manual page migration -- cleanup 5 Ray Bryant
2005-02-12  3:26 ` [RFC 2.6.11-rc2-mm2 6/7] mm: manual page migration -- add node_map arg to try_to_migrate_pages() Ray Bryant
2005-02-12  3:26 ` [RFC 2.6.11-rc2-mm2 7/7] mm: manual page migration -- sys_page_migrate Ray Bryant
2005-02-12  8:08   ` Paul Jackson
2005-02-12 12:34   ` Arjan van de Ven
2005-02-12 14:48     ` Andi Kleen
2005-02-12 20:51       ` Paul Jackson
2005-02-12 21:04   ` Dave Hansen
2005-02-12 21:44     ` Paul Jackson
2005-02-14 13:52     ` Robin Holt
2005-02-14 18:50       ` Dave Hansen
2005-02-14 22:01         ` Robin Holt
2005-02-14 22:22           ` Dave Hansen
2005-02-15 10:50             ` Robin Holt
2005-02-15 15:38               ` Paul Jackson
2005-02-15 18:39               ` Dave Hansen
2005-02-15 18:54                 ` Ray Bryant
2005-02-15 15:49           ` Paul Jackson
2005-02-15 16:21             ` Robin Holt
2005-02-15 16:35               ` Paul Jackson
2005-02-15 18:59                 ` Robin Holt
2005-02-15 20:54                   ` Dave Hansen
2005-02-15 21:58                   ` Peter Chubb
2005-02-15 22:10                     ` Paul Jackson
2005-02-15 22:51                     ` Robin Holt
2005-02-15 23:00                       ` Paul Jackson
2005-02-15 23:21                     ` Ray Bryant
2005-02-15 23:51                       ` Martin J. Bligh
2005-02-16  0:38                         ` Ray Bryant
2005-02-16  0:44                           ` Andi Kleen
2005-02-16  0:54                             ` Martin J. Bligh
2005-02-16 10:02                               ` Andi Kleen
2005-02-16 15:21                                 ` Martin J. Bligh
2005-02-16 15:49                                   ` Paul Jackson
2005-02-16 16:08                                     ` Andi Kleen
2005-02-16 16:55                                       ` Martin J. Bligh
2005-02-16 23:35                                         ` Ray Bryant
2005-02-16  0:50                           ` Martin J. Bligh
2005-02-15 15:40         ` Paul Jackson
2005-02-12 11:17 ` [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview Andi Kleen
2005-02-12 12:12   ` Robin Holt
2005-02-14 19:18     ` Andi Kleen
2005-02-15  1:02       ` Steve Longerbeam
2005-02-12 15:54   ` Marcelo Tosatti
2005-02-12 16:18     ` Marcelo Tosatti
2005-02-12 21:29     ` Andi Kleen
2005-02-14 16:38       ` Robin Holt
2005-02-14 19:15         ` Andi Kleen
2005-02-14 23:49           ` Ray Bryant
2005-02-15  3:16             ` Paul Jackson
2005-02-15  9:14               ` Ray Bryant
2005-02-15 15:21                 ` Paul Jackson
2005-02-15  0:29   ` Ray Bryant
2005-02-15 11:05     ` Robin Holt
2005-02-15 17:44       ` Ray Bryant
2005-02-15 11:53     ` Andi Kleen
2005-02-15 12:15       ` Robin Holt
2005-02-15 15:07         ` Paul Jackson
2005-02-15 15:11         ` Paul Jackson
2005-02-15 18:16       ` Ray Bryant
2005-02-15 18:24         ` Andi Kleen
2005-02-15 12:14     ` [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview II Andi Kleen
2005-02-15 18:38       ` Ray Bryant
2005-02-15 21:48         ` Andi Kleen
2005-02-15 22:37           ` Paul Jackson
2005-02-16  3:44           ` Ray Bryant
2005-02-17 23:54             ` Andi Kleen
2005-02-18  8:38               ` Ray Bryant
2005-02-18 13:02                 ` Andi Kleen
2005-02-18 16:18                   ` Paul Jackson
2005-02-18 16:20                   ` Paul Jackson
2005-02-18 16:22                   ` Paul Jackson
2005-02-18 16:25                   ` Paul Jackson
2005-02-19  1:01                   ` Ray Bryant
2005-02-20 21:49                     ` Andi Kleen
2005-02-20 22:30                       ` Paul Jackson
2005-02-20 22:35                         ` Andi Kleen
2005-02-21  1:50                           ` Paul Jackson
2005-02-21  7:39                             ` Ray Bryant
2005-02-21  7:29                           ` Ray Bryant
2005-02-21  9:57                             ` Andi Kleen
2005-02-21 12:02                               ` Paul Jackson
2005-02-21  8:42                           ` Ray Bryant
2005-02-21 12:10                             ` Andi Kleen
2005-02-21 17:12                               ` Ray Bryant
2005-02-22 18:03                                 ` Andi Kleen
2005-02-23  3:33                                   ` Ray Bryant
2005-02-22  6:40                               ` Ray Bryant
2005-02-22 18:01                                 ` Andi Kleen
2005-02-22 18:45                                   ` Ray Bryant
2005-02-22 18:49                                     ` Andi Kleen
2005-02-26 18:22                                       ` Ray Bryant [this message]
2005-02-22 22:04                                   ` Ray Bryant
2005-02-22  6:44                               ` Ray Bryant
2005-02-21  4:20                       ` Ray Bryant
2005-02-18 16:58               ` Ray Bryant
2005-02-18 17:02               ` Ray Bryant
2005-02-18 17:11               ` Ray Bryant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4220BE72.2010400@sgi.com \
    --to=raybry@sgi.com \
    --cc=ak@muc.de \
    --cc=ak@suse.de \
    --cc=haveblue@us.ibm.com \
    --cc=linux-mm@kvack.org \
    --cc=nathans@sgi.com \
    --cc=pj@sgi.com \
    --cc=raybry@austin.rr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox