public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ray Bryant <raybry@sgi.com>
To: Andi Kleen <ak@suse.de>
Cc: Paul Jackson <pj@sgi.com>,
	ak@muc.de, raybry@austin.rr.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview II
Date: Mon, 21 Feb 2005 01:29:41 -0600	[thread overview]
Message-ID: <42198DE5.2040703@sgi.com> (raw)
In-Reply-To: <20050220223510.GB14486@wotan.suse.de>

Andi Kleen wrote:
>>Do you have any better way to suggest, Andi, for a batch manager to
>>relocate a job?  The typical scenario, as Ray explained it to me, is
> 
> 
> - Give the shared libraries and any other files a suitable policy
> (by mapping them and applying mbind) 
> 
> - Then execute migrate_pages() for the anonymous pages with a suitable
> old node -> new node mapping.
> 
> 
>>How would you recommend that the batch manager move that job to the
>>nodes that can run it?  The layout of allocated memory pages and tasks
>>for that job must be preserved in order to keep the same performance.
>>The migration method needs to scale to hundreds, or more, of nodes.
> 
> 
> You have to walk to full node mapping for each array, but
> even with hundreds of nodes that should not be that costly
> (in the worst case you could create a small hash table for it
> in the kernel, but I'm not sure it's worth it) 
> 
> -Andi
> -

I'm going to assume that there have been some "crossed emails" here.
I don't think that this is the interface that you and I have been
converging on.  As I understood it, we were converging on the following:

(1)  extended attributes will be used to mark files as non-migratable
(2)  the page_migrate() system call will be defined as:

          page_migrate(pid, count, old_nodes, new_nodes);

      and it will migrate all pages that are either anonymous or part
      of mapped files that are not marked non-migratable.
(3)  The mbind() system call with MPOL_MF_STRICT will be hooked up
      to the migration code so that it actually causes a migration.
      Processes can use this interface to migrate a portion of their own
      address space containing a mapped file.

This is different than your reply above, which seems to imply that:

(A)  Step 1 is to migrate mapped files using mbind().  I don't understand
      how to do this in general, because:
      (a)  I don't know how to make a non-racy list of the mapped files to
           migrate without assuming that the process to be migrated is stopped
and  (b)  If the mapped file is associated with the DEFAULT memory policy,
           and page placement was done by first touch, then it is not clear
           how to use mbind() to cause the pages to be migrated, and still
           end up with the identical topological placement of pages after
           the migration.
(B)  Step 2 is to use page_migrate() to migrate just the anonymous pages.
      I don't like the restriction of this to just anonymous pages.

Fundamentally, I don't see why (A) is much different from allowing one
process to manipulate the physical storage for another process.  It's
just stated in terms of mmap'd objects instead of pid's.  So I don't
see why that is fundamentally different from a page_migration() call
with va_start and va_end arguments.

So I'm going to assume that the agreement was really (1)-(3) above.

The only problem I see with that is the following:  Suppose that a user
wants to migrate a portion of their own address space that is composed
of (at last partly) anonymous pages or pages mapped to a file associated
with the DEFAULT memory policy, and we want the pages to be toplogically
allocated the same way after the migration as they were before the
migration?

The only way I know how to do the latter is with a system call of the form:

	page_migrate(pid, va_start, va_end, count, old_nodes, new_nodes);

where the permission model is that a pid can migrate any process that it
can send a signal to.  So a root pid can migrate any process, and a user
pid can migrate pages of any pid started by the user.
-- 
Best Regards,
Ray
-----------------------------------------------
                   Ray Bryant
512-453-9679 (work)         512-507-7807 (cell)
raybry@sgi.com             raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
            so I installed Linux.
-----------------------------------------------

  parent reply	other threads:[~2005-02-21  7:26 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-02-12  3:25 [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview Ray Bryant
2005-02-12  3:25 ` [RFC 2.6.11-rc2-mm2 1/7] mm: manual page migration -- cleanup 1 Ray Bryant
2005-02-12  3:25 ` [RFC 2.6.11-rc2-mm2 2/7] mm: manual page migration -- cleanup 2 Ray Bryant
2005-02-12  3:25 ` [RFC 2.6.11-rc2-mm2 3/7] mm: manual page migration -- cleanup 3 Ray Bryant
2005-02-12  3:26 ` [RFC 2.6.11-rc2-mm2 4/7] mm: manual page migration -- cleanup 4 Ray Bryant
2005-02-12  3:26 ` [RFC 2.6.11-rc2-mm2 5/7] mm: manual page migration -- cleanup 5 Ray Bryant
2005-02-12  3:26 ` [RFC 2.6.11-rc2-mm2 6/7] mm: manual page migration -- add node_map arg to try_to_migrate_pages() Ray Bryant
2005-02-12  3:26 ` [RFC 2.6.11-rc2-mm2 7/7] mm: manual page migration -- sys_page_migrate Ray Bryant
2005-02-12  8:08   ` Paul Jackson
2005-02-12 12:34   ` Arjan van de Ven
2005-02-12 14:48     ` Andi Kleen
2005-02-12 20:51       ` Paul Jackson
2005-02-12 21:04   ` Dave Hansen
2005-02-12 21:44     ` Paul Jackson
2005-02-14 13:52     ` Robin Holt
2005-02-14 18:50       ` Dave Hansen
2005-02-14 22:01         ` Robin Holt
2005-02-14 22:22           ` Dave Hansen
2005-02-15 10:50             ` Robin Holt
2005-02-15 15:38               ` Paul Jackson
2005-02-15 18:39               ` Dave Hansen
2005-02-15 18:54                 ` Ray Bryant
2005-02-15 15:49           ` Paul Jackson
2005-02-15 16:21             ` Robin Holt
2005-02-15 16:35               ` Paul Jackson
2005-02-15 18:59                 ` Robin Holt
2005-02-15 20:54                   ` Dave Hansen
     [not found]                   ` <16914.28795.316835.291470@wombat.chubb.wattle.id.au>
2005-02-15 22:10                     ` Paul Jackson
2005-02-15 22:51                     ` Robin Holt
2005-02-15 23:00                       ` Paul Jackson
2005-02-15 15:40         ` Paul Jackson
2005-02-12 11:17 ` [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview Andi Kleen
2005-02-12 12:12   ` Robin Holt
2005-02-14 19:18     ` Andi Kleen
2005-02-15  1:02       ` Steve Longerbeam
2005-02-12 15:54   ` Marcelo Tosatti
2005-02-12 16:18     ` Marcelo Tosatti
2005-02-12 21:29     ` Andi Kleen
2005-02-14 16:38       ` Robin Holt
2005-02-14 19:15         ` Andi Kleen
2005-02-14 23:49           ` Ray Bryant
2005-02-15  3:16             ` Paul Jackson
2005-02-15  9:14               ` Ray Bryant
2005-02-15 15:21                 ` Paul Jackson
2005-02-15  0:29   ` Ray Bryant
2005-02-15 11:05     ` Robin Holt
2005-02-15 17:44       ` Ray Bryant
2005-02-15 11:53     ` Andi Kleen
2005-02-15 12:15       ` Robin Holt
2005-02-15 15:07         ` Paul Jackson
2005-02-15 15:11         ` Paul Jackson
2005-02-15 18:16       ` Ray Bryant
2005-02-15 18:24         ` Andi Kleen
2005-02-15 12:14     ` [RFC 2.6.11-rc2-mm2 0/7] mm: manual page migration -- overview II Andi Kleen
2005-02-15 18:38       ` Ray Bryant
2005-02-15 21:48         ` Andi Kleen
2005-02-15 22:37           ` Paul Jackson
2005-02-16  3:44           ` Ray Bryant
2005-02-17 23:54             ` Andi Kleen
2005-02-18  8:38               ` Ray Bryant
2005-02-18 13:02                 ` Andi Kleen
2005-02-18 16:18                   ` Paul Jackson
2005-02-18 16:20                   ` Paul Jackson
2005-02-18 16:22                   ` Paul Jackson
2005-02-18 16:25                   ` Paul Jackson
2005-02-19  1:01                   ` Ray Bryant
2005-02-20 21:49                     ` Andi Kleen
2005-02-20 22:30                       ` Paul Jackson
2005-02-20 22:35                         ` Andi Kleen
2005-02-21  1:50                           ` Paul Jackson
2005-02-21  7:39                             ` Ray Bryant
2005-02-21  7:29                           ` Ray Bryant [this message]
2005-02-21  9:57                             ` Andi Kleen
2005-02-21 12:02                               ` Paul Jackson
2005-02-21  8:42                           ` Ray Bryant
2005-02-21 12:10                             ` Andi Kleen
2005-02-21 17:12                               ` Ray Bryant
2005-02-22 18:03                                 ` Andi Kleen
2005-02-22  6:40                               ` Ray Bryant
2005-02-22 18:01                                 ` Andi Kleen
2005-02-22 18:45                                   ` Ray Bryant
2005-02-22 18:49                                     ` Andi Kleen
2005-02-22 22:04                                   ` Ray Bryant
2005-02-22  6:44                               ` Ray Bryant
2005-02-21  4:20                       ` Ray Bryant
2005-02-18 16:58               ` Ray Bryant
2005-02-18 17:02               ` Ray Bryant
2005-02-18 17:11               ` Ray Bryant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42198DE5.2040703@sgi.com \
    --to=raybry@sgi.com \
    --cc=ak@muc.de \
    --cc=ak@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pj@sgi.com \
    --cc=raybry@austin.rr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox