From: Steve Longerbeam <stevel@mvista.com>
To: Ray Bryant <raybry@sgi.com>
Cc: Andi Kleen <ak@muc.de>, Hirokazu Takahashi <taka@valinux.co.jp>,
Dave Hansen <haveblue@us.ibm.com>,
Marcello Tosatti <marcelo.tosatti@cyclades.com>,
Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>, andrew morton <akpm@osdl.org>
Subject: Re: page migration patchset
Date: Tue, 11 Jan 2005 11:00:56 -0800 [thread overview]
Message-ID: <41E42268.5090404@mvista.com> (raw)
In-Reply-To: <41E3F2DA.5030900@sgi.com>
Ray Bryant wrote:
> Andi and Steve,
>
> Steve Longerbeam wrote:
> <snip>
>
>>>
>>> My personal preference would be to keep as much of this as possible
>>> under user space control; that is, rather than having a big autonomous
>>> system call that migrates pages and then updates policy information,
>>> I'd prefer to split the work into several smaller system calls that
>>> are issued by a user space program responsible for coordinating the
>>> process migration as a series of steps, e. g.:
>>>
>>> (1) suspend the process via SIGSTOP
>>> (2) update the mempolicy information
>>> (3) migrate the process's pages
>>> (4) migrate the process to the new cpu via set_schedaffinity()
>>> (5) resume the process via SIGCONT
>>>
>>
>> steps 2 and 3 can be accomplished by a call to mbind() and
>> specifying MPOL_MF_MOVE. And since mbind() takes an
>> address range, you could probably migrate pages and change
>> the policies for all of the process' mappings in a single mbind()
>> call.
>
>
> OK, I just got around to looking into this suggestion. Unfortunately,
> it doesn't look as if this will do what I want. I need to be able to
> conserve the topology of the application when it is migrated (required
> to give the application the same performance in its new location that
> it got in its old location).
I see what you mean, unless the requested address range exactly
fits within an existing vma, existing vma's will get split up.
> So, I need to be able to say "take the
> pages on this node and move them to that node". The sys_mbind() call
> doesn't have the necessry arguments to do this. I'm thinking of
> something like:
>
> migrate_process_pages(pid, numnodes, oldnodelist, newnodelist);
>
> This would scan the address space of process pid, and each page that
> is found on oldnodelist[i] would be moved to node newnodelist[i].
right, that's something I'd be interested in as well. In fact, an address
range is not ideal for me either - what I really need is an API that
allows me to specify a single existing vma (or all the process'
regions in your case) that is to have its policy changed and resident
pages migrated, without changing the topology (eg. split vma's).
>
> Pages that are found to be swapped out would be handled as follows:
> Add the original node id to either the swap pte or the swp_entry_t.
> Swap in will be modified to allocate the page on the same node it
> came from. Then, as part of migrate_process_pages, all that would
> be done for swapped out pages would be to change the "original node"
> field to point at the new node.
isn't this already taken care of? read_swap_cache_async() is given
a vma, and passes it to alloc_page_vma(). So if you have earlier
changed the policy for that vma, the new policy will be used
when allocating the page during the swap in.
Steve
WARNING: multiple messages have this Message-ID (diff)
From: Steve Longerbeam <stevel@mvista.com>
To: Ray Bryant <raybry@sgi.com>
Cc: Andi Kleen <ak@muc.de>, Hirokazu Takahashi <taka@valinux.co.jp>,
Dave Hansen <haveblue@us.ibm.com>,
Marcello Tosatti <marcelo.tosatti@cyclades.com>,
Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>, andrew morton <akpm@osdl.org>
Subject: Re: page migration patchset
Date: Tue, 11 Jan 2005 11:00:56 -0800 [thread overview]
Message-ID: <41E42268.5090404@mvista.com> (raw)
In-Reply-To: <41E3F2DA.5030900@sgi.com>
Ray Bryant wrote:
> Andi and Steve,
>
> Steve Longerbeam wrote:
> <snip>
>
>>>
>>> My personal preference would be to keep as much of this as possible
>>> under user space control; that is, rather than having a big autonomous
>>> system call that migrates pages and then updates policy information,
>>> I'd prefer to split the work into several smaller system calls that
>>> are issued by a user space program responsible for coordinating the
>>> process migration as a series of steps, e. g.:
>>>
>>> (1) suspend the process via SIGSTOP
>>> (2) update the mempolicy information
>>> (3) migrate the process's pages
>>> (4) migrate the process to the new cpu via set_schedaffinity()
>>> (5) resume the process via SIGCONT
>>>
>>
>> steps 2 and 3 can be accomplished by a call to mbind() and
>> specifying MPOL_MF_MOVE. And since mbind() takes an
>> address range, you could probably migrate pages and change
>> the policies for all of the process' mappings in a single mbind()
>> call.
>
>
> OK, I just got around to looking into this suggestion. Unfortunately,
> it doesn't look as if this will do what I want. I need to be able to
> conserve the topology of the application when it is migrated (required
> to give the application the same performance in its new location that
> it got in its old location).
I see what you mean, unless the requested address range exactly
fits within an existing vma, existing vma's will get split up.
> So, I need to be able to say "take the
> pages on this node and move them to that node". The sys_mbind() call
> doesn't have the necessry arguments to do this. I'm thinking of
> something like:
>
> migrate_process_pages(pid, numnodes, oldnodelist, newnodelist);
>
> This would scan the address space of process pid, and each page that
> is found on oldnodelist[i] would be moved to node newnodelist[i].
right, that's something I'd be interested in as well. In fact, an address
range is not ideal for me either - what I really need is an API that
allows me to specify a single existing vma (or all the process'
regions in your case) that is to have its policy changed and resident
pages migrated, without changing the topology (eg. split vma's).
>
> Pages that are found to be swapped out would be handled as follows:
> Add the original node id to either the swap pte or the swp_entry_t.
> Swap in will be modified to allocate the page on the same node it
> came from. Then, as part of migrate_process_pages, all that would
> be done for swapped out pages would be to change the "original node"
> field to point at the new node.
isn't this already taken care of? read_swap_cache_async() is given
a vma, and passes it to alloc_page_vma(). So if you have earlier
changed the policy for that vma, the new policy will be used
when allocating the page during the swap in.
Steve
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2005-01-11 19:03 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-01-05 0:32 page migration patchset Ray Bryant
2005-01-05 0:32 ` Ray Bryant
2005-01-05 2:07 ` Andi Kleen
2005-01-05 2:07 ` Andi Kleen
2005-01-05 3:20 ` Ray Bryant
2005-01-05 3:20 ` Ray Bryant
2005-01-05 18:41 ` Steve Longerbeam
2005-01-05 18:41 ` Steve Longerbeam
2005-01-05 19:23 ` Ray Bryant
2005-01-05 19:23 ` Ray Bryant
2005-01-05 23:00 ` Steve Longerbeam
2005-01-05 23:16 ` Ray Bryant
2005-01-05 23:16 ` Ray Bryant
2005-01-05 20:55 ` Hugh Dickins
2005-01-05 20:55 ` Hugh Dickins
[not found] ` <41DC7EAD.8010407@mvista.com>
2005-01-06 14:43 ` Andi Kleen
2005-01-06 14:43 ` Andi Kleen
2005-01-06 16:00 ` Ray Bryant
2005-01-06 16:00 ` Ray Bryant
2005-01-06 17:50 ` Christoph Lameter
2005-01-06 17:50 ` Christoph Lameter
2005-01-06 19:29 ` Andi Kleen
2005-01-06 19:29 ` Andi Kleen
2005-01-06 22:30 ` William Lee Irwin III
2005-01-06 22:30 ` William Lee Irwin III
2005-01-06 23:08 ` Andrew Morton
2005-01-06 23:08 ` Andrew Morton
2005-01-06 23:15 ` William Lee Irwin III
2005-01-06 23:15 ` William Lee Irwin III
2005-01-06 23:21 ` Ray Bryant
2005-01-06 23:21 ` Ray Bryant
2005-01-06 23:35 ` William Lee Irwin III
2005-01-06 23:35 ` William Lee Irwin III
2005-01-06 23:53 ` Anton Blanchard
2005-01-06 23:53 ` Anton Blanchard
2005-01-07 0:06 ` William Lee Irwin III
2005-01-07 0:06 ` William Lee Irwin III
2005-01-07 0:31 ` Andi Kleen
2005-01-07 0:31 ` Andi Kleen
2005-01-06 23:43 ` Steve Longerbeam
2005-01-06 23:43 ` Steve Longerbeam
2005-01-06 23:58 ` William Lee Irwin III
2005-01-06 23:58 ` William Lee Irwin III
2005-01-11 15:38 ` Ray Bryant
2005-01-11 15:38 ` Ray Bryant
2005-01-11 19:00 ` Steve Longerbeam [this message]
2005-01-11 19:00 ` Steve Longerbeam
2005-01-11 19:30 ` Ray Bryant
2005-01-11 19:30 ` Ray Bryant
2005-01-11 20:59 ` Steve Longerbeam
2005-01-11 20:59 ` Steve Longerbeam
2005-01-12 12:35 ` Robin Holt
2005-01-12 12:35 ` Robin Holt
2005-01-12 18:12 ` Hugh Dickins
2005-01-12 18:12 ` Hugh Dickins
2005-01-12 18:45 ` Ray Bryant
2005-01-12 18:45 ` Ray Bryant
2005-01-12 18:53 ` Andrew Morton
2005-01-12 18:53 ` Andrew Morton
2005-01-14 13:55 ` swapspace layout improvements advocacy Tim Schmielau
2005-01-14 18:15 ` Andrew Morton
2005-01-14 22:52 ` Barry K. Nathan
2005-01-15 0:33 ` Alan Cox
2005-01-15 2:26 ` Tim Schmielau
2005-01-15 8:55 ` Pasi Savolainen
-- strict thread matches above, loose matches on Subject: below --
2005-01-06 20:59 page migration patchset Ray Bryant
2005-01-06 20:59 ` Ray Bryant
2005-01-06 23:04 ` Andi Kleen
2005-01-06 23:04 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41E42268.5090404@mvista.com \
--to=stevel@mvista.com \
--cc=ak@muc.de \
--cc=akpm@osdl.org \
--cc=haveblue@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=marcelo.tosatti@cyclades.com \
--cc=raybry@sgi.com \
--cc=taka@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.