public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Zoltan Menyhart <Zoltan.Menyhart_AT_bull.net@nospam.org>
To: Robin Holt in <holt@sgi.com>
Cc: linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Migrate pages from a ccNUMA node to another
Date: Fri, 26 Mar 2004 13:38:52 +0100	[thread overview]
Message-ID: <4064245C.50B74C67@nospam.org> (raw)
In-Reply-To: 20040326103959.GB14360@lnx-holt

Robin Holt wrote:
> 
> We have found that "automatic" migration ends to result in the
> system deciding to move the wrong pieces around.  Since applications
> can be so varied, I would recommend we let the application decide
> when it thinks it is beneficial to move a memory range to a nearby
> node.

I am not saying it is for every application
(see the paragraph of the "if's").
There are a couple of applications which run for long time, with
relatively stable memory working sets. And I can help them.
You launch your application with and without, and you use if you
gain enough.

> The placement policy doesn't really fit the bill entirely.  We are
> currently tracking a problem with repeatability of a benchmark.  We
> found that the newer libc we are using used to result in a newly
> forked process touching a page before the parent did and therefore
> the page, which had been marked COW, would, on the old libc end up
> on the childs node for the child and parents node for the parent.
> After the update, both pages ended up on the parents.

I haven't modified anything in the existing page fault handler.
Nor I've changed the placement policy.
You need to specify explicitly where the pages go for my proposed
syscall.

> If you syscall would simply do the copy to the destination node
> for COW pages, this would have worked terrifically in both cases.

The COW pages are referenced by more than one PGDs (by that of the
parent and its children). As I state in RESTRICTIONS, I skip these
pages.

I think this issue with the COW pages is a fork() - exec()
placement problem, i do not address it with my stuff.


> >
> > 3. NUMA aware scheduler
> > .......................
> >
> 
> Back to my earlier comment about magic.  This is a second tier of
> magic.  Here we are talking about infering a reason to migrate based
> on memory access patterns, but what if that migration results in
> some other process being hurt more than this one is helped.
> 
> Honestly, we have beaten on the scheduler quite a bit and the "allocate
> memory close to my node" has helped considerably.
> 
> One thing that would probably help considerably, in addition to the
> syscall you seem to be proposing, would be an addition to the
> task_struct.  The new field would specify which node to attempt
> allocations on.  Before doing a fork, the parent would do a
> syscall to set this field to the node the child will target.  It
> would then call fork.  The PGDs et al and associated memory, including
> the task struct and pages would end up being allocated based upon
> that numa node's allocation preference.
> 
> What do you think of combining these two items into a single syscall?

I can agree with Robin Holt, it's NUMA API issue.
I just give a tool, if someone somehow knows that this piece of memory
would be better on another node, I can do it.

> > NAME
> >         migrate_ph_pages        - migrate pages to another NUMA node
> 
> At first, I thought "Wow, this could result in some nice admin tools."
> The more I scratch my head on this, the less useful I see it, but
> would not argue against it.

We are working on the prototype of a device driver to read out the
"hot page" counters on n-th Scalable Node Controller
(say: "/dev/snc/n/hotpage").
An "artificial intelligence" can guess what to move and calls this service.


BTW Has someone a machine with a chip set other than i82870 ?


Thanks,


Zoltan Menyhart

  parent reply	other threads:[~2004-03-26 13:08 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-26  9:02 Migrate pages from a ccNUMA node to another Zoltan Menyhart
2004-03-26 10:39 ` Robin Holt
2004-03-26  7:10   ` Andi Kleen
2004-03-26 12:38   ` Zoltan Menyhart [this message]
2004-03-29 23:16 ` Erich Focht
2004-03-30  9:57   ` Zoltan Menyhart
  -- strict thread matches above, loose matches on Subject: below --
2004-03-26  9:18 Migrate pages from a ccNUMA node to another - patch Zoltan Menyhart
2004-03-26 17:20 ` Dave Hansen
2004-03-30 11:39   ` Zoltan Menyhart
2004-03-30 15:58     ` Dave Hansen
2004-04-01  8:44       ` Migrate pages from a ccNUMA node to another Zoltan Menyhart
2004-03-30 11:20 Migrate pages from a ccNUMA node to another - patch Zoltan Menyhart
2004-03-30 12:08 ` Hirokazu Takahashi
2004-03-30 14:32   ` Zoltan Menyhart
2004-04-03  2:58     ` Hirokazu Takahashi
2004-04-05 15:07       ` Zoltan Menyhart
2004-04-05 15:40         ` Dave Hansen
2004-04-06 14:42           ` Migrate pages from a ccNUMA node to another Zoltan Menyhart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4064245C.50B74C67@nospam.org \
    --to=zoltan.menyhart_at_bull.net@nospam.org \
    --cc=holt@sgi.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox