public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Samuel Thibault <samuel.thibault@ens-lyon.org>
To: christophe.lameter@sgi.com
Cc: linux-mm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Direct Migration and "Affinity on next touch" ?
Date: Wed, 8 Feb 2006 10:58:23 +0100	[thread overview]
Message-ID: <20060208095823.GD5752@implementation.labri.fr> (raw)

Hi,

Direct Migration support is quite great, but some "migration on next
touch" (aka affinity on next touch) would be quite useful too.

A bunch of parallel applications have an sequential part that
initializes all data. Then threads are launched for achieving the actual
computation in parallel. However, with a simple affinity on first touch
policy, all data is allocated on the node which initialization ran
on. Manually migrating data where threads will eventually run on may
really not be easy.

A "simple" (from the userland point of view) solution is to have
an "affinity on next touch" policy that would be set _after_
initialization, for instance the application would:
- initialize data, which gets allocated on some node ;
- call mbind(data, size, MPOL_DEFAULT, NULL, 0, MPOL_MF_NEXTTOUCH), that
  records the new policy and invalidates pages ;
- run threads ;
- threads start computing, hence they touch data pages; the page fault
  handler migrates these pages to the node on which the fault occured,
  i.e. hopefully the node on which it will be mostly used (this is
  generally true with such applications) ;
- after very little time, data pages are distributed as appropriate, and
  then the computation runs fast.

The cost of page fault + migration is quickly compensated by the
resulting better data distribution. Being able to ask for bigger page
sizes would also reduce page fault cot.

Solaris implements this solution through madvise(data, size,
MADV_ACCESS_LWP); (see Solaris' madvise() manpage
http://docs.sun.com/app/docs/doc/817-0677/6mgf9b66i?a=view ). Using this
facility can bring quite interesting performance improvements:
(for instance "affinity-on-next-touch: increasing the
performance of an industrial PDE solver on a cc-NUMA system":
http://portal.acm.org/ft_gateway.cfm%3Fid=1088201%26type=pdf )

Could such facility be implemented?

Regards,
Samuel

                 reply	other threads:[~2006-02-08  9:58 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060208095823.GD5752@implementation.labri.fr \
    --to=samuel.thibault@ens-lyon.org \
    --cc=christophe.lameter@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox