public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <pzijlstr@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Hillf Danton <dhillf@gmail.com>,
	Andrew Jones <drjones@redhat.com>, Dan Smith <danms@us.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Paul Turner <pjt@google.com>, Christoph Lameter <cl@linux.com>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Mike Galbraith <efault@gmx.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH 04/33] autonuma: define _PAGE_NUMA
Date: Thu, 11 Oct 2012 18:43:00 +0200	[thread overview]
Message-ID: <20121011164300.GN1818@redhat.com> (raw)
In-Reply-To: <20121011110137.GQ3317@csn.ul.ie>

On Thu, Oct 11, 2012 at 12:01:37PM +0100, Mel Gorman wrote:
> On Thu, Oct 04, 2012 at 01:50:46AM +0200, Andrea Arcangeli wrote:
> > The objective of _PAGE_NUMA is to be able to trigger NUMA hinting page
> > faults to identify the per NUMA node working set of the thread at
> > runtime.
> > 
> > Arming the NUMA hinting page fault mechanism works similarly to
> > setting up a mprotect(PROT_NONE) virtual range: the present bit is
> > cleared at the same time that _PAGE_NUMA is set, so when the fault
> > triggers we can identify it as a NUMA hinting page fault.
> > 
> 
> That implies that there is an atomic update requirement or at least
> an ordering requirement -- present bit must be cleared before setting
> NUMA bit. No doubt it'll be clear later in the series how this is
> accomplished. What you propose seems ok but it all depends how it's
> implemented so I'm leaving my ack off this particular patch for now.

Correct. The switch is done atomically (clear _PAGE_PRESENT at the
same time _PAGE_NUMA is set). The tlb flush is deferred (it's batched
to avoid firing an IPI for every pte/pmd_numa we establish).

It's still similar to setting a range PROT_NONE (except the way
_PAGE_PROTNONE and _PAGE_NUMA works is the opposite, and they are
mutually exclusive, so they can easily share the same pte/pmd
bitflag). Except PROT_NONE must be synchronous, _PAGE_NUMA is set lazily.

The NUMA hinting page fault also won't require any TLB flush ever.

So the whole process (establish/teardown) has an incredibly low TLB
flushing cost.

The only fixed cost is in knuma_scand and the enter/exit kernel for
every not-shared page every 10 sec (or whatever you set the duration
of a knuma_scand pass in sysfs).

Furthermore, if the pmd_scan mode is activated, I guarantee there's at
max 1 NUMA hinting page fault every 2m virtual region (even if some
accuracy is lost). You can try to set scan_pmd = 0 in sysfs and also
to disable THP (echo never >enabled) to measure the exact cost per 4k
page. It's hardly measurable here. With THP the fault is also 1 every
2m virtual region but no accuracy is lost in that case (or more
precisely, there's no way to get more accuracy than that as we deal
with a pmd).

  parent reply	other threads:[~2012-10-11 16:44 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1349308275-2174-1-git-send-email-aarcange@redhat.com>
     [not found] ` <20121004113943.be7f92a0.akpm@linux-foundation.org>
2012-10-05 23:14   ` [PATCH 00/33] AutoNUMA27 Andi Kleen
2012-10-05 23:57     ` Tim Chen
2012-10-06  0:11       ` Andi Kleen
2012-10-08 13:44         ` Don Morris
2012-10-08 20:34     ` Rik van Riel
     [not found] ` <20121011101930.GM3317@csn.ul.ie>
2012-10-11 14:56   ` Andrea Arcangeli
2012-10-11 15:35     ` Mel Gorman
2012-10-12  0:41       ` Andrea Arcangeli
2012-10-12 14:54       ` Mel Gorman
     [not found] ` <1349308275-2174-2-git-send-email-aarcange@redhat.com>
     [not found]   ` <20121011105036.GN3317@csn.ul.ie>
2012-10-11 16:07     ` [PATCH 01/33] autonuma: add Documentation/vm/autonuma.txt Andrea Arcangeli
2012-10-11 19:37       ` Mel Gorman
     [not found] ` <1349308275-2174-5-git-send-email-aarcange@redhat.com>
     [not found]   ` <20121011110137.GQ3317@csn.ul.ie>
2012-10-11 16:43     ` Andrea Arcangeli [this message]
2012-10-11 19:48       ` [PATCH 04/33] autonuma: define _PAGE_NUMA Mel Gorman
     [not found] ` <1349308275-2174-6-git-send-email-aarcange@redhat.com>
     [not found]   ` <20121011111545.GR3317@csn.ul.ie>
2012-10-11 16:58     ` [PATCH 05/33] autonuma: pte_numa() and pmd_numa() Andrea Arcangeli
2012-10-11 19:54       ` Mel Gorman
     [not found] ` <1349308275-2174-7-git-send-email-aarcange@redhat.com>
     [not found]   ` <20121011122255.GS3317@csn.ul.ie>
2012-10-11 17:05     ` [PATCH 06/33] autonuma: teach gup_fast about pmd_numa Andrea Arcangeli
2012-10-11 20:01       ` Mel Gorman
     [not found] ` <1349308275-2174-8-git-send-email-aarcange@redhat.com>
     [not found]   ` <20121011122827.GT3317@csn.ul.ie>
2012-10-11 17:15     ` [PATCH 07/33] autonuma: mm_autonuma and task_autonuma data structures Andrea Arcangeli
2012-10-11 20:06       ` Mel Gorman
     [not found]     ` <5076E4B2.2040301@redhat.com>
     [not found]       ` <0000013a525a8739-2b4049fa-1cb3-4b8f-b3a7-1fa77b181590-000000@email.amazonses.com>
2012-10-12  0:52         ` Andrea Arcangeli
     [not found] ` <1349308275-2174-9-git-send-email-aarcange@redhat.com>
     [not found]   ` <20121011134643.GU3317@csn.ul.ie>
2012-10-11 17:34     ` [PATCH 08/33] autonuma: define the autonuma flags Andrea Arcangeli
2012-10-11 20:17       ` Mel Gorman
     [not found] ` <1349308275-2174-11-git-send-email-aarcange@redhat.com>
     [not found]   ` <20121011145805.GW3317@csn.ul.ie>
2012-10-12  0:25     ` [PATCH 10/33] autonuma: CPU follows memory algorithm Andrea Arcangeli
2012-10-12  8:29       ` Mel Gorman
     [not found] ` <20121011213432.GQ3317@csn.ul.ie>
2012-10-12  1:45   ` [PATCH 00/33] AutoNUMA27 Andrea Arcangeli
2012-10-12  8:46     ` Mel Gorman
     [not found] ` <1349308275-2174-16-git-send-email-aarcange@redhat.com>
     [not found]   ` <20121011155302.GA3317@csn.ul.ie>
     [not found]     ` <50770314.7060800@redhat.com>
     [not found]       ` <20121011175953.GT1818@redhat.com>
2012-10-12 14:03         ` [PATCH 15/33] autonuma: alloc/free/init task_autonuma Rik van Riel
2012-10-13 18:40 ` [PATCH 00/33] AutoNUMA27 Srikar Dronamraju
2012-10-14  4:57   ` Andrea Arcangeli
2012-10-15  8:16     ` Srikar Dronamraju
2012-10-23 16:32     ` Srikar Dronamraju
     [not found] ` <1349308275-2174-20-git-send-email-aarcange@redhat.com>
     [not found]   ` <20121013180618.GC31442@linux.vnet.ibm.com>
2012-10-15  8:24     ` [PATCH 19/33] autonuma: memory follows CPU algorithm and task/mm_autonuma stats collection Srikar Dronamraju
2012-10-15  9:20       ` Mel Gorman
2012-10-15 10:00         ` Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121011164300.GN1818@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=danms@us.ibm.com \
    --cc=dhillf@gmail.com \
    --cc=drjones@redhat.com \
    --cc=efault@gmx.de \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mingo@elte.hu \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pjt@google.com \
    --cc=pzijlstr@redhat.com \
    --cc=riel@redhat.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox