From: Andrea Arcangeli <aarcange@redhat.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <pzijlstr@redhat.com>, Ingo Molnar <mingo@elte.hu>,
Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Hillf Danton <dhillf@gmail.com>,
Andrew Jones <drjones@redhat.com>, Dan Smith <danms@us.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Paul Turner <pjt@google.com>, Christoph Lameter <cl@linux.com>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Mike Galbraith <efault@gmx.de>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH 04/33] autonuma: define _PAGE_NUMA
Date: Thu, 11 Oct 2012 18:43:00 +0200 [thread overview]
Message-ID: <20121011164300.GN1818@redhat.com> (raw)
In-Reply-To: <20121011110137.GQ3317@csn.ul.ie>
On Thu, Oct 11, 2012 at 12:01:37PM +0100, Mel Gorman wrote:
> On Thu, Oct 04, 2012 at 01:50:46AM +0200, Andrea Arcangeli wrote:
> > The objective of _PAGE_NUMA is to be able to trigger NUMA hinting page
> > faults to identify the per NUMA node working set of the thread at
> > runtime.
> >
> > Arming the NUMA hinting page fault mechanism works similarly to
> > setting up a mprotect(PROT_NONE) virtual range: the present bit is
> > cleared at the same time that _PAGE_NUMA is set, so when the fault
> > triggers we can identify it as a NUMA hinting page fault.
> >
>
> That implies that there is an atomic update requirement or at least
> an ordering requirement -- present bit must be cleared before setting
> NUMA bit. No doubt it'll be clear later in the series how this is
> accomplished. What you propose seems ok but it all depends how it's
> implemented so I'm leaving my ack off this particular patch for now.
Correct. The switch is done atomically (clear _PAGE_PRESENT at the
same time _PAGE_NUMA is set). The tlb flush is deferred (it's batched
to avoid firing an IPI for every pte/pmd_numa we establish).
It's still similar to setting a range PROT_NONE (except the way
_PAGE_PROTNONE and _PAGE_NUMA works is the opposite, and they are
mutually exclusive, so they can easily share the same pte/pmd
bitflag). Except PROT_NONE must be synchronous, _PAGE_NUMA is set lazily.
The NUMA hinting page fault also won't require any TLB flush ever.
So the whole process (establish/teardown) has an incredibly low TLB
flushing cost.
The only fixed cost is in knuma_scand and the enter/exit kernel for
every not-shared page every 10 sec (or whatever you set the duration
of a knuma_scand pass in sysfs).
Furthermore, if the pmd_scan mode is activated, I guarantee there's at
max 1 NUMA hinting page fault every 2m virtual region (even if some
accuracy is lost). You can try to set scan_pmd = 0 in sysfs and also
to disable THP (echo never >enabled) to measure the exact cost per 4k
page. It's hardly measurable here. With THP the fault is also 1 every
2m virtual region but no accuracy is lost in that case (or more
precisely, there's no way to get more accuracy than that as we deal
with a pmd).
next prev parent reply other threads:[~2012-10-11 16:44 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1349308275-2174-1-git-send-email-aarcange@redhat.com>
[not found] ` <20121004113943.be7f92a0.akpm@linux-foundation.org>
2012-10-05 23:14 ` [PATCH 00/33] AutoNUMA27 Andi Kleen
2012-10-05 23:57 ` Tim Chen
2012-10-06 0:11 ` Andi Kleen
2012-10-08 13:44 ` Don Morris
2012-10-08 20:34 ` Rik van Riel
[not found] ` <20121011101930.GM3317@csn.ul.ie>
2012-10-11 14:56 ` Andrea Arcangeli
2012-10-11 15:35 ` Mel Gorman
2012-10-12 0:41 ` Andrea Arcangeli
2012-10-12 14:54 ` Mel Gorman
[not found] ` <1349308275-2174-2-git-send-email-aarcange@redhat.com>
[not found] ` <20121011105036.GN3317@csn.ul.ie>
2012-10-11 16:07 ` [PATCH 01/33] autonuma: add Documentation/vm/autonuma.txt Andrea Arcangeli
2012-10-11 19:37 ` Mel Gorman
[not found] ` <1349308275-2174-5-git-send-email-aarcange@redhat.com>
[not found] ` <20121011110137.GQ3317@csn.ul.ie>
2012-10-11 16:43 ` Andrea Arcangeli [this message]
2012-10-11 19:48 ` [PATCH 04/33] autonuma: define _PAGE_NUMA Mel Gorman
[not found] ` <1349308275-2174-6-git-send-email-aarcange@redhat.com>
[not found] ` <20121011111545.GR3317@csn.ul.ie>
2012-10-11 16:58 ` [PATCH 05/33] autonuma: pte_numa() and pmd_numa() Andrea Arcangeli
2012-10-11 19:54 ` Mel Gorman
[not found] ` <1349308275-2174-7-git-send-email-aarcange@redhat.com>
[not found] ` <20121011122255.GS3317@csn.ul.ie>
2012-10-11 17:05 ` [PATCH 06/33] autonuma: teach gup_fast about pmd_numa Andrea Arcangeli
2012-10-11 20:01 ` Mel Gorman
[not found] ` <1349308275-2174-8-git-send-email-aarcange@redhat.com>
[not found] ` <20121011122827.GT3317@csn.ul.ie>
2012-10-11 17:15 ` [PATCH 07/33] autonuma: mm_autonuma and task_autonuma data structures Andrea Arcangeli
2012-10-11 20:06 ` Mel Gorman
[not found] ` <5076E4B2.2040301@redhat.com>
[not found] ` <0000013a525a8739-2b4049fa-1cb3-4b8f-b3a7-1fa77b181590-000000@email.amazonses.com>
2012-10-12 0:52 ` Andrea Arcangeli
[not found] ` <1349308275-2174-9-git-send-email-aarcange@redhat.com>
[not found] ` <20121011134643.GU3317@csn.ul.ie>
2012-10-11 17:34 ` [PATCH 08/33] autonuma: define the autonuma flags Andrea Arcangeli
2012-10-11 20:17 ` Mel Gorman
[not found] ` <1349308275-2174-11-git-send-email-aarcange@redhat.com>
[not found] ` <20121011145805.GW3317@csn.ul.ie>
2012-10-12 0:25 ` [PATCH 10/33] autonuma: CPU follows memory algorithm Andrea Arcangeli
2012-10-12 8:29 ` Mel Gorman
[not found] ` <20121011213432.GQ3317@csn.ul.ie>
2012-10-12 1:45 ` [PATCH 00/33] AutoNUMA27 Andrea Arcangeli
2012-10-12 8:46 ` Mel Gorman
[not found] ` <1349308275-2174-16-git-send-email-aarcange@redhat.com>
[not found] ` <20121011155302.GA3317@csn.ul.ie>
[not found] ` <50770314.7060800@redhat.com>
[not found] ` <20121011175953.GT1818@redhat.com>
2012-10-12 14:03 ` [PATCH 15/33] autonuma: alloc/free/init task_autonuma Rik van Riel
2012-10-13 18:40 ` [PATCH 00/33] AutoNUMA27 Srikar Dronamraju
2012-10-14 4:57 ` Andrea Arcangeli
2012-10-15 8:16 ` Srikar Dronamraju
2012-10-23 16:32 ` Srikar Dronamraju
[not found] ` <1349308275-2174-20-git-send-email-aarcange@redhat.com>
[not found] ` <20121013180618.GC31442@linux.vnet.ibm.com>
2012-10-15 8:24 ` [PATCH 19/33] autonuma: memory follows CPU algorithm and task/mm_autonuma stats collection Srikar Dronamraju
2012-10-15 9:20 ` Mel Gorman
2012-10-15 10:00 ` Srikar Dronamraju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121011164300.GN1818@redhat.com \
--to=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=danms@us.ibm.com \
--cc=dhillf@gmail.com \
--cc=drjones@redhat.com \
--cc=efault@gmx.de \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=pjt@google.com \
--cc=pzijlstr@redhat.com \
--cc=riel@redhat.com \
--cc=suresh.b.siddha@intel.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox