From: Rik van Riel <riel@redhat.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Paul Turner <pjt@google.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Christoph Lameter <cl@linux.com>, Mel Gorman <mgorman@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Thomas Gleixner <tglx@linutronix.de>,
Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH 2/2] mm: Optimize the TLB flush of sys_mprotect() and change_protection() users
Date: Wed, 14 Nov 2012 13:39:15 -0500 [thread overview]
Message-ID: <50A3E553.2040501@redhat.com> (raw)
In-Reply-To: <1352883029-7885-3-git-send-email-mingo@kernel.org>
On 11/14/2012 03:50 AM, Ingo Molnar wrote:
> Reuse the NUMA code's 'modified page protections' count that
> change_protection() computes and skip the TLB flush if there's
> no changes to a range that sys_mprotect() modifies.
>
> Given that mprotect() already optimizes the same-flags case
> I expected this optimization to dominantly trigger on
> CONFIG_NUMA_BALANCING=y kernels - but even with that feature
> disabled it triggers rather often.
>
> There's two reasons for that:
>
> 1)
>
> While sys_mprotect() already optimizes the same-flag case:
>
> if (newflags == oldflags) {
> *pprev = vma;
> return 0;
> }
>
> and this test works in many cases, but it is too sharp in some
> others, where it differentiates between protection values that the
> underlying PTE format makes no distinction about, such as
> PROT_EXEC == PROT_READ on x86.
>
> 2)
>
> Even where the pte format over vma flag changes necessiates a
> modification of the pagetables, there might be no pagetables
> yet to modify: they might not be instantiated yet.
>
> During a regular desktop bootup this optimization hits a couple
> of hundred times. During a Java test I measured thousands of
> hits.
>
> So this optimization improves sys_mprotect() in general, not just
> CONFIG_NUMA_BALANCING=y kernels.
>
> [ We could further increase the efficiency of this optimization if
> change_pte_range() and change_huge_pmd() was a bit smarter about
> recognizing exact-same-value protection masks - when the hardware
> can do that safely. This would probably further speed up mprotect(). ]
>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
> mm/mprotect.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index ce0377b..6ff2d5e 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -145,7 +145,10 @@ static unsigned long change_protection_range(struct vm_area_struct *vma,
> pages += change_pud_range(vma, pgd, addr, next, newprot,
> dirty_accountable);
> } while (pgd++, addr = next, addr != end);
> - flush_tlb_range(vma, start, end);
> +
> + /* Only flush the TLB if we actually modified any entries: */
> + if (pages)
> + flush_tlb_range(vma, start, end);
>
> return pages;
> }
Ahh, this explains why the previous patch does what it does.
Would be nice to have that explained in the changelog for that patch,
too :)
--
All rights reversed
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-11-14 18:39 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-14 8:50 [PATCH 0/2] change_protection(): Count the number of pages affected Ingo Molnar
2012-11-14 8:50 ` [PATCH 1/2] sched, numa, mm: Count WS scanning against present PTEs, not virtual memory ranges Ingo Molnar
2012-11-14 18:37 ` Rik van Riel
2012-11-14 8:50 ` [PATCH 2/2] mm: Optimize the TLB flush of sys_mprotect() and change_protection() users Ingo Molnar
2012-11-14 18:39 ` Rik van Riel [this message]
2012-11-14 18:01 ` [PATCH 0/2] change_protection(): Count the number of pages affected Linus Torvalds
2012-11-14 18:43 ` Rik van Riel
2012-11-14 20:52 ` Linus Torvalds
2012-11-14 22:04 ` Rik van Riel
2012-11-16 18:40 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50A3E553.2040501@redhat.com \
--to=riel@redhat.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=pjt@google.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).