linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Paul Turner <pjt@google.com>,
	Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
	Christoph Lameter <cl@linux.com>, Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Thomas Gleixner <tglx@linutronix.de>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH 2/2] mm: Optimize the TLB flush of sys_mprotect() and change_protection() users
Date: Wed, 14 Nov 2012 13:39:15 -0500	[thread overview]
Message-ID: <50A3E553.2040501@redhat.com> (raw)
In-Reply-To: <1352883029-7885-3-git-send-email-mingo@kernel.org>

On 11/14/2012 03:50 AM, Ingo Molnar wrote:
> Reuse the NUMA code's 'modified page protections' count that
> change_protection() computes and skip the TLB flush if there's
> no changes to a range that sys_mprotect() modifies.
>
> Given that mprotect() already optimizes the same-flags case
> I expected this optimization to dominantly trigger on
> CONFIG_NUMA_BALANCING=y kernels - but even with that feature
> disabled it triggers rather often.
>
> There's two reasons for that:
>
> 1)
>
> While sys_mprotect() already optimizes the same-flag case:
>
>          if (newflags == oldflags) {
>                  *pprev = vma;
>                  return 0;
>          }
>
> and this test works in many cases, but it is too sharp in some
> others, where it differentiates between protection values that the
> underlying PTE format makes no distinction about, such as
> PROT_EXEC == PROT_READ on x86.
>
> 2)
>
> Even where the pte format over vma flag changes necessiates a
> modification of the pagetables, there might be no pagetables
> yet to modify: they might not be instantiated yet.
>
> During a regular desktop bootup this optimization hits a couple
> of hundred times. During a Java test I measured thousands of
> hits.
>
> So this optimization improves sys_mprotect() in general, not just
> CONFIG_NUMA_BALANCING=y kernels.
>
> [ We could further increase the efficiency of this optimization if
>    change_pte_range() and change_huge_pmd() was a bit smarter about
>    recognizing exact-same-value protection masks - when the hardware
>    can do that safely. This would probably further speed up mprotect(). ]
>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>   mm/mprotect.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index ce0377b..6ff2d5e 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -145,7 +145,10 @@ static unsigned long change_protection_range(struct vm_area_struct *vma,
>   		pages += change_pud_range(vma, pgd, addr, next, newprot,
>   				 dirty_accountable);
>   	} while (pgd++, addr = next, addr != end);
> -	flush_tlb_range(vma, start, end);
> +
> +	/* Only flush the TLB if we actually modified any entries: */
> +	if (pages)
> +		flush_tlb_range(vma, start, end);
>
>   	return pages;
>   }

Ahh, this explains why the previous patch does what it does.

Would be nice to have that explained in the changelog for that patch,
too :)

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-11-14 18:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-14  8:50 [PATCH 0/2] change_protection(): Count the number of pages affected Ingo Molnar
2012-11-14  8:50 ` [PATCH 1/2] sched, numa, mm: Count WS scanning against present PTEs, not virtual memory ranges Ingo Molnar
2012-11-14 18:37   ` Rik van Riel
2012-11-14  8:50 ` [PATCH 2/2] mm: Optimize the TLB flush of sys_mprotect() and change_protection() users Ingo Molnar
2012-11-14 18:39   ` Rik van Riel [this message]
2012-11-14 18:01 ` [PATCH 0/2] change_protection(): Count the number of pages affected Linus Torvalds
2012-11-14 18:43   ` Rik van Riel
2012-11-14 20:52     ` Linus Torvalds
2012-11-14 22:04       ` Rik van Riel
2012-11-16 18:40   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50A3E553.2040501@redhat.com \
    --to=riel@redhat.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=pjt@google.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).