linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
	Minchan Kim <minchan@kernel.org>,
	Andi Kleen <andi@firstfloor.org>, H Peter Anvin <hpa@zytor.com>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 0/3] TLB flush multiple pages per IPI v5
Date: Mon, 8 Jun 2015 21:52:37 +0200	[thread overview]
Message-ID: <20150608195237.GA15429@gmail.com> (raw)
In-Reply-To: <5575DD33.3000400@intel.com>


* Dave Hansen <dave.hansen@intel.com> wrote:

> On 06/08/2015 10:45 AM, Ingo Molnar wrote:
> > As per my measurements the __flush_tlb_single() primitive (which you use in patch
> > #2) is very expensive on most Intel and AMD CPUs. It barely makes sense for a 2
> > pages and gets exponentially worse. It's probably done in microcode and its 
> > performance is horrible.
> 
> I discussed this a bit in commit a5102476a2.  I'd be curious what
> numbers you came up with.

... which for those of us who don't have sha1's cached in their brain is:

  a5102476a24b ("x86/mm: Set TLB flush tunable to sane value (33)")

;-)

So what I measured agrees generally with the comment you added in the commit:

 + * Each single flush is about 100 ns, so this caps the maximum overhead at
 + * _about_ 3,000 ns.

Let that sink through: 3,000 nsecs = 3 usecs, that's like eternity!

A CR3 driven TLB flush takes less time than a single INVLPG (!):

   [    0.389028] x86/fpu: Cost of: __flush_tlb()               fn            :    96 cycles
   [    0.405885] x86/fpu: Cost of: __flush_tlb_one()           fn            :   260 cycles
   [    0.414302] x86/fpu: Cost of: __flush_tlb_range()         fn            :   404 cycles

it's true that a full flush has hidden costs not measured above, because it has 
knock-on effects (because it drops non-global TLB entries), but it's not _that_ 
bad due to:

  - there almost always being a L1 or L2 cache miss when a TLB miss occurs,
    which latency can be overlaid

  - global bit being held for kernel entries

  - user-space with high memory pressure trashing through TLBs typically

... and especially with caches and Intel's historically phenomenally low TLB 
refill latency it's difficult to measure the effects of local TLB refills, let 
alone measure it in any macro benchmark.

Cross-CPU flushes are expensive, absolutely no argument about that - my suggestion 
here is to keep the batching but simplify it: because I strongly suspect that the 
biggest win is the batching, not the pfn queueing.

We might even win a bit more performance due to the simplification.

> But, don't we have to take in to account the cost of refilling the TLB in 
> addition to the cost of emptying it?  The TLB size is historically increasing on 
> a per-core basis, so isn't this refill cost only going to get worse?

Only if TLB refill latency sucks - but Intel's is very good and AMD's is pretty 
good as well.

Also, usually if you miss the TLB you miss the cache line as well (you definitely 
miss the L1 cache, and TLB caches are sized to hold a fair chunk of your L2 
cache), and the CPU can overlap the two latencies.

So while it might sound counter-intuitive, a full TLB flush might be faster than 
trying to do software based TLB cache management ...

INVLPG really sucks. I can be convinced by numbers, but this isn't nearly as 
clear-cut as it might look.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-06-08 19:52 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-08 12:50 [PATCH 0/3] TLB flush multiple pages per IPI v5 Mel Gorman
2015-06-08 12:50 ` [PATCH 1/3] x86, mm: Trace when an IPI is about to be sent Mel Gorman
2015-06-08 12:50 ` [PATCH 2/3] mm: Send one IPI per CPU to TLB flush multiple pages that were recently unmapped Mel Gorman
2015-06-08 22:38   ` Andrew Morton
2015-06-09 11:07     ` Mel Gorman
2015-06-08 12:50 ` [PATCH 3/3] mm: Defer flush of writable TLB entries Mel Gorman
2015-06-08 17:45 ` [PATCH 0/3] TLB flush multiple pages per IPI v5 Ingo Molnar
2015-06-08 18:21   ` Dave Hansen
2015-06-08 19:52     ` Ingo Molnar [this message]
2015-06-08 20:03       ` Ingo Molnar
2015-06-08 21:07       ` Dave Hansen
2015-06-08 21:50         ` Ingo Molnar
2015-06-09  8:47   ` Mel Gorman
2015-06-09 10:32     ` Ingo Molnar
2015-06-09 11:20       ` Mel Gorman
2015-06-09 12:43         ` Ingo Molnar
2015-06-09 13:05           ` Mel Gorman
2015-06-10  8:51             ` Ingo Molnar
2015-06-10  9:08               ` Ingo Molnar
2015-06-10 10:15                 ` Mel Gorman
2015-06-11 15:26                   ` Ingo Molnar
2015-06-10  9:19               ` Mel Gorman
2015-06-09 15:34           ` Dave Hansen
2015-06-09 16:49             ` Dave Hansen
2015-06-09 21:14               ` Dave Hansen
2015-06-09 21:54                 ` Linus Torvalds
2015-06-09 22:32                   ` Mel Gorman
2015-06-09 22:35                     ` Mel Gorman
2015-06-10 13:13                   ` Andi Kleen
2015-06-10 16:17                     ` Linus Torvalds
2015-06-10 16:42                       ` Linus Torvalds
2015-06-10 17:24                         ` Mel Gorman
2015-06-10 17:31                           ` Linus Torvalds
2015-06-10 18:08                         ` Josh Boyer
2015-06-10 17:07                       ` Mel Gorman
2015-06-21 20:22             ` Kirill A. Shutemov
2015-06-25 11:48               ` Ingo Molnar
2015-06-25 18:36                 ` Linus Torvalds
2015-06-25 19:15                   ` Vlastimil Babka
2015-06-25 22:04                     ` Linus Torvalds
2015-06-25 18:46                 ` Dave Hansen
2015-06-26  9:08                   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150608195237.GA15429@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).