public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Nadav Amit <namit@vmware.com>
Cc: Ingo Molnar <mingo@redhat.com>, Andy Lutomirski <luto@kernel.org>,
	Borislav Petkov <bp@alien8.de>,
	linux-kernel@vger.kernel.org, jgross@suse.com, kys@microsoft.com,
	haiyangz@microsoft.com, sthemmin@microsoft.com,
	sashal@kernel.org
Subject: Re: [RFC PATCH 0/6] x86/mm: Flush remote and local TLBs concurrently
Date: Mon, 27 May 2019 11:59:59 +0200	[thread overview]
Message-ID: <20190527095959.GV2623@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20190525082203.6531-1-namit@vmware.com>

On Sat, May 25, 2019 at 01:21:57AM -0700, Nadav Amit wrote:
> Currently, local and remote TLB flushes are not performed concurrently,
> which introduces unnecessary overhead - each INVLPG can take 100s of
> cycles. This patch-set allows TLB flushes to be run concurrently: first
> request the remote CPUs to initiate the flush, then run it locally, and
> finally wait for the remote CPUs to finish their work.
> 
> The proposed changes should also improve the performance of other
> invocations of on_each_cpu(). Hopefully, no one has relied on the
> behavior of on_each_cpu() that functions were first executed remotely
> and only then locally.
> 
> On my Haswell machine (bare-metal), running a TLB flush microbenchmark
> (MADV_DONTNEED/touch for a single page on one thread), takes the
> following time (ns):
> 
> 	n_threads	before		after
> 	---------	------		-----
> 	1		661		663
> 	2		1436		1225 (-14%)
> 	4		1571		1421 (-10%)
> 
> Note that since the benchmark also causes page-faults, the actual
> speedup of TLB shootdowns is actually greater. Also note the higher
> improvement in performance with 2 thread (a single remote TLB flush
> target). This seems to be a side-effect of holding synchronization
> data-structures (csd) off the stack, unlike the way it is currently done
> (in smp_call_function_single()).
> 
> Patches 1-2 do small cleanup. Patches 3-5 actually implement the
> concurrent execution of TLB flushes. Patch 6 restores local TLB flushes
> performance, which was hurt by the optimization, to be as good as it was
> before these changes by introducing a fast-pass for this specific case.

I like; ideally we'll get Hyper-V and Xen sorted before the final
version and avoid having to introduce more PV crud and static keys for
that.

The Hyper-V implementation in particular is horrifically ugly, the Xen
one also doesn't win any prices, esp. that on-stack CPU mask needs to
go.

Looking at them, I'm not sure they can actually win anything from using
the new interface, but at least we can avoid making our PV crud uglier
than it has to be.

      parent reply	other threads:[~2019-05-27 10:00 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-25  8:21 [RFC PATCH 0/6] x86/mm: Flush remote and local TLBs concurrently Nadav Amit
2019-05-25  8:21 ` [RFC PATCH 1/6] smp: Remove smp_call_function() and on_each_cpu() return values Nadav Amit
2019-05-25  8:21 ` [RFC PATCH 2/6] cpumask: Purify cpumask_next() Nadav Amit
2019-05-25  8:32   ` Ingo Molnar
2019-05-27  8:30   ` Peter Zijlstra
2019-05-27 17:34     ` Nadav Amit
2019-05-25  8:22 ` [RFC PATCH 3/6] smp: Run functions concurrently in smp_call_function_many() Nadav Amit
2019-05-27  9:15   ` Peter Zijlstra
2019-05-27 17:39     ` Nadav Amit
2019-05-25  8:22 ` [RFC PATCH 4/6] x86/mm/tlb: Refactor common code into flush_tlb_on_cpus() Nadav Amit
2019-05-27  9:24   ` Peter Zijlstra
2019-05-27 18:59     ` Nadav Amit
2019-05-27 19:14       ` Peter Zijlstra
2019-05-25  8:22 ` [RFC PATCH 5/6] x86/mm/tlb: Flush remote and local TLBs concurrently Nadav Amit
2019-05-25  8:38   ` Nadav Amit
2019-05-25  8:54   ` Juergen Gross
2019-05-27  9:47     ` Peter Zijlstra
2019-05-27 10:21       ` Paolo Bonzini
2019-05-27 12:32         ` Peter Zijlstra
2019-05-27 12:45           ` Paolo Bonzini
2019-05-27 17:49       ` Nadav Amit
2019-05-25  8:22 ` [RFC PATCH 6/6] x86/mm/tlb: Optimize local TLB flushes Nadav Amit
2019-05-27  8:28 ` [RFC PATCH 0/6] x86/mm: Flush remote and local TLBs concurrently Peter Zijlstra
2019-05-27  9:59 ` Peter Zijlstra [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190527095959.GV2623@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bp@alien8.de \
    --cc=haiyangz@microsoft.com \
    --cc=jgross@suse.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namit@vmware.com \
    --cc=sashal@kernel.org \
    --cc=sthemmin@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox