[PATCH] arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU

public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed

From: christoffer.dall@linaro.org (Christoffer Dall)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU
Date: Thu, 27 Oct 2016 14:27:57 +0200	[thread overview]
Message-ID: <20161027122757.GA19614@cbox> (raw)
In-Reply-To: <f00b6e69-7b45-5238-4d3a-af8d77e85670@arm.com>

On Thu, Oct 27, 2016 at 11:40:00AM +0100, Marc Zyngier wrote:
> On 27/10/16 11:04, Christoffer Dall wrote:
> > On Thu, Oct 27, 2016 at 10:49:00AM +0100, Marc Zyngier wrote:
> >> Hi Christoffer,
> >>
> >> On 27/10/16 10:19, Christoffer Dall wrote:
> >>> On Mon, Oct 24, 2016 at 04:31:28PM +0100, Marc Zyngier wrote:
> >>>> Architecturally, TLBs are private to the (physical) CPU they're
> >>>> associated with. But when multiple vcpus from the same VM are
> >>>> being multiplexed on the same CPU, the TLBs are not private
> >>>> to the vcpus (and are actually shared across the VMID).
> >>>>
> >>>> Let's consider the following scenario:
> >>>>
> >>>> - vcpu-0 maps PA to VA
> >>>> - vcpu-1 maps PA' to VA
> >>>>
> >>>> If run on the same physical CPU, vcpu-1 can hit TLB entries generated
> >>>> by vcpu-0 accesses, and access the wrong physical page.
> >>>>
> >>>> The solution to this is to keep a per-VM map of which vcpu ran last
> >>>> on each given physical CPU, and invalidate local TLBs when switching
> >>>> to a different vcpu from the same VM.
> >>>
> >>> Just making sure I understand this:  The reason you cannot rely on the
> >>> guest doing the necessary distinction with ASIDs or invalidating the TLB
> >>> is that a guest (which assumes it's running on hardware) can validly
> >>> defer any neccessary invalidation until it starts running on other
> >>> physical CPUs, but we do this transparently in KVM?
> >>
> >> The guest wouldn't have to do any invalidation at all on real HW,
> >> because the TLBs are strictly private to a physical CPU (only the
> >> invalidation can be broadcast to the Inner Shareable domain). But when
> >> we multiplex two vcpus on the same physical CPU, we break the private
> >> semantics, and a vcpu could hit in the TLB entries populated by the
> >> another one.
> > 
> > Such a guest would be using a mapping of the same VA with the same ASID
> > on two separate CPUs, each pointing to a separate PA.  If it ever were
> > to, say, migrate a task, it would have to do invalidations then.  Right?
> 
> This doesn't have to be ASID tagged. Actually, it is more likely to
> affect global mappings. Imagine for example that the kernel (which uses
> global mappings for its own page tables) decides to create per-cpu
> variable using this trick (all the CPUs have the same VA, but use
> different PAs). No invalidation at all, everything looks perfectly fine,
> until you start virtualizing it.
> 
> > Does Linux or other guests actually do this?
> 
> Linux may hit it with CPU hotplug, which uses global mappings (which a
> vcpu using an ASID tagged mapping could then hit if the VAs overlap).
> 

Right, ok, it's more threatening than I first thought.  Thanks for the
explanation.

> > 
> > I would suspect Linux has to eventually invalidate those mappins if it
> > wants the scheduler to be allowed to freely move things around.
> > 
> >>
> >> As we cannot segregate the TLB entries per vcpu (but only per VMID), the
> >> workaround is to nuke all the TLBs for this VMID (locally only - no
> >> broadcast) each time we find that two vcpus are sharing the same
> >> physical CPU.
> >>
> >> Is that clearer?
> > 
> > Yes, the fix is clear, just want to make sure I understand that it's a
> > valid circumstance where this actually happens.  But in either case, we
> > probably have to fix this to emulate the hardware correctly.
> > 
> > Another fix would be to allocate a VMID per VCPU I suppose, just to
> > introduce a terrible TLB hit ratio :)
> 
> But that would break TLB invalidations that are broadcast in the Inner
> Shareable domain. To do so, you'd have to trap every TBLI, and issue
> corresponding invalidations for all the vcpus. I'm not sure I want to
> see the performance number of that solution... ;-)
> 
Ah, yeah, that's ridiculous.  Forget what I said.

Thanks,
-Christoffer

next prev parent reply	other threads:[~2016-10-27 12:27 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-24 15:31 [PATCH] arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU Marc Zyngier
2016-10-24 16:16 ` Mark Rutland
2016-10-25 10:20   ` Marc Zyngier
2016-10-27  9:19 ` Christoffer Dall
2016-10-27  9:49   ` Marc Zyngier
2016-10-27 10:04     ` Christoffer Dall
2016-10-27 10:40       ` Marc Zyngier
2016-10-27 12:27         ` Christoffer Dall [this message]
2016-10-27 10:51       ` Mark Rutland
2016-10-27 12:28         ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161027122757.GA19614@cbox \
    --to=christoffer.dall@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox