From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
linuxppc-dev@lists.ozlabs.org
Cc: npiggin@gmail.com, paulus@samba.org
Subject: Re: [PATCH v2] powerpc/mm/radix: Workaround prefetch issue with KVM
Date: Mon, 17 Jul 2017 10:40:10 +0530 [thread overview]
Message-ID: <87poczps25.fsf@skywalker.in.ibm.com> (raw)
In-Reply-To: <1500014988.2865.78.camel@kernel.crashing.org>
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> There's a somewhat architectural issue with Radix MMU and KVM.
>
> When coming out of a guest with AIL (ie, MMU enabled), we start
> executing hypervisor code with the PID register still containing
> whatever the guest has been using.
>
> The problem is that the CPU can (and will) then start prefetching
> or speculatively load from whatever host context has that same
> PID (if any), thus bringing translations for that context into
> the TLB, which Linux doesn't know about.
>
> This can cause stale translations and subsequent crashes.
>
> Fixing this in a way that is neither racy nor a huge performance
> impact is difficult. We could just make the host invalidations
> always use broadcast forms but that would hurt single threaded
> programs for example.
>
> We chose to fix it instead by partitioning the PID space between
> guest and host. This is possible because today Linux only use 19
> out of the 20 bits of PID space, so existing guests will work
> if we make the host use the top half of the 20 bits space.
>
> We additionally add a property to indicate to Linux the size of
> the PID register which will be useful if we eventually have
> processors with a larger PID space available.
>
> There is still an issue with malicious guests purposefully setting
> the PID register to a value in the host range. Hopefully future HW
> can prevent that, but in the meantime, we handle it with a pair of
> kludges:
>
> - On the way out of a guest, before we clear the current VCPU
> in the PACA, we check the PID and if it's outside of the permitted
> range we flush the TLB for that PID.
>
> - When context switching, if the mm is "new" on that CPU (the
> corresponding bit was set for the first time in the mm cpumask), we
> check if any sibling thread is in KVM (has a non-NULL VCPU pointer
> in the PACA). If that is the case, we also flush the PID for that
> CPU (core).
>
> This second part is needed to handle the case where a process is
> migrated (or starts a new pthread) on a sibling thread of the CPU
> coming out of KVM, as there's a window where stale translations
> can exist before we detect it and flush them out.
>
> A future optimization could be added by keeping track of whether
> the PID has ever been used and avoid doing that for completely
> fresh PIDs. We could similarily mark PIDs that have been the subject of
> a global invalidation as "fresh". But for now this will do.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>
> v2. Do the check on KVM exit *after* we've restored the host PID
>
....
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 6ea4b53..e744d11 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -1522,6 +1522,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
> std r6, VCPU_BESCR(r9)
> stw r7, VCPU_GUEST_PID(r9)
> std r8, VCPU_WORT(r9)
> +
> BEGIN_FTR_SECTION
> mfspr r5, SPRN_TCSCR
> mfspr r6, SPRN_ACOP
> @@ -1728,6 +1729,19 @@ BEGIN_FTR_SECTION
> mtspr SPRN_PSSCR, r6
> mtspr SPRN_PID, r7
> mtspr SPRN_IAMR, r8
> +
> + /* Handle the case where the guest used an illegal PID */
> + LOAD_REG_ADDR(r4, mmu_base_pid)
> + lwz r3, VCPU_GUEST_PID(r9)
> + lwz r5, 0(r4)
> + cmpw cr0,r3,r5
> + blt 1f
> +
> + /* Illegal PID, flush the TLB */
> + isync
> + bl radix_flush_pid
> +1:
this need to be done only for radix right ? Do we need radix feature
check here ?
> +
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
> BEGIN_FTR_SECTION
> PPC_INVALIDATE_ERAT
-aneesh
next prev parent reply other threads:[~2017-07-17 5:10 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-14 1:52 [PATCH 1/4] powerpc/mm/radix: Don't iterate all sets when flushing the PWC Benjamin Herrenschmidt
2017-07-14 1:52 ` [PATCH 2/4] powerpc/mm/radix: Improve TLB/PWC flushes Benjamin Herrenschmidt
2017-07-14 1:52 ` [PATCH 3/4] powerpc/mm/radix: Avoid flushing the PWC on every flush_tlb_range Benjamin Herrenschmidt
2017-07-14 5:44 ` Aneesh Kumar K.V
2017-07-14 6:22 ` Benjamin Herrenschmidt
2017-07-17 12:12 ` kbuild test robot
2017-07-14 1:52 ` [PATCH 4/4] powerpc/mm/radix: Workaround prefetch issue with KVM Benjamin Herrenschmidt
2017-07-14 5:51 ` Aneesh Kumar K.V
2017-07-14 6:25 ` Benjamin Herrenschmidt
2017-07-14 6:49 ` [PATCH v2] " Benjamin Herrenschmidt
2017-07-17 5:10 ` Aneesh Kumar K.V [this message]
2017-07-19 2:29 ` [PATCH 4/4] " Balbir Singh
2017-07-19 3:54 ` Benjamin Herrenschmidt
2017-07-14 5:41 ` [PATCH 1/4] powerpc/mm/radix: Don't iterate all sets when flushing the PWC Aneesh Kumar K.V
2017-07-14 6:21 ` Benjamin Herrenschmidt
2017-07-14 7:03 ` Aneesh Kumar K.V
2017-07-14 7:21 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87poczps25.fsf@skywalker.in.ibm.com \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=npiggin@gmail.com \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).