From: Daniel Henrique Barboza <danielhb413@gmail.com>
To: Fabiano Rosas <farosas@linux.ibm.com>, qemu-devel@nongnu.org
Cc: aneesh.kumar@linux.ibm.com, qemu-ppc@nongnu.org, clg@kaod.org,
npiggin@gmail.com, david@gibson.dropbear.id.au
Subject: Re: [RFC PATCH 1/2] spapr: Report correct GTSE support via ov5
Date: Wed, 9 Mar 2022 16:09:04 -0300 [thread overview]
Message-ID: <d9cb6b05-9c06-8081-ee7f-a44f3a85848b@gmail.com> (raw)
In-Reply-To: <20220309012400.2527157-1-farosas@linux.ibm.com>
On 3/8/22 22:23, Fabiano Rosas wrote:
> QEMU reports MMU support to the guest via the ibm,architecture-vec-5
> property of the /chosen node. Byte number 26 specifies Radix Table
> Expansions, currently only GTSE (Guest Translation Shootdown
> Enable). This feature determines whether the tlbie instruction (and
> others) are HV privileged.
>
> Up until now, we always reported GTSE=1 to guests. Even after the
> support for GTSE=0 was added. As part of that support, a kernel
> command line radix_hcall_invalidate=on was introduced that overrides
> the GTSE value received via CAS. So a guest can run with GTSE=0 and
> use the H_RPT_INVALIDATE hcall instead of tlbie.
>
> In this scenario, having GTSE always set to 1 by QEMU leads to a crash
> when running nested KVM guests because KVM does not allow a nested
> hypervisor to set GTSE support for its nested guests. So a nested
> guest always uses the same value for LPCR_GTSE as its HV. Since the
> nested HV disabled GTSE, but the L2 QEMU always reports GTSE=1, we run
> into a crash when:
>
> L1 LPCR_GTSE=0
> L2 LPCR_GTSE=0
> L2 CAS GTSE=1
>
> The nested guest will run 'tlbie' and crash because the HW looks at
> LPCR_GTSE, which is clear.
>
> Having GTSE disabled in the L1 and enabled in the L2 is not an option
> because the whole purpose of GTSE is to disallow access to tlbie and
> we cannot allow L1 to spawn L2s that can access features that L1
> itself cannot.
>
> We also cannot have the guest check the LPCR bit, because LPCR is
> HV-privileged.
>
> So this patch goes through the most intuitive route which is to have
> QEMU ask KVM about GTSE support and advertise the correct value to the
> guest. A new KVM_CAP_PPC_GTSE capability is being added.
>
> TCG continues to always enable GTSE.
>
> Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
> ---
I'm not sure if I fully understood the situation, so let me recap. Once upon a time,
QEMU advertised GTSE=1 and the host would never advertise other value, and everyone
was happy.
The host started to support GTSE=0, but QEMU kept advertising GTSE=1 regardless, and no
KVM GTSE cap was added to reflect the host support. I'll then assume that:
- all guests would break if running in a GTSE=0 host prior to rpt_invalidate support (which
is necessary to allow the guest to run in GTSE=0)
- apparently no one ever tried to run a KVM guest in a GTSE=0 host, so no bugs were opened
After commit 82123b756a1a2f1 (target/ppc: Support for H_RPT_INVALIDATE hcall) we added
cap-rpt-invalidate. I didn't follow the discussions of this cap but it seems like, as with
almost every other cap we have, there would be a migration problem for a guest that was in
a rpt_invalidate aware host to migrate to another where this wouldn't be true, and the cap
solves that.
What I'm not following is why, even after having cap-rpt-invalidate, we are still "lying"
about the GTSE=1 regardless of what the host supports. We could've added the GTSE KVM cap
at the same time rpt_invalidate was introduced, and guests that want to ignore this setting
can use the cap to bypass it.
In the end this patch is a needed fix IMHO. My confusion is why we're doing this just now.
The patch itself LGTM.
Thanks,
Daniel
> hw/ppc/spapr.c | 38 +++++++++++++++++++++++++++++++-------
> target/ppc/kvm.c | 8 ++++++++
> target/ppc/kvm_ppc.h | 6 ++++++
> 3 files changed, 45 insertions(+), 7 deletions(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 4cc204f90d..3e95a1831f 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -971,7 +971,7 @@ static void spapr_dt_ov5_platform_support(SpaprMachineState *spapr, void *fdt,
> 23, 0x00, /* XICS / XIVE mode */
> 24, 0x00, /* Hash/Radix, filled in below. */
> 25, 0x00, /* Hash options: Segment Tables == no, GTSE == no. */
> - 26, 0x40, /* Radix options: GTSE == yes. */
> + 26, 0x00, /* Radix options, filled in below. */
> };
>
> if (spapr->irq->xics && spapr->irq->xive) {
> @@ -1000,10 +1000,16 @@ static void spapr_dt_ov5_platform_support(SpaprMachineState *spapr, void *fdt,
> } else {
> val[3] = 0x00; /* Hash */
> }
> +
> + if (kvmppc_has_cap_gtse()) {
> + val[7] = 0x40 /* OV5_MMU_RADIX_GTSE */;
> + }
> } else {
> /* V3 MMU supports both hash and radix in tcg (with dynamic switching) */
> val[3] = 0xC0;
> + val[7] = 0x40 /* OV5_MMU_RADIX_GTSE */;
> }
> +
> _FDT(fdt_setprop(fdt, chosen, "ibm,arch-vec-5-platform-support",
> val, sizeof(val)));
> }
> @@ -2824,14 +2830,32 @@ static void spapr_machine_init(MachineState *machine)
> /* Init numa_assoc_array */
> spapr_numa_associativity_init(spapr, machine);
>
> - if ((!kvm_enabled() || kvmppc_has_cap_mmu_radix()) &&
> - ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00, 0,
> + if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00, 0,
> spapr->max_compat_pvr)) {
> - spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_300);
> - /* KVM and TCG always allow GTSE with radix... */
> - spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_GTSE);
> +
> + /* TCG always supports Radix w/ GTSE */
> + if (!kvm_enabled()) {
> + spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_300);
> + spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_GTSE);
> + } else {
> + if (kvmppc_has_cap_mmu_radix()) {
> + spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_300);
> + }
> +
> + /*
> + * Only disable Guest Translation Shootdown if KVM
> + * supports the H_RPT_INVALIDATE hypercall, otherwise we'd
> + * leave the guest with no way to make TLB invalidations.
> + */
> + if (kvmppc_has_cap_rpt_invalidate()) {
> + if (kvmppc_has_cap_gtse()) {
> + spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_GTSE);
> + }
> + } else {
> + spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_GTSE);
> + }
> + }
> }
> - /* ... but not with hash (currently). */
>
> if (kvm_enabled()) {
> /* Enable H_LOGICAL_CI_* so SLOF can talk to in-kernel devices */
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index dc93b99189..91582c4b15 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -90,6 +90,7 @@ static int cap_ppc_nested_kvm_hv;
> static int cap_large_decr;
> static int cap_fwnmi;
> static int cap_rpt_invalidate;
> +static int cap_gtse;
>
> static uint32_t debug_inst_opcode;
>
> @@ -154,6 +155,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> }
>
> cap_rpt_invalidate = kvm_vm_check_extension(s, KVM_CAP_PPC_RPT_INVALIDATE);
> + cap_gtse = kvm_vm_check_extension(s, KVM_CAP_PPC_GTSE);
> +
> kvm_ppc_register_host_cpu_type();
>
> return 0;
> @@ -2397,6 +2400,11 @@ bool kvmppc_has_cap_mmu_hash_v3(void)
> return cap_mmu_hash_v3;
> }
>
> +bool kvmppc_has_cap_gtse(void)
> +{
> + return cap_gtse;
> +}
> +
> static bool kvmppc_power8_host(void)
> {
> bool ret = false;
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index ee9325bf9a..7d6980edb7 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -63,6 +63,7 @@ bool kvmppc_has_cap_fixup_hcalls(void);
> bool kvmppc_has_cap_htm(void);
> bool kvmppc_has_cap_mmu_radix(void);
> bool kvmppc_has_cap_mmu_hash_v3(void);
> +bool kvmppc_has_cap_gtse(void);
> bool kvmppc_has_cap_xive(void);
> int kvmppc_get_cap_safe_cache(void);
> int kvmppc_get_cap_safe_bounds_check(void);
> @@ -343,6 +344,11 @@ static inline bool kvmppc_has_cap_mmu_hash_v3(void)
> return false;
> }
>
> +static inline bool kvmppc_has_cap_gtse(void)
> +{
> + return false;
> +}
> +
> static inline bool kvmppc_has_cap_xive(void)
> {
> return false;
next prev parent reply other threads:[~2022-03-09 19:10 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-09 1:23 [RFC PATCH 1/2] spapr: Report correct GTSE support via ov5 Fabiano Rosas
2022-03-09 1:24 ` [RFC PATCH 2/2] linux-headers: Add KVM_CAP_PPC_GTSE Fabiano Rosas
2022-03-09 19:09 ` Daniel Henrique Barboza [this message]
2022-03-11 14:45 ` [RFC PATCH 1/2] spapr: Report correct GTSE support via ov5 Fabiano Rosas
2022-03-10 19:51 ` Fabiano Rosas
2022-03-11 0:30 ` Daniel Henrique Barboza
2022-03-12 9:17 ` David Gibson
2022-03-14 22:10 ` Fabiano Rosas
2022-03-31 4:29 ` David Gibson
2022-04-01 7:01 ` Aneesh Kumar K.V
2022-04-01 15:50 ` Fabiano Rosas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d9cb6b05-9c06-8081-ee7f-a44f3a85848b@gmail.com \
--to=danielhb413@gmail.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=clg@kaod.org \
--cc=david@gibson.dropbear.id.au \
--cc=farosas@linux.ibm.com \
--cc=npiggin@gmail.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).