From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Gibson Date: Thu, 27 Sep 2018 00:57:17 +0000 Subject: Re: [RFC PATCH 20/32] KVM: PPC: Book3S HV: Nested guest entry via hypercall Message-Id: <20180927005717.GC30868@umbus.fritz.box> MIME-Version: 1 Content-Type: multipart/mixed; boundary="PmeXHqXogrb8Ap3n" List-Id: References: <1537524123-9578-21-git-send-email-paulus@ozlabs.org> In-Reply-To: <1537524123-9578-21-git-send-email-paulus@ozlabs.org> To: kvm-ppc@vger.kernel.org --PmeXHqXogrb8Ap3n Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Sep 26, 2018 at 08:59:22PM +1000, Paul Mackerras wrote: > On Wed, Sep 26, 2018 at 03:41:07PM +1000, David Gibson wrote: > > On Fri, Sep 21, 2018 at 08:01:51PM +1000, Paul Mackerras wrote: > > > This adds a new hypercall, H_ENTER_NESTED, which is used by a nested > > > hypervisor to enter one of its nested guests. The hypercall supplies > > > register values in two structs. Those values are copied by the level= 0 > > > (L0) hypervisor (the one which is running in hypervisor mode) into the > > > vcpu struct of the L1 guest, and then the guest is run until an > > > interrupt or error occurs which needs to be reported to L1 via the > > > hypercall return value. > > >=20 > > > Currently this assumes that the L0 and L1 hypervisors are the same > > > endianness, and the structs passed as arguments are in native > > > endianness. > >=20 > > That's nasty. It'd be good to at least detect this and bail. >=20 > OK, so I need to reword the commit message, because we do detect this > via the version number check, and reject the hcall in the cross-endian > case. Ah, ok. > > > case H_ENTER_NESTED: > > > ret =3D H_FUNCTION; > > > + if (!vcpu->kvm->arch.nested_enable) > > > + break; > >=20 > > Wouldn't H_AUTHORITY make more sense than H_FUNCTION for the > > no-nested-allowed case? >=20 > So the guest can distinguish "I don't know about nested virt" from "I > know about it but you're not allowed to use it", you mean? I'm not > sure it makes much difference in practice. I guess not. I think I'm just used to avoiding H_FUNCTION in qemu for anything except "we know nothing about this hypercall whatsoever". > > > + ret =3D kvmhv_enter_nested_guest(vcpu); > > > + if (ret =3D=3D H_INTERRUPT) { > > > + kvmppc_set_gpr(vcpu, 3, 0); > > > + return RESUME_HOST; > > > + } > > > break; > > > =20 > > > default: > > > @@ -1269,6 +1276,104 @@ static int kvmppc_handle_exit_hv(struct kvm_r= un *run, struct kvm_vcpu *vcpu, > > > return r; > > > } > > > =20 > > > +static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu) > >=20 > > Might be nice to rename to make it clear if this is the L0 or L1 > > handling for a nested exit. >=20 > Well, it's Ln handling the exit of Ln+2. Got a suggestion for a > better name? Yeah, as I said elsewhere feels like we need better terms for absolute vs. relative hypervisor levels. What about kvmppc_handle_guestguest_exit() or kvmppc_handle_metaguest_exit()? That would work best if "guestguest" or "metaguest" was used to refer to Ln+2 throughout, of course. > > > @@ -4462,8 +4600,14 @@ static int kvmppc_core_init_vm_hv(struct kvm *= kvm) > > > * On POWER9, we only need to do this if the "indep_threads_mode" > > > * module parameter has been set to N. > > > */ > > > - if (cpu_has_feature(CPU_FTR_ARCH_300)) > > > - kvm->arch.threads_indep =3D indep_threads_mode; > > > + if (cpu_has_feature(CPU_FTR_ARCH_300)) { > > > + if (!indep_threads_mode && !cpu_has_feature(CPU_FTR_HVMODE)) { > > > + pr_warn("KVM: Ignoring indep_threads_mode=3DN in nested hyperviso= r\n"); > > > + kvm->arch.threads_indep =3D true; > >=20 > > Wouldn't it be cleaner to enforce this at the point indep_threads_mode > > is set, rather than altering the value here? >=20 > I'm not sure if we get notified when the administrator changes the > value via sysfs. In any case, doing it this way causes a warning > every time you start a VM, which is more likely to get noticed than a > single warning in the middle of all the boot-time messages. Hm, good point. > > > +long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) > > > +{ > > > + long int err, r; > > > + struct kvm_nested_guest *l2; > > > + struct pt_regs l2_regs, saved_l1_regs; > > > + struct hv_guest_state l2_hv, saved_l1_hv; > > > + struct kvmppc_vcore *vc =3D vcpu->arch.vcore; > > > + u64 hv_ptr, regs_ptr; > > > + u64 hdec_exp; > > > + s64 delta_purr, delta_spurr, delta_ic, delta_vtb; > > > + u64 mask; > > > + > > > + if (!kvm_is_radix(vcpu->kvm)) > > > + return H_FUNCTION; > >=20 > > Would it be safer / cleaner to have this instead check that the L1 has > > completed an H_SET_PARTITION_TABLE? Which wouldn't be allowed for an > > HPT guest. >=20 > There is no valid bit in the PTCR value, and the PTCR could have been > set via the one-reg interface without there having been a > H_SET_PARTITION_TABLE in the lifetime of this KVM instance (for > example when the L1 guest has been migrated in from another machine), > so I don't see a foolproof way to do what you suggest. Ah, right. Would checking if the vPTCR is !=3D it's initial value (0?) do the job? (AFAICT PTCR=3D=3D0 would be unusable, even if technically allowed). --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --PmeXHqXogrb8Ap3n Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlusKuoACgkQbDjKyiDZ s5LbUw//TEK7t5JdD2fw0aoi+Yq+pIsF7ZNHI88lVy1O2YbR1Zf4pnEttqjc2Z1E va5grJXfMSEQ/ZQUecgeF8tDhnwwVJbrNSQr29SKGIJ7o8QSKeEtIQmn1xWApAGH 56vFHp6kQVpGjNUyl/R0njhnlVmdCuFJ0sku6M2zaY4p30r3M9aVKIKioFLAX8vX A7Uy82Ejv49Ru+9/V9ZpRLPcSvSFZsPW7vaNPXDkx2YjpNLiVo9MbhdR07x5tqzI LL9ZkSOYhN5NZU3Sea4pSUSiXHVgykch2+e37dAlzc0lLnwqRWSvjOpS1ajO45lG /hlG4egzUkzR0GUG8gZAbt3Nbv87BwGJVkyo05BhDnBcMjkWOczU9mx3pRpIFI7a jEStPbu2QX+rcibEkdI8KdkI7V8DMqc0srBf+Z5Q2MtgJTJU0LNdzKXzIWjAPhAz kJs5OZyc5soNtvBOBFS0zo98gykaYFPm6XghgyFCi+lM8zv4oggLomqnXAq5rerR MCeNeZSMaReQbXqgSAijyHioA3pxUWI9bxefy/e3MfgRq7K5uAQ7Hx/hBdCmQYhT tFs5wA1tKI2/T+tC7eJA3jQ05Ya4f1mblDwE8D/JqI/IXKOc0mhN+mW1NVfGdExt TVIFQH3V61478pfKsNNRFNB8bJkynFnXJi68NTgIeb2LZonFkUM= =DoZN -----END PGP SIGNATURE----- --PmeXHqXogrb8Ap3n--