* Re: [PATCH 2/5] memory: fsl_ifc: populate child devices without relying on simple-bus
From: Rob Herring @ 2021-10-04 16:27 UTC (permalink / raw)
To: Li Yang
Cc: devicetree, Shawn Guo, linux-kernel, Krzysztof Kozlowski,
linuxppc-dev, linux-arm-kernel
In-Reply-To: <20211001000924.15421-3-leoyang.li@nxp.com>
On Thu, Sep 30, 2021 at 07:09:21PM -0500, Li Yang wrote:
> After we update the binding to not use simple-bus compatible for the
> controller, we need the driver to populate the child devices explicitly.
>
> Signed-off-by: Li Yang <leoyang.li@nxp.com>
> ---
> drivers/memory/fsl_ifc.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/memory/fsl_ifc.c b/drivers/memory/fsl_ifc.c
> index d062c2f8250f..251d713cd50b 100644
> --- a/drivers/memory/fsl_ifc.c
> +++ b/drivers/memory/fsl_ifc.c
> @@ -88,6 +88,7 @@ static int fsl_ifc_ctrl_remove(struct platform_device *dev)
> {
> struct fsl_ifc_ctrl *ctrl = dev_get_drvdata(&dev->dev);
>
> + of_platform_depopulate(&dev->dev);
> free_irq(ctrl->nand_irq, ctrl);
> free_irq(ctrl->irq, ctrl);
>
> @@ -285,6 +286,14 @@ static int fsl_ifc_ctrl_probe(struct platform_device *dev)
> }
> }
>
> + /* legacy dts may still use "simple-bus" compatible */
> + if (!of_device_is_compatible(dev->dev.of_node, "simple-bus")) {
> + ret = of_platform_populate(dev->dev.of_node, NULL, NULL,
> + &dev->dev);
There's no need to make this conditional. of_platform_populate() is safe
to call multiple times. If that doesn't work, it's a bug.
Rob
^ permalink raw reply
* Re: [PATCH 3/3] powerpc: Set crashkernel offset to mid of RMA region
From: Aneesh Kumar K.V @ 2021-10-04 16:06 UTC (permalink / raw)
To: Sourabh Jain, mpe
Cc: Abdul haleem, mahesh, linux-kernel, hbathini, linuxppc-dev
In-Reply-To: <20211004151142.256251-4-sourabhjain@linux.ibm.com>
On 10/4/21 20:41, Sourabh Jain wrote:
> On large config LPARs (having 192 and more cores), Linux fails to boot
> due to insufficient memory in the first memory block. It is due to the
> reserve crashkernel area starts at 128MB offset by default and which
> doesn't leave enough space in the first memory block to accommodate
> memory for other essential system resources.
>
> Given that the RMA region size can be 512MB or more, setting the
> crashkernel offset to mid of RMA size will leave enough space to
> kernel to allocate memory for other system resources in the first
> memory block.
>
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> Reported-and-tested-by: Abdul haleem <abdhalee@linux.vnet.ibm.com>
> ---
> arch/powerpc/kernel/rtas.c | 3 +++
> arch/powerpc/kexec/core.c | 13 +++++++++----
> 2 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index ff80bbad22a5..ce5e62bb4d8e 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -1235,6 +1235,9 @@ int __init early_init_dt_scan_rtas(unsigned long node,
> entryp = of_get_flat_dt_prop(node, "linux,rtas-entry", NULL);
> sizep = of_get_flat_dt_prop(node, "rtas-size", NULL);
>
> + if (of_get_flat_dt_prop(node, "ibm,hypertas-functions", NULL))
> + powerpc_firmware_features |= FW_FEATURE_LPAR;
> +
The equivalent check that we currently do more than checking
ibm,hypertas-functions.
if (!strcmp(uname, "rtas") || !strcmp(uname, "rtas@0")) {
prop = of_get_flat_dt_prop(node, "ibm,hypertas-functions",
&len);
if (prop) {
powerpc_firmware_features |= FW_FEATURE_LPAR;
fw_hypertas_feature_init(prop, len);
}
also do we expect other firmware features to be set along with
FW_FEATURE_LPAR?
> if (basep && entryp && sizep) {
> rtas.base = *basep;
> rtas.entry = *entryp;
> diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
> index 48525e8b5730..f69cf3e370ec 100644
> --- a/arch/powerpc/kexec/core.c
> +++ b/arch/powerpc/kexec/core.c
> @@ -147,11 +147,16 @@ void __init reserve_crashkernel(void)
> if (!crashk_res.start) {
> #ifdef CONFIG_PPC64
> /*
> - * On 64bit we split the RMO in half but cap it at half of
> - * a small SLB (128MB) since the crash kernel needs to place
> - * itself and some stacks to be in the first segment.
> + * crash kernel needs to placed in the first segment. On LPAR
> + * setting crash kernel start to mid of RMA size (512MB or more)
> + * would help primary kernel to boot properly on large config
> + * LPAR (with core count 192 or more) and for the reset keep
> + * cap the crash kernel start at 128MB offse.
> */
> - crashk_res.start = min(0x8000000ULL, (ppc64_rma_size / 2));
> + if (firmware_has_feature(FW_FEATURE_LPAR))
> + crashk_res.start = ppc64_rma_size / 2;
> + else
> + crashk_res.start = min(0x8000000ULL, (ppc64_rma_size / 2));
> #else
> crashk_res.start = KDUMP_KERNELBASE;
> #endif
>
^ permalink raw reply
* [RFC PATCH] KVM: PPC: Book3S HV P9: Move H_CEDE logic mostly to one place
From: Nicholas Piggin @ 2021-10-04 16:10 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
Move the vcpu->arch.ceded, hrtimer, and blocking handling to one place,
except the xive escalation rearm case. The only special case is the
xive handling, as it is to be done before the xive context is pulled.
This means the P9 path does not run with ceded==1 or the hrtimer armed
except in the kvmhv_handle_cede function, and hopefully cede handling is
a bit more understandable.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/kvm_ppc.h | 4 +-
arch/powerpc/kvm/book3s_hv.c | 137 +++++++++++++-------------
arch/powerpc/kvm/book3s_hv_p9_entry.c | 3 +-
arch/powerpc/kvm/book3s_xive.c | 26 ++---
4 files changed, 88 insertions(+), 82 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 70ffcb3c91bf..e2e6cee9dddf 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -674,7 +674,7 @@ extern int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 irq,
int level, bool line_status);
extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu);
extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu);
-extern void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu);
+extern bool kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu);
static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
{
@@ -712,7 +712,7 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 ir
int level, bool line_status) { return -ENODEV; }
static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { }
static inline void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu) { }
-static inline void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu) { }
+static inline bool kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu) { return false; }
static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
{ return 0; }
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 36c54f483a02..230f10b67f98 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1019,6 +1019,8 @@ static long kvmppc_h_rpt_invalidate(struct kvm_vcpu *vcpu,
return H_SUCCESS;
}
+static int kvmhv_handle_cede(struct kvm_vcpu *vcpu);
+
int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -1080,7 +1082,9 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
break;
case H_CEDE:
+ ret = kvmhv_handle_cede(vcpu);
break;
+
case H_PROD:
target = kvmppc_get_gpr(vcpu, 4);
tvcpu = kvmppc_find_vcpu(kvm, target);
@@ -1292,25 +1296,6 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
return RESUME_GUEST;
}
-/*
- * Handle H_CEDE in the P9 path where we don't call the real-mode hcall
- * handlers in book3s_hv_rmhandlers.S.
- *
- * This has to be done early, not in kvmppc_pseries_do_hcall(), so
- * that the cede logic in kvmppc_run_single_vcpu() works properly.
- */
-static void kvmppc_cede(struct kvm_vcpu *vcpu)
-{
- vcpu->arch.shregs.msr |= MSR_EE;
- vcpu->arch.ceded = 1;
- smp_mb();
- if (vcpu->arch.prodded) {
- vcpu->arch.prodded = 0;
- smp_mb();
- vcpu->arch.ceded = 0;
- }
-}
-
static int kvmppc_hcall_impl_hv(unsigned long cmd)
{
switch (cmd) {
@@ -2971,7 +2956,7 @@ static int kvmppc_core_check_requests_hv(struct kvm_vcpu *vcpu)
return 1;
}
-static void kvmppc_set_timer(struct kvm_vcpu *vcpu)
+static bool kvmppc_set_timer(struct kvm_vcpu *vcpu)
{
unsigned long dec_nsec, now;
@@ -2980,11 +2965,12 @@ static void kvmppc_set_timer(struct kvm_vcpu *vcpu)
/* decrementer has already gone negative */
kvmppc_core_queue_dec(vcpu);
kvmppc_core_prepare_to_enter(vcpu);
- return;
+ return false;
}
dec_nsec = tb_to_ns(kvmppc_dec_expires_host_tb(vcpu) - now);
hrtimer_start(&vcpu->arch.dec_timer, dec_nsec, HRTIMER_MODE_REL);
vcpu->arch.timer_running = 1;
+ return true;
}
extern int __kvmppc_vcore_entry(void);
@@ -4015,21 +4001,11 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
else if (*tb >= time_limit) /* nested time limit */
return BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER;
- vcpu->arch.ceded = 0;
-
vcpu_vpa_increment_dispatch(vcpu);
if (kvmhv_on_pseries()) {
trap = kvmhv_vcpu_entry_p9_nested(vcpu, time_limit, lpcr, tb);
- /* H_CEDE has to be handled now, not later */
- if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
- kvmppc_get_gpr(vcpu, 3) == H_CEDE) {
- kvmppc_cede(vcpu);
- kvmppc_set_gpr(vcpu, 3, 0);
- trap = 0;
- }
-
} else {
struct kvm *kvm = vcpu->kvm;
@@ -4043,12 +4019,14 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
!(vcpu->arch.shregs.msr & MSR_PR)) {
unsigned long req = kvmppc_get_gpr(vcpu, 3);
- /* H_CEDE has to be handled now, not later */
+ /* Have to rearm before pulling xive, may abort cede */
if (req == H_CEDE) {
- kvmppc_cede(vcpu);
- kvmppc_xive_rearm_escalation(vcpu); /* may un-cede */
- kvmppc_set_gpr(vcpu, 3, 0);
- trap = 0;
+ if (kvmppc_xive_rearm_escalation(vcpu)) {
+ vcpu->arch.shregs.msr |= MSR_EE;
+ vcpu->arch.prodded = 0;
+ kvmppc_set_gpr(vcpu, 3, H_SUCCESS);
+ trap = 0;
+ }
/* XICS hcalls must be handled before xive is pulled */
} else if (hcall_is_xics(req)) {
@@ -4423,6 +4401,63 @@ static int kvmppc_run_vcpu(struct kvm_vcpu *vcpu)
return vcpu->arch.ret;
}
+/* H_CEDE for the P9 path, P7/8 is handled by realmode handlers */
+static int kvmhv_handle_cede(struct kvm_vcpu *vcpu)
+{
+ struct kvmppc_vcore *vc = vcpu->arch.vcore;
+ struct kvm_run *run = vcpu->run;
+
+ kvmppc_set_gpr(vcpu, 3, H_SUCCESS);
+ vcpu->arch.trap = 0;
+ vcpu->arch.shregs.msr |= MSR_EE;
+
+ vcpu->arch.ceded = 1;
+ smp_mb();
+ if (vcpu->arch.prodded) {
+ vcpu->arch.prodded = 0;
+ smp_mb();
+ vcpu->arch.ceded = 0;
+ return RESUME_GUEST;
+ }
+
+ if (kvmppc_vcpu_woken(vcpu)) {
+ vcpu->arch.ceded = 0;
+ return RESUME_GUEST;
+ }
+
+ if (!kvmppc_set_timer(vcpu)) {
+ vcpu->arch.ceded = 0;
+ return RESUME_GUEST;
+ }
+
+ prepare_to_rcuwait(&vcpu->wait);
+ for (;;) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ if (signal_pending(current)) {
+ vcpu->stat.signal_exits++;
+ run->exit_reason = KVM_EXIT_INTR;
+ vcpu->arch.ret = -EINTR;
+ break;
+ }
+
+ if (kvmppc_vcpu_woken(vcpu) || !vcpu->arch.ceded)
+ break;
+
+ trace_kvmppc_vcore_blocked(vc, 0);
+ schedule();
+ trace_kvmppc_vcore_blocked(vc, 1);
+ }
+ finish_rcuwait(&vcpu->wait);
+
+ if (vcpu->arch.timer_running) {
+ hrtimer_try_to_cancel(&vcpu->arch.dec_timer);
+ vcpu->arch.timer_running = 0;
+ }
+
+ vcpu->arch.ceded = 0;
+ return RESUME_GUEST;
+}
+
int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
unsigned long lpcr)
{
@@ -4442,7 +4477,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
vcpu->arch.trap = 0;
vc = vcpu->arch.vcore;
- vcpu->arch.ceded = 0;
vcpu->arch.run_task = current;
vcpu->arch.state = KVMPPC_VCPU_RUNNABLE;
vcpu->arch.last_inst = KVM_INST_FETCH_FAILED;
@@ -4488,11 +4522,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
goto out;
}
- if (vcpu->arch.timer_running) {
- hrtimer_try_to_cancel(&vcpu->arch.dec_timer);
- vcpu->arch.timer_running = 0;
- }
-
tb = mftb();
vcpu->cpu = pcpu;
@@ -4555,30 +4584,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
}
vcpu->arch.ret = r;
- if (is_kvmppc_resume_guest(r) && !kvmppc_vcpu_check_block(vcpu)) {
- kvmppc_set_timer(vcpu);
-
- prepare_to_rcuwait(&vcpu->wait);
- for (;;) {
- set_current_state(TASK_INTERRUPTIBLE);
- if (signal_pending(current)) {
- vcpu->stat.signal_exits++;
- run->exit_reason = KVM_EXIT_INTR;
- vcpu->arch.ret = -EINTR;
- break;
- }
-
- if (kvmppc_vcpu_check_block(vcpu))
- break;
-
- trace_kvmppc_vcore_blocked(vc, 0);
- schedule();
- trace_kvmppc_vcore_blocked(vc, 1);
- }
- finish_rcuwait(&vcpu->wait);
- }
- vcpu->arch.ceded = 0;
-
done:
trace_kvmppc_run_vcpu_exit(vcpu);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 86a222f97e8e..a1fdfcba608f 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -713,11 +713,10 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
WARN_ON_ONCE(vcpu->arch.shregs.msr & MSR_HV);
WARN_ON_ONCE(!(vcpu->arch.shregs.msr & MSR_ME));
+ WARN_ON_ONCE(vcpu->arch.ceded);
start_timing(vcpu, &vcpu->arch.rm_entry);
- vcpu->arch.ceded = 0;
-
/* Save MSR for restore, with EE clear. */
msr = mfmsr() & ~MSR_EE;
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index a18db9e16ea4..7d45764fd6a8 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -179,37 +179,39 @@ void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu)
}
EXPORT_SYMBOL_GPL(kvmppc_xive_pull_vcpu);
-void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu)
+/* Return true if H_CEDE is to be aborted */
+bool kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu)
{
void __iomem *esc_vaddr = (void __iomem *)vcpu->arch.xive_esc_vaddr;
if (!esc_vaddr)
- return;
+ return false;
/* we are using XIVE with single escalation */
if (vcpu->arch.xive_esc_on) {
/*
- * If we still have a pending escalation, abort the cede,
- * and we must set PQ to 10 rather than 00 so that we don't
- * potentially end up with two entries for the escalation
- * interrupt in the XIVE interrupt queue. In that case
- * we also don't want to set xive_esc_on to 1 here in
- * case we race with xive_esc_irq().
- */
- vcpu->arch.ceded = 0;
- /*
+ * If we still have a pending escalation, return true to abort
+ * the cede, and we must set PQ to 10 rather than 00 so that we
+ * don't potentially end up with two entries for the escalation
+ * interrupt in the XIVE interrupt queue. In that case we also
+ * don't want to set xive_esc_on to 1 here in case we race with
+ * xive_esc_irq().
+ *
* The escalation interrupts are special as we don't EOI them.
* There is no need to use the load-after-store ordering offset
* to set PQ to 10 as we won't use StoreEOI.
*/
__raw_readq(esc_vaddr + XIVE_ESB_SET_PQ_10);
+ mb();
+ return true;
} else {
vcpu->arch.xive_esc_on = true;
mb();
__raw_readq(esc_vaddr + XIVE_ESB_SET_PQ_00);
+ mb();
+ return false;
}
- mb();
}
EXPORT_SYMBOL_GPL(kvmppc_xive_rearm_escalation);
--
2.23.0
^ permalink raw reply related
* [PATCH v3 52/52] KVM: PPC: Book3S HV P9: Remove subcore HMI handling
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
On POWER9 and newer, rather than the complex HMI synchronisation and
subcore state, have each thread un-apply the guest TB offset before
calling into the early HMI handler.
This allows the subcore state to be avoided, including subcore enter
/ exit guest, which includes an expensive divide that shows up
slightly in profiles.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/kvm_ppc.h | 1 +
arch/powerpc/kvm/book3s_hv.c | 12 +++---
arch/powerpc/kvm/book3s_hv_hmi.c | 7 +++-
arch/powerpc/kvm/book3s_hv_p9_entry.c | 2 +-
arch/powerpc/kvm/book3s_hv_ras.c | 54 +++++++++++++++++++++++++++
5 files changed, 67 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 671fbd1a765e..70ffcb3c91bf 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -760,6 +760,7 @@ void kvmppc_realmode_machine_check(struct kvm_vcpu *vcpu);
void kvmppc_subcore_enter_guest(void);
void kvmppc_subcore_exit_guest(void);
long kvmppc_realmode_hmi_handler(void);
+long kvmppc_p9_realmode_hmi_handler(struct kvm_vcpu *vcpu);
long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel);
long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 351018f617fb..449ac0a19ceb 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4014,8 +4014,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
vcpu->arch.ceded = 0;
- kvmppc_subcore_enter_guest();
-
vcpu_vpa_increment_dispatch(vcpu);
if (kvmhv_on_pseries()) {
@@ -4068,8 +4066,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
vcpu_vpa_increment_dispatch(vcpu);
- kvmppc_subcore_exit_guest();
-
return trap;
}
@@ -6069,9 +6065,11 @@ static int kvmppc_book3s_init_hv(void)
if (r)
return r;
- r = kvm_init_subcore_bitmap();
- if (r)
- return r;
+ if (!cpu_has_feature(CPU_FTR_ARCH_300)) {
+ r = kvm_init_subcore_bitmap();
+ if (r)
+ return r;
+ }
/*
* We need a way of accessing the XICS interrupt controller,
diff --git a/arch/powerpc/kvm/book3s_hv_hmi.c b/arch/powerpc/kvm/book3s_hv_hmi.c
index 9af660476314..1ec50c69678b 100644
--- a/arch/powerpc/kvm/book3s_hv_hmi.c
+++ b/arch/powerpc/kvm/book3s_hv_hmi.c
@@ -20,10 +20,15 @@ void wait_for_subcore_guest_exit(void)
/*
* NULL bitmap pointer indicates that KVM module hasn't
- * been loaded yet and hence no guests are running.
+ * been loaded yet and hence no guests are running, or running
+ * on POWER9 or newer CPU.
+ *
* If no KVM is in use, no need to co-ordinate among threads
* as all of them will always be in host and no one is going
* to modify TB other than the opal hmi handler.
+ *
+ * POWER9 and newer don't need this synchronisation.
+ *
* Hence, just return from here.
*/
if (!local_paca->sibling_subcore_state)
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index fbecbdc42c26..86a222f97e8e 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -938,7 +938,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
kvmppc_realmode_machine_check(vcpu);
} else if (unlikely(trap == BOOK3S_INTERRUPT_HMI)) {
- kvmppc_realmode_hmi_handler();
+ kvmppc_p9_realmode_hmi_handler(vcpu);
} else if (trap == BOOK3S_INTERRUPT_H_EMUL_ASSIST) {
vcpu->arch.emul_inst = mfspr(SPRN_HEIR);
diff --git a/arch/powerpc/kvm/book3s_hv_ras.c b/arch/powerpc/kvm/book3s_hv_ras.c
index d4bca93b79f6..ccfd96965630 100644
--- a/arch/powerpc/kvm/book3s_hv_ras.c
+++ b/arch/powerpc/kvm/book3s_hv_ras.c
@@ -136,6 +136,60 @@ void kvmppc_realmode_machine_check(struct kvm_vcpu *vcpu)
vcpu->arch.mce_evt = mce_evt;
}
+
+long kvmppc_p9_realmode_hmi_handler(struct kvm_vcpu *vcpu)
+{
+ struct kvmppc_vcore *vc = vcpu->arch.vcore;
+ long ret = 0;
+
+ /*
+ * Unapply and clear the offset first. That way, if the TB was not
+ * resynced then it will remain in host-offset, and if it was resynced
+ * then it is brought into host-offset. Then the tb offset is
+ * re-applied before continuing with the KVM exit.
+ *
+ * This way, we don't need to actually know whether not OPAL resynced
+ * the timebase or do any of the complicated dance that the P7/8
+ * path requires.
+ */
+ if (vc->tb_offset_applied) {
+ u64 new_tb = mftb() - vc->tb_offset_applied;
+ mtspr(SPRN_TBU40, new_tb);
+ if ((mftb() & 0xffffff) < (new_tb & 0xffffff)) {
+ new_tb += 0x1000000;
+ mtspr(SPRN_TBU40, new_tb);
+ }
+ vc->tb_offset_applied = 0;
+ }
+
+ local_paca->hmi_irqs++;
+
+ if (hmi_handle_debugtrig(NULL) >= 0) {
+ ret = 1;
+ goto out;
+ }
+
+ if (ppc_md.hmi_exception_early)
+ ppc_md.hmi_exception_early(NULL);
+
+out:
+ if (vc->tb_offset) {
+ u64 new_tb = mftb() + vc->tb_offset;
+ mtspr(SPRN_TBU40, new_tb);
+ if ((mftb() & 0xffffff) < (new_tb & 0xffffff)) {
+ new_tb += 0x1000000;
+ mtspr(SPRN_TBU40, new_tb);
+ }
+ vc->tb_offset_applied = vc->tb_offset;
+ }
+
+ return ret;
+}
+
+/*
+ * The following subcore HMI handling is all only for pre-POWER9 CPUs.
+ */
+
/* Check if dynamic split is in force and return subcore size accordingly. */
static inline int kvmppc_cur_subcore_size(void)
{
--
2.23.0
^ permalink raw reply related
* [PATCH v3 51/52] KVM: PPC: Book3S HV P9: Stop using vc->dpdes
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
The P9 path uses vc->dpdes only for msgsndp / SMT emulation. This adds
an ordering requirement between vcpu->doorbell_request and vc->dpdes for
no real benefit. Use vcpu->doorbell_request directly.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv.c | 18 ++++++++++--------
arch/powerpc/kvm/book3s_hv_builtin.c | 2 ++
arch/powerpc/kvm/book3s_hv_p9_entry.c | 14 ++++++++++----
3 files changed, 22 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 57bf49c90e73..351018f617fb 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -761,6 +761,8 @@ static bool kvmppc_doorbell_pending(struct kvm_vcpu *vcpu)
if (vcpu->arch.doorbell_request)
return true;
+ if (cpu_has_feature(CPU_FTR_ARCH_300))
+ return false;
/*
* Ensure that the read of vcore->dpdes comes after the read
* of vcpu->doorbell_request. This barrier matches the
@@ -2185,8 +2187,10 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
* either vcore->dpdes or doorbell_request.
* On POWER8, doorbell_request is 0.
*/
- *val = get_reg_val(id, vcpu->arch.vcore->dpdes |
- vcpu->arch.doorbell_request);
+ if (cpu_has_feature(CPU_FTR_ARCH_300))
+ *val = get_reg_val(id, vcpu->arch.doorbell_request);
+ else
+ *val = get_reg_val(id, vcpu->arch.vcore->dpdes);
break;
case KVM_REG_PPC_VTB:
*val = get_reg_val(id, vcpu->arch.vcore->vtb);
@@ -2423,7 +2427,10 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
vcpu->arch.pspb = set_reg_val(id, *val);
break;
case KVM_REG_PPC_DPDES:
- vcpu->arch.vcore->dpdes = set_reg_val(id, *val);
+ if (cpu_has_feature(CPU_FTR_ARCH_300))
+ vcpu->arch.doorbell_request = set_reg_val(id, *val) & 1;
+ else
+ vcpu->arch.vcore->dpdes = set_reg_val(id, *val);
break;
case KVM_REG_PPC_VTB:
vcpu->arch.vcore->vtb = set_reg_val(id, *val);
@@ -4472,11 +4479,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
if (!nested) {
kvmppc_core_prepare_to_enter(vcpu);
- if (vcpu->arch.doorbell_request) {
- vc->dpdes = 1;
- smp_wmb();
- vcpu->arch.doorbell_request = 0;
- }
if (test_bit(BOOK3S_IRQPRIO_EXTERNAL,
&vcpu->arch.pending_exceptions))
lpcr |= LPCR_MER;
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
index fcf4760a3a0e..a4fc4b2d3806 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -649,6 +649,8 @@ void kvmppc_guest_entry_inject_int(struct kvm_vcpu *vcpu)
int ext;
unsigned long lpcr;
+ WARN_ON_ONCE(cpu_has_feature(CPU_FTR_ARCH_300));
+
/* Insert EXTERNAL bit into LPCR at the MER bit position */
ext = (vcpu->arch.pending_exceptions >> BOOK3S_IRQPRIO_EXTERNAL) & 1;
lpcr = mfspr(SPRN_LPCR);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 5a71532a3adf..fbecbdc42c26 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -705,6 +705,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
unsigned long host_pidr;
unsigned long host_dawr1;
unsigned long host_dawrx1;
+ unsigned long dpdes;
hdec = time_limit - *tb;
if (hdec < 0)
@@ -767,8 +768,10 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
if (vc->pcr)
mtspr(SPRN_PCR, vc->pcr | PCR_MASK);
- if (vc->dpdes)
- mtspr(SPRN_DPDES, vc->dpdes);
+ if (vcpu->arch.doorbell_request) {
+ vcpu->arch.doorbell_request = 0;
+ mtspr(SPRN_DPDES, 1);
+ }
if (dawr_enabled()) {
if (vcpu->arch.dawr0 != host_dawr0)
@@ -999,7 +1002,10 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
vcpu->arch.shregs.sprg2 = mfspr(SPRN_SPRG2);
vcpu->arch.shregs.sprg3 = mfspr(SPRN_SPRG3);
- vc->dpdes = mfspr(SPRN_DPDES);
+ dpdes = mfspr(SPRN_DPDES);
+ if (dpdes)
+ vcpu->arch.doorbell_request = 1;
+
vc->vtb = mfspr(SPRN_VTB);
dec = mfspr(SPRN_DEC);
@@ -1061,7 +1067,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
}
}
- if (vc->dpdes)
+ if (dpdes)
mtspr(SPRN_DPDES, 0);
if (vc->pcr)
mtspr(SPRN_PCR, PCR_MASK);
--
2.23.0
^ permalink raw reply related
* [PATCH v3 50/52] KVM: PPC: Book3S HV P9: Tidy kvmppc_create_dtl_entry
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
This goes further to removing vcores from the P9 path. Also avoid the
memset in favour of explicitly initialising all fields.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv.c | 61 +++++++++++++++++++++---------------
1 file changed, 35 insertions(+), 26 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index d614f83c7b3f..57bf49c90e73 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -698,41 +698,30 @@ static u64 vcore_stolen_time(struct kvmppc_vcore *vc, u64 now)
return p;
}
-static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
- struct kvmppc_vcore *vc, u64 tb)
+static void __kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
+ unsigned int pcpu, u64 now,
+ unsigned long stolen)
{
struct dtl_entry *dt;
struct lppaca *vpa;
- unsigned long stolen;
- unsigned long core_stolen;
- u64 now;
- unsigned long flags;
dt = vcpu->arch.dtl_ptr;
vpa = vcpu->arch.vpa.pinned_addr;
- now = tb;
-
- if (cpu_has_feature(CPU_FTR_ARCH_300)) {
- stolen = 0;
- } else {
- core_stolen = vcore_stolen_time(vc, now);
- stolen = core_stolen - vcpu->arch.stolen_logged;
- vcpu->arch.stolen_logged = core_stolen;
- spin_lock_irqsave(&vcpu->arch.tbacct_lock, flags);
- stolen += vcpu->arch.busy_stolen;
- vcpu->arch.busy_stolen = 0;
- spin_unlock_irqrestore(&vcpu->arch.tbacct_lock, flags);
- }
if (!dt || !vpa)
return;
- memset(dt, 0, sizeof(struct dtl_entry));
+
dt->dispatch_reason = 7;
- dt->processor_id = cpu_to_be16(vc->pcpu + vcpu->arch.ptid);
- dt->timebase = cpu_to_be64(now + vc->tb_offset);
+ dt->preempt_reason = 0;
+ dt->processor_id = cpu_to_be16(pcpu + vcpu->arch.ptid);
dt->enqueue_to_dispatch_time = cpu_to_be32(stolen);
+ dt->ready_to_enqueue_time = 0;
+ dt->waiting_to_ready_time = 0;
+ dt->timebase = cpu_to_be64(now);
+ dt->fault_addr = 0;
dt->srr0 = cpu_to_be64(kvmppc_get_pc(vcpu));
dt->srr1 = cpu_to_be64(vcpu->arch.shregs.msr);
+
++dt;
if (dt == vcpu->arch.dtl.pinned_end)
dt = vcpu->arch.dtl.pinned_addr;
@@ -743,6 +732,27 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
vcpu->arch.dtl.dirty = true;
}
+static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
+ struct kvmppc_vcore *vc)
+{
+ unsigned long stolen;
+ unsigned long core_stolen;
+ u64 now;
+ unsigned long flags;
+
+ now = mftb();
+
+ core_stolen = vcore_stolen_time(vc, now);
+ stolen = core_stolen - vcpu->arch.stolen_logged;
+ vcpu->arch.stolen_logged = core_stolen;
+ spin_lock_irqsave(&vcpu->arch.tbacct_lock, flags);
+ stolen += vcpu->arch.busy_stolen;
+ vcpu->arch.busy_stolen = 0;
+ spin_unlock_irqrestore(&vcpu->arch.tbacct_lock, flags);
+
+ __kvmppc_create_dtl_entry(vcpu, vc->pcpu, now + vc->tb_offset, stolen);
+}
+
/* See if there is a doorbell interrupt pending for a vcpu */
static bool kvmppc_doorbell_pending(struct kvm_vcpu *vcpu)
{
@@ -3750,7 +3760,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
pvc->pcpu = pcpu + thr;
for_each_runnable_thread(i, vcpu, pvc) {
kvmppc_start_thread(vcpu, pvc);
- kvmppc_create_dtl_entry(vcpu, pvc, mftb());
+ kvmppc_create_dtl_entry(vcpu, pvc);
trace_kvm_guest_enter(vcpu);
if (!vcpu->arch.ptid)
thr0_done = true;
@@ -4313,7 +4323,7 @@ static int kvmppc_run_vcpu(struct kvm_vcpu *vcpu)
if ((vc->vcore_state == VCORE_PIGGYBACK ||
vc->vcore_state == VCORE_RUNNING) &&
!VCORE_IS_EXITING(vc)) {
- kvmppc_create_dtl_entry(vcpu, vc, mftb());
+ kvmppc_create_dtl_entry(vcpu, vc);
kvmppc_start_thread(vcpu, vc);
trace_kvm_guest_enter(vcpu);
} else if (vc->vcore_state == VCORE_SLEEPING) {
@@ -4490,8 +4500,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
local_paca->kvm_hstate.ptid = 0;
local_paca->kvm_hstate.fake_suspend = 0;
- vc->pcpu = pcpu; // for kvmppc_create_dtl_entry
- kvmppc_create_dtl_entry(vcpu, vc, tb);
+ __kvmppc_create_dtl_entry(vcpu, pcpu, tb + vc->tb_offset, 0);
trace_kvm_guest_enter(vcpu);
--
2.23.0
^ permalink raw reply related
* [PATCH v3 49/52] KVM: PPC: Book3S HV P9: Remove most of the vcore logic
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
The P9 path always uses one vcpu per vcore, so none of the vcore, locks,
stolen time, blocking logic, shared waitq, etc., is required.
Remove most of it.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv.c | 147 ++++++++++++++++++++---------------
1 file changed, 85 insertions(+), 62 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6574e8a3731e..d614f83c7b3f 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -276,6 +276,8 @@ static void kvmppc_core_start_stolen(struct kvmppc_vcore *vc, u64 tb)
{
unsigned long flags;
+ WARN_ON_ONCE(cpu_has_feature(CPU_FTR_ARCH_300));
+
spin_lock_irqsave(&vc->stoltb_lock, flags);
vc->preempt_tb = tb;
spin_unlock_irqrestore(&vc->stoltb_lock, flags);
@@ -285,6 +287,8 @@ static void kvmppc_core_end_stolen(struct kvmppc_vcore *vc, u64 tb)
{
unsigned long flags;
+ WARN_ON_ONCE(cpu_has_feature(CPU_FTR_ARCH_300));
+
spin_lock_irqsave(&vc->stoltb_lock, flags);
if (vc->preempt_tb != TB_NIL) {
vc->stolen_tb += tb - vc->preempt_tb;
@@ -297,7 +301,12 @@ static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu *vcpu, int cpu)
{
struct kvmppc_vcore *vc = vcpu->arch.vcore;
unsigned long flags;
- u64 now = mftb();
+ u64 now;
+
+ if (cpu_has_feature(CPU_FTR_ARCH_300))
+ return;
+
+ now = mftb();
/*
* We can test vc->runner without taking the vcore lock,
@@ -321,7 +330,12 @@ static void kvmppc_core_vcpu_put_hv(struct kvm_vcpu *vcpu)
{
struct kvmppc_vcore *vc = vcpu->arch.vcore;
unsigned long flags;
- u64 now = mftb();
+ u64 now;
+
+ if (cpu_has_feature(CPU_FTR_ARCH_300))
+ return;
+
+ now = mftb();
if (vc->runner == vcpu && vc->vcore_state >= VCORE_SLEEPING)
kvmppc_core_start_stolen(vc, now);
@@ -673,6 +687,8 @@ static u64 vcore_stolen_time(struct kvmppc_vcore *vc, u64 now)
u64 p;
unsigned long flags;
+ WARN_ON_ONCE(cpu_has_feature(CPU_FTR_ARCH_300));
+
spin_lock_irqsave(&vc->stoltb_lock, flags);
p = vc->stolen_tb;
if (vc->vcore_state != VCORE_INACTIVE &&
@@ -695,13 +711,19 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
dt = vcpu->arch.dtl_ptr;
vpa = vcpu->arch.vpa.pinned_addr;
now = tb;
- core_stolen = vcore_stolen_time(vc, now);
- stolen = core_stolen - vcpu->arch.stolen_logged;
- vcpu->arch.stolen_logged = core_stolen;
- spin_lock_irqsave(&vcpu->arch.tbacct_lock, flags);
- stolen += vcpu->arch.busy_stolen;
- vcpu->arch.busy_stolen = 0;
- spin_unlock_irqrestore(&vcpu->arch.tbacct_lock, flags);
+
+ if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+ stolen = 0;
+ } else {
+ core_stolen = vcore_stolen_time(vc, now);
+ stolen = core_stolen - vcpu->arch.stolen_logged;
+ vcpu->arch.stolen_logged = core_stolen;
+ spin_lock_irqsave(&vcpu->arch.tbacct_lock, flags);
+ stolen += vcpu->arch.busy_stolen;
+ vcpu->arch.busy_stolen = 0;
+ spin_unlock_irqrestore(&vcpu->arch.tbacct_lock, flags);
+ }
+
if (!dt || !vpa)
return;
memset(dt, 0, sizeof(struct dtl_entry));
@@ -898,13 +920,14 @@ static int kvm_arch_vcpu_yield_to(struct kvm_vcpu *target)
* mode handler is not called but no other threads are in the
* source vcore.
*/
-
- spin_lock(&vcore->lock);
- if (target->arch.state == KVMPPC_VCPU_RUNNABLE &&
- vcore->vcore_state != VCORE_INACTIVE &&
- vcore->runner)
- target = vcore->runner;
- spin_unlock(&vcore->lock);
+ if (!cpu_has_feature(CPU_FTR_ARCH_300)) {
+ spin_lock(&vcore->lock);
+ if (target->arch.state == KVMPPC_VCPU_RUNNABLE &&
+ vcore->vcore_state != VCORE_INACTIVE &&
+ vcore->runner)
+ target = vcore->runner;
+ spin_unlock(&vcore->lock);
+ }
return kvm_vcpu_yield_to(target);
}
@@ -3125,13 +3148,6 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu, struct kvmppc_vcore *vc)
kvmppc_ipi_thread(cpu);
}
-/* Old path does this in asm */
-static void kvmppc_stop_thread(struct kvm_vcpu *vcpu)
-{
- vcpu->cpu = -1;
- vcpu->arch.thread_cpu = -1;
-}
-
static void kvmppc_wait_for_nap(int n_threads)
{
int cpu = smp_processor_id();
@@ -3220,6 +3236,8 @@ static void kvmppc_vcore_preempt(struct kvmppc_vcore *vc)
{
struct preempted_vcore_list *lp = this_cpu_ptr(&preempted_vcores);
+ WARN_ON_ONCE(cpu_has_feature(CPU_FTR_ARCH_300));
+
vc->vcore_state = VCORE_PREEMPT;
vc->pcpu = smp_processor_id();
if (vc->num_threads < threads_per_vcore(vc->kvm)) {
@@ -3236,6 +3254,8 @@ static void kvmppc_vcore_end_preempt(struct kvmppc_vcore *vc)
{
struct preempted_vcore_list *lp;
+ WARN_ON_ONCE(cpu_has_feature(CPU_FTR_ARCH_300));
+
kvmppc_core_end_stolen(vc, mftb());
if (!list_empty(&vc->preempt_list)) {
lp = &per_cpu(preempted_vcores, vc->pcpu);
@@ -3964,7 +3984,6 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
unsigned long lpcr, u64 *tb)
{
- struct kvmppc_vcore *vc = vcpu->arch.vcore;
u64 next_timer;
int trap;
@@ -3980,9 +3999,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
kvmppc_subcore_enter_guest();
- vc->entry_exit_map = 1;
- vc->in_guest = 1;
-
vcpu_vpa_increment_dispatch(vcpu);
if (kvmhv_on_pseries()) {
@@ -4035,9 +4051,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
vcpu_vpa_increment_dispatch(vcpu);
- vc->entry_exit_map = 0x101;
- vc->in_guest = 0;
-
kvmppc_subcore_exit_guest();
return trap;
@@ -4103,6 +4116,13 @@ static bool kvmppc_vcpu_woken(struct kvm_vcpu *vcpu)
return false;
}
+static bool kvmppc_vcpu_check_block(struct kvm_vcpu *vcpu)
+{
+ if (!vcpu->arch.ceded || kvmppc_vcpu_woken(vcpu))
+ return true;
+ return false;
+}
+
/*
* Check to see if any of the runnable vcpus on the vcore have pending
* exceptions or are no longer ceded
@@ -4113,7 +4133,7 @@ static int kvmppc_vcore_check_block(struct kvmppc_vcore *vc)
int i;
for_each_runnable_thread(i, vcpu, vc) {
- if (!vcpu->arch.ceded || kvmppc_vcpu_woken(vcpu))
+ if (kvmppc_vcpu_check_block(vcpu))
return 1;
}
@@ -4130,6 +4150,8 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
int do_sleep = 1;
u64 block_ns;
+ WARN_ON_ONCE(cpu_has_feature(CPU_FTR_ARCH_300));
+
/* Poll for pending exceptions and ceded state */
cur = start_poll = ktime_get();
if (vc->halt_poll_ns) {
@@ -4407,11 +4429,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
vcpu->arch.ceded = 0;
vcpu->arch.run_task = current;
vcpu->arch.state = KVMPPC_VCPU_RUNNABLE;
- vcpu->arch.busy_preempt = TB_NIL;
vcpu->arch.last_inst = KVM_INST_FETCH_FAILED;
- vc->runnable_threads[0] = vcpu;
- vc->n_runnable = 1;
- vc->runner = vcpu;
/* See if the MMU is ready to go */
if (unlikely(!kvm->arch.mmu_ready)) {
@@ -4429,11 +4447,8 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
kvmppc_update_vpas(vcpu);
- init_vcore_to_run(vc);
-
preempt_disable();
pcpu = smp_processor_id();
- vc->pcpu = pcpu;
if (kvm_is_radix(kvm))
kvmppc_prepare_radix_vcpu(vcpu, pcpu);
@@ -4462,21 +4477,23 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
goto out;
}
- tb = mftb();
+ if (vcpu->arch.timer_running) {
+ hrtimer_try_to_cancel(&vcpu->arch.dec_timer);
+ vcpu->arch.timer_running = 0;
+ }
- vcpu->arch.stolen_logged = vcore_stolen_time(vc, tb);
- vc->preempt_tb = TB_NIL;
+ tb = mftb();
- kvmppc_clear_host_core(pcpu);
+ vcpu->cpu = pcpu;
+ vcpu->arch.thread_cpu = pcpu;
+ local_paca->kvm_hstate.kvm_vcpu = vcpu;
+ local_paca->kvm_hstate.ptid = 0;
+ local_paca->kvm_hstate.fake_suspend = 0;
- local_paca->kvm_hstate.napping = 0;
- local_paca->kvm_hstate.kvm_split_mode = NULL;
- kvmppc_start_thread(vcpu, vc);
+ vc->pcpu = pcpu; // for kvmppc_create_dtl_entry
kvmppc_create_dtl_entry(vcpu, vc, tb);
- trace_kvm_guest_enter(vcpu);
- vc->vcore_state = VCORE_RUNNING;
- trace_kvmppc_run_core(vc, 0);
+ trace_kvm_guest_enter(vcpu);
guest_enter_irqoff();
@@ -4498,11 +4515,10 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
set_irq_happened(trap);
- kvmppc_set_host_core(pcpu);
-
guest_exit_irqoff();
- kvmppc_stop_thread(vcpu);
+ vcpu->cpu = -1;
+ vcpu->arch.thread_cpu = -1;
powerpc_local_irq_pmu_restore(flags);
@@ -4529,28 +4545,31 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
}
vcpu->arch.ret = r;
- if (is_kvmppc_resume_guest(r) && vcpu->arch.ceded &&
- !kvmppc_vcpu_woken(vcpu)) {
+ if (is_kvmppc_resume_guest(r) && !kvmppc_vcpu_check_block(vcpu)) {
kvmppc_set_timer(vcpu);
- while (vcpu->arch.ceded && !kvmppc_vcpu_woken(vcpu)) {
+
+ prepare_to_rcuwait(&vcpu->wait);
+ for (;;) {
+ set_current_state(TASK_INTERRUPTIBLE);
if (signal_pending(current)) {
vcpu->stat.signal_exits++;
run->exit_reason = KVM_EXIT_INTR;
vcpu->arch.ret = -EINTR;
break;
}
- spin_lock(&vc->lock);
- kvmppc_vcore_blocked(vc);
- spin_unlock(&vc->lock);
+
+ if (kvmppc_vcpu_check_block(vcpu))
+ break;
+
+ trace_kvmppc_vcore_blocked(vc, 0);
+ schedule();
+ trace_kvmppc_vcore_blocked(vc, 1);
}
+ finish_rcuwait(&vcpu->wait);
}
vcpu->arch.ceded = 0;
- vc->vcore_state = VCORE_INACTIVE;
- trace_kvmppc_run_core(vc, 1);
-
done:
- kvmppc_remove_runnable(vc, vcpu, tb);
trace_kvmppc_run_vcpu_exit(vcpu);
return vcpu->arch.ret;
@@ -4632,7 +4651,8 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
kvmppc_save_current_sprs();
- vcpu->arch.waitp = &vcpu->arch.vcore->wait;
+ if (!cpu_has_feature(CPU_FTR_ARCH_300))
+ vcpu->arch.waitp = &vcpu->arch.vcore->wait;
vcpu->arch.pgdir = kvm->mm->pgd;
vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST;
@@ -5094,6 +5114,9 @@ void kvmppc_alloc_host_rm_ops(void)
int cpu, core;
int size;
+ if (cpu_has_feature(CPU_FTR_ARCH_300))
+ return;
+
/* Not the first time here ? */
if (kvmppc_host_rm_ops_hv != NULL)
return;
--
2.23.0
^ permalink raw reply related
* [PATCH v3 48/52] KVM: PPC: Book3S HV P9: Avoid cpu_in_guest atomics on entry and exit
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
cpu_in_guest is set to determine if a CPU needs to be IPI'ed to exit
the guest and notice the need_tlb_flush bit.
This can be implemented as a global per-CPU pointer to the currently
running guest instead of per-guest cpumasks, saving 2 atomics per
entry/exit. P7/8 doesn't require cpu_in_guest, nor does a nested HV
(only the L0 does), so move it to the P9 HV path.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/kvm_book3s_64.h | 1 -
arch/powerpc/include/asm/kvm_host.h | 1 -
arch/powerpc/kvm/book3s_hv.c | 38 +++++++++++++-----------
3 files changed, 21 insertions(+), 19 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index 96f0fda50a07..fe07558173ef 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -44,7 +44,6 @@ struct kvm_nested_guest {
struct mutex tlb_lock; /* serialize page faults and tlbies */
struct kvm_nested_guest *next;
cpumask_t need_tlb_flush;
- cpumask_t cpu_in_guest;
short prev_cpu[NR_CPUS];
u8 radix; /* is this nested guest radix */
};
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 92925f82a1e3..4de418f6c0a2 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -287,7 +287,6 @@ struct kvm_arch {
u32 online_vcores;
atomic_t hpte_mod_interest;
cpumask_t need_tlb_flush;
- cpumask_t cpu_in_guest;
u8 radix;
u8 fwnmi_enabled;
u8 secure_guest;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6e072e2e130a..6574e8a3731e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3009,30 +3009,33 @@ static void kvmppc_release_hwthread(int cpu)
tpaca->kvm_hstate.kvm_split_mode = NULL;
}
+static DEFINE_PER_CPU(struct kvm *, cpu_in_guest);
+
static void radix_flush_cpu(struct kvm *kvm, int cpu, struct kvm_vcpu *vcpu)
{
struct kvm_nested_guest *nested = vcpu->arch.nested;
- cpumask_t *cpu_in_guest;
int i;
cpu = cpu_first_tlb_thread_sibling(cpu);
- if (nested) {
+ if (nested)
cpumask_set_cpu(cpu, &nested->need_tlb_flush);
- cpu_in_guest = &nested->cpu_in_guest;
- } else {
+ else
cpumask_set_cpu(cpu, &kvm->arch.need_tlb_flush);
- cpu_in_guest = &kvm->arch.cpu_in_guest;
- }
/*
- * Make sure setting of bit in need_tlb_flush precedes
- * testing of cpu_in_guest bits. The matching barrier on
- * the other side is the first smp_mb() in kvmppc_run_core().
+ * Make sure setting of bit in need_tlb_flush precedes testing of
+ * cpu_in_guest. The matching barrier on the other side is hwsync
+ * when switching to guest MMU mode, which happens between
+ * cpu_in_guest being set to the guest kvm, and need_tlb_flush bit
+ * being tested.
*/
smp_mb();
for (i = cpu; i <= cpu_last_tlb_thread_sibling(cpu);
- i += cpu_tlb_thread_sibling_step())
- if (cpumask_test_cpu(i, cpu_in_guest))
+ i += cpu_tlb_thread_sibling_step()) {
+ struct kvm *running = *per_cpu_ptr(&cpu_in_guest, i);
+
+ if (running == kvm)
smp_call_function_single(i, do_nothing, NULL, 1);
+ }
}
static void do_migrate_away_vcpu(void *arg)
@@ -3100,7 +3103,6 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu, struct kvmppc_vcore *vc)
{
int cpu;
struct paca_struct *tpaca;
- struct kvm *kvm = vc->kvm;
cpu = vc->pcpu;
if (vcpu) {
@@ -3111,7 +3113,6 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu, struct kvmppc_vcore *vc)
cpu += vcpu->arch.ptid;
vcpu->cpu = vc->pcpu;
vcpu->arch.thread_cpu = cpu;
- cpumask_set_cpu(cpu, &kvm->arch.cpu_in_guest);
}
tpaca = paca_ptrs[cpu];
tpaca->kvm_hstate.kvm_vcpu = vcpu;
@@ -3829,7 +3830,6 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
kvmppc_release_hwthread(pcpu + i);
if (sip && sip->napped[i])
kvmppc_ipi_thread(pcpu + i);
- cpumask_clear_cpu(pcpu + i, &vc->kvm->arch.cpu_in_guest);
}
spin_unlock(&vc->lock);
@@ -3997,8 +3997,14 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
}
} else {
+ struct kvm *kvm = vcpu->kvm;
+
kvmppc_xive_push_vcpu(vcpu);
+
+ __this_cpu_write(cpu_in_guest, kvm);
trap = kvmhv_vcpu_entry_p9(vcpu, time_limit, lpcr, tb);
+ __this_cpu_write(cpu_in_guest, NULL);
+
if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
!(vcpu->arch.shregs.msr & MSR_PR)) {
unsigned long req = kvmppc_get_gpr(vcpu, 3);
@@ -4023,7 +4029,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
}
kvmppc_xive_pull_vcpu(vcpu);
- if (kvm_is_radix(vcpu->kvm))
+ if (kvm_is_radix(kvm))
vcpu->arch.slb_max = 0;
}
@@ -4500,8 +4506,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
powerpc_local_irq_pmu_restore(flags);
- cpumask_clear_cpu(pcpu, &kvm->arch.cpu_in_guest);
-
preempt_enable();
/*
--
2.23.0
^ permalink raw reply related
* [PATCH v3 47/52] KVM: PPC: Book3S HV P9: Add unlikely annotation for !mmu_ready
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
The mmu will almost always be ready.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 0bbef4587f41..6e072e2e130a 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4408,7 +4408,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
vc->runner = vcpu;
/* See if the MMU is ready to go */
- if (!kvm->arch.mmu_ready) {
+ if (unlikely(!kvm->arch.mmu_ready)) {
r = kvmhv_setup_mmu(vcpu);
if (r) {
run->exit_reason = KVM_EXIT_FAIL_ENTRY;
--
2.23.0
^ permalink raw reply related
* [PATCH v3 46/52] KVM: PPC: Book3S HV P9: Avoid changing MSR[RI] in entry and exit
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
kvm_hstate.in_guest provides the equivalent of MSR[RI]=0 protection,
and it covers the existing MSR[RI]=0 section in late entry and early
exit, so clearing and setting MSR[RI] in those cases does not
actually do anything useful.
Remove the RI manipulation and replace it with comments. Make the
in_guest memory accesses a bit closer to a proper critical section
pattern. This speeds up guest entry/exit performance.
This also removes the MSR[RI] warnings which aren't very interesting
and would cause crashes if they hit due to causing an interrupt in
non-recoverable code.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv_p9_entry.c | 50 ++++++++++++---------------
1 file changed, 23 insertions(+), 27 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 99ce5805ea28..5a71532a3adf 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -829,7 +829,15 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
* But TM could be split out if this would be a significant benefit.
*/
- local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_HV_P9;
+ /*
+ * MSR[RI] does not need to be cleared (and is not, for radix guests
+ * with no prefetch bug), because in_guest is set. If we take a SRESET
+ * or MCE with in_guest set but still in HV mode, then
+ * kvmppc_p9_bad_interrupt handles the interrupt, which effectively
+ * clears MSR[RI] and doesn't return.
+ */
+ WRITE_ONCE(local_paca->kvm_hstate.in_guest, KVM_GUEST_MODE_HV_P9);
+ barrier(); /* Open in_guest critical section */
/*
* Hash host, hash guest, or radix guest with prefetch bug, all have
@@ -841,14 +849,10 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
save_clear_host_mmu(kvm);
- if (kvm_is_radix(kvm)) {
+ if (kvm_is_radix(kvm))
switch_mmu_to_guest_radix(kvm, vcpu, lpcr);
- if (!cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG))
- __mtmsrd(0, 1); /* clear RI */
-
- } else {
+ else
switch_mmu_to_guest_hpt(kvm, vcpu, lpcr);
- }
/* TLBIEL uses LPID=LPIDR, so run this after setting guest LPID */
kvmppc_check_need_tlb_flush(kvm, vc->pcpu, nested);
@@ -903,19 +907,16 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
vcpu->arch.regs.gpr[3] = local_paca->kvm_hstate.scratch2;
/*
- * Only set RI after reading machine check regs (DAR, DSISR, SRR0/1)
- * and hstate scratch (which we need to move into exsave to make
- * re-entrant vs SRESET/MCE)
+ * After reading machine check regs (DAR, DSISR, SRR0/1) and hstate
+ * scratch (which we need to move into exsave to make re-entrant vs
+ * SRESET/MCE), register state is protected from reentrancy. However
+ * timebase, MMU, among other state is still set to guest, so don't
+ * enable MSR[RI] here. It gets enabled at the end, after in_guest
+ * is cleared.
+ *
+ * It is possible an NMI could come in here, which is why it is
+ * important to save the above state early so it can be debugged.
*/
- if (ri_set) {
- if (unlikely(!(mfmsr() & MSR_RI))) {
- __mtmsrd(MSR_RI, 1);
- WARN_ON_ONCE(1);
- }
- } else {
- WARN_ON_ONCE(mfmsr() & MSR_RI);
- __mtmsrd(MSR_RI, 1);
- }
vcpu->arch.regs.gpr[9] = exsave[EX_R9/sizeof(u64)];
vcpu->arch.regs.gpr[10] = exsave[EX_R10/sizeof(u64)];
@@ -973,13 +974,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
*/
mtspr(SPRN_HSRR0, vcpu->arch.regs.nip);
mtspr(SPRN_HSRR1, vcpu->arch.shregs.msr);
-
- /*
- * tm_return_to_guest re-loads SRR0/1, DAR,
- * DSISR after RI is cleared, in case they had
- * been clobbered by a MCE.
- */
- __mtmsrd(0, 1); /* clear RI */
goto tm_return_to_guest;
}
}
@@ -1079,7 +1073,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
restore_p9_host_os_sprs(vcpu, &host_os_sprs);
- local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
+ barrier(); /* Close in_guest critical section */
+ WRITE_ONCE(local_paca->kvm_hstate.in_guest, KVM_GUEST_MODE_NONE);
+ /* Interrupts are recoverable at this point */
/*
* cp_abort is required if the processor supports local copy-paste
--
2.23.0
^ permalink raw reply related
* [PATCH v3 45/52] KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
slbmfee/slbmfev instructions are very expensive, moreso than a regular
mfspr instruction, so minimising them significantly improves hash guest
exit performance. The slbmfev is only required if slbmfee found a valid
SLB entry.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv_p9_entry.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 646f487ebf97..99ce5805ea28 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -487,10 +487,22 @@ static void __accumulate_time(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator
#define accumulate_time(vcpu, next) do {} while (0)
#endif
-static inline void mfslb(unsigned int idx, u64 *slbee, u64 *slbev)
+static inline u64 mfslbv(unsigned int idx)
{
- asm volatile("slbmfev %0,%1" : "=r" (*slbev) : "r" (idx));
- asm volatile("slbmfee %0,%1" : "=r" (*slbee) : "r" (idx));
+ u64 slbev;
+
+ asm volatile("slbmfev %0,%1" : "=r" (slbev) : "r" (idx));
+
+ return slbev;
+}
+
+static inline u64 mfslbe(unsigned int idx)
+{
+ u64 slbee;
+
+ asm volatile("slbmfee %0,%1" : "=r" (slbee) : "r" (idx));
+
+ return slbee;
}
static inline void mtslb(u64 slbee, u64 slbev)
@@ -620,8 +632,10 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
*/
for (i = 0; i < vcpu->arch.slb_nr; i++) {
u64 slbee, slbev;
- mfslb(i, &slbee, &slbev);
+
+ slbee = mfslbe(i);
if (slbee & SLB_ESID_V) {
+ slbev = mfslbv(i);
vcpu->arch.slb[nr].orige = slbee | i;
vcpu->arch.slb[nr].origv = slbev;
nr++;
--
2.23.0
^ permalink raw reply related
* [PATCH v3 44/52] KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
Rearrange the MSR saving on entry so it does not follow the mtmsrd to
disable interrupts, avoiding a possible RAW scoreboard stall.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/kvm_book3s_64.h | 2 +
arch/powerpc/kvm/book3s_hv.c | 18 ++-----
arch/powerpc/kvm/book3s_hv_p9_entry.c | 66 +++++++++++++++---------
3 files changed, 47 insertions(+), 39 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index 0a319ed9c2fd..96f0fda50a07 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -154,6 +154,8 @@ static inline bool kvmhv_vcpu_is_radix(struct kvm_vcpu *vcpu)
return radix;
}
+unsigned long kvmppc_msr_hard_disable_set_facilities(struct kvm_vcpu *vcpu, unsigned long msr);
+
int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb);
#define KVM_DEFAULT_HPT_ORDER 24 /* 16MB HPT by default */
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 342b4f125d03..0bbef4587f41 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3878,6 +3878,8 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
s64 dec;
int trap;
+ msr = mfmsr();
+
save_p9_host_os_sprs(&host_os_sprs);
/*
@@ -3888,24 +3890,10 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
*/
host_psscr = mfspr(SPRN_PSSCR_PR);
- hard_irq_disable();
+ kvmppc_msr_hard_disable_set_facilities(vcpu, msr);
if (lazy_irq_pending())
return 0;
- /* MSR bits may have been cleared by context switch */
- msr = 0;
- if (IS_ENABLED(CONFIG_PPC_FPU))
- msr |= MSR_FP;
- if (cpu_has_feature(CPU_FTR_ALTIVEC))
- msr |= MSR_VEC;
- if (cpu_has_feature(CPU_FTR_VSX))
- msr |= MSR_VSX;
- if ((cpu_has_feature(CPU_FTR_TM) ||
- cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
- (vcpu->arch.hfscr & HFSCR_TM))
- msr |= MSR_TM;
- msr = msr_check_and_set(msr);
-
if (unlikely(load_vcpu_state(vcpu, &host_os_sprs)))
msr = mfmsr(); /* TM restore can update msr */
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index a45ba584c734..646f487ebf97 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -632,6 +632,44 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
}
}
+unsigned long kvmppc_msr_hard_disable_set_facilities(struct kvm_vcpu *vcpu, unsigned long msr)
+{
+ unsigned long msr_needed = 0;
+
+ msr &= ~MSR_EE;
+
+ /* MSR bits may have been cleared by context switch so must recheck */
+ if (IS_ENABLED(CONFIG_PPC_FPU))
+ msr_needed |= MSR_FP;
+ if (cpu_has_feature(CPU_FTR_ALTIVEC))
+ msr_needed |= MSR_VEC;
+ if (cpu_has_feature(CPU_FTR_VSX))
+ msr_needed |= MSR_VSX;
+ if ((cpu_has_feature(CPU_FTR_TM) ||
+ cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+ (vcpu->arch.hfscr & HFSCR_TM))
+ msr_needed |= MSR_TM;
+
+ /*
+ * This could be combined with MSR[RI] clearing, but that expands
+ * the unrecoverable window. It would be better to cover unrecoverable
+ * with KVM bad interrupt handling rather than use MSR[RI] at all.
+ *
+ * Much more difficult and less worthwhile to combine with IR/DR
+ * disable.
+ */
+ if ((msr & msr_needed) != msr_needed) {
+ msr |= msr_needed;
+ __mtmsrd(msr, 0);
+ } else {
+ __hard_irq_disable();
+ }
+ local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+
+ return msr;
+}
+EXPORT_SYMBOL_GPL(kvmppc_msr_hard_disable_set_facilities);
+
int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb)
{
struct p9_host_os_sprs host_os_sprs;
@@ -665,6 +703,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
vcpu->arch.ceded = 0;
+ /* Save MSR for restore, with EE clear. */
+ msr = mfmsr() & ~MSR_EE;
+
host_hfscr = mfspr(SPRN_HFSCR);
host_ciabr = mfspr(SPRN_CIABR);
host_psscr = mfspr(SPRN_PSSCR_PR);
@@ -686,35 +727,12 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
save_p9_host_os_sprs(&host_os_sprs);
- /*
- * This could be combined with MSR[RI] clearing, but that expands
- * the unrecoverable window. It would be better to cover unrecoverable
- * with KVM bad interrupt handling rather than use MSR[RI] at all.
- *
- * Much more difficult and less worthwhile to combine with IR/DR
- * disable.
- */
- hard_irq_disable();
+ msr = kvmppc_msr_hard_disable_set_facilities(vcpu, msr);
if (lazy_irq_pending()) {
trap = 0;
goto out;
}
- /* MSR bits may have been cleared by context switch */
- msr = 0;
- if (IS_ENABLED(CONFIG_PPC_FPU))
- msr |= MSR_FP;
- if (cpu_has_feature(CPU_FTR_ALTIVEC))
- msr |= MSR_VEC;
- if (cpu_has_feature(CPU_FTR_VSX))
- msr |= MSR_VSX;
- if ((cpu_has_feature(CPU_FTR_TM) ||
- cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
- (vcpu->arch.hfscr & HFSCR_TM))
- msr |= MSR_TM;
- msr = msr_check_and_set(msr);
- /* Save MSR for restore. This is after hard disable, so EE is clear. */
-
if (unlikely(load_vcpu_state(vcpu, &host_os_sprs)))
msr = mfmsr(); /* MSR may have been updated */
--
2.23.0
^ permalink raw reply related
* [PATCH v3 43/52] KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
mftb() is expensive and one can be avoided on nested guest dispatch.
If the time checking code distinguishes between the L0 timer and the
nested HV timer, then both can be tested in the same place with the
same mftb() value.
This also nicely illustrates the relationship between the L0 and nested
HV timers.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/kvm_asm.h | 1 +
arch/powerpc/kvm/book3s_hv.c | 12 ++++++++++++
arch/powerpc/kvm/book3s_hv_nested.c | 5 -----
3 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_asm.h b/arch/powerpc/include/asm/kvm_asm.h
index fbbf3cec92e9..d68d71987d5c 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -79,6 +79,7 @@
#define BOOK3S_INTERRUPT_FP_UNAVAIL 0x800
#define BOOK3S_INTERRUPT_DECREMENTER 0x900
#define BOOK3S_INTERRUPT_HV_DECREMENTER 0x980
+#define BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER 0x1980
#define BOOK3S_INTERRUPT_DOORBELL 0xa00
#define BOOK3S_INTERRUPT_SYSCALL 0xc00
#define BOOK3S_INTERRUPT_TRACE 0xd00
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index d2a9c930f33e..342b4f125d03 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1486,6 +1486,10 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
run->ready_for_interrupt_injection = 1;
switch (vcpu->arch.trap) {
/* We're good on these - the host merely wanted to get our attention */
+ case BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER:
+ WARN_ON_ONCE(1); /* Should never happen */
+ vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
+ fallthrough;
case BOOK3S_INTERRUPT_HV_DECREMENTER:
vcpu->stat.dec_exits++;
r = RESUME_GUEST;
@@ -1814,6 +1818,12 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
vcpu->stat.ext_intr_exits++;
r = RESUME_GUEST;
break;
+ /* These need to go to the nested HV */
+ case BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER:
+ vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
+ vcpu->stat.dec_exits++;
+ r = RESUME_HOST;
+ break;
/* SR/HMI/PMI are HV interrupts that host has handled. Resume guest.*/
case BOOK3S_INTERRUPT_HMI:
case BOOK3S_INTERRUPT_PERFMON:
@@ -3975,6 +3985,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
return BOOK3S_INTERRUPT_HV_DECREMENTER;
if (next_timer < time_limit)
time_limit = next_timer;
+ else if (*tb >= time_limit) /* nested time limit */
+ return BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER;
vcpu->arch.ceded = 0;
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 7bed0b91245e..e57c08b968c0 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -375,11 +375,6 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
vcpu->arch.ret = RESUME_GUEST;
vcpu->arch.trap = 0;
do {
- if (mftb() >= hdec_exp) {
- vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
- r = RESUME_HOST;
- break;
- }
r = kvmhv_run_single_vcpu(vcpu, hdec_exp, lpcr);
} while (is_kvmppc_resume_guest(r));
--
2.23.0
^ permalink raw reply related
* [PATCH v3 42/52] KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
Use the existing TLB flushing logic to IPI the previous CPU and run the
necessary barriers before running a guest vCPU on a new physical CPU,
to do the necessary radix GTSE barriers for handling the case of an
interrupted guest tlbie sequence.
This results in more IPIs than the TLB flush logic requires, but it's
a significant win for common case scheduling when the vCPU remains on
the same physical CPU.
This saves about 520 cycles (nearly 10%) on a guest entry+exit micro
benchmark on a POWER9.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv.c | 31 +++++++++++++++++++++++----
arch/powerpc/kvm/book3s_hv_p9_entry.c | 9 --------
2 files changed, 27 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index f10cb4167549..d2a9c930f33e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3025,6 +3025,25 @@ static void radix_flush_cpu(struct kvm *kvm, int cpu, struct kvm_vcpu *vcpu)
smp_call_function_single(i, do_nothing, NULL, 1);
}
+static void do_migrate_away_vcpu(void *arg)
+{
+ struct kvm_vcpu *vcpu = arg;
+ struct kvm *kvm = vcpu->kvm;
+
+ /*
+ * If the guest has GTSE, it may execute tlbie, so do a eieio; tlbsync;
+ * ptesync sequence on the old CPU before migrating to a new one, in
+ * case we interrupted the guest between a tlbie ; eieio ;
+ * tlbsync; ptesync sequence.
+ *
+ * Otherwise, ptesync is sufficient.
+ */
+ if (kvm->arch.lpcr & LPCR_GTSE)
+ asm volatile("eieio; tlbsync; ptesync");
+ else
+ asm volatile("ptesync");
+}
+
static void kvmppc_prepare_radix_vcpu(struct kvm_vcpu *vcpu, int pcpu)
{
struct kvm_nested_guest *nested = vcpu->arch.nested;
@@ -3052,10 +3071,14 @@ static void kvmppc_prepare_radix_vcpu(struct kvm_vcpu *vcpu, int pcpu)
* so we use a single bit in .need_tlb_flush for all 4 threads.
*/
if (prev_cpu != pcpu) {
- if (prev_cpu >= 0 &&
- cpu_first_tlb_thread_sibling(prev_cpu) !=
- cpu_first_tlb_thread_sibling(pcpu))
- radix_flush_cpu(kvm, prev_cpu, vcpu);
+ if (prev_cpu >= 0) {
+ if (cpu_first_tlb_thread_sibling(prev_cpu) !=
+ cpu_first_tlb_thread_sibling(pcpu))
+ radix_flush_cpu(kvm, prev_cpu, vcpu);
+
+ smp_call_function_single(prev_cpu,
+ do_migrate_away_vcpu, vcpu, 1);
+ }
if (nested)
nested->prev_cpu[vcpu->arch.nested_vcpu_id] = pcpu;
else
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index eae9d806d704..a45ba584c734 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -1049,15 +1049,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
- if (kvm_is_radix(kvm)) {
- /*
- * Since this is radix, do a eieio; tlbsync; ptesync sequence
- * in case we interrupted the guest between a tlbie and a
- * ptesync.
- */
- asm volatile("eieio; tlbsync; ptesync");
- }
-
/*
* cp_abort is required if the processor supports local copy-paste
* to clear the copy buffer that was under control of the guest.
--
2.23.0
^ permalink raw reply related
* [PATCH v3 41/52] KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
This also moves the PSSCR update in nested entry to avoid a SPR
scoreboard stall.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv.c | 7 +++++--
arch/powerpc/kvm/book3s_hv_p9_entry.c | 26 +++++++++++++++++++-------
2 files changed, 24 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index f4445cc5a29a..f10cb4167549 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3876,7 +3876,9 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
if (unlikely(load_vcpu_state(vcpu, &host_os_sprs)))
msr = mfmsr(); /* TM restore can update msr */
- mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+ if (vcpu->arch.psscr != host_psscr)
+ mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+
kvmhv_save_hv_regs(vcpu, &hvregs);
hvregs.lpcr = lpcr;
vcpu->arch.regs.msr = vcpu->arch.shregs.msr;
@@ -3917,7 +3919,6 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
- mtspr(SPRN_PSSCR_PR, host_psscr);
store_vcpu_state(vcpu);
@@ -3930,6 +3931,8 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
timer_rearm_host_dec(*tb);
restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+ if (vcpu->arch.psscr != host_psscr)
+ mtspr(SPRN_PSSCR_PR, host_psscr);
return trap;
}
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 0f341011816c..eae9d806d704 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -649,6 +649,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
unsigned long host_dawr0;
unsigned long host_dawrx0;
unsigned long host_psscr;
+ unsigned long host_hpsscr;
unsigned long host_pidr;
unsigned long host_dawr1;
unsigned long host_dawrx1;
@@ -666,7 +667,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
host_hfscr = mfspr(SPRN_HFSCR);
host_ciabr = mfspr(SPRN_CIABR);
- host_psscr = mfspr(SPRN_PSSCR);
+ host_psscr = mfspr(SPRN_PSSCR_PR);
+ if (cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+ host_hpsscr = mfspr(SPRN_PSSCR);
host_pidr = mfspr(SPRN_PID);
if (dawr_enabled()) {
@@ -750,8 +753,14 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
if (vcpu->arch.ciabr != host_ciabr)
mtspr(SPRN_CIABR, vcpu->arch.ciabr);
- mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
- (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+
+ if (cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+ mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
+ (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+ } else {
+ if (vcpu->arch.psscr != host_psscr)
+ mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+ }
mtspr(SPRN_HFSCR, vcpu->arch.hfscr);
@@ -957,7 +966,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
vcpu->arch.ic = mfspr(SPRN_IC);
vcpu->arch.pid = mfspr(SPRN_PID);
- vcpu->arch.psscr = mfspr(SPRN_PSSCR) & PSSCR_GUEST_VIS;
+ vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
vcpu->arch.shregs.sprg0 = mfspr(SPRN_SPRG0);
vcpu->arch.shregs.sprg1 = mfspr(SPRN_SPRG1);
@@ -1003,9 +1012,12 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr);
mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr);
- /* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
- mtspr(SPRN_PSSCR, host_psscr |
- (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+ if (cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+ /* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
+ mtspr(SPRN_PSSCR, host_hpsscr |
+ (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+ }
+
mtspr(SPRN_HFSCR, host_hfscr);
if (vcpu->arch.ciabr != host_ciabr)
mtspr(SPRN_CIABR, host_ciabr);
--
2.23.0
^ permalink raw reply related
* [PATCH v3 40/52] KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
Some of the DAWR SPR access is already predicated on dawr_enabled(),
apply this to the remainder of the accesses.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv_p9_entry.c | 34 ++++++++++++++++-----------
1 file changed, 20 insertions(+), 14 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 323b692bbfe2..0f341011816c 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -666,13 +666,16 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
host_hfscr = mfspr(SPRN_HFSCR);
host_ciabr = mfspr(SPRN_CIABR);
- host_dawr0 = mfspr(SPRN_DAWR0);
- host_dawrx0 = mfspr(SPRN_DAWRX0);
host_psscr = mfspr(SPRN_PSSCR);
host_pidr = mfspr(SPRN_PID);
- if (cpu_has_feature(CPU_FTR_DAWR1)) {
- host_dawr1 = mfspr(SPRN_DAWR1);
- host_dawrx1 = mfspr(SPRN_DAWRX1);
+
+ if (dawr_enabled()) {
+ host_dawr0 = mfspr(SPRN_DAWR0);
+ host_dawrx0 = mfspr(SPRN_DAWRX0);
+ if (cpu_has_feature(CPU_FTR_DAWR1)) {
+ host_dawr1 = mfspr(SPRN_DAWR1);
+ host_dawrx1 = mfspr(SPRN_DAWRX1);
+ }
}
local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
@@ -1006,15 +1009,18 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
mtspr(SPRN_HFSCR, host_hfscr);
if (vcpu->arch.ciabr != host_ciabr)
mtspr(SPRN_CIABR, host_ciabr);
- if (vcpu->arch.dawr0 != host_dawr0)
- mtspr(SPRN_DAWR0, host_dawr0);
- if (vcpu->arch.dawrx0 != host_dawrx0)
- mtspr(SPRN_DAWRX0, host_dawrx0);
- if (cpu_has_feature(CPU_FTR_DAWR1)) {
- if (vcpu->arch.dawr1 != host_dawr1)
- mtspr(SPRN_DAWR1, host_dawr1);
- if (vcpu->arch.dawrx1 != host_dawrx1)
- mtspr(SPRN_DAWRX1, host_dawrx1);
+
+ if (dawr_enabled()) {
+ if (vcpu->arch.dawr0 != host_dawr0)
+ mtspr(SPRN_DAWR0, host_dawr0);
+ if (vcpu->arch.dawrx0 != host_dawrx0)
+ mtspr(SPRN_DAWRX0, host_dawrx0);
+ if (cpu_has_feature(CPU_FTR_DAWR1)) {
+ if (vcpu->arch.dawr1 != host_dawr1)
+ mtspr(SPRN_DAWR1, host_dawr1);
+ if (vcpu->arch.dawrx1 != host_dawrx1)
+ mtspr(SPRN_DAWRX1, host_dawrx1);
+ }
}
if (vc->dpdes)
--
2.23.0
^ permalink raw reply related
* [PATCH v3 39/52] KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
Tighten up partition switching code synchronisation and comments.
In particular, hwsync ; isync is required after the last access that is
performed in the context of a partition, before the partition is
switched away from.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_64_entry.S | 11 +++++--
arch/powerpc/kvm/book3s_64_mmu_radix.c | 4 +++
arch/powerpc/kvm/book3s_hv_p9_entry.c | 40 +++++++++++++++++++-------
3 files changed, 42 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_64_entry.S b/arch/powerpc/kvm/book3s_64_entry.S
index 983b8c18bc31..05e003eb5d90 100644
--- a/arch/powerpc/kvm/book3s_64_entry.S
+++ b/arch/powerpc/kvm/book3s_64_entry.S
@@ -374,11 +374,16 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)
BEGIN_FTR_SECTION
mtspr SPRN_DAWRX1,r10
END_FTR_SECTION_IFSET(CPU_FTR_DAWR1)
- mtspr SPRN_PID,r10
/*
- * Switch to host MMU mode
+ * Switch to host MMU mode (don't have the real host PID but we aren't
+ * going back to userspace).
*/
+ hwsync
+ isync
+
+ mtspr SPRN_PID,r10
+
ld r10, HSTATE_KVM_VCPU(r13)
ld r10, VCPU_KVM(r10)
lwz r10, KVM_HOST_LPID(r10)
@@ -389,6 +394,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_DAWR1)
ld r10, KVM_HOST_LPCR(r10)
mtspr SPRN_LPCR,r10
+ isync
+
/*
* Set GUEST_MODE_NONE so the handler won't branch to KVM, and clear
* MSR_RI in r12 ([H]SRR1) so the handler won't try to return.
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 16359525a40f..8cebe5542256 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -57,6 +57,8 @@ unsigned long __kvmhv_copy_tofrom_guest_radix(int lpid, int pid,
preempt_disable();
+ asm volatile("hwsync" ::: "memory");
+ isync();
/* switch the lpid first to avoid running host with unallocated pid */
old_lpid = mfspr(SPRN_LPID);
if (old_lpid != lpid)
@@ -75,6 +77,8 @@ unsigned long __kvmhv_copy_tofrom_guest_radix(int lpid, int pid,
ret = __copy_to_user_inatomic((void __user *)to, from, n);
pagefault_enable();
+ asm volatile("hwsync" ::: "memory");
+ isync();
/* switch the pid first to avoid running host with unallocated pid */
if (quadrant == 1 && pid != old_pid)
mtspr(SPRN_PID, old_pid);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 093ac0453d91..323b692bbfe2 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -531,17 +531,19 @@ static void switch_mmu_to_guest_radix(struct kvm *kvm, struct kvm_vcpu *vcpu, u6
lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
/*
- * All the isync()s are overkill but trivially follow the ISA
- * requirements. Some can likely be replaced with justification
- * comment for why they are not needed.
+ * Prior memory accesses to host PID Q3 must be completed before we
+ * start switching, and stores must be drained to avoid not-my-LPAR
+ * logic (see switch_mmu_to_host).
*/
+ asm volatile("hwsync" ::: "memory");
isync();
mtspr(SPRN_LPID, lpid);
- isync();
mtspr(SPRN_LPCR, lpcr);
- isync();
mtspr(SPRN_PID, vcpu->arch.pid);
- isync();
+ /*
+ * isync not required here because we are HRFID'ing to guest before
+ * any guest context access, which is context synchronising.
+ */
}
static void switch_mmu_to_guest_hpt(struct kvm *kvm, struct kvm_vcpu *vcpu, u64 lpcr)
@@ -551,25 +553,41 @@ static void switch_mmu_to_guest_hpt(struct kvm *kvm, struct kvm_vcpu *vcpu, u64
lpid = kvm->arch.lpid;
+ /*
+ * See switch_mmu_to_guest_radix. ptesync should not be required here
+ * even if the host is in HPT mode because speculative accesses would
+ * not cause RC updates (we are in real mode).
+ */
+ asm volatile("hwsync" ::: "memory");
+ isync();
mtspr(SPRN_LPID, lpid);
mtspr(SPRN_LPCR, lpcr);
mtspr(SPRN_PID, vcpu->arch.pid);
for (i = 0; i < vcpu->arch.slb_max; i++)
mtslb(vcpu->arch.slb[i].orige, vcpu->arch.slb[i].origv);
-
- isync();
+ /*
+ * isync not required here, see switch_mmu_to_guest_radix.
+ */
}
static void switch_mmu_to_host(struct kvm *kvm, u32 pid)
{
+ /*
+ * The guest has exited, so guest MMU context is no longer being
+ * non-speculatively accessed, but a hwsync is needed before the
+ * mtLPIDR / mtPIDR switch, in order to ensure all stores are drained,
+ * so the not-my-LPAR tlbie logic does not overlook them.
+ */
+ asm volatile("hwsync" ::: "memory");
isync();
mtspr(SPRN_PID, pid);
- isync();
mtspr(SPRN_LPID, kvm->arch.host_lpid);
- isync();
mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
- isync();
+ /*
+ * isync is not required after the switch, because mtmsrd with L=0
+ * is performed after this switch, which is context synchronising.
+ */
if (!radix_enabled())
slb_restore_bolted_realmode();
--
2.23.0
^ permalink raw reply related
* [PATCH v3 38/52] KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host SPRs
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
Linux implements SPR save/restore including storage space for registers
in the task struct for process context switching. Make use of this
similarly to the way we make use of the context switching fp/vec save
restore.
This improves code reuse, allows some stack space to be saved, and helps
with avoiding VRSAVE updates if they are not required.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/switch_to.h | 1 +
arch/powerpc/kernel/process.c | 6 ++
arch/powerpc/kvm/book3s_hv.c | 21 +-----
arch/powerpc/kvm/book3s_hv.h | 3 -
arch/powerpc/kvm/book3s_hv_p9_entry.c | 93 +++++++++++++++++++--------
5 files changed, 73 insertions(+), 51 deletions(-)
diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h
index e8013cd6b646..1f43ef696033 100644
--- a/arch/powerpc/include/asm/switch_to.h
+++ b/arch/powerpc/include/asm/switch_to.h
@@ -113,6 +113,7 @@ static inline void clear_task_ebb(struct task_struct *t)
}
void kvmppc_save_user_regs(void);
+void kvmppc_save_current_sprs(void);
extern int set_thread_tidr(struct task_struct *t);
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 3fca321b820d..b2d191a8fbf9 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1182,6 +1182,12 @@ void kvmppc_save_user_regs(void)
#endif
}
EXPORT_SYMBOL_GPL(kvmppc_save_user_regs);
+
+void kvmppc_save_current_sprs(void)
+{
+ save_sprs(¤t->thread);
+}
+EXPORT_SYMBOL_GPL(kvmppc_save_current_sprs);
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
static inline void restore_sprs(struct thread_struct *old_thread,
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e9037f7a3737..f4445cc5a29a 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4540,9 +4540,6 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
struct kvm_run *run = vcpu->run;
int r;
int srcu_idx;
- unsigned long ebb_regs[3] = {}; /* shut up GCC */
- unsigned long user_tar = 0;
- unsigned int user_vrsave;
struct kvm *kvm;
unsigned long msr;
@@ -4603,14 +4600,7 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
kvmppc_save_user_regs();
- /* Save userspace EBB and other register values */
- if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
- ebb_regs[0] = mfspr(SPRN_EBBHR);
- ebb_regs[1] = mfspr(SPRN_EBBRR);
- ebb_regs[2] = mfspr(SPRN_BESCR);
- user_tar = mfspr(SPRN_TAR);
- }
- user_vrsave = mfspr(SPRN_VRSAVE);
+ kvmppc_save_current_sprs();
vcpu->arch.waitp = &vcpu->arch.vcore->wait;
vcpu->arch.pgdir = kvm->mm->pgd;
@@ -4651,15 +4641,6 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
}
} while (is_kvmppc_resume_guest(r));
- /* Restore userspace EBB and other register values */
- if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
- mtspr(SPRN_EBBHR, ebb_regs[0]);
- mtspr(SPRN_EBBRR, ebb_regs[1]);
- mtspr(SPRN_BESCR, ebb_regs[2]);
- mtspr(SPRN_TAR, user_tar);
- }
- mtspr(SPRN_VRSAVE, user_vrsave);
-
vcpu->arch.state = KVMPPC_VCPU_NOTREADY;
atomic_dec(&kvm->arch.vcpus_running);
diff --git a/arch/powerpc/kvm/book3s_hv.h b/arch/powerpc/kvm/book3s_hv.h
index d7485b9e9762..6b7f07d9026b 100644
--- a/arch/powerpc/kvm/book3s_hv.h
+++ b/arch/powerpc/kvm/book3s_hv.h
@@ -4,11 +4,8 @@
* Privileged (non-hypervisor) host registers to save.
*/
struct p9_host_os_sprs {
- unsigned long dscr;
- unsigned long tidr;
unsigned long iamr;
unsigned long amr;
- unsigned long fscr;
unsigned int pmc1;
unsigned int pmc2;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 8499e8a9ca8f..093ac0453d91 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -231,15 +231,26 @@ EXPORT_SYMBOL_GPL(switch_pmu_to_host);
static void load_spr_state(struct kvm_vcpu *vcpu,
struct p9_host_os_sprs *host_os_sprs)
{
+ /* TAR is very fast */
mtspr(SPRN_TAR, vcpu->arch.tar);
+#ifdef CONFIG_ALTIVEC
+ if (cpu_has_feature(CPU_FTR_ALTIVEC) &&
+ current->thread.vrsave != vcpu->arch.vrsave)
+ mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
+#endif
+
if (vcpu->arch.hfscr & HFSCR_EBB) {
- mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
- mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
- mtspr(SPRN_BESCR, vcpu->arch.bescr);
+ if (current->thread.ebbhr != vcpu->arch.ebbhr)
+ mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
+ if (current->thread.ebbrr != vcpu->arch.ebbrr)
+ mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
+ if (current->thread.bescr != vcpu->arch.bescr)
+ mtspr(SPRN_BESCR, vcpu->arch.bescr);
}
- if (cpu_has_feature(CPU_FTR_P9_TIDR))
+ if (cpu_has_feature(CPU_FTR_P9_TIDR) &&
+ current->thread.tidr != vcpu->arch.tid)
mtspr(SPRN_TIDR, vcpu->arch.tid);
if (host_os_sprs->iamr != vcpu->arch.iamr)
mtspr(SPRN_IAMR, vcpu->arch.iamr);
@@ -247,9 +258,9 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
mtspr(SPRN_AMR, vcpu->arch.amr);
if (vcpu->arch.uamor != 0)
mtspr(SPRN_UAMOR, vcpu->arch.uamor);
- if (host_os_sprs->fscr != vcpu->arch.fscr)
+ if (current->thread.fscr != vcpu->arch.fscr)
mtspr(SPRN_FSCR, vcpu->arch.fscr);
- if (host_os_sprs->dscr != vcpu->arch.dscr)
+ if (current->thread.dscr != vcpu->arch.dscr)
mtspr(SPRN_DSCR, vcpu->arch.dscr);
if (vcpu->arch.pspb != 0)
mtspr(SPRN_PSPB, vcpu->arch.pspb);
@@ -269,20 +280,15 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
{
vcpu->arch.tar = mfspr(SPRN_TAR);
+#ifdef CONFIG_ALTIVEC
+ if (cpu_has_feature(CPU_FTR_ALTIVEC))
+ vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
+#endif
+
if (vcpu->arch.hfscr & HFSCR_EBB) {
vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
vcpu->arch.bescr = mfspr(SPRN_BESCR);
- /*
- * This is like load_fp in context switching, turn off the
- * facility after it wraps the u8 to try avoiding saving
- * and restoring the registers each partition switch.
- */
- if (!vcpu->arch.nested) {
- vcpu->arch.load_ebb++;
- if (!vcpu->arch.load_ebb)
- vcpu->arch.hfscr &= ~HFSCR_EBB;
- }
}
if (cpu_has_feature(CPU_FTR_P9_TIDR))
@@ -324,7 +330,6 @@ bool load_vcpu_state(struct kvm_vcpu *vcpu,
#ifdef CONFIG_ALTIVEC
load_vr_state(&vcpu->arch.vr);
#endif
- mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
return ret;
}
@@ -338,7 +343,6 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
#ifdef CONFIG_ALTIVEC
store_vr_state(&vcpu->arch.vr);
#endif
- vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
if (cpu_has_feature(CPU_FTR_TM) ||
@@ -364,12 +368,8 @@ EXPORT_SYMBOL_GPL(store_vcpu_state);
void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
{
- if (cpu_has_feature(CPU_FTR_P9_TIDR))
- host_os_sprs->tidr = mfspr(SPRN_TIDR);
host_os_sprs->iamr = mfspr(SPRN_IAMR);
host_os_sprs->amr = mfspr(SPRN_AMR);
- host_os_sprs->fscr = mfspr(SPRN_FSCR);
- host_os_sprs->dscr = mfspr(SPRN_DSCR);
}
EXPORT_SYMBOL_GPL(save_p9_host_os_sprs);
@@ -377,26 +377,63 @@ EXPORT_SYMBOL_GPL(save_p9_host_os_sprs);
void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
struct p9_host_os_sprs *host_os_sprs)
{
+ /*
+ * current->thread.xxx registers must all be restored to host
+ * values before a potential context switch, othrewise the context
+ * switch itself will overwrite current->thread.xxx with the values
+ * from the guest SPRs.
+ */
+
mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
- if (cpu_has_feature(CPU_FTR_P9_TIDR))
- mtspr(SPRN_TIDR, host_os_sprs->tidr);
+ if (cpu_has_feature(CPU_FTR_P9_TIDR) &&
+ current->thread.tidr != vcpu->arch.tid)
+ mtspr(SPRN_TIDR, current->thread.tidr);
if (host_os_sprs->iamr != vcpu->arch.iamr)
mtspr(SPRN_IAMR, host_os_sprs->iamr);
if (vcpu->arch.uamor != 0)
mtspr(SPRN_UAMOR, 0);
if (host_os_sprs->amr != vcpu->arch.amr)
mtspr(SPRN_AMR, host_os_sprs->amr);
- if (host_os_sprs->fscr != vcpu->arch.fscr)
- mtspr(SPRN_FSCR, host_os_sprs->fscr);
- if (host_os_sprs->dscr != vcpu->arch.dscr)
- mtspr(SPRN_DSCR, host_os_sprs->dscr);
+ if (current->thread.fscr != vcpu->arch.fscr)
+ mtspr(SPRN_FSCR, current->thread.fscr);
+ if (current->thread.dscr != vcpu->arch.dscr)
+ mtspr(SPRN_DSCR, current->thread.dscr);
if (vcpu->arch.pspb != 0)
mtspr(SPRN_PSPB, 0);
/* Save guest CTRL register, set runlatch to 1 */
if (!(vcpu->arch.ctrl & 1))
mtspr(SPRN_CTRLT, 1);
+
+#ifdef CONFIG_ALTIVEC
+ if (cpu_has_feature(CPU_FTR_ALTIVEC) &&
+ vcpu->arch.vrsave != current->thread.vrsave)
+ mtspr(SPRN_VRSAVE, current->thread.vrsave);
+#endif
+ if (vcpu->arch.hfscr & HFSCR_EBB) {
+ if (vcpu->arch.bescr != current->thread.bescr)
+ mtspr(SPRN_BESCR, current->thread.bescr);
+ if (vcpu->arch.ebbhr != current->thread.ebbhr)
+ mtspr(SPRN_EBBHR, current->thread.ebbhr);
+ if (vcpu->arch.ebbrr != current->thread.ebbrr)
+ mtspr(SPRN_EBBRR, current->thread.ebbrr);
+
+ if (!vcpu->arch.nested) {
+ /*
+ * This is like load_fp in context switching, turn off
+ * the facility after it wraps the u8 to try avoiding
+ * saving and restoring the registers each partition
+ * switch.
+ */
+ vcpu->arch.load_ebb++;
+ if (!vcpu->arch.load_ebb)
+ vcpu->arch.hfscr &= ~HFSCR_EBB;
+ }
+ }
+
+ if (vcpu->arch.tar != current->thread.tar)
+ mtspr(SPRN_TAR, current->thread.tar);
}
EXPORT_SYMBOL_GPL(restore_p9_host_os_sprs);
--
2.23.0
^ permalink raw reply related
* [PATCH v3 37/52] KVM: PPC: Book3S HV P9: Demand fault TM facility registers
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin, Fabiano Rosas
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
Use HFSCR facility disabling to implement demand faulting for TM, with
a hysteresis counter similar to the load_fp etc counters in context
switching that implement the equivalent demand faulting for userspace
facilities.
This speeds up guest entry/exit by avoiding the register save/restore
when a guest is not frequently using them. When a guest does use them
often, there will be some additional demand fault overhead, but these
are not commonly used facilities.
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/kvm_host.h | 3 +++
arch/powerpc/kvm/book3s_hv.c | 26 ++++++++++++++++++++------
arch/powerpc/kvm/book3s_hv_p9_entry.c | 15 +++++++++++----
3 files changed, 34 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 9c63eff35812..92925f82a1e3 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -580,6 +580,9 @@ struct kvm_vcpu_arch {
ulong ppr;
u32 pspb;
u8 load_ebb;
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+ u8 load_tm;
+#endif
ulong fscr;
ulong shadow_fscr;
ulong ebbhr;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 070469867bf5..e9037f7a3737 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1446,6 +1446,16 @@ static int kvmppc_ebb_unavailable(struct kvm_vcpu *vcpu)
return RESUME_GUEST;
}
+static int kvmppc_tm_unavailable(struct kvm_vcpu *vcpu)
+{
+ if (!(vcpu->arch.hfscr_permitted & HFSCR_TM))
+ return EMULATE_FAIL;
+
+ vcpu->arch.hfscr |= HFSCR_TM;
+
+ return RESUME_GUEST;
+}
+
static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
struct task_struct *tsk)
{
@@ -1739,6 +1749,8 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
r = kvmppc_pmu_unavailable(vcpu);
if (cause == FSCR_EBB_LG)
r = kvmppc_ebb_unavailable(vcpu);
+ if (cause == FSCR_TM_LG)
+ r = kvmppc_tm_unavailable(vcpu);
}
if (r == EMULATE_FAIL) {
kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
@@ -2783,9 +2795,9 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
vcpu->arch.hfscr_permitted = vcpu->arch.hfscr;
/*
- * PM, EBB is demand-faulted so start with it clear.
+ * PM, EBB, TM are demand-faulted so start with it clear.
*/
- vcpu->arch.hfscr &= ~(HFSCR_PM | HFSCR_EBB);
+ vcpu->arch.hfscr &= ~(HFSCR_PM | HFSCR_EBB | HFSCR_TM);
kvmppc_mmu_book3s_hv_init(vcpu);
@@ -3855,8 +3867,9 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
msr |= MSR_VEC;
if (cpu_has_feature(CPU_FTR_VSX))
msr |= MSR_VSX;
- if (cpu_has_feature(CPU_FTR_TM) ||
- cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+ if ((cpu_has_feature(CPU_FTR_TM) ||
+ cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+ (vcpu->arch.hfscr & HFSCR_TM))
msr |= MSR_TM;
msr = msr_check_and_set(msr);
@@ -4582,8 +4595,9 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
msr |= MSR_VEC;
if (cpu_has_feature(CPU_FTR_VSX))
msr |= MSR_VSX;
- if (cpu_has_feature(CPU_FTR_TM) ||
- cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+ if ((cpu_has_feature(CPU_FTR_TM) ||
+ cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+ (vcpu->arch.hfscr & HFSCR_TM))
msr |= MSR_TM;
msr = msr_check_and_set(msr);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 929a7c336b09..8499e8a9ca8f 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -310,7 +310,7 @@ bool load_vcpu_state(struct kvm_vcpu *vcpu,
if (MSR_TM_ACTIVE(guest_msr)) {
kvmppc_restore_tm_hv(vcpu, guest_msr, true);
ret = true;
- } else {
+ } else if (vcpu->arch.hfscr & HFSCR_TM) {
mtspr(SPRN_TEXASR, vcpu->arch.texasr);
mtspr(SPRN_TFHAR, vcpu->arch.tfhar);
mtspr(SPRN_TFIAR, vcpu->arch.tfiar);
@@ -346,10 +346,16 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
unsigned long guest_msr = vcpu->arch.shregs.msr;
if (MSR_TM_ACTIVE(guest_msr)) {
kvmppc_save_tm_hv(vcpu, guest_msr, true);
- } else {
+ } else if (vcpu->arch.hfscr & HFSCR_TM) {
vcpu->arch.texasr = mfspr(SPRN_TEXASR);
vcpu->arch.tfhar = mfspr(SPRN_TFHAR);
vcpu->arch.tfiar = mfspr(SPRN_TFIAR);
+
+ if (!vcpu->arch.nested) {
+ vcpu->arch.load_tm++; /* see load_ebb comment */
+ if (!vcpu->arch.load_tm)
+ vcpu->arch.hfscr &= ~HFSCR_TM;
+ }
}
}
#endif
@@ -641,8 +647,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
msr |= MSR_VEC;
if (cpu_has_feature(CPU_FTR_VSX))
msr |= MSR_VSX;
- if (cpu_has_feature(CPU_FTR_TM) ||
- cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+ if ((cpu_has_feature(CPU_FTR_TM) ||
+ cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+ (vcpu->arch.hfscr & HFSCR_TM))
msr |= MSR_TM;
msr = msr_check_and_set(msr);
/* Save MSR for restore. This is after hard disable, so EE is clear. */
--
2.23.0
^ permalink raw reply related
* [PATCH v3 36/52] KVM: PPC: Book3S HV P9: Demand fault EBB facility registers
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin, Fabiano Rosas
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
Use HFSCR facility disabling to implement demand faulting for EBB, with
a hysteresis counter similar to the load_fp etc counters in context
switching that implement the equivalent demand faulting for userspace
facilities.
This speeds up guest entry/exit by avoiding the register save/restore
when a guest is not frequently using them. When a guest does use them
often, there will be some additional demand fault overhead, but these
are not commonly used facilities.
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/kvm_host.h | 1 +
arch/powerpc/kvm/book3s_hv.c | 16 +++++++++++++--
arch/powerpc/kvm/book3s_hv_p9_entry.c | 28 +++++++++++++++++++++------
3 files changed, 37 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index c5fc4d016695..9c63eff35812 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -579,6 +579,7 @@ struct kvm_vcpu_arch {
ulong cfar;
ulong ppr;
u32 pspb;
+ u8 load_ebb;
ulong fscr;
ulong shadow_fscr;
ulong ebbhr;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6fb941aa77f1..070469867bf5 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1436,6 +1436,16 @@ static int kvmppc_pmu_unavailable(struct kvm_vcpu *vcpu)
return RESUME_GUEST;
}
+static int kvmppc_ebb_unavailable(struct kvm_vcpu *vcpu)
+{
+ if (!(vcpu->arch.hfscr_permitted & HFSCR_EBB))
+ return EMULATE_FAIL;
+
+ vcpu->arch.hfscr |= HFSCR_EBB;
+
+ return RESUME_GUEST;
+}
+
static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
struct task_struct *tsk)
{
@@ -1727,6 +1737,8 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
r = kvmppc_emulate_doorbell_instr(vcpu);
if (cause == FSCR_PM_LG)
r = kvmppc_pmu_unavailable(vcpu);
+ if (cause == FSCR_EBB_LG)
+ r = kvmppc_ebb_unavailable(vcpu);
}
if (r == EMULATE_FAIL) {
kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
@@ -2771,9 +2783,9 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
vcpu->arch.hfscr_permitted = vcpu->arch.hfscr;
/*
- * PM is demand-faulted so start with it clear.
+ * PM, EBB is demand-faulted so start with it clear.
*/
- vcpu->arch.hfscr &= ~HFSCR_PM;
+ vcpu->arch.hfscr &= ~(HFSCR_PM | HFSCR_EBB);
kvmppc_mmu_book3s_hv_init(vcpu);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index a23f09fa7d2d..929a7c336b09 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -232,9 +232,12 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
struct p9_host_os_sprs *host_os_sprs)
{
mtspr(SPRN_TAR, vcpu->arch.tar);
- mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
- mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
- mtspr(SPRN_BESCR, vcpu->arch.bescr);
+
+ if (vcpu->arch.hfscr & HFSCR_EBB) {
+ mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
+ mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
+ mtspr(SPRN_BESCR, vcpu->arch.bescr);
+ }
if (cpu_has_feature(CPU_FTR_P9_TIDR))
mtspr(SPRN_TIDR, vcpu->arch.tid);
@@ -265,9 +268,22 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
static void store_spr_state(struct kvm_vcpu *vcpu)
{
vcpu->arch.tar = mfspr(SPRN_TAR);
- vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
- vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
- vcpu->arch.bescr = mfspr(SPRN_BESCR);
+
+ if (vcpu->arch.hfscr & HFSCR_EBB) {
+ vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
+ vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
+ vcpu->arch.bescr = mfspr(SPRN_BESCR);
+ /*
+ * This is like load_fp in context switching, turn off the
+ * facility after it wraps the u8 to try avoiding saving
+ * and restoring the registers each partition switch.
+ */
+ if (!vcpu->arch.nested) {
+ vcpu->arch.load_ebb++;
+ if (!vcpu->arch.load_ebb)
+ vcpu->arch.hfscr &= ~HFSCR_EBB;
+ }
+ }
if (cpu_has_feature(CPU_FTR_P9_TIDR))
vcpu->arch.tid = mfspr(SPRN_TIDR);
--
2.23.0
^ permalink raw reply related
* [PATCH v3 35/52] KVM: PPC: Book3S HV P9: More SPR speed improvements
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
This avoids more scoreboard stalls and reduces mtSPRs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv_p9_entry.c | 73 ++++++++++++++++-----------
1 file changed, 43 insertions(+), 30 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 67f57b03a896..a23f09fa7d2d 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -645,24 +645,29 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
vc->tb_offset_applied = vc->tb_offset;
}
- if (vc->pcr)
- mtspr(SPRN_PCR, vc->pcr | PCR_MASK);
- mtspr(SPRN_DPDES, vc->dpdes);
mtspr(SPRN_VTB, vc->vtb);
-
mtspr(SPRN_PURR, vcpu->arch.purr);
mtspr(SPRN_SPURR, vcpu->arch.spurr);
+ if (vc->pcr)
+ mtspr(SPRN_PCR, vc->pcr | PCR_MASK);
+ if (vc->dpdes)
+ mtspr(SPRN_DPDES, vc->dpdes);
+
if (dawr_enabled()) {
- mtspr(SPRN_DAWR0, vcpu->arch.dawr0);
- mtspr(SPRN_DAWRX0, vcpu->arch.dawrx0);
+ if (vcpu->arch.dawr0 != host_dawr0)
+ mtspr(SPRN_DAWR0, vcpu->arch.dawr0);
+ if (vcpu->arch.dawrx0 != host_dawrx0)
+ mtspr(SPRN_DAWRX0, vcpu->arch.dawrx0);
if (cpu_has_feature(CPU_FTR_DAWR1)) {
- mtspr(SPRN_DAWR1, vcpu->arch.dawr1);
- mtspr(SPRN_DAWRX1, vcpu->arch.dawrx1);
+ if (vcpu->arch.dawr1 != host_dawr1)
+ mtspr(SPRN_DAWR1, vcpu->arch.dawr1);
+ if (vcpu->arch.dawrx1 != host_dawrx1)
+ mtspr(SPRN_DAWRX1, vcpu->arch.dawrx1);
}
}
- mtspr(SPRN_CIABR, vcpu->arch.ciabr);
- mtspr(SPRN_IC, vcpu->arch.ic);
+ if (vcpu->arch.ciabr != host_ciabr)
+ mtspr(SPRN_CIABR, vcpu->arch.ciabr);
mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
(local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
@@ -881,20 +886,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
vc->dpdes = mfspr(SPRN_DPDES);
vc->vtb = mfspr(SPRN_VTB);
- save_clear_guest_mmu(kvm, vcpu);
- switch_mmu_to_host(kvm, host_pidr);
-
- /*
- * If we are in real mode, only switch MMU on after the MMU is
- * switched to host, to avoid the P9_RADIX_PREFETCH_BUG.
- */
- if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
- vcpu->arch.shregs.msr & MSR_TS_MASK)
- msr |= MSR_TS_S;
- __mtmsrd(msr, 0);
-
- store_vcpu_state(vcpu);
-
dec = mfspr(SPRN_DEC);
if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
dec = (s32) dec;
@@ -912,6 +903,22 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
vc->tb_offset_applied = 0;
}
+ save_clear_guest_mmu(kvm, vcpu);
+ switch_mmu_to_host(kvm, host_pidr);
+
+ /*
+ * Enable MSR here in order to have facilities enabled to save
+ * guest registers. This enables MMU (if we were in realmode), so
+ * only switch MMU on after the MMU is switched to host, to avoid
+ * the P9_RADIX_PREFETCH_BUG or hash guest context.
+ */
+ if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
+ vcpu->arch.shregs.msr & MSR_TS_MASK)
+ msr |= MSR_TS_S;
+ __mtmsrd(msr, 0);
+
+ store_vcpu_state(vcpu);
+
mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr);
mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr);
@@ -919,15 +926,21 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
mtspr(SPRN_PSSCR, host_psscr |
(local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
mtspr(SPRN_HFSCR, host_hfscr);
- mtspr(SPRN_CIABR, host_ciabr);
- mtspr(SPRN_DAWR0, host_dawr0);
- mtspr(SPRN_DAWRX0, host_dawrx0);
+ if (vcpu->arch.ciabr != host_ciabr)
+ mtspr(SPRN_CIABR, host_ciabr);
+ if (vcpu->arch.dawr0 != host_dawr0)
+ mtspr(SPRN_DAWR0, host_dawr0);
+ if (vcpu->arch.dawrx0 != host_dawrx0)
+ mtspr(SPRN_DAWRX0, host_dawrx0);
if (cpu_has_feature(CPU_FTR_DAWR1)) {
- mtspr(SPRN_DAWR1, host_dawr1);
- mtspr(SPRN_DAWRX1, host_dawrx1);
+ if (vcpu->arch.dawr1 != host_dawr1)
+ mtspr(SPRN_DAWR1, host_dawr1);
+ if (vcpu->arch.dawrx1 != host_dawrx1)
+ mtspr(SPRN_DAWRX1, host_dawrx1);
}
- mtspr(SPRN_DPDES, 0);
+ if (vc->dpdes)
+ mtspr(SPRN_DPDES, 0);
if (vc->pcr)
mtspr(SPRN_PCR, PCR_MASK);
--
2.23.0
^ permalink raw reply related
* [PATCH v3 34/52] KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors that require it
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
Use CPU_FTR_P9_RADIX_PREFETCH_BUG to apply the workaround, to test for
DD2.1 and below processors. This saves a mtSPR in guest entry.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv.c | 3 ++-
arch/powerpc/kvm/book3s_hv_p9_entry.c | 6 ++++--
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 5a1859311b3e..6fb941aa77f1 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1590,7 +1590,8 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
unsigned long vsid;
long err;
- if (vcpu->arch.fault_dsisr == HDSISR_CANARY) {
+ if (cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG) &&
+ unlikely(vcpu->arch.fault_dsisr == HDSISR_CANARY)) {
r = RESUME_GUEST; /* Just retry if it's the canary */
break;
}
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 619bbcd47b92..67f57b03a896 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -683,9 +683,11 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
* HDSI which should correctly update the HDSISR the second time HDSI
* entry.
*
- * Just do this on all p9 processors for now.
+ * The "radix prefetch bug" test can be used to test for this bug, as
+ * it also exists fo DD2.1 and below.
*/
- mtspr(SPRN_HDSISR, HDSISR_CANARY);
+ if (cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG))
+ mtspr(SPRN_HDSISR, HDSISR_CANARY);
mtspr(SPRN_SPRG0, vcpu->arch.shregs.sprg0);
mtspr(SPRN_SPRG1, vcpu->arch.shregs.sprg1);
--
2.23.0
^ permalink raw reply related
* [PATCH v3 33/52] KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
This moves PMU switch to guest as late as possible in entry, and switch
back to host as early as possible at exit. This helps the host get the
most perf coverage of KVM entry/exit code as possible.
This is slightly suboptimal for SPR scheduling point of view when the
PMU is enabled, but when perf is disabled there is no real difference.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv.c | 6 ++----
arch/powerpc/kvm/book3s_hv_p9_entry.c | 6 ++----
2 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index db42eeb27c15..5a1859311b3e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3820,8 +3820,6 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
s64 dec;
int trap;
- switch_pmu_to_guest(vcpu, &host_os_sprs);
-
save_p9_host_os_sprs(&host_os_sprs);
/*
@@ -3884,9 +3882,11 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
+ switch_pmu_to_guest(vcpu, &host_os_sprs);
trap = plpar_hcall_norets(H_ENTER_NESTED, __pa(&hvregs),
__pa(&vcpu->arch.regs));
kvmhv_restore_hv_return_state(vcpu, &hvregs);
+ switch_pmu_to_host(vcpu, &host_os_sprs);
vcpu->arch.shregs.msr = vcpu->arch.regs.msr;
vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
@@ -3905,8 +3905,6 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
restore_p9_host_os_sprs(vcpu, &host_os_sprs);
- switch_pmu_to_host(vcpu, &host_os_sprs);
-
return trap;
}
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 6bef509bccb8..619bbcd47b92 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -601,8 +601,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
- switch_pmu_to_guest(vcpu, &host_os_sprs);
-
save_p9_host_os_sprs(&host_os_sprs);
/*
@@ -744,7 +742,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
accumulate_time(vcpu, &vcpu->arch.guest_time);
+ switch_pmu_to_guest(vcpu, &host_os_sprs);
kvmppc_p9_enter_guest(vcpu);
+ switch_pmu_to_host(vcpu, &host_os_sprs);
accumulate_time(vcpu, &vcpu->arch.rm_intr);
@@ -955,8 +955,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
asm volatile(PPC_CP_ABORT);
out:
- switch_pmu_to_host(vcpu, &host_os_sprs);
-
end_timing(vcpu);
return trap;
--
2.23.0
^ permalink raw reply related
* [PATCH v3 32/52] KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
If TM is not active, only TM register state needs to be saved and
restored, avoiding several mfmsr/mtmsrd instructions and improving
performance.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv_p9_entry.c | 27 +++++++++++++++++++++++----
1 file changed, 23 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index fa080533bd8d..6bef509bccb8 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -287,11 +287,20 @@ bool load_vcpu_state(struct kvm_vcpu *vcpu,
{
bool ret = false;
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
if (cpu_has_feature(CPU_FTR_TM) ||
cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
- kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
- ret = true;
+ unsigned long guest_msr = vcpu->arch.shregs.msr;
+ if (MSR_TM_ACTIVE(guest_msr)) {
+ kvmppc_restore_tm_hv(vcpu, guest_msr, true);
+ ret = true;
+ } else {
+ mtspr(SPRN_TEXASR, vcpu->arch.texasr);
+ mtspr(SPRN_TFHAR, vcpu->arch.tfhar);
+ mtspr(SPRN_TFIAR, vcpu->arch.tfiar);
+ }
}
+#endif
load_spr_state(vcpu, host_os_sprs);
@@ -315,9 +324,19 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
#endif
vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
if (cpu_has_feature(CPU_FTR_TM) ||
- cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
- kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+ cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+ unsigned long guest_msr = vcpu->arch.shregs.msr;
+ if (MSR_TM_ACTIVE(guest_msr)) {
+ kvmppc_save_tm_hv(vcpu, guest_msr, true);
+ } else {
+ vcpu->arch.texasr = mfspr(SPRN_TEXASR);
+ vcpu->arch.tfhar = mfspr(SPRN_TFHAR);
+ vcpu->arch.tfiar = mfspr(SPRN_TFIAR);
+ }
+ }
+#endif
}
EXPORT_SYMBOL_GPL(store_vcpu_state);
--
2.23.0
^ permalink raw reply related
* [PATCH v3 31/52] KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low level entry
From: Nicholas Piggin @ 2021-10-04 16:00 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211004160049.1338837-1-npiggin@gmail.com>
Move register saving and loading from kvmhv_p9_guest_entry() into the HV
and nested entry handlers.
Accesses are scheduled to reduce mtSPR / mfSPR interleaving which
reduces SPR scoreboard stalls.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kvm/book3s_hv.c | 79 ++++++++++------------
arch/powerpc/kvm/book3s_hv_p9_entry.c | 96 ++++++++++++++++++++-------
2 files changed, 109 insertions(+), 66 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a57727463980..db42eeb27c15 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3814,9 +3814,15 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
{
struct kvmppc_vcore *vc = vcpu->arch.vcore;
unsigned long host_psscr;
+ unsigned long msr;
struct hv_guest_state hvregs;
- int trap;
+ struct p9_host_os_sprs host_os_sprs;
s64 dec;
+ int trap;
+
+ switch_pmu_to_guest(vcpu, &host_os_sprs);
+
+ save_p9_host_os_sprs(&host_os_sprs);
/*
* We need to save and restore the guest visible part of the
@@ -3825,6 +3831,27 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
* this is done in kvmhv_vcpu_entry_p9() below otherwise.
*/
host_psscr = mfspr(SPRN_PSSCR_PR);
+
+ hard_irq_disable();
+ if (lazy_irq_pending())
+ return 0;
+
+ /* MSR bits may have been cleared by context switch */
+ msr = 0;
+ if (IS_ENABLED(CONFIG_PPC_FPU))
+ msr |= MSR_FP;
+ if (cpu_has_feature(CPU_FTR_ALTIVEC))
+ msr |= MSR_VEC;
+ if (cpu_has_feature(CPU_FTR_VSX))
+ msr |= MSR_VSX;
+ if (cpu_has_feature(CPU_FTR_TM) ||
+ cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+ msr |= MSR_TM;
+ msr = msr_check_and_set(msr);
+
+ if (unlikely(load_vcpu_state(vcpu, &host_os_sprs)))
+ msr = mfmsr(); /* TM restore can update msr */
+
mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
kvmhv_save_hv_regs(vcpu, &hvregs);
hvregs.lpcr = lpcr;
@@ -3866,12 +3893,20 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
mtspr(SPRN_PSSCR_PR, host_psscr);
+ store_vcpu_state(vcpu);
+
dec = mfspr(SPRN_DEC);
if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
dec = (s32) dec;
*tb = mftb();
vcpu->arch.dec_expires = dec + (*tb + vc->tb_offset);
+ timer_rearm_host_dec(*tb);
+
+ restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+
+ switch_pmu_to_host(vcpu, &host_os_sprs);
+
return trap;
}
@@ -3882,9 +3917,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
unsigned long lpcr, u64 *tb)
{
struct kvmppc_vcore *vc = vcpu->arch.vcore;
- struct p9_host_os_sprs host_os_sprs;
u64 next_timer;
- unsigned long msr;
int trap;
next_timer = timer_get_next_tb();
@@ -3895,33 +3928,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
vcpu->arch.ceded = 0;
- save_p9_host_os_sprs(&host_os_sprs);
-
- /*
- * This could be combined with MSR[RI] clearing, but that expands
- * the unrecoverable window. It would be better to cover unrecoverable
- * with KVM bad interrupt handling rather than use MSR[RI] at all.
- *
- * Much more difficult and less worthwhile to combine with IR/DR
- * disable.
- */
- hard_irq_disable();
- if (lazy_irq_pending())
- return 0;
-
- /* MSR bits may have been cleared by context switch */
- msr = 0;
- if (IS_ENABLED(CONFIG_PPC_FPU))
- msr |= MSR_FP;
- if (cpu_has_feature(CPU_FTR_ALTIVEC))
- msr |= MSR_VEC;
- if (cpu_has_feature(CPU_FTR_VSX))
- msr |= MSR_VSX;
- if (cpu_has_feature(CPU_FTR_TM) ||
- cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
- msr |= MSR_TM;
- msr = msr_check_and_set(msr);
-
kvmppc_subcore_enter_guest();
vc->entry_exit_map = 1;
@@ -3929,11 +3935,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
vcpu_vpa_increment_dispatch(vcpu);
- if (unlikely(load_vcpu_state(vcpu, &host_os_sprs)))
- msr = mfmsr(); /* MSR may have been updated */
-
- switch_pmu_to_guest(vcpu, &host_os_sprs);
-
if (kvmhv_on_pseries()) {
trap = kvmhv_vcpu_entry_p9_nested(vcpu, time_limit, lpcr, tb);
@@ -3976,16 +3977,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
vcpu->arch.slb_max = 0;
}
- switch_pmu_to_host(vcpu, &host_os_sprs);
-
- store_vcpu_state(vcpu);
-
vcpu_vpa_increment_dispatch(vcpu);
- timer_rearm_host_dec(*tb);
-
- restore_p9_host_os_sprs(vcpu, &host_os_sprs);
-
vc->entry_exit_map = 0x101;
vc->in_guest = 0;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 784ff5429ebc..fa080533bd8d 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -538,6 +538,7 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb)
{
+ struct p9_host_os_sprs host_os_sprs;
struct kvm *kvm = vcpu->kvm;
struct kvm_nested_guest *nested = vcpu->arch.nested;
struct kvmppc_vcore *vc = vcpu->arch.vcore;
@@ -567,9 +568,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
vcpu->arch.ceded = 0;
- /* Could avoid mfmsr by passing around, but probably no big deal */
- msr = mfmsr();
-
host_hfscr = mfspr(SPRN_HFSCR);
host_ciabr = mfspr(SPRN_CIABR);
host_dawr0 = mfspr(SPRN_DAWR0);
@@ -584,6 +582,41 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
+ switch_pmu_to_guest(vcpu, &host_os_sprs);
+
+ save_p9_host_os_sprs(&host_os_sprs);
+
+ /*
+ * This could be combined with MSR[RI] clearing, but that expands
+ * the unrecoverable window. It would be better to cover unrecoverable
+ * with KVM bad interrupt handling rather than use MSR[RI] at all.
+ *
+ * Much more difficult and less worthwhile to combine with IR/DR
+ * disable.
+ */
+ hard_irq_disable();
+ if (lazy_irq_pending()) {
+ trap = 0;
+ goto out;
+ }
+
+ /* MSR bits may have been cleared by context switch */
+ msr = 0;
+ if (IS_ENABLED(CONFIG_PPC_FPU))
+ msr |= MSR_FP;
+ if (cpu_has_feature(CPU_FTR_ALTIVEC))
+ msr |= MSR_VEC;
+ if (cpu_has_feature(CPU_FTR_VSX))
+ msr |= MSR_VSX;
+ if (cpu_has_feature(CPU_FTR_TM) ||
+ cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+ msr |= MSR_TM;
+ msr = msr_check_and_set(msr);
+ /* Save MSR for restore. This is after hard disable, so EE is clear. */
+
+ if (unlikely(load_vcpu_state(vcpu, &host_os_sprs)))
+ msr = mfmsr(); /* MSR may have been updated */
+
if (vc->tb_offset) {
u64 new_tb = *tb + vc->tb_offset;
mtspr(SPRN_TBU40, new_tb);
@@ -642,6 +675,14 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
mtspr(SPRN_SPRG2, vcpu->arch.shregs.sprg2);
mtspr(SPRN_SPRG3, vcpu->arch.shregs.sprg3);
+ /*
+ * It might be preferable to load_vcpu_state here, in order to get the
+ * GPR/FP register loads executing in parallel with the previous mtSPR
+ * instructions, but for now that can't be done because the TM handling
+ * in load_vcpu_state can change some SPRs and vcpu state (nip, msr).
+ * But TM could be split out if this would be a significant benefit.
+ */
+
local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_HV_P9;
/*
@@ -819,6 +860,20 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
vc->dpdes = mfspr(SPRN_DPDES);
vc->vtb = mfspr(SPRN_VTB);
+ save_clear_guest_mmu(kvm, vcpu);
+ switch_mmu_to_host(kvm, host_pidr);
+
+ /*
+ * If we are in real mode, only switch MMU on after the MMU is
+ * switched to host, to avoid the P9_RADIX_PREFETCH_BUG.
+ */
+ if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
+ vcpu->arch.shregs.msr & MSR_TS_MASK)
+ msr |= MSR_TS_S;
+ __mtmsrd(msr, 0);
+
+ store_vcpu_state(vcpu);
+
dec = mfspr(SPRN_DEC);
if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
dec = (s32) dec;
@@ -851,6 +906,19 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
mtspr(SPRN_DAWRX1, host_dawrx1);
}
+ mtspr(SPRN_DPDES, 0);
+ if (vc->pcr)
+ mtspr(SPRN_PCR, PCR_MASK);
+
+ /* HDEC must be at least as large as DEC, so decrementer_max fits */
+ mtspr(SPRN_HDEC, decrementer_max);
+
+ timer_rearm_host_dec(*tb);
+
+ restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+
+ local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
+
if (kvm_is_radix(kvm)) {
/*
* Since this is radix, do a eieio; tlbsync; ptesync sequence
@@ -867,26 +935,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
if (cpu_has_feature(CPU_FTR_ARCH_31))
asm volatile(PPC_CP_ABORT);
- mtspr(SPRN_DPDES, 0);
- if (vc->pcr)
- mtspr(SPRN_PCR, PCR_MASK);
-
- /* HDEC must be at least as large as DEC, so decrementer_max fits */
- mtspr(SPRN_HDEC, decrementer_max);
-
- save_clear_guest_mmu(kvm, vcpu);
- switch_mmu_to_host(kvm, host_pidr);
- local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
-
- /*
- * If we are in real mode, only switch MMU on after the MMU is
- * switched to host, to avoid the P9_RADIX_PREFETCH_BUG.
- */
- if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
- vcpu->arch.shregs.msr & MSR_TS_MASK)
- msr |= MSR_TS_S;
-
- __mtmsrd(msr, 0);
+out:
+ switch_pmu_to_host(vcpu, &host_os_sprs);
end_timing(vcpu);
--
2.23.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox