qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] spapr: clean up nested hv
@ 2023-05-03  0:39 Nicholas Piggin
  2023-05-03  0:39 ` [RFC PATCH 1/4] spapr: H_ENTER_NESTED should restore host XER ca field Nicholas Piggin
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Nicholas Piggin @ 2023-05-03  0:39 UTC (permalink / raw)
  To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Harsh Prateek Bora

Something like this is the way I'd been wanting to refactor nested hv.
The state load/store functions and data is (somewhat) abstracted, and
the hcall interface remains in the hcall handlers.

If, hypothetically, you had a new flavour of nested enter hcall that had
some other way of specifying the L2 state to load, then you would
(hopefully) be able to extend and reuse the state struct and load/store
helpers.

Thanks,
Nick

Nicholas Piggin (4):
  spapr: H_ENTER_NESTED should restore host XER ca field
  spapr: Add a nested state struct
  spapr: load and store l2 state with helper functions
  spapr: Move spapr nested HV to a new file

 hw/ppc/meson.build              |   1 +
 hw/ppc/spapr_hcall.c            | 348 +---------------------
 hw/ppc/spapr_nested.c           | 496 ++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h          |  61 +---
 include/hw/ppc/spapr_cpu_core.h |   5 +-
 5 files changed, 502 insertions(+), 409 deletions(-)
 create mode 100644 hw/ppc/spapr_nested.c

-- 
2.40.1



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH 1/4] spapr: H_ENTER_NESTED should restore host XER ca field
  2023-05-03  0:39 [RFC PATCH 0/4] spapr: clean up nested hv Nicholas Piggin
@ 2023-05-03  0:39 ` Nicholas Piggin
  2023-05-05 10:20   ` Harsh Prateek Bora
  2023-05-03  0:39 ` [RFC PATCH 2/4] spapr: Add a nested state struct Nicholas Piggin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Nicholas Piggin @ 2023-05-03  0:39 UTC (permalink / raw)
  To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Harsh Prateek Bora

Fix missing env->ca restore when going from L2 back to the host.

Fixes: 120f738a467 ("spapr: implement nested-hv capability for the virtual hypervisor")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/ppc/spapr_hcall.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index ec4def62f8..be225adaf6 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1785,6 +1785,7 @@ out_restore_l1:
     env->cfar = spapr_cpu->nested_host_state->cfar;
     env->xer = spapr_cpu->nested_host_state->xer;
     env->so = spapr_cpu->nested_host_state->so;
+    env->ca = spapr_cpu->nested_host_state->ca;
     env->ov = spapr_cpu->nested_host_state->ov;
     env->ov32 = spapr_cpu->nested_host_state->ov32;
     env->ca32 = spapr_cpu->nested_host_state->ca32;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 2/4] spapr: Add a nested state struct
  2023-05-03  0:39 [RFC PATCH 0/4] spapr: clean up nested hv Nicholas Piggin
  2023-05-03  0:39 ` [RFC PATCH 1/4] spapr: H_ENTER_NESTED should restore host XER ca field Nicholas Piggin
@ 2023-05-03  0:39 ` Nicholas Piggin
  2023-05-05 10:54   ` Harsh Prateek Bora
  2023-05-03  0:39 ` [RFC PATCH 3/4] spapr: load and store l2 state with helper functions Nicholas Piggin
  2023-05-03  0:39 ` [RFC PATCH 4/4] spapr: Move spapr nested HV to a new file Nicholas Piggin
  3 siblings, 1 reply; 12+ messages in thread
From: Nicholas Piggin @ 2023-05-03  0:39 UTC (permalink / raw)
  To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Harsh Prateek Bora

Rather than use a copy of CPUPPCState to store the host state while
the environment has been switched to the L2, use a new struct for
this purpose.

Have helper functions to save and load this host state.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/ppc/spapr_hcall.c            | 164 ++++++++++++++++++++++++--------
 include/hw/ppc/spapr_cpu_core.h |   5 +-
 2 files changed, 129 insertions(+), 40 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index be225adaf6..6679150ac7 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1544,6 +1544,126 @@ static target_ulong h_copy_tofrom_guest(PowerPCCPU *cpu,
     return H_FUNCTION;
 }
 
+struct nested_ppc_state {
+    uint64_t gpr[32];
+    uint64_t lr;
+    uint64_t ctr;
+    uint64_t cfar;
+    uint64_t msr;
+    uint64_t nip;
+    uint32_t cr;
+
+    uint64_t xer;
+
+    uint64_t lpcr;
+    uint64_t lpidr;
+    uint64_t pidr;
+    uint64_t pcr;
+    uint64_t dpdes;
+    uint64_t hfscr;
+    uint64_t srr0;
+    uint64_t srr1;
+    uint64_t sprg0;
+    uint64_t sprg1;
+    uint64_t sprg2;
+    uint64_t sprg3;
+    uint64_t ppr;
+
+    int64_t tb_offset;
+};
+
+static void nested_save_state(struct nested_ppc_state *save, PowerPCCPU *cpu)
+{
+    CPUPPCState *env = &cpu->env;
+    uint32_t cr;
+    int i;
+
+    memcpy(save->gpr, env->gpr, sizeof(save->gpr));
+
+    save->lr = env->lr;
+    save->ctr = env->ctr;
+    save->cfar = env->cfar;
+    save->msr = env->msr;
+    save->nip = env->nip;
+
+    cr = 0;
+    for (i = 0; i < 8; i++) {
+        cr |= (env->crf[i] & 15) << (4 * (7 - i));
+    }
+    save->cr = cr;
+
+    save->xer = cpu_read_xer(env);
+
+    save->lpcr = env->spr[SPR_LPCR];
+    save->lpidr = env->spr[SPR_LPIDR];
+    save->pcr = env->spr[SPR_PCR];
+    save->dpdes = env->spr[SPR_DPDES];
+    save->hfscr = env->spr[SPR_HFSCR];
+    save->srr0 = env->spr[SPR_SRR0];
+    save->srr1 = env->spr[SPR_SRR1];
+    save->sprg0 = env->spr[SPR_SPRG0];
+    save->sprg1 = env->spr[SPR_SPRG1];
+    save->sprg2 = env->spr[SPR_SPRG2];
+    save->sprg3 = env->spr[SPR_SPRG3];
+    save->pidr = env->spr[SPR_BOOKS_PID];
+    save->ppr = env->spr[SPR_PPR];
+
+    save->tb_offset = env->tb_env->tb_offset;
+}
+
+static void nested_load_state(PowerPCCPU *cpu, struct nested_ppc_state *load)
+{
+    CPUState *cs = CPU(cpu);
+    CPUPPCState *env = &cpu->env;
+    uint32_t cr;
+    int i;
+
+    memcpy(env->gpr, load->gpr, sizeof(env->gpr));
+
+    env->lr = load->lr;
+    env->ctr = load->ctr;
+    env->cfar = load->cfar;
+    env->msr = load->msr;
+    env->nip = load->nip;
+
+    cr = load->cr;
+    for (i = 7; i >= 0; i--) {
+        env->crf[i] = cr & 15;
+        cr >>= 4;
+    }
+
+    cpu_write_xer(env, load->xer);
+
+    env->spr[SPR_LPCR] = load->lpcr;
+    env->spr[SPR_LPIDR] = load->lpidr;
+    env->spr[SPR_PCR] = load->pcr;
+    env->spr[SPR_DPDES] = load->dpdes;
+    env->spr[SPR_HFSCR] = load->hfscr;
+    env->spr[SPR_SRR0] = load->srr0;
+    env->spr[SPR_SRR1] = load->srr1;
+    env->spr[SPR_SPRG0] = load->sprg0;
+    env->spr[SPR_SPRG1] = load->sprg1;
+    env->spr[SPR_SPRG2] = load->sprg2;
+    env->spr[SPR_SPRG3] = load->sprg3;
+    env->spr[SPR_BOOKS_PID] = load->pidr;
+    env->spr[SPR_PPR] = load->ppr;
+
+    env->tb_env->tb_offset = load->tb_offset;
+
+    /*
+     * MSR updated, compute hflags and possible interrupts.
+     */
+    hreg_compute_hflags(env);
+    ppc_maybe_interrupt(env);
+
+    /*
+     * Nested HV does not tag TLB entries between L1 and L2, so must
+     * flush on transition.
+     */
+    tlb_flush(cs);
+    env->reserve_addr = -1; /* Reset the reservation */
+}
+
 /*
  * When this handler returns, the environment is switched to the L2 guest
  * and TCG begins running that. spapr_exit_nested() performs the switch from
@@ -1593,12 +1713,14 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
         return H_PARAMETER;
     }
 
-    spapr_cpu->nested_host_state = g_try_new(CPUPPCState, 1);
+    spapr_cpu->nested_host_state = g_try_new(struct nested_ppc_state, 1);
     if (!spapr_cpu->nested_host_state) {
         return H_NO_MEM;
     }
 
-    memcpy(spapr_cpu->nested_host_state, env, sizeof(CPUPPCState));
+    assert(env->spr[SPR_LPIDR] == 0);
+    assert(env->spr[SPR_DPDES] == 0);
+    nested_save_state(spapr_cpu->nested_host_state, cpu);
 
     len = sizeof(*regs);
     regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, false,
@@ -1644,7 +1766,6 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
     env->spr[SPR_DPDES] = hv_state.dpdes;
     env->spr[SPR_HFSCR] = hv_state.hfscr;
     hdec = hv_state.hdec_expiry - now;
-    spapr_cpu->nested_tb_offset = hv_state.tb_offset;
     /* TCG does not implement DAWR*, CIABR, PURR, SPURR, IC, VTB, HEIR SPRs*/
     env->spr[SPR_SRR0] = hv_state.srr0;
     env->spr[SPR_SRR1] = hv_state.srr1;
@@ -1670,7 +1791,7 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
      * and it's not obviously worth a new data structure to do it.
      */
 
-    env->tb_env->tb_offset += spapr_cpu->nested_tb_offset;
+    env->tb_env->tb_offset += hv_state.tb_offset;
     spapr_cpu->in_nested = true;
 
     hreg_compute_hflags(env);
@@ -1689,7 +1810,6 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
 
 void spapr_exit_nested(PowerPCCPU *cpu, int excp)
 {
-    CPUState *cs = CPU(cpu);
     CPUPPCState *env = &cpu->env;
     SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
     target_ulong r3_return = env->excp_vectors[excp]; /* hcall return value */
@@ -1778,34 +1898,8 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
     address_space_unmap(CPU(cpu)->as, regs, len, len, true);
 
 out_restore_l1:
-    memcpy(env->gpr, spapr_cpu->nested_host_state->gpr, sizeof(env->gpr));
-    env->lr = spapr_cpu->nested_host_state->lr;
-    env->ctr = spapr_cpu->nested_host_state->ctr;
-    memcpy(env->crf, spapr_cpu->nested_host_state->crf, sizeof(env->crf));
-    env->cfar = spapr_cpu->nested_host_state->cfar;
-    env->xer = spapr_cpu->nested_host_state->xer;
-    env->so = spapr_cpu->nested_host_state->so;
-    env->ca = spapr_cpu->nested_host_state->ca;
-    env->ov = spapr_cpu->nested_host_state->ov;
-    env->ov32 = spapr_cpu->nested_host_state->ov32;
-    env->ca32 = spapr_cpu->nested_host_state->ca32;
-    env->msr = spapr_cpu->nested_host_state->msr;
-    env->nip = spapr_cpu->nested_host_state->nip;
-
     assert(env->spr[SPR_LPIDR] != 0);
-    env->spr[SPR_LPCR] = spapr_cpu->nested_host_state->spr[SPR_LPCR];
-    env->spr[SPR_LPIDR] = spapr_cpu->nested_host_state->spr[SPR_LPIDR];
-    env->spr[SPR_PCR] = spapr_cpu->nested_host_state->spr[SPR_PCR];
-    env->spr[SPR_DPDES] = 0;
-    env->spr[SPR_HFSCR] = spapr_cpu->nested_host_state->spr[SPR_HFSCR];
-    env->spr[SPR_SRR0] = spapr_cpu->nested_host_state->spr[SPR_SRR0];
-    env->spr[SPR_SRR1] = spapr_cpu->nested_host_state->spr[SPR_SRR1];
-    env->spr[SPR_SPRG0] = spapr_cpu->nested_host_state->spr[SPR_SPRG0];
-    env->spr[SPR_SPRG1] = spapr_cpu->nested_host_state->spr[SPR_SPRG1];
-    env->spr[SPR_SPRG2] = spapr_cpu->nested_host_state->spr[SPR_SPRG2];
-    env->spr[SPR_SPRG3] = spapr_cpu->nested_host_state->spr[SPR_SPRG3];
-    env->spr[SPR_BOOKS_PID] = spapr_cpu->nested_host_state->spr[SPR_BOOKS_PID];
-    env->spr[SPR_PPR] = spapr_cpu->nested_host_state->spr[SPR_PPR];
+    nested_load_state(cpu, spapr_cpu->nested_host_state);
 
     /*
      * Return the interrupt vector address from H_ENTER_NESTED to the L1
@@ -1813,14 +1907,8 @@ out_restore_l1:
      */
     env->gpr[3] = r3_return;
 
-    env->tb_env->tb_offset -= spapr_cpu->nested_tb_offset;
     spapr_cpu->in_nested = false;
 
-    hreg_compute_hflags(env);
-    ppc_maybe_interrupt(env);
-    tlb_flush(cs);
-    env->reserve_addr = -1; /* Reset the reservation */
-
     g_free(spapr_cpu->nested_host_state);
     spapr_cpu->nested_host_state = NULL;
 }
diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
index b560514560..69a52e39b8 100644
--- a/include/hw/ppc/spapr_cpu_core.h
+++ b/include/hw/ppc/spapr_cpu_core.h
@@ -41,6 +41,8 @@ void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip,
                                target_ulong r1, target_ulong r3,
                                target_ulong r4);
 
+struct nested_ppc_state;
+
 typedef struct SpaprCpuState {
     uint64_t vpa_addr;
     uint64_t slb_shadow_addr, slb_shadow_size;
@@ -51,8 +53,7 @@ typedef struct SpaprCpuState {
 
     /* Fields for nested-HV support */
     bool in_nested; /* true while the L2 is executing */
-    CPUPPCState *nested_host_state; /* holds the L1 state while L2 executes */
-    int64_t nested_tb_offset; /* L1->L2 TB offset */
+    struct nested_ppc_state *nested_host_state; /* holds the L1 state while L2 executes */
 } SpaprCpuState;
 
 static inline SpaprCpuState *spapr_cpu_state(PowerPCCPU *cpu)
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 3/4] spapr: load and store l2 state with helper functions
  2023-05-03  0:39 [RFC PATCH 0/4] spapr: clean up nested hv Nicholas Piggin
  2023-05-03  0:39 ` [RFC PATCH 1/4] spapr: H_ENTER_NESTED should restore host XER ca field Nicholas Piggin
  2023-05-03  0:39 ` [RFC PATCH 2/4] spapr: Add a nested state struct Nicholas Piggin
@ 2023-05-03  0:39 ` Nicholas Piggin
  2023-05-05 11:03   ` Harsh Prateek Bora
  2023-05-03  0:39 ` [RFC PATCH 4/4] spapr: Move spapr nested HV to a new file Nicholas Piggin
  3 siblings, 1 reply; 12+ messages in thread
From: Nicholas Piggin @ 2023-05-03  0:39 UTC (permalink / raw)
  To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Harsh Prateek Bora

Arguably this is just shuffling around register accesses, but one nice
thing it does is allow the exit to save away the L2 state then switch
the environment to the L1 before copying L2 data back to the L1, which
logically flows more naturally and simplifies the error paths.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/ppc/spapr_hcall.c | 178 +++++++++++++++++++++----------------------
 1 file changed, 85 insertions(+), 93 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 6679150ac7..783a06ba98 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1675,9 +1675,9 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
                                    target_ulong *args)
 {
     PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
-    CPUState *cs = CPU(cpu);
     CPUPPCState *env = &cpu->env;
     SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
+    struct nested_ppc_state l2_state;
     target_ulong hv_ptr = args[0];
     target_ulong regs_ptr = args[1];
     target_ulong hdec, now = cpu_ppc_load_tbl(env);
@@ -1686,8 +1686,6 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
     struct kvmppc_hv_guest_state hv_state;
     struct kvmppc_pt_regs *regs;
     hwaddr len;
-    uint64_t cr;
-    int i;
 
     if (spapr->nested_ptcr == 0) {
         return H_NOT_AVAILABLE;
@@ -1713,6 +1711,10 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
         return H_PARAMETER;
     }
 
+    if (hv_state.lpid == 0) {
+        return H_PARAMETER;
+    }
+
     spapr_cpu->nested_host_state = g_try_new(struct nested_ppc_state, 1);
     if (!spapr_cpu->nested_host_state) {
         return H_NO_MEM;
@@ -1731,51 +1733,49 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
         return H_P2;
     }
 
-    len = sizeof(env->gpr);
+    len = sizeof(l2_state.gpr);
     assert(len == sizeof(regs->gpr));
-    memcpy(env->gpr, regs->gpr, len);
+    memcpy(l2_state.gpr, regs->gpr, len);
 
-    env->lr = regs->link;
-    env->ctr = regs->ctr;
-    cpu_write_xer(env, regs->xer);
-
-    cr = regs->ccr;
-    for (i = 7; i >= 0; i--) {
-        env->crf[i] = cr & 15;
-        cr >>= 4;
-    }
-
-    env->msr = regs->msr;
-    env->nip = regs->nip;
+    l2_state.lr = regs->link;
+    l2_state.ctr = regs->ctr;
+    l2_state.xer = regs->xer;
+    l2_state.cr = regs->ccr;
+    l2_state.msr = regs->msr;
+    l2_state.nip = regs->nip;
 
     address_space_unmap(CPU(cpu)->as, regs, len, len, false);
 
-    env->cfar = hv_state.cfar;
-
-    assert(env->spr[SPR_LPIDR] == 0);
-    env->spr[SPR_LPIDR] = hv_state.lpid;
+    l2_state.cfar = hv_state.cfar;
+    l2_state.lpidr = hv_state.lpid;
 
     lpcr_mask = LPCR_DPFD | LPCR_ILE | LPCR_AIL | LPCR_LD | LPCR_MER;
     lpcr = (env->spr[SPR_LPCR] & ~lpcr_mask) | (hv_state.lpcr & lpcr_mask);
     lpcr |= LPCR_HR | LPCR_UPRT | LPCR_GTSE | LPCR_HVICE | LPCR_HDICE;
     lpcr &= ~LPCR_LPES0;
-    env->spr[SPR_LPCR] = lpcr & pcc->lpcr_mask;
+    l2_state.lpcr = lpcr & pcc->lpcr_mask;
 
-    env->spr[SPR_PCR] = hv_state.pcr;
+    l2_state.pcr = hv_state.pcr;
     /* hv_state.amor is not used */
-    env->spr[SPR_DPDES] = hv_state.dpdes;
-    env->spr[SPR_HFSCR] = hv_state.hfscr;
-    hdec = hv_state.hdec_expiry - now;
+    l2_state.dpdes = hv_state.dpdes;
+    l2_state.hfscr = hv_state.hfscr;
     /* TCG does not implement DAWR*, CIABR, PURR, SPURR, IC, VTB, HEIR SPRs*/
-    env->spr[SPR_SRR0] = hv_state.srr0;
-    env->spr[SPR_SRR1] = hv_state.srr1;
-    env->spr[SPR_SPRG0] = hv_state.sprg[0];
-    env->spr[SPR_SPRG1] = hv_state.sprg[1];
-    env->spr[SPR_SPRG2] = hv_state.sprg[2];
-    env->spr[SPR_SPRG3] = hv_state.sprg[3];
-    env->spr[SPR_BOOKS_PID] = hv_state.pidr;
-    env->spr[SPR_PPR] = hv_state.ppr;
+    l2_state.srr0 = hv_state.srr0;
+    l2_state.srr1 = hv_state.srr1;
+    l2_state.sprg0 = hv_state.sprg[0];
+    l2_state.sprg1 = hv_state.sprg[1];
+    l2_state.sprg2 = hv_state.sprg[2];
+    l2_state.sprg3 = hv_state.sprg[3];
+    l2_state.pidr = hv_state.pidr;
+    l2_state.ppr = hv_state.ppr;
+    l2_state.tb_offset = env->tb_env->tb_offset + hv_state.tb_offset;
 
+    /*
+     * Switch to the nested guest environment and start the "hdec" timer.
+     */
+    nested_load_state(cpu, &l2_state);
+
+    hdec = hv_state.hdec_expiry - now;
     cpu_ppc_hdecr_init(env);
     cpu_ppc_store_hdecr(env, hdec);
 
@@ -1791,14 +1791,8 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
      * and it's not obviously worth a new data structure to do it.
      */
 
-    env->tb_env->tb_offset += hv_state.tb_offset;
     spapr_cpu->in_nested = true;
 
-    hreg_compute_hflags(env);
-    ppc_maybe_interrupt(env);
-    tlb_flush(cs);
-    env->reserve_addr = -1; /* Reset the reservation */
-
     /*
      * The spapr hcall helper sets env->gpr[3] to the return value, but at
      * this point the L1 is not returning from the hcall but rather we
@@ -1812,51 +1806,69 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
 {
     CPUPPCState *env = &cpu->env;
     SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
-    target_ulong r3_return = env->excp_vectors[excp]; /* hcall return value */
+    struct nested_ppc_state l2_state;
     target_ulong hv_ptr = spapr_cpu->nested_host_state->gpr[4];
     target_ulong regs_ptr = spapr_cpu->nested_host_state->gpr[5];
+    target_ulong hsrr0, hsrr1, hdar, asdr, hdsisr;
     struct kvmppc_hv_guest_state *hvstate;
     struct kvmppc_pt_regs *regs;
     hwaddr len;
-    uint64_t cr;
-    int i;
 
     assert(spapr_cpu->in_nested);
 
+    nested_save_state(&l2_state, cpu);
+    hsrr0 = env->spr[SPR_HSRR0];
+    hsrr1 = env->spr[SPR_HSRR1];
+    hdar = env->spr[SPR_HDAR];
+    hdsisr = env->spr[SPR_HDSISR];
+    asdr = env->spr[SPR_ASDR];
+
+    /*
+     * Switch back to the host environment (including for any error).
+     */
+    assert(env->spr[SPR_LPIDR] != 0);
+    nested_load_state(cpu, spapr_cpu->nested_host_state);
+    env->gpr[3] = env->excp_vectors[excp]; /* hcall return value */
+
     cpu_ppc_hdecr_exit(env);
 
+    spapr_cpu->in_nested = false;
+
+    g_free(spapr_cpu->nested_host_state);
+    spapr_cpu->nested_host_state = NULL;
+
     len = sizeof(*hvstate);
     hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, true,
                                 MEMTXATTRS_UNSPECIFIED);
     if (len != sizeof(*hvstate)) {
         address_space_unmap(CPU(cpu)->as, hvstate, len, 0, true);
-        r3_return = H_PARAMETER;
-        goto out_restore_l1;
+        env->gpr[3] = H_PARAMETER;
+	return;
     }
 
-    hvstate->cfar = env->cfar;
-    hvstate->lpcr = env->spr[SPR_LPCR];
-    hvstate->pcr = env->spr[SPR_PCR];
-    hvstate->dpdes = env->spr[SPR_DPDES];
-    hvstate->hfscr = env->spr[SPR_HFSCR];
+    hvstate->cfar = l2_state.cfar;
+    hvstate->lpcr = l2_state.lpcr;
+    hvstate->pcr = l2_state.pcr;
+    hvstate->dpdes = l2_state.dpdes;
+    hvstate->hfscr = l2_state.hfscr;
 
     if (excp == POWERPC_EXCP_HDSI) {
-        hvstate->hdar = env->spr[SPR_HDAR];
-        hvstate->hdsisr = env->spr[SPR_HDSISR];
-        hvstate->asdr = env->spr[SPR_ASDR];
+        hvstate->hdar = hdar;
+        hvstate->hdsisr = hdsisr;
+        hvstate->asdr = asdr;
     } else if (excp == POWERPC_EXCP_HISI) {
-        hvstate->asdr = env->spr[SPR_ASDR];
+        hvstate->asdr = asdr;
     }
 
     /* HEIR should be implemented for HV mode and saved here. */
-    hvstate->srr0 = env->spr[SPR_SRR0];
-    hvstate->srr1 = env->spr[SPR_SRR1];
-    hvstate->sprg[0] = env->spr[SPR_SPRG0];
-    hvstate->sprg[1] = env->spr[SPR_SPRG1];
-    hvstate->sprg[2] = env->spr[SPR_SPRG2];
-    hvstate->sprg[3] = env->spr[SPR_SPRG3];
-    hvstate->pidr = env->spr[SPR_BOOKS_PID];
-    hvstate->ppr = env->spr[SPR_PPR];
+    hvstate->srr0 = l2_state.srr0;
+    hvstate->srr1 = l2_state.srr1;
+    hvstate->sprg[0] = l2_state.sprg0;
+    hvstate->sprg[1] = l2_state.sprg1;
+    hvstate->sprg[2] = l2_state.sprg2;
+    hvstate->sprg[3] = l2_state.sprg3;
+    hvstate->pidr = l2_state.pidr;
+    hvstate->ppr = l2_state.ppr;
 
     /* Is it okay to specify write length larger than actual data written? */
     address_space_unmap(CPU(cpu)->as, hvstate, len, len, true);
@@ -1866,51 +1878,31 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
                                 MEMTXATTRS_UNSPECIFIED);
     if (!regs || len != sizeof(*regs)) {
         address_space_unmap(CPU(cpu)->as, regs, len, 0, true);
-        r3_return = H_P2;
-        goto out_restore_l1;
+        env->gpr[3] = H_P2;
+	return;
     }
 
     len = sizeof(env->gpr);
     assert(len == sizeof(regs->gpr));
-    memcpy(regs->gpr, env->gpr, len);
+    memcpy(regs->gpr, l2_state.gpr, len);
 
-    regs->link = env->lr;
-    regs->ctr = env->ctr;
-    regs->xer = cpu_read_xer(env);
-
-    cr = 0;
-    for (i = 0; i < 8; i++) {
-        cr |= (env->crf[i] & 15) << (4 * (7 - i));
-    }
-    regs->ccr = cr;
+    regs->link = l2_state.lr;
+    regs->ctr = l2_state.ctr;
+    regs->xer = l2_state.xer;
+    regs->ccr = l2_state.cr;
 
     if (excp == POWERPC_EXCP_MCHECK ||
         excp == POWERPC_EXCP_RESET ||
         excp == POWERPC_EXCP_SYSCALL) {
-        regs->nip = env->spr[SPR_SRR0];
-        regs->msr = env->spr[SPR_SRR1] & env->msr_mask;
+        regs->nip = l2_state.srr0;
+        regs->msr = l2_state.srr1 & env->msr_mask;
     } else {
-        regs->nip = env->spr[SPR_HSRR0];
-        regs->msr = env->spr[SPR_HSRR1] & env->msr_mask;
+        regs->nip = hsrr0;
+        regs->msr = hsrr1 & env->msr_mask;
     }
 
     /* Is it okay to specify write length larger than actual data written? */
     address_space_unmap(CPU(cpu)->as, regs, len, len, true);
-
-out_restore_l1:
-    assert(env->spr[SPR_LPIDR] != 0);
-    nested_load_state(cpu, spapr_cpu->nested_host_state);
-
-    /*
-     * Return the interrupt vector address from H_ENTER_NESTED to the L1
-     * (or error code).
-     */
-    env->gpr[3] = r3_return;
-
-    spapr_cpu->in_nested = false;
-
-    g_free(spapr_cpu->nested_host_state);
-    spapr_cpu->nested_host_state = NULL;
 }
 
 static void hypercall_register_nested(void)
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 4/4] spapr: Move spapr nested HV to a new file
  2023-05-03  0:39 [RFC PATCH 0/4] spapr: clean up nested hv Nicholas Piggin
                   ` (2 preceding siblings ...)
  2023-05-03  0:39 ` [RFC PATCH 3/4] spapr: load and store l2 state with helper functions Nicholas Piggin
@ 2023-05-03  0:39 ` Nicholas Piggin
  2023-05-05 11:09   ` Harsh Prateek Bora
  3 siblings, 1 reply; 12+ messages in thread
From: Nicholas Piggin @ 2023-05-03  0:39 UTC (permalink / raw)
  To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Harsh Prateek Bora

Create spapr_nested.c for the nested HV implementation (modulo small
pieces in MMU and exception handling).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/ppc/meson.build     |   1 +
 hw/ppc/spapr_hcall.c   | 429 +----------------------------------
 hw/ppc/spapr_nested.c  | 496 +++++++++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h |  61 +----
 4 files changed, 499 insertions(+), 488 deletions(-)
 create mode 100644 hw/ppc/spapr_nested.c

diff --git a/hw/ppc/meson.build b/hw/ppc/meson.build
index c927337da0..a313d4b964 100644
--- a/hw/ppc/meson.build
+++ b/hw/ppc/meson.build
@@ -15,6 +15,7 @@ ppc_ss.add(when: 'CONFIG_PSERIES', if_true: files(
   'spapr_vio.c',
   'spapr_events.c',
   'spapr_hcall.c',
+  'spapr_nested.c',
   'spapr_iommu.c',
   'spapr_rtas.c',
   'spapr_pci.c',
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 783a06ba98..dbdcbb0e9d 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1496,444 +1496,17 @@ target_ulong spapr_hypercall(PowerPCCPU *cpu, target_ulong opcode,
 }
 
 #ifdef CONFIG_TCG
-#define PRTS_MASK      0x1f
-
-static target_ulong h_set_ptbl(PowerPCCPU *cpu,
-                               SpaprMachineState *spapr,
-                               target_ulong opcode,
-                               target_ulong *args)
-{
-    target_ulong ptcr = args[0];
-
-    if (!spapr_get_cap(spapr, SPAPR_CAP_NESTED_KVM_HV)) {
-        return H_FUNCTION;
-    }
-
-    if ((ptcr & PRTS_MASK) + 12 - 4 > 12) {
-        return H_PARAMETER;
-    }
-
-    spapr->nested_ptcr = ptcr; /* Save new partition table */
-
-    return H_SUCCESS;
-}
-
-static target_ulong h_tlb_invalidate(PowerPCCPU *cpu,
-                                     SpaprMachineState *spapr,
-                                     target_ulong opcode,
-                                     target_ulong *args)
-{
-    /*
-     * The spapr virtual hypervisor nested HV implementation retains no L2
-     * translation state except for TLB. And the TLB is always invalidated
-     * across L1<->L2 transitions, so nothing is required here.
-     */
-
-    return H_SUCCESS;
-}
-
-static target_ulong h_copy_tofrom_guest(PowerPCCPU *cpu,
-                                        SpaprMachineState *spapr,
-                                        target_ulong opcode,
-                                        target_ulong *args)
-{
-    /*
-     * This HCALL is not required, L1 KVM will take a slow path and walk the
-     * page tables manually to do the data copy.
-     */
-    return H_FUNCTION;
-}
-
-struct nested_ppc_state {
-    uint64_t gpr[32];
-    uint64_t lr;
-    uint64_t ctr;
-    uint64_t cfar;
-    uint64_t msr;
-    uint64_t nip;
-    uint32_t cr;
-
-    uint64_t xer;
-
-    uint64_t lpcr;
-    uint64_t lpidr;
-    uint64_t pidr;
-    uint64_t pcr;
-    uint64_t dpdes;
-    uint64_t hfscr;
-    uint64_t srr0;
-    uint64_t srr1;
-    uint64_t sprg0;
-    uint64_t sprg1;
-    uint64_t sprg2;
-    uint64_t sprg3;
-    uint64_t ppr;
-
-    int64_t tb_offset;
-};
-
-static void nested_save_state(struct nested_ppc_state *save, PowerPCCPU *cpu)
-{
-    CPUPPCState *env = &cpu->env;
-    uint32_t cr;
-    int i;
-
-    memcpy(save->gpr, env->gpr, sizeof(save->gpr));
-
-    save->lr = env->lr;
-    save->ctr = env->ctr;
-    save->cfar = env->cfar;
-    save->msr = env->msr;
-    save->nip = env->nip;
-
-    cr = 0;
-    for (i = 0; i < 8; i++) {
-        cr |= (env->crf[i] & 15) << (4 * (7 - i));
-    }
-    save->cr = cr;
-
-    save->xer = cpu_read_xer(env);
-
-    save->lpcr = env->spr[SPR_LPCR];
-    save->lpidr = env->spr[SPR_LPIDR];
-    save->pcr = env->spr[SPR_PCR];
-    save->dpdes = env->spr[SPR_DPDES];
-    save->hfscr = env->spr[SPR_HFSCR];
-    save->srr0 = env->spr[SPR_SRR0];
-    save->srr1 = env->spr[SPR_SRR1];
-    save->sprg0 = env->spr[SPR_SPRG0];
-    save->sprg1 = env->spr[SPR_SPRG1];
-    save->sprg2 = env->spr[SPR_SPRG2];
-    save->sprg3 = env->spr[SPR_SPRG3];
-    save->pidr = env->spr[SPR_BOOKS_PID];
-    save->ppr = env->spr[SPR_PPR];
-
-    save->tb_offset = env->tb_env->tb_offset;
-}
-
-static void nested_load_state(PowerPCCPU *cpu, struct nested_ppc_state *load)
-{
-    CPUState *cs = CPU(cpu);
-    CPUPPCState *env = &cpu->env;
-    uint32_t cr;
-    int i;
-
-    memcpy(env->gpr, load->gpr, sizeof(env->gpr));
-
-    env->lr = load->lr;
-    env->ctr = load->ctr;
-    env->cfar = load->cfar;
-    env->msr = load->msr;
-    env->nip = load->nip;
-
-    cr = load->cr;
-    for (i = 7; i >= 0; i--) {
-        env->crf[i] = cr & 15;
-        cr >>= 4;
-    }
-
-    cpu_write_xer(env, load->xer);
-
-    env->spr[SPR_LPCR] = load->lpcr;
-    env->spr[SPR_LPIDR] = load->lpidr;
-    env->spr[SPR_PCR] = load->pcr;
-    env->spr[SPR_DPDES] = load->dpdes;
-    env->spr[SPR_HFSCR] = load->hfscr;
-    env->spr[SPR_SRR0] = load->srr0;
-    env->spr[SPR_SRR1] = load->srr1;
-    env->spr[SPR_SPRG0] = load->sprg0;
-    env->spr[SPR_SPRG1] = load->sprg1;
-    env->spr[SPR_SPRG2] = load->sprg2;
-    env->spr[SPR_SPRG3] = load->sprg3;
-    env->spr[SPR_BOOKS_PID] = load->pidr;
-    env->spr[SPR_PPR] = load->ppr;
-
-    env->tb_env->tb_offset = load->tb_offset;
-
-    /*
-     * MSR updated, compute hflags and possible interrupts.
-     */
-    hreg_compute_hflags(env);
-    ppc_maybe_interrupt(env);
-
-    /*
-     * Nested HV does not tag TLB entries between L1 and L2, so must
-     * flush on transition.
-     */
-    tlb_flush(cs);
-    env->reserve_addr = -1; /* Reset the reservation */
-}
-
-/*
- * When this handler returns, the environment is switched to the L2 guest
- * and TCG begins running that. spapr_exit_nested() performs the switch from
- * L2 back to L1 and returns from the H_ENTER_NESTED hcall.
- */
-static target_ulong h_enter_nested(PowerPCCPU *cpu,
-                                   SpaprMachineState *spapr,
-                                   target_ulong opcode,
-                                   target_ulong *args)
-{
-    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
-    CPUPPCState *env = &cpu->env;
-    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
-    struct nested_ppc_state l2_state;
-    target_ulong hv_ptr = args[0];
-    target_ulong regs_ptr = args[1];
-    target_ulong hdec, now = cpu_ppc_load_tbl(env);
-    target_ulong lpcr, lpcr_mask;
-    struct kvmppc_hv_guest_state *hvstate;
-    struct kvmppc_hv_guest_state hv_state;
-    struct kvmppc_pt_regs *regs;
-    hwaddr len;
-
-    if (spapr->nested_ptcr == 0) {
-        return H_NOT_AVAILABLE;
-    }
-
-    len = sizeof(*hvstate);
-    hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, false,
-                                MEMTXATTRS_UNSPECIFIED);
-    if (len != sizeof(*hvstate)) {
-        address_space_unmap(CPU(cpu)->as, hvstate, len, 0, false);
-        return H_PARAMETER;
-    }
-
-    memcpy(&hv_state, hvstate, len);
-
-    address_space_unmap(CPU(cpu)->as, hvstate, len, len, false);
-
-    /*
-     * We accept versions 1 and 2. Version 2 fields are unused because TCG
-     * does not implement DAWR*.
-     */
-    if (hv_state.version > HV_GUEST_STATE_VERSION) {
-        return H_PARAMETER;
-    }
-
-    if (hv_state.lpid == 0) {
-        return H_PARAMETER;
-    }
-
-    spapr_cpu->nested_host_state = g_try_new(struct nested_ppc_state, 1);
-    if (!spapr_cpu->nested_host_state) {
-        return H_NO_MEM;
-    }
-
-    assert(env->spr[SPR_LPIDR] == 0);
-    assert(env->spr[SPR_DPDES] == 0);
-    nested_save_state(spapr_cpu->nested_host_state, cpu);
-
-    len = sizeof(*regs);
-    regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, false,
-                                MEMTXATTRS_UNSPECIFIED);
-    if (!regs || len != sizeof(*regs)) {
-        address_space_unmap(CPU(cpu)->as, regs, len, 0, false);
-        g_free(spapr_cpu->nested_host_state);
-        return H_P2;
-    }
-
-    len = sizeof(l2_state.gpr);
-    assert(len == sizeof(regs->gpr));
-    memcpy(l2_state.gpr, regs->gpr, len);
-
-    l2_state.lr = regs->link;
-    l2_state.ctr = regs->ctr;
-    l2_state.xer = regs->xer;
-    l2_state.cr = regs->ccr;
-    l2_state.msr = regs->msr;
-    l2_state.nip = regs->nip;
-
-    address_space_unmap(CPU(cpu)->as, regs, len, len, false);
-
-    l2_state.cfar = hv_state.cfar;
-    l2_state.lpidr = hv_state.lpid;
-
-    lpcr_mask = LPCR_DPFD | LPCR_ILE | LPCR_AIL | LPCR_LD | LPCR_MER;
-    lpcr = (env->spr[SPR_LPCR] & ~lpcr_mask) | (hv_state.lpcr & lpcr_mask);
-    lpcr |= LPCR_HR | LPCR_UPRT | LPCR_GTSE | LPCR_HVICE | LPCR_HDICE;
-    lpcr &= ~LPCR_LPES0;
-    l2_state.lpcr = lpcr & pcc->lpcr_mask;
-
-    l2_state.pcr = hv_state.pcr;
-    /* hv_state.amor is not used */
-    l2_state.dpdes = hv_state.dpdes;
-    l2_state.hfscr = hv_state.hfscr;
-    /* TCG does not implement DAWR*, CIABR, PURR, SPURR, IC, VTB, HEIR SPRs*/
-    l2_state.srr0 = hv_state.srr0;
-    l2_state.srr1 = hv_state.srr1;
-    l2_state.sprg0 = hv_state.sprg[0];
-    l2_state.sprg1 = hv_state.sprg[1];
-    l2_state.sprg2 = hv_state.sprg[2];
-    l2_state.sprg3 = hv_state.sprg[3];
-    l2_state.pidr = hv_state.pidr;
-    l2_state.ppr = hv_state.ppr;
-    l2_state.tb_offset = env->tb_env->tb_offset + hv_state.tb_offset;
-
-    /*
-     * Switch to the nested guest environment and start the "hdec" timer.
-     */
-    nested_load_state(cpu, &l2_state);
-
-    hdec = hv_state.hdec_expiry - now;
-    cpu_ppc_hdecr_init(env);
-    cpu_ppc_store_hdecr(env, hdec);
-
-    /*
-     * The hv_state.vcpu_token is not needed. It is used by the KVM
-     * implementation to remember which L2 vCPU last ran on which physical
-     * CPU so as to invalidate process scope translations if it is moved
-     * between physical CPUs. For now TLBs are always flushed on L1<->L2
-     * transitions so this is not a problem.
-     *
-     * Could validate that the same vcpu_token does not attempt to run on
-     * different L1 vCPUs at the same time, but that would be a L1 KVM bug
-     * and it's not obviously worth a new data structure to do it.
-     */
-
-    spapr_cpu->in_nested = true;
-
-    /*
-     * The spapr hcall helper sets env->gpr[3] to the return value, but at
-     * this point the L1 is not returning from the hcall but rather we
-     * start running the L2, so r3 must not be clobbered, so return env->gpr[3]
-     * to leave it unchanged.
-     */
-    return env->gpr[3];
-}
-
-void spapr_exit_nested(PowerPCCPU *cpu, int excp)
-{
-    CPUPPCState *env = &cpu->env;
-    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
-    struct nested_ppc_state l2_state;
-    target_ulong hv_ptr = spapr_cpu->nested_host_state->gpr[4];
-    target_ulong regs_ptr = spapr_cpu->nested_host_state->gpr[5];
-    target_ulong hsrr0, hsrr1, hdar, asdr, hdsisr;
-    struct kvmppc_hv_guest_state *hvstate;
-    struct kvmppc_pt_regs *regs;
-    hwaddr len;
-
-    assert(spapr_cpu->in_nested);
-
-    nested_save_state(&l2_state, cpu);
-    hsrr0 = env->spr[SPR_HSRR0];
-    hsrr1 = env->spr[SPR_HSRR1];
-    hdar = env->spr[SPR_HDAR];
-    hdsisr = env->spr[SPR_HDSISR];
-    asdr = env->spr[SPR_ASDR];
-
-    /*
-     * Switch back to the host environment (including for any error).
-     */
-    assert(env->spr[SPR_LPIDR] != 0);
-    nested_load_state(cpu, spapr_cpu->nested_host_state);
-    env->gpr[3] = env->excp_vectors[excp]; /* hcall return value */
-
-    cpu_ppc_hdecr_exit(env);
-
-    spapr_cpu->in_nested = false;
-
-    g_free(spapr_cpu->nested_host_state);
-    spapr_cpu->nested_host_state = NULL;
-
-    len = sizeof(*hvstate);
-    hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, true,
-                                MEMTXATTRS_UNSPECIFIED);
-    if (len != sizeof(*hvstate)) {
-        address_space_unmap(CPU(cpu)->as, hvstate, len, 0, true);
-        env->gpr[3] = H_PARAMETER;
-	return;
-    }
-
-    hvstate->cfar = l2_state.cfar;
-    hvstate->lpcr = l2_state.lpcr;
-    hvstate->pcr = l2_state.pcr;
-    hvstate->dpdes = l2_state.dpdes;
-    hvstate->hfscr = l2_state.hfscr;
-
-    if (excp == POWERPC_EXCP_HDSI) {
-        hvstate->hdar = hdar;
-        hvstate->hdsisr = hdsisr;
-        hvstate->asdr = asdr;
-    } else if (excp == POWERPC_EXCP_HISI) {
-        hvstate->asdr = asdr;
-    }
-
-    /* HEIR should be implemented for HV mode and saved here. */
-    hvstate->srr0 = l2_state.srr0;
-    hvstate->srr1 = l2_state.srr1;
-    hvstate->sprg[0] = l2_state.sprg0;
-    hvstate->sprg[1] = l2_state.sprg1;
-    hvstate->sprg[2] = l2_state.sprg2;
-    hvstate->sprg[3] = l2_state.sprg3;
-    hvstate->pidr = l2_state.pidr;
-    hvstate->ppr = l2_state.ppr;
-
-    /* Is it okay to specify write length larger than actual data written? */
-    address_space_unmap(CPU(cpu)->as, hvstate, len, len, true);
-
-    len = sizeof(*regs);
-    regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, true,
-                                MEMTXATTRS_UNSPECIFIED);
-    if (!regs || len != sizeof(*regs)) {
-        address_space_unmap(CPU(cpu)->as, regs, len, 0, true);
-        env->gpr[3] = H_P2;
-	return;
-    }
-
-    len = sizeof(env->gpr);
-    assert(len == sizeof(regs->gpr));
-    memcpy(regs->gpr, l2_state.gpr, len);
-
-    regs->link = l2_state.lr;
-    regs->ctr = l2_state.ctr;
-    regs->xer = l2_state.xer;
-    regs->ccr = l2_state.cr;
-
-    if (excp == POWERPC_EXCP_MCHECK ||
-        excp == POWERPC_EXCP_RESET ||
-        excp == POWERPC_EXCP_SYSCALL) {
-        regs->nip = l2_state.srr0;
-        regs->msr = l2_state.srr1 & env->msr_mask;
-    } else {
-        regs->nip = hsrr0;
-        regs->msr = hsrr1 & env->msr_mask;
-    }
-
-    /* Is it okay to specify write length larger than actual data written? */
-    address_space_unmap(CPU(cpu)->as, regs, len, len, true);
-}
-
-static void hypercall_register_nested(void)
-{
-    spapr_register_hypercall(KVMPPC_H_SET_PARTITION_TABLE, h_set_ptbl);
-    spapr_register_hypercall(KVMPPC_H_ENTER_NESTED, h_enter_nested);
-    spapr_register_hypercall(KVMPPC_H_TLB_INVALIDATE, h_tlb_invalidate);
-    spapr_register_hypercall(KVMPPC_H_COPY_TOFROM_GUEST, h_copy_tofrom_guest);
-}
-
 static void hypercall_register_softmmu(void)
 {
     /* DO NOTHING */
 }
 #else
-void spapr_exit_nested(PowerPCCPU *cpu, int excp)
-{
-    g_assert_not_reached();
-}
-
 static target_ulong h_softmmu(PowerPCCPU *cpu, SpaprMachineState *spapr,
                             target_ulong opcode, target_ulong *args)
 {
     g_assert_not_reached();
 }
 
-static void hypercall_register_nested(void)
-{
-    /* DO NOTHING */
-}
-
 static void hypercall_register_softmmu(void)
 {
     /* hcall-pft */
@@ -2003,7 +1576,7 @@ static void hypercall_register_types(void)
 
     spapr_register_hypercall(KVMPPC_H_UPDATE_DT, h_update_dt);
 
-    hypercall_register_nested();
+    spapr_register_nested();
 }
 
 type_init(hypercall_register_types)
diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
new file mode 100644
index 0000000000..c06dd8903c
--- /dev/null
+++ b/hw/ppc/spapr_nested.c
@@ -0,0 +1,496 @@
+#include "qemu/osdep.h"
+#include "qemu/cutils.h"
+#include "exec/exec-all.h"
+#include "helper_regs.h"
+#include "hw/ppc/ppc.h"
+#include "hw/ppc/spapr.h"
+#include "hw/ppc/spapr_cpu_core.h"
+
+/*
+ * Register state for entering a nested guest with H_ENTER_NESTED.
+ * New member must be added at the end.
+ */
+struct kvmppc_hv_guest_state {
+    uint64_t version;      /* version of this structure layout, must be first */
+    uint32_t lpid;
+    uint32_t vcpu_token;
+    /* These registers are hypervisor privileged (at least for writing) */
+    uint64_t lpcr;
+    uint64_t pcr;
+    uint64_t amor;
+    uint64_t dpdes;
+    uint64_t hfscr;
+    int64_t tb_offset;
+    uint64_t dawr0;
+    uint64_t dawrx0;
+    uint64_t ciabr;
+    uint64_t hdec_expiry;
+    uint64_t purr;
+    uint64_t spurr;
+    uint64_t ic;
+    uint64_t vtb;
+    uint64_t hdar;
+    uint64_t hdsisr;
+    uint64_t heir;
+    uint64_t asdr;
+    /* These are OS privileged but need to be set late in guest entry */
+    uint64_t srr0;
+    uint64_t srr1;
+    uint64_t sprg[4];
+    uint64_t pidr;
+    uint64_t cfar;
+    uint64_t ppr;
+    /* Version 1 ends here */
+    uint64_t dawr1;
+    uint64_t dawrx1;
+    /* Version 2 ends here */
+};
+
+/* Latest version of hv_guest_state structure */
+#define HV_GUEST_STATE_VERSION  2
+
+/* Linux 64-bit powerpc pt_regs struct, used by nested HV */
+struct kvmppc_pt_regs {
+    uint64_t gpr[32];
+    uint64_t nip;
+    uint64_t msr;
+    uint64_t orig_gpr3;    /* Used for restarting system calls */
+    uint64_t ctr;
+    uint64_t link;
+    uint64_t xer;
+    uint64_t ccr;
+    uint64_t softe;        /* Soft enabled/disabled */
+    uint64_t trap;         /* Reason for being here */
+    uint64_t dar;          /* Fault registers */
+    uint64_t dsisr;        /* on 4xx/Book-E used for ESR */
+    uint64_t result;       /* Result of a system call */
+};
+
+#ifdef CONFIG_TCG
+#define PRTS_MASK      0x1f
+
+static target_ulong h_set_ptbl(PowerPCCPU *cpu,
+                               SpaprMachineState *spapr,
+                               target_ulong opcode,
+                               target_ulong *args)
+{
+    target_ulong ptcr = args[0];
+
+    if (!spapr_get_cap(spapr, SPAPR_CAP_NESTED_KVM_HV)) {
+        return H_FUNCTION;
+    }
+
+    if ((ptcr & PRTS_MASK) + 12 - 4 > 12) {
+        return H_PARAMETER;
+    }
+
+    spapr->nested_ptcr = ptcr; /* Save new partition table */
+
+    return H_SUCCESS;
+}
+
+static target_ulong h_tlb_invalidate(PowerPCCPU *cpu,
+                                     SpaprMachineState *spapr,
+                                     target_ulong opcode,
+                                     target_ulong *args)
+{
+    /*
+     * The spapr virtual hypervisor nested HV implementation retains no L2
+     * translation state except for TLB. And the TLB is always invalidated
+     * across L1<->L2 transitions, so nothing is required here.
+     */
+
+    return H_SUCCESS;
+}
+
+static target_ulong h_copy_tofrom_guest(PowerPCCPU *cpu,
+                                        SpaprMachineState *spapr,
+                                        target_ulong opcode,
+                                        target_ulong *args)
+{
+    /*
+     * This HCALL is not required, L1 KVM will take a slow path and walk the
+     * page tables manually to do the data copy.
+     */
+    return H_FUNCTION;
+}
+
+struct nested_ppc_state {
+    uint64_t gpr[32];
+    uint64_t lr;
+    uint64_t ctr;
+    uint64_t cfar;
+    uint64_t msr;
+    uint64_t nip;
+    uint32_t cr;
+
+    uint64_t xer;
+
+    uint64_t lpcr;
+    uint64_t lpidr;
+    uint64_t pidr;
+    uint64_t pcr;
+    uint64_t dpdes;
+    uint64_t hfscr;
+    uint64_t srr0;
+    uint64_t srr1;
+    uint64_t sprg0;
+    uint64_t sprg1;
+    uint64_t sprg2;
+    uint64_t sprg3;
+    uint64_t ppr;
+
+    int64_t tb_offset;
+};
+
+static void nested_save_state(struct nested_ppc_state *save, PowerPCCPU *cpu)
+{
+    CPUPPCState *env = &cpu->env;
+    uint32_t cr;
+    int i;
+
+    memcpy(save->gpr, env->gpr, sizeof(save->gpr));
+
+    save->lr = env->lr;
+    save->ctr = env->ctr;
+    save->cfar = env->cfar;
+    save->msr = env->msr;
+    save->nip = env->nip;
+
+    cr = 0;
+    for (i = 0; i < 8; i++) {
+        cr |= (env->crf[i] & 15) << (4 * (7 - i));
+    }
+    save->cr = cr;
+
+    save->xer = cpu_read_xer(env);
+
+    save->lpcr = env->spr[SPR_LPCR];
+    save->lpidr = env->spr[SPR_LPIDR];
+    save->pcr = env->spr[SPR_PCR];
+    save->dpdes = env->spr[SPR_DPDES];
+    save->hfscr = env->spr[SPR_HFSCR];
+    save->srr0 = env->spr[SPR_SRR0];
+    save->srr1 = env->spr[SPR_SRR1];
+    save->sprg0 = env->spr[SPR_SPRG0];
+    save->sprg1 = env->spr[SPR_SPRG1];
+    save->sprg2 = env->spr[SPR_SPRG2];
+    save->sprg3 = env->spr[SPR_SPRG3];
+    save->pidr = env->spr[SPR_BOOKS_PID];
+    save->ppr = env->spr[SPR_PPR];
+
+    save->tb_offset = env->tb_env->tb_offset;
+}
+
+static void nested_load_state(PowerPCCPU *cpu, struct nested_ppc_state *load)
+{
+    CPUState *cs = CPU(cpu);
+    CPUPPCState *env = &cpu->env;
+    uint32_t cr;
+    int i;
+
+    memcpy(env->gpr, load->gpr, sizeof(env->gpr));
+
+    env->lr = load->lr;
+    env->ctr = load->ctr;
+    env->cfar = load->cfar;
+    env->msr = load->msr;
+    env->nip = load->nip;
+
+    cr = load->cr;
+    for (i = 7; i >= 0; i--) {
+        env->crf[i] = cr & 15;
+        cr >>= 4;
+    }
+
+    cpu_write_xer(env, load->xer);
+
+    env->spr[SPR_LPCR] = load->lpcr;
+    env->spr[SPR_LPIDR] = load->lpidr;
+    env->spr[SPR_PCR] = load->pcr;
+    env->spr[SPR_DPDES] = load->dpdes;
+    env->spr[SPR_HFSCR] = load->hfscr;
+    env->spr[SPR_SRR0] = load->srr0;
+    env->spr[SPR_SRR1] = load->srr1;
+    env->spr[SPR_SPRG0] = load->sprg0;
+    env->spr[SPR_SPRG1] = load->sprg1;
+    env->spr[SPR_SPRG2] = load->sprg2;
+    env->spr[SPR_SPRG3] = load->sprg3;
+    env->spr[SPR_BOOKS_PID] = load->pidr;
+    env->spr[SPR_PPR] = load->ppr;
+
+    env->tb_env->tb_offset = load->tb_offset;
+
+    /*
+     * MSR updated, compute hflags and possible interrupts.
+     */
+    hreg_compute_hflags(env);
+    ppc_maybe_interrupt(env);
+
+    /*
+     * Nested HV does not tag TLB entries between L1 and L2, so must
+     * flush on transition.
+     */
+    tlb_flush(cs);
+    env->reserve_addr = -1; /* Reset the reservation */
+}
+
+/*
+ * When this handler returns, the environment is switched to the L2 guest
+ * and TCG begins running that. spapr_exit_nested() performs the switch from
+ * L2 back to L1 and returns from the H_ENTER_NESTED hcall.
+ */
+static target_ulong h_enter_nested(PowerPCCPU *cpu,
+                                   SpaprMachineState *spapr,
+                                   target_ulong opcode,
+                                   target_ulong *args)
+{
+    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
+    CPUPPCState *env = &cpu->env;
+    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
+    struct nested_ppc_state l2_state;
+    target_ulong hv_ptr = args[0];
+    target_ulong regs_ptr = args[1];
+    target_ulong hdec, now = cpu_ppc_load_tbl(env);
+    target_ulong lpcr, lpcr_mask;
+    struct kvmppc_hv_guest_state *hvstate;
+    struct kvmppc_hv_guest_state hv_state;
+    struct kvmppc_pt_regs *regs;
+    hwaddr len;
+
+    if (spapr->nested_ptcr == 0) {
+        return H_NOT_AVAILABLE;
+    }
+
+    len = sizeof(*hvstate);
+    hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, false,
+                                MEMTXATTRS_UNSPECIFIED);
+    if (len != sizeof(*hvstate)) {
+        address_space_unmap(CPU(cpu)->as, hvstate, len, 0, false);
+        return H_PARAMETER;
+    }
+
+    memcpy(&hv_state, hvstate, len);
+
+    address_space_unmap(CPU(cpu)->as, hvstate, len, len, false);
+
+    /*
+     * We accept versions 1 and 2. Version 2 fields are unused because TCG
+     * does not implement DAWR*.
+     */
+    if (hv_state.version > HV_GUEST_STATE_VERSION) {
+        return H_PARAMETER;
+    }
+
+    if (hv_state.lpid == 0) {
+        return H_PARAMETER;
+    }
+
+    spapr_cpu->nested_host_state = g_try_new(struct nested_ppc_state, 1);
+    if (!spapr_cpu->nested_host_state) {
+        return H_NO_MEM;
+    }
+
+    assert(env->spr[SPR_LPIDR] == 0);
+    assert(env->spr[SPR_DPDES] == 0);
+    nested_save_state(spapr_cpu->nested_host_state, cpu);
+
+    len = sizeof(*regs);
+    regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, false,
+                                MEMTXATTRS_UNSPECIFIED);
+    if (!regs || len != sizeof(*regs)) {
+        address_space_unmap(CPU(cpu)->as, regs, len, 0, false);
+        g_free(spapr_cpu->nested_host_state);
+        return H_P2;
+    }
+
+    len = sizeof(l2_state.gpr);
+    assert(len == sizeof(regs->gpr));
+    memcpy(l2_state.gpr, regs->gpr, len);
+
+    l2_state.lr = regs->link;
+    l2_state.ctr = regs->ctr;
+    l2_state.xer = regs->xer;
+    l2_state.cr = regs->ccr;
+    l2_state.msr = regs->msr;
+    l2_state.nip = regs->nip;
+
+    address_space_unmap(CPU(cpu)->as, regs, len, len, false);
+
+    l2_state.cfar = hv_state.cfar;
+    l2_state.lpidr = hv_state.lpid;
+
+    lpcr_mask = LPCR_DPFD | LPCR_ILE | LPCR_AIL | LPCR_LD | LPCR_MER;
+    lpcr = (env->spr[SPR_LPCR] & ~lpcr_mask) | (hv_state.lpcr & lpcr_mask);
+    lpcr |= LPCR_HR | LPCR_UPRT | LPCR_GTSE | LPCR_HVICE | LPCR_HDICE;
+    lpcr &= ~LPCR_LPES0;
+    l2_state.lpcr = lpcr & pcc->lpcr_mask;
+
+    l2_state.pcr = hv_state.pcr;
+    /* hv_state.amor is not used */
+    l2_state.dpdes = hv_state.dpdes;
+    l2_state.hfscr = hv_state.hfscr;
+    /* TCG does not implement DAWR*, CIABR, PURR, SPURR, IC, VTB, HEIR SPRs*/
+    l2_state.srr0 = hv_state.srr0;
+    l2_state.srr1 = hv_state.srr1;
+    l2_state.sprg0 = hv_state.sprg[0];
+    l2_state.sprg1 = hv_state.sprg[1];
+    l2_state.sprg2 = hv_state.sprg[2];
+    l2_state.sprg3 = hv_state.sprg[3];
+    l2_state.pidr = hv_state.pidr;
+    l2_state.ppr = hv_state.ppr;
+    l2_state.tb_offset = env->tb_env->tb_offset + hv_state.tb_offset;
+
+    /*
+     * Switch to the nested guest environment and start the "hdec" timer.
+     */
+    nested_load_state(cpu, &l2_state);
+
+    hdec = hv_state.hdec_expiry - now;
+    cpu_ppc_hdecr_init(env);
+    cpu_ppc_store_hdecr(env, hdec);
+
+    /*
+     * The hv_state.vcpu_token is not needed. It is used by the KVM
+     * implementation to remember which L2 vCPU last ran on which physical
+     * CPU so as to invalidate process scope translations if it is moved
+     * between physical CPUs. For now TLBs are always flushed on L1<->L2
+     * transitions so this is not a problem.
+     *
+     * Could validate that the same vcpu_token does not attempt to run on
+     * different L1 vCPUs at the same time, but that would be a L1 KVM bug
+     * and it's not obviously worth a new data structure to do it.
+     */
+
+    spapr_cpu->in_nested = true;
+
+    /*
+     * The spapr hcall helper sets env->gpr[3] to the return value, but at
+     * this point the L1 is not returning from the hcall but rather we
+     * start running the L2, so r3 must not be clobbered, so return env->gpr[3]
+     * to leave it unchanged.
+     */
+    return env->gpr[3];
+}
+
+void spapr_exit_nested(PowerPCCPU *cpu, int excp)
+{
+    CPUPPCState *env = &cpu->env;
+    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
+    struct nested_ppc_state l2_state;
+    target_ulong hv_ptr = spapr_cpu->nested_host_state->gpr[4];
+    target_ulong regs_ptr = spapr_cpu->nested_host_state->gpr[5];
+    target_ulong hsrr0, hsrr1, hdar, asdr, hdsisr;
+    struct kvmppc_hv_guest_state *hvstate;
+    struct kvmppc_pt_regs *regs;
+    hwaddr len;
+
+    assert(spapr_cpu->in_nested);
+
+    nested_save_state(&l2_state, cpu);
+    hsrr0 = env->spr[SPR_HSRR0];
+    hsrr1 = env->spr[SPR_HSRR1];
+    hdar = env->spr[SPR_HDAR];
+    hdsisr = env->spr[SPR_HDSISR];
+    asdr = env->spr[SPR_ASDR];
+
+    /*
+     * Switch back to the host environment (including for any error).
+     */
+    assert(env->spr[SPR_LPIDR] != 0);
+    nested_load_state(cpu, spapr_cpu->nested_host_state);
+    env->gpr[3] = env->excp_vectors[excp]; /* hcall return value */
+
+    cpu_ppc_hdecr_exit(env);
+
+    spapr_cpu->in_nested = false;
+
+    g_free(spapr_cpu->nested_host_state);
+    spapr_cpu->nested_host_state = NULL;
+
+    len = sizeof(*hvstate);
+    hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, true,
+                                MEMTXATTRS_UNSPECIFIED);
+    if (len != sizeof(*hvstate)) {
+        address_space_unmap(CPU(cpu)->as, hvstate, len, 0, true);
+        env->gpr[3] = H_PARAMETER;
+	return;
+    }
+
+    hvstate->cfar = l2_state.cfar;
+    hvstate->lpcr = l2_state.lpcr;
+    hvstate->pcr = l2_state.pcr;
+    hvstate->dpdes = l2_state.dpdes;
+    hvstate->hfscr = l2_state.hfscr;
+
+    if (excp == POWERPC_EXCP_HDSI) {
+        hvstate->hdar = hdar;
+        hvstate->hdsisr = hdsisr;
+        hvstate->asdr = asdr;
+    } else if (excp == POWERPC_EXCP_HISI) {
+        hvstate->asdr = asdr;
+    }
+
+    /* HEIR should be implemented for HV mode and saved here. */
+    hvstate->srr0 = l2_state.srr0;
+    hvstate->srr1 = l2_state.srr1;
+    hvstate->sprg[0] = l2_state.sprg0;
+    hvstate->sprg[1] = l2_state.sprg1;
+    hvstate->sprg[2] = l2_state.sprg2;
+    hvstate->sprg[3] = l2_state.sprg3;
+    hvstate->pidr = l2_state.pidr;
+    hvstate->ppr = l2_state.ppr;
+
+    /* Is it okay to specify write length larger than actual data written? */
+    address_space_unmap(CPU(cpu)->as, hvstate, len, len, true);
+
+    len = sizeof(*regs);
+    regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, true,
+                                MEMTXATTRS_UNSPECIFIED);
+    if (!regs || len != sizeof(*regs)) {
+        address_space_unmap(CPU(cpu)->as, regs, len, 0, true);
+        env->gpr[3] = H_P2;
+	return;
+    }
+
+    len = sizeof(env->gpr);
+    assert(len == sizeof(regs->gpr));
+    memcpy(regs->gpr, l2_state.gpr, len);
+
+    regs->link = l2_state.lr;
+    regs->ctr = l2_state.ctr;
+    regs->xer = l2_state.xer;
+    regs->ccr = l2_state.cr;
+
+    if (excp == POWERPC_EXCP_MCHECK ||
+        excp == POWERPC_EXCP_RESET ||
+        excp == POWERPC_EXCP_SYSCALL) {
+        regs->nip = l2_state.srr0;
+        regs->msr = l2_state.srr1 & env->msr_mask;
+    } else {
+        regs->nip = hsrr0;
+        regs->msr = hsrr1 & env->msr_mask;
+    }
+
+    /* Is it okay to specify write length larger than actual data written? */
+    address_space_unmap(CPU(cpu)->as, regs, len, len, true);
+}
+
+void spapr_register_nested(void)
+{
+    spapr_register_hypercall(KVMPPC_H_SET_PARTITION_TABLE, h_set_ptbl);
+    spapr_register_hypercall(KVMPPC_H_ENTER_NESTED, h_enter_nested);
+    spapr_register_hypercall(KVMPPC_H_TLB_INVALIDATE, h_tlb_invalidate);
+    spapr_register_hypercall(KVMPPC_H_COPY_TOFROM_GUEST, h_copy_tofrom_guest);
+}
+#else
+void spapr_exit_nested(PowerPCCPU *cpu, int excp)
+{
+    g_assert_not_reached();
+}
+
+void spapr_register_nested(void)
+{
+    /* DO NOTHING */
+}
+#endif
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 5c8aabd444..84cdde75da 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -619,66 +619,6 @@ struct SpaprMachineState {
 #define SVM_H_TPM_COMM              0xEF10
 #define SVM_HCALL_MAX               SVM_H_TPM_COMM
 
-/*
- * Register state for entering a nested guest with H_ENTER_NESTED.
- * New member must be added at the end.
- */
-struct kvmppc_hv_guest_state {
-    uint64_t version;      /* version of this structure layout, must be first */
-    uint32_t lpid;
-    uint32_t vcpu_token;
-    /* These registers are hypervisor privileged (at least for writing) */
-    uint64_t lpcr;
-    uint64_t pcr;
-    uint64_t amor;
-    uint64_t dpdes;
-    uint64_t hfscr;
-    int64_t tb_offset;
-    uint64_t dawr0;
-    uint64_t dawrx0;
-    uint64_t ciabr;
-    uint64_t hdec_expiry;
-    uint64_t purr;
-    uint64_t spurr;
-    uint64_t ic;
-    uint64_t vtb;
-    uint64_t hdar;
-    uint64_t hdsisr;
-    uint64_t heir;
-    uint64_t asdr;
-    /* These are OS privileged but need to be set late in guest entry */
-    uint64_t srr0;
-    uint64_t srr1;
-    uint64_t sprg[4];
-    uint64_t pidr;
-    uint64_t cfar;
-    uint64_t ppr;
-    /* Version 1 ends here */
-    uint64_t dawr1;
-    uint64_t dawrx1;
-    /* Version 2 ends here */
-};
-
-/* Latest version of hv_guest_state structure */
-#define HV_GUEST_STATE_VERSION  2
-
-/* Linux 64-bit powerpc pt_regs struct, used by nested HV */
-struct kvmppc_pt_regs {
-    uint64_t gpr[32];
-    uint64_t nip;
-    uint64_t msr;
-    uint64_t orig_gpr3;    /* Used for restarting system calls */
-    uint64_t ctr;
-    uint64_t link;
-    uint64_t xer;
-    uint64_t ccr;
-    uint64_t softe;        /* Soft enabled/disabled */
-    uint64_t trap;         /* Reason for being here */
-    uint64_t dar;          /* Fault registers */
-    uint64_t dsisr;        /* on 4xx/Book-E used for ESR */
-    uint64_t result;       /* Result of a system call */
-};
-
 typedef struct SpaprDeviceTreeUpdateHeader {
     uint32_t version_id;
 } SpaprDeviceTreeUpdateHeader;
@@ -696,6 +636,7 @@ void spapr_register_hypercall(target_ulong opcode, spapr_hcall_fn fn);
 target_ulong spapr_hypercall(PowerPCCPU *cpu, target_ulong opcode,
                              target_ulong *args);
 
+void spapr_register_nested(void);
 void spapr_exit_nested(PowerPCCPU *cpu, int excp);
 
 target_ulong softmmu_resize_hpt_prepare(PowerPCCPU *cpu, SpaprMachineState *spapr,
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 1/4] spapr: H_ENTER_NESTED should restore host XER ca field
  2023-05-03  0:39 ` [RFC PATCH 1/4] spapr: H_ENTER_NESTED should restore host XER ca field Nicholas Piggin
@ 2023-05-05 10:20   ` Harsh Prateek Bora
  0 siblings, 0 replies; 12+ messages in thread
From: Harsh Prateek Bora @ 2023-05-05 10:20 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel, Harsh Prateek Bora

<correcting my email in CC>

On 5/3/23 06:09, Nicholas Piggin wrote:
> Fix missing env->ca restore when going from L2 back to the host.
> 
> Fixes: 120f738a467 ("spapr: implement nested-hv capability for the virtual hypervisor")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/ppc/spapr_hcall.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index ec4def62f8..be225adaf6 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -1785,6 +1785,7 @@ out_restore_l1:
>       env->cfar = spapr_cpu->nested_host_state->cfar;
>       env->xer = spapr_cpu->nested_host_state->xer;
>       env->so = spapr_cpu->nested_host_state->so;
> +    env->ca = spapr_cpu->nested_host_state->ca;
>       env->ov = spapr_cpu->nested_host_state->ov;
>       env->ov32 = spapr_cpu->nested_host_state->ov32;
>       env->ca32 = spapr_cpu->nested_host_state->ca32;

Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 2/4] spapr: Add a nested state struct
  2023-05-03  0:39 ` [RFC PATCH 2/4] spapr: Add a nested state struct Nicholas Piggin
@ 2023-05-05 10:54   ` Harsh Prateek Bora
  2023-05-13  3:27     ` Nicholas Piggin
  0 siblings, 1 reply; 12+ messages in thread
From: Harsh Prateek Bora @ 2023-05-05 10:54 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel, Harsh Prateek Bora

<correcting my email in CC>

On 5/3/23 06:09, Nicholas Piggin wrote:
> Rather than use a copy of CPUPPCState to store the host state while
> the environment has been switched to the L2, use a new struct for
> this purpose.
> 
> Have helper functions to save and load this host state.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/ppc/spapr_hcall.c            | 164 ++++++++++++++++++++++++--------
>   include/hw/ppc/spapr_cpu_core.h |   5 +-
>   2 files changed, 129 insertions(+), 40 deletions(-)
> 
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index be225adaf6..6679150ac7 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -1544,6 +1544,126 @@ static target_ulong h_copy_tofrom_guest(PowerPCCPU *cpu,
>       return H_FUNCTION;
>   }
>   
> +struct nested_ppc_state {
> +    uint64_t gpr[32];
> +    uint64_t lr;
> +    uint64_t ctr;
> +    uint64_t cfar;
> +    uint64_t msr;
> +    uint64_t nip;
> +    uint32_t cr;
> +
> +    uint64_t xer;
> +
> +    uint64_t lpcr;
> +    uint64_t lpidr;
> +    uint64_t pidr;
> +    uint64_t pcr;
> +    uint64_t dpdes;
> +    uint64_t hfscr;
> +    uint64_t srr0;
> +    uint64_t srr1;
> +    uint64_t sprg0;
> +    uint64_t sprg1;
> +    uint64_t sprg2;
> +    uint64_t sprg3;
> +    uint64_t ppr;
> +
> +    int64_t tb_offset;
> +};
> +
> +static void nested_save_state(struct nested_ppc_state *save, PowerPCCPU *cpu)
> +{
> +    CPUPPCState *env = &cpu->env;
> +    uint32_t cr;
> +    int i;
> +
> +    memcpy(save->gpr, env->gpr, sizeof(save->gpr));
> +
> +    save->lr = env->lr;
> +    save->ctr = env->ctr;
> +    save->cfar = env->cfar;
> +    save->msr = env->msr;
> +    save->nip = env->nip;
> +
> +    cr = 0;
> +    for (i = 0; i < 8; i++) {
> +        cr |= (env->crf[i] & 15) << (4 * (7 - i));
> +    }
> +    save->cr = cr;
> +
> +    save->xer = cpu_read_xer(env);
> +
> +    save->lpcr = env->spr[SPR_LPCR];
> +    save->lpidr = env->spr[SPR_LPIDR];
> +    save->pcr = env->spr[SPR_PCR];
> +    save->dpdes = env->spr[SPR_DPDES];
> +    save->hfscr = env->spr[SPR_HFSCR];
> +    save->srr0 = env->spr[SPR_SRR0];
> +    save->srr1 = env->spr[SPR_SRR1];
> +    save->sprg0 = env->spr[SPR_SPRG0];
> +    save->sprg1 = env->spr[SPR_SPRG1];
> +    save->sprg2 = env->spr[SPR_SPRG2];
> +    save->sprg3 = env->spr[SPR_SPRG3];
> +    save->pidr = env->spr[SPR_BOOKS_PID];
> +    save->ppr = env->spr[SPR_PPR];
> +
> +    save->tb_offset = env->tb_env->tb_offset;
> +}
> +
> +static void nested_load_state(PowerPCCPU *cpu, struct nested_ppc_state *load)
> +{
> +    CPUState *cs = CPU(cpu);
> +    CPUPPCState *env = &cpu->env;
> +    uint32_t cr;
> +    int i;
> +
> +    memcpy(env->gpr, load->gpr, sizeof(env->gpr));
> +
> +    env->lr = load->lr;
> +    env->ctr = load->ctr;
> +    env->cfar = load->cfar;
> +    env->msr = load->msr;
> +    env->nip = load->nip;
> +
> +    cr = load->cr;
> +    for (i = 7; i >= 0; i--) {
> +        env->crf[i] = cr & 15;
> +        cr >>= 4;
> +    }
> +
> +    cpu_write_xer(env, load->xer);
> +
> +    env->spr[SPR_LPCR] = load->lpcr;
> +    env->spr[SPR_LPIDR] = load->lpidr;
> +    env->spr[SPR_PCR] = load->pcr;
> +    env->spr[SPR_DPDES] = load->dpdes;
> +    env->spr[SPR_HFSCR] = load->hfscr;
> +    env->spr[SPR_SRR0] = load->srr0;
> +    env->spr[SPR_SRR1] = load->srr1;
> +    env->spr[SPR_SPRG0] = load->sprg0;
> +    env->spr[SPR_SPRG1] = load->sprg1;
> +    env->spr[SPR_SPRG2] = load->sprg2;
> +    env->spr[SPR_SPRG3] = load->sprg3;
> +    env->spr[SPR_BOOKS_PID] = load->pidr;
> +    env->spr[SPR_PPR] = load->ppr;
> +
> +    env->tb_env->tb_offset = load->tb_offset;
> +
> +    /*
> +     * MSR updated, compute hflags and possible interrupts.
> +     */
> +    hreg_compute_hflags(env);
> +    ppc_maybe_interrupt(env);
> +
> +    /*
> +     * Nested HV does not tag TLB entries between L1 and L2, so must
> +     * flush on transition.
> +     */
> +    tlb_flush(cs);
> +    env->reserve_addr = -1; /* Reset the reservation */
> +}
> +
>   /*
>    * When this handler returns, the environment is switched to the L2 guest
>    * and TCG begins running that. spapr_exit_nested() performs the switch from
> @@ -1593,12 +1713,14 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>           return H_PARAMETER;
>       }
>   
> -    spapr_cpu->nested_host_state = g_try_new(CPUPPCState, 1);
> +    spapr_cpu->nested_host_state = g_try_new(struct nested_ppc_state, 1);
>       if (!spapr_cpu->nested_host_state) {
>           return H_NO_MEM;
>       }
>   
> -    memcpy(spapr_cpu->nested_host_state, env, sizeof(CPUPPCState));
> +    assert(env->spr[SPR_LPIDR] == 0);
> +    assert(env->spr[SPR_DPDES] == 0);
> +    nested_save_state(spapr_cpu->nested_host_state, cpu);
>   
Ideally, we may want to save entire env for L1 host, while switching to 
L2 rather than keeping a subset of it for 2 reasons:
  - keeping enitre L1 env ensures it remains untouched by L2 during L2 
execution (shouldnt allow L2 to modify remaining L1 env bits unexpectedly)
  - I see some of the registers are retained only for L1 (so, ca, ov32, 
ca32, etc) but not for L2 (got missed in nested_load_state helper in 
this patch), are they not really needed anymore? Previous patch 
introduced one of them though.

regards,
Harsh
>       len = sizeof(*regs);
>       regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, false,
> @@ -1644,7 +1766,6 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>       env->spr[SPR_DPDES] = hv_state.dpdes;
>       env->spr[SPR_HFSCR] = hv_state.hfscr;
>       hdec = hv_state.hdec_expiry - now;
> -    spapr_cpu->nested_tb_offset = hv_state.tb_offset;
>       /* TCG does not implement DAWR*, CIABR, PURR, SPURR, IC, VTB, HEIR SPRs*/
>       env->spr[SPR_SRR0] = hv_state.srr0;
>       env->spr[SPR_SRR1] = hv_state.srr1;
> @@ -1670,7 +1791,7 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>        * and it's not obviously worth a new data structure to do it.
>        */
>   
> -    env->tb_env->tb_offset += spapr_cpu->nested_tb_offset;
> +    env->tb_env->tb_offset += hv_state.tb_offset;
>       spapr_cpu->in_nested = true;
>   
>       hreg_compute_hflags(env);
> @@ -1689,7 +1810,6 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>   
>   void spapr_exit_nested(PowerPCCPU *cpu, int excp)
>   {
> -    CPUState *cs = CPU(cpu);
>       CPUPPCState *env = &cpu->env;
>       SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
>       target_ulong r3_return = env->excp_vectors[excp]; /* hcall return value */
> @@ -1778,34 +1898,8 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
>       address_space_unmap(CPU(cpu)->as, regs, len, len, true);
>   
>   out_restore_l1:
> -    memcpy(env->gpr, spapr_cpu->nested_host_state->gpr, sizeof(env->gpr));
> -    env->lr = spapr_cpu->nested_host_state->lr;
> -    env->ctr = spapr_cpu->nested_host_state->ctr;
> -    memcpy(env->crf, spapr_cpu->nested_host_state->crf, sizeof(env->crf));
> -    env->cfar = spapr_cpu->nested_host_state->cfar;
> -    env->xer = spapr_cpu->nested_host_state->xer;
> -    env->so = spapr_cpu->nested_host_state->so;
> -    env->ca = spapr_cpu->nested_host_state->ca;
> -    env->ov = spapr_cpu->nested_host_state->ov;
> -    env->ov32 = spapr_cpu->nested_host_state->ov32;
> -    env->ca32 = spapr_cpu->nested_host_state->ca32;
> -    env->msr = spapr_cpu->nested_host_state->msr;
> -    env->nip = spapr_cpu->nested_host_state->nip;
> -
>       assert(env->spr[SPR_LPIDR] != 0);
> -    env->spr[SPR_LPCR] = spapr_cpu->nested_host_state->spr[SPR_LPCR];
> -    env->spr[SPR_LPIDR] = spapr_cpu->nested_host_state->spr[SPR_LPIDR];
> -    env->spr[SPR_PCR] = spapr_cpu->nested_host_state->spr[SPR_PCR];
> -    env->spr[SPR_DPDES] = 0;
> -    env->spr[SPR_HFSCR] = spapr_cpu->nested_host_state->spr[SPR_HFSCR];
> -    env->spr[SPR_SRR0] = spapr_cpu->nested_host_state->spr[SPR_SRR0];
> -    env->spr[SPR_SRR1] = spapr_cpu->nested_host_state->spr[SPR_SRR1];
> -    env->spr[SPR_SPRG0] = spapr_cpu->nested_host_state->spr[SPR_SPRG0];
> -    env->spr[SPR_SPRG1] = spapr_cpu->nested_host_state->spr[SPR_SPRG1];
> -    env->spr[SPR_SPRG2] = spapr_cpu->nested_host_state->spr[SPR_SPRG2];
> -    env->spr[SPR_SPRG3] = spapr_cpu->nested_host_state->spr[SPR_SPRG3];
> -    env->spr[SPR_BOOKS_PID] = spapr_cpu->nested_host_state->spr[SPR_BOOKS_PID];
> -    env->spr[SPR_PPR] = spapr_cpu->nested_host_state->spr[SPR_PPR];
> +    nested_load_state(cpu, spapr_cpu->nested_host_state);
>   
>       /*
>        * Return the interrupt vector address from H_ENTER_NESTED to the L1
> @@ -1813,14 +1907,8 @@ out_restore_l1:
>        */
>       env->gpr[3] = r3_return;
>   
> -    env->tb_env->tb_offset -= spapr_cpu->nested_tb_offset;
>       spapr_cpu->in_nested = false;
>   
> -    hreg_compute_hflags(env);
> -    ppc_maybe_interrupt(env);
> -    tlb_flush(cs);
> -    env->reserve_addr = -1; /* Reset the reservation */
> -
>       g_free(spapr_cpu->nested_host_state);
>       spapr_cpu->nested_host_state = NULL;
>   }
> diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
> index b560514560..69a52e39b8 100644
> --- a/include/hw/ppc/spapr_cpu_core.h
> +++ b/include/hw/ppc/spapr_cpu_core.h
> @@ -41,6 +41,8 @@ void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip,
>                                  target_ulong r1, target_ulong r3,
>                                  target_ulong r4);
>   
> +struct nested_ppc_state;
> +
>   typedef struct SpaprCpuState {
>       uint64_t vpa_addr;
>       uint64_t slb_shadow_addr, slb_shadow_size;
> @@ -51,8 +53,7 @@ typedef struct SpaprCpuState {
>   
>       /* Fields for nested-HV support */
>       bool in_nested; /* true while the L2 is executing */
> -    CPUPPCState *nested_host_state; /* holds the L1 state while L2 executes */
> -    int64_t nested_tb_offset; /* L1->L2 TB offset */
> +    struct nested_ppc_state *nested_host_state; /* holds the L1 state while L2 executes */
>   } SpaprCpuState;
>   
>   static inline SpaprCpuState *spapr_cpu_state(PowerPCCPU *cpu)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 3/4] spapr: load and store l2 state with helper functions
  2023-05-03  0:39 ` [RFC PATCH 3/4] spapr: load and store l2 state with helper functions Nicholas Piggin
@ 2023-05-05 11:03   ` Harsh Prateek Bora
  2023-05-13  3:30     ` Nicholas Piggin
  0 siblings, 1 reply; 12+ messages in thread
From: Harsh Prateek Bora @ 2023-05-05 11:03 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel, Harsh Prateek Bora

<correcting my email in CC>

On 5/3/23 06:09, Nicholas Piggin wrote:
> Arguably this is just shuffling around register accesses, but one nice
> thing it does is allow the exit to save away the L2 state then switch
> the environment to the L1 before copying L2 data back to the L1, which
> logically flows more naturally and simplifies the error paths.
> 
The supposed advantage you have mentioned is coming at the cost of 
double copy (env -> l2_state, switch to L1, l2_state -> hvstate/ptregs), 
but previously we were just doing a single copy that directly conveyed 
it to L1 before switching to L1. Additional copy means additional delay 
in transition of L1/L2. Not sure if it's worth it?

regards,
Harsh

> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/ppc/spapr_hcall.c | 178 +++++++++++++++++++++----------------------
>   1 file changed, 85 insertions(+), 93 deletions(-)
> 
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 6679150ac7..783a06ba98 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -1675,9 +1675,9 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>                                      target_ulong *args)
>   {
>       PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> -    CPUState *cs = CPU(cpu);
>       CPUPPCState *env = &cpu->env;
>       SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> +    struct nested_ppc_state l2_state;
>       target_ulong hv_ptr = args[0];
>       target_ulong regs_ptr = args[1];
>       target_ulong hdec, now = cpu_ppc_load_tbl(env);
> @@ -1686,8 +1686,6 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>       struct kvmppc_hv_guest_state hv_state;
>       struct kvmppc_pt_regs *regs;
>       hwaddr len;
> -    uint64_t cr;
> -    int i;
>   
>       if (spapr->nested_ptcr == 0) {
>           return H_NOT_AVAILABLE;
> @@ -1713,6 +1711,10 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>           return H_PARAMETER;
>       }
>   
> +    if (hv_state.lpid == 0) {
> +        return H_PARAMETER;
> +    }
> +
>       spapr_cpu->nested_host_state = g_try_new(struct nested_ppc_state, 1);
>       if (!spapr_cpu->nested_host_state) {
>           return H_NO_MEM;
> @@ -1731,51 +1733,49 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>           return H_P2;
>       }
>   
> -    len = sizeof(env->gpr);
> +    len = sizeof(l2_state.gpr);
>       assert(len == sizeof(regs->gpr));
> -    memcpy(env->gpr, regs->gpr, len);
> +    memcpy(l2_state.gpr, regs->gpr, len);
>   
> -    env->lr = regs->link;
> -    env->ctr = regs->ctr;
> -    cpu_write_xer(env, regs->xer);
> -
> -    cr = regs->ccr;
> -    for (i = 7; i >= 0; i--) {
> -        env->crf[i] = cr & 15;
> -        cr >>= 4;
> -    }
> -
> -    env->msr = regs->msr;
> -    env->nip = regs->nip;
> +    l2_state.lr = regs->link;
> +    l2_state.ctr = regs->ctr;
> +    l2_state.xer = regs->xer;
> +    l2_state.cr = regs->ccr;
> +    l2_state.msr = regs->msr;
> +    l2_state.nip = regs->nip;
>   
>       address_space_unmap(CPU(cpu)->as, regs, len, len, false);
>   
> -    env->cfar = hv_state.cfar;
> -
> -    assert(env->spr[SPR_LPIDR] == 0);
> -    env->spr[SPR_LPIDR] = hv_state.lpid;
> +    l2_state.cfar = hv_state.cfar;
> +    l2_state.lpidr = hv_state.lpid;
>   
>       lpcr_mask = LPCR_DPFD | LPCR_ILE | LPCR_AIL | LPCR_LD | LPCR_MER;
>       lpcr = (env->spr[SPR_LPCR] & ~lpcr_mask) | (hv_state.lpcr & lpcr_mask);
>       lpcr |= LPCR_HR | LPCR_UPRT | LPCR_GTSE | LPCR_HVICE | LPCR_HDICE;
>       lpcr &= ~LPCR_LPES0;
> -    env->spr[SPR_LPCR] = lpcr & pcc->lpcr_mask;
> +    l2_state.lpcr = lpcr & pcc->lpcr_mask;
>   
> -    env->spr[SPR_PCR] = hv_state.pcr;
> +    l2_state.pcr = hv_state.pcr;
>       /* hv_state.amor is not used */
> -    env->spr[SPR_DPDES] = hv_state.dpdes;
> -    env->spr[SPR_HFSCR] = hv_state.hfscr;
> -    hdec = hv_state.hdec_expiry - now;
> +    l2_state.dpdes = hv_state.dpdes;
> +    l2_state.hfscr = hv_state.hfscr;
>       /* TCG does not implement DAWR*, CIABR, PURR, SPURR, IC, VTB, HEIR SPRs*/
> -    env->spr[SPR_SRR0] = hv_state.srr0;
> -    env->spr[SPR_SRR1] = hv_state.srr1;
> -    env->spr[SPR_SPRG0] = hv_state.sprg[0];
> -    env->spr[SPR_SPRG1] = hv_state.sprg[1];
> -    env->spr[SPR_SPRG2] = hv_state.sprg[2];
> -    env->spr[SPR_SPRG3] = hv_state.sprg[3];
> -    env->spr[SPR_BOOKS_PID] = hv_state.pidr;
> -    env->spr[SPR_PPR] = hv_state.ppr;
> +    l2_state.srr0 = hv_state.srr0;
> +    l2_state.srr1 = hv_state.srr1;
> +    l2_state.sprg0 = hv_state.sprg[0];
> +    l2_state.sprg1 = hv_state.sprg[1];
> +    l2_state.sprg2 = hv_state.sprg[2];
> +    l2_state.sprg3 = hv_state.sprg[3];
> +    l2_state.pidr = hv_state.pidr;
> +    l2_state.ppr = hv_state.ppr;
> +    l2_state.tb_offset = env->tb_env->tb_offset + hv_state.tb_offset;
>   
> +    /*
> +     * Switch to the nested guest environment and start the "hdec" timer.
> +     */
> +    nested_load_state(cpu, &l2_state);
> +
> +    hdec = hv_state.hdec_expiry - now;
>       cpu_ppc_hdecr_init(env);
>       cpu_ppc_store_hdecr(env, hdec);
>   
> @@ -1791,14 +1791,8 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>        * and it's not obviously worth a new data structure to do it.
>        */
>   
> -    env->tb_env->tb_offset += hv_state.tb_offset;
>       spapr_cpu->in_nested = true;
>   
> -    hreg_compute_hflags(env);
> -    ppc_maybe_interrupt(env);
> -    tlb_flush(cs);
> -    env->reserve_addr = -1; /* Reset the reservation */
> -
>       /*
>        * The spapr hcall helper sets env->gpr[3] to the return value, but at
>        * this point the L1 is not returning from the hcall but rather we
> @@ -1812,51 +1806,69 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
>   {
>       CPUPPCState *env = &cpu->env;
>       SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> -    target_ulong r3_return = env->excp_vectors[excp]; /* hcall return value */
> +    struct nested_ppc_state l2_state;
>       target_ulong hv_ptr = spapr_cpu->nested_host_state->gpr[4];
>       target_ulong regs_ptr = spapr_cpu->nested_host_state->gpr[5];
> +    target_ulong hsrr0, hsrr1, hdar, asdr, hdsisr;
>       struct kvmppc_hv_guest_state *hvstate;
>       struct kvmppc_pt_regs *regs;
>       hwaddr len;
> -    uint64_t cr;
> -    int i;
>   
>       assert(spapr_cpu->in_nested);
>   
> +    nested_save_state(&l2_state, cpu);
> +    hsrr0 = env->spr[SPR_HSRR0];
> +    hsrr1 = env->spr[SPR_HSRR1];
> +    hdar = env->spr[SPR_HDAR];
> +    hdsisr = env->spr[SPR_HDSISR];
> +    asdr = env->spr[SPR_ASDR];
> +
> +    /*
> +     * Switch back to the host environment (including for any error).
> +     */
> +    assert(env->spr[SPR_LPIDR] != 0);
> +    nested_load_state(cpu, spapr_cpu->nested_host_state);
> +    env->gpr[3] = env->excp_vectors[excp]; /* hcall return value */
> +
>       cpu_ppc_hdecr_exit(env);
>   
> +    spapr_cpu->in_nested = false;
> +
> +    g_free(spapr_cpu->nested_host_state);
> +    spapr_cpu->nested_host_state = NULL;
> +
>       len = sizeof(*hvstate);
>       hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, true,
>                                   MEMTXATTRS_UNSPECIFIED);
>       if (len != sizeof(*hvstate)) {
>           address_space_unmap(CPU(cpu)->as, hvstate, len, 0, true);
> -        r3_return = H_PARAMETER;
> -        goto out_restore_l1;
> +        env->gpr[3] = H_PARAMETER;
> +	return;
>       }
>   
> -    hvstate->cfar = env->cfar;
> -    hvstate->lpcr = env->spr[SPR_LPCR];
> -    hvstate->pcr = env->spr[SPR_PCR];
> -    hvstate->dpdes = env->spr[SPR_DPDES];
> -    hvstate->hfscr = env->spr[SPR_HFSCR];
> +    hvstate->cfar = l2_state.cfar;
> +    hvstate->lpcr = l2_state.lpcr;
> +    hvstate->pcr = l2_state.pcr;
> +    hvstate->dpdes = l2_state.dpdes;
> +    hvstate->hfscr = l2_state.hfscr;
>   
>       if (excp == POWERPC_EXCP_HDSI) {
> -        hvstate->hdar = env->spr[SPR_HDAR];
> -        hvstate->hdsisr = env->spr[SPR_HDSISR];
> -        hvstate->asdr = env->spr[SPR_ASDR];
> +        hvstate->hdar = hdar;
> +        hvstate->hdsisr = hdsisr;
> +        hvstate->asdr = asdr;
>       } else if (excp == POWERPC_EXCP_HISI) {
> -        hvstate->asdr = env->spr[SPR_ASDR];
> +        hvstate->asdr = asdr;
>       }
>   
>       /* HEIR should be implemented for HV mode and saved here. */
> -    hvstate->srr0 = env->spr[SPR_SRR0];
> -    hvstate->srr1 = env->spr[SPR_SRR1];
> -    hvstate->sprg[0] = env->spr[SPR_SPRG0];
> -    hvstate->sprg[1] = env->spr[SPR_SPRG1];
> -    hvstate->sprg[2] = env->spr[SPR_SPRG2];
> -    hvstate->sprg[3] = env->spr[SPR_SPRG3];
> -    hvstate->pidr = env->spr[SPR_BOOKS_PID];
> -    hvstate->ppr = env->spr[SPR_PPR];
> +    hvstate->srr0 = l2_state.srr0;
> +    hvstate->srr1 = l2_state.srr1;
> +    hvstate->sprg[0] = l2_state.sprg0;
> +    hvstate->sprg[1] = l2_state.sprg1;
> +    hvstate->sprg[2] = l2_state.sprg2;
> +    hvstate->sprg[3] = l2_state.sprg3;
> +    hvstate->pidr = l2_state.pidr;
> +    hvstate->ppr = l2_state.ppr;
>   
>       /* Is it okay to specify write length larger than actual data written? */
>       address_space_unmap(CPU(cpu)->as, hvstate, len, len, true);
> @@ -1866,51 +1878,31 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
>                                   MEMTXATTRS_UNSPECIFIED);
>       if (!regs || len != sizeof(*regs)) {
>           address_space_unmap(CPU(cpu)->as, regs, len, 0, true);
> -        r3_return = H_P2;
> -        goto out_restore_l1;
> +        env->gpr[3] = H_P2;
> +	return;
>       }
>   
>       len = sizeof(env->gpr);
>       assert(len == sizeof(regs->gpr));
> -    memcpy(regs->gpr, env->gpr, len);
> +    memcpy(regs->gpr, l2_state.gpr, len);
>   
> -    regs->link = env->lr;
> -    regs->ctr = env->ctr;
> -    regs->xer = cpu_read_xer(env);
> -
> -    cr = 0;
> -    for (i = 0; i < 8; i++) {
> -        cr |= (env->crf[i] & 15) << (4 * (7 - i));
> -    }
> -    regs->ccr = cr;
> +    regs->link = l2_state.lr;
> +    regs->ctr = l2_state.ctr;
> +    regs->xer = l2_state.xer;
> +    regs->ccr = l2_state.cr;
>   
>       if (excp == POWERPC_EXCP_MCHECK ||
>           excp == POWERPC_EXCP_RESET ||
>           excp == POWERPC_EXCP_SYSCALL) {
> -        regs->nip = env->spr[SPR_SRR0];
> -        regs->msr = env->spr[SPR_SRR1] & env->msr_mask;
> +        regs->nip = l2_state.srr0;
> +        regs->msr = l2_state.srr1 & env->msr_mask;
>       } else {
> -        regs->nip = env->spr[SPR_HSRR0];
> -        regs->msr = env->spr[SPR_HSRR1] & env->msr_mask;
> +        regs->nip = hsrr0;
> +        regs->msr = hsrr1 & env->msr_mask;
>       }
>   
>       /* Is it okay to specify write length larger than actual data written? */
>       address_space_unmap(CPU(cpu)->as, regs, len, len, true);
> -
> -out_restore_l1:
> -    assert(env->spr[SPR_LPIDR] != 0);
> -    nested_load_state(cpu, spapr_cpu->nested_host_state);
> -
> -    /*
> -     * Return the interrupt vector address from H_ENTER_NESTED to the L1
> -     * (or error code).
> -     */
> -    env->gpr[3] = r3_return;
> -
> -    spapr_cpu->in_nested = false;
> -
> -    g_free(spapr_cpu->nested_host_state);
> -    spapr_cpu->nested_host_state = NULL;
>   }
>   
>   static void hypercall_register_nested(void)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 4/4] spapr: Move spapr nested HV to a new file
  2023-05-03  0:39 ` [RFC PATCH 4/4] spapr: Move spapr nested HV to a new file Nicholas Piggin
@ 2023-05-05 11:09   ` Harsh Prateek Bora
  2023-05-13  3:32     ` Nicholas Piggin
  0 siblings, 1 reply; 12+ messages in thread
From: Harsh Prateek Bora @ 2023-05-05 11:09 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel, Harsh Prateek Bora

<correcting my email in CC>

On 5/3/23 06:09, Nicholas Piggin wrote:
> Create spapr_nested.c for the nested HV implementation (modulo small
> pieces in MMU and exception handling).
> 
This separation of nested code in its own file is very much needed, but 
this could have been a pre-patch to all the previous patches (at least 
2/3, 3/4 which are debatable).

regards,
Harsh
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/ppc/meson.build     |   1 +
>   hw/ppc/spapr_hcall.c   | 429 +----------------------------------
>   hw/ppc/spapr_nested.c  | 496 +++++++++++++++++++++++++++++++++++++++++
>   include/hw/ppc/spapr.h |  61 +----
>   4 files changed, 499 insertions(+), 488 deletions(-)
>   create mode 100644 hw/ppc/spapr_nested.c
> 
> diff --git a/hw/ppc/meson.build b/hw/ppc/meson.build
> index c927337da0..a313d4b964 100644
> --- a/hw/ppc/meson.build
> +++ b/hw/ppc/meson.build
> @@ -15,6 +15,7 @@ ppc_ss.add(when: 'CONFIG_PSERIES', if_true: files(
>     'spapr_vio.c',
>     'spapr_events.c',
>     'spapr_hcall.c',
> +  'spapr_nested.c',
>     'spapr_iommu.c',
>     'spapr_rtas.c',
>     'spapr_pci.c',
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 783a06ba98..dbdcbb0e9d 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -1496,444 +1496,17 @@ target_ulong spapr_hypercall(PowerPCCPU *cpu, target_ulong opcode,
>   }
>   
>   #ifdef CONFIG_TCG
> -#define PRTS_MASK      0x1f
> -
> -static target_ulong h_set_ptbl(PowerPCCPU *cpu,
> -                               SpaprMachineState *spapr,
> -                               target_ulong opcode,
> -                               target_ulong *args)
> -{
> -    target_ulong ptcr = args[0];
> -
> -    if (!spapr_get_cap(spapr, SPAPR_CAP_NESTED_KVM_HV)) {
> -        return H_FUNCTION;
> -    }
> -
> -    if ((ptcr & PRTS_MASK) + 12 - 4 > 12) {
> -        return H_PARAMETER;
> -    }
> -
> -    spapr->nested_ptcr = ptcr; /* Save new partition table */
> -
> -    return H_SUCCESS;
> -}
> -
> -static target_ulong h_tlb_invalidate(PowerPCCPU *cpu,
> -                                     SpaprMachineState *spapr,
> -                                     target_ulong opcode,
> -                                     target_ulong *args)
> -{
> -    /*
> -     * The spapr virtual hypervisor nested HV implementation retains no L2
> -     * translation state except for TLB. And the TLB is always invalidated
> -     * across L1<->L2 transitions, so nothing is required here.
> -     */
> -
> -    return H_SUCCESS;
> -}
> -
> -static target_ulong h_copy_tofrom_guest(PowerPCCPU *cpu,
> -                                        SpaprMachineState *spapr,
> -                                        target_ulong opcode,
> -                                        target_ulong *args)
> -{
> -    /*
> -     * This HCALL is not required, L1 KVM will take a slow path and walk the
> -     * page tables manually to do the data copy.
> -     */
> -    return H_FUNCTION;
> -}
> -
> -struct nested_ppc_state {
> -    uint64_t gpr[32];
> -    uint64_t lr;
> -    uint64_t ctr;
> -    uint64_t cfar;
> -    uint64_t msr;
> -    uint64_t nip;
> -    uint32_t cr;
> -
> -    uint64_t xer;
> -
> -    uint64_t lpcr;
> -    uint64_t lpidr;
> -    uint64_t pidr;
> -    uint64_t pcr;
> -    uint64_t dpdes;
> -    uint64_t hfscr;
> -    uint64_t srr0;
> -    uint64_t srr1;
> -    uint64_t sprg0;
> -    uint64_t sprg1;
> -    uint64_t sprg2;
> -    uint64_t sprg3;
> -    uint64_t ppr;
> -
> -    int64_t tb_offset;
> -};
> -
> -static void nested_save_state(struct nested_ppc_state *save, PowerPCCPU *cpu)
> -{
> -    CPUPPCState *env = &cpu->env;
> -    uint32_t cr;
> -    int i;
> -
> -    memcpy(save->gpr, env->gpr, sizeof(save->gpr));
> -
> -    save->lr = env->lr;
> -    save->ctr = env->ctr;
> -    save->cfar = env->cfar;
> -    save->msr = env->msr;
> -    save->nip = env->nip;
> -
> -    cr = 0;
> -    for (i = 0; i < 8; i++) {
> -        cr |= (env->crf[i] & 15) << (4 * (7 - i));
> -    }
> -    save->cr = cr;
> -
> -    save->xer = cpu_read_xer(env);
> -
> -    save->lpcr = env->spr[SPR_LPCR];
> -    save->lpidr = env->spr[SPR_LPIDR];
> -    save->pcr = env->spr[SPR_PCR];
> -    save->dpdes = env->spr[SPR_DPDES];
> -    save->hfscr = env->spr[SPR_HFSCR];
> -    save->srr0 = env->spr[SPR_SRR0];
> -    save->srr1 = env->spr[SPR_SRR1];
> -    save->sprg0 = env->spr[SPR_SPRG0];
> -    save->sprg1 = env->spr[SPR_SPRG1];
> -    save->sprg2 = env->spr[SPR_SPRG2];
> -    save->sprg3 = env->spr[SPR_SPRG3];
> -    save->pidr = env->spr[SPR_BOOKS_PID];
> -    save->ppr = env->spr[SPR_PPR];
> -
> -    save->tb_offset = env->tb_env->tb_offset;
> -}
> -
> -static void nested_load_state(PowerPCCPU *cpu, struct nested_ppc_state *load)
> -{
> -    CPUState *cs = CPU(cpu);
> -    CPUPPCState *env = &cpu->env;
> -    uint32_t cr;
> -    int i;
> -
> -    memcpy(env->gpr, load->gpr, sizeof(env->gpr));
> -
> -    env->lr = load->lr;
> -    env->ctr = load->ctr;
> -    env->cfar = load->cfar;
> -    env->msr = load->msr;
> -    env->nip = load->nip;
> -
> -    cr = load->cr;
> -    for (i = 7; i >= 0; i--) {
> -        env->crf[i] = cr & 15;
> -        cr >>= 4;
> -    }
> -
> -    cpu_write_xer(env, load->xer);
> -
> -    env->spr[SPR_LPCR] = load->lpcr;
> -    env->spr[SPR_LPIDR] = load->lpidr;
> -    env->spr[SPR_PCR] = load->pcr;
> -    env->spr[SPR_DPDES] = load->dpdes;
> -    env->spr[SPR_HFSCR] = load->hfscr;
> -    env->spr[SPR_SRR0] = load->srr0;
> -    env->spr[SPR_SRR1] = load->srr1;
> -    env->spr[SPR_SPRG0] = load->sprg0;
> -    env->spr[SPR_SPRG1] = load->sprg1;
> -    env->spr[SPR_SPRG2] = load->sprg2;
> -    env->spr[SPR_SPRG3] = load->sprg3;
> -    env->spr[SPR_BOOKS_PID] = load->pidr;
> -    env->spr[SPR_PPR] = load->ppr;
> -
> -    env->tb_env->tb_offset = load->tb_offset;
> -
> -    /*
> -     * MSR updated, compute hflags and possible interrupts.
> -     */
> -    hreg_compute_hflags(env);
> -    ppc_maybe_interrupt(env);
> -
> -    /*
> -     * Nested HV does not tag TLB entries between L1 and L2, so must
> -     * flush on transition.
> -     */
> -    tlb_flush(cs);
> -    env->reserve_addr = -1; /* Reset the reservation */
> -}
> -
> -/*
> - * When this handler returns, the environment is switched to the L2 guest
> - * and TCG begins running that. spapr_exit_nested() performs the switch from
> - * L2 back to L1 and returns from the H_ENTER_NESTED hcall.
> - */
> -static target_ulong h_enter_nested(PowerPCCPU *cpu,
> -                                   SpaprMachineState *spapr,
> -                                   target_ulong opcode,
> -                                   target_ulong *args)
> -{
> -    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> -    CPUPPCState *env = &cpu->env;
> -    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> -    struct nested_ppc_state l2_state;
> -    target_ulong hv_ptr = args[0];
> -    target_ulong regs_ptr = args[1];
> -    target_ulong hdec, now = cpu_ppc_load_tbl(env);
> -    target_ulong lpcr, lpcr_mask;
> -    struct kvmppc_hv_guest_state *hvstate;
> -    struct kvmppc_hv_guest_state hv_state;
> -    struct kvmppc_pt_regs *regs;
> -    hwaddr len;
> -
> -    if (spapr->nested_ptcr == 0) {
> -        return H_NOT_AVAILABLE;
> -    }
> -
> -    len = sizeof(*hvstate);
> -    hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, false,
> -                                MEMTXATTRS_UNSPECIFIED);
> -    if (len != sizeof(*hvstate)) {
> -        address_space_unmap(CPU(cpu)->as, hvstate, len, 0, false);
> -        return H_PARAMETER;
> -    }
> -
> -    memcpy(&hv_state, hvstate, len);
> -
> -    address_space_unmap(CPU(cpu)->as, hvstate, len, len, false);
> -
> -    /*
> -     * We accept versions 1 and 2. Version 2 fields are unused because TCG
> -     * does not implement DAWR*.
> -     */
> -    if (hv_state.version > HV_GUEST_STATE_VERSION) {
> -        return H_PARAMETER;
> -    }
> -
> -    if (hv_state.lpid == 0) {
> -        return H_PARAMETER;
> -    }
> -
> -    spapr_cpu->nested_host_state = g_try_new(struct nested_ppc_state, 1);
> -    if (!spapr_cpu->nested_host_state) {
> -        return H_NO_MEM;
> -    }
> -
> -    assert(env->spr[SPR_LPIDR] == 0);
> -    assert(env->spr[SPR_DPDES] == 0);
> -    nested_save_state(spapr_cpu->nested_host_state, cpu);
> -
> -    len = sizeof(*regs);
> -    regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, false,
> -                                MEMTXATTRS_UNSPECIFIED);
> -    if (!regs || len != sizeof(*regs)) {
> -        address_space_unmap(CPU(cpu)->as, regs, len, 0, false);
> -        g_free(spapr_cpu->nested_host_state);
> -        return H_P2;
> -    }
> -
> -    len = sizeof(l2_state.gpr);
> -    assert(len == sizeof(regs->gpr));
> -    memcpy(l2_state.gpr, regs->gpr, len);
> -
> -    l2_state.lr = regs->link;
> -    l2_state.ctr = regs->ctr;
> -    l2_state.xer = regs->xer;
> -    l2_state.cr = regs->ccr;
> -    l2_state.msr = regs->msr;
> -    l2_state.nip = regs->nip;
> -
> -    address_space_unmap(CPU(cpu)->as, regs, len, len, false);
> -
> -    l2_state.cfar = hv_state.cfar;
> -    l2_state.lpidr = hv_state.lpid;
> -
> -    lpcr_mask = LPCR_DPFD | LPCR_ILE | LPCR_AIL | LPCR_LD | LPCR_MER;
> -    lpcr = (env->spr[SPR_LPCR] & ~lpcr_mask) | (hv_state.lpcr & lpcr_mask);
> -    lpcr |= LPCR_HR | LPCR_UPRT | LPCR_GTSE | LPCR_HVICE | LPCR_HDICE;
> -    lpcr &= ~LPCR_LPES0;
> -    l2_state.lpcr = lpcr & pcc->lpcr_mask;
> -
> -    l2_state.pcr = hv_state.pcr;
> -    /* hv_state.amor is not used */
> -    l2_state.dpdes = hv_state.dpdes;
> -    l2_state.hfscr = hv_state.hfscr;
> -    /* TCG does not implement DAWR*, CIABR, PURR, SPURR, IC, VTB, HEIR SPRs*/
> -    l2_state.srr0 = hv_state.srr0;
> -    l2_state.srr1 = hv_state.srr1;
> -    l2_state.sprg0 = hv_state.sprg[0];
> -    l2_state.sprg1 = hv_state.sprg[1];
> -    l2_state.sprg2 = hv_state.sprg[2];
> -    l2_state.sprg3 = hv_state.sprg[3];
> -    l2_state.pidr = hv_state.pidr;
> -    l2_state.ppr = hv_state.ppr;
> -    l2_state.tb_offset = env->tb_env->tb_offset + hv_state.tb_offset;
> -
> -    /*
> -     * Switch to the nested guest environment and start the "hdec" timer.
> -     */
> -    nested_load_state(cpu, &l2_state);
> -
> -    hdec = hv_state.hdec_expiry - now;
> -    cpu_ppc_hdecr_init(env);
> -    cpu_ppc_store_hdecr(env, hdec);
> -
> -    /*
> -     * The hv_state.vcpu_token is not needed. It is used by the KVM
> -     * implementation to remember which L2 vCPU last ran on which physical
> -     * CPU so as to invalidate process scope translations if it is moved
> -     * between physical CPUs. For now TLBs are always flushed on L1<->L2
> -     * transitions so this is not a problem.
> -     *
> -     * Could validate that the same vcpu_token does not attempt to run on
> -     * different L1 vCPUs at the same time, but that would be a L1 KVM bug
> -     * and it's not obviously worth a new data structure to do it.
> -     */
> -
> -    spapr_cpu->in_nested = true;
> -
> -    /*
> -     * The spapr hcall helper sets env->gpr[3] to the return value, but at
> -     * this point the L1 is not returning from the hcall but rather we
> -     * start running the L2, so r3 must not be clobbered, so return env->gpr[3]
> -     * to leave it unchanged.
> -     */
> -    return env->gpr[3];
> -}
> -
> -void spapr_exit_nested(PowerPCCPU *cpu, int excp)
> -{
> -    CPUPPCState *env = &cpu->env;
> -    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> -    struct nested_ppc_state l2_state;
> -    target_ulong hv_ptr = spapr_cpu->nested_host_state->gpr[4];
> -    target_ulong regs_ptr = spapr_cpu->nested_host_state->gpr[5];
> -    target_ulong hsrr0, hsrr1, hdar, asdr, hdsisr;
> -    struct kvmppc_hv_guest_state *hvstate;
> -    struct kvmppc_pt_regs *regs;
> -    hwaddr len;
> -
> -    assert(spapr_cpu->in_nested);
> -
> -    nested_save_state(&l2_state, cpu);
> -    hsrr0 = env->spr[SPR_HSRR0];
> -    hsrr1 = env->spr[SPR_HSRR1];
> -    hdar = env->spr[SPR_HDAR];
> -    hdsisr = env->spr[SPR_HDSISR];
> -    asdr = env->spr[SPR_ASDR];
> -
> -    /*
> -     * Switch back to the host environment (including for any error).
> -     */
> -    assert(env->spr[SPR_LPIDR] != 0);
> -    nested_load_state(cpu, spapr_cpu->nested_host_state);
> -    env->gpr[3] = env->excp_vectors[excp]; /* hcall return value */
> -
> -    cpu_ppc_hdecr_exit(env);
> -
> -    spapr_cpu->in_nested = false;
> -
> -    g_free(spapr_cpu->nested_host_state);
> -    spapr_cpu->nested_host_state = NULL;
> -
> -    len = sizeof(*hvstate);
> -    hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, true,
> -                                MEMTXATTRS_UNSPECIFIED);
> -    if (len != sizeof(*hvstate)) {
> -        address_space_unmap(CPU(cpu)->as, hvstate, len, 0, true);
> -        env->gpr[3] = H_PARAMETER;
> -	return;
> -    }
> -
> -    hvstate->cfar = l2_state.cfar;
> -    hvstate->lpcr = l2_state.lpcr;
> -    hvstate->pcr = l2_state.pcr;
> -    hvstate->dpdes = l2_state.dpdes;
> -    hvstate->hfscr = l2_state.hfscr;
> -
> -    if (excp == POWERPC_EXCP_HDSI) {
> -        hvstate->hdar = hdar;
> -        hvstate->hdsisr = hdsisr;
> -        hvstate->asdr = asdr;
> -    } else if (excp == POWERPC_EXCP_HISI) {
> -        hvstate->asdr = asdr;
> -    }
> -
> -    /* HEIR should be implemented for HV mode and saved here. */
> -    hvstate->srr0 = l2_state.srr0;
> -    hvstate->srr1 = l2_state.srr1;
> -    hvstate->sprg[0] = l2_state.sprg0;
> -    hvstate->sprg[1] = l2_state.sprg1;
> -    hvstate->sprg[2] = l2_state.sprg2;
> -    hvstate->sprg[3] = l2_state.sprg3;
> -    hvstate->pidr = l2_state.pidr;
> -    hvstate->ppr = l2_state.ppr;
> -
> -    /* Is it okay to specify write length larger than actual data written? */
> -    address_space_unmap(CPU(cpu)->as, hvstate, len, len, true);
> -
> -    len = sizeof(*regs);
> -    regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, true,
> -                                MEMTXATTRS_UNSPECIFIED);
> -    if (!regs || len != sizeof(*regs)) {
> -        address_space_unmap(CPU(cpu)->as, regs, len, 0, true);
> -        env->gpr[3] = H_P2;
> -	return;
> -    }
> -
> -    len = sizeof(env->gpr);
> -    assert(len == sizeof(regs->gpr));
> -    memcpy(regs->gpr, l2_state.gpr, len);
> -
> -    regs->link = l2_state.lr;
> -    regs->ctr = l2_state.ctr;
> -    regs->xer = l2_state.xer;
> -    regs->ccr = l2_state.cr;
> -
> -    if (excp == POWERPC_EXCP_MCHECK ||
> -        excp == POWERPC_EXCP_RESET ||
> -        excp == POWERPC_EXCP_SYSCALL) {
> -        regs->nip = l2_state.srr0;
> -        regs->msr = l2_state.srr1 & env->msr_mask;
> -    } else {
> -        regs->nip = hsrr0;
> -        regs->msr = hsrr1 & env->msr_mask;
> -    }
> -
> -    /* Is it okay to specify write length larger than actual data written? */
> -    address_space_unmap(CPU(cpu)->as, regs, len, len, true);
> -}
> -
> -static void hypercall_register_nested(void)
> -{
> -    spapr_register_hypercall(KVMPPC_H_SET_PARTITION_TABLE, h_set_ptbl);
> -    spapr_register_hypercall(KVMPPC_H_ENTER_NESTED, h_enter_nested);
> -    spapr_register_hypercall(KVMPPC_H_TLB_INVALIDATE, h_tlb_invalidate);
> -    spapr_register_hypercall(KVMPPC_H_COPY_TOFROM_GUEST, h_copy_tofrom_guest);
> -}
> -
>   static void hypercall_register_softmmu(void)
>   {
>       /* DO NOTHING */
>   }
>   #else
> -void spapr_exit_nested(PowerPCCPU *cpu, int excp)
> -{
> -    g_assert_not_reached();
> -}
> -
>   static target_ulong h_softmmu(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                               target_ulong opcode, target_ulong *args)
>   {
>       g_assert_not_reached();
>   }
>   
> -static void hypercall_register_nested(void)
> -{
> -    /* DO NOTHING */
> -}
> -
>   static void hypercall_register_softmmu(void)
>   {
>       /* hcall-pft */
> @@ -2003,7 +1576,7 @@ static void hypercall_register_types(void)
>   
>       spapr_register_hypercall(KVMPPC_H_UPDATE_DT, h_update_dt);
>   
> -    hypercall_register_nested();
> +    spapr_register_nested();
>   }
>   
>   type_init(hypercall_register_types)
> diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
> new file mode 100644
> index 0000000000..c06dd8903c
> --- /dev/null
> +++ b/hw/ppc/spapr_nested.c
> @@ -0,0 +1,496 @@
> +#include "qemu/osdep.h"
> +#include "qemu/cutils.h"
> +#include "exec/exec-all.h"
> +#include "helper_regs.h"
> +#include "hw/ppc/ppc.h"
> +#include "hw/ppc/spapr.h"
> +#include "hw/ppc/spapr_cpu_core.h"
> +
> +/*
> + * Register state for entering a nested guest with H_ENTER_NESTED.
> + * New member must be added at the end.
> + */
> +struct kvmppc_hv_guest_state {
> +    uint64_t version;      /* version of this structure layout, must be first */
> +    uint32_t lpid;
> +    uint32_t vcpu_token;
> +    /* These registers are hypervisor privileged (at least for writing) */
> +    uint64_t lpcr;
> +    uint64_t pcr;
> +    uint64_t amor;
> +    uint64_t dpdes;
> +    uint64_t hfscr;
> +    int64_t tb_offset;
> +    uint64_t dawr0;
> +    uint64_t dawrx0;
> +    uint64_t ciabr;
> +    uint64_t hdec_expiry;
> +    uint64_t purr;
> +    uint64_t spurr;
> +    uint64_t ic;
> +    uint64_t vtb;
> +    uint64_t hdar;
> +    uint64_t hdsisr;
> +    uint64_t heir;
> +    uint64_t asdr;
> +    /* These are OS privileged but need to be set late in guest entry */
> +    uint64_t srr0;
> +    uint64_t srr1;
> +    uint64_t sprg[4];
> +    uint64_t pidr;
> +    uint64_t cfar;
> +    uint64_t ppr;
> +    /* Version 1 ends here */
> +    uint64_t dawr1;
> +    uint64_t dawrx1;
> +    /* Version 2 ends here */
> +};
> +
> +/* Latest version of hv_guest_state structure */
> +#define HV_GUEST_STATE_VERSION  2
> +
> +/* Linux 64-bit powerpc pt_regs struct, used by nested HV */
> +struct kvmppc_pt_regs {
> +    uint64_t gpr[32];
> +    uint64_t nip;
> +    uint64_t msr;
> +    uint64_t orig_gpr3;    /* Used for restarting system calls */
> +    uint64_t ctr;
> +    uint64_t link;
> +    uint64_t xer;
> +    uint64_t ccr;
> +    uint64_t softe;        /* Soft enabled/disabled */
> +    uint64_t trap;         /* Reason for being here */
> +    uint64_t dar;          /* Fault registers */
> +    uint64_t dsisr;        /* on 4xx/Book-E used for ESR */
> +    uint64_t result;       /* Result of a system call */
> +};
> +
> +#ifdef CONFIG_TCG
> +#define PRTS_MASK      0x1f
> +
> +static target_ulong h_set_ptbl(PowerPCCPU *cpu,
> +                               SpaprMachineState *spapr,
> +                               target_ulong opcode,
> +                               target_ulong *args)
> +{
> +    target_ulong ptcr = args[0];
> +
> +    if (!spapr_get_cap(spapr, SPAPR_CAP_NESTED_KVM_HV)) {
> +        return H_FUNCTION;
> +    }
> +
> +    if ((ptcr & PRTS_MASK) + 12 - 4 > 12) {
> +        return H_PARAMETER;
> +    }
> +
> +    spapr->nested_ptcr = ptcr; /* Save new partition table */
> +
> +    return H_SUCCESS;
> +}
> +
> +static target_ulong h_tlb_invalidate(PowerPCCPU *cpu,
> +                                     SpaprMachineState *spapr,
> +                                     target_ulong opcode,
> +                                     target_ulong *args)
> +{
> +    /*
> +     * The spapr virtual hypervisor nested HV implementation retains no L2
> +     * translation state except for TLB. And the TLB is always invalidated
> +     * across L1<->L2 transitions, so nothing is required here.
> +     */
> +
> +    return H_SUCCESS;
> +}
> +
> +static target_ulong h_copy_tofrom_guest(PowerPCCPU *cpu,
> +                                        SpaprMachineState *spapr,
> +                                        target_ulong opcode,
> +                                        target_ulong *args)
> +{
> +    /*
> +     * This HCALL is not required, L1 KVM will take a slow path and walk the
> +     * page tables manually to do the data copy.
> +     */
> +    return H_FUNCTION;
> +}
> +
> +struct nested_ppc_state {
> +    uint64_t gpr[32];
> +    uint64_t lr;
> +    uint64_t ctr;
> +    uint64_t cfar;
> +    uint64_t msr;
> +    uint64_t nip;
> +    uint32_t cr;
> +
> +    uint64_t xer;
> +
> +    uint64_t lpcr;
> +    uint64_t lpidr;
> +    uint64_t pidr;
> +    uint64_t pcr;
> +    uint64_t dpdes;
> +    uint64_t hfscr;
> +    uint64_t srr0;
> +    uint64_t srr1;
> +    uint64_t sprg0;
> +    uint64_t sprg1;
> +    uint64_t sprg2;
> +    uint64_t sprg3;
> +    uint64_t ppr;
> +
> +    int64_t tb_offset;
> +};
> +
> +static void nested_save_state(struct nested_ppc_state *save, PowerPCCPU *cpu)
> +{
> +    CPUPPCState *env = &cpu->env;
> +    uint32_t cr;
> +    int i;
> +
> +    memcpy(save->gpr, env->gpr, sizeof(save->gpr));
> +
> +    save->lr = env->lr;
> +    save->ctr = env->ctr;
> +    save->cfar = env->cfar;
> +    save->msr = env->msr;
> +    save->nip = env->nip;
> +
> +    cr = 0;
> +    for (i = 0; i < 8; i++) {
> +        cr |= (env->crf[i] & 15) << (4 * (7 - i));
> +    }
> +    save->cr = cr;
> +
> +    save->xer = cpu_read_xer(env);
> +
> +    save->lpcr = env->spr[SPR_LPCR];
> +    save->lpidr = env->spr[SPR_LPIDR];
> +    save->pcr = env->spr[SPR_PCR];
> +    save->dpdes = env->spr[SPR_DPDES];
> +    save->hfscr = env->spr[SPR_HFSCR];
> +    save->srr0 = env->spr[SPR_SRR0];
> +    save->srr1 = env->spr[SPR_SRR1];
> +    save->sprg0 = env->spr[SPR_SPRG0];
> +    save->sprg1 = env->spr[SPR_SPRG1];
> +    save->sprg2 = env->spr[SPR_SPRG2];
> +    save->sprg3 = env->spr[SPR_SPRG3];
> +    save->pidr = env->spr[SPR_BOOKS_PID];
> +    save->ppr = env->spr[SPR_PPR];
> +
> +    save->tb_offset = env->tb_env->tb_offset;
> +}
> +
> +static void nested_load_state(PowerPCCPU *cpu, struct nested_ppc_state *load)
> +{
> +    CPUState *cs = CPU(cpu);
> +    CPUPPCState *env = &cpu->env;
> +    uint32_t cr;
> +    int i;
> +
> +    memcpy(env->gpr, load->gpr, sizeof(env->gpr));
> +
> +    env->lr = load->lr;
> +    env->ctr = load->ctr;
> +    env->cfar = load->cfar;
> +    env->msr = load->msr;
> +    env->nip = load->nip;
> +
> +    cr = load->cr;
> +    for (i = 7; i >= 0; i--) {
> +        env->crf[i] = cr & 15;
> +        cr >>= 4;
> +    }
> +
> +    cpu_write_xer(env, load->xer);
> +
> +    env->spr[SPR_LPCR] = load->lpcr;
> +    env->spr[SPR_LPIDR] = load->lpidr;
> +    env->spr[SPR_PCR] = load->pcr;
> +    env->spr[SPR_DPDES] = load->dpdes;
> +    env->spr[SPR_HFSCR] = load->hfscr;
> +    env->spr[SPR_SRR0] = load->srr0;
> +    env->spr[SPR_SRR1] = load->srr1;
> +    env->spr[SPR_SPRG0] = load->sprg0;
> +    env->spr[SPR_SPRG1] = load->sprg1;
> +    env->spr[SPR_SPRG2] = load->sprg2;
> +    env->spr[SPR_SPRG3] = load->sprg3;
> +    env->spr[SPR_BOOKS_PID] = load->pidr;
> +    env->spr[SPR_PPR] = load->ppr;
> +
> +    env->tb_env->tb_offset = load->tb_offset;
> +
> +    /*
> +     * MSR updated, compute hflags and possible interrupts.
> +     */
> +    hreg_compute_hflags(env);
> +    ppc_maybe_interrupt(env);
> +
> +    /*
> +     * Nested HV does not tag TLB entries between L1 and L2, so must
> +     * flush on transition.
> +     */
> +    tlb_flush(cs);
> +    env->reserve_addr = -1; /* Reset the reservation */
> +}
> +
> +/*
> + * When this handler returns, the environment is switched to the L2 guest
> + * and TCG begins running that. spapr_exit_nested() performs the switch from
> + * L2 back to L1 and returns from the H_ENTER_NESTED hcall.
> + */
> +static target_ulong h_enter_nested(PowerPCCPU *cpu,
> +                                   SpaprMachineState *spapr,
> +                                   target_ulong opcode,
> +                                   target_ulong *args)
> +{
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +    CPUPPCState *env = &cpu->env;
> +    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> +    struct nested_ppc_state l2_state;
> +    target_ulong hv_ptr = args[0];
> +    target_ulong regs_ptr = args[1];
> +    target_ulong hdec, now = cpu_ppc_load_tbl(env);
> +    target_ulong lpcr, lpcr_mask;
> +    struct kvmppc_hv_guest_state *hvstate;
> +    struct kvmppc_hv_guest_state hv_state;
> +    struct kvmppc_pt_regs *regs;
> +    hwaddr len;
> +
> +    if (spapr->nested_ptcr == 0) {
> +        return H_NOT_AVAILABLE;
> +    }
> +
> +    len = sizeof(*hvstate);
> +    hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, false,
> +                                MEMTXATTRS_UNSPECIFIED);
> +    if (len != sizeof(*hvstate)) {
> +        address_space_unmap(CPU(cpu)->as, hvstate, len, 0, false);
> +        return H_PARAMETER;
> +    }
> +
> +    memcpy(&hv_state, hvstate, len);
> +
> +    address_space_unmap(CPU(cpu)->as, hvstate, len, len, false);
> +
> +    /*
> +     * We accept versions 1 and 2. Version 2 fields are unused because TCG
> +     * does not implement DAWR*.
> +     */
> +    if (hv_state.version > HV_GUEST_STATE_VERSION) {
> +        return H_PARAMETER;
> +    }
> +
> +    if (hv_state.lpid == 0) {
> +        return H_PARAMETER;
> +    }
> +
> +    spapr_cpu->nested_host_state = g_try_new(struct nested_ppc_state, 1);
> +    if (!spapr_cpu->nested_host_state) {
> +        return H_NO_MEM;
> +    }
> +
> +    assert(env->spr[SPR_LPIDR] == 0);
> +    assert(env->spr[SPR_DPDES] == 0);
> +    nested_save_state(spapr_cpu->nested_host_state, cpu);
> +
> +    len = sizeof(*regs);
> +    regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, false,
> +                                MEMTXATTRS_UNSPECIFIED);
> +    if (!regs || len != sizeof(*regs)) {
> +        address_space_unmap(CPU(cpu)->as, regs, len, 0, false);
> +        g_free(spapr_cpu->nested_host_state);
> +        return H_P2;
> +    }
> +
> +    len = sizeof(l2_state.gpr);
> +    assert(len == sizeof(regs->gpr));
> +    memcpy(l2_state.gpr, regs->gpr, len);
> +
> +    l2_state.lr = regs->link;
> +    l2_state.ctr = regs->ctr;
> +    l2_state.xer = regs->xer;
> +    l2_state.cr = regs->ccr;
> +    l2_state.msr = regs->msr;
> +    l2_state.nip = regs->nip;
> +
> +    address_space_unmap(CPU(cpu)->as, regs, len, len, false);
> +
> +    l2_state.cfar = hv_state.cfar;
> +    l2_state.lpidr = hv_state.lpid;
> +
> +    lpcr_mask = LPCR_DPFD | LPCR_ILE | LPCR_AIL | LPCR_LD | LPCR_MER;
> +    lpcr = (env->spr[SPR_LPCR] & ~lpcr_mask) | (hv_state.lpcr & lpcr_mask);
> +    lpcr |= LPCR_HR | LPCR_UPRT | LPCR_GTSE | LPCR_HVICE | LPCR_HDICE;
> +    lpcr &= ~LPCR_LPES0;
> +    l2_state.lpcr = lpcr & pcc->lpcr_mask;
> +
> +    l2_state.pcr = hv_state.pcr;
> +    /* hv_state.amor is not used */
> +    l2_state.dpdes = hv_state.dpdes;
> +    l2_state.hfscr = hv_state.hfscr;
> +    /* TCG does not implement DAWR*, CIABR, PURR, SPURR, IC, VTB, HEIR SPRs*/
> +    l2_state.srr0 = hv_state.srr0;
> +    l2_state.srr1 = hv_state.srr1;
> +    l2_state.sprg0 = hv_state.sprg[0];
> +    l2_state.sprg1 = hv_state.sprg[1];
> +    l2_state.sprg2 = hv_state.sprg[2];
> +    l2_state.sprg3 = hv_state.sprg[3];
> +    l2_state.pidr = hv_state.pidr;
> +    l2_state.ppr = hv_state.ppr;
> +    l2_state.tb_offset = env->tb_env->tb_offset + hv_state.tb_offset;
> +
> +    /*
> +     * Switch to the nested guest environment and start the "hdec" timer.
> +     */
> +    nested_load_state(cpu, &l2_state);
> +
> +    hdec = hv_state.hdec_expiry - now;
> +    cpu_ppc_hdecr_init(env);
> +    cpu_ppc_store_hdecr(env, hdec);
> +
> +    /*
> +     * The hv_state.vcpu_token is not needed. It is used by the KVM
> +     * implementation to remember which L2 vCPU last ran on which physical
> +     * CPU so as to invalidate process scope translations if it is moved
> +     * between physical CPUs. For now TLBs are always flushed on L1<->L2
> +     * transitions so this is not a problem.
> +     *
> +     * Could validate that the same vcpu_token does not attempt to run on
> +     * different L1 vCPUs at the same time, but that would be a L1 KVM bug
> +     * and it's not obviously worth a new data structure to do it.
> +     */
> +
> +    spapr_cpu->in_nested = true;
> +
> +    /*
> +     * The spapr hcall helper sets env->gpr[3] to the return value, but at
> +     * this point the L1 is not returning from the hcall but rather we
> +     * start running the L2, so r3 must not be clobbered, so return env->gpr[3]
> +     * to leave it unchanged.
> +     */
> +    return env->gpr[3];
> +}
> +
> +void spapr_exit_nested(PowerPCCPU *cpu, int excp)
> +{
> +    CPUPPCState *env = &cpu->env;
> +    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> +    struct nested_ppc_state l2_state;
> +    target_ulong hv_ptr = spapr_cpu->nested_host_state->gpr[4];
> +    target_ulong regs_ptr = spapr_cpu->nested_host_state->gpr[5];
> +    target_ulong hsrr0, hsrr1, hdar, asdr, hdsisr;
> +    struct kvmppc_hv_guest_state *hvstate;
> +    struct kvmppc_pt_regs *regs;
> +    hwaddr len;
> +
> +    assert(spapr_cpu->in_nested);
> +
> +    nested_save_state(&l2_state, cpu);
> +    hsrr0 = env->spr[SPR_HSRR0];
> +    hsrr1 = env->spr[SPR_HSRR1];
> +    hdar = env->spr[SPR_HDAR];
> +    hdsisr = env->spr[SPR_HDSISR];
> +    asdr = env->spr[SPR_ASDR];
> +
> +    /*
> +     * Switch back to the host environment (including for any error).
> +     */
> +    assert(env->spr[SPR_LPIDR] != 0);
> +    nested_load_state(cpu, spapr_cpu->nested_host_state);
> +    env->gpr[3] = env->excp_vectors[excp]; /* hcall return value */
> +
> +    cpu_ppc_hdecr_exit(env);
> +
> +    spapr_cpu->in_nested = false;
> +
> +    g_free(spapr_cpu->nested_host_state);
> +    spapr_cpu->nested_host_state = NULL;
> +
> +    len = sizeof(*hvstate);
> +    hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, true,
> +                                MEMTXATTRS_UNSPECIFIED);
> +    if (len != sizeof(*hvstate)) {
> +        address_space_unmap(CPU(cpu)->as, hvstate, len, 0, true);
> +        env->gpr[3] = H_PARAMETER;
> +	return;
> +    }
> +
> +    hvstate->cfar = l2_state.cfar;
> +    hvstate->lpcr = l2_state.lpcr;
> +    hvstate->pcr = l2_state.pcr;
> +    hvstate->dpdes = l2_state.dpdes;
> +    hvstate->hfscr = l2_state.hfscr;
> +
> +    if (excp == POWERPC_EXCP_HDSI) {
> +        hvstate->hdar = hdar;
> +        hvstate->hdsisr = hdsisr;
> +        hvstate->asdr = asdr;
> +    } else if (excp == POWERPC_EXCP_HISI) {
> +        hvstate->asdr = asdr;
> +    }
> +
> +    /* HEIR should be implemented for HV mode and saved here. */
> +    hvstate->srr0 = l2_state.srr0;
> +    hvstate->srr1 = l2_state.srr1;
> +    hvstate->sprg[0] = l2_state.sprg0;
> +    hvstate->sprg[1] = l2_state.sprg1;
> +    hvstate->sprg[2] = l2_state.sprg2;
> +    hvstate->sprg[3] = l2_state.sprg3;
> +    hvstate->pidr = l2_state.pidr;
> +    hvstate->ppr = l2_state.ppr;
> +
> +    /* Is it okay to specify write length larger than actual data written? */
> +    address_space_unmap(CPU(cpu)->as, hvstate, len, len, true);
> +
> +    len = sizeof(*regs);
> +    regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, true,
> +                                MEMTXATTRS_UNSPECIFIED);
> +    if (!regs || len != sizeof(*regs)) {
> +        address_space_unmap(CPU(cpu)->as, regs, len, 0, true);
> +        env->gpr[3] = H_P2;
> +	return;
> +    }
> +
> +    len = sizeof(env->gpr);
> +    assert(len == sizeof(regs->gpr));
> +    memcpy(regs->gpr, l2_state.gpr, len);
> +
> +    regs->link = l2_state.lr;
> +    regs->ctr = l2_state.ctr;
> +    regs->xer = l2_state.xer;
> +    regs->ccr = l2_state.cr;
> +
> +    if (excp == POWERPC_EXCP_MCHECK ||
> +        excp == POWERPC_EXCP_RESET ||
> +        excp == POWERPC_EXCP_SYSCALL) {
> +        regs->nip = l2_state.srr0;
> +        regs->msr = l2_state.srr1 & env->msr_mask;
> +    } else {
> +        regs->nip = hsrr0;
> +        regs->msr = hsrr1 & env->msr_mask;
> +    }
> +
> +    /* Is it okay to specify write length larger than actual data written? */
> +    address_space_unmap(CPU(cpu)->as, regs, len, len, true);
> +}
> +
> +void spapr_register_nested(void)
> +{
> +    spapr_register_hypercall(KVMPPC_H_SET_PARTITION_TABLE, h_set_ptbl);
> +    spapr_register_hypercall(KVMPPC_H_ENTER_NESTED, h_enter_nested);
> +    spapr_register_hypercall(KVMPPC_H_TLB_INVALIDATE, h_tlb_invalidate);
> +    spapr_register_hypercall(KVMPPC_H_COPY_TOFROM_GUEST, h_copy_tofrom_guest);
> +}
> +#else
> +void spapr_exit_nested(PowerPCCPU *cpu, int excp)
> +{
> +    g_assert_not_reached();
> +}
> +
> +void spapr_register_nested(void)
> +{
> +    /* DO NOTHING */
> +}
> +#endif
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 5c8aabd444..84cdde75da 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -619,66 +619,6 @@ struct SpaprMachineState {
>   #define SVM_H_TPM_COMM              0xEF10
>   #define SVM_HCALL_MAX               SVM_H_TPM_COMM
>   
> -/*
> - * Register state for entering a nested guest with H_ENTER_NESTED.
> - * New member must be added at the end.
> - */
> -struct kvmppc_hv_guest_state {
> -    uint64_t version;      /* version of this structure layout, must be first */
> -    uint32_t lpid;
> -    uint32_t vcpu_token;
> -    /* These registers are hypervisor privileged (at least for writing) */
> -    uint64_t lpcr;
> -    uint64_t pcr;
> -    uint64_t amor;
> -    uint64_t dpdes;
> -    uint64_t hfscr;
> -    int64_t tb_offset;
> -    uint64_t dawr0;
> -    uint64_t dawrx0;
> -    uint64_t ciabr;
> -    uint64_t hdec_expiry;
> -    uint64_t purr;
> -    uint64_t spurr;
> -    uint64_t ic;
> -    uint64_t vtb;
> -    uint64_t hdar;
> -    uint64_t hdsisr;
> -    uint64_t heir;
> -    uint64_t asdr;
> -    /* These are OS privileged but need to be set late in guest entry */
> -    uint64_t srr0;
> -    uint64_t srr1;
> -    uint64_t sprg[4];
> -    uint64_t pidr;
> -    uint64_t cfar;
> -    uint64_t ppr;
> -    /* Version 1 ends here */
> -    uint64_t dawr1;
> -    uint64_t dawrx1;
> -    /* Version 2 ends here */
> -};
> -
> -/* Latest version of hv_guest_state structure */
> -#define HV_GUEST_STATE_VERSION  2
> -
> -/* Linux 64-bit powerpc pt_regs struct, used by nested HV */
> -struct kvmppc_pt_regs {
> -    uint64_t gpr[32];
> -    uint64_t nip;
> -    uint64_t msr;
> -    uint64_t orig_gpr3;    /* Used for restarting system calls */
> -    uint64_t ctr;
> -    uint64_t link;
> -    uint64_t xer;
> -    uint64_t ccr;
> -    uint64_t softe;        /* Soft enabled/disabled */
> -    uint64_t trap;         /* Reason for being here */
> -    uint64_t dar;          /* Fault registers */
> -    uint64_t dsisr;        /* on 4xx/Book-E used for ESR */
> -    uint64_t result;       /* Result of a system call */
> -};
> -
>   typedef struct SpaprDeviceTreeUpdateHeader {
>       uint32_t version_id;
>   } SpaprDeviceTreeUpdateHeader;
> @@ -696,6 +636,7 @@ void spapr_register_hypercall(target_ulong opcode, spapr_hcall_fn fn);
>   target_ulong spapr_hypercall(PowerPCCPU *cpu, target_ulong opcode,
>                                target_ulong *args);
>   
> +void spapr_register_nested(void);
>   void spapr_exit_nested(PowerPCCPU *cpu, int excp);
>   
>   target_ulong softmmu_resize_hpt_prepare(PowerPCCPU *cpu, SpaprMachineState *spapr,


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 2/4] spapr: Add a nested state struct
  2023-05-05 10:54   ` Harsh Prateek Bora
@ 2023-05-13  3:27     ` Nicholas Piggin
  0 siblings, 0 replies; 12+ messages in thread
From: Nicholas Piggin @ 2023-05-13  3:27 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc; +Cc: qemu-devel

On Fri May 5, 2023 at 8:54 PM AEST, Harsh Prateek Bora wrote:
> <correcting my email in CC>
>
> On 5/3/23 06:09, Nicholas Piggin wrote:
> > @@ -1593,12 +1713,14 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
> >           return H_PARAMETER;
> >       }
> >   
> > -    spapr_cpu->nested_host_state = g_try_new(CPUPPCState, 1);
> > +    spapr_cpu->nested_host_state = g_try_new(struct nested_ppc_state, 1);
> >       if (!spapr_cpu->nested_host_state) {
> >           return H_NO_MEM;
> >       }
> >   
> > -    memcpy(spapr_cpu->nested_host_state, env, sizeof(CPUPPCState));
> > +    assert(env->spr[SPR_LPIDR] == 0);
> > +    assert(env->spr[SPR_DPDES] == 0);
> > +    nested_save_state(spapr_cpu->nested_host_state, cpu);
> >   
> Ideally, we may want to save entire env for L1 host, while switching to 
> L2 rather than keeping a subset of it for 2 reasons:
>   - keeping enitre L1 env ensures it remains untouched by L2 during L2 
> execution (shouldnt allow L2 to modify remaining L1 env bits unexpectedly)

It doesn't because you need to restore it, and you can't just restore
all. Making it symmetrical and saving what you restore is better. It's
a pretty nasty layering violation to memcpy the whole CPUPPCState too,
conceptually.

I prefer that we have to think about each bit of state that is changed
and how we deal with it -- I'd not be at all surprised if there are bits
that are wrong as is, e.g., in interrupt handling.

>   - I see some of the registers are retained only for L1 (so, ca, ov32, 
> ca32, etc) but not for L2 (got missed in nested_load_state helper in 
> this patch), are they not really needed anymore? Previous patch 
> introduced one of them though.

I'm not sure. those fields aren't registers as such, but mirror in some
values from regisers. I didn't think I got it wrong but I'll double
check.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 3/4] spapr: load and store l2 state with helper functions
  2023-05-05 11:03   ` Harsh Prateek Bora
@ 2023-05-13  3:30     ` Nicholas Piggin
  0 siblings, 0 replies; 12+ messages in thread
From: Nicholas Piggin @ 2023-05-13  3:30 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc; +Cc: qemu-devel

On Fri May 5, 2023 at 9:03 PM AEST, Harsh Prateek Bora wrote:
> <correcting my email in CC>
>
> On 5/3/23 06:09, Nicholas Piggin wrote:
> > Arguably this is just shuffling around register accesses, but one nice
> > thing it does is allow the exit to save away the L2 state then switch
> > the environment to the L1 before copying L2 data back to the L1, which
> > logically flows more naturally and simplifies the error paths.
> > 
> The supposed advantage you have mentioned is coming at the cost of 
> double copy (env -> l2_state, switch to L1, l2_state -> hvstate/ptregs), 
> but previously we were just doing a single copy that directly conveyed 
> it to L1 before switching to L1. Additional copy means additional delay 
> in transition of L1/L2. Not sure if it's worth it?

Yeah, the memcpy in the host  is the least of our performance concerns
though (and removing the CPUPPCState memcpy in the series will reduce
the total amount of copying done in entry/exit anyway).

I think clearer code is more important here.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 4/4] spapr: Move spapr nested HV to a new file
  2023-05-05 11:09   ` Harsh Prateek Bora
@ 2023-05-13  3:32     ` Nicholas Piggin
  0 siblings, 0 replies; 12+ messages in thread
From: Nicholas Piggin @ 2023-05-13  3:32 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc; +Cc: qemu-devel

On Fri May 5, 2023 at 9:09 PM AEST, Harsh Prateek Bora wrote:
> <correcting my email in CC>
>
> On 5/3/23 06:09, Nicholas Piggin wrote:
> > Create spapr_nested.c for the nested HV implementation (modulo small
> > pieces in MMU and exception handling).
> > 
> This separation of nested code in its own file is very much needed, but 
> this could have been a pre-patch to all the previous patches (at least 
> 2/3, 3/4 which are debatable).

It could have. I don't mind getting it into a slightly better shape
before doing the rename but I don't know if there is really a one true
way for ordering code movement vs reworking.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-05-13  3:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-03  0:39 [RFC PATCH 0/4] spapr: clean up nested hv Nicholas Piggin
2023-05-03  0:39 ` [RFC PATCH 1/4] spapr: H_ENTER_NESTED should restore host XER ca field Nicholas Piggin
2023-05-05 10:20   ` Harsh Prateek Bora
2023-05-03  0:39 ` [RFC PATCH 2/4] spapr: Add a nested state struct Nicholas Piggin
2023-05-05 10:54   ` Harsh Prateek Bora
2023-05-13  3:27     ` Nicholas Piggin
2023-05-03  0:39 ` [RFC PATCH 3/4] spapr: load and store l2 state with helper functions Nicholas Piggin
2023-05-05 11:03   ` Harsh Prateek Bora
2023-05-13  3:30     ` Nicholas Piggin
2023-05-03  0:39 ` [RFC PATCH 4/4] spapr: Move spapr nested HV to a new file Nicholas Piggin
2023-05-05 11:09   ` Harsh Prateek Bora
2023-05-13  3:32     ` Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).