qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM)
@ 2023-10-12 10:49 Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 01/14] spapr: nested: move nested part of spapr_get_pate into spapr_nested.c Harsh Prateek Bora
                   ` (14 more replies)
  0 siblings, 15 replies; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

There is an existing Nested-HV API to enable nested guests on powernv
machines. However, that is not supported on pseries/PowerVM LPARs.
This patch series implements required hcall interfaces to enable nested
guests with KVM on PowerVM.
Unlike Nested-HV, with this API, entire L2 state is retained by L0
during guest entry/exit and uses pre-defined Guest State Buffer (GSB)
format to communicate guest state between L1 and L2 via L0.

L0 here refers to the phyp/PowerVM, or launching a Qemu TCG L0 with the
newly introduced option cap-nested-papr=true.
L1 refers to the LPAR host on PowerVM or Linux booted on Qemu TCG with
above mentioned option cap-nested-papr=true.
L2 refers to nested guest running on top of L1 using KVM.
No SW changes needed for Qemu running in L1 Linux as well as L2 Kernel.

There is a Linux Kernel side patch series to enable support for Nested
PAPR in L1 and same can be found at below url:

Linux Kernel patch series:
- https://lore.kernel.org/linuxppc-dev/20230914030600.16993-1-jniethe5@gmail.com/

For more details, documentation can be referred in either of patch
series.

There are scripts available to assist in setting up an environment for
testing nested guests at https://github.com/iamjpn/kvm-powervm-test

A tree with this series is available at:
https://github.com/planetharsh/qemu/tree/upstream-kop-1012

Thanks to Michael Neuling, Shivaprasad Bhat, Kautuk Consul, Vaibhav Jain
and Jordan Niethe.

Changelog:

v2:
    - Addressed review comments from Nick on v1 series.
v1:
    - https://lore.kernel.org/qemu-devel/20230906043333.448244-1-harshpb@linux.ibm.com/

Harsh Prateek Bora (14):
  spapr: nested:  move nested part of spapr_get_pate into spapr_nested.c
  spapr: nested: Introduce SpaprMachineStateNested to store related
    info.
  spapr: nested: Document Nested PAPR API
  spapr: nested: Introduce cap-nested-papr for Nested PAPR API
  spapr: nested: register nested-hv api hcalls only for cap-nested-hv
  spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls.
  spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls.
  spapr: nested: Introduce H_GUEST_CREATE_VPCU hcall.
  spapr: nested: Initialize the GSB elements lookup table.
  spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls.
  spapr: nested: Use correct source for parttbl info for nested PAPR
    API.
  spapr: nested: rename nested_host_state to nested_hv_host
  spapr: nested: keep nested-hv exit code restricted to its API.
  spapr: nested: Introduce H_GUEST_RUN_VCPU hcall.

 docs/devel/nested-papr.txt      |  500 +++++++++++
 hw/ppc/spapr.c                  |   32 +-
 hw/ppc/spapr_caps.c             |   63 ++
 hw/ppc/spapr_hcall.c            |    2 -
 hw/ppc/spapr_nested.c           | 1439 ++++++++++++++++++++++++++++++-
 include/hw/ppc/spapr.h          |   21 +-
 include/hw/ppc/spapr_cpu_core.h |    7 +-
 include/hw/ppc/spapr_nested.h   |  361 ++++++++
 target/ppc/cpu.h                |    2 +
 9 files changed, 2368 insertions(+), 59 deletions(-)
 create mode 100644 docs/devel/nested-papr.txt

-- 
2.39.3



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 01/14] spapr: nested: move nested part of spapr_get_pate into spapr_nested.c
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-11-29  3:47   ` Nicholas Piggin
  2023-10-12 10:49 ` [PATCH v2 02/14] spapr: nested: Introduce SpaprMachineStateNested to store related info Harsh Prateek Bora
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Most of the nested code has already been moved to spapr_nested.c
This logic inside spapr_get_pate is related to nested guests and
better suited for spapr_nested.c, hence moving there.

Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr.c                | 28 ++--------------------------
 hw/ppc/spapr_nested.c         | 29 +++++++++++++++++++++++++++++
 include/hw/ppc/spapr_nested.h |  3 ++-
 3 files changed, 33 insertions(+), 27 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index cb840676d3..a2c69d0f4f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1341,7 +1341,6 @@ void spapr_init_all_lpcrs(target_ulong value, target_ulong mask)
     }
 }
 
-
 static bool spapr_get_pate(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
                            target_ulong lpid, ppc_v3_pate_t *entry)
 {
@@ -1354,33 +1353,10 @@ static bool spapr_get_pate(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
         /* Copy PATE1:GR into PATE0:HR */
         entry->dw0 = spapr->patb_entry & PATE0_HR;
         entry->dw1 = spapr->patb_entry;
-
+        return true;
     } else {
-        uint64_t patb, pats;
-
-        assert(lpid != 0);
-
-        patb = spapr->nested_ptcr & PTCR_PATB;
-        pats = spapr->nested_ptcr & PTCR_PATS;
-
-        /* Check if partition table is properly aligned */
-        if (patb & MAKE_64BIT_MASK(0, pats + 12)) {
-            return false;
-        }
-
-        /* Calculate number of entries */
-        pats = 1ull << (pats + 12 - 4);
-        if (pats <= lpid) {
-            return false;
-        }
-
-        /* Grab entry */
-        patb += 16 * lpid;
-        entry->dw0 = ldq_phys(CPU(cpu)->as, patb);
-        entry->dw1 = ldq_phys(CPU(cpu)->as, patb + 8);
+        return spapr_get_pate_nested(spapr, cpu, lpid, entry);
     }
-
-    return true;
 }
 
 #define HPTE(_table, _i)   (void *)(((uint64_t *)(_table)) + ((_i) * 2))
diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index 121aa96ddc..123e127b08 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -6,6 +6,35 @@
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/spapr_cpu_core.h"
 #include "hw/ppc/spapr_nested.h"
+#include "mmu-book3s-v3.h"
+
+bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
+                           target_ulong lpid, ppc_v3_pate_t *entry)
+{
+    uint64_t patb, pats;
+
+    assert(lpid != 0);
+
+    patb = spapr->nested_ptcr & PTCR_PATB;
+    pats = spapr->nested_ptcr & PTCR_PATS;
+
+    /* Check if partition table is properly aligned */
+    if (patb & MAKE_64BIT_MASK(0, pats + 12)) {
+        return false;
+    }
+
+    /* Calculate number of entries */
+    pats = 1ull << (pats + 12 - 4);
+    if (pats <= lpid) {
+        return false;
+    }
+
+    /* Grab entry */
+    patb += 16 * lpid;
+    entry->dw0 = ldq_phys(CPU(cpu)->as, patb);
+    entry->dw1 = ldq_phys(CPU(cpu)->as, patb + 8);
+    return true;
+}
 
 #ifdef CONFIG_TCG
 #define PRTS_MASK      0x1f
diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
index d383486476..e3d15d6d0b 100644
--- a/include/hw/ppc/spapr_nested.h
+++ b/include/hw/ppc/spapr_nested.h
@@ -98,5 +98,6 @@ struct nested_ppc_state {
 
 void spapr_register_nested(void);
 void spapr_exit_nested(PowerPCCPU *cpu, int excp);
-
+bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
+                           target_ulong lpid, ppc_v3_pate_t *entry);
 #endif /* HW_SPAPR_NESTED_H */
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 02/14] spapr: nested: Introduce SpaprMachineStateNested to store related info.
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 01/14] spapr: nested: move nested part of spapr_get_pate into spapr_nested.c Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-11-29  3:48   ` Nicholas Piggin
  2023-10-12 10:49 ` [PATCH v2 03/14] spapr: nested: Document Nested PAPR API Harsh Prateek Bora
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Currently, nested_ptcr is being used by existing nested-hv API to store
nested guest related info. This need to be organised to extend support
for the nested PAPR API which would need to store additional info related
to nested guests in next series of patches.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr_nested.c         | 8 ++++----
 include/hw/ppc/spapr.h        | 3 ++-
 include/hw/ppc/spapr_nested.h | 5 +++++
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index 123e127b08..db47c1196f 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -15,8 +15,8 @@ bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
 
     assert(lpid != 0);
 
-    patb = spapr->nested_ptcr & PTCR_PATB;
-    pats = spapr->nested_ptcr & PTCR_PATS;
+    patb = spapr->nested.ptcr & PTCR_PATB;
+    pats = spapr->nested.ptcr & PTCR_PATS;
 
     /* Check if partition table is properly aligned */
     if (patb & MAKE_64BIT_MASK(0, pats + 12)) {
@@ -54,7 +54,7 @@ static target_ulong h_set_ptbl(PowerPCCPU *cpu,
         return H_PARAMETER;
     }
 
-    spapr->nested_ptcr = ptcr; /* Save new partition table */
+    spapr->nested.ptcr = ptcr; /* Save new partition table */
 
     return H_SUCCESS;
 }
@@ -186,7 +186,7 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
     struct kvmppc_pt_regs *regs;
     hwaddr len;
 
-    if (spapr->nested_ptcr == 0) {
+    if (spapr->nested.ptcr == 0) {
         return H_NOT_AVAILABLE;
     }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index e91791a1a9..3e825f2787 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -12,6 +12,7 @@
 #include "hw/ppc/spapr_xive.h"  /* For SpaprXive */
 #include "hw/ppc/xics.h"        /* For ICSState */
 #include "hw/ppc/spapr_tpm_proxy.h"
+#include "hw/ppc/spapr_nested.h" /* For SpaprMachineStateNested */
 
 struct SpaprVioBus;
 struct SpaprPhbState;
@@ -213,7 +214,7 @@ struct SpaprMachineState {
     uint32_t vsmt;       /* Virtual SMT mode (KVM's "core stride") */
 
     /* Nested HV support (TCG only) */
-    uint64_t nested_ptcr;
+    SpaprMachineStateNested nested;
 
     Notifier epow_notifier;
     QTAILQ_HEAD(, SpaprEventLogEntry) pending_events;
diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
index e3d15d6d0b..0722b999cd 100644
--- a/include/hw/ppc/spapr_nested.h
+++ b/include/hw/ppc/spapr_nested.h
@@ -4,6 +4,10 @@
 #include "qemu/osdep.h"
 #include "target/ppc/cpu.h"
 
+typedef struct SpaprMachineStateNested {
+    uint64_t ptcr;
+} SpaprMachineStateNested;
+
 /*
  * Register state for entering a nested guest with H_ENTER_NESTED.
  * New member must be added at the end.
@@ -98,6 +102,7 @@ struct nested_ppc_state {
 
 void spapr_register_nested(void);
 void spapr_exit_nested(PowerPCCPU *cpu, int excp);
+typedef struct SpaprMachineState SpaprMachineState;
 bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
                            target_ulong lpid, ppc_v3_pate_t *entry);
 #endif /* HW_SPAPR_NESTED_H */
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 03/14] spapr: nested: Document Nested PAPR API
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 01/14] spapr: nested: move nested part of spapr_get_pate into spapr_nested.c Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 02/14] spapr: nested: Introduce SpaprMachineStateNested to store related info Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 04/14] spapr: nested: Introduce cap-nested-papr for " Harsh Prateek Bora
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Adding initial documentation about Nested PAPR API to describe the set
of APIs and its usage. Also talks about the Guest State Buffer elements
and it's format which is used between L0/L1 to communicate L2 state.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 docs/devel/nested-papr.txt | 500 +++++++++++++++++++++++++++++++++++++
 1 file changed, 500 insertions(+)
 create mode 100644 docs/devel/nested-papr.txt

diff --git a/docs/devel/nested-papr.txt b/docs/devel/nested-papr.txt
new file mode 100644
index 0000000000..3884794296
--- /dev/null
+++ b/docs/devel/nested-papr.txt
@@ -0,0 +1,500 @@
+Nested PAPR API (aka KVM on PowerVM)
+====================================
+
+This API aims at providing support to enable nested virtualization with
+KVM on PowerVM. While the existing support for nested KVM on PowerNV was
+introduced with cap-nested-hv option, however, with a slight design change,
+to enable this on papr/pseries, a new cap-nested-papr option is added. eg:
+
+  qemu-system-ppc64 -cpu POWER10 -machine pseries,cap-nested-papr=true ...
+
+Work by:
+    Michael Neuling <mikey@neuling.org>
+    Vaibhav Jain <vaibhav@linux.ibm.com>
+    Jordan Niethe <jniethe5@gmail.com>
+    Harsh Prateek Bora <harshpb@linux.ibm.com>
+    Shivaprasad G Bhat <sbhat@linux.ibm.com>
+    Kautuk Consul <kconsul@linux.vnet.ibm.com>
+
+Below taken from the kernel documentation:
+
+Introduction
+============
+
+This document explains how a guest operating system can act as a
+hypervisor and run nested guests through the use of hypercalls, if the
+hypervisor has implemented them. The terms L0, L1, and L2 are used to
+refer to different software entities. L0 is the hypervisor mode entity
+that would normally be called the "host" or "hypervisor". L1 is a
+guest virtual machine that is directly run under L0 and is initiated
+and controlled by L0. L2 is a guest virtual machine that is initiated
+and controlled by L1 acting as a hypervisor. A significant design change
+wrt existing API is that now the entire L2 state is maintained within L0.
+
+Existing Nested-HV API
+======================
+
+Linux/KVM has had support for Nesting as an L0 or L1 since 2018
+
+The L0 code was added::
+
+   commit 8e3f5fc1045dc49fd175b978c5457f5f51e7a2ce
+   Author: Paul Mackerras <paulus@ozlabs.org>
+   Date:   Mon Oct 8 16:31:03 2018 +1100
+   KVM: PPC: Book3S HV: Framework and hcall stubs for nested virtualization
+
+The L1 code was added::
+
+   commit 360cae313702cdd0b90f82c261a8302fecef030a
+   Author: Paul Mackerras <paulus@ozlabs.org>
+   Date:   Mon Oct 8 16:31:04 2018 +1100
+   KVM: PPC: Book3S HV: Nested guest entry via hypercall
+
+This API works primarily using a signal hcall h_enter_nested(). This
+call made by the L1 to tell the L0 to start an L2 vCPU with the given
+state. The L0 then starts this L2 and runs until an L2 exit condition
+is reached. Once the L2 exits, the state of the L2 is given back to
+the L1 by the L0. The full L2 vCPU state is always transferred from
+and to L1 when the L2 is run. The L0 doesn't keep any state on the L2
+vCPU (except in the short sequence in the L0 on L1 -> L2 entry and L2
+-> L1 exit).
+
+The only state kept by the L0 is the partition table. The L1 registers
+it's partition table using the h_set_partition_table() hcall. All
+other state held by the L0 about the L2s is cached state (such as
+shadow page tables).
+
+The L1 may run any L2 or vCPU without first informing the L0. It
+simply starts the vCPU using h_enter_nested(). The creation of L2s and
+vCPUs is done implicitly whenever h_enter_nested() is called.
+
+In this document, we call this existing API the v1 API.
+
+New PAPR API
+===============
+
+The new PAPR API changes from the v1 API such that the creating L2 and
+associated vCPUs is explicit. In this document, we call this the v2
+API.
+
+h_enter_nested() is replaced with H_GUEST_VCPU_RUN().  Before this can
+be called the L1 must explicitly create the L2 using h_guest_create()
+and any associated vCPUs() created with h_guest_create_vCPU(). Getting
+and setting vCPU state can also be performed using h_guest_{g|s}et
+hcall.
+
+The basic execution flow is for an L1 to create an L2, run it, and
+delete it is:
+
+- L1 and L0 negotiate capabilities with H_GUEST_{G,S}ET_CAPABILITIES()
+  (normally at L1 boot time).
+
+- L1 requests the L0 to create an L2 with H_GUEST_CREATE() and receives a token
+
+- L1 requests the L0 to create an L2 vCPU with H_GUEST_CREATE_VCPU()
+
+- L1 and L0 communicate the vCPU state using the H_GUEST_{G,S}ET() hcall
+
+- L1 requests the L0 to run the vCPU using H_GUEST_RUN_VCPU() hcall
+
+- L1 deletes L2 with H_GUEST_DELETE()
+
+More details of the individual hcalls follows:
+
+HCALL Details
+=============
+
+This documentation is provided to give an overall understating of the
+API. It doesn't aim to provide full details required to implement
+an L1 or L0. Latest PAPR spec shall be referred for more details.
+
+All these HCALLs are made by the L1 to the L0.
+
+H_GUEST_GET_CAPABILITIES()
+--------------------------
+
+This is called to get the capabilities of the L0 nested
+hypervisor. This includes capabilities such the CPU versions (eg
+POWER9, POWER10) that are supported as L2s.
+
+H_GUEST_SET_CAPABILITIES()
+--------------------------
+
+This is called to inform the L0 of the capabilities of the L1
+hypervisor. The set of flags passed here are the same as
+H_GUEST_GET_CAPABILITIES()
+
+Typically, GET will be called first and then SET will be called with a
+subset of the flags returned from GET. This process allows the L0 and
+L1 to negotiate a agreed set of capabilities.
+
+H_GUEST_CREATE()
+----------------
+
+This is called to create a L2. Returned is ID of the L2 created
+(similar to an LPID), which can be use on subsequent HCALLs to
+identify the L2.
+
+H_GUEST_CREATE_VCPU()
+---------------------
+
+This is called to create a vCPU associated with a L2. The L2 id
+(returned from H_GUEST_CREATE()) should be passed it. Also passed in
+is a unique (for this L2) vCPUid. This vCPUid is allocated by the
+L1.
+
+H_GUEST_SET_STATE()
+-------------------
+
+This is called to set L2 wide or vCPU specific L2 state. This info is
+passed via the Guest State Buffer (GSB), details below.
+
+This can set either L2 wide or vcpu specific information. Examples of
+L2 wide is the timebase offset or process scoped page table
+info. Examples of vCPU wide are GPRs or VSRs. A bit in the flags
+parameter specifies if this call is L2 wide or vCPU specific and the
+IDs in the GSB must match this.
+
+The L1 provides a pointer to the GSB as a parameter to this call. Also
+provided is the L2 and vCPU IDs associated with the state to set.
+
+The L1 writes all values in the GSB and the L0 only reads the GSB for
+this call
+
+H_GUEST_GET_STATE()
+-------------------
+
+This is called to get state associated with a L2 or L2 vCPU. This info
+passed via the GSB (details below).
+
+This can get either L2 wide or vcpu specific information. Examples of
+L2 wide is the timebase offset or process scoped page table
+info. Examples of vCPU wide are GPRs or VSRs. A bit in the flags
+parameter specifies if this call is L2 wide or vCPU specific and the
+IDs in the GSB must match this.
+
+The L1 provides a pointer to the GSB as a parameter to this call. Also
+provided is the L2 and vCPU IDs associated with the state to get.
+
+The L1 writes only the IDs and sizes in the GSB.  L0 writes the
+associated values for each ID in the GSB.
+
+H_GUEST_RUN_VCPU()
+------------------
+
+This is called to run an L2 vCPU. The L2 and vCPU IDs are passed in as
+parameters. The vCPU runs with the state set previously using
+H_GUEST_SET_STATE(). When the L2 exits, the L1 will resume from this
+hcall.
+
+This hcall also has associated input and output GSBs. Unlike
+H_GUEST_{S,G}ET_STATE(), these GSB pointers are not passed in as
+parameters to the hcall (This was done in the interest of
+performance). The locations of these GSBs must be preregistered using
+the H_GUEST_SET_STATE() call with ID 0x0c00 and 0x0c01 (see table later
+below).
+
+The input GSB may contain only VCPU wide elements to be set. This GSB
+may also contain zero elements (ie 0 in the first 4 bytes of the GSB)
+if nothing needs to be set.
+
+On exit from the hcall, the output buffer is filled with elements
+determined by the L0. The reason for the exit is contained in GPR4 (ie
+NIP is put in GPR4).  The elements returned depend on the exit
+type. For example, if the exit reason is the L2 doing a hcall (GPR4 =
+0xc00), then GPR3-12 are provided in the output GSB as this is the
+state likely needed to service the hcall. If additional state is
+needed, H_GUEST_GET_STATE() may be called by the L1.
+
+To synthesize interrupts in the L2, when calling H_GUEST_RUN_VCPU()
+the L1 may set a flag (as a hcall parameter) and the L0 will
+synthesize the interrupt in the L2. Alternatively, the L1 may
+synthesize the interrupt itself using H_GUEST_SET_STATE() or the
+H_GUEST_RUN_VCPU() input GSB to set the state appropriately.
+
+H_GUEST_DELETE()
+----------------
+
+This is called to delete an L2. All associated vCPUs are also
+deleted. No specific vCPU delete call is provided.
+
+A flag may be provided to delete all guests. This is used to reset the
+L0 in the case of kdump/kexec.
+
+Guest State Buffer (GSB)
+========================
+
+The Guest State Buffer (GSB) is the main method of communicating state
+about the L2 between the L1 and L0 via H_GUEST_{G,S}ET() and
+H_GUEST_VCPU_RUN() calls.
+
+State may be associated with a whole L2 (eg timebase offset) or a
+specific L2 vCPU (eg. GPR state). Only L2 VCPU state maybe be set by
+H_GUEST_VCPU_RUN().
+
+All data in the GSB is big endian (as is standard in PAPR)
+
+The Guest state buffer has a header which gives the number of
+elements, followed by the GSB elements themselves.
+
+GSB header:
+
++----------+----------+-------------------------------------------+
+|  Offset  |  Size    |  Purpose                                  |
+|  Bytes   |  Bytes   |                                           |
++==========+==========+===========================================+
+|    0     |    4     |  Number of elements                       |
++----------+----------+-------------------------------------------+
+|    4     |          |  Guest state buffer elements              |
++----------+----------+-------------------------------------------+
+
+GSB element:
+
++----------+----------+-------------------------------------------+
+|  Offset  |  Size    |  Purpose                                  |
+|  Bytes   |  Bytes   |                                           |
++==========+==========+===========================================+
+|    0     |    2     |  ID                                       |
++----------+----------+-------------------------------------------+
+|    2     |    2     |  Size of Value                            |
++----------+----------+-------------------------------------------+
+|    4     | As above |  Value                                    |
++----------+----------+-------------------------------------------+
+
+The ID in the GSB element specifies what is to be set. This includes
+archtected state like GPRs, VSRs, SPRs, plus also some meta data about
+the partition like the timebase offset and partition scoped page
+table information.
+
++--------+-------+----+--------+----------------------------------+
+|   ID   | Size  | RW | Thread | Details                          |
+|        | Bytes |    | Guest  |                                  |
+|        |       |    | Scope  |                                  |
++========+=======+====+========+==================================+
+| 0x0000 |       | RW |   TG   | NOP element                      |
++--------+-------+----+--------+----------------------------------+
+| 0x0001 | 0x08  | R  |   G    | Size of L0 vCPU state            |
++--------+-------+----+--------+----------------------------------+
+| 0x0002 | 0x08  | R  |   G    | Size Run vCPU out buffer         |
++--------+-------+----+--------+----------------------------------+
+| 0x0003 | 0x04  | RW |   G    | Logical PVR                      |
++--------+-------+----+--------+----------------------------------+
+| 0x0004 | 0x08  | RW |   G    | TB Offset (L1 relative)          |
++--------+-------+----+--------+----------------------------------+
+| 0x0005 | 0x18  | RW |   G    |Partition scoped page tbl info:   |
+|        |       |    |        |                                  |
+|        |       |    |        |- 0x00 Addr part scope table      |
+|        |       |    |        |- 0x08 Num addr bits              |
+|        |       |    |        |- 0x10 Size root dir              |
++--------+-------+----+--------+----------------------------------+
+| 0x0006 | 0x10  | RW |   G    |Process Table Information:        |
+|        |       |    |        |                                  |
+|        |       |    |        |- 0x0 Addr proc scope table       |
+|        |       |    |        |- 0x8 Table size.                 |
++--------+-------+----+--------+----------------------------------+
+| 0x0007-|       |    |        | Reserved                         |
+| 0x0BFF |       |    |        |                                  |
++--------+-------+----+--------+----------------------------------+
+| 0x0C00 | 0x10  | RW |   T    |Run vCPU Input Buffer:            |
+|        |       |    |        |                                  |
+|        |       |    |        |- 0x0 Addr of buffer              |
+|        |       |    |        |- 0x8 Buffer Size.                |
++--------+-------+----+--------+----------------------------------+
+| 0x0C01 | 0x10  | RW |   T    |Run vCPU Output Buffer:           |
+|        |       |    |        |                                  |
+|        |       |    |        |- 0x0 Addr of buffer              |
+|        |       |    |        |- 0x8 Buffer Size.                |
++--------+-------+----+--------+----------------------------------+
+| 0x0C02 | 0x08  | RW |   T    | vCPU VPA Address                 |
++--------+-------+----+--------+----------------------------------+
+| 0x0C03-|       |    |        | Reserved                         |
+| 0x0FFF |       |    |        |                                  |
++--------+-------+----+--------+----------------------------------+
+| 0x1000-| 0x08  | RW |   T    | GPR 0-31                         |
+| 0x101F |       |    |        |                                  |
++--------+-------+----+--------+----------------------------------+
+| 0x1020 |  0x08 | T  |   T    | HDEC expiry TB                   |
++--------+-------+----+--------+----------------------------------+
+| 0x1021 | 0x08  | RW |   T    | NIA                              |
++--------+-------+----+--------+----------------------------------+
+| 0x1022 | 0x08  | RW |   T    | MSR                              |
++--------+-------+----+--------+----------------------------------+
+| 0x1023 | 0x08  | RW |   T    | LR                               |
++--------+-------+----+--------+----------------------------------+
+| 0x1024 | 0x08  | RW |   T    | XER                              |
++--------+-------+----+--------+----------------------------------+
+| 0x1025 | 0x08  | RW |   T    | CTR                              |
++--------+-------+----+--------+----------------------------------+
+| 0x1026 | 0x08  | RW |   T    | CFAR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x1027 | 0x08  | RW |   T    | SRR0                             |
++--------+-------+----+--------+----------------------------------+
+| 0x1028 | 0x08  | RW |   T    | SRR1                             |
++--------+-------+----+--------+----------------------------------+
+| 0x1029 | 0x08  | RW |   T    | DAR                              |
++--------+-------+----+--------+----------------------------------+
+| 0x102A | 0x08  | RW |   T    | DEC expiry TB                    |
++--------+-------+----+--------+----------------------------------+
+| 0x102B | 0x08  | RW |   T    | VTB                              |
++--------+-------+----+--------+----------------------------------+
+| 0x102C | 0x08  | RW |   T    | LPCR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x102D | 0x08  | RW |   T    | HFSCR                            |
++--------+-------+----+--------+----------------------------------+
+| 0x102E | 0x08  | RW |   T    | FSCR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x102F | 0x08  | RW |   T    | FPSCR                            |
++--------+-------+----+--------+----------------------------------+
+| 0x1030 | 0x08  | RW |   T    | DAWR0                            |
++--------+-------+----+--------+----------------------------------+
+| 0x1031 | 0x08  | RW |   T    | DAWR1                            |
++--------+-------+----+--------+----------------------------------+
+| 0x1032 | 0x08  | RW |   T    | CIABR                            |
++--------+-------+----+--------+----------------------------------+
+| 0x1033 | 0x08  | RW |   T    | PURR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x1034 | 0x08  | RW |   T    | SPURR                            |
++--------+-------+----+--------+----------------------------------+
+| 0x1035 | 0x08  | RW |   T    | IC                               |
++--------+-------+----+--------+----------------------------------+
+| 0x1036-| 0x08  | RW |   T    | SPRG 0-3                         |
+| 0x1039 |       |    |        |                                  |
++--------+-------+----+--------+----------------------------------+
+| 0x103A | 0x08  | W  |   T    | PPR                              |
++--------+-------+----+--------+----------------------------------+
+| 0x103B | 0x08  | RW |   T    | MMCR 0-3                         |
+| 0x103E |       |    |        |                                  |
++--------+-------+----+--------+----------------------------------+
+| 0x103F | 0x08  | RW |   T    | MMCRA                            |
++--------+-------+----+--------+----------------------------------+
+| 0x1040 | 0x08  | RW |   T    | SIER                             |
++--------+-------+----+--------+----------------------------------+
+| 0x1041 | 0x08  | RW |   T    | SIER 2                           |
++--------+-------+----+--------+----------------------------------+
+| 0x1042 | 0x08  | RW |   T    | SIER 3                           |
++--------+-------+----+--------+----------------------------------+
+| 0x1043 | 0x08  | RW |   T    | BESCR                            |
++--------+-------+----+--------+----------------------------------+
+| 0x1044 | 0x08  | RW |   T    | EBBHR                            |
++--------+-------+----+--------+----------------------------------+
+| 0x1045 | 0x08  | RW |   T    | EBBRR                            |
++--------+-------+----+--------+----------------------------------+
+| 0x1046 | 0x08  | RW |   T    | AMR                              |
++--------+-------+----+--------+----------------------------------+
+| 0x1047 | 0x08  | RW |   T    | IAMR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x1048 | 0x08  | RW |   T    | AMOR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x1049 | 0x08  | RW |   T    | UAMOR                            |
++--------+-------+----+--------+----------------------------------+
+| 0x104A | 0x08  | RW |   T    | SDAR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x104B | 0x08  | RW |   T    | SIAR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x104C | 0x08  | RW |   T    | DSCR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x104D | 0x08  | RW |   T    | TAR                              |
++--------+-------+----+--------+----------------------------------+
+| 0x104E | 0x08  | RW |   T    | DEXCR                            |
++--------+-------+----+--------+----------------------------------+
+| 0x104F | 0x08  | RW |   T    | HDEXCR                           |
++--------+-------+----+--------+----------------------------------+
+| 0x1050 | 0x08  | RW |   T    | HASHKEYR                         |
++--------+-------+----+--------+----------------------------------+
+| 0x1051 | 0x08  | RW |   T    | HASHPKEYR                        |
++--------+-------+----+--------+----------------------------------+
+| 0x1052 | 0x08  | RW |   T    | CTRL                             |
++--------+-------+----+--------+----------------------------------+
+| 0x1053-|       |    |        | Reserved                         |
+| 0x1FFF |       |    |        |                                  |
++--------+-------+----+--------+----------------------------------+
+| 0x2000 | 0x04  | RW |   T    | CR                               |
++--------+-------+----+--------+----------------------------------+
+| 0x2001 | 0x04  | RW |   T    | PIDR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x2002 | 0x04  | RW |   T    | DSISR                            |
++--------+-------+----+--------+----------------------------------+
+| 0x2003 | 0x04  | RW |   T    | VSCR                             |
++--------+-------+----+--------+----------------------------------+
+| 0x2004 | 0x04  | RW |   T    | VRSAVE                           |
++--------+-------+----+--------+----------------------------------+
+| 0x2005 | 0x04  | RW |   T    | DAWRX0                           |
++--------+-------+----+--------+----------------------------------+
+| 0x2006 | 0x04  | RW |   T    | DAWRX1                           |
++--------+-------+----+--------+----------------------------------+
+| 0x2007-| 0x04  | RW |   T    | PMC 1-6                          |
+| 0x200c |       |    |        |                                  |
++--------+-------+----+--------+----------------------------------+
+| 0x200D | 0x04  | RW |   T    | WORT                             |
++--------+-------+----+--------+----------------------------------+
+| 0x200E | 0x04  | RW |   T    | PSPB                             |
++--------+-------+----+--------+----------------------------------+
+| 0x200F-|       |    |        | Reserved                         |
+| 0x2FFF |       |    |        |                                  |
++--------+-------+----+--------+----------------------------------+
+| 0x3000-| 0x10  | RW |   T    | VSR 0-63                         |
+| 0x303F |       |    |        |                                  |
++--------+-------+----+--------+----------------------------------+
+| 0x3040-|       |    |        | Reserved                         |
+| 0xEFFF |       |    |        |                                  |
++--------+-------+----+--------+----------------------------------+
+| 0xF000 | 0x08  | R  |   T    | HDAR                             |
++--------+-------+----+--------+----------------------------------+
+| 0xF001 | 0x04  | R  |   T    | HDSISR                           |
++--------+-------+----+--------+----------------------------------+
+| 0xF002 | 0x04  | R  |   T    | HEIR                             |
++--------+-------+----+--------+----------------------------------+
+| 0xF003 | 0x08  | R  |   T    | ASDR                             |
++--------+-------+----+--------+----------------------------------+
+
+Miscellaneous info
+==================
+
+State not in ptregs/hvregs
+--------------------------
+
+In the v1 API, some state is not in the ptregs/hvstate. This includes
+the vector register and some SPRs. For the L1 to set this state for
+the L2, the L1 loads up these hardware registers before the
+h_enter_nested() call and the L0 ensures they end up as the L2 state
+(by not touching them).
+
+The v2 API removes this and explicitly sets this state via the GSB.
+
+L1 Implementation details: Caching state
+----------------------------------------
+
+In the v1 API, all state is sent from the L1 to the L0 and vice versa
+on every h_enter_nested() hcall. If the L0 is not currently running
+any L2s, the L0 has no state information about them. The only
+exception to this is the location of the partition table, registered
+via h_set_partition_table().
+
+The v2 API changes this so that the L0 retains the L2 state even when
+it's vCPUs are no longer running. This means that the L1 only needs to
+communicate with the L0 about L2 state when it needs to modify the L2
+state, or when it's value is out of date. This provides an opportunity
+for performance optimisation.
+
+When a vCPU exits from a H_GUEST_RUN_VCPU() call, the L1 internally
+marks all L2 state as invalid. This means that if the L1 wants to know
+the L2 state (say via a kvm_get_one_reg() call), it needs  to call
+H_GUEST_GET_STATE() to get that state. Once it's read, it's marked as
+valid in L1 until the L2 is run again.
+
+Also, when an L1 modifies L2 vcpu state, it doesn't need to write it
+to the L0 until that L2 vcpu runs again. Hence when the L1 updates
+state (say via a kvm_set_one_reg() call), it writes to an internal L1
+copy and only flushes this copy to the L0 when the L2 runs again via
+the H_GUEST_VCPU_RUN() input buffer.
+
+This lazy updating of state by the L1 avoids unnecessary
+H_GUEST_{G|S}ET_STATE() calls.
+
+References
+==========
+
+For more details, please refer:
+
+[1] Kernel documentation (patches accepted upstream):
+    https://lore.kernel.org/linuxppc-dev/169528846875.874757.8861595746180557787.b4-ty@ellerman.id.au/
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 04/14] spapr: nested: Introduce cap-nested-papr for Nested PAPR API
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (2 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 03/14] spapr: nested: Document Nested PAPR API Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-11-29  4:01   ` Nicholas Piggin
  2023-10-12 10:49 ` [PATCH v2 05/14] spapr: nested: register nested-hv api hcalls only for cap-nested-hv Harsh Prateek Bora
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Introduce a SPAPR capability cap-nested-papr which provides a nested
HV facility to the guest. This is similar to cap-nested-hv, but uses
a different (incompatible) API and so they are mutually exclusive.
This new API is to enable support for KVM on PowerVM and recently the
Linux kernel side patches have been accepted upstream as well [1].
Support for related hcalls is being added in next set of patches.

[1]
https://lore.kernel.org/linuxppc-dev/169528846875.874757.8861595746180557787.b4-ty@ellerman.id.au/

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr.c                |  9 +++++-
 hw/ppc/spapr_caps.c           | 61 +++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_nested.c         | 15 +++++++++
 include/hw/ppc/spapr.h        |  5 ++-
 include/hw/ppc/spapr_nested.h |  5 +++
 5 files changed, 93 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a2c69d0f4f..14196fdd11 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1355,7 +1355,11 @@ static bool spapr_get_pate(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
         entry->dw1 = spapr->patb_entry;
         return true;
     } else {
-        return spapr_get_pate_nested(spapr, cpu, lpid, entry);
+        assert(spapr->nested.api);
+        if (spapr->nested.api == NESTED_API_KVM_HV) {
+            return spapr_get_pate_nested(spapr, cpu, lpid, entry);
+        }
+        return false;
     }
 }
 
@@ -2093,6 +2097,7 @@ static const VMStateDescription vmstate_spapr = {
         &vmstate_spapr_cap_fwnmi,
         &vmstate_spapr_fwnmi,
         &vmstate_spapr_cap_rpt_invalidate,
+        &vmstate_spapr_cap_nested_papr,
         NULL
     }
 };
@@ -3437,6 +3442,7 @@ static void spapr_instance_init(Object *obj)
         spapr_get_host_serial, spapr_set_host_serial);
     object_property_set_description(obj, "host-serial",
         "Host serial number to advertise in guest device tree");
+    spapr_nested_init(spapr);
 }
 
 static void spapr_machine_finalizefn(Object *obj)
@@ -4675,6 +4681,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_WORKAROUND;
     smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
     smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
+    smc->default_caps.caps[SPAPR_CAP_NESTED_PAPR] = SPAPR_CAP_OFF;
     smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_ON;
     smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_ON;
     smc->default_caps.caps[SPAPR_CAP_FWNMI] = SPAPR_CAP_ON;
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 5a0755d34f..9b53f19ec8 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -454,6 +454,14 @@ static void cap_nested_kvm_hv_apply(SpaprMachineState *spapr,
         return;
     }
 
+    if (!spapr->nested.api) {
+        spapr->nested.api = NESTED_API_KVM_HV;
+    } else {
+        error_setg(errp, "Nested-HV APIs are mutually exclusive/incompatible");
+        error_append_hint(errp, "Please use either cap-nested-hv or "
+                                 "cap-nested-papr to proceed.\n");
+        return;
+    }
     if (kvm_enabled()) {
         if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_3_00, 0,
                               spapr->max_compat_pvr)) {
@@ -490,6 +498,49 @@ static void cap_nested_kvm_hv_apply(SpaprMachineState *spapr,
     }
 }
 
+static void cap_nested_papr_apply(SpaprMachineState *spapr,
+                                    uint8_t val, Error **errp)
+{
+    ERRP_GUARD();
+    PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
+    CPUPPCState *env = &cpu->env;
+
+    if (!val) {
+        /* capability disabled by default */
+        return;
+    }
+
+    if (!spapr->nested.api) {
+        spapr->nested.api = NESTED_API_PAPR;
+        spapr_register_nested_papr();
+    } else {
+        error_setg(errp, "Nested-HV APIs are mutually exclusive/incompatible");
+        error_append_hint(errp, "Please use either cap-nested-hv or "
+                                 "cap-nested-papr to proceed.\n");
+        return;
+    }
+
+    if (tcg_enabled()) {
+        if (!(env->insns_flags2 & PPC2_ISA300)) {
+            error_setg(errp, "Nested-PAPR only supported on POWER9 and later");
+            error_append_hint(errp,
+                              "Try appending -machine cap-nested-papr=off\n");
+            return;
+        }
+    } else if (kvm_enabled()) {
+        /*
+         * this gets executed in L1 qemu when L2 is launched,
+         * needs kvm-hv support in L1 kernel.
+         */
+        if (!kvmppc_has_cap_nested_kvm_hv()) {
+            error_setg(errp,
+                       "KVM implementation does not support Nested-HV");
+        } else if (kvmppc_set_cap_nested_kvm_hv(val) < 0) {
+            error_setg(errp, "Error enabling Nested-HV with KVM");
+        }
+    }
+}
+
 static void cap_large_decr_apply(SpaprMachineState *spapr,
                                  uint8_t val, Error **errp)
 {
@@ -735,6 +786,15 @@ SpaprCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
         .type = "bool",
         .apply = cap_nested_kvm_hv_apply,
     },
+    [SPAPR_CAP_NESTED_PAPR] = {
+        .name = "nested-papr",
+        .description = "Allow Nested HV (PAPR API)",
+        .index = SPAPR_CAP_NESTED_PAPR,
+        .get = spapr_cap_get_bool,
+        .set = spapr_cap_set_bool,
+        .type = "bool",
+        .apply = cap_nested_papr_apply,
+    },
     [SPAPR_CAP_LARGE_DECREMENTER] = {
         .name = "large-decr",
         .description = "Allow Large Decrementer",
@@ -919,6 +979,7 @@ SPAPR_CAP_MIG_STATE(sbbc, SPAPR_CAP_SBBC);
 SPAPR_CAP_MIG_STATE(ibs, SPAPR_CAP_IBS);
 SPAPR_CAP_MIG_STATE(hpt_maxpagesize, SPAPR_CAP_HPT_MAXPAGESIZE);
 SPAPR_CAP_MIG_STATE(nested_kvm_hv, SPAPR_CAP_NESTED_KVM_HV);
+SPAPR_CAP_MIG_STATE(nested_papr, SPAPR_CAP_NESTED_PAPR);
 SPAPR_CAP_MIG_STATE(large_decr, SPAPR_CAP_LARGE_DECREMENTER);
 SPAPR_CAP_MIG_STATE(ccf_assist, SPAPR_CAP_CCF_ASSIST);
 SPAPR_CAP_MIG_STATE(fwnmi, SPAPR_CAP_FWNMI);
diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index db47c1196f..87a0db22a5 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -8,6 +8,11 @@
 #include "hw/ppc/spapr_nested.h"
 #include "mmu-book3s-v3.h"
 
+void spapr_nested_init(SpaprMachineState *spapr)
+{
+    spapr->nested.api = 0;
+}
+
 bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
                            target_ulong lpid, ppc_v3_pate_t *entry)
 {
@@ -411,6 +416,11 @@ void spapr_register_nested(void)
     spapr_register_hypercall(KVMPPC_H_TLB_INVALIDATE, h_tlb_invalidate);
     spapr_register_hypercall(KVMPPC_H_COPY_TOFROM_GUEST, h_copy_tofrom_guest);
 }
+
+void spapr_register_nested_papr(void)
+{
+    /* register hcalls here */
+}
 #else
 void spapr_exit_nested(PowerPCCPU *cpu, int excp)
 {
@@ -421,4 +431,9 @@ void spapr_register_nested(void)
 {
     /* DO NOTHING */
 }
+
+void spapr_register_nested_papr(void)
+{
+    /* DO NOTHING */
+}
 #endif
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 3e825f2787..e33ee87ba4 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -81,8 +81,10 @@ typedef enum {
 #define SPAPR_CAP_RPT_INVALIDATE        0x0B
 /* Support for AIL modes */
 #define SPAPR_CAP_AIL_MODE_3            0x0C
+/* Nested PAPR */
+#define SPAPR_CAP_NESTED_PAPR           0x0D
 /* Num Caps */
-#define SPAPR_CAP_NUM                   (SPAPR_CAP_AIL_MODE_3 + 1)
+#define SPAPR_CAP_NUM                   (SPAPR_CAP_NESTED_PAPR + 1)
 
 /*
  * Capability Values
@@ -982,6 +984,7 @@ extern const VMStateDescription vmstate_spapr_cap_sbbc;
 extern const VMStateDescription vmstate_spapr_cap_ibs;
 extern const VMStateDescription vmstate_spapr_cap_hpt_maxpagesize;
 extern const VMStateDescription vmstate_spapr_cap_nested_kvm_hv;
+extern const VMStateDescription vmstate_spapr_cap_nested_papr;
 extern const VMStateDescription vmstate_spapr_cap_large_decr;
 extern const VMStateDescription vmstate_spapr_cap_ccf_assist;
 extern const VMStateDescription vmstate_spapr_cap_fwnmi;
diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
index 0722b999cd..efdfc78200 100644
--- a/include/hw/ppc/spapr_nested.h
+++ b/include/hw/ppc/spapr_nested.h
@@ -6,6 +6,9 @@
 
 typedef struct SpaprMachineStateNested {
     uint64_t ptcr;
+    uint8_t api;
+#define NESTED_API_KVM_HV  1
+#define NESTED_API_PAPR    2
 } SpaprMachineStateNested;
 
 /*
@@ -105,4 +108,6 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp);
 typedef struct SpaprMachineState SpaprMachineState;
 bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
                            target_ulong lpid, ppc_v3_pate_t *entry);
+void spapr_register_nested_papr(void);
+void spapr_nested_init(SpaprMachineState *spapr);
 #endif /* HW_SPAPR_NESTED_H */
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 05/14] spapr: nested: register nested-hv api hcalls only for cap-nested-hv
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (3 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 04/14] spapr: nested: Introduce cap-nested-papr for " Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-11-29  4:03   ` Nicholas Piggin
  2023-10-12 10:49 ` [PATCH v2 06/14] spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls Harsh Prateek Bora
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Since cap-nested-hv and cap-nested-papr are mutually exclusive, now it
makes sense to register api specfic hcalls only when respective
capability is enabled, hence this change.

Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr_caps.c  | 1 +
 hw/ppc/spapr_hcall.c | 2 --
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 9b53f19ec8..ed3e638334 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -456,6 +456,7 @@ static void cap_nested_kvm_hv_apply(SpaprMachineState *spapr,
 
     if (!spapr->nested.api) {
         spapr->nested.api = NESTED_API_KVM_HV;
+        spapr_register_nested();
     } else {
         error_setg(errp, "Nested-HV APIs are mutually exclusive/incompatible");
         error_append_hint(errp, "Please use either cap-nested-hv or "
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 522a2396c7..8ae55087ec 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1635,8 +1635,6 @@ static void hypercall_register_types(void)
     spapr_register_hypercall(KVMPPC_H_CAS, h_client_architecture_support);
 
     spapr_register_hypercall(KVMPPC_H_UPDATE_DT, h_update_dt);
-
-    spapr_register_nested();
 }
 
 type_init(hypercall_register_types)
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 06/14] spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls.
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (4 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 05/14] spapr: nested: register nested-hv api hcalls only for cap-nested-hv Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 07/14] spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls Harsh Prateek Bora
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Introduce the nested PAPR hcalls:
 - H_GUEST_GET_CAPABILITIES which is used to query the capabilities
   of the API and the L2 guests it provides.
 - H_GUEST_SET_CAPABILITIES which is used to set the Guest API
   capabilities that the Host Partition supports and may use.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr_nested.c         | 70 +++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h        |  5 ++-
 include/hw/ppc/spapr_nested.h | 10 +++++
 3 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index 87a0db22a5..c53b310f82 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -7,10 +7,12 @@
 #include "hw/ppc/spapr_cpu_core.h"
 #include "hw/ppc/spapr_nested.h"
 #include "mmu-book3s-v3.h"
+#include "cpu-models.h"
 
 void spapr_nested_init(SpaprMachineState *spapr)
 {
     spapr->nested.api = 0;
+    spapr->nested.capabilities_set = false;
 }
 
 bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
@@ -409,6 +411,72 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
     address_space_unmap(CPU(cpu)->as, regs, len, len, true);
 }
 
+static target_ulong h_guest_get_capabilities(PowerPCCPU *cpu,
+                                             SpaprMachineState *spapr,
+                                             target_ulong opcode,
+                                             target_ulong *args)
+{
+    CPUPPCState *env = &cpu->env;
+    target_ulong flags = args[0];
+
+    if (flags) { /* don't handle any flags capabilities for now */
+        return H_PARAMETER;
+    }
+
+    if ((env->spr[SPR_PVR] & CPU_POWERPC_POWER_SERVER_MASK) ==
+        (CPU_POWERPC_POWER9_BASE))
+        env->gpr[4] = H_GUEST_CAPABILITIES_P9_MODE;
+
+    if ((env->spr[SPR_PVR] & CPU_POWERPC_POWER_SERVER_MASK) ==
+        (CPU_POWERPC_POWER10_BASE))
+        env->gpr[4] = H_GUEST_CAPABILITIES_P10_MODE;
+
+    return H_SUCCESS;
+}
+
+static target_ulong h_guest_set_capabilities(PowerPCCPU *cpu,
+                                             SpaprMachineState *spapr,
+                                             target_ulong opcode,
+                                              target_ulong *args)
+{
+    CPUPPCState *env = &cpu->env;
+    target_ulong flags = args[0];
+    target_ulong capabilities = args[1];
+
+    if (flags) { /* don't handle any flags capabilities for now */
+        return H_PARAMETER;
+    }
+
+    if (capabilities & H_GUEST_CAPABILITIES_COPY_MEM) {
+        env->gpr[4] = 0;
+        return H_P2; /* isn't supported */
+    }
+
+    if ((env->spr[SPR_PVR] & CPU_POWERPC_POWER_SERVER_MASK) ==
+        (CPU_POWERPC_POWER9_BASE)) {
+        if (!(capabilities & H_GUEST_CAPABILITIES_P9_MODE)) {
+            env->gpr[4] = 1;
+            return H_P2;
+        }
+    }
+
+    if ((env->spr[SPR_PVR] & CPU_POWERPC_POWER_SERVER_MASK) ==
+        (CPU_POWERPC_POWER10_BASE)) {
+        if (!(capabilities & H_GUEST_CAPABILITIES_P10_MODE)) {
+            env->gpr[4] = 2;
+            return H_P2;
+        }
+    }
+
+    if (!spapr->nested.capabilities_set) {
+        spapr->nested.capabilities_set = true;
+        spapr->nested.pvr_base = env->spr[SPR_PVR];
+        return H_SUCCESS;
+    } else {
+        return H_STATE;
+    }
+}
+
 void spapr_register_nested(void)
 {
     spapr_register_hypercall(KVMPPC_H_SET_PARTITION_TABLE, h_set_ptbl);
@@ -420,6 +488,8 @@ void spapr_register_nested(void)
 void spapr_register_nested_papr(void)
 {
     /* register hcalls here */
+    spapr_register_hypercall(H_GUEST_GET_CAPABILITIES, h_guest_get_capabilities);
+    spapr_register_hypercall(H_GUEST_SET_CAPABILITIES, h_guest_set_capabilities);
 }
 #else
 void spapr_exit_nested(PowerPCCPU *cpu, int excp)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index e33ee87ba4..c83dd76a84 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -366,6 +366,7 @@ struct SpaprMachineState {
 #define H_NOOP            -63
 #define H_UNSUPPORTED     -67
 #define H_OVERLAP         -68
+#define H_STATE           -75
 #define H_UNSUPPORTED_FLAG -256
 #define H_MULTI_THREADS_ACTIVE -9005
 
@@ -585,8 +586,10 @@ struct SpaprMachineState {
 #define H_RPT_INVALIDATE        0x448
 #define H_SCM_FLUSH             0x44C
 #define H_WATCHDOG              0x45C
+#define H_GUEST_GET_CAPABILITIES 0x460
+#define H_GUEST_SET_CAPABILITIES 0x464
 
-#define MAX_HCALL_OPCODE        H_WATCHDOG
+#define MAX_HCALL_OPCODE         H_GUEST_SET_CAPABILITIES
 
 /* The hcalls above are standardized in PAPR and implemented by pHyp
  * as well.
diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
index efdfc78200..b7586d06a4 100644
--- a/include/hw/ppc/spapr_nested.h
+++ b/include/hw/ppc/spapr_nested.h
@@ -9,8 +9,18 @@ typedef struct SpaprMachineStateNested {
     uint8_t api;
 #define NESTED_API_KVM_HV  1
 #define NESTED_API_PAPR    2
+    bool capabilities_set;
+    uint32_t pvr_base;
 } SpaprMachineStateNested;
 
+/* Nested PAPR API related macros */
+#define H_GUEST_CAPABILITIES_COPY_MEM 0x8000000000000000
+#define H_GUEST_CAPABILITIES_P9_MODE  0x4000000000000000
+#define H_GUEST_CAPABILITIES_P10_MODE 0x2000000000000000
+#define H_GUEST_CAP_COPY_MEM_BMAP     0
+#define H_GUEST_CAP_P9_MODE_BMAP      1
+#define H_GUEST_CAP_P10_MODE_BMAP     2
+
 /*
  * Register state for entering a nested guest with H_ENTER_NESTED.
  * New member must be added at the end.
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 07/14] spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls.
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (5 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 06/14] spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 08/14] spapr: nested: Introduce H_GUEST_CREATE_VPCU hcall Harsh Prateek Bora
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Introduce the nested PAPR hcalls:
    - H_GUEST_CREATE which is used to create and allocate resources for
nested guest being created.
    - H_GUEST_DELETE which is used to delete and deallocate resources
for the nested guest being deleted. It also supports deleting all nested
guests at once using a deleteAll flag.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr_nested.c         | 101 ++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h        |   4 +-
 include/hw/ppc/spapr_nested.h |   7 +++
 3 files changed, 111 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index c53b310f82..e95e70e20c 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -477,6 +477,105 @@ static target_ulong h_guest_set_capabilities(PowerPCCPU *cpu,
     }
 }
 
+static void
+destroy_guest_helper(gpointer value)
+{
+    struct SpaprMachineStateNestedGuest *guest = value;
+    g_free(guest);
+}
+
+static target_ulong h_guest_create(PowerPCCPU *cpu,
+                                   SpaprMachineState *spapr,
+                                   target_ulong opcode,
+                                   target_ulong *args)
+{
+    CPUPPCState *env = &cpu->env;
+    target_ulong flags = args[0];
+    target_ulong continue_token = args[1];
+    uint64_t guestid;
+    int nguests = 0;
+    struct SpaprMachineStateNestedGuest *guest;
+
+    if (flags) { /* don't handle any flags for now */
+        return H_UNSUPPORTED_FLAG;
+    }
+
+    if (continue_token != -1) {
+        return H_P2;
+    }
+
+    if (!spapr->nested.capabilities_set) {
+        return H_STATE;
+    }
+
+    if (!spapr->nested.guests) {
+        spapr->nested.guests = g_hash_table_new_full(NULL,
+                                                     NULL,
+                                                     NULL,
+                                                     destroy_guest_helper);
+    }
+
+    nguests = g_hash_table_size(spapr->nested.guests);
+
+    if (nguests == PAPR_NESTED_GUEST_MAX) {
+        return H_NO_MEM;
+    }
+
+    /* Lookup for available guestid */
+    for (guestid = 1; guestid < PAPR_NESTED_GUEST_MAX; guestid++) {
+        if (!(g_hash_table_lookup(spapr->nested.guests,
+                                  GINT_TO_POINTER(guestid)))) {
+            break;
+        }
+    }
+
+    if (guestid == PAPR_NESTED_GUEST_MAX) {
+        return H_NO_MEM;
+    }
+
+    guest = g_try_new0(struct SpaprMachineStateNestedGuest, 1);
+    if (!guest) {
+        return H_NO_MEM;
+    }
+
+    guest->pvr_logical = spapr->nested.pvr_base;
+    g_hash_table_insert(spapr->nested.guests, GINT_TO_POINTER(guestid), guest);
+    env->gpr[4] = guestid;
+
+    return H_SUCCESS;
+}
+
+static target_ulong h_guest_delete(PowerPCCPU *cpu,
+                                   SpaprMachineState *spapr,
+                                   target_ulong opcode,
+                                   target_ulong *args)
+{
+    target_ulong flags = args[0];
+    target_ulong guestid = args[1];
+    struct SpaprMachineStateNestedGuest *guest;
+
+    /*
+     * handle flag deleteAllGuests, if set:
+     * guestid is ignored and all guests are deleted
+     *
+     */
+    if (flags & ~H_GUEST_DELETE_ALL_FLAG) {
+        return H_UNSUPPORTED_FLAG; /* other flag bits reserved */
+    } else if (flags & H_GUEST_DELETE_ALL_FLAG) {
+        g_hash_table_destroy(spapr->nested.guests);
+        return H_SUCCESS;
+    }
+
+    guest = g_hash_table_lookup(spapr->nested.guests, GINT_TO_POINTER(guestid));
+    if (!guest) {
+        return H_P2;
+    }
+
+    g_hash_table_remove(spapr->nested.guests, GINT_TO_POINTER(guestid));
+
+    return H_SUCCESS;
+}
+
 void spapr_register_nested(void)
 {
     spapr_register_hypercall(KVMPPC_H_SET_PARTITION_TABLE, h_set_ptbl);
@@ -490,6 +589,8 @@ void spapr_register_nested_papr(void)
     /* register hcalls here */
     spapr_register_hypercall(H_GUEST_GET_CAPABILITIES, h_guest_get_capabilities);
     spapr_register_hypercall(H_GUEST_SET_CAPABILITIES, h_guest_set_capabilities);
+    spapr_register_hypercall(H_GUEST_CREATE          , h_guest_create);
+    spapr_register_hypercall(H_GUEST_DELETE          , h_guest_delete);
 }
 #else
 void spapr_exit_nested(PowerPCCPU *cpu, int excp)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index c83dd76a84..fcc6f0a3d9 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -588,8 +588,10 @@ struct SpaprMachineState {
 #define H_WATCHDOG              0x45C
 #define H_GUEST_GET_CAPABILITIES 0x460
 #define H_GUEST_SET_CAPABILITIES 0x464
+#define H_GUEST_CREATE           0x470
+#define H_GUEST_DELETE           0x488
 
-#define MAX_HCALL_OPCODE         H_GUEST_SET_CAPABILITIES
+#define MAX_HCALL_OPCODE         H_GUEST_DELETE
 
 /* The hcalls above are standardized in PAPR and implemented by pHyp
  * as well.
diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
index b7586d06a4..f186005f00 100644
--- a/include/hw/ppc/spapr_nested.h
+++ b/include/hw/ppc/spapr_nested.h
@@ -11,8 +11,13 @@ typedef struct SpaprMachineStateNested {
 #define NESTED_API_PAPR    2
     bool capabilities_set;
     uint32_t pvr_base;
+    GHashTable *guests;
 } SpaprMachineStateNested;
 
+typedef struct SpaprMachineStateNestedGuest {
+    uint32_t pvr_logical;
+} SpaprMachineStateNestedGuest;
+
 /* Nested PAPR API related macros */
 #define H_GUEST_CAPABILITIES_COPY_MEM 0x8000000000000000
 #define H_GUEST_CAPABILITIES_P9_MODE  0x4000000000000000
@@ -20,6 +25,8 @@ typedef struct SpaprMachineStateNested {
 #define H_GUEST_CAP_COPY_MEM_BMAP     0
 #define H_GUEST_CAP_P9_MODE_BMAP      1
 #define H_GUEST_CAP_P10_MODE_BMAP     2
+#define PAPR_NESTED_GUEST_MAX         4096
+#define H_GUEST_DELETE_ALL_FLAG       0x8000000000000000ULL
 
 /*
  * Register state for entering a nested guest with H_ENTER_NESTED.
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 08/14] spapr: nested: Introduce H_GUEST_CREATE_VPCU hcall.
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (6 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 07/14] spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 09/14] spapr: nested: Initialize the GSB elements lookup table Harsh Prateek Bora
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Introduce the nested PAPR hcall H_GUEST_CREATE_VCPU which is used to
create and initialize the specified VCPU resource for the previously
created guest. Each guest can have multiple VCPUs upto max 2048.
All VCPUs for a guest gets deallocated on guest delete.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr_nested.c         | 100 ++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h        |   2 +
 include/hw/ppc/spapr_nested.h |   8 +++
 3 files changed, 110 insertions(+)

diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index e95e70e20c..b38116b7c3 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -411,6 +411,41 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
     address_space_unmap(CPU(cpu)->as, regs, len, len, true);
 }
 
+static
+SpaprMachineStateNestedGuest *spapr_get_nested_guest(SpaprMachineState *spapr,
+                                                     target_ulong guestid)
+{
+    SpaprMachineStateNestedGuest *guest;
+
+    guest = g_hash_table_lookup(spapr->nested.guests, GINT_TO_POINTER(guestid));
+    return guest;
+}
+
+static bool spapr_nested_vcpu_check(SpaprMachineStateNestedGuest *guest,
+                                    target_ulong vcpuid)
+{
+    struct SpaprMachineStateNestedGuestVcpu *vcpu;
+    /*
+     * Perform sanity checks for the provided vcpuid of a guest.
+     * For now, ensure its valid, allocated and enabled for use.
+     */
+
+    if (vcpuid >= PAPR_NESTED_GUEST_VCPU_MAX) {
+        return false;
+    }
+
+    if (!(vcpuid < guest->vcpus)) {
+        return false;
+    }
+
+    vcpu = &guest->vcpu[vcpuid];
+    if (!vcpu->enabled) {
+        return false;
+    }
+
+    return true;
+}
+
 static target_ulong h_guest_get_capabilities(PowerPCCPU *cpu,
                                              SpaprMachineState *spapr,
                                              target_ulong opcode,
@@ -481,6 +516,11 @@ static void
 destroy_guest_helper(gpointer value)
 {
     struct SpaprMachineStateNestedGuest *guest = value;
+
+    for (int i = 0; i < guest->vcpus; i++) {
+        cpu_ppc_tb_free(&guest->vcpu[i].env);
+    }
+    g_free(guest->vcpu);
     g_free(guest);
 }
 
@@ -576,6 +616,65 @@ static target_ulong h_guest_delete(PowerPCCPU *cpu,
     return H_SUCCESS;
 }
 
+static target_ulong h_guest_create_vcpu(PowerPCCPU *cpu,
+                                        SpaprMachineState *spapr,
+                                        target_ulong opcode,
+                                        target_ulong *args)
+{
+    CPUPPCState *env = &cpu->env, *l2env;
+    target_ulong flags = args[0];
+    target_ulong guestid = args[1];
+    target_ulong vcpuid = args[2];
+    SpaprMachineStateNestedGuest *guest;
+
+    if (flags) { /* don't handle any flags for now */
+        return H_UNSUPPORTED_FLAG;
+    }
+
+    guest = spapr_get_nested_guest(spapr, guestid);
+    if (!guest) {
+        return H_P2;
+    }
+
+    if (vcpuid < guest->vcpus) {
+        return H_IN_USE;
+    }
+
+    if (guest->vcpus >= PAPR_NESTED_GUEST_VCPU_MAX) {
+        return H_P3;
+    }
+
+    if (guest->vcpus) {
+        SpaprMachineStateNestedGuestVcpu *vcpus;
+        vcpus = g_try_renew(struct SpaprMachineStateNestedGuestVcpu,
+                            guest->vcpu,
+                            guest->vcpus + 1);
+        if (!vcpus) {
+            return H_NO_MEM;
+        }
+        memset(&vcpus[guest->vcpus], 0,
+               sizeof(SpaprMachineStateNestedGuestVcpu));
+        guest->vcpu = vcpus;
+    } else {
+        guest->vcpu = g_try_new0(SpaprMachineStateNestedGuestVcpu, 1);
+        if (guest->vcpu == NULL) {
+            return H_NO_MEM;
+        }
+    }
+    l2env = &guest->vcpu[guest->vcpus].env;
+    guest->vcpus++;
+    assert(vcpuid < guest->vcpus); /* linear vcpuid allocation only */
+    /* Copy L1 PVR to L2 */
+    l2env->spr[SPR_PVR] = env->spr[SPR_PVR];
+    cpu_ppc_tb_init(l2env, SPAPR_TIMEBASE_FREQ);
+    guest->vcpu[vcpuid].enabled = true;
+
+    if (!spapr_nested_vcpu_check(guest, vcpuid)) {
+        return H_PARAMETER;
+    }
+    return H_SUCCESS;
+}
+
 void spapr_register_nested(void)
 {
     spapr_register_hypercall(KVMPPC_H_SET_PARTITION_TABLE, h_set_ptbl);
@@ -591,6 +690,7 @@ void spapr_register_nested_papr(void)
     spapr_register_hypercall(H_GUEST_SET_CAPABILITIES, h_guest_set_capabilities);
     spapr_register_hypercall(H_GUEST_CREATE          , h_guest_create);
     spapr_register_hypercall(H_GUEST_DELETE          , h_guest_delete);
+    spapr_register_hypercall(H_GUEST_CREATE_VCPU     , h_guest_create_vcpu);
 }
 #else
 void spapr_exit_nested(PowerPCCPU *cpu, int excp)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index fcc6f0a3d9..25aa8c5ffe 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -367,6 +367,7 @@ struct SpaprMachineState {
 #define H_UNSUPPORTED     -67
 #define H_OVERLAP         -68
 #define H_STATE           -75
+#define H_IN_USE          -77
 #define H_UNSUPPORTED_FLAG -256
 #define H_MULTI_THREADS_ACTIVE -9005
 
@@ -589,6 +590,7 @@ struct SpaprMachineState {
 #define H_GUEST_GET_CAPABILITIES 0x460
 #define H_GUEST_SET_CAPABILITIES 0x464
 #define H_GUEST_CREATE           0x470
+#define H_GUEST_CREATE_VCPU      0x474
 #define H_GUEST_DELETE           0x488
 
 #define MAX_HCALL_OPCODE         H_GUEST_DELETE
diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
index f186005f00..91cb744a5b 100644
--- a/include/hw/ppc/spapr_nested.h
+++ b/include/hw/ppc/spapr_nested.h
@@ -16,8 +16,15 @@ typedef struct SpaprMachineStateNested {
 
 typedef struct SpaprMachineStateNestedGuest {
     uint32_t pvr_logical;
+    unsigned long vcpus;
+    struct SpaprMachineStateNestedGuestVcpu *vcpu;
 } SpaprMachineStateNestedGuest;
 
+typedef struct SpaprMachineStateNestedGuestVcpu {
+    bool enabled;
+    CPUPPCState env;
+} SpaprMachineStateNestedGuestVcpu;
+
 /* Nested PAPR API related macros */
 #define H_GUEST_CAPABILITIES_COPY_MEM 0x8000000000000000
 #define H_GUEST_CAPABILITIES_P9_MODE  0x4000000000000000
@@ -27,6 +34,7 @@ typedef struct SpaprMachineStateNestedGuest {
 #define H_GUEST_CAP_P10_MODE_BMAP     2
 #define PAPR_NESTED_GUEST_MAX         4096
 #define H_GUEST_DELETE_ALL_FLAG       0x8000000000000000ULL
+#define PAPR_NESTED_GUEST_VCPU_MAX    2048
 
 /*
  * Register state for entering a nested guest with H_ENTER_NESTED.
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 09/14] spapr: nested: Initialize the GSB elements lookup table.
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (7 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 08/14] spapr: nested: Introduce H_GUEST_CREATE_VPCU hcall Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 10/14] spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls Harsh Prateek Bora
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Nested PAPR API provides a standard Guest State Buffer (GSB) format
with unique IDs for each guest state element for which get/set state is
supported by the API. Some of the elements are read-only and/or guest-wide.
Introducing helper routines for state exchange of each of the nested guest
state elements for which get/set state should be supported by the API.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr_caps.c           |   1 +
 hw/ppc/spapr_nested.c         | 500 +++++++++++++++++++++++++++++++++-
 include/hw/ppc/spapr_nested.h | 299 ++++++++++++++++++++
 target/ppc/cpu.h              |   2 +
 4 files changed, 799 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index ed3e638334..1d57d1b7c8 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -514,6 +514,7 @@ static void cap_nested_papr_apply(SpaprMachineState *spapr,
     if (!spapr->nested.api) {
         spapr->nested.api = NESTED_API_PAPR;
         spapr_register_nested_papr();
+        spapr_nested_gsb_init();
     } else {
         error_setg(errp, "Nested-HV APIs are mutually exclusive/incompatible");
         error_append_hint(errp, "Please use either cap-nested-hv or "
diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index b38116b7c3..8f4152bd54 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -422,7 +422,7 @@ SpaprMachineStateNestedGuest *spapr_get_nested_guest(SpaprMachineState *spapr,
 }
 
 static bool spapr_nested_vcpu_check(SpaprMachineStateNestedGuest *guest,
-                                    target_ulong vcpuid)
+                                    target_ulong vcpuid, bool inoutbuf)
 {
     struct SpaprMachineStateNestedGuestVcpu *vcpu;
     /*
@@ -443,7 +443,495 @@ static bool spapr_nested_vcpu_check(SpaprMachineStateNestedGuest *guest,
         return false;
     }
 
-    return true;
+    if (!inoutbuf) {
+        return true;
+    }
+
+    /* Check to see if the in/out buffers are registered */
+    if (vcpu->runbufin.addr && vcpu->runbufout.addr) {
+        return true;
+    }
+
+    return false;
+}
+
+static void *get_vcpu_env_ptr(SpaprMachineStateNestedGuest *guest,
+                              target_ulong vcpuid)
+{
+    assert(spapr_nested_vcpu_check(guest, vcpuid, false));
+    return &guest->vcpu[vcpuid].env;
+}
+
+static void *get_vcpu_ptr(SpaprMachineStateNestedGuest *guest,
+                                   target_ulong vcpuid)
+{
+    assert(spapr_nested_vcpu_check(guest, vcpuid, false));
+    return &guest->vcpu[vcpuid];
+}
+
+static void *get_guest_ptr(SpaprMachineStateNestedGuest *guest,
+                           target_ulong vcpuid)
+{
+    return guest;
+}
+
+/*
+ * set=1 means the L1 is trying to set some state
+ * set=0 means the L1 is trying to get some state
+ */
+static void copy_state_8to8(void *a, void *b, bool set)
+{
+    /* set takes from the Big endian element_buf and sets internal buffer */
+
+    if (set) {
+        *(uint64_t *)a = be64_to_cpu(*(uint64_t *)b);
+    } else {
+        *(uint64_t *)b = cpu_to_be64(*(uint64_t *)a);
+    }
+}
+
+static void copy_state_16to16(void *a, void *b, bool set)
+{
+    uint64_t *src, *dst;
+
+    if (set) {
+        src = b;
+        dst = a;
+
+        dst[1] = be64_to_cpu(src[0]);
+        dst[0] = be64_to_cpu(src[1]);
+    } else {
+        src = a;
+        dst = b;
+
+        dst[1] = cpu_to_be64(src[0]);
+        dst[0] = cpu_to_be64(src[1]);
+    }
+}
+
+static void copy_state_4to8(void *a, void *b, bool set)
+{
+    if (set) {
+        *(uint64_t *)a  = (uint64_t) be32_to_cpu(*(uint32_t *)b);
+    } else {
+        *(uint32_t *)b = cpu_to_be32((uint32_t) (*((uint64_t *)a)));
+    }
+}
+
+static void copy_state_pagetbl(void *a, void *b, bool set)
+{
+    uint64_t *pagetbl;
+    uint64_t *buf; /* 3 double words */
+    uint64_t rts;
+
+    assert(set);
+
+    pagetbl = a;
+    buf = b;
+
+    *pagetbl = be64_to_cpu(buf[0]);
+    /* as per ISA section 6.7.6.1 */
+    *pagetbl |= PATE0_HR; /* Host Radix bit is 1 */
+
+    /* RTS */
+    rts = be64_to_cpu(buf[1]);
+    assert(rts == 52);
+    rts = rts - 31; /* since radix tree size = 2^(RTS+31) */
+    *pagetbl |=  ((rts & 0x7) << 5); /* RTS2 is bit 56:58 */
+    *pagetbl |=  (((rts >> 3) & 0x3) << 61); /* RTS1 is bit 1:2 */
+
+    /* RPDS {Size = 2^(RPDS+3) , RPDS >=5} */
+    *pagetbl |= 63 - clz64(be64_to_cpu(buf[2])) - 3;
+}
+
+static void copy_state_proctbl(void *a, void *b, bool set)
+{
+    uint64_t *proctbl;
+    uint64_t *buf; /* 2 double words */
+
+    assert(set);
+
+    proctbl = a;
+    buf = b;
+    /* PRTB: Process Table Base */
+    *proctbl = be64_to_cpu(buf[0]);
+    /* PRTS: Process Table Size = 2^(12+PRTS) */
+    if (be64_to_cpu(buf[1]) == (1ULL << 12)) {
+            *proctbl |= 0;
+    } else if (be64_to_cpu(buf[1]) == (1ULL << 24)) {
+            *proctbl |= 12;
+    } else {
+        g_assert_not_reached();
+    }
+}
+
+static void copy_state_runbuf(void *a, void *b, bool set)
+{
+    uint64_t *buf; /* 2 double words */
+    struct SpaprMachineStateNestedGuestVcpuRunBuf *runbuf;
+
+    assert(set);
+
+    runbuf = a;
+    buf = b;
+
+    runbuf->addr = be64_to_cpu(buf[0]);
+    assert(runbuf->addr);
+
+    /* per spec */
+    assert(be64_to_cpu(buf[1]) <= 16384);
+
+    /*
+     * This will also hit in the input buffer but should be fine for
+     * now. If not we can split this function.
+     */
+    assert(be64_to_cpu(buf[1]) >= VCPU_OUT_BUF_MIN_SZ);
+
+    runbuf->size = be64_to_cpu(buf[1]);
+}
+
+/* tell the L1 how big we want the output vcpu run buffer */
+static void out_buf_min_size(void *a, void *b, bool set)
+{
+    uint64_t *buf; /* 1 double word */
+
+    assert(!set);
+
+    buf = b;
+
+    buf[0] = cpu_to_be64(VCPU_OUT_BUF_MIN_SZ);
+}
+
+static void copy_logical_pvr(void *a, void *b, bool set)
+{
+    uint32_t *buf; /* 1 word */
+    uint32_t *pvr_logical_ptr;
+    uint32_t pvr_logical;
+
+    pvr_logical_ptr = a;
+    buf = b;
+
+    if (!set) {
+        buf[0] = cpu_to_be32(*pvr_logical_ptr);
+        return;
+    }
+
+    pvr_logical = be32_to_cpu(buf[0]);
+    /* don't change the major version */
+    assert((pvr_logical & CPU_POWERPC_POWER_SERVER_MASK) ==
+           (*pvr_logical_ptr & CPU_POWERPC_POWER_SERVER_MASK));
+
+    *pvr_logical_ptr = pvr_logical;
+}
+
+static void copy_tb_offset(void *a, void *b, bool set)
+{
+    SpaprMachineStateNestedGuest *guest;
+    uint64_t *buf; /* 1 double word */
+    uint64_t *tb_offset_ptr;
+    uint64_t tb_offset;
+
+    tb_offset_ptr = a;
+    buf = b;
+
+    if (!set) {
+        buf[0] = cpu_to_be64(*tb_offset_ptr);
+        return;
+    }
+
+    tb_offset = be64_to_cpu(buf[0]);
+    /* need to copy this to the individual tb_offset for each vcpu */
+    guest = container_of(tb_offset_ptr,
+                         struct SpaprMachineStateNestedGuest,
+                         tb_offset);
+    for (int i = 0; i < guest->vcpus; i++) {
+        guest->vcpu[i].tb_offset = tb_offset;
+    }
+}
+
+static void copy_state_dec_expire_tb(void *a, void *b, bool set)
+{
+    int64_t *dec_expiry_tb;
+    uint64_t *buf; /* 1 double word */
+
+    dec_expiry_tb = a;
+    buf = b;
+
+    if (!set) {
+        buf[0] = cpu_to_be64(*dec_expiry_tb);
+        return;
+    }
+
+    *dec_expiry_tb = be64_to_cpu(buf[0]);
+}
+
+static void copy_state_hdecr(void *a, void *b, bool set)
+{
+    uint64_t *buf; /* 1 double word */
+    uint64_t *hdecr_expiry_tb;
+
+    hdecr_expiry_tb = a;
+    buf = b;
+
+    if (!set) {
+        buf[0] = cpu_to_be64(*hdecr_expiry_tb);
+        return;
+    }
+
+    *hdecr_expiry_tb = be64_to_cpu(buf[0]);
+}
+
+static void copy_state_vscr(void *a, void *b, bool set)
+{
+    uint32_t *buf; /* 1 word */
+    CPUPPCState *env;
+
+    env = a;
+    buf = b;
+
+    if (!set) {
+        buf[0] = cpu_to_be32(ppc_get_vscr(env));
+        return;
+    }
+
+    ppc_store_vscr(env, be32_to_cpu(buf[0]));
+}
+
+static void copy_state_fpscr(void *a, void *b, bool set)
+{
+    uint64_t *buf; /* 1 double word */
+    CPUPPCState *env;
+
+    env = a;
+    buf = b;
+
+    if (!set) {
+        buf[0] = cpu_to_be64(env->fpscr);
+        return;
+    }
+
+    ppc_store_fpscr(env, be64_to_cpu(buf[0]));
+}
+
+static void copy_state_cr(void *a, void *b, bool set)
+{
+    uint32_t *buf; /* 1 word */
+    CPUPPCState *env;
+    uint64_t cr; /* api v1 uses uint64_t but papr acr v2 mentions 4 bytes */
+    env = a;
+    buf = b;
+
+    if (!set) {
+        buf[0] = cpu_to_be32((uint32_t)ppc_get_cr(env));
+        return;
+    }
+    cr = be32_to_cpu(buf[0]);
+    ppc_set_cr(env, cr);
+}
+
+struct guest_state_element_type guest_state_element_types[] = {
+    GUEST_STATE_ELEMENT_NOP(GSB_HV_VCPU_IGNORED_ID, 0),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR0,  gpr[0]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR1,  gpr[1]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR2,  gpr[2]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR3,  gpr[3]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR4,  gpr[4]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR5,  gpr[5]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR6,  gpr[6]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR7,  gpr[7]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR8,  gpr[8]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR9,  gpr[9]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR10, gpr[10]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR11, gpr[11]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR12, gpr[12]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR13, gpr[13]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR14, gpr[14]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR15, gpr[15]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR16, gpr[16]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR17, gpr[17]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR18, gpr[18]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR19, gpr[19]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR20, gpr[20]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR21, gpr[21]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR22, gpr[22]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR23, gpr[23]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR24, gpr[24]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR25, gpr[25]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR26, gpr[26]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR27, gpr[27]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR28, gpr[28]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR29, gpr[29]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR30, gpr[30]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_GPR31, gpr[31]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_NIA, nip),
+    GSE_ENV_DWM(GSB_VCPU_SPR_MSR, msr, HVMASK_MSR),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_CTR, ctr),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_LR, lr),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_XER, xer),
+    GUEST_STATE_ELEMENT_ENV_BASE(GSB_VCPU_SPR_CR, 4, copy_state_cr),
+    GUEST_STATE_ELEMENT_NOP_DW(GSB_VCPU_SPR_MMCR3),
+    GUEST_STATE_ELEMENT_NOP_DW(GSB_VCPU_SPR_SIER2),
+    GUEST_STATE_ELEMENT_NOP_DW(GSB_VCPU_SPR_SIER3),
+    GUEST_STATE_ELEMENT_NOP_W(GSB_VCPU_SPR_WORT),
+    GSE_ENV_DWM(GSB_VCPU_SPR_LPCR, spr[SPR_LPCR], HVMASK_LPCR),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_AMOR, spr[SPR_AMOR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_HFSCR, spr[SPR_HFSCR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_DAWR0, spr[SPR_DAWR0]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_DAWRX0, spr[SPR_DAWRX0]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_CIABR, spr[SPR_CIABR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_PURR,  spr[SPR_PURR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_SPURR, spr[SPR_SPURR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_IC,    spr[SPR_IC]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_VTB,   spr[SPR_VTB]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_HDAR,  spr[SPR_HDAR]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_HDSISR, spr[SPR_HDSISR]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_HEIR,   spr[SPR_HEIR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_ASDR,  spr[SPR_ASDR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_SRR0, spr[SPR_SRR0]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_SRR1, spr[SPR_SRR1]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_SPRG0, spr[SPR_SPRG0]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_SPRG1, spr[SPR_SPRG1]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_SPRG2, spr[SPR_SPRG2]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_SPRG3, spr[SPR_SPRG3]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_PIDR,   spr[SPR_BOOKS_PID]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_CFAR, cfar),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_PPR, spr[SPR_PPR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_DAWR1, spr[SPR_DAWR1]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_DAWRX1, spr[SPR_DAWRX1]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_DEXCR, spr[SPR_DEXCR]),
+    GSE_ENV_DWM(GSB_VCPU_SPR_HDEXCR, spr[SPR_HDEXCR], HVMASK_HDEXCR),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_HASHKEYR,  spr[SPR_HASHKEYR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_HASHPKEYR, spr[SPR_HASHPKEYR]),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR0, 16, vsr[0], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR1, 16, vsr[1], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR2, 16, vsr[2], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR3, 16, vsr[3], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR4, 16, vsr[4], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR5, 16, vsr[5], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR6, 16, vsr[6], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR7, 16, vsr[7], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR8, 16, vsr[8], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR9, 16, vsr[9], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR10, 16, vsr[10], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR11, 16, vsr[11], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR12, 16, vsr[12], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR13, 16, vsr[13], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR14, 16, vsr[14], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR15, 16, vsr[15], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR16, 16, vsr[16], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR17, 16, vsr[17], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR18, 16, vsr[18], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR19, 16, vsr[19], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR20, 16, vsr[20], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR21, 16, vsr[21], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR22, 16, vsr[22], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR23, 16, vsr[23], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR24, 16, vsr[24], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR25, 16, vsr[25], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR26, 16, vsr[26], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR27, 16, vsr[27], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR28, 16, vsr[28], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR29, 16, vsr[29], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR30, 16, vsr[30], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR31, 16, vsr[31], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR32, 16, vsr[32], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR33, 16, vsr[33], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR34, 16, vsr[34], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR35, 16, vsr[35], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR36, 16, vsr[36], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR37, 16, vsr[37], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR38, 16, vsr[38], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR39, 16, vsr[39], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR40, 16, vsr[40], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR41, 16, vsr[41], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR42, 16, vsr[42], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR43, 16, vsr[43], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR44, 16, vsr[44], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR45, 16, vsr[45], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR46, 16, vsr[46], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR47, 16, vsr[47], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR48, 16, vsr[48], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR49, 16, vsr[49], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR50, 16, vsr[50], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR51, 16, vsr[51], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR52, 16, vsr[52], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR53, 16, vsr[53], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR54, 16, vsr[54], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR55, 16, vsr[55], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR56, 16, vsr[56], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR57, 16, vsr[57], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR58, 16, vsr[58], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR59, 16, vsr[59], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR60, 16, vsr[60], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR61, 16, vsr[61], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR62, 16, vsr[62], copy_state_16to16),
+    GUEST_STATE_ELEMENT_ENV(GSB_VCPU_SPR_VSR63, 16, vsr[63], copy_state_16to16),
+    GSBE_NESTED(GSB_PART_SCOPED_PAGETBL, 0x18, parttbl[0],  copy_state_pagetbl),
+    GSBE_NESTED(GSB_PROCESS_TBL,         0x10, parttbl[1],  copy_state_proctbl),
+    GSBE_NESTED(GSB_VCPU_LPVR,           0x4,  pvr_logical, copy_logical_pvr),
+    GSBE_NESTED_MSK(GSB_TB_OFFSET, 0x8, tb_offset, copy_tb_offset,
+                    HVMASK_TB_OFFSET),
+    GSBE_NESTED_VCPU(GSB_VCPU_IN_BUFFER, 0x10, runbufin,    copy_state_runbuf),
+    GSBE_NESTED_VCPU(GSB_VCPU_OUT_BUFFER, 0x10, runbufout,   copy_state_runbuf),
+    GSBE_NESTED_VCPU(GSB_VCPU_OUT_BUF_MIN_SZ, 0x8, runbufout, out_buf_min_size),
+    GSBE_NESTED_VCPU(GSB_VCPU_DEC_EXPIRE_TB, 0x8, dec_expiry_tb,
+                     copy_state_dec_expire_tb),
+    GSBE_NESTED_VCPU(GSB_VCPU_HDEC_EXPIRY_TB, 0x8, hdecr_expiry_tb,
+                     copy_state_hdecr),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_EBBHR, spr[SPR_EBBHR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_TAR,   spr[SPR_TAR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_EBBRR, spr[SPR_EBBRR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_BESCR, spr[SPR_BESCR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_IAMR , spr[SPR_IAMR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_AMR  , spr[SPR_AMR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_UAMOR, spr[SPR_UAMOR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_DSCR , spr[SPR_DSCR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_FSCR , spr[SPR_FSCR]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_PSPB , spr[SPR_PSPB]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_CTRL , spr[SPR_CTRL]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_VRSAVE, spr[SPR_VRSAVE]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_DAR , spr[SPR_DAR]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_DSISR , spr[SPR_DSISR]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_PMC1, spr[SPR_POWER_PMC1]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_PMC2, spr[SPR_POWER_PMC2]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_PMC3, spr[SPR_POWER_PMC3]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_PMC4, spr[SPR_POWER_PMC4]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_PMC5, spr[SPR_POWER_PMC5]),
+    GUEST_STATE_ELEMENT_ENV_W(GSB_VCPU_SPR_PMC6, spr[SPR_POWER_PMC6]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_MMCR0, spr[SPR_POWER_MMCR0]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_MMCR1, spr[SPR_POWER_MMCR1]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_MMCR2, spr[SPR_POWER_MMCR2]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_MMCRA, spr[SPR_POWER_MMCRA]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_SDAR , spr[SPR_POWER_SDAR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_SIAR , spr[SPR_POWER_SIAR]),
+    GUEST_STATE_ELEMENT_ENV_DW(GSB_VCPU_SPR_SIER , spr[SPR_POWER_SIER]),
+    GUEST_STATE_ELEMENT_ENV_BASE(GSB_VCPU_SPR_VSCR,  4, copy_state_vscr),
+    GUEST_STATE_ELEMENT_ENV_BASE(GSB_VCPU_SPR_FPSCR, 8, copy_state_fpscr)
+};
+
+void spapr_nested_gsb_init(void)
+{
+    struct guest_state_element_type *type;
+
+    /* Init the guest state elements lookup table, flags for now */
+    for (int i = 0; i < ARRAY_SIZE(guest_state_element_types); i++) {
+        type = &guest_state_element_types[i];
+
+        assert(type->id <= GSB_LAST);
+        if (type->id >= GSB_VCPU_SPR_HDAR)
+            /* 0xf000 - 0xf005 Thread + RO */
+            type->flags = GUEST_STATE_ELEMENT_TYPE_FLAG_READ_ONLY;
+        else if (type->id >= GSB_VCPU_IN_BUFFER)
+            /* 0x0c00 - 0xf000 Thread + RW */
+            type->flags = 0;
+        else if (type->id >= GSB_VCPU_LPVR)
+            /* 0x0003 - 0x0bff Guest + RW */
+            type->flags = GUEST_STATE_ELEMENT_TYPE_FLAG_GUEST_WIDE;
+        else if (type->id >= GSB_HV_VCPU_STATE_SIZE)
+            /* 0x0001 - 0x0002 Guest + RO */
+            type->flags = GUEST_STATE_ELEMENT_TYPE_FLAG_READ_ONLY |
+                          GUEST_STATE_ELEMENT_TYPE_FLAG_GUEST_WIDE;
+    }
 }
 
 static target_ulong h_guest_get_capabilities(PowerPCCPU *cpu,
@@ -669,7 +1157,7 @@ static target_ulong h_guest_create_vcpu(PowerPCCPU *cpu,
     cpu_ppc_tb_init(l2env, SPAPR_TIMEBASE_FREQ);
     guest->vcpu[vcpuid].enabled = true;
 
-    if (!spapr_nested_vcpu_check(guest, vcpuid)) {
+    if (!spapr_nested_vcpu_check(guest, vcpuid, false)) {
         return H_PARAMETER;
     }
     return H_SUCCESS;
@@ -707,4 +1195,10 @@ void spapr_register_nested_papr(void)
 {
     /* DO NOTHING */
 }
+
+void spapr_nested_gsb_init(void)
+{
+    /* DO NOTHING */
+}
+
 #endif
diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
index 91cb744a5b..9b647c0389 100644
--- a/include/hw/ppc/spapr_nested.h
+++ b/include/hw/ppc/spapr_nested.h
@@ -4,6 +4,191 @@
 #include "qemu/osdep.h"
 #include "target/ppc/cpu.h"
 
+/* Guest State Buffer Element IDs */
+#define GSB_HV_VCPU_IGNORED_ID  0x0000 /* An element whose value is ignored */
+#define GSB_HV_VCPU_STATE_SIZE  0x0001 /* HV internal format VCPU state size */
+#define GSB_VCPU_OUT_BUF_MIN_SZ 0x0002 /* Min size of the Run VCPU o/p buffer */
+#define GSB_VCPU_LPVR           0x0003 /* Logical PVR */
+#define GSB_TB_OFFSET           0x0004 /* Timebase Offset */
+#define GSB_PART_SCOPED_PAGETBL 0x0005 /* Partition Scoped Page Table */
+#define GSB_PROCESS_TBL         0x0006 /* Process Table */
+                    /* RESERVED 0x0007 - 0x0BFF */
+#define GSB_VCPU_IN_BUFFER      0x0C00 /* Run VCPU Input Buffer */
+#define GSB_VCPU_OUT_BUFFER     0x0C01 /* Run VCPU Out Buffer */
+#define GSB_VCPU_VPA            0x0C02 /* HRA to Guest VCPU VPA */
+                    /* RESERVED 0x0C03 - 0x0FFF */
+#define GSB_VCPU_GPR0           0x1000
+#define GSB_VCPU_GPR1           0x1001
+#define GSB_VCPU_GPR2           0x1002
+#define GSB_VCPU_GPR3           0x1003
+#define GSB_VCPU_GPR4           0x1004
+#define GSB_VCPU_GPR5           0x1005
+#define GSB_VCPU_GPR6           0x1006
+#define GSB_VCPU_GPR7           0x1007
+#define GSB_VCPU_GPR8           0x1008
+#define GSB_VCPU_GPR9           0x1009
+#define GSB_VCPU_GPR10          0x100A
+#define GSB_VCPU_GPR11          0x100B
+#define GSB_VCPU_GPR12          0x100C
+#define GSB_VCPU_GPR13          0x100D
+#define GSB_VCPU_GPR14          0x100E
+#define GSB_VCPU_GPR15          0x100F
+#define GSB_VCPU_GPR16          0x1010
+#define GSB_VCPU_GPR17          0x1011
+#define GSB_VCPU_GPR18          0x1012
+#define GSB_VCPU_GPR19          0x1013
+#define GSB_VCPU_GPR20          0x1014
+#define GSB_VCPU_GPR21          0x1015
+#define GSB_VCPU_GPR22          0x1016
+#define GSB_VCPU_GPR23          0x1017
+#define GSB_VCPU_GPR24          0x1018
+#define GSB_VCPU_GPR25          0x1019
+#define GSB_VCPU_GPR26          0x101A
+#define GSB_VCPU_GPR27          0x101B
+#define GSB_VCPU_GPR28          0x101C
+#define GSB_VCPU_GPR29          0x101D
+#define GSB_VCPU_GPR30          0x101E
+#define GSB_VCPU_GPR31          0x101F
+#define GSB_VCPU_HDEC_EXPIRY_TB 0x1020
+#define GSB_VCPU_SPR_NIA        0x1021
+#define GSB_VCPU_SPR_MSR        0x1022
+#define GSB_VCPU_SPR_LR         0x1023
+#define GSB_VCPU_SPR_XER        0x1024
+#define GSB_VCPU_SPR_CTR        0x1025
+#define GSB_VCPU_SPR_CFAR       0x1026
+#define GSB_VCPU_SPR_SRR0       0x1027
+#define GSB_VCPU_SPR_SRR1       0x1028
+#define GSB_VCPU_SPR_DAR        0x1029
+#define GSB_VCPU_DEC_EXPIRE_TB  0x102A
+#define GSB_VCPU_SPR_VTB        0x102B
+#define GSB_VCPU_SPR_LPCR       0x102C
+#define GSB_VCPU_SPR_HFSCR      0x102D
+#define GSB_VCPU_SPR_FSCR       0x102E
+#define GSB_VCPU_SPR_FPSCR      0x102F
+#define GSB_VCPU_SPR_DAWR0      0x1030
+#define GSB_VCPU_SPR_DAWR1      0x1031
+#define GSB_VCPU_SPR_CIABR      0x1032
+#define GSB_VCPU_SPR_PURR       0x1033
+#define GSB_VCPU_SPR_SPURR      0x1034
+#define GSB_VCPU_SPR_IC         0x1035
+#define GSB_VCPU_SPR_SPRG0      0x1036
+#define GSB_VCPU_SPR_SPRG1      0x1037
+#define GSB_VCPU_SPR_SPRG2      0x1038
+#define GSB_VCPU_SPR_SPRG3      0x1039
+#define GSB_VCPU_SPR_PPR        0x103A
+#define GSB_VCPU_SPR_MMCR0      0x103B
+#define GSB_VCPU_SPR_MMCR1      0x103C
+#define GSB_VCPU_SPR_MMCR2      0x103D
+#define GSB_VCPU_SPR_MMCR3      0x103E
+#define GSB_VCPU_SPR_MMCRA      0x103F
+#define GSB_VCPU_SPR_SIER       0x1040
+#define GSB_VCPU_SPR_SIER2      0x1041
+#define GSB_VCPU_SPR_SIER3      0x1042
+#define GSB_VCPU_SPR_BESCR      0x1043
+#define GSB_VCPU_SPR_EBBHR      0x1044
+#define GSB_VCPU_SPR_EBBRR      0x1045
+#define GSB_VCPU_SPR_AMR        0x1046
+#define GSB_VCPU_SPR_IAMR       0x1047
+#define GSB_VCPU_SPR_AMOR       0x1048
+#define GSB_VCPU_SPR_UAMOR      0x1049
+#define GSB_VCPU_SPR_SDAR       0x104A
+#define GSB_VCPU_SPR_SIAR       0x104B
+#define GSB_VCPU_SPR_DSCR       0x104C
+#define GSB_VCPU_SPR_TAR        0x104D
+#define GSB_VCPU_SPR_DEXCR      0x104E
+#define GSB_VCPU_SPR_HDEXCR     0x104F
+#define GSB_VCPU_SPR_HASHKEYR   0x1050
+#define GSB_VCPU_SPR_HASHPKEYR  0x1051
+#define GSB_VCPU_SPR_CTRL       0x1052
+                    /* RESERVED 0x1053 - 0x1FFF */
+#define GSB_VCPU_SPR_CR         0x2000
+#define GSB_VCPU_SPR_PIDR       0x2001
+#define GSB_VCPU_SPR_DSISR      0x2002
+#define GSB_VCPU_SPR_VSCR       0x2003
+#define GSB_VCPU_SPR_VRSAVE     0x2004
+#define GSB_VCPU_SPR_DAWRX0     0x2005
+#define GSB_VCPU_SPR_DAWRX1     0x2006
+#define GSB_VCPU_SPR_PMC1       0x2007
+#define GSB_VCPU_SPR_PMC2       0x2008
+#define GSB_VCPU_SPR_PMC3       0x2009
+#define GSB_VCPU_SPR_PMC4       0x200A
+#define GSB_VCPU_SPR_PMC5       0x200B
+#define GSB_VCPU_SPR_PMC6       0x200C
+#define GSB_VCPU_SPR_WORT       0x200D
+#define GSB_VCPU_SPR_PSPB       0x200E
+                    /* RESERVED 0x200F - 0x2FFF */
+#define GSB_VCPU_SPR_VSR0       0x3000
+#define GSB_VCPU_SPR_VSR1       0x3001
+#define GSB_VCPU_SPR_VSR2       0x3002
+#define GSB_VCPU_SPR_VSR3       0x3003
+#define GSB_VCPU_SPR_VSR4       0x3004
+#define GSB_VCPU_SPR_VSR5       0x3005
+#define GSB_VCPU_SPR_VSR6       0x3006
+#define GSB_VCPU_SPR_VSR7       0x3007
+#define GSB_VCPU_SPR_VSR8       0x3008
+#define GSB_VCPU_SPR_VSR9       0x3009
+#define GSB_VCPU_SPR_VSR10      0x300A
+#define GSB_VCPU_SPR_VSR11      0x300B
+#define GSB_VCPU_SPR_VSR12      0x300C
+#define GSB_VCPU_SPR_VSR13      0x300D
+#define GSB_VCPU_SPR_VSR14      0x300E
+#define GSB_VCPU_SPR_VSR15      0x300F
+#define GSB_VCPU_SPR_VSR16      0x3010
+#define GSB_VCPU_SPR_VSR17      0x3011
+#define GSB_VCPU_SPR_VSR18      0x3012
+#define GSB_VCPU_SPR_VSR19      0x3013
+#define GSB_VCPU_SPR_VSR20      0x3014
+#define GSB_VCPU_SPR_VSR21      0x3015
+#define GSB_VCPU_SPR_VSR22      0x3016
+#define GSB_VCPU_SPR_VSR23      0x3017
+#define GSB_VCPU_SPR_VSR24      0x3018
+#define GSB_VCPU_SPR_VSR25      0x3019
+#define GSB_VCPU_SPR_VSR26      0x301A
+#define GSB_VCPU_SPR_VSR27      0x301B
+#define GSB_VCPU_SPR_VSR28      0x301C
+#define GSB_VCPU_SPR_VSR29      0x301D
+#define GSB_VCPU_SPR_VSR30      0x301E
+#define GSB_VCPU_SPR_VSR31      0x301F
+#define GSB_VCPU_SPR_VSR32      0x3020
+#define GSB_VCPU_SPR_VSR33      0x3021
+#define GSB_VCPU_SPR_VSR34      0x3022
+#define GSB_VCPU_SPR_VSR35      0x3023
+#define GSB_VCPU_SPR_VSR36      0x3024
+#define GSB_VCPU_SPR_VSR37      0x3025
+#define GSB_VCPU_SPR_VSR38      0x3026
+#define GSB_VCPU_SPR_VSR39      0x3027
+#define GSB_VCPU_SPR_VSR40      0x3028
+#define GSB_VCPU_SPR_VSR41      0x3029
+#define GSB_VCPU_SPR_VSR42      0x302A
+#define GSB_VCPU_SPR_VSR43      0x302B
+#define GSB_VCPU_SPR_VSR44      0x302C
+#define GSB_VCPU_SPR_VSR45      0x302D
+#define GSB_VCPU_SPR_VSR46      0x302E
+#define GSB_VCPU_SPR_VSR47      0x302F
+#define GSB_VCPU_SPR_VSR48      0x3030
+#define GSB_VCPU_SPR_VSR49      0x3031
+#define GSB_VCPU_SPR_VSR50      0x3032
+#define GSB_VCPU_SPR_VSR51      0x3033
+#define GSB_VCPU_SPR_VSR52      0x3034
+#define GSB_VCPU_SPR_VSR53      0x3035
+#define GSB_VCPU_SPR_VSR54      0x3036
+#define GSB_VCPU_SPR_VSR55      0x3037
+#define GSB_VCPU_SPR_VSR56      0x3038
+#define GSB_VCPU_SPR_VSR57      0x3039
+#define GSB_VCPU_SPR_VSR58      0x303A
+#define GSB_VCPU_SPR_VSR59      0x303B
+#define GSB_VCPU_SPR_VSR60      0x303C
+#define GSB_VCPU_SPR_VSR61      0x303D
+#define GSB_VCPU_SPR_VSR62      0x303E
+#define GSB_VCPU_SPR_VSR63      0x303F
+                    /* RESERVED 0x3040 - 0xEFFF */
+#define GSB_VCPU_SPR_HDAR       0xF000
+#define GSB_VCPU_SPR_HDSISR     0xF001
+#define GSB_VCPU_SPR_HEIR       0xF002
+#define GSB_VCPU_SPR_ASDR       0xF003
+/* End of list of Guest State Buffer Element IDs */
+#define GSB_LAST                GSB_VCPU_SPR_ASDR
+
 typedef struct SpaprMachineStateNested {
     uint64_t ptcr;
     uint8_t api;
@@ -17,14 +202,38 @@ typedef struct SpaprMachineStateNested {
 typedef struct SpaprMachineStateNestedGuest {
     uint32_t pvr_logical;
     unsigned long vcpus;
+    uint64_t parttbl[2];
+    uint64_t tb_offset;
     struct SpaprMachineStateNestedGuestVcpu *vcpu;
 } SpaprMachineStateNestedGuest;
 
+struct SpaprMachineStateNestedGuestVcpuRunBuf {
+    uint64_t addr;
+    uint64_t size;
+};
+
 typedef struct SpaprMachineStateNestedGuestVcpu {
     bool enabled;
+    struct SpaprMachineStateNestedGuestVcpuRunBuf runbufin;
+    struct SpaprMachineStateNestedGuestVcpuRunBuf runbufout;
     CPUPPCState env;
+    int64_t tb_offset;
+    int64_t dec_expiry_tb;
+    uint64_t hdecr_expiry_tb;
 } SpaprMachineStateNestedGuestVcpu;
 
+struct guest_state_element_type {
+    uint16_t id;
+    int size;
+#define GUEST_STATE_ELEMENT_TYPE_FLAG_GUEST_WIDE 0x1
+#define GUEST_STATE_ELEMENT_TYPE_FLAG_READ_ONLY  0x2
+   uint16_t flags;
+    void *(*location)(SpaprMachineStateNestedGuest *, target_ulong);
+    size_t offset;
+    void (*copy)(void *, void *, bool);
+    uint64_t mask;
+};
+
 /* Nested PAPR API related macros */
 #define H_GUEST_CAPABILITIES_COPY_MEM 0x8000000000000000
 #define H_GUEST_CAPABILITIES_P9_MODE  0x4000000000000000
@@ -35,6 +244,95 @@ typedef struct SpaprMachineStateNestedGuestVcpu {
 #define PAPR_NESTED_GUEST_MAX         4096
 #define H_GUEST_DELETE_ALL_FLAG       0x8000000000000000ULL
 #define PAPR_NESTED_GUEST_VCPU_MAX    2048
+#define VCPU_OUT_BUF_MIN_SZ           0x80ULL
+#define HVMASK_DEFAULT                0xffffffffffffffff
+#define HVMASK_LPCR                   0x0070000003820800
+#define HVMASK_MSR                    0xEBFFFFFFFFBFEFFF
+#define HVMASK_HDEXCR                 0x00000000FFFFFFFF
+#define HVMASK_TB_OFFSET              0x000000FFFFFFFFFF
+
+#define GUEST_STATE_ELEMENT(i, sz, s, f, ptr, c) { \
+    .id = (i),                                     \
+    .size = (sz),                                  \
+    .location = ptr,                               \
+    .offset = offsetof(struct s, f),               \
+    .copy = (c)                                    \
+}
+
+#define GSBE_NESTED(i, sz, f, c) {                             \
+    .id = (i),                                                 \
+    .size = (sz),                                              \
+    .location = get_guest_ptr,                                 \
+    .offset = offsetof(struct SpaprMachineStateNestedGuest, f),\
+    .copy = (c),                                               \
+    .mask = HVMASK_DEFAULT                                     \
+}
+
+#define GSBE_NESTED_MSK(i, sz, f, c, m) {                      \
+    .id = (i),                                                 \
+    .size = (sz),                                              \
+    .location = get_guest_ptr,                                 \
+    .offset = offsetof(struct SpaprMachineStateNestedGuest, f),\
+    .copy = (c),                                               \
+    .mask = (m)                                                \
+}
+
+#define GSBE_NESTED_VCPU(i, sz, f, c) {                            \
+    .id = (i),                                                     \
+    .size = (sz),                                                  \
+    .location = get_vcpu_ptr,                                      \
+    .offset = offsetof(struct SpaprMachineStateNestedGuestVcpu, f),\
+    .copy = (c),                                                   \
+    .mask = HVMASK_DEFAULT                                         \
+}
+
+#define GUEST_STATE_ELEMENT_NOP(i, sz) { \
+    .id = (i),                             \
+    .size = (sz),                          \
+    .location = NULL,                      \
+    .offset = 0,                           \
+    .copy = NULL,                          \
+    .mask = HVMASK_DEFAULT                 \
+}
+
+#define GUEST_STATE_ELEMENT_NOP_DW(i)   \
+        GUEST_STATE_ELEMENT_NOP(i, 8)
+#define GUEST_STATE_ELEMENT_NOP_W(i) \
+        GUEST_STATE_ELEMENT_NOP(i, 4)
+
+#define GUEST_STATE_ELEMENT_ENV_BASE(i, s, c) {  \
+            .id = (i),                           \
+            .size = (s),                         \
+            .location = get_vcpu_env_ptr,        \
+            .offset = 0,                         \
+            .copy = (c),                         \
+            .mask = HVMASK_DEFAULT               \
+    }
+
+#define GUEST_STATE_ELEMENT_ENV(i, s, f, c) {    \
+            .id = (i),                           \
+            .size = (s),                         \
+            .location = get_vcpu_env_ptr,        \
+            .offset = offsetof(CPUPPCState, f),  \
+            .copy = (c),                         \
+            .mask = HVMASK_DEFAULT               \
+    }
+
+#define GUEST_STATE_ELEMENT_MSK(i, s, f, c, m) { \
+            .id = (i),                           \
+            .size = (s),                         \
+            .location = get_vcpu_env_ptr,        \
+            .offset = offsetof(CPUPPCState, f),  \
+            .copy = (c),                         \
+            .mask = (m)                          \
+    }
+
+#define GUEST_STATE_ELEMENT_ENV_DW(i, f) \
+    GUEST_STATE_ELEMENT_ENV(i, 8, f, copy_state_8to8)
+#define GUEST_STATE_ELEMENT_ENV_W(i, f) \
+    GUEST_STATE_ELEMENT_ENV(i, 4, f, copy_state_4to8)
+#define GSE_ENV_DWM(i, f, m) \
+    GUEST_STATE_ELEMENT_MSK(i, 8, f, copy_state_8to8, m)
 
 /*
  * Register state for entering a nested guest with H_ENTER_NESTED.
@@ -135,4 +433,5 @@ bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
                            target_ulong lpid, ppc_v3_pate_t *entry);
 void spapr_register_nested_papr(void);
 void spapr_nested_init(SpaprMachineState *spapr);
+void spapr_nested_gsb_init(void);
 #endif /* HW_SPAPR_NESTED_H */
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 30392ebeee..e3f0bca573 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1596,9 +1596,11 @@ void ppc_compat_add_property(Object *obj, const char *name,
 #define SPR_PSPB              (0x09F)
 #define SPR_DPDES             (0x0B0)
 #define SPR_DAWR0             (0x0B4)
+#define SPR_DAWR1             (0x0B5)
 #define SPR_RPR               (0x0BA)
 #define SPR_CIABR             (0x0BB)
 #define SPR_DAWRX0            (0x0BC)
+#define SPR_DAWRX1            (0x0BD)
 #define SPR_HFSCR             (0x0BE)
 #define SPR_VRSAVE            (0x100)
 #define SPR_USPRG0            (0x100)
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 10/14] spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls.
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (8 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 09/14] spapr: nested: Initialize the GSB elements lookup table Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 11/14] spapr: nested: Use correct source for parttbl info for nested PAPR API Harsh Prateek Bora
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Introduce the nested PAPR hcalls:
    - H_GUEST_GET_STATE which is used to get state of a nested guest or
      a guest VCPU. The value field for each element in the request is
      ignored and on success, will be updated to reflect current state.
    - H_GUEST_SET_STATE which is used to modify the state of a guest or
      a guest VCPU. On success, guest (or its VCPU) state shall be
      updated as per the value field for the requested element(s).

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr_nested.c         | 270 ++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h        |   3 +
 include/hw/ppc/spapr_nested.h |  23 +++
 3 files changed, 296 insertions(+)

diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index 8f4152bd54..f41b0c0944 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -8,6 +8,7 @@
 #include "hw/ppc/spapr_nested.h"
 #include "mmu-book3s-v3.h"
 #include "cpu-models.h"
+#include "qemu/log.h"
 
 void spapr_nested_init(SpaprMachineState *spapr)
 {
@@ -934,6 +935,140 @@ void spapr_nested_gsb_init(void)
     }
 }
 
+static struct guest_state_element *guest_state_element_next(
+    struct guest_state_element *element,
+    int64_t *len,
+    int64_t *num_elements)
+{
+    uint16_t size;
+
+    /* size is of element->value[] only. Not whole guest_state_element */
+    size = be16_to_cpu(element->size);
+
+    if (len) {
+        *len -= size + offsetof(struct guest_state_element, value);
+    }
+
+    if (num_elements) {
+        *num_elements -= 1;
+    }
+
+    return (struct guest_state_element *)(element->value + size);
+}
+
+static
+struct guest_state_element_type *guest_state_element_type_find(uint16_t id)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(guest_state_element_types); i++)
+        if (id == guest_state_element_types[i].id) {
+            return &guest_state_element_types[i];
+        }
+
+    return NULL;
+}
+
+static void log_element(struct guest_state_element *element,
+                        struct guest_state_request *gsr)
+{
+    qemu_log_mask(LOG_GUEST_ERROR, "h_guest_%s_state id:0x%04x size:0x%04x",
+                  gsr->flags & GUEST_STATE_REQUEST_SET ? "set" : "get",
+                  be16_to_cpu(element->id), be16_to_cpu(element->size));
+    qemu_log_mask(LOG_GUEST_ERROR, "buf:0x%016lx ...\n",
+                  be64_to_cpu(*(uint64_t *)element->value));
+}
+
+static bool guest_state_request_check(struct guest_state_request *gsr)
+{
+    int64_t num_elements, len = gsr->len;
+    struct guest_state_buffer *gsb = gsr->gsb;
+    struct guest_state_element *element;
+    struct guest_state_element_type *type;
+    uint16_t id, size;
+
+    /* gsb->num_elements = 0 == 32 bits long */
+    assert(len >= 4);
+
+    num_elements = be32_to_cpu(gsb->num_elements);
+    element = gsb->elements;
+    len -= sizeof(gsb->num_elements);
+
+    /* Walk the buffer to validate the length */
+    while (num_elements) {
+
+        id = be16_to_cpu(element->id);
+        size = be16_to_cpu(element->size);
+
+        if (false) {
+            log_element(element, gsr);
+        }
+        /* buffer size too small */
+        if (len < 0) {
+            return false;
+        }
+
+        type = guest_state_element_type_find(id);
+        if (!type) {
+            qemu_log_mask(LOG_GUEST_ERROR,"Element ID %04x unknown\n", id);
+            log_element(element, gsr);
+            return false;
+        }
+
+        if (id == GSB_HV_VCPU_IGNORED_ID) {
+            goto next_element;
+        }
+
+        if (size != type->size) {
+            qemu_log_mask(LOG_GUEST_ERROR,"Size mismatch. Element ID:%04x."
+                          "Size Exp:%i Got:%i\n", id, type->size, size);
+            log_element(element, gsr);
+            return false;
+        }
+
+        if ((type->flags & GUEST_STATE_ELEMENT_TYPE_FLAG_READ_ONLY) &&
+            (gsr->flags & GUEST_STATE_REQUEST_SET)) {
+            qemu_log_mask(LOG_GUEST_ERROR,"trying to set a read-only Element "
+                          "ID:%04x.\n", id);
+            return false;
+        }
+
+        if (type->flags & GUEST_STATE_ELEMENT_TYPE_FLAG_GUEST_WIDE) {
+            /* guest wide element type */
+            if (!(gsr->flags & GUEST_STATE_REQUEST_GUEST_WIDE)) {
+                qemu_log_mask(LOG_GUEST_ERROR, "trying to set a guest wide "
+                              "Element ID:%04x.\n", id);
+                return false;
+            }
+        } else {
+            /* thread wide element type */
+            if (gsr->flags & GUEST_STATE_REQUEST_GUEST_WIDE) {
+                qemu_log_mask(LOG_GUEST_ERROR, "trying to set a thread wide "
+                              "Element ID:%04x.\n", id);
+                return false;
+            }
+        }
+next_element:
+        element = guest_state_element_next(element, &len, &num_elements);
+
+    }
+    return true;
+}
+
+static bool is_gsr_invalid(struct guest_state_request *gsr,
+                                   struct guest_state_element *element,
+                                   struct guest_state_element_type *type)
+{
+    if ((gsr->flags & GUEST_STATE_REQUEST_SET) &&
+        (*(uint64_t *)(element->value) & ~(type->mask))) {
+        log_element(element, gsr);
+        qemu_log_mask(LOG_GUEST_ERROR, "L1 can't set reserved bits i"
+                      "(allowed mask: 0x%08lx)\n", type->mask);
+        return true;
+    }
+    return false;
+}
+
 static target_ulong h_guest_get_capabilities(PowerPCCPU *cpu,
                                              SpaprMachineState *spapr,
                                              target_ulong opcode,
@@ -1163,6 +1298,139 @@ static target_ulong h_guest_create_vcpu(PowerPCCPU *cpu,
     return H_SUCCESS;
 }
 
+static target_ulong getset_state(SpaprMachineStateNestedGuest *guest,
+                                 uint64_t vcpuid,
+                                 struct guest_state_request *gsr)
+{
+    void *ptr;
+    uint16_t id;
+    struct guest_state_element *element;
+    struct guest_state_element_type *type;
+    int64_t lenleft, num_elements;
+
+    lenleft = gsr->len;
+
+    if (!guest_state_request_check(gsr)) {
+        return H_P3;
+    }
+
+    num_elements = be32_to_cpu(gsr->gsb->num_elements);
+    element = gsr->gsb->elements;
+    /* Process the elements */
+    while (num_elements) {
+        type = NULL;
+        /* Debug print before doing anything */
+        if (false) {
+            log_element(element, gsr);
+        }
+
+        id = be16_to_cpu(element->id);
+        if (id == GSB_HV_VCPU_IGNORED_ID) {
+            goto next_element;
+        }
+
+        type = guest_state_element_type_find(id);
+        assert(type);
+
+        /* Get pointer to guest data to get/set */
+        if (type->location && type->copy) {
+            ptr = type->location(guest, vcpuid);
+            assert(ptr);
+            if (!~(type->mask) && is_gsr_invalid(gsr, element, type)) {
+                return H_INVALID_ELEMENT_VALUE;
+            }
+            type->copy(ptr + type->offset, element->value,
+                       gsr->flags & GUEST_STATE_REQUEST_SET ? true : false);
+        }
+
+next_element:
+        element = guest_state_element_next(element, &lenleft, &num_elements);
+    }
+
+    return H_SUCCESS;
+}
+
+static target_ulong map_and_getset_state(PowerPCCPU *cpu,
+                                         SpaprMachineStateNestedGuest *guest,
+                                         uint64_t vcpuid,
+                                         struct guest_state_request *gsr)
+{
+    target_ulong rc;
+    int64_t len;
+    bool is_write;
+
+    len = gsr->len;
+    /* only get_state would require write access to the provided buffer */
+    is_write = (gsr->flags & GUEST_STATE_REQUEST_SET) ? false : true;
+    gsr->gsb = address_space_map(CPU(cpu)->as, gsr->buf, (uint64_t *)&len,
+                                 is_write, MEMTXATTRS_UNSPECIFIED);
+    if (!gsr->gsb) {
+        rc = H_P3;
+        goto out1;
+    }
+
+    if (len != gsr->len) {
+        rc = H_P3;
+        goto out1;
+    }
+
+    rc = getset_state(guest, vcpuid, gsr);
+
+out1:
+    address_space_unmap(CPU(cpu)->as, gsr->gsb, len, is_write, len);
+    return rc;
+}
+
+static target_ulong h_guest_getset_state(PowerPCCPU *cpu,
+                                         SpaprMachineState *spapr,
+                                         target_ulong *args,
+                                         bool set)
+{
+    target_ulong flags = args[0];
+    target_ulong lpid = args[1];
+    target_ulong vcpuid = args[2];
+    target_ulong buf = args[3];
+    target_ulong buflen = args[4];
+    struct guest_state_request gsr;
+    SpaprMachineStateNestedGuest *guest;
+
+    guest = spapr_get_nested_guest(spapr, lpid);
+    if (!guest) {
+        return H_P2;
+    }
+    gsr.buf = buf;
+    assert(buflen <= GSB_MAX_BUF_SIZE);
+    gsr.len = buflen;
+    gsr.flags = 0;
+    if (flags & H_GUEST_GETSET_STATE_FLAG_GUEST_WIDE) {
+        gsr.flags |= GUEST_STATE_REQUEST_GUEST_WIDE;
+    }
+    if (flags & !H_GUEST_GETSET_STATE_FLAG_GUEST_WIDE) {
+        return H_PARAMETER; /* flag not supported yet */
+    }
+
+    if (set) {
+        gsr.flags |= GUEST_STATE_REQUEST_SET;
+    }
+    return map_and_getset_state(cpu, guest, vcpuid, &gsr);
+}
+
+static target_ulong h_guest_set_state(PowerPCCPU *cpu,
+                                      SpaprMachineState *spapr,
+                                      target_ulong opcode,
+                                      target_ulong *args)
+{
+    return h_guest_getset_state(cpu, spapr, args, true);
+}
+
+static target_ulong h_guest_get_state(PowerPCCPU *cpu,
+                                      SpaprMachineState *spapr,
+                                      target_ulong opcode,
+                                      target_ulong *args)
+{
+    return h_guest_getset_state(cpu, spapr, args, false);
+}
+
 void spapr_register_nested(void)
 {
     spapr_register_hypercall(KVMPPC_H_SET_PARTITION_TABLE, h_set_ptbl);
@@ -1179,6 +1447,8 @@ void spapr_register_nested_papr(void)
     spapr_register_hypercall(H_GUEST_CREATE          , h_guest_create);
     spapr_register_hypercall(H_GUEST_DELETE          , h_guest_delete);
     spapr_register_hypercall(H_GUEST_CREATE_VCPU     , h_guest_create_vcpu);
+    spapr_register_hypercall(H_GUEST_SET_STATE       , h_guest_set_state);
+    spapr_register_hypercall(H_GUEST_GET_STATE       , h_guest_get_state);
 }
 #else
 void spapr_exit_nested(PowerPCCPU *cpu, int excp)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 25aa8c5ffe..b9a67895bb 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -368,6 +368,7 @@ struct SpaprMachineState {
 #define H_OVERLAP         -68
 #define H_STATE           -75
 #define H_IN_USE          -77
+#define H_INVALID_ELEMENT_VALUE            -81
 #define H_UNSUPPORTED_FLAG -256
 #define H_MULTI_THREADS_ACTIVE -9005
 
@@ -591,6 +592,8 @@ struct SpaprMachineState {
 #define H_GUEST_SET_CAPABILITIES 0x464
 #define H_GUEST_CREATE           0x470
 #define H_GUEST_CREATE_VCPU      0x474
+#define H_GUEST_GET_STATE        0x478
+#define H_GUEST_SET_STATE        0x47C
 #define H_GUEST_DELETE           0x488
 
 #define MAX_HCALL_OPCODE         H_GUEST_DELETE
diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
index 9b647c0389..d27688b1b8 100644
--- a/include/hw/ppc/spapr_nested.h
+++ b/include/hw/ppc/spapr_nested.h
@@ -250,6 +250,10 @@ struct guest_state_element_type {
 #define HVMASK_MSR                    0xEBFFFFFFFFBFEFFF
 #define HVMASK_HDEXCR                 0x00000000FFFFFFFF
 #define HVMASK_TB_OFFSET              0x000000FFFFFFFFFF
+#define GSB_MAX_BUF_SIZE              (1024 * 1024)
+#define H_GUEST_GETSET_STATE_FLAG_GUEST_WIDE 0x8000000000000000
+#define GUEST_STATE_REQUEST_GUEST_WIDE       0x1
+#define GUEST_STATE_REQUEST_SET              0x2
 
 #define GUEST_STATE_ELEMENT(i, sz, s, f, ptr, c) { \
     .id = (i),                                     \
@@ -334,6 +338,25 @@ struct guest_state_element_type {
 #define GSE_ENV_DWM(i, f, m) \
     GUEST_STATE_ELEMENT_MSK(i, 8, f, copy_state_8to8, m)
 
+struct guest_state_element {
+    uint16_t id;
+    uint16_t size;
+    uint8_t value[];
+} QEMU_PACKED;
+
+struct guest_state_buffer {
+    uint32_t num_elements;
+    struct guest_state_element elements[];
+} QEMU_PACKED;
+
+/* Actual buffer plus some metadata about the request */
+struct guest_state_request {
+    struct guest_state_buffer *gsb;
+    int64_t buf;
+    int64_t len;
+    uint16_t flags;
+};
+
 /*
  * Register state for entering a nested guest with H_ENTER_NESTED.
  * New member must be added at the end.
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 11/14] spapr: nested: Use correct source for parttbl info for nested PAPR API.
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (9 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 10/14] spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-11-29  4:15   ` Nicholas Piggin
  2023-10-12 10:49 ` [PATCH v2 12/14] spapr: nested: rename nested_host_state to nested_hv_host Harsh Prateek Bora
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

For nested PAPR API, we use SpaprMachineStateNestedGuest struct to store
partition table info, use the same in spapr_get_pate_nested() as well.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr.c                |  5 +---
 hw/ppc/spapr_nested.c         | 49 +++++++++++++++++++++++------------
 include/hw/ppc/spapr_nested.h |  3 +++
 3 files changed, 36 insertions(+), 21 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 14196fdd11..e1f7b29842 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1356,10 +1356,7 @@ static bool spapr_get_pate(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
         return true;
     } else {
         assert(spapr->nested.api);
-        if (spapr->nested.api == NESTED_API_KVM_HV) {
-            return spapr_get_pate_nested(spapr, cpu, lpid, entry);
-        }
-        return false;
+        return spapr_get_pate_nested(spapr, cpu, lpid, entry);
     }
 }
 
diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index f41b0c0944..1917d8bb13 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -19,29 +19,45 @@ void spapr_nested_init(SpaprMachineState *spapr)
 bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
                            target_ulong lpid, ppc_v3_pate_t *entry)
 {
-    uint64_t patb, pats;
+    if (spapr->nested.api == NESTED_API_KVM_HV) {
+        uint64_t patb, pats;
 
-    assert(lpid != 0);
+        assert(lpid != 0);
 
-    patb = spapr->nested.ptcr & PTCR_PATB;
-    pats = spapr->nested.ptcr & PTCR_PATS;
+        patb = spapr->nested.ptcr & PTCR_PATB;
+        pats = spapr->nested.ptcr & PTCR_PATS;
 
-    /* Check if partition table is properly aligned */
-    if (patb & MAKE_64BIT_MASK(0, pats + 12)) {
-        return false;
-    }
+        /* Check if partition table is properly aligned */
+        if (patb & MAKE_64BIT_MASK(0, pats + 12)) {
+            return false;
+        }
 
-    /* Calculate number of entries */
-    pats = 1ull << (pats + 12 - 4);
-    if (pats <= lpid) {
-        return false;
+        /* Calculate number of entries */
+        pats = 1ull << (pats + 12 - 4);
+        if (pats <= lpid) {
+            return false;
+        }
+
+        /* Grab entry */
+        patb += 16 * lpid;
+        entry->dw0 = ldq_phys(CPU(cpu)->as, patb);
+        entry->dw1 = ldq_phys(CPU(cpu)->as, patb + 8);
+        return true;
     }
+#ifdef CONFIG_TCG
+    /* Nested PAPR API */
+    SpaprMachineStateNestedGuest *guest;
+    assert(lpid != 0);
+    guest = spapr_get_nested_guest(spapr, lpid);
+    assert(guest != NULL);
+
+    entry->dw0 = guest->parttbl[0];
+    entry->dw1 = guest->parttbl[1];
 
-    /* Grab entry */
-    patb += 16 * lpid;
-    entry->dw0 = ldq_phys(CPU(cpu)->as, patb);
-    entry->dw1 = ldq_phys(CPU(cpu)->as, patb + 8);
     return true;
+#else
+    return false;
+#endif
 }
 
 #ifdef CONFIG_TCG
@@ -412,7 +428,6 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
     address_space_unmap(CPU(cpu)->as, regs, len, len, true);
 }
 
-static
 SpaprMachineStateNestedGuest *spapr_get_nested_guest(SpaprMachineState *spapr,
                                                      target_ulong guestid)
 {
diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
index d27688b1b8..e04202ecb4 100644
--- a/include/hw/ppc/spapr_nested.h
+++ b/include/hw/ppc/spapr_nested.h
@@ -457,4 +457,7 @@ bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
 void spapr_register_nested_papr(void);
 void spapr_nested_init(SpaprMachineState *spapr);
 void spapr_nested_gsb_init(void);
+SpaprMachineStateNestedGuest *spapr_get_nested_guest(SpaprMachineState *spapr,
+                                                     target_ulong lpid);
+
 #endif /* HW_SPAPR_NESTED_H */
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 12/14] spapr: nested: rename nested_host_state to nested_hv_host
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (10 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 11/14] spapr: nested: Use correct source for parttbl info for nested PAPR API Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-10-12 10:49 ` [PATCH v2 13/14] spapr: nested: keep nested-hv exit code restricted to its API Harsh Prateek Bora
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

Renaming nested host state variable name to reflect the nested-hv API
being used. For nested PAPR API, similar convention shall be used in
upcoming patch which helps distinguishing them from each other.

Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr_nested.c           | 18 +++++++++---------
 include/hw/ppc/spapr_cpu_core.h |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index 1917d8bb13..607fb7ead2 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -238,21 +238,21 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
         return H_PARAMETER;
     }
 
-    spapr_cpu->nested_host_state = g_try_new(struct nested_ppc_state, 1);
-    if (!spapr_cpu->nested_host_state) {
+    spapr_cpu->nested_hv_host = g_try_new(struct nested_ppc_state, 1);
+    if (!spapr_cpu->nested_hv_host) {
         return H_NO_MEM;
     }
 
     assert(env->spr[SPR_LPIDR] == 0);
     assert(env->spr[SPR_DPDES] == 0);
-    nested_save_state(spapr_cpu->nested_host_state, cpu);
+    nested_save_state(spapr_cpu->nested_hv_host, cpu);
 
     len = sizeof(*regs);
     regs = address_space_map(CPU(cpu)->as, regs_ptr, &len, false,
                                 MEMTXATTRS_UNSPECIFIED);
     if (!regs || len != sizeof(*regs)) {
         address_space_unmap(CPU(cpu)->as, regs, len, 0, false);
-        g_free(spapr_cpu->nested_host_state);
+        g_free(spapr_cpu->nested_hv_host);
         return H_P2;
     }
 
@@ -330,8 +330,8 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
     CPUPPCState *env = &cpu->env;
     SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
     struct nested_ppc_state l2_state;
-    target_ulong hv_ptr = spapr_cpu->nested_host_state->gpr[4];
-    target_ulong regs_ptr = spapr_cpu->nested_host_state->gpr[5];
+    target_ulong hv_ptr = spapr_cpu->nested_hv_host->gpr[4];
+    target_ulong regs_ptr = spapr_cpu->nested_hv_host->gpr[5];
     target_ulong hsrr0, hsrr1, hdar, asdr, hdsisr;
     struct kvmppc_hv_guest_state *hvstate;
     struct kvmppc_pt_regs *regs;
@@ -350,15 +350,15 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
      * Switch back to the host environment (including for any error).
      */
     assert(env->spr[SPR_LPIDR] != 0);
-    nested_load_state(cpu, spapr_cpu->nested_host_state);
+    nested_load_state(cpu, spapr_cpu->nested_hv_host);
     env->gpr[3] = env->excp_vectors[excp]; /* hcall return value */
 
     cpu_ppc_hdecr_exit(env);
 
     spapr_cpu->in_nested = false;
 
-    g_free(spapr_cpu->nested_host_state);
-    spapr_cpu->nested_host_state = NULL;
+    g_free(spapr_cpu->nested_hv_host);
+    spapr_cpu->nested_hv_host = NULL;
 
     len = sizeof(*hvstate);
     hvstate = address_space_map(CPU(cpu)->as, hv_ptr, &len, true,
diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
index 69a52e39b8..9c8c59f173 100644
--- a/include/hw/ppc/spapr_cpu_core.h
+++ b/include/hw/ppc/spapr_cpu_core.h
@@ -53,7 +53,7 @@ typedef struct SpaprCpuState {
 
     /* Fields for nested-HV support */
     bool in_nested; /* true while the L2 is executing */
-    struct nested_ppc_state *nested_host_state; /* holds the L1 state while L2 executes */
+    struct nested_ppc_state *nested_hv_host; /* holds the L1 state while L2 executes */
 } SpaprCpuState;
 
 static inline SpaprCpuState *spapr_cpu_state(PowerPCCPU *cpu)
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 13/14] spapr: nested: keep nested-hv exit code restricted to its API.
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (11 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 12/14] spapr: nested: rename nested_host_state to nested_hv_host Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-11-29  4:16   ` Nicholas Piggin
  2023-10-12 10:49 ` [PATCH v2 14/14] spapr: nested: Introduce H_GUEST_RUN_VCPU hcall Harsh Prateek Bora
  2023-11-13 13:15 ` [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Nicholas Piggin
  14 siblings, 1 reply; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

spapr_exit_nested gets triggered on a nested guest exit and currently
being used only for nested-hv API. Isolating code flows based on API
helps extending it to be used with nested PAPR API as well.

Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr_nested.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index 607fb7ead2..e2d0cb5559 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -325,7 +325,7 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
     return env->gpr[3];
 }
 
-void spapr_exit_nested(PowerPCCPU *cpu, int excp)
+static void spapr_exit_nested_hv(PowerPCCPU *cpu, int excp)
 {
     CPUPPCState *env = &cpu->env;
     SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
@@ -337,8 +337,6 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
     struct kvmppc_pt_regs *regs;
     hwaddr len;
 
-    assert(spapr_cpu->in_nested);
-
     nested_save_state(&l2_state, cpu);
     hsrr0 = env->spr[SPR_HSRR0];
     hsrr1 = env->spr[SPR_HSRR1];
@@ -428,6 +426,17 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
     address_space_unmap(CPU(cpu)->as, regs, len, len, true);
 }
 
+void spapr_exit_nested(PowerPCCPU *cpu, int excp)
+{
+    SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
+    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
+
+    assert(spapr_cpu->in_nested);
+    if (spapr->nested.api == NESTED_API_KVM_HV) {
+        spapr_exit_nested_hv(cpu, excp);
+    }
+}
+
 SpaprMachineStateNestedGuest *spapr_get_nested_guest(SpaprMachineState *spapr,
                                                      target_ulong guestid)
 {
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 14/14] spapr: nested: Introduce H_GUEST_RUN_VCPU hcall.
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (12 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 13/14] spapr: nested: keep nested-hv exit code restricted to its API Harsh Prateek Bora
@ 2023-10-12 10:49 ` Harsh Prateek Bora
  2023-11-29  4:58   ` Nicholas Piggin
  2023-11-13 13:15 ` [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Nicholas Piggin
  14 siblings, 1 reply; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-10-12 10:49 UTC (permalink / raw)
  To: npiggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

The H_GUEST_RUN_VCPU hcall is used to start execution of a Guest VCPU.
The Hypervisor will update the state of the Guest VCPU based on the
input buffer, restore the saved Guest VCPU state, and start its execution.

The Guest VCPU can stop running for numerous reasons including HCALLs,
hypervisor exceptions, or an outstanding Host Partition Interrupt.
The reason that the Guest VCPU stopped running is communicated through
R4 and the output buffer will be filled in with any relevant state.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr_nested.c           | 308 ++++++++++++++++++++++++++++++--
 include/hw/ppc/spapr.h          |   1 +
 include/hw/ppc/spapr_cpu_core.h |   7 +-
 3 files changed, 302 insertions(+), 14 deletions(-)

diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index e2d0cb5559..d3e7629f63 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -141,6 +141,15 @@ static void nested_save_state(struct nested_ppc_state *save, PowerPCCPU *cpu)
     save->tb_offset = env->tb_env->tb_offset;
 }
 
+static void nested_post_state_update(CPUPPCState *env, CPUState *cs)
+{
+    hreg_compute_hflags(env);
+    ppc_maybe_interrupt(env);
+    tlb_flush(cs);
+    env->reserve_addr = -1; /* Reset the reservation */
+
+}
+
 static void nested_load_state(PowerPCCPU *cpu, struct nested_ppc_state *load)
 {
     CPUState *cs = CPU(cpu);
@@ -172,19 +181,7 @@ static void nested_load_state(PowerPCCPU *cpu, struct nested_ppc_state *load)
     env->spr[SPR_PPR] = load->ppr;
 
     env->tb_env->tb_offset = load->tb_offset;
-
-    /*
-     * MSR updated, compute hflags and possible interrupts.
-     */
-    hreg_compute_hflags(env);
-    ppc_maybe_interrupt(env);
-
-    /*
-     * Nested HV does not tag TLB entries between L1 and L2, so must
-     * flush on transition.
-     */
-    tlb_flush(cs);
-    env->reserve_addr = -1; /* Reset the reservation */
+    nested_post_state_update(env, cs);
 }
 
 /*
@@ -426,6 +423,9 @@ static void spapr_exit_nested_hv(PowerPCCPU *cpu, int excp)
     address_space_unmap(CPU(cpu)->as, regs, len, len, true);
 }
 
+static
+void spapr_exit_nested_papr(SpaprMachineState *spapr, PowerPCCPU *cpu, int excp);
+
 void spapr_exit_nested(PowerPCCPU *cpu, int excp)
 {
     SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
@@ -434,6 +434,10 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
     assert(spapr_cpu->in_nested);
     if (spapr->nested.api == NESTED_API_KVM_HV) {
         spapr_exit_nested_hv(cpu, excp);
+    } else if (spapr->nested.api == NESTED_API_PAPR) {
+        spapr_exit_nested_papr(spapr, cpu, excp);
+    } else {
+        g_assert_not_reached();
     }
 }
 
@@ -1455,6 +1459,283 @@ static target_ulong h_guest_get_state(PowerPCCPU *cpu,
     return h_guest_getset_state(cpu, spapr, args, false);
 }
 
+static void restore_common_regs(CPUPPCState *dst, CPUPPCState *src)
+{
+    memcpy(dst->gpr, src->gpr, sizeof(dst->gpr));
+    memcpy(dst->crf, src->crf, sizeof(dst->crf));
+    memcpy(dst->vsr, src->vsr, sizeof(dst->vsr));
+    dst->nip = src->nip;
+    dst->msr = src->msr;
+    dst->lr  = src->lr;
+    dst->ctr = src->ctr;
+    dst->cfar = src->cfar;
+    cpu_write_xer(dst, src->xer);
+    ppc_store_vscr(dst, ppc_get_vscr(src));
+    ppc_store_fpscr(dst, src->fpscr);
+    memcpy(dst->spr, src->spr, sizeof(dst->spr));
+}
+
+static void exit_nested_restore_vcpu(PowerPCCPU *cpu, int excp,
+                                     SpaprMachineStateNestedGuestVcpu *vcpu)
+{
+    CPUPPCState *env = &cpu->env;
+    target_ulong now, hdar, hdsisr, asdr;
+
+    assert(sizeof(env->gpr) == sizeof(vcpu->env.gpr)); /* sanity check */
+
+    now = cpu_ppc_load_tbl(env); /* L2 timebase */
+    now -= vcpu->tb_offset; /* L1 timebase */
+    vcpu->dec_expiry_tb = now - cpu_ppc_load_decr(env);
+    /* backup hdar, hdsisr, asdr if reqd later below */
+    hdar   = vcpu->env.spr[SPR_HDAR];
+    hdsisr = vcpu->env.spr[SPR_HDSISR];
+    asdr   = vcpu->env.spr[SPR_ASDR];
+
+    restore_common_regs(&vcpu->env, env);
+
+    if (excp == POWERPC_EXCP_MCHECK ||
+        excp == POWERPC_EXCP_RESET ||
+        excp == POWERPC_EXCP_SYSCALL) {
+        vcpu->env.nip = env->spr[SPR_SRR0];
+        vcpu->env.msr = env->spr[SPR_SRR1] & env->msr_mask;
+    } else {
+        vcpu->env.nip = env->spr[SPR_HSRR0];
+        vcpu->env.msr = env->spr[SPR_HSRR1] & env->msr_mask;
+    }
+
+    /* hdar, hdsisr, asdr should be retained unless certain exceptions */
+    if ((excp != POWERPC_EXCP_HDSI) && (excp != POWERPC_EXCP_HISI)) {
+        vcpu->env.spr[SPR_ASDR] = asdr;
+    } else if (excp != POWERPC_EXCP_HDSI) {
+        vcpu->env.spr[SPR_HDAR]   = hdar;
+        vcpu->env.spr[SPR_HDSISR] = hdsisr;
+    }
+}
+
+static int get_exit_ids(uint64_t srr0, uint16_t ids[16])
+{
+    int nr;
+
+    switch (srr0) {
+    case 0xc00:
+        nr = 10;
+        ids[0] = GSB_VCPU_GPR3;
+        ids[1] = GSB_VCPU_GPR4;
+        ids[2] = GSB_VCPU_GPR5;
+        ids[3] = GSB_VCPU_GPR6;
+        ids[4] = GSB_VCPU_GPR7;
+        ids[5] = GSB_VCPU_GPR8;
+        ids[6] = GSB_VCPU_GPR9;
+        ids[7] = GSB_VCPU_GPR10;
+        ids[8] = GSB_VCPU_GPR11;
+        ids[9] = GSB_VCPU_GPR12;
+        break;
+    case 0xe00:
+        nr = 5;
+        ids[0] = GSB_VCPU_SPR_HDAR;
+        ids[1] = GSB_VCPU_SPR_HDSISR;
+        ids[2] = GSB_VCPU_SPR_ASDR;
+        ids[3] = GSB_VCPU_SPR_NIA;
+        ids[4] = GSB_VCPU_SPR_MSR;
+        break;
+    case 0xe20:
+        nr = 4;
+        ids[0] = GSB_VCPU_SPR_HDAR;
+        ids[1] = GSB_VCPU_SPR_ASDR;
+        ids[2] = GSB_VCPU_SPR_NIA;
+        ids[3] = GSB_VCPU_SPR_MSR;
+        break;
+    case 0xe40:
+        nr = 3;
+        ids[0] = GSB_VCPU_SPR_HEIR;
+        ids[1] = GSB_VCPU_SPR_NIA;
+        ids[2] = GSB_VCPU_SPR_MSR;
+        break;
+    case 0xf80:
+        nr = 3;
+        ids[0] = GSB_VCPU_SPR_HFSCR;
+        ids[1] = GSB_VCPU_SPR_NIA;
+        ids[2] = GSB_VCPU_SPR_MSR;
+        break;
+    default:
+        nr = 0;
+        break;
+    }
+
+    return nr;
+}
+
+static void exit_process_output_buffer(PowerPCCPU *cpu,
+                                      SpaprMachineStateNestedGuest *guest,
+                                      target_ulong vcpuid,
+                                      target_ulong *r3)
+{
+    SpaprMachineStateNestedGuestVcpu *vcpu = &guest->vcpu[vcpuid];
+    struct guest_state_request gsr;
+    struct guest_state_buffer *gsb;
+    struct guest_state_element *element;
+    struct guest_state_element_type *type;
+    int exit_id_count = 0;
+    uint16_t exit_cause_ids[16];
+    hwaddr len;
+
+    len = vcpu->runbufout.size;
+    gsb = address_space_map(CPU(cpu)->as, vcpu->runbufout.addr, &len, true,
+                            MEMTXATTRS_UNSPECIFIED);
+    if (!gsb || len != vcpu->runbufout.size) {
+        address_space_unmap(CPU(cpu)->as, gsb, len, true, len);
+        *r3 = H_P2;
+        return;
+    }
+
+    exit_id_count = get_exit_ids(*r3, exit_cause_ids);
+
+    /* Create a buffer of elements to send back */
+    gsb->num_elements = cpu_to_be32(exit_id_count);
+    element = gsb->elements;
+    for (int i = 0; i < exit_id_count; i++) {
+        type = guest_state_element_type_find(exit_cause_ids[i]);
+        assert(type);
+        element->id = cpu_to_be16(exit_cause_ids[i]);
+        element->size = cpu_to_be16(type->size);
+        element = guest_state_element_next(element, NULL, NULL);
+    }
+    gsr.gsb = gsb;
+    gsr.len = VCPU_OUT_BUF_MIN_SZ;
+    gsr.flags = 0; /* get + never guest wide */
+    getset_state(guest, vcpuid, &gsr);
+
+    address_space_unmap(CPU(cpu)->as, gsb, len, true, len);
+    return;
+}
+
+static
+void spapr_exit_nested_papr(SpaprMachineState *spapr, PowerPCCPU *cpu, int excp)
+{
+    CPUState *cs = CPU(cpu);
+    CPUPPCState *env = &cpu->env;
+    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
+    target_ulong r3_return = env->excp_vectors[excp]; /* hcall return value */
+    target_ulong lpid = 0, vcpuid = 0;
+    struct SpaprMachineStateNestedGuestVcpu *vcpu = NULL;
+    struct SpaprMachineStateNestedGuest *guest = NULL;
+
+    lpid = spapr_cpu->nested_papr_host->gpr[5];
+    vcpuid = spapr_cpu->nested_papr_host->gpr[6];
+    guest = spapr_get_nested_guest(spapr, lpid);
+    assert(guest);
+    spapr_nested_vcpu_check(guest, vcpuid, false);
+    vcpu = &guest->vcpu[vcpuid];
+
+    exit_nested_restore_vcpu(cpu, excp, vcpu);
+    /* do the output buffer for run_vcpu*/
+    exit_process_output_buffer(cpu, guest, vcpuid, &r3_return);
+
+    assert(env->spr[SPR_LPIDR] != 0);
+    restore_common_regs(env, spapr_cpu->nested_papr_host);
+    env->tb_env->tb_offset -= vcpu->tb_offset;
+    env->gpr[3] = H_SUCCESS;
+    env->gpr[4] = r3_return;
+    nested_post_state_update(env, cs);
+    cpu_ppc_hdecr_exit(env);
+
+    spapr_cpu->in_nested = false;
+    g_free(spapr_cpu->nested_papr_host);
+    spapr_cpu->nested_papr_host = NULL;
+}
+
+static void nested_papr_restore_l2_state(PowerPCCPU *cpu,
+                                         CPUPPCState *env,
+                                         SpaprMachineStateNestedGuestVcpu *vcpu,
+                                         target_ulong now)
+{
+    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
+    target_ulong lpcr, lpcr_mask, hdec;
+    lpcr_mask = LPCR_DPFD | LPCR_ILE | LPCR_AIL | LPCR_LD | LPCR_MER;
+
+    assert(vcpu);
+    assert(sizeof(env->gpr) == sizeof(vcpu->env.gpr));
+    restore_common_regs(env, &vcpu->env);
+    lpcr = (env->spr[SPR_LPCR] & ~lpcr_mask) |
+           (vcpu->env.spr[SPR_LPCR] & lpcr_mask);
+    lpcr |= LPCR_HR | LPCR_UPRT | LPCR_GTSE | LPCR_HVICE | LPCR_HDICE;
+    lpcr &= ~LPCR_LPES0;
+    env->spr[SPR_LPCR] = lpcr & pcc->lpcr_mask;
+
+    hdec = vcpu->hdecr_expiry_tb - now;
+    cpu_ppc_store_decr(env, vcpu->dec_expiry_tb - now);
+    cpu_ppc_hdecr_init(env);
+    cpu_ppc_store_hdecr(env, hdec);
+
+    env->tb_env->tb_offset += vcpu->tb_offset;
+}
+
+static void nested_papr_run_vcpu(PowerPCCPU *cpu,
+                                 uint64_t lpid,
+                                 SpaprMachineStateNestedGuestVcpu *vcpu)
+{
+    SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
+    CPUState *cs = CPU(cpu);
+    CPUPPCState *env = &cpu->env;
+    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
+    target_ulong now = cpu_ppc_load_tbl(env);
+
+    assert(env->spr[SPR_LPIDR] == 0);
+    assert(spapr->nested.api); /* ensure API version is initialized */
+    spapr_cpu->nested_papr_host = g_try_new(CPUPPCState, 1);
+    assert(spapr_cpu->nested_papr_host);
+    memcpy(spapr_cpu->nested_papr_host, env, sizeof(CPUPPCState));
+
+    nested_papr_restore_l2_state(cpu, env, vcpu, now);
+    env->spr[SPR_LPIDR] = lpid; /* post restore l2 state */
+
+    spapr_cpu->in_nested = true;
+
+    nested_post_state_update(env, cs);
+}
+
+static target_ulong h_guest_run_vcpu(PowerPCCPU *cpu,
+                                     SpaprMachineState *spapr,
+                                     target_ulong opcode,
+                                     target_ulong *args)
+{
+    CPUPPCState *env = &cpu->env;
+    target_ulong flags = args[0];
+    target_ulong lpid = args[1];
+    target_ulong vcpuid = args[2];
+    struct SpaprMachineStateNestedGuestVcpu *vcpu;
+    struct guest_state_request gsr;
+    SpaprMachineStateNestedGuest *guest;
+
+    if (flags) /* don't handle any flags for now */
+        return H_PARAMETER;
+
+    guest = spapr_get_nested_guest(spapr, lpid);
+    if (!guest) {
+        return H_P2;
+    }
+    if (!spapr_nested_vcpu_check(guest, vcpuid, true)) {
+        return H_P3;
+    }
+
+    if (guest->parttbl[0] == 0) {
+        /* At least need a partition scoped radix tree */
+        return H_NOT_AVAILABLE;
+    }
+
+    vcpu = &guest->vcpu[vcpuid];
+
+    /* Read run_vcpu input buffer to update state */
+    gsr.buf = vcpu->runbufin.addr;
+    gsr.len = vcpu->runbufin.size;
+    gsr.flags = GUEST_STATE_REQUEST_SET; /* Thread wide + writing */
+    if (!map_and_getset_state(cpu, guest, vcpuid, &gsr)) {
+        nested_papr_run_vcpu(cpu, lpid, vcpu);
+    }
+
+    return env->gpr[3];
+}
+
 void spapr_register_nested(void)
 {
     spapr_register_hypercall(KVMPPC_H_SET_PARTITION_TABLE, h_set_ptbl);
@@ -1473,6 +1754,7 @@ void spapr_register_nested_papr(void)
     spapr_register_hypercall(H_GUEST_CREATE_VCPU     , h_guest_create_vcpu);
     spapr_register_hypercall(H_GUEST_SET_STATE       , h_guest_set_state);
     spapr_register_hypercall(H_GUEST_GET_STATE       , h_guest_get_state);
+    spapr_register_hypercall(H_GUEST_RUN_VCPU        , h_guest_run_vcpu);
 }
 #else
 void spapr_exit_nested(PowerPCCPU *cpu, int excp)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index b9a67895bb..e278ddc7cf 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -594,6 +594,7 @@ struct SpaprMachineState {
 #define H_GUEST_CREATE_VCPU      0x474
 #define H_GUEST_GET_STATE        0x478
 #define H_GUEST_SET_STATE        0x47C
+#define H_GUEST_RUN_VCPU         0x480
 #define H_GUEST_DELETE           0x488
 
 #define MAX_HCALL_OPCODE         H_GUEST_DELETE
diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
index 9c8c59f173..a9749a2df1 100644
--- a/include/hw/ppc/spapr_cpu_core.h
+++ b/include/hw/ppc/spapr_cpu_core.h
@@ -53,7 +53,12 @@ typedef struct SpaprCpuState {
 
     /* Fields for nested-HV support */
     bool in_nested; /* true while the L2 is executing */
-    struct nested_ppc_state *nested_hv_host; /* holds the L1 state while L2 executes */
+    union {
+        /* holds the L1 state while L2 executes */
+        struct nested_ppc_state *nested_hv_host;
+        CPUPPCState             *nested_papr_host;
+    };
+
 } SpaprCpuState;
 
 static inline SpaprCpuState *spapr_cpu_state(PowerPCCPU *cpu)
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM)
  2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
                   ` (13 preceding siblings ...)
  2023-10-12 10:49 ` [PATCH v2 14/14] spapr: nested: Introduce H_GUEST_RUN_VCPU hcall Harsh Prateek Bora
@ 2023-11-13 13:15 ` Nicholas Piggin
  14 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2023-11-13 13:15 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

This will have to wait until the next release. There should not be
big changes to rebase on. I'll take a better look before then.

Linux now has this merged upstream so it will be much easier to test.

I posted some RFCs for new avocado tests including a KVM guest boot
(https://lists.gnu.org/archive/html/qemu-ppc/2023-10/msg00260.html).
Byt I guess it won't be too easy to adapt that to test the new API
until there is a usable distro image with support. We should open an
issue for that.

Thanks,
Nick

On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> There is an existing Nested-HV API to enable nested guests on powernv
> machines. However, that is not supported on pseries/PowerVM LPARs.
> This patch series implements required hcall interfaces to enable nested
> guests with KVM on PowerVM.
> Unlike Nested-HV, with this API, entire L2 state is retained by L0
> during guest entry/exit and uses pre-defined Guest State Buffer (GSB)
> format to communicate guest state between L1 and L2 via L0.
>
> L0 here refers to the phyp/PowerVM, or launching a Qemu TCG L0 with the
> newly introduced option cap-nested-papr=true.
> L1 refers to the LPAR host on PowerVM or Linux booted on Qemu TCG with
> above mentioned option cap-nested-papr=true.
> L2 refers to nested guest running on top of L1 using KVM.
> No SW changes needed for Qemu running in L1 Linux as well as L2 Kernel.
>
> There is a Linux Kernel side patch series to enable support for Nested
> PAPR in L1 and same can be found at below url:
>
> Linux Kernel patch series:
> - https://lore.kernel.org/linuxppc-dev/20230914030600.16993-1-jniethe5@gmail.com/
>
> For more details, documentation can be referred in either of patch
> series.
>
> There are scripts available to assist in setting up an environment for
> testing nested guests at https://github.com/iamjpn/kvm-powervm-test
>
> A tree with this series is available at:
> https://github.com/planetharsh/qemu/tree/upstream-kop-1012
>
> Thanks to Michael Neuling, Shivaprasad Bhat, Kautuk Consul, Vaibhav Jain
> and Jordan Niethe.
>
> Changelog:
>
> v2:
>     - Addressed review comments from Nick on v1 series.
> v1:
>     - https://lore.kernel.org/qemu-devel/20230906043333.448244-1-harshpb@linux.ibm.com/
>
> Harsh Prateek Bora (14):
>   spapr: nested:  move nested part of spapr_get_pate into spapr_nested.c
>   spapr: nested: Introduce SpaprMachineStateNested to store related
>     info.
>   spapr: nested: Document Nested PAPR API
>   spapr: nested: Introduce cap-nested-papr for Nested PAPR API
>   spapr: nested: register nested-hv api hcalls only for cap-nested-hv
>   spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls.
>   spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls.
>   spapr: nested: Introduce H_GUEST_CREATE_VPCU hcall.
>   spapr: nested: Initialize the GSB elements lookup table.
>   spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls.
>   spapr: nested: Use correct source for parttbl info for nested PAPR
>     API.
>   spapr: nested: rename nested_host_state to nested_hv_host
>   spapr: nested: keep nested-hv exit code restricted to its API.
>   spapr: nested: Introduce H_GUEST_RUN_VCPU hcall.
>
>  docs/devel/nested-papr.txt      |  500 +++++++++++
>  hw/ppc/spapr.c                  |   32 +-
>  hw/ppc/spapr_caps.c             |   63 ++
>  hw/ppc/spapr_hcall.c            |    2 -
>  hw/ppc/spapr_nested.c           | 1439 ++++++++++++++++++++++++++++++-
>  include/hw/ppc/spapr.h          |   21 +-
>  include/hw/ppc/spapr_cpu_core.h |    7 +-
>  include/hw/ppc/spapr_nested.h   |  361 ++++++++
>  target/ppc/cpu.h                |    2 +
>  9 files changed, 2368 insertions(+), 59 deletions(-)
>  create mode 100644 docs/devel/nested-papr.txt



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 01/14] spapr: nested: move nested part of spapr_get_pate into spapr_nested.c
  2023-10-12 10:49 ` [PATCH v2 01/14] spapr: nested: move nested part of spapr_get_pate into spapr_nested.c Harsh Prateek Bora
@ 2023-11-29  3:47   ` Nicholas Piggin
  0 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2023-11-29  3:47 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> Most of the nested code has already been moved to spapr_nested.c
> This logic inside spapr_get_pate is related to nested guests and
> better suited for spapr_nested.c, hence moving there.
>
> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

> ---
>  hw/ppc/spapr.c                | 28 ++--------------------------
>  hw/ppc/spapr_nested.c         | 29 +++++++++++++++++++++++++++++
>  include/hw/ppc/spapr_nested.h |  3 ++-
>  3 files changed, 33 insertions(+), 27 deletions(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index cb840676d3..a2c69d0f4f 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1341,7 +1341,6 @@ void spapr_init_all_lpcrs(target_ulong value, target_ulong mask)
>      }
>  }
>  
> -
>  static bool spapr_get_pate(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
>                             target_ulong lpid, ppc_v3_pate_t *entry)
>  {
> @@ -1354,33 +1353,10 @@ static bool spapr_get_pate(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
>          /* Copy PATE1:GR into PATE0:HR */
>          entry->dw0 = spapr->patb_entry & PATE0_HR;
>          entry->dw1 = spapr->patb_entry;
> -
> +        return true;
>      } else {
> -        uint64_t patb, pats;
> -
> -        assert(lpid != 0);
> -
> -        patb = spapr->nested_ptcr & PTCR_PATB;
> -        pats = spapr->nested_ptcr & PTCR_PATS;
> -
> -        /* Check if partition table is properly aligned */
> -        if (patb & MAKE_64BIT_MASK(0, pats + 12)) {
> -            return false;
> -        }
> -
> -        /* Calculate number of entries */
> -        pats = 1ull << (pats + 12 - 4);
> -        if (pats <= lpid) {
> -            return false;
> -        }
> -
> -        /* Grab entry */
> -        patb += 16 * lpid;
> -        entry->dw0 = ldq_phys(CPU(cpu)->as, patb);
> -        entry->dw1 = ldq_phys(CPU(cpu)->as, patb + 8);
> +        return spapr_get_pate_nested(spapr, cpu, lpid, entry);
>      }
> -
> -    return true;
>  }
>  
>  #define HPTE(_table, _i)   (void *)(((uint64_t *)(_table)) + ((_i) * 2))
> diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
> index 121aa96ddc..123e127b08 100644
> --- a/hw/ppc/spapr_nested.c
> +++ b/hw/ppc/spapr_nested.c
> @@ -6,6 +6,35 @@
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_cpu_core.h"
>  #include "hw/ppc/spapr_nested.h"
> +#include "mmu-book3s-v3.h"
> +
> +bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
> +                           target_ulong lpid, ppc_v3_pate_t *entry)
> +{
> +    uint64_t patb, pats;
> +
> +    assert(lpid != 0);
> +
> +    patb = spapr->nested_ptcr & PTCR_PATB;
> +    pats = spapr->nested_ptcr & PTCR_PATS;
> +
> +    /* Check if partition table is properly aligned */
> +    if (patb & MAKE_64BIT_MASK(0, pats + 12)) {
> +        return false;
> +    }
> +
> +    /* Calculate number of entries */
> +    pats = 1ull << (pats + 12 - 4);
> +    if (pats <= lpid) {
> +        return false;
> +    }
> +
> +    /* Grab entry */
> +    patb += 16 * lpid;
> +    entry->dw0 = ldq_phys(CPU(cpu)->as, patb);
> +    entry->dw1 = ldq_phys(CPU(cpu)->as, patb + 8);
> +    return true;
> +}
>  
>  #ifdef CONFIG_TCG
>  #define PRTS_MASK      0x1f
> diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
> index d383486476..e3d15d6d0b 100644
> --- a/include/hw/ppc/spapr_nested.h
> +++ b/include/hw/ppc/spapr_nested.h
> @@ -98,5 +98,6 @@ struct nested_ppc_state {
>  
>  void spapr_register_nested(void);
>  void spapr_exit_nested(PowerPCCPU *cpu, int excp);
> -
> +bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
> +                           target_ulong lpid, ppc_v3_pate_t *entry);
>  #endif /* HW_SPAPR_NESTED_H */



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 02/14] spapr: nested: Introduce SpaprMachineStateNested to store related info.
  2023-10-12 10:49 ` [PATCH v2 02/14] spapr: nested: Introduce SpaprMachineStateNested to store related info Harsh Prateek Bora
@ 2023-11-29  3:48   ` Nicholas Piggin
  0 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2023-11-29  3:48 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> Currently, nested_ptcr is being used by existing nested-hv API to store
> nested guest related info. This need to be organised to extend support
> for the nested PAPR API which would need to store additional info related
> to nested guests in next series of patches.
>
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

> ---
>  hw/ppc/spapr_nested.c         | 8 ++++----
>  include/hw/ppc/spapr.h        | 3 ++-
>  include/hw/ppc/spapr_nested.h | 5 +++++
>  3 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
> index 123e127b08..db47c1196f 100644
> --- a/hw/ppc/spapr_nested.c
> +++ b/hw/ppc/spapr_nested.c
> @@ -15,8 +15,8 @@ bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
>  
>      assert(lpid != 0);
>  
> -    patb = spapr->nested_ptcr & PTCR_PATB;
> -    pats = spapr->nested_ptcr & PTCR_PATS;
> +    patb = spapr->nested.ptcr & PTCR_PATB;
> +    pats = spapr->nested.ptcr & PTCR_PATS;
>  
>      /* Check if partition table is properly aligned */
>      if (patb & MAKE_64BIT_MASK(0, pats + 12)) {
> @@ -54,7 +54,7 @@ static target_ulong h_set_ptbl(PowerPCCPU *cpu,
>          return H_PARAMETER;
>      }
>  
> -    spapr->nested_ptcr = ptcr; /* Save new partition table */
> +    spapr->nested.ptcr = ptcr; /* Save new partition table */
>  
>      return H_SUCCESS;
>  }
> @@ -186,7 +186,7 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>      struct kvmppc_pt_regs *regs;
>      hwaddr len;
>  
> -    if (spapr->nested_ptcr == 0) {
> +    if (spapr->nested.ptcr == 0) {
>          return H_NOT_AVAILABLE;
>      }
>  
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index e91791a1a9..3e825f2787 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -12,6 +12,7 @@
>  #include "hw/ppc/spapr_xive.h"  /* For SpaprXive */
>  #include "hw/ppc/xics.h"        /* For ICSState */
>  #include "hw/ppc/spapr_tpm_proxy.h"
> +#include "hw/ppc/spapr_nested.h" /* For SpaprMachineStateNested */
>  
>  struct SpaprVioBus;
>  struct SpaprPhbState;
> @@ -213,7 +214,7 @@ struct SpaprMachineState {
>      uint32_t vsmt;       /* Virtual SMT mode (KVM's "core stride") */
>  
>      /* Nested HV support (TCG only) */
> -    uint64_t nested_ptcr;
> +    SpaprMachineStateNested nested;
>  
>      Notifier epow_notifier;
>      QTAILQ_HEAD(, SpaprEventLogEntry) pending_events;
> diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
> index e3d15d6d0b..0722b999cd 100644
> --- a/include/hw/ppc/spapr_nested.h
> +++ b/include/hw/ppc/spapr_nested.h
> @@ -4,6 +4,10 @@
>  #include "qemu/osdep.h"
>  #include "target/ppc/cpu.h"
>  
> +typedef struct SpaprMachineStateNested {
> +    uint64_t ptcr;
> +} SpaprMachineStateNested;
> +
>  /*
>   * Register state for entering a nested guest with H_ENTER_NESTED.
>   * New member must be added at the end.
> @@ -98,6 +102,7 @@ struct nested_ppc_state {
>  
>  void spapr_register_nested(void);
>  void spapr_exit_nested(PowerPCCPU *cpu, int excp);
> +typedef struct SpaprMachineState SpaprMachineState;
>  bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
>                             target_ulong lpid, ppc_v3_pate_t *entry);
>  #endif /* HW_SPAPR_NESTED_H */



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 04/14] spapr: nested: Introduce cap-nested-papr for Nested PAPR API
  2023-10-12 10:49 ` [PATCH v2 04/14] spapr: nested: Introduce cap-nested-papr for " Harsh Prateek Bora
@ 2023-11-29  4:01   ` Nicholas Piggin
  2023-11-30  6:19     ` Harsh Prateek Bora
  0 siblings, 1 reply; 26+ messages in thread
From: Nicholas Piggin @ 2023-11-29  4:01 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> Introduce a SPAPR capability cap-nested-papr which provides a nested
> HV facility to the guest. This is similar to cap-nested-hv, but uses
> a different (incompatible) API and so they are mutually exclusive.
> This new API is to enable support for KVM on PowerVM and recently the
> Linux kernel side patches have been accepted upstream as well [1].
> Support for related hcalls is being added in next set of patches.

We do want to be able to support both APIs on a per-guest basis. It
doesn't look like the vmstate bits will be a problem, both could be
enabled if the logic permitted it and that wouldn't cause a
compatibility problem I think?

And it's a bit of a nitpick, but the capability should not be permitted
before the actual APIs are supported IMO. You could split this into
adding .api first, so the implementation can test it, and add the spapr
caps at the end.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/14] spapr: nested: register nested-hv api hcalls only for cap-nested-hv
  2023-10-12 10:49 ` [PATCH v2 05/14] spapr: nested: register nested-hv api hcalls only for cap-nested-hv Harsh Prateek Bora
@ 2023-11-29  4:03   ` Nicholas Piggin
  0 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2023-11-29  4:03 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> Since cap-nested-hv and cap-nested-papr are mutually exclusive, now it
> makes sense to register api specfic hcalls only when respective
> capability is enabled, hence this change.
>
> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>

I think this basically makes sense anyway since if you don't enable
cap-nested-hv, then no need to register the hcalls. This patch could
be pulled to the front of the series.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

> ---
>  hw/ppc/spapr_caps.c  | 1 +
>  hw/ppc/spapr_hcall.c | 2 --
>  2 files changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 9b53f19ec8..ed3e638334 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -456,6 +456,7 @@ static void cap_nested_kvm_hv_apply(SpaprMachineState *spapr,
>  
>      if (!spapr->nested.api) {
>          spapr->nested.api = NESTED_API_KVM_HV;
> +        spapr_register_nested();
>      } else {
>          error_setg(errp, "Nested-HV APIs are mutually exclusive/incompatible");
>          error_append_hint(errp, "Please use either cap-nested-hv or "
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 522a2396c7..8ae55087ec 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -1635,8 +1635,6 @@ static void hypercall_register_types(void)
>      spapr_register_hypercall(KVMPPC_H_CAS, h_client_architecture_support);
>  
>      spapr_register_hypercall(KVMPPC_H_UPDATE_DT, h_update_dt);
> -
> -    spapr_register_nested();
>  }
>  
>  type_init(hypercall_register_types)



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 11/14] spapr: nested: Use correct source for parttbl info for nested PAPR API.
  2023-10-12 10:49 ` [PATCH v2 11/14] spapr: nested: Use correct source for parttbl info for nested PAPR API Harsh Prateek Bora
@ 2023-11-29  4:15   ` Nicholas Piggin
  0 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2023-11-29  4:15 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> For nested PAPR API, we use SpaprMachineStateNestedGuest struct to store
> partition table info, use the same in spapr_get_pate_nested() as well.
>
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
> ---
>  hw/ppc/spapr.c                |  5 +---
>  hw/ppc/spapr_nested.c         | 49 +++++++++++++++++++++++------------
>  include/hw/ppc/spapr_nested.h |  3 +++
>  3 files changed, 36 insertions(+), 21 deletions(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 14196fdd11..e1f7b29842 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1356,10 +1356,7 @@ static bool spapr_get_pate(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
>          return true;
>      } else {
>          assert(spapr->nested.api);
> -        if (spapr->nested.api == NESTED_API_KVM_HV) {
> -            return spapr_get_pate_nested(spapr, cpu, lpid, entry);
> -        }
> -        return false;
> +        return spapr_get_pate_nested(spapr, cpu, lpid, entry);

If you don't permit the api to be changed to PAPR at the start, then you
don't need to juggle this back and forth.

>      }
>  }
>  
> diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
> index f41b0c0944..1917d8bb13 100644
> --- a/hw/ppc/spapr_nested.c
> +++ b/hw/ppc/spapr_nested.c
> @@ -19,29 +19,45 @@ void spapr_nested_init(SpaprMachineState *spapr)
>  bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
>                             target_ulong lpid, ppc_v3_pate_t *entry)
>  {
> -    uint64_t patb, pats;
> +    if (spapr->nested.api == NESTED_API_KVM_HV) {

Maybe split out the implementations into two functions?

And actually instead of open coding the API test, I'm thinking
maybe you should add a helper function

spapr_cpu_nested_api(cpu) which returns NESTED_API_... and you
can just hard code it to return spapr->nested.api for now.
That should avoid a bunch of churn when we permit guests to
run with different APIs.

> +        uint64_t patb, pats;
>  
> -    assert(lpid != 0);
> +        assert(lpid != 0);
>  
> -    patb = spapr->nested.ptcr & PTCR_PATB;
> -    pats = spapr->nested.ptcr & PTCR_PATS;
> +        patb = spapr->nested.ptcr & PTCR_PATB;
> +        pats = spapr->nested.ptcr & PTCR_PATS;
>  
> -    /* Check if partition table is properly aligned */
> -    if (patb & MAKE_64BIT_MASK(0, pats + 12)) {
> -        return false;
> -    }
> +        /* Check if partition table is properly aligned */
> +        if (patb & MAKE_64BIT_MASK(0, pats + 12)) {
> +            return false;
> +        }
>  
> -    /* Calculate number of entries */
> -    pats = 1ull << (pats + 12 - 4);
> -    if (pats <= lpid) {
> -        return false;
> +        /* Calculate number of entries */
> +        pats = 1ull << (pats + 12 - 4);
> +        if (pats <= lpid) {
> +            return false;
> +        }
> +
> +        /* Grab entry */
> +        patb += 16 * lpid;
> +        entry->dw0 = ldq_phys(CPU(cpu)->as, patb);
> +        entry->dw1 = ldq_phys(CPU(cpu)->as, patb + 8);
> +        return true;
>      }
> +#ifdef CONFIG_TCG

Nested spapr is entirely TCG, no? If we're not properly guarding the
nested code with CONFIG_TCG, that should go as an initial prep patch.

Thanks,
Nick

> +    /* Nested PAPR API */
> +    SpaprMachineStateNestedGuest *guest;
> +    assert(lpid != 0);
> +    guest = spapr_get_nested_guest(spapr, lpid);
> +    assert(guest != NULL);
> +
> +    entry->dw0 = guest->parttbl[0];
> +    entry->dw1 = guest->parttbl[1];
>  
> -    /* Grab entry */
> -    patb += 16 * lpid;
> -    entry->dw0 = ldq_phys(CPU(cpu)->as, patb);
> -    entry->dw1 = ldq_phys(CPU(cpu)->as, patb + 8);
>      return true;
> +#else
> +    return false;
> +#endif
>  }
>  
>  #ifdef CONFIG_TCG
> @@ -412,7 +428,6 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
>      address_space_unmap(CPU(cpu)->as, regs, len, len, true);
>  }
>  
> -static
>  SpaprMachineStateNestedGuest *spapr_get_nested_guest(SpaprMachineState *spapr,
>                                                       target_ulong guestid)
>  {
> diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
> index d27688b1b8..e04202ecb4 100644
> --- a/include/hw/ppc/spapr_nested.h
> +++ b/include/hw/ppc/spapr_nested.h
> @@ -457,4 +457,7 @@ bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
>  void spapr_register_nested_papr(void);
>  void spapr_nested_init(SpaprMachineState *spapr);
>  void spapr_nested_gsb_init(void);
> +SpaprMachineStateNestedGuest *spapr_get_nested_guest(SpaprMachineState *spapr,
> +                                                     target_ulong lpid);
> +
>  #endif /* HW_SPAPR_NESTED_H */



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 13/14] spapr: nested: keep nested-hv exit code restricted to its API.
  2023-10-12 10:49 ` [PATCH v2 13/14] spapr: nested: keep nested-hv exit code restricted to its API Harsh Prateek Bora
@ 2023-11-29  4:16   ` Nicholas Piggin
  0 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2023-11-29  4:16 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> spapr_exit_nested gets triggered on a nested guest exit and currently
> being used only for nested-hv API. Isolating code flows based on API
> helps extending it to be used with nested PAPR API as well.
>

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
> ---
>  hw/ppc/spapr_nested.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
> index 607fb7ead2..e2d0cb5559 100644
> --- a/hw/ppc/spapr_nested.c
> +++ b/hw/ppc/spapr_nested.c
> @@ -325,7 +325,7 @@ static target_ulong h_enter_nested(PowerPCCPU *cpu,
>      return env->gpr[3];
>  }
>  
> -void spapr_exit_nested(PowerPCCPU *cpu, int excp)
> +static void spapr_exit_nested_hv(PowerPCCPU *cpu, int excp)
>  {
>      CPUPPCState *env = &cpu->env;
>      SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> @@ -337,8 +337,6 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
>      struct kvmppc_pt_regs *regs;
>      hwaddr len;
>  
> -    assert(spapr_cpu->in_nested);
> -
>      nested_save_state(&l2_state, cpu);
>      hsrr0 = env->spr[SPR_HSRR0];
>      hsrr1 = env->spr[SPR_HSRR1];
> @@ -428,6 +426,17 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
>      address_space_unmap(CPU(cpu)->as, regs, len, len, true);
>  }
>  
> +void spapr_exit_nested(PowerPCCPU *cpu, int excp)
> +{
> +    SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> +    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> +
> +    assert(spapr_cpu->in_nested);
> +    if (spapr->nested.api == NESTED_API_KVM_HV) {
> +        spapr_exit_nested_hv(cpu, excp);
> +    }
> +}
> +
>  SpaprMachineStateNestedGuest *spapr_get_nested_guest(SpaprMachineState *spapr,
>                                                       target_ulong guestid)
>  {



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 14/14] spapr: nested: Introduce H_GUEST_RUN_VCPU hcall.
  2023-10-12 10:49 ` [PATCH v2 14/14] spapr: nested: Introduce H_GUEST_RUN_VCPU hcall Harsh Prateek Bora
@ 2023-11-29  4:58   ` Nicholas Piggin
  0 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2023-11-29  4:58 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> The H_GUEST_RUN_VCPU hcall is used to start execution of a Guest VCPU.
> The Hypervisor will update the state of the Guest VCPU based on the
> input buffer, restore the saved Guest VCPU state, and start its execution.
>
> The Guest VCPU can stop running for numerous reasons including HCALLs,
> hypervisor exceptions, or an outstanding Host Partition Interrupt.
> The reason that the Guest VCPU stopped running is communicated through
> R4 and the output buffer will be filled in with any relevant state.
>
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com>
> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
> ---
>  hw/ppc/spapr_nested.c           | 308 ++++++++++++++++++++++++++++++--
>  include/hw/ppc/spapr.h          |   1 +
>  include/hw/ppc/spapr_cpu_core.h |   7 +-
>  3 files changed, 302 insertions(+), 14 deletions(-)
>
> diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
> index e2d0cb5559..d3e7629f63 100644
> --- a/hw/ppc/spapr_nested.c
> +++ b/hw/ppc/spapr_nested.c
> @@ -141,6 +141,15 @@ static void nested_save_state(struct nested_ppc_state *save, PowerPCCPU *cpu)
>      save->tb_offset = env->tb_env->tb_offset;
>  }
>  
> +static void nested_post_state_update(CPUPPCState *env, CPUState *cs)
> +{
> +    hreg_compute_hflags(env);
> +    ppc_maybe_interrupt(env);
> +    tlb_flush(cs);
> +    env->reserve_addr = -1; /* Reset the reservation */
> +

Extra newline. And don't strip out comments please.

> +}
> +
>  static void nested_load_state(PowerPCCPU *cpu, struct nested_ppc_state *load)
>  {
>      CPUState *cs = CPU(cpu);
> @@ -172,19 +181,7 @@ static void nested_load_state(PowerPCCPU *cpu, struct nested_ppc_state *load)
>      env->spr[SPR_PPR] = load->ppr;
>  
>      env->tb_env->tb_offset = load->tb_offset;
> -
> -    /*
> -     * MSR updated, compute hflags and possible interrupts.
> -     */
> -    hreg_compute_hflags(env);
> -    ppc_maybe_interrupt(env);
> -
> -    /*
> -     * Nested HV does not tag TLB entries between L1 and L2, so must
> -     * flush on transition.
> -     */
> -    tlb_flush(cs);
> -    env->reserve_addr = -1; /* Reset the reservation */
> +    nested_post_state_update(env, cs);
>  }
>  
>  /*
> @@ -426,6 +423,9 @@ static void spapr_exit_nested_hv(PowerPCCPU *cpu, int excp)
>      address_space_unmap(CPU(cpu)->as, regs, len, len, true);
>  }
>  
> +static
> +void spapr_exit_nested_papr(SpaprMachineState *spapr, PowerPCCPU *cpu, int excp);

Would be nicer if the implementations could go above, then the APIs
below so you don't need this.

> +
>  void spapr_exit_nested(PowerPCCPU *cpu, int excp)
>  {
>      SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> @@ -434,6 +434,10 @@ void spapr_exit_nested(PowerPCCPU *cpu, int excp)
>      assert(spapr_cpu->in_nested);
>      if (spapr->nested.api == NESTED_API_KVM_HV) {
>          spapr_exit_nested_hv(cpu, excp);
> +    } else if (spapr->nested.api == NESTED_API_PAPR) {
> +        spapr_exit_nested_papr(spapr, cpu, excp);
> +    } else {
> +        g_assert_not_reached();

The assert leg should have been introduced in the previous patch.

>      }
>  }
>  
> @@ -1455,6 +1459,283 @@ static target_ulong h_guest_get_state(PowerPCCPU *cpu,
>      return h_guest_getset_state(cpu, spapr, args, false);
>  }
>  
> +static void restore_common_regs(CPUPPCState *dst, CPUPPCState *src)
> +{
> +    memcpy(dst->gpr, src->gpr, sizeof(dst->gpr));
> +    memcpy(dst->crf, src->crf, sizeof(dst->crf));
> +    memcpy(dst->vsr, src->vsr, sizeof(dst->vsr));
> +    dst->nip = src->nip;
> +    dst->msr = src->msr;
> +    dst->lr  = src->lr;
> +    dst->ctr = src->ctr;
> +    dst->cfar = src->cfar;
> +    cpu_write_xer(dst, src->xer);
> +    ppc_store_vscr(dst, ppc_get_vscr(src));
> +    ppc_store_fpscr(dst, src->fpscr);

Still don't like having a CPUPPCState to save into. It does give you a
bunch of variables you need, but you still need a lot of bespoke code
for FPSCR, XER, NIP, etc. so it's not really buying you anything.

If there was a general ppc_cpu_state_save/restore() function then sure,
but I don't think we have ti.

Now I think about it we could probably use the same struct and mostly
the same code for both APIs. They both do the same thing conceptually,
the difference is just that the PAPR API saves a few more registers.

> +    memcpy(dst->spr, src->spr, sizeof(dst->spr));

It's not really clear we can do this, since some SPRs have special
implementations. And AFAIKS the current code already has at least
one bug at a glance because of this, and that's DPDES.

Fortunately Linux doesn't really use it in !SMT and we don't support
SMT + nested at the moment. The problem just gets worse by doing a
wholesale memcpy of SPRs.

> +}

> +
> +static void exit_nested_restore_vcpu(PowerPCCPU *cpu, int excp,
> +                                     SpaprMachineStateNestedGuestVcpu *vcpu)

The "restore" here and  in nested_papr_restore_l2_state confuse me
I think. This isn't "restoring" the nested vcpu is it? It's saving
it. And nested_papr_restore_l2_state is *loading* the l2 state.

Restore is not a great word, it's a bit ambigous. But in general
this is all to be read from the point of view of the L1 (host), so
"restore" is more suitable for restoring back to host state, but
IMO it should be avoided if possible. "save" can refer to saving
current machine state to somewhere, "load" loading machine state
from somewhere.

> +{
> +    CPUPPCState *env = &cpu->env;
> +    target_ulong now, hdar, hdsisr, asdr;
> +
> +    assert(sizeof(env->gpr) == sizeof(vcpu->env.gpr)); /* sanity check */
> +
> +    now = cpu_ppc_load_tbl(env); /* L2 timebase */
> +    now -= vcpu->tb_offset; /* L1 timebase */
> +    vcpu->dec_expiry_tb = now - cpu_ppc_load_decr(env);
> +    /* backup hdar, hdsisr, asdr if reqd later below */
> +    hdar   = vcpu->env.spr[SPR_HDAR];
> +    hdsisr = vcpu->env.spr[SPR_HDSISR];
> +    asdr   = vcpu->env.spr[SPR_ASDR];
> +
> +    restore_common_regs(&vcpu->env, env);
> +
> +    if (excp == POWERPC_EXCP_MCHECK ||
> +        excp == POWERPC_EXCP_RESET ||
> +        excp == POWERPC_EXCP_SYSCALL) {
> +        vcpu->env.nip = env->spr[SPR_SRR0];
> +        vcpu->env.msr = env->spr[SPR_SRR1] & env->msr_mask;
> +    } else {
> +        vcpu->env.nip = env->spr[SPR_HSRR0];
> +        vcpu->env.msr = env->spr[SPR_HSRR1] & env->msr_mask;
> +    }
> +
> +    /* hdar, hdsisr, asdr should be retained unless certain exceptions */
> +    if ((excp != POWERPC_EXCP_HDSI) && (excp != POWERPC_EXCP_HISI)) {
> +        vcpu->env.spr[SPR_ASDR] = asdr;
> +    } else if (excp != POWERPC_EXCP_HDSI) {
> +        vcpu->env.spr[SPR_HDAR]   = hdar;
> +        vcpu->env.spr[SPR_HDSISR] = hdsisr;
> +    }
> +}
> +
> +static int get_exit_ids(uint64_t srr0, uint16_t ids[16])
> +{
> +    int nr;
> +
> +    switch (srr0) {
> +    case 0xc00:
> +        nr = 10;
> +        ids[0] = GSB_VCPU_GPR3;
> +        ids[1] = GSB_VCPU_GPR4;
> +        ids[2] = GSB_VCPU_GPR5;
> +        ids[3] = GSB_VCPU_GPR6;
> +        ids[4] = GSB_VCPU_GPR7;
> +        ids[5] = GSB_VCPU_GPR8;
> +        ids[6] = GSB_VCPU_GPR9;
> +        ids[7] = GSB_VCPU_GPR10;
> +        ids[8] = GSB_VCPU_GPR11;
> +        ids[9] = GSB_VCPU_GPR12;
> +        break;
> +    case 0xe00:
> +        nr = 5;
> +        ids[0] = GSB_VCPU_SPR_HDAR;
> +        ids[1] = GSB_VCPU_SPR_HDSISR;
> +        ids[2] = GSB_VCPU_SPR_ASDR;
> +        ids[3] = GSB_VCPU_SPR_NIA;
> +        ids[4] = GSB_VCPU_SPR_MSR;
> +        break;
> +    case 0xe20:
> +        nr = 4;
> +        ids[0] = GSB_VCPU_SPR_HDAR;
> +        ids[1] = GSB_VCPU_SPR_ASDR;
> +        ids[2] = GSB_VCPU_SPR_NIA;
> +        ids[3] = GSB_VCPU_SPR_MSR;
> +        break;
> +    case 0xe40:
> +        nr = 3;
> +        ids[0] = GSB_VCPU_SPR_HEIR;
> +        ids[1] = GSB_VCPU_SPR_NIA;
> +        ids[2] = GSB_VCPU_SPR_MSR;
> +        break;
> +    case 0xf80:
> +        nr = 3;
> +        ids[0] = GSB_VCPU_SPR_HFSCR;
> +        ids[1] = GSB_VCPU_SPR_NIA;
> +        ids[2] = GSB_VCPU_SPR_MSR;
> +        break;
> +    default:
> +        nr = 0;
> +        break;
> +    }
> +
> +    return nr;
> +}
> +
> +static void exit_process_output_buffer(PowerPCCPU *cpu,
> +                                      SpaprMachineStateNestedGuest *guest,
> +                                      target_ulong vcpuid,
> +                                      target_ulong *r3)
> +{
> +    SpaprMachineStateNestedGuestVcpu *vcpu = &guest->vcpu[vcpuid];
> +    struct guest_state_request gsr;
> +    struct guest_state_buffer *gsb;
> +    struct guest_state_element *element;
> +    struct guest_state_element_type *type;
> +    int exit_id_count = 0;
> +    uint16_t exit_cause_ids[16];
> +    hwaddr len;
> +
> +    len = vcpu->runbufout.size;
> +    gsb = address_space_map(CPU(cpu)->as, vcpu->runbufout.addr, &len, true,
> +                            MEMTXATTRS_UNSPECIFIED);
> +    if (!gsb || len != vcpu->runbufout.size) {
> +        address_space_unmap(CPU(cpu)->as, gsb, len, true, len);
> +        *r3 = H_P2;
> +        return;
> +    }
> +
> +    exit_id_count = get_exit_ids(*r3, exit_cause_ids);
> +
> +    /* Create a buffer of elements to send back */
> +    gsb->num_elements = cpu_to_be32(exit_id_count);
> +    element = gsb->elements;
> +    for (int i = 0; i < exit_id_count; i++) {
> +        type = guest_state_element_type_find(exit_cause_ids[i]);
> +        assert(type);
> +        element->id = cpu_to_be16(exit_cause_ids[i]);
> +        element->size = cpu_to_be16(type->size);
> +        element = guest_state_element_next(element, NULL, NULL);
> +    }
> +    gsr.gsb = gsb;
> +    gsr.len = VCPU_OUT_BUF_MIN_SZ;
> +    gsr.flags = 0; /* get + never guest wide */
> +    getset_state(guest, vcpuid, &gsr);
> +
> +    address_space_unmap(CPU(cpu)->as, gsb, len, true, len);
> +    return;
> +}
> +
> +static
> +void spapr_exit_nested_papr(SpaprMachineState *spapr, PowerPCCPU *cpu, int excp)
> +{
> +    CPUState *cs = CPU(cpu);
> +    CPUPPCState *env = &cpu->env;
> +    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> +    target_ulong r3_return = env->excp_vectors[excp]; /* hcall return value */
> +    target_ulong lpid = 0, vcpuid = 0;
> +    struct SpaprMachineStateNestedGuestVcpu *vcpu = NULL;
> +    struct SpaprMachineStateNestedGuest *guest = NULL;
> +
> +    lpid = spapr_cpu->nested_papr_host->gpr[5];
> +    vcpuid = spapr_cpu->nested_papr_host->gpr[6];
> +    guest = spapr_get_nested_guest(spapr, lpid);
> +    assert(guest);
> +    spapr_nested_vcpu_check(guest, vcpuid, false);
> +    vcpu = &guest->vcpu[vcpuid];
> +
> +    exit_nested_restore_vcpu(cpu, excp, vcpu);
> +    /* do the output buffer for run_vcpu*/
> +    exit_process_output_buffer(cpu, guest, vcpuid, &r3_return);
> +
> +    assert(env->spr[SPR_LPIDR] != 0);
> +    restore_common_regs(env, spapr_cpu->nested_papr_host);

I need to take a look a bit closer, but AFAIKS you aren't
loading L1 decr when exiting to host...

> +    env->tb_env->tb_offset -= vcpu->tb_offset;

This is nasty, and I can say that because I did it in the
original code. We should call a timebase helper to do this
adjustment for us.

> +    env->gpr[3] = H_SUCCESS;
> +    env->gpr[4] = r3_return;
> +    nested_post_state_update(env, cs);
> +    cpu_ppc_hdecr_exit(env);
> +
> +    spapr_cpu->in_nested = false;
> +    g_free(spapr_cpu->nested_papr_host);
> +    spapr_cpu->nested_papr_host = NULL;
> +}
> +
> +static void nested_papr_restore_l2_state(PowerPCCPU *cpu,
> +                                         CPUPPCState *env,
> +                                         SpaprMachineStateNestedGuestVcpu *vcpu,
> +                                         target_ulong now)
> +{
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +    target_ulong lpcr, lpcr_mask, hdec;
> +    lpcr_mask = LPCR_DPFD | LPCR_ILE | LPCR_AIL | LPCR_LD | LPCR_MER;
> +
> +    assert(vcpu);
> +    assert(sizeof(env->gpr) == sizeof(vcpu->env.gpr));
> +    restore_common_regs(env, &vcpu->env);
> +    lpcr = (env->spr[SPR_LPCR] & ~lpcr_mask) |
> +           (vcpu->env.spr[SPR_LPCR] & lpcr_mask);
> +    lpcr |= LPCR_HR | LPCR_UPRT | LPCR_GTSE | LPCR_HVICE | LPCR_HDICE;
> +    lpcr &= ~LPCR_LPES0;
> +    env->spr[SPR_LPCR] = lpcr & pcc->lpcr_mask;
> +
> +    hdec = vcpu->hdecr_expiry_tb - now;
> +    cpu_ppc_store_decr(env, vcpu->dec_expiry_tb - now);
> +    cpu_ppc_hdecr_init(env);
> +    cpu_ppc_store_hdecr(env, hdec);
> +
> +    env->tb_env->tb_offset += vcpu->tb_offset;
> +}
> +
> +static void nested_papr_run_vcpu(PowerPCCPU *cpu,
> +                                 uint64_t lpid,
> +                                 SpaprMachineStateNestedGuestVcpu *vcpu)
> +{
> +    SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> +    CPUState *cs = CPU(cpu);
> +    CPUPPCState *env = &cpu->env;
> +    SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> +    target_ulong now = cpu_ppc_load_tbl(env);
> +
> +    assert(env->spr[SPR_LPIDR] == 0);
> +    assert(spapr->nested.api); /* ensure API version is initialized */
> +    spapr_cpu->nested_papr_host = g_try_new(CPUPPCState, 1);
> +    assert(spapr_cpu->nested_papr_host);
> +    memcpy(spapr_cpu->nested_papr_host, env, sizeof(CPUPPCState));
> +
> +    nested_papr_restore_l2_state(cpu, env, vcpu, now);
> +    env->spr[SPR_LPIDR] = lpid; /* post restore l2 state */
> +
> +    spapr_cpu->in_nested = true;
> +
> +    nested_post_state_update(env, cs);
> +}
> +
> +static target_ulong h_guest_run_vcpu(PowerPCCPU *cpu,
> +                                     SpaprMachineState *spapr,
> +                                     target_ulong opcode,
> +                                     target_ulong *args)
> +{
> +    CPUPPCState *env = &cpu->env;
> +    target_ulong flags = args[0];
> +    target_ulong lpid = args[1];
> +    target_ulong vcpuid = args[2];
> +    struct SpaprMachineStateNestedGuestVcpu *vcpu;
> +    struct guest_state_request gsr;
> +    SpaprMachineStateNestedGuest *guest;
> +
> +    if (flags) /* don't handle any flags for now */
> +        return H_PARAMETER;
> +
> +    guest = spapr_get_nested_guest(spapr, lpid);
> +    if (!guest) {
> +        return H_P2;
> +    }
> +    if (!spapr_nested_vcpu_check(guest, vcpuid, true)) {
> +        return H_P3;
> +    }
> +
> +    if (guest->parttbl[0] == 0) {
> +        /* At least need a partition scoped radix tree */
> +        return H_NOT_AVAILABLE;
> +    }
> +
> +    vcpu = &guest->vcpu[vcpuid];
> +
> +    /* Read run_vcpu input buffer to update state */
> +    gsr.buf = vcpu->runbufin.addr;
> +    gsr.len = vcpu->runbufin.size;
> +    gsr.flags = GUEST_STATE_REQUEST_SET; /* Thread wide + writing */
> +    if (!map_and_getset_state(cpu, guest, vcpuid, &gsr)) {
> +        nested_papr_run_vcpu(cpu, lpid, vcpu);
> +    }

If there is an error with map_and_getset, does it set gpr[3] to
an error code? IMO may be nicer if map_and_getset_state returns
the error itself, and this caller can set it in r3.

Thanks,
Nick

> +
> +    return env->gpr[3];
> +}
> +
>  void spapr_register_nested(void)
>  {
>      spapr_register_hypercall(KVMPPC_H_SET_PARTITION_TABLE, h_set_ptbl);
> @@ -1473,6 +1754,7 @@ void spapr_register_nested_papr(void)
>      spapr_register_hypercall(H_GUEST_CREATE_VCPU     , h_guest_create_vcpu);
>      spapr_register_hypercall(H_GUEST_SET_STATE       , h_guest_set_state);
>      spapr_register_hypercall(H_GUEST_GET_STATE       , h_guest_get_state);
> +    spapr_register_hypercall(H_GUEST_RUN_VCPU        , h_guest_run_vcpu);
>  }
>  #else
>  void spapr_exit_nested(PowerPCCPU *cpu, int excp)
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index b9a67895bb..e278ddc7cf 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -594,6 +594,7 @@ struct SpaprMachineState {
>  #define H_GUEST_CREATE_VCPU      0x474
>  #define H_GUEST_GET_STATE        0x478
>  #define H_GUEST_SET_STATE        0x47C
> +#define H_GUEST_RUN_VCPU         0x480
>  #define H_GUEST_DELETE           0x488
>  
>  #define MAX_HCALL_OPCODE         H_GUEST_DELETE
> diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
> index 9c8c59f173..a9749a2df1 100644
> --- a/include/hw/ppc/spapr_cpu_core.h
> +++ b/include/hw/ppc/spapr_cpu_core.h
> @@ -53,7 +53,12 @@ typedef struct SpaprCpuState {
>  
>      /* Fields for nested-HV support */
>      bool in_nested; /* true while the L2 is executing */
> -    struct nested_ppc_state *nested_hv_host; /* holds the L1 state while L2 executes */
> +    union {
> +        /* holds the L1 state while L2 executes */
> +        struct nested_ppc_state *nested_hv_host;
> +        CPUPPCState             *nested_papr_host;
> +    };
> +
>  } SpaprCpuState;
>  
>  static inline SpaprCpuState *spapr_cpu_state(PowerPCCPU *cpu)



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 04/14] spapr: nested: Introduce cap-nested-papr for Nested PAPR API
  2023-11-29  4:01   ` Nicholas Piggin
@ 2023-11-30  6:19     ` Harsh Prateek Bora
  2023-11-30 11:11       ` Nicholas Piggin
  0 siblings, 1 reply; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-11-30  6:19 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413



On 11/29/23 09:31, Nicholas Piggin wrote:
> On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
>> Introduce a SPAPR capability cap-nested-papr which provides a nested
>> HV facility to the guest. This is similar to cap-nested-hv, but uses
>> a different (incompatible) API and so they are mutually exclusive.
>> This new API is to enable support for KVM on PowerVM and recently the
>> Linux kernel side patches have been accepted upstream as well [1].
>> Support for related hcalls is being added in next set of patches.
> 
> We do want to be able to support both APIs on a per-guest basis. It
> doesn't look like the vmstate bits will be a problem, both could be
> enabled if the logic permitted it and that wouldn't cause a
> compatibility problem I think?
> 

I am not sure if it makes sense to have both APIs working in parallel 
for a nested guest. Former uses h_enter_guest and expects L1 to store 
most of the regs, and has no concept like GSB where the communication 
between L1 and L0 takes place in a standard format which is used at 
nested guest exit also. Here, we have separate APIs for guest/vcpu 
create and then do a run_vcpu for a specific vcpu. So, we cant really 
use both APIs interchangeably while running a nested guest. BTW, L1 
kernel uses only either of the APIs at a time, preferably this one if 
supported.

> And it's a bit of a nitpick, but the capability should not be permitted
> before the actual APIs are supported IMO. You could split this into
> adding .api first, so the implementation can test it, and add the spapr
> caps at the end.
> 

Agree, I shall update as suggested.

regards,
Harsh

> Thanks,
> Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 04/14] spapr: nested: Introduce cap-nested-papr for Nested PAPR API
  2023-11-30  6:19     ` Harsh Prateek Bora
@ 2023-11-30 11:11       ` Nicholas Piggin
  2023-12-01  5:34         ` Harsh Prateek Bora
  0 siblings, 1 reply; 26+ messages in thread
From: Nicholas Piggin @ 2023-11-30 11:11 UTC (permalink / raw)
  To: Harsh Prateek Bora, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413

On Thu Nov 30, 2023 at 4:19 PM AEST, Harsh Prateek Bora wrote:
>
>
> On 11/29/23 09:31, Nicholas Piggin wrote:
> > On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> >> Introduce a SPAPR capability cap-nested-papr which provides a nested
> >> HV facility to the guest. This is similar to cap-nested-hv, but uses
> >> a different (incompatible) API and so they are mutually exclusive.
> >> This new API is to enable support for KVM on PowerVM and recently the
> >> Linux kernel side patches have been accepted upstream as well [1].
> >> Support for related hcalls is being added in next set of patches.
> > 
> > We do want to be able to support both APIs on a per-guest basis. It
> > doesn't look like the vmstate bits will be a problem, both could be
> > enabled if the logic permitted it and that wouldn't cause a
> > compatibility problem I think?
> > 
>
> I am not sure if it makes sense to have both APIs working in parallel 
> for a nested guest.

Not for the nested guest, but for the nested KVM host (i.e., the direct
pseries guest running QEMU). QEMU doesn't know ahead of time which API
might be used by the OS.

> Former uses h_enter_guest and expects L1 to store 
> most of the regs, and has no concept like GSB where the communication 
> between L1 and L0 takes place in a standard format which is used at 
> nested guest exit also. Here, we have separate APIs for guest/vcpu 
> create and then do a run_vcpu for a specific vcpu. So, we cant really 
> use both APIs interchangeably while running a nested guest. BTW, L1 
> kernel uses only either of the APIs at a time, preferably this one if 
> supported.

Yeah not on the same guest. And it's less about running two different
APIs on different guests with the same L1 simultaneously (although we
could probably change KVM to support that fairly easily, and we might
want to for testing purposes), but more about compatibility. What if
we boot or exec into an old kernel that doesn't support the new API?
>
> > And it's a bit of a nitpick, but the capability should not be permitted
> > before the actual APIs are supported IMO. You could split this into
> > adding .api first, so the implementation can test it, and add the spapr
> > caps at the end.
> > 
>
> Agree, I shall update as suggested.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 04/14] spapr: nested: Introduce cap-nested-papr for Nested PAPR API
  2023-11-30 11:11       ` Nicholas Piggin
@ 2023-12-01  5:34         ` Harsh Prateek Bora
  0 siblings, 0 replies; 26+ messages in thread
From: Harsh Prateek Bora @ 2023-12-01  5:34 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc
  Cc: clegoate, qemu-devel, mikey, vaibhav, jniethe5, sbhat, kconsul,
	danielhb413



On 11/30/23 16:41, Nicholas Piggin wrote:
> On Thu Nov 30, 2023 at 4:19 PM AEST, Harsh Prateek Bora wrote:
>>
>>
>> On 11/29/23 09:31, Nicholas Piggin wrote:
>>> On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
>>>> Introduce a SPAPR capability cap-nested-papr which provides a nested
>>>> HV facility to the guest. This is similar to cap-nested-hv, but uses
>>>> a different (incompatible) API and so they are mutually exclusive.
>>>> This new API is to enable support for KVM on PowerVM and recently the
>>>> Linux kernel side patches have been accepted upstream as well [1].
>>>> Support for related hcalls is being added in next set of patches.
>>>
>>> We do want to be able to support both APIs on a per-guest basis. It
>>> doesn't look like the vmstate bits will be a problem, both could be
>>> enabled if the logic permitted it and that wouldn't cause a
>>> compatibility problem I think?
>>>
>>
>> I am not sure if it makes sense to have both APIs working in parallel
>> for a nested guest.
> 
> Not for the nested guest, but for the nested KVM host (i.e., the direct
> pseries guest running QEMU). QEMU doesn't know ahead of time which API
> might be used by the OS.
> 
>> Former uses h_enter_guest and expects L1 to store
>> most of the regs, and has no concept like GSB where the communication
>> between L1 and L0 takes place in a standard format which is used at
>> nested guest exit also. Here, we have separate APIs for guest/vcpu
>> create and then do a run_vcpu for a specific vcpu. So, we cant really
>> use both APIs interchangeably while running a nested guest. BTW, L1
>> kernel uses only either of the APIs at a time, preferably this one if
>> supported.
> 
> Yeah not on the same guest. And it's less about running two different
> APIs on different guests with the same L1 simultaneously (although we
> could probably change KVM to support that fairly easily, and we might
> want to for testing purposes), but more about compatibility. What if
> we boot or exec into an old kernel that doesn't support the new API?

Hmm, ok, that's a possible use case, will drop the mutual exclusion in v3.

regards,
Harsh

>>
>>> And it's a bit of a nitpick, but the capability should not be permitted
>>> before the actual APIs are supported IMO. You could split this into
>>> adding .api first, so the implementation can test it, and add the spapr
>>> caps at the end.
>>>
>>
>> Agree, I shall update as suggested.
> 
> Thanks,
> Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-12-01  5:35 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-12 10:49 [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Harsh Prateek Bora
2023-10-12 10:49 ` [PATCH v2 01/14] spapr: nested: move nested part of spapr_get_pate into spapr_nested.c Harsh Prateek Bora
2023-11-29  3:47   ` Nicholas Piggin
2023-10-12 10:49 ` [PATCH v2 02/14] spapr: nested: Introduce SpaprMachineStateNested to store related info Harsh Prateek Bora
2023-11-29  3:48   ` Nicholas Piggin
2023-10-12 10:49 ` [PATCH v2 03/14] spapr: nested: Document Nested PAPR API Harsh Prateek Bora
2023-10-12 10:49 ` [PATCH v2 04/14] spapr: nested: Introduce cap-nested-papr for " Harsh Prateek Bora
2023-11-29  4:01   ` Nicholas Piggin
2023-11-30  6:19     ` Harsh Prateek Bora
2023-11-30 11:11       ` Nicholas Piggin
2023-12-01  5:34         ` Harsh Prateek Bora
2023-10-12 10:49 ` [PATCH v2 05/14] spapr: nested: register nested-hv api hcalls only for cap-nested-hv Harsh Prateek Bora
2023-11-29  4:03   ` Nicholas Piggin
2023-10-12 10:49 ` [PATCH v2 06/14] spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls Harsh Prateek Bora
2023-10-12 10:49 ` [PATCH v2 07/14] spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls Harsh Prateek Bora
2023-10-12 10:49 ` [PATCH v2 08/14] spapr: nested: Introduce H_GUEST_CREATE_VPCU hcall Harsh Prateek Bora
2023-10-12 10:49 ` [PATCH v2 09/14] spapr: nested: Initialize the GSB elements lookup table Harsh Prateek Bora
2023-10-12 10:49 ` [PATCH v2 10/14] spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls Harsh Prateek Bora
2023-10-12 10:49 ` [PATCH v2 11/14] spapr: nested: Use correct source for parttbl info for nested PAPR API Harsh Prateek Bora
2023-11-29  4:15   ` Nicholas Piggin
2023-10-12 10:49 ` [PATCH v2 12/14] spapr: nested: rename nested_host_state to nested_hv_host Harsh Prateek Bora
2023-10-12 10:49 ` [PATCH v2 13/14] spapr: nested: keep nested-hv exit code restricted to its API Harsh Prateek Bora
2023-11-29  4:16   ` Nicholas Piggin
2023-10-12 10:49 ` [PATCH v2 14/14] spapr: nested: Introduce H_GUEST_RUN_VCPU hcall Harsh Prateek Bora
2023-11-29  4:58   ` Nicholas Piggin
2023-11-13 13:15 ` [PATCH v2 00/14] Nested PAPR API (KVM on PowerVM) Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).