* [RFC PATCH 01/16] x86/msr: Introduce SYSCFG_MEM_ENCRYPT MSR.
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
@ 2025-05-16 9:31 ` Teddy Astie
2025-07-31 15:45 ` Jan Beulich
2025-05-16 9:31 ` [RFC PATCH 04/16] x86/public: Expose physaddr_abi through Xen HVM CPUID leaf Teddy Astie
` (15 subsequent siblings)
16 siblings, 1 reply; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 9:31 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Andrei Semenov
From: Andrei Semenov <andrei.semenov@vates.tech>
SYSCFG_MEM_ENCRYPT is the AMD SME MSR used to enable SME and AMD SEV.
Signed-off-by: Andrei Semenov <andrei.semenov@vates.tech>
---
xen/arch/x86/include/asm/msr-index.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h
index 22d9e76e55..7620c4cd2e 100644
--- a/xen/arch/x86/include/asm/msr-index.h
+++ b/xen/arch/x86/include/asm/msr-index.h
@@ -221,6 +221,7 @@
#define SYSCFG_MTRR_VAR_DRAM_EN (_AC(1, ULL) << 20)
#define SYSCFG_MTRR_TOM2_EN (_AC(1, ULL) << 21)
#define SYSCFG_TOM2_FORCE_WB (_AC(1, ULL) << 22)
+#define SYSCFG_MEM_ENCRYPT (_AC(1, ULL) << 23)
#define MSR_K8_IORR_BASE0 _AC(0xc0010016, U)
#define MSR_K8_IORR_MASK0 _AC(0xc0010017, U)
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [RFC PATCH 01/16] x86/msr: Introduce SYSCFG_MEM_ENCRYPT MSR.
2025-05-16 9:31 ` [RFC PATCH 01/16] x86/msr: Introduce SYSCFG_MEM_ENCRYPT MSR Teddy Astie
@ 2025-07-31 15:45 ` Jan Beulich
0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2025-07-31 15:45 UTC (permalink / raw)
To: Teddy Astie
Cc: Andrew Cooper, Roger Pau Monné, Andrei Semenov, xen-devel
On 16.05.2025 11:31, Teddy Astie wrote:
> From: Andrei Semenov <andrei.semenov@vates.tech>
>
> SYSCFG_MEM_ENCRYPT is the AMD SME MSR used to enable SME and AMD SEV.
>
> Signed-off-by: Andrei Semenov <andrei.semenov@vates.tech>
Title and description talk of an entire MSR, yet then ...
> --- a/xen/arch/x86/include/asm/msr-index.h
> +++ b/xen/arch/x86/include/asm/msr-index.h
> @@ -221,6 +221,7 @@
> #define SYSCFG_MTRR_VAR_DRAM_EN (_AC(1, ULL) << 20)
> #define SYSCFG_MTRR_TOM2_EN (_AC(1, ULL) << 21)
> #define SYSCFG_TOM2_FORCE_WB (_AC(1, ULL) << 22)
> +#define SYSCFG_MEM_ENCRYPT (_AC(1, ULL) << 23)
... it's only a single bit. Such an addition is in principle okay to go
in at about any time, but content and description need to match.
Jan
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC PATCH 04/16] x86/public: Expose physaddr_abi through Xen HVM CPUID leaf
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
2025-05-16 9:31 ` [RFC PATCH 01/16] x86/msr: Introduce SYSCFG_MEM_ENCRYPT MSR Teddy Astie
@ 2025-05-16 9:31 ` Teddy Astie
2025-05-16 9:31 ` [RFC PATCH 02/16] x86/svm: Move svm_domain structure to svm.h Teddy Astie
` (14 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 9:31 UTC (permalink / raw)
To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
xen/arch/x86/cpuid.c | 2 ++
xen/include/public/arch-x86/cpuid.h | 2 ++
2 files changed, 4 insertions(+)
diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 8dc68945f7..e2d94619c2 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -153,6 +153,8 @@ static void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
*/
res->a |= XEN_HVM_CPUID_UPCALL_VECTOR;
+ /* Indicate that guest can the physical addresses hypercall ABI. */
+ res->a |= XEN_HVM_CPUID_PHYS_ADDR_ABI;
break;
case 5: /* PV-specific parameters */
diff --git a/xen/include/public/arch-x86/cpuid.h b/xen/include/public/arch-x86/cpuid.h
index 3bb0dd249f..5405bf6fbd 100644
--- a/xen/include/public/arch-x86/cpuid.h
+++ b/xen/include/public/arch-x86/cpuid.h
@@ -106,6 +106,8 @@
* bound to event channels.
*/
#define XEN_HVM_CPUID_UPCALL_VECTOR (1u << 6)
+/* Hypercalls can use physical addresses instead of linear ones. */
+#define XEN_HVM_CPUID_PHYS_ADDR_ABI (1u << 7)
/*
* Leaf 6 (0x40000x05)
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 02/16] x86/svm: Move svm_domain structure to svm.h
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
2025-05-16 9:31 ` [RFC PATCH 01/16] x86/msr: Introduce SYSCFG_MEM_ENCRYPT MSR Teddy Astie
2025-05-16 9:31 ` [RFC PATCH 04/16] x86/public: Expose physaddr_abi through Xen HVM CPUID leaf Teddy Astie
@ 2025-05-16 9:31 ` Teddy Astie
2025-07-31 15:48 ` Jan Beulich
2025-05-16 9:31 ` [RFC PATCH 03/16] x86/hvm: Add support for physical address ABI Teddy Astie
` (13 subsequent siblings)
16 siblings, 1 reply; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 9:31 UTC (permalink / raw)
To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné
struct svm_domain was in vmcb.h which is meant for VMCB specific
operations and values, move it to svm.h where it belongs.
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
xen/arch/x86/include/asm/hvm/domain.h | 1 +
xen/arch/x86/include/asm/hvm/svm/svm.h | 11 +++++++++++
xen/arch/x86/include/asm/hvm/svm/vmcb.h | 11 -----------
3 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/xen/arch/x86/include/asm/hvm/domain.h b/xen/arch/x86/include/asm/hvm/domain.h
index 333501d5f2..2608bcfad2 100644
--- a/xen/arch/x86/include/asm/hvm/domain.h
+++ b/xen/arch/x86/include/asm/hvm/domain.h
@@ -16,6 +16,7 @@
#include <asm/hvm/io.h>
#include <asm/hvm/vmx/vmcs.h>
#include <asm/hvm/svm/vmcb.h>
+#include <asm/hvm/svm/svm.h>
#ifdef CONFIG_MEM_SHARING
struct mem_sharing_domain
diff --git a/xen/arch/x86/include/asm/hvm/svm/svm.h b/xen/arch/x86/include/asm/hvm/svm/svm.h
index 4eeeb25da9..32f6e48e30 100644
--- a/xen/arch/x86/include/asm/hvm/svm/svm.h
+++ b/xen/arch/x86/include/asm/hvm/svm/svm.h
@@ -21,6 +21,17 @@ bool svm_load_segs(unsigned int ldt_ents, unsigned long ldt_base,
unsigned long fs_base, unsigned long gs_base,
unsigned long gs_shadow);
+struct svm_domain {
+ /* OSVW MSRs */
+ union {
+ uint64_t raw[2];
+ struct {
+ uint64_t length;
+ uint64_t status;
+ };
+ } osvw;
+};
+
extern u32 svm_feature_flags;
#define SVM_FEATURE_NPT 0 /* Nested page table support */
diff --git a/xen/arch/x86/include/asm/hvm/svm/vmcb.h b/xen/arch/x86/include/asm/hvm/svm/vmcb.h
index 28f715e376..3d871b6135 100644
--- a/xen/arch/x86/include/asm/hvm/svm/vmcb.h
+++ b/xen/arch/x86/include/asm/hvm/svm/vmcb.h
@@ -548,17 +548,6 @@ struct vmcb_struct {
u64 res18[291];
};
-struct svm_domain {
- /* OSVW MSRs */
- union {
- uint64_t raw[2];
- struct {
- uint64_t length;
- uint64_t status;
- };
- } osvw;
-};
-
/*
* VMRUN doesn't switch fs/gs/tr/ldtr and SHADOWGS/SYSCALL/SYSENTER state.
* Therefore, guest state is in the hardware registers when servicing a
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 03/16] x86/hvm: Add support for physical address ABI
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (2 preceding siblings ...)
2025-05-16 9:31 ` [RFC PATCH 02/16] x86/svm: Move svm_domain structure to svm.h Teddy Astie
@ 2025-05-16 9:31 ` Teddy Astie
2025-05-16 9:31 ` [RFC PATCH 05/16] docs/x86: Document HVM Physical Addresss ABI Teddy Astie
` (12 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 9:31 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini
Guest can tag their hypercalls with 0x40000000 in order to use this
alternative ABI that uses physical addresses instead of linear ones.
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
This one is based on the "HVMv2 ABI" RFC, but reworked in a way that is more
compatible with existing guest (guest need to opt-in abi for a specific
hypercall).
Andrew has some plans regarding making a better HVM ABI for that, but it is
a first start for this RFC.
---
xen/arch/x86/hvm/hvm.c | 17 ++++++++++++++---
xen/arch/x86/hvm/hypercall.c | 17 +++++++++++++----
xen/include/xen/sched.h | 2 ++
3 files changed, 29 insertions(+), 7 deletions(-)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4cb2e13046..0e7c453b24 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3497,7 +3497,11 @@ unsigned int copy_to_user_hvm(void *to, const void *from, unsigned int len)
return 0;
}
- rc = hvm_copy_to_guest_linear((unsigned long)to, from, len, 0, NULL);
+ if ( evaluate_nospec(current->hcall_physaddr) )
+ rc = hvm_copy_to_guest_phys((unsigned long)to, from, len, current);
+ else
+ rc = hvm_copy_to_guest_linear((unsigned long)to, from, len, 0, NULL);
+
return rc ? len : 0; /* fake a copy_to_user() return code */
}
@@ -3511,7 +3515,10 @@ unsigned int clear_user_hvm(void *to, unsigned int len)
return 0;
}
- rc = hvm_copy_to_guest_linear((unsigned long)to, NULL, len, 0, NULL);
+ if ( evaluate_nospec(current->hcall_physaddr) )
+ rc = hvm_copy_to_guest_phys((unsigned long)to, NULL, len, current);
+ else
+ rc = hvm_copy_to_guest_linear((unsigned long)to, NULL, len, 0, NULL);
return rc ? len : 0; /* fake a clear_user() return code */
}
@@ -3526,7 +3533,11 @@ unsigned int copy_from_user_hvm(void *to, const void *from, unsigned int len)
return 0;
}
- rc = hvm_copy_from_guest_linear(to, (unsigned long)from, len, 0, NULL);
+ if ( evaluate_nospec(current->hcall_physaddr) )
+ rc = hvm_copy_from_guest_phys(to, (unsigned long)from, len);
+ else
+ rc = hvm_copy_from_guest_linear(to, (unsigned long)from, len, 0, NULL);
+
return rc ? len : 0; /* fake a copy_from_user() return code */
}
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 6f8dfdff4a..b891089cda 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -160,8 +160,13 @@ int hvm_hypercall(struct cpu_user_regs *regs)
HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%lu(%lx, %lx, %lx, %lx, %lx)",
eax, regs->rdi, regs->rsi, regs->rdx, regs->r10, regs->r8);
- call_handlers_hvm64(eax, regs->rax, regs->rdi, regs->rsi, regs->rdx,
- regs->r10, regs->r8);
+ if ( eax & 0x40000000U )
+ curr->hcall_physaddr = true;
+
+ call_handlers_hvm64(eax & ~0x40000000U, regs->rax, regs->rdi, regs->rsi,
+ regs->rdx, regs->r10, regs->r8);
+
+ curr->hcall_physaddr = false;
if ( !curr->hcall_preempted && regs->rax != -ENOSYS )
clobber_regs(regs, eax, hvm, 64);
@@ -172,9 +177,13 @@ int hvm_hypercall(struct cpu_user_regs *regs)
regs->ebx, regs->ecx, regs->edx, regs->esi, regs->edi);
curr->hcall_compat = true;
- call_handlers_hvm32(eax, regs->eax, regs->ebx, regs->ecx, regs->edx,
- regs->esi, regs->edi);
+ if ( eax & 0x40000000U )
+ curr->hcall_physaddr = true;
+
+ call_handlers_hvm32(eax & ~0x40000000U, regs->eax, regs->ebx, regs->ecx,
+ regs->edx, regs->esi, regs->edi);
curr->hcall_compat = false;
+ curr->hcall_physaddr = false;
if ( !curr->hcall_preempted && regs->eax != -ENOSYS )
clobber_regs(regs, eax, hvm, 32);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 559d201e0c..4ce9253284 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -240,6 +240,8 @@ struct vcpu
bool hcall_compat;
/* Physical runstate area registered via compat ABI? */
bool runstate_guest_area_compat;
+ /* A hypercall is using the physical address ABI? */
+ bool hcall_physaddr;
#endif
#ifdef CONFIG_IOREQ_SERVER
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 05/16] docs/x86: Document HVM Physical Addresss ABI
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (3 preceding siblings ...)
2025-05-16 9:31 ` [RFC PATCH 03/16] x86/hvm: Add support for physical address ABI Teddy Astie
@ 2025-05-16 9:31 ` Teddy Astie
2025-05-16 9:31 ` [RFC PATCH 06/16] vmx: Introduce vcpu single context VPID invalidation Teddy Astie
` (11 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 9:31 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Andrew Cooper, Anthony PERARD, Michal Orzel,
Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
docs/guest-guide/x86/hypercall-abi.rst | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/docs/guest-guide/x86/hypercall-abi.rst b/docs/guest-guide/x86/hypercall-abi.rst
index e52ed453bc..710a02895b 100644
--- a/docs/guest-guide/x86/hypercall-abi.rst
+++ b/docs/guest-guide/x86/hypercall-abi.rst
@@ -35,6 +35,10 @@ The registers used for hypercalls depends on the operating mode of the guest.
HVM guest depends on whether the vCPU is operating in a 64bit segment or not
[#mode]_.
+If `XEN_HVM_CPUID_PHYS_ADDR_ABI` is supported, HVM guests can use a alternative
+ABI where physical addresses are used for hypercall parameters instead of
+linear addresses. This ABI can be used by tagging the hypercall index with
+0x40000000.
Parameters
----------
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 06/16] vmx: Introduce vcpu single context VPID invalidation
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (4 preceding siblings ...)
2025-05-16 9:31 ` [RFC PATCH 05/16] docs/x86: Document HVM Physical Addresss ABI Teddy Astie
@ 2025-05-16 9:31 ` Teddy Astie
2025-05-16 10:22 ` [RFC PATCH 07/16] x86/hvm: Introduce Xen-wide ASID allocator Teddy Astie
` (10 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 9:31 UTC (permalink / raw)
To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné
Introduce vpid_sync_vcpu_context to do a single-context invalidation
on the vpid attached to the vcpu as a alternative to per-gva and all-context
invlidations.
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
This will be used on Intel platforms for the ASID management rework.
---
xen/arch/x86/include/asm/hvm/vmx/vmx.h | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmx.h b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
index d85b52b9d5..a55a31b42d 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmx.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
@@ -451,6 +451,27 @@ static inline void ept_sync_all(void)
void ept_sync_domain(struct p2m_domain *p2m);
+static inline void vpid_sync_vcpu_context(struct vcpu *v)
+{
+ int type = INVVPID_SINGLE_CONTEXT;
+
+ /*
+ * If single context invalidation is not supported, we escalate to
+ * use all context invalidation.
+ */
+ if ( likely(cpu_has_vmx_vpid_invvpid_single_context) )
+ goto execute_invvpid;
+
+ /*
+ * If single context invalidation is not supported, we escalate to
+ * use all context invalidation.
+ */
+ type = INVVPID_ALL_CONTEXT;
+
+execute_invvpid:
+ __invvpid(type, v->arch.hvm.n1asid.asid, (u64)gva);
+}
+
static inline void vpid_sync_vcpu_gva(struct vcpu *v, unsigned long gva)
{
int type = INVVPID_INDIVIDUAL_ADDR;
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 07/16] x86/hvm: Introduce Xen-wide ASID allocator
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (5 preceding siblings ...)
2025-05-16 9:31 ` [RFC PATCH 06/16] vmx: Introduce vcpu single context VPID invalidation Teddy Astie
@ 2025-05-16 10:22 ` Teddy Astie
2025-05-16 10:22 ` [RFC PATCH 08/16] x86/crypto: Introduce AMD PSP driver for SEV Teddy Astie
` (9 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 10:22 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini,
Tim Deegan, Vaishali Thakkar
From: Vaishali Thakkar <vaishali.thakkar@suse.com>
Currently ASID generation and management is done per-PCPU. This
scheme is incompatible with SEV technologies as SEV VMs need to
have a fixed ASID associated with all vcpus of the VM throughout
it's lifetime.
This commit introduces a Xen-wide allocator which initializes
the asids at the start of xen and allows to have a fixed asids
throughout the lifecycle of all domains. Having a fixed asid
for non-SEV domains also presents us with the opportunity to
further take use of AMD instructions like TLBSYNC and INVLPGB
for broadcasting the TLB invalidations.
Introduce vcpu->needs_tlb_flush attribute to schedule a guest TLB
flush for the next VMRUN/VMENTER. This will be later be done using
either TLB_CONTROL field (AMD) or INVEPT (Intel). This flush method
is used in place of the current ASID swapping logic.
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Vaishali Thakkar <vaishali.thakkar@suse.com>
---
TODO:
- Intel: Don't assign the VPID at each VMENTER, though we need
to rethink how we manage VMCS with nested virtualization / altp2m
for changing this behavior.
- AMD: Consider hot-plug of CPU with ERRATA_170. (is it possible ?)
- Consider cases where we don't have enough ASIDs (e.g Xen as nested guest)
- Nested virtualization ASID management
This patch currently breaks shadow paging, yet I haven't managed to diagnose
exactly what is happening.
Original changelog :
Changes since v3:
- Simplified asid bitmap management
It is only called once per domain, so it doesn't need to have
a complicated logic.
- Drop hvm_asid_data structure which doesn't serve a purpose anymore.
- Introduce and use vcpu->needs_tlb_flush to indicate that a guest TLB
flush is needed before waking the vcpu. It is used to set
TLB_CONTROL (AMD) field properly or make a appropriate invept (Intel).
- Only assign ASID once (see TODO for Intel side)
- Check the ERRATA_170 for each CPU present.
- add asid_alloc_range() for allocating within a specific
range (e.g SEV ASID ranges)
Changes since v2:
- Moved hvm_asid_domain_create to hvm_domain_initialise
- Added __ro_after_init for bitmaps
- Make hvm_asid_init unsigned int __init
- Remove functions hvm_asid_flush_domain_asid and hvm_asid_flush_vcpu
- Mark ASID 0 permenantly
- Remove the irrelevant tracking of generation
- Add hvm_domain_asid_destroy to avoid layering violation
- Remove unnecessary fixups touching the same code
- Add a logic to move asids from reclaim_bitmap->asid_bitmap
- Misc styling fixes - remove unncessary trailing spaces/printks
Changes since v1:
- Introudce hvm_asid_bitmap as discussed at Xen-summit
- Introduce hvm_reclaim_bitmap for reusing ASIDs
- Assign the asid to the domain at the domain creation via
hvm_asid_domain_create
- Corrected the use of CPUID in the svm_asid_init function
- Adjusted the code in nested virtualization related files
to use new scheme. As discussed at the Xen-summit, this
is not tested.
- Addressed Jan's comments about using uniform style for
accessing domains via v->domain
- Allow to flush at the vcpu level in HAP code
- Documented the sketch of implementation for the new scheme
- Remove min_asid as for this patch, we are not demonstarting
it's usecase
- Arrange includes in multiple files as per Jan's feedback
---
xen/arch/x86/flushtlb.c | 7 +-
xen/arch/x86/hvm/asid.c | 170 +++++++++++--------------
xen/arch/x86/hvm/emulate.c | 2 +-
xen/arch/x86/hvm/hvm.c | 14 +-
xen/arch/x86/hvm/nestedhvm.c | 7 +-
xen/arch/x86/hvm/svm/asid.c | 77 +++++++----
xen/arch/x86/hvm/svm/nestedsvm.c | 2 +-
xen/arch/x86/hvm/svm/svm.c | 37 +++---
xen/arch/x86/hvm/svm/svm.h | 4 -
xen/arch/x86/hvm/vmx/vmcs.c | 6 +-
xen/arch/x86/hvm/vmx/vmx.c | 68 +++++-----
xen/arch/x86/hvm/vmx/vvmx.c | 5 +-
xen/arch/x86/include/asm/hvm/asid.h | 26 ++--
xen/arch/x86/include/asm/hvm/domain.h | 1 +
xen/arch/x86/include/asm/hvm/hvm.h | 15 +--
xen/arch/x86/include/asm/hvm/svm/svm.h | 5 +
xen/arch/x86/include/asm/hvm/vcpu.h | 10 +-
xen/arch/x86/include/asm/hvm/vmx/vmx.h | 8 +-
xen/arch/x86/mm/hap/hap.c | 7 +-
xen/arch/x86/mm/p2m.c | 7 +-
xen/arch/x86/mm/paging.c | 2 +-
xen/arch/x86/mm/shadow/hvm.c | 1 +
xen/arch/x86/mm/shadow/multi.c | 1 +
xen/include/xen/sched.h | 2 +
24 files changed, 238 insertions(+), 246 deletions(-)
diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c
index 1e0011d5b1..9cae828b34 100644
--- a/xen/arch/x86/flushtlb.c
+++ b/xen/arch/x86/flushtlb.c
@@ -13,6 +13,7 @@
#include <xen/softirq.h>
#include <asm/cache.h>
#include <asm/flushtlb.h>
+#include <asm/hvm/hvm.h>
#include <asm/invpcid.h>
#include <asm/nops.h>
#include <asm/page.h>
@@ -124,7 +125,6 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4)
if ( tlb_clk_enabled )
t = pre_flush();
- hvm_flush_guest_tlbs();
old_cr4 = read_cr4();
ASSERT(!(old_cr4 & X86_CR4_PCIDE) || !(old_cr4 & X86_CR4_PGE));
@@ -229,8 +229,9 @@ unsigned int flush_area_local(const void *va, unsigned int flags)
do_tlb_flush();
}
- if ( flags & FLUSH_HVM_ASID_CORE )
- hvm_flush_guest_tlbs();
+ //if ( flags & FLUSH_HVM_ASID_CORE )
+ // // Needed ?
+ // hvm_flush_tlb(NULL);
if ( flags & FLUSH_CACHE )
{
diff --git a/xen/arch/x86/hvm/asid.c b/xen/arch/x86/hvm/asid.c
index 8d27b7dba1..91f8e44210 100644
--- a/xen/arch/x86/hvm/asid.c
+++ b/xen/arch/x86/hvm/asid.c
@@ -8,133 +8,107 @@
#include <xen/init.h>
#include <xen/lib.h>
#include <xen/param.h>
-#include <xen/sched.h>
-#include <xen/smp.h>
-#include <xen/percpu.h>
+#include <xen/spinlock.h>
+#include <xen/xvmalloc.h>
+
#include <asm/hvm/asid.h>
+#include <asm/bitops.h>
/* Xen command-line option to enable ASIDs */
static bool __read_mostly opt_asid_enabled = true;
boolean_param("asid", opt_asid_enabled);
+bool __read_mostly asid_enabled = false;
+static unsigned long __ro_after_init *asid_bitmap;
+static unsigned long __ro_after_init asid_count;
+static DEFINE_SPINLOCK(asid_lock);
+
/*
- * ASIDs partition the physical TLB. In the current implementation ASIDs are
- * introduced to reduce the number of TLB flushes. Each time the guest's
- * virtual address space changes (e.g. due to an INVLPG, MOV-TO-{CR3, CR4}
- * operation), instead of flushing the TLB, a new ASID is assigned. This
- * reduces the number of TLB flushes to at most 1/#ASIDs. The biggest
- * advantage is that hot parts of the hypervisor's code and data retain in
- * the TLB.
- *
* Sketch of the Implementation:
- *
- * ASIDs are a CPU-local resource. As preemption of ASIDs is not possible,
- * ASIDs are assigned in a round-robin scheme. To minimize the overhead of
- * ASID invalidation, at the time of a TLB flush, ASIDs are tagged with a
- * 64-bit generation. Only on a generation overflow the code needs to
- * invalidate all ASID information stored at the VCPUs with are run on the
- * specific physical processor. This overflow appears after about 2^80
- * host processor cycles, so we do not optimize this case, but simply disable
- * ASID useage to retain correctness.
+ * ASIDs are assigned uniquely per domain and doesn't change during
+ * the lifecycle of the domain. Once vcpus are initialized and are up,
+ * we assign the same ASID to all vcpus of that domain at the first VMRUN.
+ * In order to process a TLB flush on a vcpu, we set needs_tlb_flush
+ * to schedule a TLB flush for the next VMRUN (e.g using tlb control field
+ * of VMCB).
*/
-/* Per-CPU ASID management. */
-struct hvm_asid_data {
- uint64_t core_asid_generation;
- uint32_t next_asid;
- uint32_t max_asid;
- bool disabled;
-};
-
-static DEFINE_PER_CPU(struct hvm_asid_data, hvm_asid_data);
-
-void hvm_asid_init(int nasids)
+int __init hvm_asid_init(unsigned long nasids)
{
- static int8_t g_disabled = -1;
- struct hvm_asid_data *data = &this_cpu(hvm_asid_data);
+ ASSERT(nasids);
- data->max_asid = nasids - 1;
- data->disabled = !opt_asid_enabled || (nasids <= 1);
-
- if ( g_disabled != data->disabled )
- {
- printk("HVM: ASIDs %sabled.\n", data->disabled ? "dis" : "en");
- if ( g_disabled < 0 )
- g_disabled = data->disabled;
- }
+ asid_count = nasids;
+ asid_enabled = opt_asid_enabled || (nasids <= 1);
- /* Zero indicates 'invalid generation', so we start the count at one. */
- data->core_asid_generation = 1;
+ asid_bitmap = xvzalloc_array(unsigned long, BITS_TO_LONGS(asid_count));
+ if ( !asid_bitmap )
+ return -ENOMEM;
- /* Zero indicates 'ASIDs disabled', so we start the count at one. */
- data->next_asid = 1;
-}
+ printk("HVM: ASIDs %sabled (count=%lu)\n", asid_enabled ? "en" : "dis", asid_count);
-void hvm_asid_flush_vcpu_asid(struct hvm_vcpu_asid *asid)
-{
- write_atomic(&asid->generation, 0);
-}
+ /* ASID 0 is reserved, mark it as permanently used */
+ set_bit(0, asid_bitmap);
-void hvm_asid_flush_vcpu(struct vcpu *v)
-{
- hvm_asid_flush_vcpu_asid(&v->arch.hvm.n1asid);
- hvm_asid_flush_vcpu_asid(&vcpu_nestedhvm(v).nv_n2asid);
+ return 0;
}
-void hvm_asid_flush_core(void)
+int hvm_asid_alloc(struct hvm_asid *asid)
{
- struct hvm_asid_data *data = &this_cpu(hvm_asid_data);
+ unsigned long new_asid;
+
+ if ( !asid_enabled )
+ {
+ asid->asid = 1;
+ return 0;
+ }
- if ( data->disabled )
- return;
+ spin_lock(&asid_lock);
+ new_asid = find_first_zero_bit(asid_bitmap, asid_count);
+ if ( new_asid > asid_count )
+ return -ENOSPC;
- if ( likely(++data->core_asid_generation != 0) )
- return;
+ set_bit(new_asid, asid_bitmap);
- /*
- * ASID generations are 64 bit. Overflow of generations never happens.
- * For safety, we simply disable ASIDs, so correctness is established; it
- * only runs a bit slower.
- */
- printk("HVM: ASID generation overrun. Disabling ASIDs.\n");
- data->disabled = 1;
+ asid->asid = new_asid;
+ spin_unlock(&asid_lock);
+ return 0;
}
-bool hvm_asid_handle_vmenter(struct hvm_vcpu_asid *asid)
+int hvm_asid_alloc_range(struct hvm_asid *asid, unsigned long min, unsigned long max)
{
- struct hvm_asid_data *data = &this_cpu(hvm_asid_data);
-
- /* On erratum #170 systems we must flush the TLB.
- * Generation overruns are taken here, too. */
- if ( data->disabled )
- goto disabled;
-
- /* Test if VCPU has valid ASID. */
- if ( read_atomic(&asid->generation) == data->core_asid_generation )
- return 0;
+ unsigned long new_asid;
+
+ if ( WARN_ON(min >= asid_count) )
+ return -EINVAL;
+
+ if ( !asid_enabled )
+ return -EOPNOTSUPP;
+
+ spin_lock(&asid_lock);
+ new_asid = find_next_zero_bit(asid_bitmap, asid_count, min);
+ if ( new_asid > max || new_asid > asid_count )
+ return -ENOSPC;
+
+ set_bit(new_asid, asid_bitmap);
+
+ asid->asid = new_asid;
+ spin_unlock(&asid_lock);
+ return 0;
+}
- /* If there are no free ASIDs, need to go to a new generation */
- if ( unlikely(data->next_asid > data->max_asid) )
- {
- hvm_asid_flush_core();
- data->next_asid = 1;
- if ( data->disabled )
- goto disabled;
- }
+void hvm_asid_free(struct hvm_asid *asid)
+{
+ ASSERT( asid->asid );
- /* Now guaranteed to be a free ASID. */
- asid->asid = data->next_asid++;
- write_atomic(&asid->generation, data->core_asid_generation);
+ if ( !asid_enabled )
+ return;
- /*
- * When we assign ASID 1, flush all TLB entries as we are starting a new
- * generation, and all old ASID allocations are now stale.
- */
- return (asid->asid == 1);
+ ASSERT( asid->asid < asid_count );
- disabled:
- asid->asid = 0;
- return 0;
+ spin_lock(&asid_lock);
+ WARN_ON(!test_bit(asid->asid, asid_bitmap));
+ clear_bit(asid->asid, asid_bitmap);
+ spin_unlock(&asid_lock);
}
/*
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 91f004d233..6ed8e03475 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2666,7 +2666,7 @@ static int cf_check hvmemul_tlb_op(
case x86emul_invpcid:
if ( x86emul_invpcid_type(aux) != X86_INVPCID_INDIV_ADDR )
{
- hvm_asid_flush_vcpu(current);
+ current->needs_tlb_flush = true;
break;
}
aux = x86emul_invpcid_pcid(aux);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 0e7c453b24..625ae2098b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -702,6 +702,10 @@ int hvm_domain_initialise(struct domain *d,
if ( rc )
goto fail2;
+ rc = hvm_asid_alloc(&d->arch.hvm.asid);
+ if ( rc )
+ goto fail2;
+
rc = alternative_call(hvm_funcs.domain_initialise, d);
if ( rc != 0 )
goto fail2;
@@ -782,8 +786,9 @@ void hvm_domain_destroy(struct domain *d)
list_del(&ioport->list);
xfree(ioport);
}
-
+ hvm_asid_free(&d->arch.hvm.asid);
destroy_vpci_mmcfg(d);
+
}
static int cf_check hvm_save_tsc_adjust(struct vcpu *v, hvm_domain_context_t *h)
@@ -1603,7 +1608,7 @@ int hvm_vcpu_initialise(struct vcpu *v)
int rc;
struct domain *d = v->domain;
- hvm_asid_flush_vcpu(v);
+ v->needs_tlb_flush = true;
spin_lock_init(&v->arch.hvm.tm_lock);
INIT_LIST_HEAD(&v->arch.hvm.tm_list);
@@ -4145,6 +4150,11 @@ static void hvm_s3_resume(struct domain *d)
}
}
+int hvm_flush_tlb(const unsigned long *vcpu_bitmap)
+{
+ return current->domain->arch.paging.flush_tlb(vcpu_bitmap);
+}
+
static int hvmop_flush_tlb_all(void)
{
if ( !is_hvm_domain(current->domain) )
diff --git a/xen/arch/x86/hvm/nestedhvm.c b/xen/arch/x86/hvm/nestedhvm.c
index bddd77d810..61e866b771 100644
--- a/xen/arch/x86/hvm/nestedhvm.c
+++ b/xen/arch/x86/hvm/nestedhvm.c
@@ -12,6 +12,7 @@
#include <asm/hvm/nestedhvm.h>
#include <asm/event.h> /* for local_event_delivery_(en|dis)able */
#include <asm/paging.h> /* for paging_mode_hap() */
+#include <asm/hvm/asid.h>
static unsigned long *shadow_io_bitmap[3];
@@ -36,13 +37,11 @@ nestedhvm_vcpu_reset(struct vcpu *v)
hvm_unmap_guest_frame(nv->nv_vvmcx, 1);
nv->nv_vvmcx = NULL;
nv->nv_vvmcxaddr = INVALID_PADDR;
- nv->nv_flushp2m = 0;
+ nv->nv_flushp2m = true;
nv->nv_p2m = NULL;
nv->stale_np2m = false;
nv->np2m_generation = 0;
- hvm_asid_flush_vcpu_asid(&nv->nv_n2asid);
-
alternative_vcall(hvm_funcs.nhvm_vcpu_reset, v);
/* vcpu is in host mode */
@@ -86,7 +85,7 @@ static void cf_check nestedhvm_flushtlb_ipi(void *info)
* This is cheaper than flush_tlb_local() and has
* the same desired effect.
*/
- hvm_asid_flush_core();
+ WARN_ON(hvm_flush_tlb(NULL));
vcpu_nestedhvm(v).nv_p2m = NULL;
vcpu_nestedhvm(v).stale_np2m = true;
}
diff --git a/xen/arch/x86/hvm/svm/asid.c b/xen/arch/x86/hvm/svm/asid.c
index 7977a8e86b..1b6def4a4c 100644
--- a/xen/arch/x86/hvm/svm/asid.c
+++ b/xen/arch/x86/hvm/svm/asid.c
@@ -1,56 +1,77 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
- * asid.c: handling ASIDs in SVM.
+ * asid.c: handling ASIDs/VPIDs.
* Copyright (c) 2007, Advanced Micro Devices, Inc.
*/
+#include <xen/cpumask.h>
+
#include <asm/amd.h>
#include <asm/hvm/nestedhvm.h>
#include <asm/hvm/svm/svm.h>
+#include <asm/hvm/svm/vmcb.h>
+#include <asm/processor.h>
#include "svm.h"
-void svm_asid_init(const struct cpuinfo_x86 *c)
+void __init svm_asid_init(void)
{
- int nasids = 0;
+ unsigned int cpu;
+ int nasids = cpuid_ebx(0x8000000aU);
+
+ if ( !nasids )
+ nasids = 1;
- /* Check for erratum #170, and leave ASIDs disabled if it's present. */
- if ( !cpu_has_amd_erratum(c, AMD_ERRATUM_170) )
- nasids = cpuid_ebx(0x8000000aU);
+ for_each_present_cpu(cpu)
+ {
+ /* Check for erratum #170, and leave ASIDs disabled if it's present. */
+ if ( cpu_has_amd_erratum(&cpu_data[cpu], AMD_ERRATUM_170) )
+ {
+ printk(XENLOG_WARNING "Disabling ASID due to errata 170 on CPU%u\n", cpu);
+ nasids = 1;
+ }
+ }
- hvm_asid_init(nasids);
+ BUG_ON(hvm_asid_init(nasids));
}
/*
- * Called directly before VMRUN. Checks if the VCPU needs a new ASID,
- * assigns it, and if required, issues required TLB flushes.
+ * Called directly at the first VMRUN/VMENTER of a vcpu to assign the ASID/VPID.
*/
-void svm_asid_handle_vmrun(void)
+void svm_vcpu_assign_asid(struct vcpu *v)
{
- struct vcpu *curr = current;
- struct vmcb_struct *vmcb = curr->arch.hvm.svm.vmcb;
- struct hvm_vcpu_asid *p_asid =
- nestedhvm_vcpu_in_guestmode(curr)
- ? &vcpu_nestedhvm(curr).nv_n2asid : &curr->arch.hvm.n1asid;
- bool need_flush = hvm_asid_handle_vmenter(p_asid);
-
- /* ASID 0 indicates that ASIDs are disabled. */
- if ( p_asid->asid == 0 )
- {
- vmcb_set_asid(vmcb, true);
- vmcb->tlb_control =
- cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL;
- return;
- }
+ struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
+ struct hvm_asid *p_asid = &v->domain->arch.hvm.asid;
+
+ ASSERT(p_asid->asid);
- if ( vmcb_get_asid(vmcb) != p_asid->asid )
- vmcb_set_asid(vmcb, p_asid->asid);
+ /* In case ASIDs are disabled, as ASID = 0 is reserved, guest can use 1 instead. */
+ vmcb_set_asid(vmcb, asid_enabled ? p_asid->asid : 1);
+}
+
+/* Call to make a TLB flush at the next VMRUN. */
+void svm_vcpu_set_tlb_control(struct vcpu *v)
+{
+ struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
+
+ /*
+ * If the vcpu is already running, the tlb control flag may not be
+ * processed and will be cleared at the next VMEXIT, which will undo
+ * what we are trying to do.
+ */
+ WARN_ON(v != current && v->is_running);
vmcb->tlb_control =
- !need_flush ? TLB_CTRL_NO_FLUSH :
cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL;
}
+void svm_vcpu_clear_tlb_control(struct vcpu *v)
+{
+ struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
+
+ vmcb->tlb_control = TLB_CTRL_NO_FLUSH;
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index dc2b6a4253..6e5dc2624f 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -5,6 +5,7 @@
*
*/
+#include <asm/hvm/asid.h>
#include <asm/hvm/support.h>
#include <asm/hvm/svm/svm.h>
#include <asm/hvm/svm/vmcb.h>
@@ -699,7 +700,6 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs,
if ( svm->ns_asid != vmcb_get_asid(ns_vmcb))
{
nv->nv_flushp2m = 1;
- hvm_asid_flush_vcpu_asid(&vcpu_nestedhvm(v).nv_n2asid);
svm->ns_asid = vmcb_get_asid(ns_vmcb);
}
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index e33a38c1e4..cc19d80fe1 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -26,6 +26,7 @@
#include <asm/hvm/monitor.h>
#include <asm/hvm/nestedhvm.h>
#include <asm/hvm/support.h>
+#include <asm/hvm/asid.h>
#include <asm/hvm/svm/svm.h>
#include <asm/hvm/svm/svmdebug.h>
#include <asm/hvm/svm/vmcb.h>
@@ -183,14 +184,17 @@ static void cf_check svm_update_guest_cr(
if ( !nestedhvm_enabled(v->domain) )
{
if ( !(flags & HVM_UPDATE_GUEST_CR3_NOFLUSH) )
- hvm_asid_flush_vcpu(v);
+ v->needs_tlb_flush = true;
}
else if ( nestedhvm_vmswitch_in_progress(v) )
; /* CR3 switches during VMRUN/VMEXIT do not flush the TLB. */
else if ( !(flags & HVM_UPDATE_GUEST_CR3_NOFLUSH) )
- hvm_asid_flush_vcpu_asid(
- nestedhvm_vcpu_in_guestmode(v)
- ? &vcpu_nestedhvm(v).nv_n2asid : &v->arch.hvm.n1asid);
+ {
+ if (nestedhvm_vcpu_in_guestmode(v))
+ vcpu_nestedhvm(v).nv_flushp2m = true;
+ else
+ v->needs_tlb_flush = true;
+ }
break;
case 4:
value = HVM_CR4_HOST_MASK;
@@ -991,8 +995,7 @@ static void noreturn cf_check svm_do_resume(void)
v->arch.hvm.svm.launch_core = smp_processor_id();
hvm_migrate_timers(v);
hvm_migrate_pirqs(v);
- /* Migrating to another ASID domain. Request a new ASID. */
- hvm_asid_flush_vcpu(v);
+ v->needs_tlb_flush = true;
}
if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) )
@@ -1019,13 +1022,14 @@ void asmlinkage svm_vmenter_helper(void)
ASSERT(hvmemul_cache_disabled(curr));
- svm_asid_handle_vmrun();
-
TRACE_TIME(TRC_HVM_VMENTRY |
(nestedhvm_vcpu_in_guestmode(curr) ? TRC_HVM_NESTEDFLAG : 0));
svm_sync_vmcb(curr, vmcb_needs_vmsave);
+ if ( test_and_clear_bool(curr->needs_tlb_flush) )
+ svm_vcpu_set_tlb_control(curr);
+
vmcb->rax = regs->rax;
vmcb->rip = regs->rip;
vmcb->rsp = regs->rsp;
@@ -1146,6 +1150,8 @@ static int cf_check svm_vcpu_initialise(struct vcpu *v)
return rc;
}
+ svm_vcpu_assign_asid(v);
+
return 0;
}
@@ -1572,9 +1578,6 @@ static int _svm_cpu_up(bool bsp)
/* check for erratum 383 */
svm_init_erratum_383(c);
- /* Initialize core's ASID handling. */
- svm_asid_init(c);
-
/* Initialize OSVW bits to be used by guests */
svm_host_osvw_init();
@@ -2338,7 +2341,7 @@ static void svm_invlpga_intercept(
{
svm_invlpga(linear,
(asid == 0)
- ? v->arch.hvm.n1asid.asid
+ ? v->domain->arch.hvm.asid.asid
: vcpu_nestedhvm(v).nv_n2asid.asid);
}
@@ -2360,8 +2363,8 @@ static bool cf_check is_invlpg(
static void cf_check svm_invlpg(struct vcpu *v, unsigned long linear)
{
- /* Safe fallback. Take a new ASID. */
- hvm_asid_flush_vcpu(v);
+ /* Schedule a tlb flush on the VCPU. */
+ v->needs_tlb_flush = true;
}
static bool cf_check svm_get_pending_event(
@@ -2528,6 +2531,8 @@ const struct hvm_function_table * __init start_svm(void)
svm_function_table.caps.hap_superpage_2mb = true;
svm_function_table.caps.hap_superpage_1gb = cpu_has_page1gb;
+ svm_asid_init();
+
return &svm_function_table;
}
@@ -2584,6 +2589,8 @@ void asmlinkage svm_vmexit_handler(void)
(vlapic_get_reg(vlapic, APIC_TASKPRI) & 0x0F));
}
+ svm_vcpu_clear_tlb_control(v);
+
exit_reason = vmcb->exitcode;
if ( hvm_long_mode_active(v) )
@@ -2659,7 +2666,7 @@ void asmlinkage svm_vmexit_handler(void)
}
}
- if ( unlikely(exit_reason == VMEXIT_INVALID) )
+ if ( unlikely(exit_reason == VMEXIT_INVALID || exit_reason == (uint32_t)VMEXIT_INVALID) )
{
gdprintk(XENLOG_ERR, "invalid VMCB state:\n");
svm_vmcb_dump(__func__, vmcb);
diff --git a/xen/arch/x86/hvm/svm/svm.h b/xen/arch/x86/hvm/svm/svm.h
index f5b0312d2d..92145c6d7b 100644
--- a/xen/arch/x86/hvm/svm/svm.h
+++ b/xen/arch/x86/hvm/svm/svm.h
@@ -12,12 +12,8 @@
#include <xen/types.h>
struct cpu_user_regs;
-struct cpuinfo_x86;
struct vcpu;
-void svm_asid_init(const struct cpuinfo_x86 *c);
-void svm_asid_handle_vmrun(void);
-
unsigned long *svm_msrbit(unsigned long *msr_bitmap, uint32_t msr);
void __update_guest_eip(struct cpu_user_regs *regs, unsigned int inst_len);
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index a44475ae15..b8a28e2cf8 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -20,6 +20,7 @@
#include <asm/current.h>
#include <asm/flushtlb.h>
#include <asm/hvm/hvm.h>
+#include <asm/hvm/asid.h>
#include <asm/hvm/io.h>
#include <asm/hvm/nestedhvm.h>
#include <asm/hvm/vmx/vmcs.h>
@@ -759,8 +760,6 @@ static int _vmx_cpu_up(bool bsp)
this_cpu(vmxon) = 1;
- hvm_asid_init(cpu_has_vmx_vpid ? (1u << VMCS_VPID_WIDTH) : 0);
-
if ( cpu_has_vmx_ept )
ept_sync_all();
@@ -1941,7 +1940,7 @@ void cf_check vmx_do_resume(void)
*/
v->arch.hvm.vmx.hostenv_migrated = 1;
- hvm_asid_flush_vcpu(v);
+ v->needs_tlb_flush = true;
}
debug_state = v->domain->debugger_attached
@@ -2154,7 +2153,6 @@ void vmcs_dump_vcpu(struct vcpu *v)
(SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
printk("Virtual processor ID = 0x%04x VMfunc controls = %016lx\n",
vmr16(VIRTUAL_PROCESSOR_ID), vmr(VM_FUNCTION_CONTROL));
-
vmx_vmcs_exit(v);
}
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 827db6bdd8..8859ec4b38 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -37,6 +37,7 @@
#include <asm/x86_emulate.h>
#include <asm/hvm/vpt.h>
#include <public/hvm/save.h>
+#include <asm/hvm/asid.h>
#include <asm/hvm/monitor.h>
#include <asm/xenoprof.h>
#include <asm/gdbsx.h>
@@ -824,6 +825,18 @@ static void cf_check vmx_cpuid_policy_changed(struct vcpu *v)
vmx_update_secondary_exec_control(v);
}
+ if ( asid_enabled )
+ {
+ v->arch.hvm.vmx.secondary_exec_control |= SECONDARY_EXEC_ENABLE_VPID;
+ vmx_update_secondary_exec_control(v);
+ }
+ else
+ {
+ v->arch.hvm.vmx.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VPID;
+ vmx_update_secondary_exec_control(v);
+ }
+
+
/*
* We can safely pass MSR_SPEC_CTRL through to the guest, even if STIBP
* isn't enumerated in hardware, as SPEC_CTRL_STIBP is ignored.
@@ -1477,7 +1490,7 @@ static void cf_check vmx_handle_cd(struct vcpu *v, unsigned long value)
vmx_set_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW);
wbinvd(); /* flush possibly polluted cache */
- hvm_asid_flush_vcpu(v); /* invalidate memory type cached in TLB */
+ v->needs_tlb_flush = true; /* invalidate memory type cached in TLB */
v->arch.hvm.cache_mode = NO_FILL_CACHE_MODE;
}
else
@@ -1486,7 +1499,7 @@ static void cf_check vmx_handle_cd(struct vcpu *v, unsigned long value)
vmx_set_guest_pat(v, *pat);
if ( !is_iommu_enabled(v->domain) || iommu_snoop )
vmx_clear_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW);
- hvm_asid_flush_vcpu(v); /* no need to flush cache */
+ v->needs_tlb_flush = true;
}
}
}
@@ -1847,7 +1860,7 @@ static void cf_check vmx_update_guest_cr(
__vmwrite(GUEST_CR3, v->arch.hvm.hw_cr[3]);
if ( !(flags & HVM_UPDATE_GUEST_CR3_NOFLUSH) )
- hvm_asid_flush_vcpu(v);
+ v->needs_tlb_flush = true;
break;
default:
@@ -3128,6 +3141,8 @@ const struct hvm_function_table * __init start_vmx(void)
lbr_tsx_fixup_check();
ler_to_fixup_check();
+ BUG_ON(hvm_asid_init(cpu_has_vmx_vpid ? (1u << VMCS_VPID_WIDTH) : 1));
+
return &vmx_function_table;
}
@@ -4901,9 +4916,7 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
{
struct vcpu *curr = current;
struct domain *currd = curr->domain;
- u32 new_asid, old_asid;
- struct hvm_vcpu_asid *p_asid;
- bool need_flush;
+ struct hvm_asid *p_asid;
ASSERT(hvmemul_cache_disabled(curr));
@@ -4914,38 +4927,14 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
if ( curr->domain->arch.hvm.pi_ops.vcpu_block )
vmx_pi_do_resume(curr);
- if ( !cpu_has_vmx_vpid )
+ if ( !asid_enabled )
goto out;
if ( nestedhvm_vcpu_in_guestmode(curr) )
p_asid = &vcpu_nestedhvm(curr).nv_n2asid;
else
- p_asid = &curr->arch.hvm.n1asid;
-
- old_asid = p_asid->asid;
- need_flush = hvm_asid_handle_vmenter(p_asid);
- new_asid = p_asid->asid;
-
- if ( unlikely(new_asid != old_asid) )
- {
- __vmwrite(VIRTUAL_PROCESSOR_ID, new_asid);
- if ( !old_asid && new_asid )
- {
- /* VPID was disabled: now enabled. */
- curr->arch.hvm.vmx.secondary_exec_control |=
- SECONDARY_EXEC_ENABLE_VPID;
- vmx_update_secondary_exec_control(curr);
- }
- else if ( old_asid && !new_asid )
- {
- /* VPID was enabled: now disabled. */
- curr->arch.hvm.vmx.secondary_exec_control &=
- ~SECONDARY_EXEC_ENABLE_VPID;
- vmx_update_secondary_exec_control(curr);
- }
- }
+ p_asid = &currd->arch.hvm.asid;
- if ( unlikely(need_flush) )
- vpid_sync_all();
+ __vmwrite(VIRTUAL_PROCESSOR_ID, p_asid->asid);
if ( paging_mode_hap(curr->domain) )
{
@@ -4954,12 +4943,18 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
unsigned int inv = 0; /* None => Single => All */
struct ept_data *single = NULL; /* Single eptp, iff inv == 1 */
+ if ( test_and_clear_bool(curr->needs_tlb_flush) )
+ {
+ inv = 1;
+ single = ept;
+ }
+
if ( cpumask_test_cpu(cpu, ept->invalidate) )
{
cpumask_clear_cpu(cpu, ept->invalidate);
/* Automatically invalidate all contexts if nested. */
- inv += 1 + nestedhvm_enabled(currd);
+ inv = 1 + nestedhvm_enabled(currd);
single = ept;
}
@@ -4986,6 +4981,11 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
__invept(inv == 1 ? INVEPT_SINGLE_CONTEXT : INVEPT_ALL_CONTEXT,
inv == 1 ? single->eptp : 0);
}
+ else /* Shadow paging */
+ {
+ if ( test_and_clear_bool(curr->needs_tlb_flush) )
+ vpid_sync_vcpu_context(curr);
+ }
out:
if ( unlikely(curr->arch.hvm.vmx.lbr_flags & LBR_FIXUP_MASK) )
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index ceb5e5a322..fa84ee4e8f 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -12,6 +12,7 @@
#include <asm/mtrr.h>
#include <asm/p2m.h>
+#include <asm/hvm/hvm.h>
#include <asm/hvm/support.h>
#include <asm/hvm/vmx/vmx.h>
#include <asm/hvm/vmx/vvmx.h>
@@ -1254,7 +1255,7 @@ static void virtual_vmentry(struct cpu_user_regs *regs)
if ( nvmx->guest_vpid != new_vpid )
{
- hvm_asid_flush_vcpu_asid(&vcpu_nestedhvm(v).nv_n2asid);
+ v->needs_tlb_flush = true;
nvmx->guest_vpid = new_vpid;
}
}
@@ -2055,7 +2056,7 @@ static int nvmx_handle_invvpid(struct cpu_user_regs *regs)
case INVVPID_INDIVIDUAL_ADDR:
case INVVPID_SINGLE_CONTEXT:
case INVVPID_ALL_CONTEXT:
- hvm_asid_flush_vcpu_asid(&vcpu_nestedhvm(current).nv_n2asid);
+ hvm_flush_tlb(NULL);
break;
default:
vmfail(regs, VMX_INSN_INVEPT_INVVPID_INVALID_OP);
diff --git a/xen/arch/x86/include/asm/hvm/asid.h b/xen/arch/x86/include/asm/hvm/asid.h
index 17c58353d1..13ea357f70 100644
--- a/xen/arch/x86/include/asm/hvm/asid.h
+++ b/xen/arch/x86/include/asm/hvm/asid.h
@@ -8,25 +8,21 @@
#ifndef __ASM_X86_HVM_ASID_H__
#define __ASM_X86_HVM_ASID_H__
+#include <xen/stdbool.h>
+#include <xen/stdint.h>
-struct vcpu;
-struct hvm_vcpu_asid;
+struct hvm_asid {
+ uint32_t asid;
+};
-/* Initialise ASID management for the current physical CPU. */
-void hvm_asid_init(int nasids);
+extern bool asid_enabled;
-/* Invalidate a particular ASID allocation: forces re-allocation. */
-void hvm_asid_flush_vcpu_asid(struct hvm_vcpu_asid *asid);
+/* Initialise ASID management distributed across all CPUs. */
+int hvm_asid_init(unsigned long nasids);
-/* Invalidate all ASID allocations for specified VCPU: forces re-allocation. */
-void hvm_asid_flush_vcpu(struct vcpu *v);
-
-/* Flush all ASIDs on this processor core. */
-void hvm_asid_flush_core(void);
-
-/* Called before entry to guest context. Checks ASID allocation, returns a
- * boolean indicating whether all ASIDs must be flushed. */
-bool hvm_asid_handle_vmenter(struct hvm_vcpu_asid *asid);
+int hvm_asid_alloc(struct hvm_asid *asid);
+int hvm_asid_alloc_range(struct hvm_asid *asid, unsigned long min, unsigned long max);
+void hvm_asid_free(struct hvm_asid *asid);
#endif /* __ASM_X86_HVM_ASID_H__ */
diff --git a/xen/arch/x86/include/asm/hvm/domain.h b/xen/arch/x86/include/asm/hvm/domain.h
index 2608bcfad2..5fb37d342b 100644
--- a/xen/arch/x86/include/asm/hvm/domain.h
+++ b/xen/arch/x86/include/asm/hvm/domain.h
@@ -141,6 +141,7 @@ struct hvm_domain {
} write_map;
struct hvm_pi_ops pi_ops;
+ struct hvm_asid asid;
union {
struct vmx_domain vmx;
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index bf8bc2e100..7af111cb39 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -268,6 +268,8 @@ int hvm_domain_initialise(struct domain *d,
void hvm_domain_relinquish_resources(struct domain *d);
void hvm_domain_destroy(struct domain *d);
+int hvm_flush_tlb(const unsigned long *vcpu_bitmap);
+
int hvm_vcpu_initialise(struct vcpu *v);
void hvm_vcpu_destroy(struct vcpu *v);
void hvm_vcpu_down(struct vcpu *v);
@@ -483,17 +485,6 @@ static inline void hvm_set_tsc_offset(struct vcpu *v, uint64_t offset,
alternative_vcall(hvm_funcs.set_tsc_offset, v, offset, at_tsc);
}
-/*
- * Called to ensure than all guest-specific mappings in a tagged TLB are
- * flushed; does *not* flush Xen's TLB entries, and on processors without a
- * tagged TLB it will be a noop.
- */
-static inline void hvm_flush_guest_tlbs(void)
-{
- if ( hvm_enabled )
- hvm_asid_flush_core();
-}
-
static inline unsigned int
hvm_get_cpl(struct vcpu *v)
{
@@ -881,8 +872,6 @@ static inline int hvm_cpu_up(void)
static inline void hvm_cpu_down(void) {}
-static inline void hvm_flush_guest_tlbs(void) {}
-
static inline void hvm_invlpg(const struct vcpu *v, unsigned long linear)
{
ASSERT_UNREACHABLE();
diff --git a/xen/arch/x86/include/asm/hvm/svm/svm.h b/xen/arch/x86/include/asm/hvm/svm/svm.h
index 32f6e48e30..1254e5f3ee 100644
--- a/xen/arch/x86/include/asm/hvm/svm/svm.h
+++ b/xen/arch/x86/include/asm/hvm/svm/svm.h
@@ -9,6 +9,11 @@
#ifndef __ASM_X86_HVM_SVM_H__
#define __ASM_X86_HVM_SVM_H__
+void svm_asid_init(void);
+void svm_vcpu_assign_asid(struct vcpu *v);
+void svm_vcpu_set_tlb_control(struct vcpu *v);
+void svm_vcpu_clear_tlb_control(struct vcpu *v);
+
/*
* PV context switch helpers. Prefetching the VMCB area itself has been shown
* to be useful for performance.
diff --git a/xen/arch/x86/include/asm/hvm/vcpu.h b/xen/arch/x86/include/asm/hvm/vcpu.h
index 196fed6d5d..960bea6734 100644
--- a/xen/arch/x86/include/asm/hvm/vcpu.h
+++ b/xen/arch/x86/include/asm/hvm/vcpu.h
@@ -9,6 +9,7 @@
#define __ASM_X86_HVM_VCPU_H__
#include <xen/tasklet.h>
+#include <asm/hvm/asid.h>
#include <asm/hvm/vlapic.h>
#include <asm/hvm/vmx/vmcs.h>
#include <asm/hvm/vmx/vvmx.h>
@@ -17,11 +18,6 @@
#include <asm/mtrr.h>
#include <public/hvm/ioreq.h>
-struct hvm_vcpu_asid {
- uint64_t generation;
- uint32_t asid;
-};
-
struct hvm_vcpu_io {
/*
* HVM emulation:
@@ -79,7 +75,7 @@ struct nestedvcpu {
bool stale_np2m; /* True when p2m_base in VMCx02 is no longer valid */
uint64_t np2m_generation;
- struct hvm_vcpu_asid nv_n2asid;
+ struct hvm_asid nv_n2asid;
bool nv_vmentry_pending;
bool nv_vmexit_pending;
@@ -141,8 +137,6 @@ struct hvm_vcpu {
/* (MFN) hypervisor page table */
pagetable_t monitor_table;
- struct hvm_vcpu_asid n1asid;
-
u64 msr_tsc_adjust;
union {
diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmx.h b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
index a55a31b42d..cae3613a61 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmx.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
@@ -462,14 +462,10 @@ static inline void vpid_sync_vcpu_context(struct vcpu *v)
if ( likely(cpu_has_vmx_vpid_invvpid_single_context) )
goto execute_invvpid;
- /*
- * If single context invalidation is not supported, we escalate to
- * use all context invalidation.
- */
type = INVVPID_ALL_CONTEXT;
execute_invvpid:
- __invvpid(type, v->arch.hvm.n1asid.asid, (u64)gva);
+ __invvpid(type, v->domain->arch.hvm.asid.asid, 0);
}
static inline void vpid_sync_vcpu_gva(struct vcpu *v, unsigned long gva)
@@ -493,7 +489,7 @@ static inline void vpid_sync_vcpu_gva(struct vcpu *v, unsigned long gva)
type = INVVPID_ALL_CONTEXT;
execute_invvpid:
- __invvpid(type, v->arch.hvm.n1asid.asid, (u64)gva);
+ __invvpid(type, v->domain->arch.hvm.asid.asid, (u64)gva);
}
static inline void vpid_sync_all(void)
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index ec5043a8aa..cf7ca1702a 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -27,6 +27,7 @@
#include <asm/p2m.h>
#include <asm/domain.h>
#include <xen/numa.h>
+#include <asm/hvm/asid.h>
#include <asm/hvm/nestedhvm.h>
#include <public/sched.h>
@@ -739,7 +740,7 @@ static bool cf_check flush_tlb(const unsigned long *vcpu_bitmap)
if ( !flush_vcpu(v, vcpu_bitmap) )
continue;
- hvm_asid_flush_vcpu(v);
+ v->needs_tlb_flush = true;
cpu = read_atomic(&v->dirty_cpu);
if ( cpu != this_cpu && is_vcpu_dirty_cpu(cpu) && v->is_running )
@@ -748,9 +749,7 @@ static bool cf_check flush_tlb(const unsigned long *vcpu_bitmap)
/*
* Trigger a vmexit on all pCPUs with dirty vCPU state in order to force an
- * ASID/VPID change and hence accomplish a guest TLB flush. Note that vCPUs
- * not currently running will already be flushed when scheduled because of
- * the ASID tickle done in the loop above.
+ * ASID/VPID flush and hence accomplish a guest TLB flush.
*/
on_selected_cpus(mask, NULL, NULL, 0);
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 3a39b5d124..04b41cee12 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -25,6 +25,7 @@
#include <asm/p2m.h>
#include <asm/mem_sharing.h>
#include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/vcpu.h>
#include <asm/altp2m.h>
#include <asm/vm_event.h>
#include <xsm/xsm.h>
@@ -1405,7 +1406,7 @@ p2m_flush(struct vcpu *v, struct p2m_domain *p2m)
ASSERT(v->domain == p2m->domain);
vcpu_nestedhvm(v).nv_p2m = NULL;
p2m_flush_table(p2m);
- hvm_asid_flush_vcpu(v);
+ v->needs_tlb_flush = true;
}
void
@@ -1464,7 +1465,7 @@ static void assign_np2m(struct vcpu *v, struct p2m_domain *p2m)
static void nvcpu_flush(struct vcpu *v)
{
- hvm_asid_flush_vcpu(v);
+ v->needs_tlb_flush = true;
vcpu_nestedhvm(v).stale_np2m = true;
}
@@ -1584,7 +1585,7 @@ void np2m_schedule(int dir)
if ( !np2m_valid )
{
/* This vCPU's np2m was flushed while it was not runnable */
- hvm_asid_flush_core();
+ curr->needs_tlb_flush = true; /* TODO: Is it ok ? */
vcpu_nestedhvm(curr).nv_p2m = NULL;
}
else
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index c77f4c1dac..26b6ce9e9b 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -964,7 +964,7 @@ void paging_update_nestedmode(struct vcpu *v)
else
/* TODO: shadow-on-shadow */
v->arch.paging.nestedmode = NULL;
- hvm_asid_flush_vcpu(v);
+ v->needs_tlb_flush = true;
}
int __init paging_set_allocation(struct domain *d, unsigned int pages,
diff --git a/xen/arch/x86/mm/shadow/hvm.c b/xen/arch/x86/mm/shadow/hvm.c
index 114957a3e1..f98591f976 100644
--- a/xen/arch/x86/mm/shadow/hvm.c
+++ b/xen/arch/x86/mm/shadow/hvm.c
@@ -737,6 +737,7 @@ bool cf_check shadow_flush_tlb(const unsigned long *vcpu_bitmap)
continue;
paging_update_cr3(v, false);
+ v->needs_tlb_flush = true;
cpu = read_atomic(&v->dirty_cpu);
if ( is_vcpu_dirty_cpu(cpu) )
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 7be9c180ec..0e2865286e 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3144,6 +3144,7 @@ sh_update_linear_entries(struct vcpu *v)
* without this change, it would fetch the wrong value due to a stale TLB.
*/
sh_flush_local(d);
+ v->needs_tlb_flush = true;
}
static pagetable_t cf_check sh_update_cr3(struct vcpu *v, bool noflush)
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 4ce9253284..f2f5a98534 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -228,6 +228,8 @@ struct vcpu
bool defer_shutdown;
/* VCPU is paused following shutdown request (d->is_shutting_down)? */
bool paused_for_shutdown;
+ /* VCPU needs its TLB flushed before waking. */
+ bool needs_tlb_flush;
/* VCPU need affinity restored */
uint8_t affinity_broken;
#define VCPU_AFFINITY_OVERRIDE 0x01
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 08/16] x86/crypto: Introduce AMD PSP driver for SEV
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (6 preceding siblings ...)
2025-05-16 10:22 ` [RFC PATCH 07/16] x86/hvm: Introduce Xen-wide ASID allocator Teddy Astie
@ 2025-05-16 10:22 ` Teddy Astie
2025-05-16 10:23 ` [RFC PATCH 09/16] common: Introduce confidential computing infrastructure Teddy Astie
` (8 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 10:22 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini,
Andrei Semenov
From: Andrei Semenov <andrei.semenov@vates.tech>
Introduce a basic PSP driver with focus on SEV commands.
Signed-off-by: Andrei Semenov <andrei.semenov@vates.tech>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
xen/arch/x86/include/asm/psp-sev.h | 655 +++++++++++++++++++++++
xen/drivers/Kconfig | 2 +
xen/drivers/Makefile | 1 +
xen/drivers/crypto/Kconfig | 10 +
xen/drivers/crypto/Makefile | 1 +
xen/drivers/crypto/asp.c | 830 +++++++++++++++++++++++++++++
6 files changed, 1499 insertions(+)
create mode 100644 xen/arch/x86/include/asm/psp-sev.h
create mode 100644 xen/drivers/crypto/Kconfig
create mode 100644 xen/drivers/crypto/Makefile
create mode 100644 xen/drivers/crypto/asp.c
diff --git a/xen/arch/x86/include/asm/psp-sev.h b/xen/arch/x86/include/asm/psp-sev.h
new file mode 100644
index 0000000000..5bbe1ed2c0
--- /dev/null
+++ b/xen/arch/x86/include/asm/psp-sev.h
@@ -0,0 +1,655 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * AMD Secure Encrypted Virtualization (SEV) driver interface
+ *
+ * Copyright (C) 2016-2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * SEV API spec is available at https://developer.amd.com/sev
+ */
+
+#ifndef __PSP_SEV_H__
+#define __PSP_SEV_H__
+
+#include <xen/types.h>
+
+/**
+ * SEV platform and guest management commands
+ */
+enum sev_cmd {
+ /* platform commands */
+ SEV_CMD_INIT = 0x001,
+ SEV_CMD_SHUTDOWN = 0x002,
+ SEV_CMD_FACTORY_RESET = 0x003,
+ SEV_CMD_PLATFORM_STATUS = 0x004,
+ SEV_CMD_PEK_GEN = 0x005,
+ SEV_CMD_PEK_CSR = 0x006,
+ SEV_CMD_PEK_CERT_IMPORT = 0x007,
+ SEV_CMD_PDH_CERT_EXPORT = 0x008,
+ SEV_CMD_PDH_GEN = 0x009,
+ SEV_CMD_DF_FLUSH = 0x00A,
+ SEV_CMD_DOWNLOAD_FIRMWARE = 0x00B,
+ SEV_CMD_GET_ID = 0x00C,
+ SEV_CMD_INIT_EX = 0x00D,
+
+ /* Guest commands */
+ SEV_CMD_DECOMMISSION = 0x020,
+ SEV_CMD_ACTIVATE = 0x021,
+ SEV_CMD_DEACTIVATE = 0x022,
+ SEV_CMD_GUEST_STATUS = 0x023,
+
+ /* Guest launch commands */
+ SEV_CMD_LAUNCH_START = 0x030,
+ SEV_CMD_LAUNCH_UPDATE_DATA = 0x031,
+ SEV_CMD_LAUNCH_UPDATE_VMSA = 0x032,
+ SEV_CMD_LAUNCH_MEASURE = 0x033,
+ SEV_CMD_LAUNCH_UPDATE_SECRET = 0x034,
+ SEV_CMD_LAUNCH_FINISH = 0x035,
+ SEV_CMD_ATTESTATION_REPORT = 0x036,
+
+ /* Guest migration commands (outgoing) */
+ SEV_CMD_SEND_START = 0x040,
+ SEV_CMD_SEND_UPDATE_DATA = 0x041,
+ SEV_CMD_SEND_UPDATE_VMSA = 0x042,
+ SEV_CMD_SEND_FINISH = 0x043,
+ SEV_CMD_SEND_CANCEL = 0x044,
+
+ /* Guest migration commands (incoming) */
+ SEV_CMD_RECEIVE_START = 0x050,
+ SEV_CMD_RECEIVE_UPDATE_DATA = 0x051,
+ SEV_CMD_RECEIVE_UPDATE_VMSA = 0x052,
+ SEV_CMD_RECEIVE_FINISH = 0x053,
+
+ /* Guest debug commands */
+ SEV_CMD_DBG_DECRYPT = 0x060,
+ SEV_CMD_DBG_ENCRYPT = 0x061,
+
+ SEV_CMD_MAX,
+};
+
+/**
+ * struct sev_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ * @tmr_address: system physical address used for SEV-ES
+ * @tmr_len: len of tmr_address
+ */
+struct sev_data_init {
+ uint32_t flags; /* In */
+ uint32_t reserved; /* In */
+ uint64_t tmr_address; /* In */
+ uint32_t tmr_len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_init_ex - INIT_EX command parameters
+ *
+ * @length: len of the command buffer read by the PSP
+ * @flags: processing flags
+ * @tmr_address: system physical address used for SEV-ES
+ * @tmr_len: len of tmr_address
+ * @nv_address: system physical address used for PSP NV storage
+ * @nv_len: len of nv_address
+ */
+struct sev_data_init_ex {
+ uint32_t length; /* In */
+ uint32_t flags; /* In */
+ uint64_t tmr_address; /* In */
+ uint32_t tmr_len; /* In */
+ uint32_t reserved; /* In */
+ uint64_t nv_address; /* In/Out */
+ uint32_t nv_len; /* In */
+} __packed;
+
+#define SEV_INIT_FLAGS_SEV_ES 0x01
+
+/**
+ * struct sev_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @len: len of certificate
+ */
+struct sev_data_pek_csr {
+ uint64_t address; /* In */
+ uint32_t len; /* In/Out */
+} __packed;
+
+/**
+ * struct sev_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_len: len of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_len: len of OCA certificate
+ */
+struct sev_data_pek_cert_import {
+ uint64_t pek_cert_address; /* In */
+ uint32_t pek_cert_len; /* In */
+ uint32_t reserved; /* In */
+ uint64_t oca_cert_address; /* In */
+ uint32_t oca_cert_len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_download_firmware - DOWNLOAD_FIRMWARE command parameters
+ *
+ * @address: physical address of firmware image
+ * @len: len of the firmware image
+ */
+struct sev_data_download_firmware {
+ uint64_t address; /* In */
+ uint32_t len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_get_id - GET_ID command parameters
+ *
+ * @address: physical address of region to place unique CPU ID(s)
+ * @len: len of the region
+ */
+struct sev_data_get_id {
+ uint64_t address; /* In */
+ uint32_t len; /* In/Out */
+} __packed;
+/**
+ * struct sev_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_len: len of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_len: len of PDH certificate chain
+ */
+struct sev_data_pdh_cert_export {
+ uint64_t pdh_cert_address; /* In */
+ uint32_t pdh_cert_len; /* In/Out */
+ uint32_t reserved; /* In */
+ uint64_t cert_chain_address; /* In */
+ uint32_t cert_chain_len; /* In/Out */
+} __packed;
+
+/**
+ * struct sev_data_decommission - DECOMMISSION command parameters
+ *
+ * @handle: handle of the VM to decommission
+ */
+struct sev_data_decommission {
+ uint32_t handle; /* In */
+} __packed;
+
+/**
+ * struct sev_data_activate - ACTIVATE command parameters
+ *
+ * @handle: handle of the VM to activate
+ * @asid: asid assigned to the VM
+ */
+struct sev_data_activate {
+ uint32_t handle; /* In */
+ uint32_t asid; /* In */
+} __packed;
+
+/**
+ * struct sev_data_deactivate - DEACTIVATE command parameters
+ *
+ * @handle: handle of the VM to deactivate
+ */
+struct sev_data_deactivate {
+ uint32_t handle; /* In */
+} __packed;
+
+/**
+ * struct sev_data_guest_status - SEV GUEST_STATUS command parameters
+ *
+ * @handle: handle of the VM to retrieve status
+ * @policy: policy information for the VM
+ * @asid: current ASID of the VM
+ * @state: current state of the VM
+ */
+struct sev_data_guest_status {
+ uint32_t handle; /* In */
+ uint32_t policy; /* Out */
+ uint32_t asid; /* Out */
+ uint8_t state; /* Out */
+} __packed;
+
+/**
+ * struct sev_data_launch_start - LAUNCH_START command parameters
+ *
+ * @handle: handle assigned to the VM
+ * @policy: guest launch policy
+ * @dh_cert_address: physical address of DH certificate blob
+ * @dh_cert_len: len of DH certificate blob
+ * @session_address: physical address of session parameters
+ * @session_len: len of session parameters
+ */
+struct sev_data_launch_start {
+ uint32_t handle; /* In/Out */
+ uint32_t policy; /* In */
+ uint64_t dh_cert_address; /* In */
+ uint32_t dh_cert_len; /* In */
+ uint32_t reserved; /* In */
+ uint64_t session_address; /* In */
+ uint32_t session_len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_launch_update_data - LAUNCH_UPDATE_DATA command parameter
+ *
+ * @handle: handle of the VM to update
+ * @len: len of memory to be encrypted
+ * @address: physical address of memory region to encrypt
+ */
+struct sev_data_launch_update_data {
+ uint32_t handle; /* In */
+ uint32_t reserved;
+ uint64_t address; /* In */
+ uint32_t len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_launch_update_vmsa - LAUNCH_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM
+ * @address: physical address of memory region to encrypt
+ * @len: len of memory region to encrypt
+ */
+struct sev_data_launch_update_vmsa {
+ uint32_t handle; /* In */
+ uint32_t reserved;
+ uint64_t address; /* In */
+ uint32_t len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_launch_measure - LAUNCH_MEASURE command parameters
+ *
+ * @handle: handle of the VM to process
+ * @address: physical address containing the measurement blob
+ * @len: len of measurement blob
+ */
+struct sev_data_launch_measure {
+ uint32_t handle; /* In */
+ uint32_t reserved;
+ uint64_t address; /* In */
+ uint32_t len; /* In/Out */
+} __packed;
+
+/**
+ * struct sev_data_launch_secret - LAUNCH_SECRET command parameters
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing the packet header
+ * @hdr_len: len of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_len: len of guest_paddr
+ * @trans_address: physical address of transport memory buffer
+ * @trans_len: len of transport memory buffer
+ */
+struct sev_data_launch_secret {
+ uint32_t handle; /* In */
+ uint32_t reserved1;
+ uint64_t hdr_address; /* In */
+ uint32_t hdr_len; /* In */
+ uint32_t reserved2;
+ uint64_t guest_address; /* In */
+ uint32_t guest_len; /* In */
+ uint32_t reserved3;
+ uint64_t trans_address; /* In */
+ uint32_t trans_len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_launch_finish - LAUNCH_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_launch_finish {
+ uint32_t handle; /* In */
+} __packed;
+
+/**
+ * struct sev_data_send_start - SEND_START command parameters
+ *
+ * @handle: handle of the VM to process
+ * @policy: policy information for the VM
+ * @pdh_cert_address: physical address containing PDH certificate
+ * @pdh_cert_len: len of PDH certificate
+ * @plat_certs_address: physical address containing platform certificate
+ * @plat_certs_len: len of platform certificate
+ * @amd_certs_address: physical address containing AMD certificate
+ * @amd_certs_len: len of AMD certificate
+ * @session_address: physical address containing Session data
+ * @session_len: len of session data
+ */
+struct sev_data_send_start {
+ uint32_t handle; /* In */
+ uint32_t policy; /* Out */
+ uint64_t pdh_cert_address; /* In */
+ uint32_t pdh_cert_len; /* In */
+ uint32_t reserved1;
+ uint64_t plat_certs_address; /* In */
+ uint32_t plat_certs_len; /* In */
+ uint32_t reserved2;
+ uint64_t amd_certs_address; /* In */
+ uint32_t amd_certs_len; /* In */
+ uint32_t reserved3;
+ uint64_t session_address; /* In */
+ uint32_t session_len; /* In/Out */
+} __packed;
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_DATA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_len: len of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_len: len of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_len: len of host memory region
+ */
+struct sev_data_send_update_data {
+ uint32_t handle; /* In */
+ uint32_t reserved1;
+ uint64_t hdr_address; /* In */
+ uint32_t hdr_len; /* In/Out */
+ uint32_t reserved2;
+ uint64_t guest_address; /* In */
+ uint32_t guest_len; /* In */
+ uint32_t reserved3;
+ uint64_t trans_address; /* In */
+ uint32_t trans_len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_len: len of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_len: len of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_len: len of host memory region
+ */
+struct sev_data_send_update_vmsa {
+ uint32_t handle; /* In */
+ uint64_t hdr_address; /* In */
+ uint32_t hdr_len; /* In/Out */
+ uint32_t reserved2;
+ uint64_t guest_address; /* In */
+ uint32_t guest_len; /* In */
+ uint32_t reserved3;
+ uint64_t trans_address; /* In */
+ uint32_t trans_len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_send_finish - SEND_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_send_finish {
+ uint32_t handle; /* In */
+} __packed;
+
+/**
+ * struct sev_data_send_cancel - SEND_CANCEL command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_send_cancel {
+ uint32_t handle; /* In */
+} __packed;
+
+/**
+ * struct sev_data_receive_start - RECEIVE_START command parameters
+ *
+ * @handle: handle of the VM to perform receive operation
+ * @pdh_cert_address: system physical address containing PDH certificate blob
+ * @pdh_cert_len: len of PDH certificate blob
+ * @session_address: system physical address containing session blob
+ * @session_len: len of session blob
+ */
+struct sev_data_receive_start {
+ uint32_t handle; /* In/Out */
+ uint32_t policy; /* In */
+ uint64_t pdh_cert_address; /* In */
+ uint32_t pdh_cert_len; /* In */
+ uint32_t reserved1;
+ uint64_t session_address; /* In */
+ uint32_t session_len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_receive_update_data - RECEIVE_UPDATE_DATA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_len: len of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_len: len of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_len: len of transport buffer
+ */
+struct sev_data_receive_update_data {
+ uint32_t handle; /* In */
+ uint32_t reserved1;
+ uint64_t hdr_address; /* In */
+ uint32_t hdr_len; /* In */
+ uint32_t reserved2;
+ uint64_t guest_address; /* In */
+ uint32_t guest_len; /* In */
+ uint32_t reserved3;
+ uint64_t trans_address; /* In */
+ uint32_t trans_len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_receive_update_vmsa - RECEIVE_UPDATE_VMSA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_len: len of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_len: len of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_len: len of transport buffer
+ */
+struct sev_data_receive_update_vmsa {
+ uint32_t handle; /* In */
+ uint32_t reserved1;
+ uint64_t hdr_address; /* In */
+ uint32_t hdr_len; /* In */
+ uint32_t reserved2;
+ uint64_t guest_address; /* In */
+ uint32_t guest_len; /* In */
+ uint32_t reserved3;
+ uint64_t trans_address; /* In */
+ uint32_t trans_len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_receive_finish - RECEIVE_FINISH command parameters
+ *
+ * @handle: handle of the VM to finish
+ */
+struct sev_data_receive_finish {
+ uint32_t handle; /* In */
+} __packed;
+
+/**
+ * struct sev_data_dbg - DBG_ENCRYPT/DBG_DECRYPT command parameters
+ *
+ * @handle: handle of the VM to perform debug operation
+ * @src_addr: source address of data to operate on
+ * @dst_addr: destination address of data to operate on
+ * @len: len of data to operate on
+ */
+struct sev_data_dbg {
+ uint32_t handle; /* In */
+ uint32_t reserved;
+ uint64_t src_addr; /* In */
+ uint64_t dst_addr; /* In */
+ uint32_t len; /* In */
+} __packed;
+
+/**
+ * struct sev_data_attestation_report - SEV_ATTESTATION_REPORT command parameters
+ *
+ * @handle: handle of the VM
+ * @mnonce: a random nonce that will be included in the report.
+ * @address: physical address where the report will be copied.
+ * @len: length of the physical buffer.
+ */
+struct sev_data_attestation_report {
+ uint32_t handle; /* In */
+ uint32_t reserved;
+ uint64_t address; /* In */
+ uint8_t mnonce[16]; /* In */
+ uint32_t len; /* In/Out */
+} __packed;
+
+
+/**
+ * SEV platform commands
+ */
+enum {
+ SEV_FACTORY_RESET = 0,
+ SEV_PLATFORM_STATUS,
+ SEV_PEK_GEN,
+ SEV_PEK_CSR,
+ SEV_PDH_GEN,
+ SEV_PDH_CERT_EXPORT,
+ SEV_PEK_CERT_IMPORT,
+ SEV_GET_ID, /* This command is deprecated, use SEV_GET_ID2 */
+ SEV_GET_ID2,
+
+ SEV_MAX,
+};
+
+/**
+ * SEV Firmware status code
+ */
+typedef enum {
+ /*
+ * This error code is not in the SEV spec. Its purpose is to convey that
+ * there was an error that prevented the SEV firmware from being called.
+ * The SEV API error codes are 16 bits, so the -1 value will not overlap
+ * with possible values from the specification.
+ */
+ SEV_RET_NO_FW_CALL = -1,
+ SEV_RET_SUCCESS = 0,
+ SEV_RET_INVALID_PLATFORM_STATE,
+ SEV_RET_INVALID_GUEST_STATE,
+ SEV_RET_INAVLID_CONFIG,
+ SEV_RET_INVALID_LEN,
+ SEV_RET_ALREADY_OWNED,
+ SEV_RET_INVALID_CERTIFICATE,
+ SEV_RET_POLICY_FAILURE,
+ SEV_RET_INACTIVE,
+ SEV_RET_INVALID_ADDRESS,
+ SEV_RET_BAD_SIGNATURE,
+ SEV_RET_BAD_MEASUREMENT,
+ SEV_RET_ASID_OWNED,
+ SEV_RET_INVALID_ASID,
+ SEV_RET_WBINVD_REQUIRED,
+ SEV_RET_DFFLUSH_REQUIRED,
+ SEV_RET_INVALID_GUEST,
+ SEV_RET_INVALID_COMMAND,
+ SEV_RET_ACTIVE,
+ SEV_RET_HWSEV_RET_PLATFORM,
+ SEV_RET_HWSEV_RET_UNSAFE,
+ SEV_RET_UNSUPPORTED,
+ SEV_RET_INVALID_PARAM,
+ SEV_RET_RESOURCE_LIMIT,
+ SEV_RET_SECURE_DATA_INVALID,
+ SEV_RET_MAX,
+} sev_ret_code;
+
+/**
+ * struct sev_user_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @flags: platform config flags
+ * @build: firmware build id for API version
+ * @guest_count: number of active guests
+ */
+struct sev_user_data_status {
+ uint8_t api_major; /* Out */
+ uint8_t api_minor; /* Out */
+ uint8_t state; /* Out */
+ uint8_t flags; /* Out */
+ uint8_t build; /* Out */
+ uint8_t guest_count; /* Out */
+} __packed;
+
+#define SEV_STATUS_FLAGS_CONFIG_ES 0x0100
+
+/**
+ * struct sev_user_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_user_data_pek_csr {
+ uint8_t address; /* In */
+ uint8_t length; /* In/Out */
+} __packed;
+
+/**
+ * struct sev_user_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_len: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_len: length of OCA certificate
+ */
+struct sev_user_data_pek_cert_import {
+ uint8_t pek_cert_address; /* In */
+ uint8_t pek_cert_len; /* In */
+ uint8_t oca_cert_address; /* In */
+ uint8_t oca_cert_len; /* In */
+} __packed;
+
+/**
+ * struct sev_user_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_len: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_len: length of PDH certificate chain
+ */
+struct sev_user_data_pdh_cert_export {
+ uint8_t pdh_cert_address; /* In */
+ uint8_t pdh_cert_len; /* In/Out */
+ uint8_t cert_chain_address; /* In */
+ uint8_t cert_chain_len; /* In/Out */
+} __packed;
+
+/**
+ * struct sev_user_data_get_id - GET_ID command parameters (deprecated)
+ *
+ * @socket1: Buffer to pass unique ID of first socket
+ * @socket2: Buffer to pass unique ID of second socket
+ */
+struct sev_user_data_get_id {
+ uint8_t socket1[64]; /* Out */
+ uint8_t socket2[64]; /* Out */
+} __packed;
+
+/**
+ * struct sev_user_data_get_id2 - GET_ID command parameters
+ * @address: Buffer to store unique ID
+ * @length: length of the unique ID
+ */
+struct sev_user_data_get_id2 {
+ uint8_t address; /* In */
+ uint8_t length; /* In/Out */
+} __packed;
+
+extern int sev_do_cmd(int cmd, void *data, int *psp_ret, bool poll);
+
+#endif /* __PSP_SEV_H__ */
diff --git a/xen/drivers/Kconfig b/xen/drivers/Kconfig
index 20050e9bb8..bed7d5a3e2 100644
--- a/xen/drivers/Kconfig
+++ b/xen/drivers/Kconfig
@@ -12,6 +12,8 @@ source "drivers/pci/Kconfig"
source "drivers/video/Kconfig"
+source "drivers/crypto/Kconfig"
+
config HAS_VPCI
bool
diff --git a/xen/drivers/Makefile b/xen/drivers/Makefile
index 2a1ae8ad13..f24f788fde 100644
--- a/xen/drivers/Makefile
+++ b/xen/drivers/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_HAS_VPCI) += vpci/
obj-$(CONFIG_HAS_PASSTHROUGH) += passthrough/
obj-$(CONFIG_ACPI) += acpi/
obj-$(CONFIG_VIDEO) += video/
+obj-y += crypto/
diff --git a/xen/drivers/crypto/Kconfig b/xen/drivers/crypto/Kconfig
new file mode 100644
index 0000000000..ca3d3f5f1b
--- /dev/null
+++ b/xen/drivers/crypto/Kconfig
@@ -0,0 +1,10 @@
+config AMD_SP
+ bool "AMD Secure Processor (UNSUPPORTED)" if UNSUPPORTED
+ depends on X86
+ default n
+ help
+ Enables AMD Secure Processor.
+
+ If your platform includes AMD Secure Processor devices and you are
+ intended to use AMD Secure Encrypted Virtualization Technology, say Y.
+ If in doubt, say N.
diff --git a/xen/drivers/crypto/Makefile b/xen/drivers/crypto/Makefile
new file mode 100644
index 0000000000..ff283c2cb5
--- /dev/null
+++ b/xen/drivers/crypto/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_AMD_SP) += asp.o
diff --git a/xen/drivers/crypto/asp.c b/xen/drivers/crypto/asp.c
new file mode 100644
index 0000000000..834e5d3b9b
--- /dev/null
+++ b/xen/drivers/crypto/asp.c
@@ -0,0 +1,830 @@
+#include <xen/init.h>
+#include <xen/pci.h>
+#include <xen/list.h>
+#include <xen/tasklet.h>
+#include <xen/pci_ids.h>
+#include <xen/delay.h>
+#include <xen/timer.h>
+#include <xen/wait.h>
+#include <xen/smp.h>
+#include <asm/msi.h>
+#include <asm/system.h>
+#include <asm/psp-sev.h>
+
+/*
+TODO:
+- GLOBAL:
+ - add command line params for tunables
+ - INTERRUPT MODE:
+ - CET shadow stack: adapt #CP handler???
+ - Serialization: must be done by the client? adapt spinlock?
+ */
+
+#define PSP_CAPABILITY_SEV (1 << 0)
+#define PSP_CAPABILITY_TEE (1 << 1)
+#define PSP_CAPABILITY_PSP_SECURITY_REPORTING (1 << 7)
+#define PSP_CAPABILITY_PSP_SECURITY_OFFSET 8
+
+#define PSP_INTSTS_CMD_COMPLETE (1 << 1)
+
+#define SEV_CMDRESP_CMD_MASK 0x7ff0000
+#define SEV_CMDRESP_CMD_SHIFT 16
+#define SEV_CMDRESP_CMD(cmd) ((cmd) << SEV_CMDRESP_CMD_SHIFT)
+#define SEV_CMDRESP_STS_MASK 0xffff
+#define SEV_CMDRESP_STS(x) ((x) & SEV_CMDRESP_STS_MASK)
+#define SEV_CMDRESP_RESP (1 << 31)
+#define SEV_CMDRESP_IOC (1 << 0)
+
+#define ASP_CMD_BUFF_SIZE 0x1000
+#define SEV_FW_BLOB_MAX_SIZE 0x4000
+
+/*
+ * SEV platform state
+ */
+enum sev_state {
+ SEV_STATE_UNINIT = 0x0,
+ SEV_STATE_INIT = 0x1,
+ SEV_STATE_WORKING = 0x2,
+ SEV_STATE_MAX
+};
+
+struct sev_vdata {
+ const unsigned int cmdresp_reg;
+ const unsigned int cmdbuff_addr_lo_reg;
+ const unsigned int cmdbuff_addr_hi_reg;
+};
+
+struct psp_vdata {
+ const unsigned short base_offset;
+ const struct sev_vdata *sev;
+ const unsigned int feature_reg;
+ const unsigned int inten_reg;
+ const unsigned int intsts_reg;
+ const char* name;
+};
+
+static struct sev_vdata sevv1 = {
+ .cmdresp_reg = 0x10580, /* C2PMSG_32 */
+ .cmdbuff_addr_lo_reg = 0x105e0, /* C2PMSG_56 */
+ .cmdbuff_addr_hi_reg = 0x105e4, /* C2PMSG_57 */
+};
+
+static struct sev_vdata sevv2 = {
+ .cmdresp_reg = 0x10980, /* C2PMSG_32 */
+ .cmdbuff_addr_lo_reg = 0x109e0, /* C2PMSG_56 */
+ .cmdbuff_addr_hi_reg = 0x109e4, /* C2PMSG_57 */
+};
+
+static struct psp_vdata pspv1 = {
+ .base_offset = PCI_BASE_ADDRESS_2,
+ .sev = &sevv1,
+ .feature_reg = 0x105fc, /* C2PMSG_63 */
+ .inten_reg = 0x10610, /* P2CMSG_INTEN */
+ .intsts_reg = 0x10614, /* P2CMSG_INTSTS */
+ .name = "pspv1",
+};
+
+static struct psp_vdata pspv2 = {
+ .base_offset = PCI_BASE_ADDRESS_2,
+ .sev = &sevv2,
+ .feature_reg = 0x109fc, /* C2PMSG_63 */
+ .inten_reg = 0x10690, /* P2CMSG_INTEN */
+ .intsts_reg = 0x10694, /* P2CMSG_INTSTS */
+ .name = "pspv2",
+};
+
+static struct psp_vdata pspv4 = {
+ .base_offset = PCI_BASE_ADDRESS_2,
+ .sev = &sevv2,
+ .feature_reg = 0x109fc, /* C2PMSG_63 */
+ .inten_reg = 0x10690, /* P2CMSG_INTEN */
+ .intsts_reg = 0x10694, /* P2CMSG_INTSTS */
+ .name = "pspv4",
+};
+
+static struct psp_vdata pspv6 = {
+ .base_offset = PCI_BASE_ADDRESS_2,
+ .sev = &sevv2,
+ .feature_reg = 0x109fc, /* C2PMSG_63 */
+ .inten_reg = 0x10510, /* P2CMSG_INTEN */
+ .intsts_reg = 0x10514, /* P2CMSG_INTSTS */
+ .name = "pspv6",
+};
+
+struct amd_sp_dev
+{
+ struct list_head list;
+ struct pci_dev *pdev;
+ struct psp_vdata *vdata;
+ void *io_base;
+ paddr_t io_pbase;
+ size_t io_size;
+ int irq;
+ int state;
+ void* cmd_buff;
+ uint32_t cbuff_pa_low;
+ uint32_t cbuff_pa_high;
+ unsigned int capability;
+ uint8_t api_major;
+ uint8_t api_minor;
+ uint8_t build;
+ int intr_rcvd;
+ int cmd_timeout;
+ struct timer cmd_timer;
+ struct waitqueue_head cmd_in_progress;
+};
+
+LIST_HEAD(amd_sp_units);
+#define for_each_sp_unit(sp) \
+ list_for_each_entry(sp, &amd_sp_units, list)
+
+static spinlock_t _sp_cmd_lock = SPIN_LOCK_UNLOCKED;
+
+static struct amd_sp_dev *amd_sp_master;
+
+static void do_sp_irq(void *data);
+static DECLARE_SOFTIRQ_TASKLET(sp_irq_tasklet, do_sp_irq, NULL);
+
+static bool force_sync = false;
+static unsigned int asp_timeout_val = 30000;
+static unsigned long long asp_sync_delay = 100ULL;
+static int asp_sync_tries = 10;
+
+static void sp_cmd_lock(void)
+{
+ spin_lock(&_sp_cmd_lock);
+}
+
+static void sp_cmd_unlock(void)
+{
+ spin_unlock(&_sp_cmd_lock);
+}
+
+static int sev_cmd_buffer_len(int cmd)
+{
+ switch (cmd) {
+ case SEV_CMD_INIT:
+ return sizeof(struct sev_data_init);
+ case SEV_CMD_INIT_EX:
+ return sizeof(struct sev_data_init_ex);
+ case SEV_CMD_PLATFORM_STATUS:
+ return sizeof(struct sev_user_data_status);
+ case SEV_CMD_PEK_CSR:
+ return sizeof(struct sev_data_pek_csr);
+ case SEV_CMD_PEK_CERT_IMPORT:
+ return sizeof(struct sev_data_pek_cert_import);
+ case SEV_CMD_PDH_CERT_EXPORT:
+ return sizeof(struct sev_data_pdh_cert_export);
+ case SEV_CMD_LAUNCH_START:
+ return sizeof(struct sev_data_launch_start);
+ case SEV_CMD_LAUNCH_UPDATE_DATA:
+ return sizeof(struct sev_data_launch_update_data);
+ case SEV_CMD_LAUNCH_UPDATE_VMSA:
+ return sizeof(struct sev_data_launch_update_vmsa);
+ case SEV_CMD_LAUNCH_FINISH:
+ return sizeof(struct sev_data_launch_finish);
+ case SEV_CMD_LAUNCH_MEASURE:
+ return sizeof(struct sev_data_launch_measure);
+ case SEV_CMD_ACTIVATE:
+ return sizeof(struct sev_data_activate);
+ case SEV_CMD_DEACTIVATE:
+ return sizeof(struct sev_data_deactivate);
+ case SEV_CMD_DECOMMISSION:
+ return sizeof(struct sev_data_decommission);
+ case SEV_CMD_GUEST_STATUS:
+ return sizeof(struct sev_data_guest_status);
+ case SEV_CMD_DBG_DECRYPT:
+ return sizeof(struct sev_data_dbg);
+ case SEV_CMD_DBG_ENCRYPT:
+ return sizeof(struct sev_data_dbg);
+ case SEV_CMD_SEND_START:
+ return sizeof(struct sev_data_send_start);
+ case SEV_CMD_SEND_UPDATE_DATA:
+ return sizeof(struct sev_data_send_update_data);
+ case SEV_CMD_SEND_UPDATE_VMSA:
+ return sizeof(struct sev_data_send_update_vmsa);
+ case SEV_CMD_SEND_FINISH:
+ return sizeof(struct sev_data_send_finish);
+ case SEV_CMD_RECEIVE_START:
+ return sizeof(struct sev_data_receive_start);
+ case SEV_CMD_RECEIVE_FINISH:
+ return sizeof(struct sev_data_receive_finish);
+ case SEV_CMD_RECEIVE_UPDATE_DATA:
+ return sizeof(struct sev_data_receive_update_data);
+ case SEV_CMD_RECEIVE_UPDATE_VMSA:
+ return sizeof(struct sev_data_receive_update_vmsa);
+ case SEV_CMD_LAUNCH_UPDATE_SECRET:
+ return sizeof(struct sev_data_launch_secret);
+ case SEV_CMD_DOWNLOAD_FIRMWARE:
+ return sizeof(struct sev_data_download_firmware);
+ case SEV_CMD_GET_ID:
+ return sizeof(struct sev_data_get_id);
+ case SEV_CMD_ATTESTATION_REPORT:
+ return sizeof(struct sev_data_attestation_report);
+ case SEV_CMD_SEND_CANCEL:
+ return sizeof(struct sev_data_send_cancel);
+ default:
+ return 0;
+ }
+}
+
+static void invalidate_cache(void *unused)
+{
+ wbinvd();
+}
+
+int _sev_do_cmd(struct amd_sp_dev *sp, int cmd, void *data, int *psp_ret)
+{
+ unsigned int cbuff_pa_low, cbuff_pa_high, cmd_val;
+ int buf_len, cmdresp, rc;
+
+ buf_len = sev_cmd_buffer_len(cmd);
+
+ if ( data )
+ memcpy(sp->cmd_buff, data, buf_len);
+
+ cbuff_pa_low = data ? sp->cbuff_pa_low : 0;
+ cbuff_pa_high = data ? sp->cbuff_pa_high : 0;
+
+ writel(cbuff_pa_low, sp->io_base + sp->vdata->sev->cmdbuff_addr_lo_reg);
+ writel(cbuff_pa_high, sp->io_base + sp->vdata->sev->cmdbuff_addr_hi_reg);
+
+ cmd_val = SEV_CMDRESP_CMD(cmd) | SEV_CMDRESP_IOC;
+
+ sp->cmd_timeout = 0;
+ sp->intr_rcvd = 0;
+
+ writel(cmd_val, sp->io_base + sp->vdata->sev->cmdresp_reg);
+
+ set_timer(&sp->cmd_timer, NOW() + MILLISECS(asp_timeout_val));
+
+ /* FIXME: If the timer triggers here the device will be set offline */
+
+ wait_event(sp->cmd_in_progress, sp->cmd_timeout || sp->intr_rcvd);
+
+ stop_timer(&sp->cmd_timer);
+
+ if ( sp->intr_rcvd )
+ {
+ cmdresp = readl(sp->io_base + sp->vdata->sev->cmdresp_reg);
+
+ ASSERT(cmdresp & SEV_CMDRESP_RESP);
+
+ rc = SEV_CMDRESP_STS(cmdresp) ? -EFAULT : 0;
+
+ if ( rc && psp_ret )
+ *psp_ret = SEV_CMDRESP_STS(cmdresp);
+
+ if ( data && (!rc) )
+ memcpy(data, sp->cmd_buff, buf_len);
+ }
+ else
+ {
+ ASSERT(sp->cmd_timeout);
+
+ sp->state = SEV_STATE_UNINIT;
+
+ writel(0, sp->io_base + sp->vdata->inten_reg);
+
+ rc = -EIO;
+ }
+ return rc;
+}
+
+static int _sev_do_cmd_sync(struct amd_sp_dev *sp, int cmd, void *data, int *psp_ret)
+{
+ unsigned int cbuff_pa_low, cbuff_pa_high, cmd_val;
+ int buf_len, cmdresp, rc, i;
+
+ buf_len = sev_cmd_buffer_len(cmd);
+
+ if ( data )
+ memcpy(sp->cmd_buff, data, buf_len);
+
+ cbuff_pa_low = data ? sp->cbuff_pa_low : 0;
+ cbuff_pa_high = data ? sp->cbuff_pa_high : 0;
+
+ writel(cbuff_pa_low, sp->io_base + sp->vdata->sev->cmdbuff_addr_lo_reg);
+ writel(cbuff_pa_high, sp->io_base + sp->vdata->sev->cmdbuff_addr_hi_reg);
+
+ cmd_val = SEV_CMDRESP_CMD(cmd);
+
+ writel(cmd_val, sp->io_base + sp->vdata->sev->cmdresp_reg);
+
+ for (rc = -EIO, i = asp_sync_tries; i; i-- )
+ {
+ mdelay(asp_sync_delay);
+
+ cmdresp = readl(sp->io_base + sp->vdata->sev->cmdresp_reg);
+ if ( cmdresp & SEV_CMDRESP_RESP )
+ {
+ rc = 0;
+ break;
+ }
+ }
+
+ if ( !rc && SEV_CMDRESP_STS(cmdresp) )
+ rc = -EFAULT;
+
+ if ( rc && psp_ret )
+ *psp_ret = SEV_CMDRESP_STS(cmdresp);
+
+ if ( data && (!rc) )
+ memcpy(data, sp->cmd_buff, buf_len);
+
+ return rc;
+}
+
+int sev_do_cmd(int cmd, void *data, int *psp_ret, bool poll)
+{
+ struct amd_sp_dev *sp = amd_sp_master;
+ int buf_len, rc;
+
+ if ( !sp )
+ return -ENODEV;
+
+ if ( sp->state < SEV_STATE_INIT )
+ return -ENODEV;
+
+ if ( cmd >= SEV_CMD_MAX )
+ return -EINVAL;
+
+ buf_len = sev_cmd_buffer_len(cmd);
+
+ if ( !data != !buf_len )
+ return -EINVAL;
+
+ if ( force_sync || poll )
+ {
+ sp_cmd_lock();
+ rc = _sev_do_cmd_sync(sp, cmd, data, psp_ret);
+ sp_cmd_unlock();
+ }
+ else
+ {
+ rc = _sev_do_cmd(sp, cmd, data, psp_ret);
+ }
+
+ return rc;
+}
+
+static void do_sp_cmd_timer(void *data)
+{
+ struct amd_sp_dev *sp = (struct amd_sp_dev*)data;
+
+ sp->cmd_timeout = 1;
+ wake_up_nr(&sp->cmd_in_progress, 1);
+}
+
+static void do_sp_irq(void *data)
+{
+ struct amd_sp_dev *sp;
+
+ for_each_sp_unit(sp)
+ {
+ uint32_t cmdresp = readl(sp->io_base + sp->vdata->sev->cmdresp_reg);
+ if ( cmdresp & SEV_CMDRESP_RESP )
+ {
+ sp->intr_rcvd = 1;
+ wake_up_nr(&sp->cmd_in_progress, 1);
+ }
+ }
+}
+
+static void sp_interrupt_handler(int irq, void *dev_id)
+{
+ struct amd_sp_dev *sp = (struct amd_sp_dev*)dev_id;
+ uint32_t status;
+
+ status = readl(sp->io_base + sp->vdata->intsts_reg);
+ writel(status, sp->io_base + sp->vdata->intsts_reg);
+
+ if ( status & PSP_INTSTS_CMD_COMPLETE )
+ tasklet_schedule(&sp_irq_tasklet);
+}
+
+static int __init sp_get_capability(struct amd_sp_dev *sp)
+{
+ uint32_t val = readl(sp->io_base + sp->vdata->feature_reg);
+
+ if ( (val == 0xffffffff) || (!(val & PSP_CAPABILITY_SEV)) )
+ return -ENODEV;
+
+ sp->capability = val;
+
+ return 0;
+}
+
+static int __init sp_get_state(struct amd_sp_dev *sp, int *state, int *err)
+{
+ struct sev_user_data_status status;
+ int rc;
+
+ rc = _sev_do_cmd_sync(sp, SEV_CMD_PLATFORM_STATUS, &status, err);
+ if ( rc )
+ return rc;
+
+ *state = status.state;
+
+ return 0;
+}
+
+static int __init sp_get_api_version(struct amd_sp_dev *sp)
+{
+ struct sev_user_data_status status;
+ int err, rc;
+
+ rc = _sev_do_cmd_sync(sp, SEV_CMD_PLATFORM_STATUS, &status, &err);
+ if ( rc )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't get API version (%d 0x%x)\n",
+ &sp->pdev->sbdf, rc, err);
+ return rc;
+ }
+
+ sp->api_major = status.api_major;
+ sp->api_minor = status.api_minor;
+ sp->state = status.state;
+
+ return 0;
+}
+
+static int __init sp_update_firmware(struct amd_sp_dev *sp)
+{
+ /*
+ * FIXME: nothing to do for now
+ */
+ return 0;
+}
+
+static int __init sp_alloc_special_regions(struct amd_sp_dev *sp)
+{
+ /*
+ * FIXME: allocate TMP memory area for SEV-ES
+ */
+ return 0;
+}
+
+static int __init sp_do_init(struct amd_sp_dev *sp)
+{
+ struct sev_data_init data;
+ int err, rc;
+
+ if ( sp->state == SEV_STATE_INIT )
+ return 0;
+
+ memset(&data, 0, sizeof(data));
+
+ rc = _sev_do_cmd_sync(sp, SEV_CMD_INIT, &data, &err);
+ if ( rc )
+ dprintk(XENLOG_ERR, "asp-%pp: can't init device: (%d 0x%x)\n", &sp->pdev->sbdf, rc, err);
+
+ return 0;
+}
+
+static int __init sp_df_flush(struct amd_sp_dev *sp)
+{
+ int rc, err;
+
+ rc = _sev_do_cmd_sync(sp, SEV_CMD_DF_FLUSH, NULL, &err);
+ if ( rc )
+ dprintk(XENLOG_ERR, "asp-%pp: can't flush device: (%d 0x%x)\n", &sp->pdev->sbdf, rc, err);
+
+ return 0;
+}
+
+static int __init sp_dev_init(struct amd_sp_dev *sp)
+{
+ int err, rc;
+
+ rc = sp_get_capability(sp);
+ if ( rc )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: capability is broken %d\n",
+ &sp->pdev->sbdf, rc);
+ return rc;
+ }
+
+ rc = sp_get_api_version(sp);
+ if ( rc )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't get API version %d\n",
+ &sp->pdev->sbdf, rc);
+ return rc;
+ }
+
+ rc = sp_update_firmware(sp);
+ if ( rc )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't update firmware %d\n",
+ &sp->pdev->sbdf, rc);
+ return rc;
+ }
+
+ rc = sp_alloc_special_regions(sp);
+ if ( rc )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't alloc special regions %d\n",
+ &sp->pdev->sbdf, rc);
+ return rc;
+ }
+
+ rc = sp_do_init(sp);
+ if ( rc )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't init device %d\n", &sp->pdev->sbdf,
+ rc);
+ return rc;
+ }
+
+ on_each_cpu(invalidate_cache, NULL, 1);
+
+ rc = sp_df_flush(sp);
+ if ( rc )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't flush %d\n", &sp->pdev->sbdf, rc);
+ return rc;
+ }
+
+ rc = sp_get_state(sp, &sp->state, &err);
+ if ( rc )
+ dprintk(XENLOG_ERR, "asp-%pp: can't get sate %d\n", &sp->pdev->sbdf,rc);
+
+
+ if ( sp->state != SEV_STATE_INIT )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: device is not inited 0x%x\n",
+ &sp->pdev->sbdf, sp->state);
+ return rc;
+ }
+
+ printk(XENLOG_INFO "inited asp-%pp device\n", &sp->pdev->sbdf);
+ return 0;
+}
+
+static int __init sp_init_irq(struct amd_sp_dev *sp)
+{
+ int irq, rc;
+ struct msi_info minfo;
+ struct msi_desc *mdesc;
+
+ /* Disable and clear interrupts until ready */
+ writel(0, sp->io_base + sp->vdata->inten_reg);
+ writel(-1, sp->io_base + sp->vdata->intsts_reg);
+
+ irq = create_irq(0, false);
+ if ( !irq )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't create interrupt\n", &sp->pdev->sbdf);
+ return -EBUSY;
+ }
+
+ minfo.sbdf = sp->pdev->sbdf;
+ minfo.irq = irq;
+ minfo.entry_nr = 1;
+ if ( pci_find_cap_offset(sp->pdev->sbdf, PCI_CAP_ID_MSI) )
+ minfo.table_base = 0;
+ else
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: only MSI is handled\n", &sp->pdev->sbdf);
+ return -EINVAL;
+ }
+
+ mdesc = NULL;
+
+ pcidevs_lock();
+
+ rc = pci_enable_msi(sp->pdev, &minfo, &mdesc);
+ if ( !rc )
+ {
+ struct irq_desc *idesc = irq_to_desc(irq);
+ unsigned long flags;
+
+ spin_lock_irqsave(&idesc->lock, flags);
+ rc = setup_msi_irq(idesc, mdesc);
+ spin_unlock_irqrestore(&idesc->lock, flags);
+
+ if ( rc )
+ {
+ pci_disable_msi(mdesc);
+ dprintk(XENLOG_ERR, "asp-%pp: can't setup msi %d\n", &sp->pdev->sbdf, rc);
+ }
+ }
+
+ pcidevs_unlock();
+
+ if ( rc )
+ {
+ if ( mdesc )
+ msi_free_irq(mdesc);
+ else
+ destroy_irq(irq);
+ return rc;
+
+ }
+
+ rc = request_irq(irq, 0, sp_interrupt_handler, "amd_sp", sp);
+
+ if ( rc )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't request interrupt %d\n", &sp->pdev->sbdf, rc);
+ return rc;
+ }
+
+ sp->irq = irq;
+
+ /* Enable interrupts */
+ writel(-1, sp->io_base + sp->vdata->inten_reg);
+
+ return 0;
+}
+
+static int __init sp_map_iomem(struct amd_sp_dev *sp)
+{
+ uint32_t base_low;
+ uint32_t base_high;
+ uint16_t cmd;
+ size_t size;
+ bool high_space;
+
+ base_low = pci_conf_read32(sp->pdev->sbdf, sp->vdata->base_offset);
+
+ if ( (base_low & PCI_BASE_ADDRESS_SPACE) != PCI_BASE_ADDRESS_SPACE_MEMORY )
+ return -EINVAL;
+
+ if ( (base_low & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == PCI_BASE_ADDRESS_MEM_TYPE_64 )
+ {
+ base_high = pci_conf_read32(sp->pdev->sbdf, sp->vdata->base_offset + 4);
+ high_space = true;
+ } else {
+ base_high = 0;
+ high_space = false;
+ }
+
+ sp->io_pbase = ((paddr_t)base_high << 32) | (base_low & PCI_BASE_ADDRESS_MEM_MASK);
+ ASSERT(sp->io_pbase);
+
+ pci_conf_write32(sp->pdev->sbdf, sp->vdata->base_offset, 0xFFFFFFFF);
+
+ if ( high_space ) {
+ pci_conf_write32(sp->pdev->sbdf, sp->vdata->base_offset + 4, 0xFFFFFFFF);
+ size = (size_t)pci_conf_read32(sp->pdev->sbdf, sp->vdata->base_offset + 4) << 32;
+ } else
+ size = ~0xffffffffUL;
+
+ size |= pci_conf_read32(sp->pdev->sbdf, sp->vdata->base_offset);
+ sp->io_size = ~(size & PCI_BASE_ADDRESS_MEM_MASK) + 1;
+
+ pci_conf_write32(sp->pdev->sbdf, sp->vdata->base_offset, base_low);
+
+ if ( high_space )
+ pci_conf_write32(sp->pdev->sbdf, sp->vdata->base_offset + 4, base_high);
+
+ cmd = pci_conf_read16(sp->pdev->sbdf, PCI_COMMAND);
+ pci_conf_write16(sp->pdev->sbdf, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
+
+ sp->io_base = ioremap(sp->io_pbase, sp->io_size);
+ if ( !sp->io_base )
+ return -EFAULT;
+
+ if ( pci_ro_device(0, sp->pdev->bus, sp->pdev->devfn) )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't hide PCI device\n",&sp->pdev->sbdf);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int __init sp_dev_create(struct pci_dev *pdev, struct psp_vdata *vdata)
+{
+ struct amd_sp_dev *sp;
+ int rc;
+
+ printk(XENLOG_INFO "asp: discovered asp-%pp device\n", &pdev->sbdf);
+
+ sp = xzalloc(struct amd_sp_dev);
+ if ( !sp )
+ return -ENOMEM;
+
+ sp->pdev = pdev;
+ sp->vdata = vdata;
+ sp->state = SEV_STATE_UNINIT;
+
+ init_timer(&sp->cmd_timer, do_sp_cmd_timer, (void*)sp, 0);
+
+ init_waitqueue_head(&sp->cmd_in_progress);
+
+ rc = sp_map_iomem(sp);
+ if ( rc )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't map iomem %d\n", &sp->pdev->sbdf, rc);
+ return rc;
+ }
+
+ rc = sp_init_irq(sp);
+ if ( rc )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't init irq %d\n", &sp->pdev->sbdf, rc);
+ return rc;
+ }
+
+ sp->cmd_buff = alloc_xenheap_pages(get_order_from_bytes(ASP_CMD_BUFF_SIZE), 0);
+ if ( !sp->cmd_buff )
+ {
+ dprintk(XENLOG_ERR, "asp-%pp: can't allocate cmd buffer\n", &sp->pdev->sbdf);
+ return -ENOMEM;
+ }
+
+ sp->cbuff_pa_low = (uint32_t)(__pa(sp->cmd_buff));
+ sp->cbuff_pa_high = (uint32_t)(__pa(sp->cmd_buff) >> 32);
+
+ list_add(&sp->list, &amd_sp_units);
+
+ amd_sp_master = sp;
+
+ return 0;
+}
+
+static void sp_dev_destroy(struct amd_sp_dev* sp)
+{
+ if( sp->io_base )
+ writel(0, sp->io_base + sp->vdata->inten_reg);
+
+ if ( sp->cmd_buff )
+ free_xenheap_pages(sp->cmd_buff, get_order_from_bytes(ASP_CMD_BUFF_SIZE));
+
+ xfree(sp);
+}
+
+static void sp_devs_destroy(void)
+{
+ struct amd_sp_dev *sp, *next;
+
+ list_for_each_entry_safe ( sp, next, &amd_sp_units, list)
+ {
+ list_del(&sp->list);
+ sp_dev_destroy(sp);
+ }
+}
+
+static int __init amd_sp_probe(void)
+{
+ int bus = 0, devfn = 0, rc;
+ struct amd_sp_dev *sp;
+
+ if ( boot_cpu_has(X86_FEATURE_XEN_SHSTK) )
+ {
+ force_sync = true;
+ printk(XENLOG_INFO "asp: CET-SS detected - sync mode forced\n");
+ }
+
+ for ( bus = 0; bus < 256; ++bus )
+ for ( devfn = 0; devfn < 256; ++devfn )
+ {
+ struct pci_dev *pdev;
+ pcidevs_lock();
+ pdev = pci_get_pdev(NULL, PCI_SBDF(0, bus, devfn));
+ pcidevs_unlock();
+
+ if ( !pdev || pci_conf_read16(pdev->sbdf, PCI_VENDOR_ID) !=
+ PCI_VENDOR_ID_AMD )
+ continue;
+
+ switch ( pci_conf_read16(pdev->sbdf, PCI_DEVICE_ID) )
+ {
+ case 0x1456:
+ rc = sp_dev_create(pdev, &pspv1);
+ break;
+ case 0x1486:
+ rc = sp_dev_create(pdev, &pspv2);
+ break;
+ case 0x14CA:
+ rc = sp_dev_create(pdev, &pspv4);
+ break;
+ case 0x156E:
+ rc = sp_dev_create(pdev, &pspv6);
+ break;
+ default:
+ rc = 0;
+ break;
+ }
+ if ( rc )
+ goto err;
+ }
+
+ for_each_sp_unit(sp)
+ {
+ rc = sp_dev_init(sp);
+ if ( rc )
+ goto err;
+ }
+
+ return 0;
+
+ err:
+ sp_devs_destroy();
+ return rc;
+}
+
+__initcall(amd_sp_probe);
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 09/16] common: Introduce confidential computing infrastructure
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (7 preceding siblings ...)
2025-05-16 10:22 ` [RFC PATCH 08/16] x86/crypto: Introduce AMD PSP driver for SEV Teddy Astie
@ 2025-05-16 10:23 ` Teddy Astie
2025-05-16 10:23 ` [RFC PATCH 10/16] xl/coco: Introduce confidential computing support Teddy Astie
` (7 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 10:23 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini
Introduce a subsystem that is used for future confidential computing
platforms. This subsystem manages and provides hooks for domain management
and exposes various informations for toolstack (COCO platform, supported
features, ...).
Add a domain creation flag to build a confidential computing guest.
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
xen/arch/x86/domain.c | 4 +
xen/arch/x86/hvm/hvm.c | 10 ++-
xen/common/Kconfig | 5 ++
xen/common/Makefile | 1 +
xen/common/coco.c | 134 ++++++++++++++++++++++++++++++++++
xen/common/domain.c | 41 ++++++++++-
xen/include/hypercall-defs.c | 2 +
xen/include/public/domctl.h | 5 +-
xen/include/public/hvm/coco.h | 65 +++++++++++++++++
xen/include/public/xen.h | 1 +
xen/include/xen/coco.h | 88 ++++++++++++++++++++++
xen/include/xen/sched.h | 10 +++
12 files changed, 363 insertions(+), 3 deletions(-)
create mode 100644 xen/common/coco.c
create mode 100644 xen/include/public/hvm/coco.h
create mode 100644 xen/include/xen/coco.h
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index f197dad4c0..a5783154ad 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -12,6 +12,7 @@
*/
#include <xen/acpi.h>
+#include <xen/coco.h>
#include <xen/compat.h>
#include <xen/console.h>
#include <xen/cpu.h>
@@ -948,6 +949,9 @@ void arch_domain_destroy(struct domain *d)
free_xenheap_page(d->shared_info);
cleanup_domain_irq_mapping(d);
+ if ( is_coco_domain(d) )
+ coco_domain_destroy(d);
+
psr_domain_free(d);
}
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 625ae2098b..e1bcf8e086 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -15,6 +15,7 @@
#include <xen/sched.h>
#include <xen/irq.h>
#include <xen/softirq.h>
+#include <xen/coco.h>
#include <xen/domain.h>
#include <xen/domain_page.h>
#include <xen/hypercall.h>
@@ -702,7 +703,11 @@ int hvm_domain_initialise(struct domain *d,
if ( rc )
goto fail2;
- rc = hvm_asid_alloc(&d->arch.hvm.asid);
+ if ( is_coco_domain(d) && d->coco_ops && d->coco_ops->asid_alloc )
+ rc = d->coco_ops->asid_alloc(d, &d->arch.hvm.asid);
+ else
+ rc = hvm_asid_alloc(&d->arch.hvm.asid);
+
if ( rc )
goto fail2;
@@ -710,6 +715,9 @@ int hvm_domain_initialise(struct domain *d,
if ( rc != 0 )
goto fail2;
+ if ( is_coco_domain(d) )
+ coco_domain_initialise(d);
+
return 0;
fail2:
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 6d43be2e6e..1ddb73e707 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -576,4 +576,9 @@ config BUDDY_ALLOCATOR_SIZE
Amount of memory reserved for the buddy allocator to serve Xen heap,
working alongside the colored one.
+config COCO
+ bool "Enable COnfidential COmputing support for guests" if EXPERT
+ default n
+ help
+ Allows to run guests in private encrypted memory space.
endmenu
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 98f0873056..4409510fc0 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -5,6 +5,7 @@ obj-$(CONFIG_GENERIC_BUG_FRAME) += bug.o
obj-$(CONFIG_HYPFS_CONFIG) += config_data.o
obj-$(CONFIG_CORE_PARKING) += core_parking.o
obj-y += cpu.o
+obj-$(CONFIG_COCO) += coco.o
obj-$(CONFIG_DEBUG_TRACE) += debugtrace.o
obj-$(CONFIG_HAS_DEVICE_TREE) += device.o
obj-$(filter-out $(CONFIG_X86),$(CONFIG_ACPI)) += device.o
diff --git a/xen/common/coco.c b/xen/common/coco.c
new file mode 100644
index 0000000000..d9bd17628d
--- /dev/null
+++ b/xen/common/coco.c
@@ -0,0 +1,134 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * General confidential computing functions.
+ */
+
+#include <xen/coco.h>
+#include <xen/errno.h>
+#include <xen/domain.h>
+#include <xen/domain_page.h>
+#include <xen/guest_access.h>
+#include <xen/hypercall.h>
+#include <xen/sched.h>
+#include <xen/sections.h>
+#include <xen/types.h>
+
+#include <asm/p2m.h>
+
+#include <public/hvm/coco.h>
+
+static __ro_after_init struct coco_ops *coco_ops;
+__read_mostly struct coco_platform_status platform_status;
+
+void __init coco_register_ops(struct coco_ops *ops)
+{
+ coco_ops = ops;
+}
+
+int __init coco_init(void)
+{
+ int rc = 0;
+
+ if ( coco_ops )
+ printk("coco: Using '%s'\n", coco_ops->name);
+ else
+ {
+ printk("coco: No platform found\n");
+ return 0;
+ }
+
+ if ( coco_ops->init )
+ {
+ rc = coco_ops->init();
+
+ if ( rc )
+ {
+ printk("coco: Unable to initialize coco platform (%d)", rc);
+ goto err;
+ }
+ }
+
+ rc = coco_ops->get_platform_status(&platform_status);
+ if ( rc )
+ {
+ printk("coco: Unable to get platform status\n");
+ goto err;
+ }
+
+ return 0;
+
+err:
+ /* Disable confidential computing if initialization failed. */
+ coco_ops = NULL;
+ return rc;
+}
+
+void coco_set_domain_ops(struct domain *d)
+{
+ ASSERT(is_coco_domain(d));
+
+ d->coco_ops = coco_ops->get_domain_ops(d);
+}
+
+int coco_prepare_initial_memory(struct domain *d, gfn_t gfn, size_t page_count)
+{
+ /* TODO: Check prepare_initial_memory constraints (no dangling mapping). */
+
+ if ( d->coco_ops->prepare_initial_mem )
+ return d->coco_ops->prepare_initial_mem(d, gfn, page_count);
+
+ return 0;
+}
+
+long coco_op_prepare_initial_mem(struct coco_prepare_initial_mem arg)
+{
+ long rc = 0;
+ struct domain *d = get_domain_by_id(arg.domid);
+
+ if ( !d )
+ return -ENOENT;
+
+ if ( !is_coco_domain(d) )
+ {
+ rc = -EOPNOTSUPP;
+ goto out;
+ }
+
+ rc = coco_prepare_initial_memory(d, arg.gfn, arg.count);
+
+out:
+ put_domain(d);
+ return rc;
+}
+
+long do_coco_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+ if ( !is_hardware_domain(current->domain) )
+ return -EPERM;
+
+ switch (cmd)
+ {
+ case XEN_COCO_platform_status:
+ {
+ if ( copy_to_guest(arg, &platform_status, 1) )
+ return -EFAULT;
+
+ return 0;
+ }
+
+ case XEN_COCO_prepare_initial_mem:
+ {
+ struct coco_prepare_initial_mem prepare_initial_mem;
+
+ if ( copy_from_guest(&prepare_initial_mem, arg, 1) )
+ return -EFAULT;
+
+ return coco_op_prepare_initial_mem(prepare_initial_mem);
+ }
+
+ default:
+ return -ENOSYS;
+ }
+}
+
+__initcall(coco_init);
\ No newline at end of file
diff --git a/xen/common/domain.c b/xen/common/domain.c
index abf1969e60..c29d6efd29 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -4,6 +4,7 @@
* Generic domain-handling functions.
*/
+#include <xen/coco.h>
#include <xen/compat.h>
#include <xen/init.h>
#include <xen/lib.h>
@@ -716,17 +717,51 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
bool hap = config->flags & XEN_DOMCTL_CDF_hap;
bool iommu = config->flags & XEN_DOMCTL_CDF_iommu;
bool vpmu = config->flags & XEN_DOMCTL_CDF_vpmu;
+ bool coco = config->flags & XEN_DOMCTL_CDF_coco;
if ( config->flags &
~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
- XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_vpmu) )
+ XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_vpmu | XEN_DOMCTL_CDF_coco) )
{
dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
return -EINVAL;
}
+ if ( coco )
+ {
+ if ( !IS_ENABLED(CONFIG_COCO) )
+ {
+ dprintk(XENLOG_INFO, "COCO support is compiled out\n");
+ return -EINVAL;
+ }
+
+ if ( !coco_is_supported() )
+ {
+ dprintk(XENLOG_INFO, "COCO is not available\n");
+ return -EINVAL;
+ }
+
+ if ( !hvm )
+ {
+ dprintk(XENLOG_INFO, "COCO requested for non-HVM guest\n");
+ return -EINVAL;
+ }
+
+ if ( !hap )
+ {
+ dprintk(XENLOG_INFO, "COCO cannot work without HAP\n");
+ return -EINVAL;
+ }
+
+ if ( config->flags & XEN_DOMCTL_CDF_nested_virt )
+ {
+ dprintk(XENLOG_INFO, "Nested virtualization isn't supported with COCO\n");
+ return -EINVAL;
+ }
+ }
+
if ( config->grant_opts & ~XEN_DOMCTL_GRANT_version_mask )
{
dprintk(XENLOG_INFO, "Unknown grant options %#x\n", config->grant_opts);
@@ -836,6 +871,9 @@ struct domain *domain_create(domid_t domid,
/* Holding CDF_* internal flags. */
d->cdf = flags;
+ if ( is_coco_domain(d) )
+ coco_set_domain_ops(d);
+
TRACE_TIME(TRC_DOM0_DOM_ADD, d->domain_id);
lock_profile_register_struct(LOCKPROF_TYPE_PERDOM, d, domid);
@@ -1617,6 +1655,7 @@ int domain_unpause_by_systemcontroller(struct domain *d)
{
d->creation_finished = true;
arch_domain_creation_finished(d);
+ coco_domain_creation_finished(d); /* TODO: or before arch_* ? */
}
domain_unpause(d);
diff --git a/xen/include/hypercall-defs.c b/xen/include/hypercall-defs.c
index 7720a29ade..6c01a9e395 100644
--- a/xen/include/hypercall-defs.c
+++ b/xen/include/hypercall-defs.c
@@ -209,6 +209,7 @@ hypfs_op(unsigned int cmd, const char *arg1, unsigned long arg2, void *arg3, uns
#ifdef CONFIG_X86
xenpmu_op(unsigned int op, xen_pmu_params_t *arg)
#endif
+coco_op(unsigned int cmd, void *arg)
#ifdef CONFIG_PV
caller: pv64
@@ -295,5 +296,6 @@ mca do do - - -
#ifndef CONFIG_PV_SHIM_EXCLUSIVE
paging_domctl_cont do do do do -
#endif
+coco_op do do do do do
#endif /* !CPPCHECK */
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 5b2063eed9..f4f69556b6 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -67,8 +67,11 @@ struct xen_domctl_createdomain {
/* Should we expose the vPMU to the guest? */
#define XEN_DOMCTL_CDF_vpmu (1U << 7)
+#define _XEN_DOMCTL_CDF_coco 8
+#define XEN_DOMCTL_CDF_coco (1U << _XEN_DOMCTL_CDF_coco)
+
/* Max XEN_DOMCTL_CDF_* constant. Used for ABI checking. */
-#define XEN_DOMCTL_CDF_MAX XEN_DOMCTL_CDF_vpmu
+#define XEN_DOMCTL_CDF_MAX XEN_DOMCTL_CDF_coco
uint32_t flags;
diff --git a/xen/include/public/hvm/coco.h b/xen/include/public/hvm/coco.h
new file mode 100644
index 0000000000..2e23d91e12
--- /dev/null
+++ b/xen/include/public/hvm/coco.h
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: MIT */
+#ifndef __XEN_PUBLIC_HVM_COCO_H__
+#define __XEN_PUBLIC_HVM_COCO_H__
+
+#include "../xen.h"
+
+#define XEN_COCO_platform_status 0
+
+/**
+ * XEN_COCO_platform_status: Get the status of confidential computing platform.
+ *
+ * Query informations regarding the current confidential computing platform.
+ *
+ * Confidential computing is supposed working as long as COCO_STATUS_FLAG_SUPPORTED bit
+ * is set, and additionally security-supported only if COCO_STATUS_FLAG_UNSAFE bit
+ * is cleared.
+ *
+ * If COCO_PLATFORM_FLAG_UNSAFE is set but COCO_PLATFORM_FLAG_SUPPORTED is not,
+ * then confidential computing is explicitly present but intentionally disabled
+ * or forbidden by policy.
+ */
+struct coco_platform_status {
+#define COCO_PLATFORM_none 0 /* None */
+#define COCO_PLATFORM_amd_sev 1 /* AMD Secure Encrypted Virtualization */
+#define COCO_PLATFORM_intel_tdx 2 /* Intel Trust Domain Extensions */
+#define COCO_PLATFORM_arm_rme 3 /* ARM Realm Management Extension */
+ uint32_t platform; /* OUT */
+
+#define COCO_PLATFORM_FLAG_sev_es (1 << 0) /* AMD SEV Encrypted State */
+#define COCO_PLATFORM_FLAG_sev_snp (1 << 1) /* AMD SEV Secure Nested Paging */
+#define COCO_PLATFORM_FLAG_sev_tio (1 << 2) /* AMD SEV Trusted I/O */
+ uint32_t platform_flags; /* OUT */
+
+#define COCO_STATUS_FLAG_supported (1 << 0) /* Confidential computing is supported and usable */
+#define COCO_STATUS_FLAG_unsafe (1 << 1) /* Confidential computing is unsafe (e.g debug mode) */
+ uint32_t flags; /* OUT */
+ uint32_t features; /* OUT */
+
+ uint32_t version_major; /* OUT */
+ uint32_t version_minor; /* OUT */
+};
+typedef struct coco_platform_status coco_platform_status_t;
+DEFINE_XEN_GUEST_HANDLE(coco_platform_status_t);
+
+#define XEN_COCO_prepare_initial_mem 1
+
+/**
+ * XEN_COCO_prepare_initial_mem: Prepare early memory pages of a guest
+ *
+ * During guest construction, the confidential computing platform may require memory
+ * to be prepared (e.g., encrypted) before the guest is started.
+ *
+ * After preparation, any further access to these pages is invalid, as they may be
+ * encrypted, sealed, or tracked by the platform.
+ */
+struct coco_prepare_initial_mem {
+ domid_t domid; /* IN */
+ uint16_t _rsvd[3]; /* ZERO */
+ uint64_t gfn; /* IN */
+ uint64_t count; /* IN */
+};
+typedef struct coco_prepare_initial_mem coco_prepare_initial_mem_t;
+DEFINE_XEN_GUEST_HANDLE(coco_prepare_initial_mem_t);
+
+#endif /* __XEN_PUBLIC_HVM_COCO_H__ */
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 82b9c05a76..e656d6f617 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -118,6 +118,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
#define __HYPERVISOR_xenpmu_op 40
#define __HYPERVISOR_dm_op 41
#define __HYPERVISOR_hypfs_op 42
+#define __HYPERVISOR_coco_op 43
/* Architecture-specific hypercall definitions. */
#define __HYPERVISOR_arch_0 48
diff --git a/xen/include/xen/coco.h b/xen/include/xen/coco.h
new file mode 100644
index 0000000000..2ae43995ec
--- /dev/null
+++ b/xen/include/xen/coco.h
@@ -0,0 +1,88 @@
+#ifndef _XEN_COCO_H
+#define _XEN_COCO_H
+
+#include <asm/nospec.h>
+
+#include <xen/stdint.h>
+#include <xen/sched.h>
+
+#include <public/hvm/coco.h>
+
+extern __read_mostly struct coco_platform_status platform_status;
+
+struct coco_domain_ops {
+ int (*prepare_initial_mem)(struct domain *d, gfn_t gfn, size_t page_count);
+ /* domain_creation_finished, ... */
+
+ /* HVM domain hooks */
+ int (*domain_initialise)(struct domain *d);
+ int (*domain_creation_finished)(struct domain *d);
+ void (*domain_destroy)(struct domain *d);
+
+#ifdef CONFIG_X86
+ /* COCO-specific ASID allocation logic */
+ int (*asid_alloc)(struct domain *d, struct hvm_asid *asid);
+#endif
+};
+
+struct coco_ops {
+ const char *name;
+
+ int (*init)(void);
+ int (*get_platform_status)(coco_platform_status_t *status);
+ struct coco_domain_ops *(*get_domain_ops)(struct domain *d);
+};
+
+void __init coco_register_ops(struct coco_ops *ops);
+int __init coco_init(void);
+void coco_set_domain_ops(struct domain *d);
+
+#ifdef CONFIG_COCO
+static inline bool coco_is_supported(void)
+{
+ return evaluate_nospec(platform_status.flags & COCO_STATUS_FLAG_supported);
+}
+
+static inline int coco_domain_initialise(struct domain *d)
+{
+ if ( d->coco_ops && d->coco_ops->domain_initialise )
+ return d->coco_ops->domain_initialise(d);
+
+ return 0;
+}
+
+static inline int coco_domain_creation_finished(struct domain *d)
+{
+ if ( d->coco_ops && d->coco_ops->domain_creation_finished )
+ return d->coco_ops->domain_creation_finished(d);
+
+ return 0;
+}
+
+static inline void coco_domain_destroy(struct domain *d)
+{
+ if ( d->coco_ops && d->coco_ops->domain_destroy )
+ d->coco_ops->domain_destroy(d);
+}
+#else
+static inline bool coco_is_supported(void)
+{
+ return false;
+}
+
+static inline int coco_domain_initialise(struct domain *d)
+{
+ return 0;
+}
+
+static inline int coco_domain_creation_finished(struct domain *d)
+{
+ return 0;
+}
+
+static inline void coco_domain_destroy(struct domain *d)
+{
+}
+#endif
+
+#endif /* _XEN_COCO_H */
\ No newline at end of file
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index f2f5a98534..c57bedc30a 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -630,6 +630,10 @@ struct domain
struct argo_domain *argo;
#endif
+#ifdef CONFIG_COCO
+ struct coco_domain_ops *coco_ops;
+#endif
+
/*
* Continuation information for domain_teardown(). All fields entirely
* private.
@@ -1198,6 +1202,12 @@ static always_inline bool is_hvm_vcpu(const struct vcpu *v)
return is_hvm_domain(v->domain);
}
+static always_inline bool is_coco_domain(const struct domain *d)
+{
+ return IS_ENABLED(CONFIG_COCO) &&
+ evaluate_nospec(d->options & XEN_DOMCTL_CDF_coco);
+}
+
static always_inline bool hap_enabled(const struct domain *d)
{
/* sanitise_domain_config() rejects HAP && !HVM */
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 10/16] xl/coco: Introduce confidential computing support
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (8 preceding siblings ...)
2025-05-16 10:23 ` [RFC PATCH 09/16] common: Introduce confidential computing infrastructure Teddy Astie
@ 2025-05-16 10:23 ` Teddy Astie
2025-05-16 10:24 ` [RFC PATCH 11/16] x86/svm: Introduce NPCTRL VMCB bits Teddy Astie
` (6 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 10:23 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Anthony PERARD, Juergen Gross, Jan Beulich,
Andrew Cooper, Roger Pau Monné, Christian Lindig,
David Scott, Vaishali Thakkar
From: Vaishali Thakkar <vaishali.thakkar@suse.com>
Signed-off-by: Vaishali Thakkar <vaishali.thakkar@suse.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
tools/include/libxl.h | 5 ++++
tools/include/xenctrl.h | 4 ++++
tools/include/xenguest.h | 1 +
tools/libs/ctrl/xc_domain.c | 36 +++++++++++++++++++++++++++++
tools/libs/guest/Makefile.common | 2 ++
tools/libs/guest/xg_dom_boot.c | 33 +++++++++++++++++++++++++++
tools/libs/guest/xg_dom_coco.c | 35 ++++++++++++++++++++++++++++
tools/libs/guest/xg_dom_coco.h | 39 ++++++++++++++++++++++++++++++++
tools/libs/guest/xg_dom_x86.c | 1 +
tools/libs/light/libxl_cpuid.c | 1 +
tools/libs/light/libxl_create.c | 4 ++++
tools/libs/light/libxl_dom.c | 1 +
tools/libs/light/libxl_types.idl | 1 +
tools/libs/util/libxlu_disk_l.c | 13 ++++-------
tools/libs/util/libxlu_disk_l.h | 7 ++----
tools/misc/xen-cpuid.c | 1 +
tools/ocaml/libs/xc/xenctrl.ml | 1 +
tools/ocaml/libs/xc/xenctrl.mli | 1 +
tools/xl/xl_parse.c | 2 ++
19 files changed, 175 insertions(+), 13 deletions(-)
create mode 100644 tools/libs/guest/xg_dom_coco.c
create mode 100644 tools/libs/guest/xg_dom_coco.h
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index b7ad7735ca..e75179b604 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -178,6 +178,11 @@
*/
#define LIBXL_HAVE_BUILDINFO_EVENT_CHANNELS 1
+/*
+ * The libxl_domain_build_info has the coco field.
+*/
+#define LIBXL_HAVE_BUILDINFO_COCO 1
+
/*
* libxl_domain_build_info has the u.hvm.ms_vm_genid field.
*/
diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
index 4955981231..aae228da44 100644
--- a/tools/include/xenctrl.h
+++ b/tools/include/xenctrl.h
@@ -46,6 +46,7 @@
#include <xen/xsm/flask_op.h>
#include <xen/kexec.h>
#include <xen/platform.h>
+#include <xen/hvm/coco.h>
#include "xentoollog.h"
#include "xen-barrier.h"
@@ -1682,6 +1683,9 @@ int xc_hvm_param_get(xc_interface *handle, uint32_t dom, uint32_t param, uint64_
int xc_set_hvm_param(xc_interface *handle, uint32_t dom, int param, unsigned long value);
int xc_get_hvm_param(xc_interface *handle, uint32_t dom, int param, unsigned long *value);
+int xc_coco_platform_status(xc_interface *handle, coco_platform_status_t *status);
+int xc_coco_prepare_initial_mem(xc_interface *handle, coco_prepare_initial_mem_t *cmd);
+
/* HVM guest pass-through */
int xc_assign_device(xc_interface *xch,
uint32_t domid,
diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h
index e01f494b77..9d36fa5665 100644
--- a/tools/include/xenguest.h
+++ b/tools/include/xenguest.h
@@ -219,6 +219,7 @@ struct xc_dom_image {
xen_paddr_t lowmem_end;
xen_paddr_t highmem_end;
xen_pfn_t vga_hole_size;
+ bool coco; /* 1 if this is a confidential computing guest, 0 otherwise */
/* If unset disables the setup of the IOREQ pages. */
bool device_model;
diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c
index 2ddc3f4f42..66b6c146f4 100644
--- a/tools/libs/ctrl/xc_domain.c
+++ b/tools/libs/ctrl/xc_domain.c
@@ -20,8 +20,10 @@
*/
#include "xc_private.h"
+#include "xenctrl.h"
#include <xen/memory.h>
#include <xen/hvm/hvm_op.h>
+#include <xen/hvm/coco.h>
int xc_domain_create(xc_interface *xch, uint32_t *pdomid,
struct xen_domctl_createdomain *config)
@@ -1496,6 +1498,40 @@ int xc_get_hvm_param(xc_interface *handle, uint32_t dom, int param, unsigned lon
return 0;
}
+int xc_coco_platform_status(xc_interface *handle, coco_platform_status_t *status)
+{
+ DECLARE_HYPERCALL_BUFFER(coco_platform_status_t, arg);
+ int rc;
+
+ arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+ if ( arg == NULL )
+ return -1;
+ memcpy(arg, status, sizeof(coco_platform_status_t));
+
+ rc = xencall2(handle->xcall, __HYPERVISOR_coco_op, XEN_COCO_platform_status,
+ HYPERCALL_BUFFER_AS_ARG(arg));
+
+ xc_hypercall_buffer_free(handle, arg);
+ return rc;
+}
+
+int xc_coco_prepare_initial_mem(xc_interface *handle, coco_prepare_initial_mem_t *cmd)
+{
+ DECLARE_HYPERCALL_BUFFER(coco_prepare_initial_mem_t, arg);
+ int rc;
+
+ arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+ if ( arg == NULL )
+ return -1;
+ memcpy(arg, cmd, sizeof(coco_prepare_initial_mem_t));
+
+ rc = xencall2(handle->xcall, __HYPERVISOR_coco_op, XEN_COCO_prepare_initial_mem,
+ HYPERCALL_BUFFER_AS_ARG(arg));
+
+ xc_hypercall_buffer_free(handle, arg);
+ return rc;
+}
+
int xc_domain_setdebugging(xc_interface *xch,
uint32_t domid,
unsigned int enable)
diff --git a/tools/libs/guest/Makefile.common b/tools/libs/guest/Makefile.common
index a026a2f662..64ede46a05 100644
--- a/tools/libs/guest/Makefile.common
+++ b/tools/libs/guest/Makefile.common
@@ -41,6 +41,8 @@ endif
# new domain builder
OBJS-y += xg_dom_core.o
OBJS-y += xg_dom_boot.o
+# TODO: add something like CONFIG_COCO ?
+OBJS-y += xg_dom_coco.o
OBJS-y += xg_dom_elfloader.o
OBJS-$(CONFIG_X86) += xg_dom_bzimageloader.o
OBJS-$(CONFIG_X86) += xg_dom_decompress_lz4.o
diff --git a/tools/libs/guest/xg_dom_boot.c b/tools/libs/guest/xg_dom_boot.c
index 5c7e12221d..6566784161 100644
--- a/tools/libs/guest/xg_dom_boot.c
+++ b/tools/libs/guest/xg_dom_boot.c
@@ -32,9 +32,13 @@
#include "xg_private.h"
#include "xg_core.h"
+#include "xg_dom_coco.h"
#include <xen/hvm/params.h>
#include <xen/grant_table.h>
+#define round_pgup(_p) (((_p)+(PAGE_SIZE_X86-1))&PAGE_MASK_X86)
+#define round_pgdown(_p) ((_p)&PAGE_MASK_X86)
+
/* ------------------------------------------------------------------------ */
static int setup_hypercall_page(struct xc_dom_image *dom)
@@ -201,6 +205,35 @@ int xc_dom_boot_image(struct xc_dom_image *dom)
if ( (rc = dom->arch_hooks->bootlate(dom)) != 0 )
return rc;
+ // Encrypt domain pages
+ if ( dom->coco )
+ {
+ struct xc_dom_seg initrd_seg = {
+ .pfn = dom->initrd_start >> XC_DOM_PAGE_SHIFT(dom),
+ .pages = dom->initrd_len >> XC_DOM_PAGE_SHIFT(dom)
+ };
+
+ if ( (rc = xg_dom_coco_encrypt_seg(dom->xch, dom, dom->kernel_seg, "kernel") != 0) )
+ return rc;
+ if ( initrd_seg.pages && (rc = xg_dom_coco_encrypt_seg(dom->xch, dom, initrd_seg, "ramdisk") != 0) )
+ return rc;
+ if ( (rc = xg_dom_coco_encrypt_seg(dom->xch, dom, dom->start_info_seg, "start_info") != 0) )
+ return rc;
+
+ for ( int i = 0; i < MAX_ACPI_MODULES; i++ )
+ {
+ struct xc_dom_seg seg;
+ seg.pfn = dom->acpi_modules[i].guest_addr_out >> XC_DOM_PAGE_SHIFT(dom);
+ seg.pages = round_pgup(dom->acpi_modules[i].length) >> XC_DOM_PAGE_SHIFT(dom);
+
+ if ( !seg.pfn || !seg.pages )
+ continue;
+
+ if ( (rc = xg_dom_coco_encrypt_seg(dom->xch, dom, seg, "acpi module")) != 0 )
+ return rc;
+ }
+ }
+
/* let the vm run */
if ( (rc = dom->arch_hooks->vcpu(dom)) != 0 )
return rc;
diff --git a/tools/libs/guest/xg_dom_coco.c b/tools/libs/guest/xg_dom_coco.c
new file mode 100644
index 0000000000..f47b59fa49
--- /dev/null
+++ b/tools/libs/guest/xg_dom_coco.c
@@ -0,0 +1,35 @@
+/*
+ * Confidential computing support.
+ * Copyright (c) 2024 Teddy Astie <teddy.astie@vates.tech>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "xg_private.h"
+#include "xenctrl.h"
+#include "xg_dom_coco.h"
+
+int xg_dom_coco_encrypt_seg(xc_interface *xch, struct xc_dom_image *dom,
+ struct xc_dom_seg seg, const char *name)
+{
+ coco_prepare_initial_mem_t cmd;
+ DPRINTF("coco: Encrypting pfn:[%"PRI_xen_pfn"-%"PRI_xen_pfn"] (%s)\n",
+ seg.pfn, seg.pfn + seg.pages, name);
+
+ cmd.domid = dom->guest_domid;
+ cmd.gfn = seg.pfn;
+ cmd.count = seg.pages;
+
+ return xc_coco_prepare_initial_mem(xch, &cmd);
+}
\ No newline at end of file
diff --git a/tools/libs/guest/xg_dom_coco.h b/tools/libs/guest/xg_dom_coco.h
new file mode 100644
index 0000000000..eac0fa66e3
--- /dev/null
+++ b/tools/libs/guest/xg_dom_coco.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (c) 2006 Isaku Yamahata <yamahata at valinux co jp>
+ * VA Linux Systems Japan K.K.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#ifndef XC_DOM_COCO_H
+#define XC_DOM_COCO_H
+
+#include "xg_private.h"
+#include "xenctrl.h"
+
+int xg_dom_coco_encrypt_seg(xc_interface *xch, struct xc_dom_image *dom,
+ struct xc_dom_seg seg, const char *name);
+
+#endif /* XC_DOM_COCO_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/libs/guest/xg_dom_x86.c b/tools/libs/guest/xg_dom_x86.c
index cba01384ae..93407bf192 100644
--- a/tools/libs/guest/xg_dom_x86.c
+++ b/tools/libs/guest/xg_dom_x86.c
@@ -103,6 +103,7 @@ struct xc_dom_image_x86 {
#define MAPPING_MAX 2
struct xc_dom_x86_mapping maps[MAPPING_MAX];
const struct xc_dom_params *params;
+ bool coco;
/* PV: Pointer to the in-guest P2M. */
void *p2m_guest;
diff --git a/tools/libs/light/libxl_cpuid.c b/tools/libs/light/libxl_cpuid.c
index 063fe86eb7..9891c42a5b 100644
--- a/tools/libs/light/libxl_cpuid.c
+++ b/tools/libs/light/libxl_cpuid.c
@@ -342,6 +342,7 @@ int libxl_cpuid_parse_config(libxl_cpuid_policy_list *policy, const char* str)
CPUID_ENTRY(0x00000007, 1, CPUID_REG_EDX),
MSR_ENTRY(0x10a, CPUID_REG_EAX),
MSR_ENTRY(0x10a, CPUID_REG_EDX),
+ CPUID_ENTRY(0x8000001f, NA, CPUID_REG_EAX),
#undef MSR_ENTRY
#undef CPUID_ENTRY
};
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index e03599ea99..185f7946f4 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -93,6 +93,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
libxl_defbool_setdefault(&b_info->device_model_stubdomain, false);
libxl_defbool_setdefault(&b_info->vpmu, false);
+ libxl_defbool_setdefault(&b_info->coco, false);
if (libxl_defbool_val(b_info->device_model_stubdomain) &&
!b_info->device_model_ssidref)
@@ -667,6 +668,9 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
if (libxl_defbool_val(b_info->vpmu))
create.flags |= XEN_DOMCTL_CDF_vpmu;
+ if (libxl_defbool_val(b_info->coco))
+ create.flags |= XEN_DOMCTL_CDF_coco;
+
assert(info->passthrough != LIBXL_PASSTHROUGH_DEFAULT);
LOG(DETAIL, "passthrough: %s",
libxl_passthrough_to_string(info->passthrough));
diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c
index 94fef37401..778dac2286 100644
--- a/tools/libs/light/libxl_dom.c
+++ b/tools/libs/light/libxl_dom.c
@@ -1081,6 +1081,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
}
dom->container_type = XC_DOM_HVM_CONTAINER;
+ dom->coco = libxl_defbool_val(info->coco);
/* The params from the configuration file are in Mb, which are then
* multiplied by 1 Kb. This was then divided off when calling
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 9bb2969931..bb27e27148 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -637,6 +637,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
("nested_hvm", libxl_defbool),
("apic", libxl_defbool),
("dm_restrict", libxl_defbool),
+ ("coco", libxl_defbool),
("tee", libxl_tee_type),
("u", KeyedUnion(None, libxl_domain_type, "type",
[("hvm", Struct(None, [("firmware", string),
diff --git a/tools/libs/util/libxlu_disk_l.c b/tools/libs/util/libxlu_disk_l.c
index 0c180fff52..4924162a51 100644
--- a/tools/libs/util/libxlu_disk_l.c
+++ b/tools/libs/util/libxlu_disk_l.c
@@ -1,10 +1,7 @@
#line 1 "libxlu_disk_l.c"
-#line 31 "libxlu_disk_l.l"
#define _GNU_SOURCE
-
-
-#line 7 "libxlu_disk_l.c"
+#line 4 "libxlu_disk_l.c"
#define YY_INT_ALIGNED short int
@@ -1257,9 +1254,9 @@ static int vdev_and_devtype(DiskParseContext *dpc, char *str) {
#undef DPC /* needs to be defined differently the actual lexer */
#define DPC ((DiskParseContext*)yyextra)
-#line 1260 "libxlu_disk_l.c"
+#line 1257 "libxlu_disk_l.c"
-#line 1262 "libxlu_disk_l.c"
+#line 1259 "libxlu_disk_l.c"
#define INITIAL 0
#define LEXERR 1
@@ -1541,7 +1538,7 @@ YY_DECL
#line 188 "libxlu_disk_l.l"
/*----- the scanner rules which do the parsing -----*/
-#line 1544 "libxlu_disk_l.c"
+#line 1541 "libxlu_disk_l.c"
while ( /*CONSTCOND*/1 ) /* loops until end-of-file is reached */
{
@@ -1920,7 +1917,7 @@ YY_RULE_SETUP
#line 306 "libxlu_disk_l.l"
YY_FATAL_ERROR( "flex scanner jammed" );
YY_BREAK
-#line 1923 "libxlu_disk_l.c"
+#line 1920 "libxlu_disk_l.c"
case YY_STATE_EOF(INITIAL):
case YY_STATE_EOF(LEXERR):
yyterminate();
diff --git a/tools/libs/util/libxlu_disk_l.h b/tools/libs/util/libxlu_disk_l.h
index c868422568..027fd96c49 100644
--- a/tools/libs/util/libxlu_disk_l.h
+++ b/tools/libs/util/libxlu_disk_l.h
@@ -3,12 +3,9 @@
#define xlu__disk_yyIN_HEADER 1
#line 5 "libxlu_disk_l.h"
-#line 31 "libxlu_disk_l.l"
#define _GNU_SOURCE
-
-
-#line 11 "libxlu_disk_l.h"
+#line 8 "libxlu_disk_l.h"
#define YY_INT_ALIGNED short int
@@ -699,6 +696,6 @@ extern int yylex (yyscan_t yyscanner);
#line 306 "libxlu_disk_l.l"
-#line 702 "libxlu_disk_l.h"
+#line 699 "libxlu_disk_l.h"
#undef xlu__disk_yyIN_HEADER
#endif /* xlu__disk_yyHEADER_H */
diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
index 4c4593528d..10a2e603e9 100644
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -37,6 +37,7 @@ static const struct {
{ "CPUID 0x00000007:1.edx", "7d1" },
{ "MSR_ARCH_CAPS.lo", "m10Al" },
{ "MSR_ARCH_CAPS.hi", "m10Ah" },
+ { "CPUID 0x8000001f.eax", "e1fa" },
};
#define COL_ALIGN "24"
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 2690f9a923..256adf0054 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -70,6 +70,7 @@ type domain_create_flag =
| CDF_IOMMU
| CDF_NESTED_VIRT
| CDF_VPMU
+ | CDF_COCO
type domain_create_iommu_opts =
| IOMMU_NO_SHAREPT
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index febbe1f6ae..9ca55af05a 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -63,6 +63,7 @@ type domain_create_flag =
| CDF_IOMMU
| CDF_NESTED_VIRT
| CDF_VPMU
+ | CDF_COCO
type domain_create_iommu_opts =
| IOMMU_NO_SHAREPT
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 089a88935a..0ddec0815b 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -2993,6 +2993,8 @@ skip_usbdev:
xlu_cfg_get_defbool(config, "vpmu", &b_info->vpmu, 0);
+ xlu_cfg_get_defbool(config, "coco", &b_info->coco, 0);
+
xlu_cfg_destroy(config);
}
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 11/16] x86/svm: Introduce NPCTRL VMCB bits
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (9 preceding siblings ...)
2025-05-16 10:23 ` [RFC PATCH 10/16] xl/coco: Introduce confidential computing support Teddy Astie
@ 2025-05-16 10:24 ` Teddy Astie
2025-05-16 10:24 ` [RFC PATCH 12/16] x86/cpufeature: Introduce SME and SEV-related CPU features Teddy Astie
` (5 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 10:24 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Andrei Semenov
Those bits are used to enable SEV-related features in VMCB.
Signed-off-by: Andrei Semenov <andrei.semenov@vates.tech>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
xen/arch/x86/include/asm/hvm/svm/vmcb.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/xen/arch/x86/include/asm/hvm/svm/vmcb.h b/xen/arch/x86/include/asm/hvm/svm/vmcb.h
index 3d871b6135..fd166498f2 100644
--- a/xen/arch/x86/include/asm/hvm/svm/vmcb.h
+++ b/xen/arch/x86/include/asm/hvm/svm/vmcb.h
@@ -143,6 +143,17 @@ enum DRInterceptBits
DR_INTERCEPT_DR15_WRITE = 1u << 31,
};
+/* Miscellanious controls in _np_ctrl*/
+enum NpCtrlBits
+{
+ NPCTRL_NP_ENABLE = 1 << 0,
+ NPCTRL_SEV_ENABLE = 1 << 1,
+ NPCTRL_SEVES_ENABLE = 1 << 2,
+ NPCTRL_GMET_ENABLE = 1 << 3,
+ NPCTRL_NPSSS_ENABL = 1 << 4,
+ NPCTRL_VTE_ENABLE = 1 << 5,
+};
+
enum VMEXIT_EXITCODE
{
/* control register read exitcodes */
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 12/16] x86/cpufeature: Introduce SME and SEV-related CPU features
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (10 preceding siblings ...)
2025-05-16 10:24 ` [RFC PATCH 11/16] x86/svm: Introduce NPCTRL VMCB bits Teddy Astie
@ 2025-05-16 10:24 ` Teddy Astie
2026-03-25 8:52 ` Jan Beulich
2025-05-16 10:24 ` [RFC PATCH 13/16] x86/coco: Introduce AMD-SEV support Teddy Astie
` (4 subsequent siblings)
16 siblings, 1 reply; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 10:24 UTC (permalink / raw)
To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
xen/arch/x86/cpu/common.c | 2 ++
xen/arch/x86/include/asm/cpufeature.h | 4 ++++
xen/include/public/arch-x86/cpufeatureset.h | 5 +++++
xen/include/xen/lib/x86/cpu-policy.h | 9 ++++++++-
4 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index e8d4ca3203..a610b0f513 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -481,6 +481,8 @@ static void generic_identify(struct cpuinfo_x86 *c)
c->x86_capability[FEATURESET_e8b] = cpuid_ebx(0x80000008);
if (c->extended_cpuid_level >= 0x80000021)
c->x86_capability[FEATURESET_e21a] = cpuid_eax(0x80000021);
+ if (c->extended_cpuid_level >= 0x8000001f)
+ c->x86_capability[FEATURESET_e1fa] = cpuid_eax(0x8000001f);
/* Intel-defined flags: level 0x00000007 */
if (c->cpuid_level >= 7) {
diff --git a/xen/arch/x86/include/asm/cpufeature.h b/xen/arch/x86/include/asm/cpufeature.h
index 397a04af41..bded70231c 100644
--- a/xen/arch/x86/include/asm/cpufeature.h
+++ b/xen/arch/x86/include/asm/cpufeature.h
@@ -233,6 +233,10 @@ static inline bool boot_cpu_has(unsigned int feat)
#define cpu_has_msr_tsc_aux (cpu_has_rdtscp || cpu_has_rdpid)
+#define cpu_has_sme boot_cpu_has(X86_FEATURE_SME)
+#define cpu_has_sev boot_cpu_has(X86_FEATURE_SEV)
+#define cpu_has_sev_es boot_cpu_has(X86_FEATURE_SEV_ES)
+
/* Bugs. */
#define cpu_bug_fpu_ptrs boot_cpu_has(X86_BUG_FPU_PTRS)
#define cpu_bug_null_seg boot_cpu_has(X86_BUG_NULL_SEG)
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index a6d4a0cba7..2a67bcc6a4 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -394,6 +394,11 @@ XEN_CPUFEATURE(MON_UMON_MITG, 16*32+30) /* MCU_OPT_CTRL.MON_UMON_MITG */
/* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.edx, word 17 (express in terms of word 16) */
XEN_CPUFEATURE(ITS_NO, 16*32+62) /*!A No Indirect Target Selection */
+/* AMD-defined CPU features, CPUID level 0x8000001f.eax, word 18 */
+XEN_CPUFEATURE(SME, 18*32+ 0) /* Secure Memory Encryption */
+XEN_CPUFEATURE(SEV, 18*32+ 1) /* Secure Encrypted Virtualization */
+XEN_CPUFEATURE(SEV_ES, 18*32+ 3) /* SEV Encrypted State */
+
#endif /* XEN_CPUFEATURE */
/* Clean up from a default include. Close the enum (for C). */
diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h
index f43e1a3b21..a5b22b34d8 100644
--- a/xen/include/xen/lib/x86/cpu-policy.h
+++ b/xen/include/xen/lib/x86/cpu-policy.h
@@ -22,6 +22,7 @@
#define FEATURESET_7d1 15 /* 0x00000007:1.edx */
#define FEATURESET_m10Al 16 /* 0x0000010a.eax */
#define FEATURESET_m10Ah 17 /* 0x0000010a.edx */
+#define FEATURESET_e1fa 18 /* 0x8000001f.eax */
struct cpuid_leaf
{
@@ -317,7 +318,13 @@ struct cpu_policy
uint64_t :64, :64; /* Leaf 0x8000001c. */
uint64_t :64, :64; /* Leaf 0x8000001d - Cache properties. */
uint64_t :64, :64; /* Leaf 0x8000001e - Extd APIC/Core/Node IDs. */
- uint64_t :64, :64; /* Leaf 0x8000001f - AMD Secure Encryption. */
+ /* Leaf 0x8000001f - AMD Secure Memory Encryption. */
+ union {
+ uint32_t e1fa;
+ struct { DECL_BITFIELD(e1fa); };
+ };
+ uint32_t c_bit_pos:6, physaddr_red:6, num_vmpl:4, :16;
+ uint32_t max_sev_guests:32, min_no_es_asid;
uint64_t :64, :64; /* Leaf 0x80000020 - Platform QoS. */
/* Leaf 0x80000021 - Extended Feature 2 */
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [RFC PATCH 12/16] x86/cpufeature: Introduce SME and SEV-related CPU features
2025-05-16 10:24 ` [RFC PATCH 12/16] x86/cpufeature: Introduce SME and SEV-related CPU features Teddy Astie
@ 2026-03-25 8:52 ` Jan Beulich
0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2026-03-25 8:52 UTC (permalink / raw)
To: Teddy Astie; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel
[-- Attachment #1: Type: text/plain, Size: 1724 bytes --]
On 16.05.2025 12:24, Teddy Astie wrote:
> Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
> ---
> xen/arch/x86/cpu/common.c | 2 ++
> xen/arch/x86/include/asm/cpufeature.h | 4 ++++
> xen/include/public/arch-x86/cpufeatureset.h | 5 +++++
> xen/include/xen/lib/x86/cpu-policy.h | 9 ++++++++-
> 4 files changed, 19 insertions(+), 1 deletion(-)
As I happened to look at this patch, there are pieces missing here. There
likely are existing commits or pending patches I could point you at, but I
think the not-yet-posted patch new in v4 of the AVX10 series is the best
reference (for having pretty few things beyond what you need to pay
attention to). See attached.
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -481,6 +481,8 @@ static void generic_identify(struct cpuinfo_x86 *c)
> c->x86_capability[FEATURESET_e8b] = cpuid_ebx(0x80000008);
> if (c->extended_cpuid_level >= 0x80000021)
> c->x86_capability[FEATURESET_e21a] = cpuid_eax(0x80000021);
> + if (c->extended_cpuid_level >= 0x8000001f)
> + c->x86_capability[FEATURESET_e1fa] = cpuid_eax(0x8000001f);
This would be nice to be kept in numerical order.
> --- a/xen/include/xen/lib/x86/cpu-policy.h
> +++ b/xen/include/xen/lib/x86/cpu-policy.h
> @@ -22,6 +22,7 @@
> #define FEATURESET_7d1 15 /* 0x00000007:1.edx */
> #define FEATURESET_m10Al 16 /* 0x0000010a.eax */
> #define FEATURESET_m10Ah 17 /* 0x0000010a.edx */
> +#define FEATURESET_e1fa 18 /* 0x8000001f.eax */
I think this wants to be FEATURESET_e1Fa, much like it's FEATURESET_Da1
and (going to be; another yet to be posted patch that I have been
carrying for far too long) FEATURESET_1Ea1.
Jan
[-- Attachment #2: x86-CPUID-AVX10-2.patch --]
[-- Type: text/plain, Size: 5946 bytes --]
x86/CPUID: enable AVX10.2 sub-leaf
The logic is modeled as closely as possible after that of leaf 7
sub-leaf handling.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
While the "AVX10" infix is necessary everywhere, the "avx10" prefix on
the bitfield name is redundant with the containing structure's field
name (see "x86emul: support AVX10.2 media insns" for how this looks like
in actual use). Do we want to special-case this in gen-cpuid.py?
---
v4: New.
--- unstable.orig/tools/libs/light/libxl_cpuid.c 2025-10-14 19:31:43.000000000 +0200
+++ unstable/tools/libs/light/libxl_cpuid.c 2025-07-23 10:05:01.000000000 +0200
@@ -343,6 +343,7 @@ int libxl_cpuid_parse_config(libxl_cpuid
MSR_ENTRY(0x10a, CPUID_REG_EAX),
MSR_ENTRY(0x10a, CPUID_REG_EDX),
CPUID_ENTRY(0x80000021, NA, CPUID_REG_ECX),
+ CPUID_ENTRY(0x00000024, 1, CPUID_REG_ECX),
#undef MSR_ENTRY
#undef CPUID_ENTRY
};
--- unstable.orig/tools/misc/xen-cpuid.c 2025-07-22 16:21:18.000000000 +0200
+++ unstable/tools/misc/xen-cpuid.c 2025-07-23 10:04:36.000000000 +0200
@@ -38,6 +38,7 @@ static const struct {
{ "MSR_ARCH_CAPS.lo", "m10Al" },
{ "MSR_ARCH_CAPS.hi", "m10Ah" },
{ "CPUID 0x80000021.ecx", "e21c" },
+ { "CPUID 0x00000024:1.ecx", "24c1" },
};
#define COL_ALIGN "24"
--- unstable.orig/xen/arch/x86/cpu/common.c 2023-11-12 14:12:21.000000000 +0100
+++ unstable/xen/arch/x86/cpu/common.c 2025-11-12 14:20:53.000000000 +0100
@@ -552,6 +552,17 @@ static void generic_identify(struct cpui
&c->x86_capability[FEATURESET_Da1],
&tmp, &tmp, &tmp);
+ if (cpu_has(c, X86_FEATURE_AVX10) && c->cpuid_level >= 0x24) {
+ uint32_t max_subleaf;
+
+ cpuid_count(0x24, 0, &max_subleaf, &tmp, &tmp, &tmp);
+ if (max_subleaf >= 1)
+ cpuid_count(0x24, 1,
+ &tmp, &tmp,
+ &c->x86_capability[FEATURESET_24c1],
+ &tmp);
+ }
+
if (test_bit(X86_FEATURE_ARCH_CAPS, c->x86_capability)) {
val = rdmsr(MSR_ARCH_CAPABILITIES);
c->x86_capability[FEATURESET_m10Al] = val;
--- unstable.orig/xen/arch/x86/cpu-policy.c 2025-07-24 12:27:24.795193021 +0200
+++ unstable/xen/arch/x86/cpu-policy.c 2025-07-01 14:01:41.000000000 +0200
@@ -277,6 +277,9 @@ static void recalculate_misc(struct cpu_
p->avx10.raw[0].b &= 0x000700ff;
p->avx10.raw[0].c = 0;
p->avx10.raw[0].d = 0;
+ p->avx10.raw[1].a = 0;
+ p->avx10.raw[1].b = 0;
+ p->avx10.raw[1].d = 0;
if ( !p->feat.avx10 || !p->avx10.version ||
!p->avx10.vsz512 || !p->avx10.vsz256 || !p->avx10.vsz128 )
{
--- unstable.orig/xen/include/public/arch-x86/cpufeatureset.h 2025-06-03 12:35:53.000000000 +0200
+++ unstable/xen/include/public/arch-x86/cpufeatureset.h 2026-03-12 10:26:14.000000000 +0100
@@ -409,6 +409,9 @@ XEN_CPUFEATURE(ITS_NO, 16*32
XEN_CPUFEATURE(TSA_SQ_NO, 18*32+ 1) /*A No Store Queue Transitive Scheduler Attacks */
XEN_CPUFEATURE(TSA_L1_NO, 18*32+ 2) /*A No L1D Transitive Scheduler Attacks */
+/* Intel-defined CPU features, CPUID level 0x00000024:1.ecx, word 19 */
+XEN_CPUFEATURE(AVX10_V1_AUX, 19*32+ 2) /* AVX10 V1 Auxiliary Instructions */
+
#endif /* XEN_CPUFEATURE */
/* Clean up from a default include. Close the enum (for C). */
--- unstable.orig/xen/include/xen/lib/x86/cpu-policy.h 2025-06-17 13:51:25.290993409 +0200
+++ unstable/xen/include/xen/lib/x86/cpu-policy.h 2025-08-28 14:54:22.000000000 +0200
@@ -23,6 +23,7 @@
#define FEATURESET_m10Al 16 /* 0x0000010a.eax */
#define FEATURESET_m10Ah 17 /* 0x0000010a.edx */
#define FEATURESET_e21c 18 /* 0x80000021.ecx */
+#define FEATURESET_24c1 19 /* 0x00000024:1.ecx */
struct cpuid_leaf
{
@@ -64,7 +65,7 @@ const char *x86_cpuid_vendor_to_str(unsi
#define CPUID_GUEST_NR_FEAT (2u + 1)
#define CPUID_GUEST_NR_TOPO (1u + 1)
#define CPUID_GUEST_NR_XSTATE (62u + 1)
-#define CPUID_GUEST_NR_AVX10 (0u + 1)
+#define CPUID_GUEST_NR_AVX10 (1u + 1)
#define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1)
#define CPUID_GUEST_NR_EXTD_AMD (0x21u + 1)
#define CPUID_GUEST_NR_EXTD MAX(CPUID_GUEST_NR_EXTD_INTEL, \
@@ -275,6 +276,14 @@ struct cpu_policy
bool vsz128:1, vsz256:1, vsz512:1;
uint32_t :13;
uint32_t /* c */:32, /* d */:32;
+
+ /* Subleaf 1. */
+ uint32_t /* a */:32, /* b */:32;
+ union {
+ uint32_t _24c1;
+ struct { DECL_BITFIELD(24c1); };
+ };
+ uint32_t /* d */:32;
};
} avx10;
--- unstable.orig/xen/arch/x86/lib/cpu-policy/cpuid.c 2023-10-19 15:22:20.000000000 +0200
+++ unstable/xen/arch/x86/lib/cpu-policy/cpuid.c 2025-07-23 09:58:26.000000000 +0200
@@ -82,6 +82,7 @@ void x86_cpu_policy_to_featureset(
fs[FEATURESET_m10Al] = p->arch_caps.lo;
fs[FEATURESET_m10Ah] = p->arch_caps.hi;
fs[FEATURESET_e21c] = p->extd.e21c;
+ fs[FEATURESET_24c1] = p->avx10._24c1;
}
void x86_cpu_featureset_to_policy(
@@ -106,6 +107,7 @@ void x86_cpu_featureset_to_policy(
p->arch_caps.lo = fs[FEATURESET_m10Al];
p->arch_caps.hi = fs[FEATURESET_m10Ah];
p->extd.e21c = fs[FEATURESET_e21c];
+ p->avx10._24c1 = fs[FEATURESET_24c1];
}
void x86_cpu_policy_recalc_synth(struct cpu_policy *p)
--- unstable.orig/xen/tools/gen-cpuid.py 2023-11-22 08:11:29.000000000 +0100
+++ unstable/xen/tools/gen-cpuid.py 2026-03-12 10:21:30.000000000 +0100
@@ -310,6 +310,9 @@ def crunch_numbers(state):
AVX512BW: [AVX512_VBMI, AVX512_VBMI2, AVX512_BITALG, AVX512_BF16,
AVX512_FP16, AVX512_BMM],
+ # AVX10 discrete features.
+ AVX10: [AVX10_V1_AUX],
+
# Extensions with VEX/EVEX encodings keyed to a separate feature
# flag are made dependents of their respective legacy feature.
PCLMULQDQ: [VPCLMULQDQ],
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC PATCH 13/16] x86/coco: Introduce AMD-SEV support
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (11 preceding siblings ...)
2025-05-16 10:24 ` [RFC PATCH 12/16] x86/cpufeature: Introduce SME and SEV-related CPU features Teddy Astie
@ 2025-05-16 10:24 ` Teddy Astie
2025-05-16 10:24 ` [RFC PATCH 14/16] sev/emulate: Handle some non-emulable HVM paths Teddy Astie
` (3 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 10:24 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Andrei Semenov
From: Andrei Semenov <andrei.semenov@vates.tech>
AMD-SEV is AMD implementation for confidential computing.
This patch introduces SEV initialization and HVM enablement logic.
Signed-off-by: Andrei Semenov <andrei.semenov@vates.tech>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
Some possible improvement would be to slightly change the ASID allocation
logic under SEV :
With SEV support and usable :
- non-SEV guest : Use ASID > NumSevGuests if possible
- SEV guest : Use ASID in SEV range
Such as we don't waste SEV-supported ASIDs.
This currently lacks DF_FLUSH support, so SEV-enabled destroyed cannot
reuse their ASIDs. This is currently workaround with "coco: Leak ASID for coco guests".
---
xen/arch/x86/Makefile | 1 +
xen/arch/x86/coco/Makefile | 1 +
xen/arch/x86/coco/sev.c | 262 +++++++++++++++++++++++++
xen/arch/x86/cpu/amd.c | 10 +
xen/arch/x86/cpuid.c | 5 +
xen/arch/x86/hvm/Kconfig | 10 +
xen/arch/x86/hvm/svm/svm.c | 6 +
xen/arch/x86/hvm/svm/vmcb.c | 17 +-
xen/arch/x86/include/asm/coco.h | 8 +
xen/arch/x86/include/asm/hvm/svm/sev.h | 14 ++
xen/arch/x86/include/asm/hvm/svm/svm.h | 16 ++
11 files changed, 344 insertions(+), 6 deletions(-)
create mode 100644 xen/arch/x86/coco/Makefile
create mode 100644 xen/arch/x86/coco/sev.c
create mode 100644 xen/arch/x86/include/asm/coco.h
create mode 100644 xen/arch/x86/include/asm/hvm/svm/sev.h
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index bedb97cbee..220bff5e0a 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -1,5 +1,6 @@
obj-y += acpi/
obj-y += boot/
+obj-$(CONFIG_COCO) += coco/
obj-y += cpu/
obj-y += efi/
obj-y += genapic/
diff --git a/xen/arch/x86/coco/Makefile b/xen/arch/x86/coco/Makefile
new file mode 100644
index 0000000000..59ab1c075f
--- /dev/null
+++ b/xen/arch/x86/coco/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_COCO_AMD_SEV) += sev.o
\ No newline at end of file
diff --git a/xen/arch/x86/coco/sev.c b/xen/arch/x86/coco/sev.c
new file mode 100644
index 0000000000..366ce42baa
--- /dev/null
+++ b/xen/arch/x86/coco/sev.c
@@ -0,0 +1,262 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * coco/sev.c: AMD SEV support
+ * Copyright (c) Vates SAS
+ */
+
+#include <asm/cpu-policy.h>
+#include <asm/cpufeature.h>
+#include <asm/p2m.h>
+#include <asm/hvm/asid.h>
+
+#include <public/hvm/coco.h>
+
+#include <xen/config.h>
+#include <xen/coco.h>
+#include <asm/psp-sev.h>
+
+static int sev_domain_initialise(struct domain *d)
+{
+ struct sev_data_launch_start sd_ls;
+ struct sev_data_activate sd_a;
+ int psp_ret;
+ long rc = 0;
+
+ sd_ls.handle = 0; /* generate new one */
+ sd_ls.policy = 0; /* NOKS policy */
+ sd_ls.dh_cert_address = 0; /* do not DH stuff */
+
+ rc = sev_do_cmd(SEV_CMD_LAUNCH_START, (void *)(&sd_ls), &psp_ret, true);
+ if ( rc )
+ {
+ printk(XENLOG_ERR "asp: failed to LAUNCH_START domain(%d): psp_ret %d\n",
+ d->domain_id, psp_ret);
+ return rc;
+ }
+
+ sd_a.handle = sd_ls.handle;
+ sd_a.asid = d->arch.hvm.asid.asid;
+
+ rc = sev_do_cmd(SEV_CMD_ACTIVATE, (void *)(&sd_a), &psp_ret, true);
+ if ( rc )
+ {
+ printk(XENLOG_ERR "asp: failed to ACTIVATE domain(%d): psp_ret %d\n",
+ d->domain_id, psp_ret);
+ return rc;
+ }
+
+ d->arch.hvm.svm.sev.asp_handle = sd_ls.handle;
+ d->arch.hvm.svm.sev.asp_policy = 0;
+
+ return 0;
+}
+
+static int sev_domain_prepare_initial_mem(struct domain *d, gfn_t gfn, size_t count)
+{
+ struct page_info *page;
+ int rc, psp_ret;
+ struct sev_data_launch_update_data sd_lud;
+
+ mfn_t mfn, mfn_base = INVALID_MFN;
+ size_t segment_size = 0;
+
+ do {
+ page = get_page_from_gfn(d, gfn_x(gfn), NULL, P2M_ALLOC);
+ if ( unlikely(!page) )
+ return rc;
+
+ mfn = page_to_mfn(page);
+ put_page(page);
+
+ if ( !mfn_valid(mfn_base) )
+ mfn_base = mfn;
+ else
+ {
+ // Check for a break.
+ if (mfn_x(mfn_base) + segment_size != mfn_x(mfn) || segment_size == 512)
+ {
+ // Make launch update data.
+ printk(XENLOG_DEBUG
+ "asp: LAUNCH_UPDATE_DATA d%d: base=%"PRI_xen_pfn", size=%zx\n",
+ d->domain_id, mfn_x(mfn_base), segment_size);
+
+ sd_lud.reserved = 0;
+ sd_lud.handle = d->arch.hvm.svm.sev.asp_handle;
+ sd_lud.address = mfn_x(mfn_base) << PAGE_SHIFT;
+ sd_lud.len = segment_size * PAGE_SIZE;
+ rc = sev_do_cmd(SEV_CMD_LAUNCH_UPDATE_DATA, (void *)(&sd_lud),
+ &psp_ret, true);
+ if (rc)
+ {
+ printk(XENLOG_ERR
+ "asp: failed to LAUNCH_UPDATE_DATA dom(%d): err %d\n",
+ d->domain_id, psp_ret);
+ return rc;
+ }
+
+ mfn_base = mfn_x(mfn);
+ segment_size = 0;
+ }
+ }
+
+ gfn = gfn_add(gfn, 1);
+ segment_size++;
+ count--;
+ } while ( count );
+
+ // Last launch update data.
+ if ( segment_size )
+ {
+ sd_lud.reserved = 0;
+ sd_lud.handle = d->arch.hvm.svm.sev.asp_handle;
+ sd_lud.address = mfn_x(mfn_base) << PAGE_SHIFT;
+ sd_lud.len = segment_size * PAGE_SIZE;
+ rc = sev_do_cmd(SEV_CMD_LAUNCH_UPDATE_DATA, (void *)(&sd_lud),
+ &psp_ret, true);
+
+ if ( rc )
+ printk(XENLOG_ERR "asp: failed to LAUNCH_UPDATE_DATA dom(%d): err %d\n",
+ d->domain_id, psp_ret);
+ }
+
+ return rc;
+}
+
+static int sev_domain_creation_finished(struct domain *d)
+{
+ struct sev_data_launch_measure sd_lm;
+ struct sev_data_launch_finish sd_lf;
+ int psp_ret;
+ long rc = 0;
+
+ sd_lm.handle = d->arch.hvm.svm.sev.asp_handle;
+ sd_lm.address = virt_to_maddr(d->arch.hvm.svm.sev.measure);
+ sd_lm.len = sizeof(d->arch.hvm.svm.sev.measure);
+ sd_lm.reserved = 0;
+
+ rc = sev_do_cmd(SEV_CMD_LAUNCH_MEASURE, (void *)(&sd_lm), &psp_ret, true);
+ if ( rc )
+ {
+ printk(XENLOG_ERR "asp: failed to LAUNCH_MEASURE for d%hu: psp_ret %hu, rc %ld\n",
+ d->domain_id, psp_ret, rc);
+
+ if (psp_ret == SEV_RET_INVALID_LEN)
+ printk(XENLOG_ERR "asp: Expected %"PRIu32" bytes\n", sd_lm.len);
+ return rc;
+ }
+
+ sd_lf.handle = d->arch.hvm.svm.sev.asp_handle;
+
+ rc = sev_do_cmd(SEV_CMD_LAUNCH_FINISH, (void *)(&sd_lf), &psp_ret, true);
+ if ( rc )
+ {
+ printk(XENLOG_ERR "asp: failed to LAUNCH_FINISH for d%hu: psp_ret %d, rc %ld\n",
+ d->domain_id, psp_ret, rc);
+ return rc;
+ }
+
+ d->arch.hvm.svm.sev.measure_len = sd_lm.len;
+ return 0;
+}
+
+static void sev_domain_destroy(struct domain *d)
+{
+ struct sev_data_deactivate sd_da;
+ struct sev_data_decommission sd_de;
+ int psp_ret;
+ long rc = 0;
+
+ sd_da.handle = d->arch.hvm.svm.sev.asp_handle;
+
+ rc = sev_do_cmd(SEV_CMD_DEACTIVATE, (void *)(&sd_da), &psp_ret, true);
+ if (rc)
+ {
+ printk(XENLOG_ERR "asp: failed to DEACTIVATE for d%hu: psp_ret %d\n",
+ d->domain_id, psp_ret);
+ return;
+ }
+
+ sd_de.handle = d->arch.hvm.svm.sev.asp_handle;
+
+ rc = sev_do_cmd(SEV_CMD_DECOMMISSION, (void *)(&sd_de), &psp_ret, true);
+ if (rc)
+ {
+ printk(XENLOG_ERR "asp: failed to DECOMMISSION for d%hu: psp_ret %d\n",
+ d->domain_id, psp_ret);
+ return;
+ }
+
+ d->arch.hvm.svm.sev.asp_handle = 0;
+}
+
+static int sev_asid_alloc(struct domain *d, struct hvm_asid *asid)
+{
+ /* TODO: SEV-ES/SNP */
+ unsigned long asid_min = raw_cpu_policy.extd.min_no_es_asid;
+ unsigned long asid_max = raw_cpu_policy.extd.max_sev_guests;
+
+ return hvm_asid_alloc_range(asid, asid_min, asid_max);
+}
+
+static struct coco_domain_ops sev_domain_ops = {
+ .prepare_initial_mem = sev_domain_prepare_initial_mem,
+ .domain_initialise = sev_domain_initialise,
+ .domain_creation_finished = sev_domain_creation_finished,
+ .domain_destroy = sev_domain_destroy,
+ .asid_alloc = sev_asid_alloc,
+};
+
+static int sev_init(void)
+{
+ unsigned long syscfg;
+
+ if ( WARN_ON(!cpu_has_sev) )
+ return -ENOSYS;
+
+ ASSERT(raw_cpu_policy.extd.c_bit_pos > 0);
+ ASSERT(raw_cpu_policy.extd.max_sev_guests > 0);
+
+ printk(XENLOG_INFO "sev: C-bit is %"PRIu32"\n", raw_cpu_policy.extd.c_bit_pos);
+ printk(XENLOG_INFO "sev: Supports up to %"PRIu32" guests\n",
+ raw_cpu_policy.extd.max_sev_guests);
+
+ /* Enable AMD SME */
+ rdmsrl(MSR_K8_SYSCFG, syscfg);
+
+ if ( !(syscfg & SYSCFG_MEM_ENCRYPT) )
+ {
+ syscfg |= SYSCFG_MEM_ENCRYPT;
+ wrmsrl(MSR_K8_SYSCFG, syscfg);
+
+ printk(XENLOG_INFO "sev: Enabled AMD SME\n");
+ }
+
+ return 0;
+}
+
+static int sev_get_platform_status(struct coco_platform_status *status)
+{
+ status->platform = COCO_PLATFORM_amd_sev;
+
+ // if ( cpu_has_sev_es )
+ // status->platform_flags |= COCO_PLATFORM_FLAG_sev_es;
+
+ status->flags = COCO_STATUS_FLAG_supported;
+
+ return 0;
+}
+
+static struct coco_domain_ops *sev_get_domain_ops(struct domain *d)
+{
+ // TODO: SEV-ES and SEV-SNP support
+ return &sev_domain_ops;
+}
+
+struct coco_ops sev_coco_ops = {
+ .name = "SEV",
+ .init = sev_init,
+ .get_platform_status = sev_get_platform_status,
+ .get_domain_ops = sev_get_domain_ops,
+};
+
+
diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 37d67dd15c..28b5a0420d 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -1,4 +1,5 @@
#include <xen/cpu.h>
+#include <asm/cpu-policy.h>
#include <xen/init.h>
#include <xen/bitops.h>
#include <xen/mm.h>
@@ -19,6 +20,10 @@
#include "cpu.h"
+#ifdef CONFIG_COCO
+#include <asm/coco.h>
+#endif
+
/*
* Pre-canned values for overriding the CPUID features
* and extended features masks.
@@ -1333,6 +1338,11 @@ static void cf_check init_amd(struct cpuinfo_x86 *c)
check_syscfg_dram_mod_en();
amd_log_freq(c);
+
+#ifdef CONFIG_COCO_AMD_SEV
+ if ( cpu_has_sev )
+ coco_register_ops(&sev_coco_ops);
+#endif
}
const struct cpu_dev __initconst_cf_clobber amd_cpu_dev = {
diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index e2d94619c2..e1d6db4ad8 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -8,6 +8,7 @@
#include <asm/cpu-policy.h>
#include <asm/cpuid.h>
#include <asm/hvm/viridian.h>
+#include <asm/hvm/svm/sev.h>
#include <asm/xstate.h>
#define EMPTY_LEAF ((struct cpuid_leaf){})
@@ -250,6 +251,10 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
return;
*res = array_access_nospec(p->extd.raw, leaf & 0xffff);
+
+ /* For a SEV guest, passthrough the host SEV leaf. */
+ if ( is_sev_domain(d) && leaf == 0x8000001fU )
+ *res = raw_cpu_policy.extd.raw[0x1f];
break;
default:
diff --git a/xen/arch/x86/hvm/Kconfig b/xen/arch/x86/hvm/Kconfig
index 2def0f98e2..a9332ab8ce 100644
--- a/xen/arch/x86/hvm/Kconfig
+++ b/xen/arch/x86/hvm/Kconfig
@@ -25,6 +25,16 @@ config AMD_SVM
If your system includes a processor with AMD-V support, say Y.
If in doubt, say Y.
+config COCO_AMD_SEV
+ bool "AMD SEV (UNSUPPORTED)" if AMD && AMD_SVM && COCO && UNSUPPORTED
+ default y
+ select AMD_SP
+ help
+ Enables support for AMD Secure Encrypted Virtualization technology.
+ This option is needed if you want to run confidential guests on a
+ AMD platform that supports it.
+ If in doubt, say N.
+
config INTEL_VMX
bool "Intel VT-x" if INTEL && EXPERT
default y
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index cc19d80fe1..63889bf803 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -27,6 +27,7 @@
#include <asm/hvm/nestedhvm.h>
#include <asm/hvm/support.h>
#include <asm/hvm/asid.h>
+#include <asm/hvm/svm/sev.h>
#include <asm/hvm/svm/svm.h>
#include <asm/hvm/svm/svmdebug.h>
#include <asm/hvm/svm/vmcb.h>
@@ -1865,6 +1866,11 @@ static int cf_check svm_msr_read_intercept(
break;
case MSR_K8_SYSCFG:
+ if ( is_sev_domain(d) )
+ {
+ *msr_content = SYSCFG_MEM_ENCRYPT;
+ break;
+ }
case MSR_K8_TOP_MEM1:
case MSR_K8_TOP_MEM2:
case MSR_K8_VM_CR:
diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
index 4e1f61dbe0..5157afe733 100644
--- a/xen/arch/x86/hvm/svm/vmcb.c
+++ b/xen/arch/x86/hvm/svm/vmcb.c
@@ -15,6 +15,7 @@
#include <asm/hvm/svm/vmcb.h>
#include <asm/msr-index.h>
#include <asm/p2m.h>
+#include <asm/hvm/svm/sev.h>
#include <asm/hvm/svm/svm.h>
#include <asm/hvm/svm/svmdebug.h>
#include <asm/spec_ctrl.h>
@@ -192,15 +193,19 @@ int svm_create_vmcb(struct vcpu *v)
svm->vmcb = nv->nv_n1vmcx;
rc = construct_vmcb(v);
if ( rc != 0 )
- {
- free_vmcb(nv->nv_n1vmcx);
- nv->nv_n1vmcx = NULL;
- svm->vmcb = NULL;
- return rc;
- }
+ goto err;
+
+ if ( is_sev_domain(v->domain) )
+ vmcb_set_np_ctrl(svm->vmcb, vmcb_get_np_ctrl(svm->vmcb) | NPCTRL_SEV_ENABLE);
svm->vmcb_pa = nv->nv_n1vmcx_pa = virt_to_maddr(svm->vmcb);
return 0;
+
+err:
+ free_vmcb(nv->nv_n1vmcx);
+ nv->nv_n1vmcx = NULL;
+ svm->vmcb = NULL;
+ return rc;
}
void svm_destroy_vmcb(struct vcpu *v)
diff --git a/xen/arch/x86/include/asm/coco.h b/xen/arch/x86/include/asm/coco.h
new file mode 100644
index 0000000000..874ef56327
--- /dev/null
+++ b/xen/arch/x86/include/asm/coco.h
@@ -0,0 +1,8 @@
+#ifndef __ARCH_X86_COCO_H
+#define __ARCH_X86_COCO_H
+
+#include <xen/coco.h>
+
+extern struct coco_ops sev_coco_ops;
+
+#endif /* __ARCH_X86_CACHE_H */
\ No newline at end of file
diff --git a/xen/arch/x86/include/asm/hvm/svm/sev.h b/xen/arch/x86/include/asm/hvm/svm/sev.h
new file mode 100644
index 0000000000..b7b5ab5591
--- /dev/null
+++ b/xen/arch/x86/include/asm/hvm/svm/sev.h
@@ -0,0 +1,14 @@
+#ifndef __XEN_HVM_SEV_H__
+#define __XEN_HVM_SEV_H__
+
+#include <asm/nospec.h>
+#include <asm/cpufeature.h>
+
+#include <xen/sched.h>
+
+static always_inline bool is_sev_domain(const struct domain *d)
+{
+ return cpu_has_sev && evaluate_nospec(d->options & XEN_DOMCTL_CDF_coco);
+}
+
+#endif /* __XEN_HVM_SEV_H__ */
diff --git a/xen/arch/x86/include/asm/hvm/svm/svm.h b/xen/arch/x86/include/asm/hvm/svm/svm.h
index 1254e5f3ee..efd54511aa 100644
--- a/xen/arch/x86/include/asm/hvm/svm/svm.h
+++ b/xen/arch/x86/include/asm/hvm/svm/svm.h
@@ -9,6 +9,8 @@
#ifndef __ASM_X86_HVM_SVM_H__
#define __ASM_X86_HVM_SVM_H__
+#include <xen/stdint.h>
+
void svm_asid_init(void);
void svm_vcpu_assign_asid(struct vcpu *v);
void svm_vcpu_set_tlb_control(struct vcpu *v);
@@ -26,6 +28,16 @@ bool svm_load_segs(unsigned int ldt_ents, unsigned long ldt_base,
unsigned long fs_base, unsigned long gs_base,
unsigned long gs_shadow);
+struct sev_state {
+ uint32_t asp_handle;
+ uint32_t asp_policy;
+ uint8_t measure[96];
+ uint32_t measure_len; /* 96 bytes */
+ uint8_t state;
+
+ unsigned long flags;
+};
+
struct svm_domain {
/* OSVW MSRs */
union {
@@ -35,6 +47,10 @@ struct svm_domain {
uint64_t status;
};
} osvw;
+
+ #ifdef CONFIG_COCO_AMD_SEV
+ struct sev_state sev;
+ #endif
};
extern u32 svm_feature_flags;
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 14/16] sev/emulate: Handle some non-emulable HVM paths
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (12 preceding siblings ...)
2025-05-16 10:24 ` [RFC PATCH 13/16] x86/coco: Introduce AMD-SEV support Teddy Astie
@ 2025-05-16 10:24 ` Teddy Astie
2025-05-16 10:24 ` [RFC PATCH 15/16] HACK: coco: Leak ASID for coco guests Teddy Astie
` (2 subsequent siblings)
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 10:24 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Andrei Semenov
From: Andrei Semenov <andrei.semenov@vates.tech>
Some code paths are not emulable under SEV or needs special handling.
Signed-off-by: Andrei Semenov <andrei.semenov@vates.tech>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
xen/arch/x86/hvm/emulate.c | 137 ++++++++++++++++++++++++++++++++-----
xen/arch/x86/hvm/hvm.c | 13 ++++
2 files changed, 133 insertions(+), 17 deletions(-)
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 6ed8e03475..7ac3be2d59 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -26,6 +26,7 @@
#include <asm/hvm/hvm.h>
#include <asm/hvm/monitor.h>
#include <asm/hvm/support.h>
+#include <asm/hvm/svm/sev.h>
#include <asm/iocap.h>
#include <asm/vm_event.h>
@@ -689,6 +690,9 @@ static void *hvmemul_map_linear_addr(
goto unhandleable;
}
+ if ( is_sev_domain(curr->domain) && (nr_frames > 1) )
+ goto unhandleable;
+
for ( i = 0; i < nr_frames; i++ )
{
enum hvm_translation_result res;
@@ -703,8 +707,16 @@ static void *hvmemul_map_linear_addr(
/* Error checking. Confirm that the current slot is clean. */
ASSERT(mfn_x(*mfn) == 0);
- res = hvm_translate_get_page(curr, addr, true, pfec,
+ if ( is_sev_domain(curr->domain) )
+ {
+ struct hvm_vcpu_io *hvio = &curr->arch.hvm.hvm_io;
+ unsigned long gpa = pfn_to_paddr(hvio->mmio_gpfn) | (addr & ~PAGE_MASK);
+ res = hvm_translate_get_page(curr, gpa, false, pfec,
&pfinfo, &page, &gfn, &p2mt);
+ }
+ else
+ res = hvm_translate_get_page(curr, addr, true, pfec,
+ &pfinfo, &page, &gfn, &p2mt);
switch ( res )
{
@@ -1173,6 +1185,7 @@ static int hvmemul_linear_mmio_access(
dir, buffer_offset);
paddr_t gpa;
unsigned long one_rep = 1;
+ unsigned int chunk;
int rc;
if ( cache == NULL )
@@ -1183,21 +1196,50 @@ static int hvmemul_linear_mmio_access(
ASSERT_UNREACHABLE();
return X86EMUL_UNHANDLEABLE;
}
+
+ chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
if ( known_gpfn )
gpa = pfn_to_paddr(hvio->mmio_gpfn) | offset;
else
{
- rc = hvmemul_linear_to_phys(gla, &gpa, size, &one_rep, pfec,
+ if ( is_sev_domain(current->domain) )
+ gpa = pfn_to_paddr(hvio->mmio_gpfn) | offset;
+ else
+ {
+ rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
+ hvmemul_ctxt);
+ if ( rc != X86EMUL_OKAY )
+ return rc;
+ }
+
+ latch_linear_to_phys(hvio, gla, gpa, dir == IOREQ_WRITE);
+ }
+
+ for ( ;; )
+ {
+ rc = hvmemul_phys_mmio_access(cache, gpa, chunk, dir, buffer, buffer_offset);
+ if ( rc != X86EMUL_OKAY )
+ break;
+
+ gla += chunk;
+ buffer_offset += chunk;
+ size -= chunk;
+
+ if ( size == 0 )
+ break;
+
+ if ( is_sev_domain(current->domain) )
+ return X86EMUL_UNHANDLEABLE;
+
+ chunk = min_t(unsigned int, size, PAGE_SIZE);
+ rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
hvmemul_ctxt);
if ( rc != X86EMUL_OKAY )
return rc;
-
- latch_linear_to_phys(hvio, gla, gpa, dir == IOREQ_WRITE);
}
- return hvmemul_phys_mmio_access(cache, gpa, size, dir, buffer,
- buffer_offset);
+ return rc;
}
static inline int hvmemul_linear_mmio_read(
@@ -1254,6 +1296,9 @@ static int linear_read(unsigned long addr, unsigned int bytes, void *p_data,
{
unsigned int part1 = PAGE_SIZE - offset;
+ if ( is_sev_domain(current->domain) )
+ return X86EMUL_UNHANDLEABLE;
+
/* Split the access at the page boundary. */
rc = linear_read(addr, part1, p_data, pfec, hvmemul_ctxt);
if ( rc != X86EMUL_OKAY )
@@ -1278,11 +1323,25 @@ static int linear_read(unsigned long addr, unsigned int bytes, void *p_data,
* upon replay) the RAM access for anything that's ahead of or past MMIO,
* i.e. in RAM.
*/
- cache = hvmemul_find_mmio_cache(hvio, start, IOREQ_READ, ~0);
- if ( !cache ||
- addr + bytes <= start + cache->skip ||
- addr >= start + cache->size )
- rc = hvm_copy_from_guest_linear(p_data, addr, bytes, pfec, &pfinfo);
+ cache = hvmemul_find_mmio_cache(hvio, start, IOREQ_READ, ~0);
+ if ( !cache ||
+ addr + bytes <= start + cache->skip ||
+ addr >= start + cache->size )
+ {
+ if ( is_sev_domain(current->domain) )
+ {
+ if ( hvio->mmio_gpfn )
+ {
+ paddr_t gpa;
+ gpa = pfn_to_paddr(hvio->mmio_gpfn) | (addr & ~PAGE_MASK);
+ rc = hvm_copy_from_guest_phys(p_data, gpa, bytes);
+ }
+ else
+ return X86EMUL_UNHANDLEABLE;
+ }
+ else
+ rc = hvm_copy_from_guest_linear(p_data, addr, bytes, pfec, &pfinfo);
+ }
switch ( rc )
{
@@ -1325,6 +1384,9 @@ static int linear_write(unsigned long addr, unsigned int bytes, void *p_data,
{
unsigned int part1 = PAGE_SIZE - offset;
+ if ( is_sev_domain(current->domain) )
+ return X86EMUL_UNHANDLEABLE;
+
/* Split the access at the page boundary. */
rc = linear_write(addr, part1, p_data, pfec, hvmemul_ctxt);
if ( rc != X86EMUL_OKAY )
@@ -1340,9 +1402,23 @@ static int linear_write(unsigned long addr, unsigned int bytes, void *p_data,
/* See commentary in linear_read(). */
cache = hvmemul_find_mmio_cache(hvio, start, IOREQ_WRITE, ~0);
if ( !cache ||
- addr + bytes <= start + cache->skip ||
- addr >= start + cache->size )
- rc = hvm_copy_to_guest_linear(addr, p_data, bytes, pfec, &pfinfo);
+ addr + bytes <= start + cache->skip ||
+ addr >= start + cache->size )
+ {
+ if ( is_sev_domain(current->domain) )
+ {
+ if ( hvio->mmio_gpfn )
+ {
+ paddr_t gpa;
+ gpa = pfn_to_paddr(hvio->mmio_gpfn) | (addr & ~PAGE_MASK);
+ rc = hvm_copy_to_guest_phys(gpa, p_data, bytes, current);
+ }
+ else
+ return X86EMUL_UNHANDLEABLE;
+ }
+ else
+ rc = hvm_copy_to_guest_linear(addr, p_data, bytes, pfec, &pfinfo);
+ }
switch ( rc )
{
@@ -1430,7 +1506,12 @@ int cf_check hvmemul_insn_fetch(
if ( !bytes ||
unlikely((insn_off + bytes) > hvmemul_ctxt->insn_buf_bytes) )
{
- int rc = __hvmemul_read(x86_seg_cs, offset, p_data, bytes,
+ int rc;
+
+ if ( is_sev_domain(current->domain) )
+ return X86EMUL_UNHANDLEABLE;
+
+ rc = __hvmemul_read(x86_seg_cs, offset, p_data, bytes,
hvm_access_insn_fetch, hvmemul_ctxt);
if ( rc == X86EMUL_OKAY && bytes )
@@ -1485,6 +1566,7 @@ static int cf_check hvmemul_write(
if ( !known_gla(addr, bytes, pfec) )
{
mapping = hvmemul_map_linear_addr(addr, bytes, pfec, hvmemul_ctxt);
+
if ( IS_ERR(mapping) )
return ~PTR_ERR(mapping);
}
@@ -1719,6 +1801,9 @@ static int cf_check hvmemul_cmpxchg(
int rc;
void *mapping = NULL;
+ if ( is_sev_domain(current->domain) )
+ return X86EMUL_UNHANDLEABLE;
+
rc = hvmemul_virtual_to_linear(
seg, offset, bytes, NULL, hvm_access_write, hvmemul_ctxt, &addr);
if ( rc != X86EMUL_OKAY )
@@ -1821,6 +1906,9 @@ static int cf_check hvmemul_rep_ins(
p2m_type_t p2mt;
int rc;
+ if ( is_sev_domain(current->domain) )
+ return X86EMUL_UNHANDLEABLE;
+
rc = hvmemul_virtual_to_linear(
dst_seg, dst_offset, bytes_per_rep, reps, hvm_access_write,
hvmemul_ctxt, &addr);
@@ -1899,6 +1987,9 @@ static int cf_check hvmemul_rep_outs(
p2m_type_t p2mt;
int rc;
+ if ( is_sev_domain(current->domain) )
+ return X86EMUL_UNHANDLEABLE;
+
if ( unlikely(hvmemul_ctxt->set_context) )
return hvmemul_rep_outs_set_context(dst_port, bytes_per_rep, reps);
@@ -1944,6 +2035,9 @@ static int cf_check hvmemul_rep_movs(
int rc, df = !!(ctxt->regs->eflags & X86_EFLAGS_DF);
char *buf;
+ if ( is_sev_domain(current->domain) )
+ return X86EMUL_UNHANDLEABLE;
+
rc = hvmemul_virtual_to_linear(
src_seg, src_offset, bytes_per_rep, reps, hvm_access_read,
hvmemul_ctxt, &saddr);
@@ -2109,9 +2203,13 @@ static int cf_check hvmemul_rep_stos(
paddr_t gpa;
p2m_type_t p2mt;
bool df = ctxt->regs->eflags & X86_EFLAGS_DF;
- int rc = hvmemul_virtual_to_linear(seg, offset, bytes_per_rep, reps,
- hvm_access_write, hvmemul_ctxt, &addr);
+ int rc;
+
+ if ( is_sev_domain(current->domain) )
+ return X86EMUL_UNHANDLEABLE;
+ rc = hvmemul_virtual_to_linear(seg, offset, bytes_per_rep, reps,
+ hvm_access_write, hvmemul_ctxt, &addr);
if ( rc != X86EMUL_OKAY )
return rc;
@@ -2770,6 +2868,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
struct vcpu *curr = current;
uint32_t new_intr_shadow;
struct hvm_vcpu_io *hvio = &curr->arch.hvm.hvm_io;
+
int rc;
/*
@@ -2983,6 +3082,9 @@ void hvm_emulate_init_per_insn(
unsigned int pfec = PFEC_page_present | PFEC_insn_fetch;
unsigned long addr;
+ if ( is_sev_domain(current->domain) )
+ goto out;
+
if ( hvmemul_ctxt->seg_reg[x86_seg_ss].dpl == 3 )
pfec |= PFEC_user_mode;
@@ -3000,6 +3102,7 @@ void hvm_emulate_init_per_insn(
sizeof(hvmemul_ctxt->insn_buf) : 0;
}
+out:
hvmemul_ctxt->is_mem_access = false;
}
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e1bcf8e086..d3060329fb 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -56,6 +56,7 @@
#include <asm/hvm/monitor.h>
#include <asm/hvm/viridian.h>
#include <asm/hvm/vm_event.h>
+#include <asm/hvm/svm/sev.h>
#include <asm/altp2m.h>
#include <asm/mtrr.h>
#include <asm/apic.h>
@@ -3477,6 +3478,9 @@ enum hvm_translation_result hvm_copy_to_guest_linear(
unsigned long addr, const void *buf, unsigned int size, uint32_t pfec,
pagefault_info_t *pfinfo)
{
+ if ( is_sev_domain(current->domain) )
+ return HVMTRANS_unhandleable;
+
return __hvm_copy((void *)buf /* HVMCOPY_to_guest doesn't modify */,
addr, size, current, HVMCOPY_to_guest | HVMCOPY_linear,
PFEC_page_present | PFEC_write_access | pfec, pfinfo);
@@ -3486,6 +3490,9 @@ enum hvm_translation_result hvm_copy_from_guest_linear(
void *buf, unsigned long addr, unsigned int size, uint32_t pfec,
pagefault_info_t *pfinfo)
{
+ if ( is_sev_domain(current->domain) )
+ return HVMTRANS_unhandleable;
+
return __hvm_copy(buf, addr, size, current,
HVMCOPY_from_guest | HVMCOPY_linear,
PFEC_page_present | pfec, pfinfo);
@@ -3495,6 +3502,9 @@ enum hvm_translation_result hvm_copy_from_vcpu_linear(
void *buf, unsigned long addr, unsigned int size, struct vcpu *v,
unsigned int pfec)
{
+ if ( is_sev_domain(v->domain) )
+ return HVMTRANS_unhandleable;
+
return __hvm_copy(buf, addr, size, v,
HVMCOPY_from_guest | HVMCOPY_linear,
PFEC_page_present | pfec, NULL);
@@ -3522,6 +3532,9 @@ unsigned int clear_user_hvm(void *to, unsigned int len)
{
int rc;
+ if ( is_sev_domain(current->domain) )
+ return HVMTRANS_unhandleable;
+
if ( current->hcall_compat && is_compat_arg_xlat_range(to, len) )
{
memset(to, 0x00, len);
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 15/16] HACK: coco: Leak ASID for coco guests
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (13 preceding siblings ...)
2025-05-16 10:24 ` [RFC PATCH 14/16] sev/emulate: Handle some non-emulable HVM paths Teddy Astie
@ 2025-05-16 10:24 ` Teddy Astie
2025-05-16 10:24 ` [RFC PATCH 16/16] HACK: Add sev_console hypercall Teddy Astie
2025-05-16 10:52 ` [RFC PATCH 00/16] Confidential computing and AMD SEV support Jürgen Groß
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 10:24 UTC (permalink / raw)
To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné
In order to reuse a ASID in a SEV guest, we need to perform a
WBINVD on all pCPUs that ran the guest, then a DF_FLUSH on the PSP.
Just leak the ASID for now.
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
xen/arch/x86/hvm/hvm.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index d3060329fb..ced58ccf4b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -795,7 +795,10 @@ void hvm_domain_destroy(struct domain *d)
list_del(&ioport->list);
xfree(ioport);
}
- hvm_asid_free(&d->arch.hvm.asid);
+ if ( !is_coco_domain(d) )
+ hvm_asid_free(&d->arch.hvm.asid);
+ else
+ printk("coco: Leaking ASID %x: TODO (DF_FLUSH handling)\n", d->arch.hvm.asid.asid);
destroy_vpci_mmcfg(d);
}
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 16/16] HACK: Add sev_console hypercall
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (14 preceding siblings ...)
2025-05-16 10:24 ` [RFC PATCH 15/16] HACK: coco: Leak ASID for coco guests Teddy Astie
@ 2025-05-16 10:24 ` Teddy Astie
2025-05-16 10:52 ` [RFC PATCH 00/16] Confidential computing and AMD SEV support Jürgen Groß
16 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 10:24 UTC (permalink / raw)
To: xen-devel
Cc: Teddy Astie, Andrew Cooper, Anthony PERARD, Michal Orzel,
Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
Introduce a basic console hypercall for debugging needs under SEV
when PV console is not usable at this point. This is later on used
by the earlyprintk of the experimental SEV Linux branch.
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
xen/common/coco.c | 6 ++++++
xen/include/hypercall-defs.c | 2 ++
xen/include/public/xen.h | 1 +
3 files changed, 9 insertions(+)
diff --git a/xen/common/coco.c b/xen/common/coco.c
index d9bd17628d..23c0da6281 100644
--- a/xen/common/coco.c
+++ b/xen/common/coco.c
@@ -131,4 +131,10 @@ long do_coco_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
}
}
+long do_sev_console_op(unsigned long c)
+{
+ printk("%c", (unsigned char)c);
+ return 0;
+}
+
__initcall(coco_init);
\ No newline at end of file
diff --git a/xen/include/hypercall-defs.c b/xen/include/hypercall-defs.c
index 6c01a9e395..19f40f0b38 100644
--- a/xen/include/hypercall-defs.c
+++ b/xen/include/hypercall-defs.c
@@ -210,6 +210,7 @@ hypfs_op(unsigned int cmd, const char *arg1, unsigned long arg2, void *arg3, uns
xenpmu_op(unsigned int op, xen_pmu_params_t *arg)
#endif
coco_op(unsigned int cmd, void *arg)
+sev_console_op(unsigned long c)
#ifdef CONFIG_PV
caller: pv64
@@ -297,5 +298,6 @@ mca do do - - -
paging_domctl_cont do do do do -
#endif
coco_op do do do do do
+sev_console_op do do do do -
#endif /* !CPPCHECK */
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index e656d6f617..04fc891353 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -119,6 +119,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
#define __HYPERVISOR_dm_op 41
#define __HYPERVISOR_hypfs_op 42
#define __HYPERVISOR_coco_op 43
+#define __HYPERVISOR_sev_console_op 45
/* Architecture-specific hypercall definitions. */
#define __HYPERVISOR_arch_0 48
--
2.49.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [RFC PATCH 00/16] Confidential computing and AMD SEV support
2025-05-16 9:31 [RFC PATCH 00/16] Confidential computing and AMD SEV support Teddy Astie
` (15 preceding siblings ...)
2025-05-16 10:24 ` [RFC PATCH 16/16] HACK: Add sev_console hypercall Teddy Astie
@ 2025-05-16 10:52 ` Jürgen Groß
2025-05-16 12:36 ` Teddy Astie
16 siblings, 1 reply; 22+ messages in thread
From: Jürgen Groß @ 2025-05-16 10:52 UTC (permalink / raw)
To: Teddy Astie, xen-devel
Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini, Tim Deegan,
Christian Lindig, David Scott
[-- Attachment #1.1.1: Type: text/plain, Size: 1992 bytes --]
On 16.05.25 11:31, Teddy Astie wrote:
> Hello,
>
> This series introduce support for confidential computing along with a
> AMD SEV implementation. It also bundles some of the functional
> requirements (ASID scheme, ABI, ...) which could be separated if needed.
>
> (I bundled everything in this serie to have a complete coherent serie)
>
> This work receives funding by the Hyper Open X consortium (France 2030).
>
> # Concepts
>
> A confidential guest is a bit special as :
> - its memory is by default encrypted or not directly accessible by the
> hypervisor, thus other domains/dom0 as well; it must be explicitely
> shared by the guest itself
> - so its page-tables are also not accessible
>
> # Implementation
>
> Confidential computing is exposed in a uniform way regardless of actual
> implementation (SEV, TDX, RME, ...) through the coco_op hypercall (mostly
> for use by the Dom0 toolstack). This interface provides a way to query
> informations on the coco platform (support status, features (un)safety,
> ...), and prepare initial guest memory.
> Only HVM domains have support for confidential computing.
> (in the future, we may want to have attestation support)
>
> In order to create a confidential computing domain, the process is follow :
> - create a HVM/PVH domain with XEN_DOMCTL_CDF_coco
> - populate initial memory as usual
> - apply coco_prepare_initial_mem on all initial pages
> (under SEV, this will encrypt memory)
>
> Under xl, it is exposed through the `coco` parameter ("coco = 1").
Wouldn't it make sense to allow specifying the kind of domain
(SEV, SEV-ES, SEV-SNP, TDX) like KVM does?
It might not be needed right now, but in future this could be needed
(e.g. when allowing migration between hosts with different SEV
features).
I don't think this is important during RFC phase, but the final
configuration and hypervisor interfaces of this series should allow
that.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [RFC PATCH 00/16] Confidential computing and AMD SEV support
2025-05-16 10:52 ` [RFC PATCH 00/16] Confidential computing and AMD SEV support Jürgen Groß
@ 2025-05-16 12:36 ` Teddy Astie
0 siblings, 0 replies; 22+ messages in thread
From: Teddy Astie @ 2025-05-16 12:36 UTC (permalink / raw)
To: Jürgen Groß, xen-devel
Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini, Tim Deegan,
Christian Lindig, David Scott
Le 16/05/2025 à 12:54, Jürgen Groß a écrit :
> On 16.05.25 11:31, Teddy Astie wrote:
>>
>> In order to create a confidential computing domain, the process is
>> follow :
>> - create a HVM/PVH domain with XEN_DOMCTL_CDF_coco
>> - populate initial memory as usual
>> - apply coco_prepare_initial_mem on all initial pages
>> (under SEV, this will encrypt memory)
>>
>> Under xl, it is exposed through the `coco` parameter ("coco = 1").
>
> Wouldn't it make sense to allow specifying the kind of domain
> (SEV, SEV-ES, SEV-SNP, TDX) like KVM does?
>
Yes, I was thinking of exposing it through in a optional arch-specific
parameter for specifying some SEV-specific parameters (enable SNP, ...).
And by default rely on what the platform provides with a "best default"
configuration.
(AFAICT it's not possible to have both SEV (AMD-specific) and TDX
(Intel-specific), or at least not yet)
> It might not be needed right now, but in future this could be needed
> (e.g. when allowing migration between hosts with different SEV
> features).
>
> I don't think this is important during RFC phase, but the final
> configuration and hypervisor interfaces of this series should allow
> that.
>
>
> Juergen
Teddy
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply [flat|nested] 22+ messages in thread