[PATCH 00/10] KVM: SEV: allow customizing VMSA features

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 00/10] KVM: SEV: allow customizing VMSA features
@ 2024-02-09 18:37 Paolo Bonzini
  2024-02-09 18:37 ` [PATCH 01/10] KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP Paolo Bonzini
                   ` (11 more replies)
  0 siblings, 12 replies; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

The idea that no parameter would ever be necessary when enabling SEV or
SEV-ES for a VM was decidedly optimistic.  The first source of variability
that was encountered is the desired set of VMSA features, as that affects
the measurement of the VM's initial state and cannot be changed
arbitrarily by the hypervisor.

This series adds all the APIs that are needed to customize the features,
with room for future enhancements:

- a new /dev/kvm device attribute to retrieve the set of supported
  features (right now, only debug swap)

- a new sub-operation for KVM_MEM_ENCRYPT_OP that can take a struct,
  replacing the existing KVM_SEV_INIT and KVM_SEV_ES_INIT

It then puts the new op to work by including the VMSA features as a field
of the The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of
supported VMSA features for backwards compatibility; but I am considering
also making them use zero as the feature mask, and will gladly adjust the
patches if so requested.

In order to avoid creating *two* new KVM_MEM_ENCRYPT_OPs, I decided that
I could as well make SEV and SEV-ES use VM types.  And then, why not make
a SEV-ES VM, when created with the new VM type instead of KVM_SEV_ES_INIT,
reject KVM_GET_REGS/KVM_SET_REGS and friends on the vCPU file descriptor
once the VMSA has been encrypted...  Which is how the API should have
always behaved.

The series is defined as follows:

- patches 1 and 2 are unrelated fixes and improvements for the SEV API

- patches 3 to 5 introduce the new device attribute to retrieve supported
  VMSA features

- patches 6 to 7 introduce new infrastructure for VM types, partly lifted
  out of the TDX patches

- patches 8 and 9 introduce respectively the new VM types for SEV and
  SEV-ES, and KVM_SEV_INIT2 as a new sub-operation for KVM_MEM_ENCRYPT_OP.

- patches 10 and 11 are tests.  The last patch is not intended to be applied
  in order to keep some coverage of KVM_SEV_INIT and KVM_SEV_ES_INIT in
  self tests; but it is there as "proof" that migration can be made to
  work with the new API as well.


The idea is that SEV SNP will only ever support KVM_SEV_INIT2.  I have
patches in progress for QEMU to support this new API.

Thanks,

Paolo

Isaku Yamahata (1):
  KVM: x86: Add is_vm_type_supported callback

Paolo Bonzini (10):
  KVM: x86: define standard behavior for bits 0/1 of VM type
  KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP
  KVM: introduce new vendor op for KVM_GET_DEVICE_ATTR
  Documentation: kvm/sev: separate description of firmware
  KVM: SEV: publish supported VMSA features
  KVM: SEV: store VMSA features in kvm_sev_info
  KVM: SEV: define VM types for SEV and SEV-ES
  KVM: SEV: introduce KVM_SEV_INIT2 operation
  selftests: kvm: add tests for KVM_SEV_INIT2
  selftests: kvm: switch sev_migrate_tests to KVM_SEV_INIT2

 Documentation/virt/kvm/api.rst                |   2 +
 .../virt/kvm/x86/amd-memory-encryption.rst    |  81 +++++++--
 arch/x86/include/asm/kvm-x86-ops.h            |   2 +
 arch/x86/include/asm/kvm_host.h               |   4 +-
 arch/x86/include/uapi/asm/kvm.h               |  42 ++++-
 arch/x86/kvm/svm/sev.c                        | 104 +++++++++++-
 arch/x86/kvm/svm/svm.c                        |  14 +-
 arch/x86/kvm/svm/svm.h                        |   6 +-
 arch/x86/kvm/x86.c                            | 158 ++++++++++++++----
 arch/x86/kvm/x86.h                            |   2 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/kvm_util_base.h     |   6 +-
 .../selftests/kvm/set_memory_region_test.c    |   8 +-
 .../selftests/kvm/x86_64/sev_init2_tests.c    | 147 ++++++++++++++++
 .../selftests/kvm/x86_64/sev_migrate_tests.c  |  45 ++---
 15 files changed, 530 insertions(+), 92 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/sev_init2_tests.c

-- 
2.39.0



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 01/10] KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-14 22:50   ` Michael Roth
  2024-02-09 18:37 ` [PATCH 02/10] KVM: introduce new vendor op for KVM_GET_DEVICE_ATTR Paolo Bonzini
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

The data structs for KVM_MEMORY_ENCRYPT_OP have different sizes for 32- and 64-bit
kernels, but they do not make any attempt to convert from one ABI to the other.
Fix this by adding the appropriate padding.

No functional change intended for 64-bit userspace.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/uapi/asm/kvm.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 0ad6bda1fc39..b305daff056e 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -687,6 +687,7 @@ enum sev_cmd_id {
 
 struct kvm_sev_cmd {
 	__u32 id;
+	__u32 pad0;
 	__u64 data;
 	__u32 error;
 	__u32 sev_fd;
@@ -697,28 +698,35 @@ struct kvm_sev_launch_start {
 	__u32 policy;
 	__u64 dh_uaddr;
 	__u32 dh_len;
+	__u32 pad0;
 	__u64 session_uaddr;
 	__u32 session_len;
+	__u32 pad1;
 };
 
 struct kvm_sev_launch_update_data {
 	__u64 uaddr;
 	__u32 len;
+	__u32 pad0;
 };
 
 
 struct kvm_sev_launch_secret {
 	__u64 hdr_uaddr;
 	__u32 hdr_len;
+	__u32 pad0;
 	__u64 guest_uaddr;
 	__u32 guest_len;
+	__u32 pad1;
 	__u64 trans_uaddr;
 	__u32 trans_len;
+	__u32 pad2;
 };
 
 struct kvm_sev_launch_measure {
 	__u64 uaddr;
 	__u32 len;
+	__u32 pad0;
 };
 
 struct kvm_sev_guest_status {
@@ -731,33 +739,43 @@ struct kvm_sev_dbg {
 	__u64 src_uaddr;
 	__u64 dst_uaddr;
 	__u32 len;
+	__u32 pad0;
 };
 
 struct kvm_sev_attestation_report {
 	__u8 mnonce[16];
 	__u64 uaddr;
 	__u32 len;
+	__u32 pad0;
 };
 
 struct kvm_sev_send_start {
 	__u32 policy;
+	__u32 pad0;
 	__u64 pdh_cert_uaddr;
 	__u32 pdh_cert_len;
+	__u32 pad1;
 	__u64 plat_certs_uaddr;
 	__u32 plat_certs_len;
+	__u32 pad2;
 	__u64 amd_certs_uaddr;
 	__u32 amd_certs_len;
+	__u32 pad3;
 	__u64 session_uaddr;
 	__u32 session_len;
+	__u32 pad4;
 };
 
 struct kvm_sev_send_update_data {
 	__u64 hdr_uaddr;
 	__u32 hdr_len;
+	__u32 pad0;
 	__u64 guest_uaddr;
 	__u32 guest_len;
+	__u32 pad1;
 	__u64 trans_uaddr;
 	__u32 trans_len;
+	__u32 pad2;
 };
 
 struct kvm_sev_receive_start {
@@ -765,17 +783,22 @@ struct kvm_sev_receive_start {
 	__u32 policy;
 	__u64 pdh_uaddr;
 	__u32 pdh_len;
+	__u32 pad0;
 	__u64 session_uaddr;
 	__u32 session_len;
+	__u32 pad1;
 };
 
 struct kvm_sev_receive_update_data {
 	__u64 hdr_uaddr;
 	__u32 hdr_len;
+	__u32 pad0;
 	__u64 guest_uaddr;
 	__u32 guest_len;
+	__u32 pad1;
 	__u64 trans_uaddr;
 	__u32 trans_len;
+	__u32 pad2;
 };
 
 #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 02/10] KVM: introduce new vendor op for KVM_GET_DEVICE_ATTR
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
  2024-02-09 18:37 ` [PATCH 01/10] KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-14 22:57   ` Michael Roth
  2024-02-09 18:37 ` [PATCH 03/10] Documentation: kvm/sev: separate description of firmware Paolo Bonzini
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

Allow vendor modules to provide their own attributes on /dev/kvm.
To avoid proliferation of vendor ops, implement KVM_HAS_DEVICE_ATTR
and KVM_GET_DEVICE_ATTR in terms of the same function.  You're not
supposed to use KVM_GET_DEVICE_ATTR to do complicated computations,
especially on /dev/kvm.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  1 +
 arch/x86/kvm/x86.c                 | 52 +++++++++++++++++++-----------
 3 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 378ed944b849..ac8b7614e79d 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -122,6 +122,7 @@ KVM_X86_OP(enter_smm)
 KVM_X86_OP(leave_smm)
 KVM_X86_OP(enable_smi_window)
 #endif
+KVM_X86_OP_OPTIONAL(dev_get_attr)
 KVM_X86_OP_OPTIONAL(mem_enc_ioctl)
 KVM_X86_OP_OPTIONAL(mem_enc_register_region)
 KVM_X86_OP_OPTIONAL(mem_enc_unregister_region)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d271ba20a0b2..0bcd9ae16097 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1769,6 +1769,7 @@ struct kvm_x86_ops {
 	void (*enable_smi_window)(struct kvm_vcpu *vcpu);
 #endif
 
+	int (*dev_get_attr)(u64 attr, u64 *val);
 	int (*mem_enc_ioctl)(struct kvm *kvm, void __user *argp);
 	int (*mem_enc_register_region)(struct kvm *kvm, struct kvm_enc_region *argp);
 	int (*mem_enc_unregister_region)(struct kvm *kvm, struct kvm_enc_region *argp);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bf10a9073a09..8746530930d5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4804,37 +4804,53 @@ static inline void __user *kvm_get_attr_addr(struct kvm_device_attr *attr)
 	return uaddr;
 }
 
-static int kvm_x86_dev_get_attr(struct kvm_device_attr *attr)
+static int __kvm_x86_dev_get_attr(struct kvm_device_attr *attr, u64 *val)
 {
-	u64 __user *uaddr = kvm_get_attr_addr(attr);
+	int r;
 
 	if (attr->group)
 		return -ENXIO;
 
+	switch (attr->attr) {
+	case KVM_X86_XCOMP_GUEST_SUPP:
+		r = 0;
+		*val = kvm_caps.supported_xcr0;
+		break;
+	default:
+		r = -ENXIO;
+		if (kvm_x86_ops.dev_get_attr)
+			r = kvm_x86_ops.dev_get_attr(attr->attr, val);
+		break;
+	}
+
+	return r;
+}
+
+static int kvm_x86_dev_get_attr(struct kvm_device_attr *attr)
+{
+	u64 __user *uaddr;
+	int r;
+	u64 val;
+
+	r = __kvm_x86_dev_get_attr(attr, &val);
+	if (r < 0)
+		return r;
+
+	uaddr = kvm_get_attr_addr(attr);
 	if (IS_ERR(uaddr))
 		return PTR_ERR(uaddr);
 
-	switch (attr->attr) {
-	case KVM_X86_XCOMP_GUEST_SUPP:
-		if (put_user(kvm_caps.supported_xcr0, uaddr))
-			return -EFAULT;
-		return 0;
-	default:
-		return -ENXIO;
-	}
+	if (put_user(val, uaddr))
+		return -EFAULT;
+
+	return 0;
 }
 
 static int kvm_x86_dev_has_attr(struct kvm_device_attr *attr)
 {
-	if (attr->group)
-		return -ENXIO;
+	u64 val;
 
-	switch (attr->attr) {
-	case KVM_X86_XCOMP_GUEST_SUPP:
-		return 0;
-	default:
-		return -ENXIO;
-	}
+	return __kvm_x86_dev_get_attr(attr, &val);
 }
 
 long kvm_arch_dev_ioctl(struct file *filp,
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 03/10] Documentation: kvm/sev: separate description of firmware
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
  2024-02-09 18:37 ` [PATCH 01/10] KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP Paolo Bonzini
  2024-02-09 18:37 ` [PATCH 02/10] KVM: introduce new vendor op for KVM_GET_DEVICE_ATTR Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-14 23:23   ` Michael Roth
  2024-02-09 18:37 ` [PATCH 04/10] KVM: SEV: publish supported VMSA features Paolo Bonzini
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

The description of firmware is included part under the "SEV Key Management"
header, part under the KVM_SEV_INIT ioctl.  Put these two bits together and
and rename "SEV Key Management" to what it actually is, namely a description
of the KVM_MEMORY_ENCRYPT_OP API.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 .../virt/kvm/x86/amd-memory-encryption.rst    | 29 +++++++++++--------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
index 995780088eb2..37c5c37f4f6e 100644
--- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
@@ -46,14 +46,8 @@ SEV hardware uses ASIDs to associate a memory encryption key with a VM.
 Hence, the ASID for the SEV-enabled guests must be from 1 to a maximum value
 defined in the CPUID 0x8000001f[ecx] field.
 
-SEV Key Management
-==================
-
-The SEV guest key management is handled by a separate processor called the AMD
-Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
-key management interface to perform common hypervisor activities such as
-encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
-information, see the SEV Key Management spec [api-spec]_
+``KVM_MEMORY_ENCRYPT_OP`` API
+=============================
 
 The main ioctl to access SEV is KVM_MEMORY_ENCRYPT_OP.  If the argument
 to KVM_MEMORY_ENCRYPT_OP is NULL, the ioctl returns 0 if SEV is enabled
@@ -87,10 +81,6 @@ guests, such as launching, running, snapshotting, migrating and decommissioning.
 The KVM_SEV_INIT command is used by the hypervisor to initialize the SEV platform
 context. In a typical workflow, this command should be the first command issued.
 
-The firmware can be initialized either by using its own non-volatile storage or
-the OS can manage the NV storage for the firmware using the module parameter
-``init_ex_path``. If the file specified by ``init_ex_path`` does not exist or
-is invalid, the OS will create or override the file with output from PSP.
 
 Returns: 0 on success, -negative on error
 
@@ -434,6 +424,21 @@ issued by the hypervisor to make the guest ready for execution.
 
 Returns: 0 on success, -negative on error
 
+Firmware Management
+===================
+
+The SEV guest key management is handled by a separate processor called the AMD
+Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
+key management interface to perform common hypervisor activities such as
+encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
+information, see the SEV Key Management spec [api-spec]_
+
+The AMD-SP firmware can be initialized either by using its own non-volatile
+storage or the OS can manage the NV storage for the firmware using
+parameter ``init_ex_path`` of the ``ccp`` module. If the file specified
+by ``init_ex_path`` does not exist or is invalid, the OS will create or
+override the file with PSP non-volatile storage.
+
 References
 ==========
 
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 04/10] KVM: SEV: publish supported VMSA features
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
                   ` (2 preceding siblings ...)
  2024-02-09 18:37 ` [PATCH 03/10] Documentation: kvm/sev: separate description of firmware Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-14 23:49   ` Michael Roth
  2024-02-09 18:37 ` [PATCH 05/10] KVM: SEV: store VMSA features in kvm_sev_info Paolo Bonzini
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

Compute the set of features to be stored in the VMSA when KVM is
initialized; move it from there into kvm_sev_info when SEV is initialized,
and then into the initial VMSA.

The new variable can then be used to return the set of supported features
to userspace, via the KVM_GET_DEVICE_ATTR ioctl.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 .../virt/kvm/x86/amd-memory-encryption.rst    | 12 ++++++++++
 arch/x86/include/uapi/asm/kvm.h               |  1 +
 arch/x86/kvm/svm/sev.c                        | 22 +++++++++++++++++--
 arch/x86/kvm/svm/svm.c                        |  1 +
 arch/x86/kvm/svm/svm.h                        |  1 +
 5 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
index 37c5c37f4f6e..5ed11bc16b96 100644
--- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
@@ -424,6 +424,18 @@ issued by the hypervisor to make the guest ready for execution.
 
 Returns: 0 on success, -negative on error
 
+Device attribute API
+====================
+
+Attributes of the SEV implementation can be retrieved through the
+``KVM_HAS_DEVICE_ATTR`` and ``KVM_GET_DEVICE_ATTR`` ioctls on the ``/dev/kvm``
+device node.
+
+Currently only one attribute is implemented:
+
+* group 0, attribute ``KVM_X86_SEV_VMSA_FEATURES``: return the set of all
+  bits that are accepted in the ``vmsa_features`` of ``KVM_SEV_INIT2``.
+
 Firmware Management
 ===================
 
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index b305daff056e..cccaa5ff6d01 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -459,6 +459,7 @@ struct kvm_sync_regs {
 
 /* attributes for system fd (group 0) */
 #define KVM_X86_XCOMP_GUEST_SUPP	0
+#define KVM_X86_SEV_VMSA_FEATURES	1
 
 struct kvm_vmx_nested_state_data {
 	__u8 vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index f760106c31f8..2e558f7538c2 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -65,6 +65,7 @@ module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0444);
 #define sev_es_debug_swap_enabled false
 #endif /* CONFIG_KVM_AMD_SEV */
 
+static u64 sev_supported_vmsa_features;
 static u8 sev_enc_bit;
 static DECLARE_RWSEM(sev_deactivate_lock);
 static DEFINE_MUTEX(sev_bitmap_lock);
@@ -612,8 +613,7 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
 	save->xss  = svm->vcpu.arch.ia32_xss;
 	save->dr6  = svm->vcpu.arch.dr6;
 
-	if (sev_es_debug_swap_enabled)
-		save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP;
+	save->sev_features = sev_supported_vmsa_features;
 
 	pr_debug("Virtual Machine Save Area (VMSA):\n");
 	print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false);
@@ -1849,6 +1849,20 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
 	return ret;
 }
 
+int sev_dev_get_attr(u64 attr, u64 *val)
+{
+	switch (attr) {
+#ifdef CONFIG_KVM_AMD_SEV
+	case KVM_X86_SEV_VMSA_FEATURES:
+		*val = sev_supported_vmsa_features;
+		return 0;
+#endif
+
+	default:
+		return -ENXIO;
+	}
+}
+
 int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -2276,6 +2290,10 @@ void __init sev_hardware_setup(void)
 	if (!sev_es_enabled || !cpu_feature_enabled(X86_FEATURE_DEBUG_SWAP) ||
 	    !cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP))
 		sev_es_debug_swap_enabled = false;
+
+	sev_supported_vmsa_features = 0;
+	if (sev_es_debug_swap_enabled)
+		sev_supported_vmsa_features |= SVM_SEV_FEAT_DEBUG_SWAP;
 #endif
 }
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e90b429c84f1..aa1792f402ab 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5014,6 +5014,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.enable_smi_window = svm_enable_smi_window,
 #endif
 
+	.dev_get_attr = sev_dev_get_attr,
 	.mem_enc_ioctl = sev_mem_enc_ioctl,
 	.mem_enc_register_region = sev_mem_enc_register_region,
 	.mem_enc_unregister_region = sev_mem_enc_unregister_region,
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 8ef95139cd24..d630026b23b0 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -671,6 +671,7 @@ void avic_refresh_virtual_apic_mode(struct kvm_vcpu *vcpu);
 extern unsigned int max_sev_asid;
 
 void sev_vm_destroy(struct kvm *kvm);
+int sev_dev_get_attr(u64 attr, u64 *val);
 int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp);
 int sev_mem_enc_register_region(struct kvm *kvm,
 				struct kvm_enc_region *range);
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 05/10] KVM: SEV: store VMSA features in kvm_sev_info
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
                   ` (3 preceding siblings ...)
  2024-02-09 18:37 ` [PATCH 04/10] KVM: SEV: publish supported VMSA features Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-15  0:03   ` Michael Roth
  2024-02-09 18:37 ` [PATCH 06/10] KVM: x86: define standard behavior for bits 0/1 of VM type Paolo Bonzini
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

Right now, the set of features that are stored in the VMSA upon
initialization is fixed and depends on the module parameters for
kvm-amd.ko.  However, the hypervisor cannot really change it at will
because the feature word has to match between the hypervisor and whatever
computes a measurement of the VMSA for attestation purposes.

Add a field to kvm_set_info that holds the set of features to be stored
in the VMSA; and query it instead of referring to the module parameters.

Because KVM_SEV_INIT and KVM_SEV_ES_INIT accept no parameters, this
does not yet introduce any functional change, but it paves the way for
an API that allows customization of the features per-VM.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 22 ++++++++++++++++++----
 arch/x86/kvm/svm/svm.c |  2 +-
 arch/x86/kvm/svm/svm.h |  3 ++-
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 2e558f7538c2..712bfbc0028a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -116,6 +116,14 @@ static inline bool is_mirroring_enc_context(struct kvm *kvm)
 	return !!to_kvm_svm(kvm)->sev_info.enc_context_owner;
 }
 
+static bool sev_vcpu_has_debug_swap(struct vcpu_svm *svm)
+{
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
+
+	return sev->vmsa_features & SVM_SEV_FEAT_DEBUG_SWAP;
+}
+
 /* Must be called with the sev_bitmap_lock held */
 static bool __sev_recycle_asids(int min_asid, int max_asid)
 {
@@ -258,6 +266,8 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 
 	sev->active = true;
 	sev->es_active = argp->id == KVM_SEV_ES_INIT;
+	sev->vmsa_features = sev_supported_vmsa_features;
+
 	asid = sev_asid_new(sev);
 	if (asid < 0)
 		goto e_no_asid;
@@ -278,6 +288,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	sev_asid_free(sev);
 	sev->asid = 0;
 e_no_asid:
+	sev->vmsa_features = 0;
 	sev->es_active = false;
 	sev->active = false;
 	return ret;
@@ -572,6 +583,8 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 
 static int sev_es_sync_vmsa(struct vcpu_svm *svm)
 {
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
 	struct sev_es_save_area *save = svm->sev_es.vmsa;
 
 	/* Check some debug related fields before encrypting the VMSA */
@@ -613,7 +626,7 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
 	save->xss  = svm->vcpu.arch.ia32_xss;
 	save->dr6  = svm->vcpu.arch.dr6;
 
-	save->sev_features = sev_supported_vmsa_features;
+	save->sev_features = sev->vmsa_features;
 
 	pr_debug("Virtual Machine Save Area (VMSA):\n");
 	print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false);
@@ -1693,6 +1706,7 @@ static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm)
 	dst->pages_locked = src->pages_locked;
 	dst->enc_context_owner = src->enc_context_owner;
 	dst->es_active = src->es_active;
+	dst->vmsa_features = src->vmsa_features;
 
 	src->asid = 0;
 	src->active = false;
@@ -3063,7 +3077,7 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
 	svm_set_intercept(svm, TRAP_CR8_WRITE);
 
 	vmcb->control.intercepts[INTERCEPT_DR] = 0;
-	if (!sev_es_debug_swap_enabled) {
+	if (!sev_vcpu_has_debug_swap(svm)) {
 		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
 		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
 		recalc_intercepts(svm);
@@ -3118,7 +3132,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
 					    sev_enc_bit));
 }
 
-void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa)
+void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa)
 {
 	/*
 	 * All host state for SEV-ES guests is categorized into three swap types
@@ -3146,7 +3160,7 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa)
 	 * the CPU (Type-B). If DebugSwap is disabled/unsupported, the CPU both
 	 * saves and loads debug registers (Type-A).
 	 */
-	if (sev_es_debug_swap_enabled) {
+	if (sev_vcpu_has_debug_swap(svm)) {
 		hostsa->dr0 = native_get_debugreg(0);
 		hostsa->dr1 = native_get_debugreg(1);
 		hostsa->dr2 = native_get_debugreg(2);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index aa1792f402ab..392b9c2e2ce1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1523,7 +1523,7 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 		struct sev_es_save_area *hostsa;
 		hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
 
-		sev_es_prepare_switch_to_guest(hostsa);
+		sev_es_prepare_switch_to_guest(svm, hostsa);
 	}
 
 	if (tsc_scaling)
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index d630026b23b0..864c782eaa58 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -85,6 +85,7 @@ struct kvm_sev_info {
 	unsigned long pages_locked; /* Number of pages locked */
 	struct list_head regions_list;  /* List of registered regions */
 	u64 ap_jump_table;	/* SEV-ES AP Jump Table address */
+	u64 vmsa_features;
 	struct kvm *enc_context_owner; /* Owner of copied encryption context */
 	struct list_head mirror_vms; /* List of VMs mirroring */
 	struct list_head mirror_entry; /* Use as a list entry of mirrors */
@@ -693,7 +694,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
 int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
 void sev_es_vcpu_reset(struct vcpu_svm *svm);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
-void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
+void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 /* vmenter.S */
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 06/10] KVM: x86: define standard behavior for bits 0/1 of VM type
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
                   ` (4 preceding siblings ...)
  2024-02-09 18:37 ` [PATCH 05/10] KVM: SEV: store VMSA features in kvm_sev_info Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-09 18:37 ` [PATCH 07/10] KVM: x86: Add is_vm_type_supported callback Paolo Bonzini
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

Some VM types have characteristics in common; in fact, the only use
of VM types right now is kvm_arch_has_private_mem and it assumes that
_all_ VM types have private memory.

So, let the low bits specify the characteristics of the VM type.  As
of we have two special things: whether memory is private, and whether
guest state is protected.  The latter is similar to
kvm->arch.guest_state_protected, but the latter is only set on a fully
initialized VM.  If both are set, ioctls to set registers will cause
an error---SEV-ES did not do so, which is a problematic API.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	The plan is to reserve VM type 19 for TDX (16 for Intel,
	+1 for private memory, +2 for encrypted state).

 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/include/uapi/asm/kvm.h |  6 ++-
 arch/x86/kvm/x86.c              | 93 +++++++++++++++++++++++++++------
 3 files changed, 83 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0bcd9ae16097..b7d33205d49d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2136,7 +2136,7 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
 		       int tdp_max_root_level, int tdp_huge_page_level);
 
 #ifdef CONFIG_KVM_PRIVATE_MEM
-#define kvm_arch_has_private_mem(kvm) ((kvm)->arch.vm_type != KVM_X86_DEFAULT_VM)
+#define kvm_arch_has_private_mem(kvm) ((kvm)->arch.vm_type & __KVM_X86_PRIVATE_MEM_TYPE)
 #else
 #define kvm_arch_has_private_mem(kvm) false
 #endif
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index cccaa5ff6d01..6c74db23257e 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -848,7 +848,11 @@ struct kvm_hyperv_eventfd {
 /* x86-specific KVM_EXIT_HYPERCALL flags. */
 #define KVM_EXIT_HYPERCALL_LONG_MODE	_BITULL(0)
 
+/* Low bits of VM types provide confidential computing capabilities.  */
+#define __KVM_X86_PRIVATE_MEM_TYPE	1
+#define __KVM_X86_PROTECTED_STATE_TYPE	2
+
 #define KVM_X86_DEFAULT_VM	0
-#define KVM_X86_SW_PROTECTED_VM	1
+#define KVM_X86_SW_PROTECTED_VM	(KVM_X86_DEFAULT_VM | __KVM_X86_PRIVATE_MEM_TYPE)
 
 #endif /* _ASM_X86_KVM_H */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8746530930d5..e634e5b67516 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5526,21 +5526,30 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
-static void kvm_vcpu_ioctl_x86_get_debugregs(struct kvm_vcpu *vcpu,
-					     struct kvm_debugregs *dbgregs)
+static int kvm_vcpu_ioctl_x86_get_debugregs(struct kvm_vcpu *vcpu,
+					    struct kvm_debugregs *dbgregs)
 {
 	unsigned long val;
 
+	if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+	    vcpu->arch.guest_state_protected)
+		return -EINVAL;
+
 	memset(dbgregs, 0, sizeof(*dbgregs));
 	memcpy(dbgregs->db, vcpu->arch.db, sizeof(vcpu->arch.db));
 	kvm_get_dr(vcpu, 6, &val);
 	dbgregs->dr6 = val;
 	dbgregs->dr7 = vcpu->arch.dr7;
+	return 0;
 }
 
 static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
 					    struct kvm_debugregs *dbgregs)
 {
+	if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+	    vcpu->arch.guest_state_protected)
+		return -EINVAL;
+
 	if (dbgregs->flags)
 		return -EINVAL;
 
@@ -5559,9 +5568,13 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
 }
 
 
-static void kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
-					  u8 *state, unsigned int size)
+static int kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
+					 u8 *state, unsigned int size)
 {
+	if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+	    fpstate_is_confidential(&vcpu->arch.guest_fpu))
+		return -EINVAL;
+
 	/*
 	 * Only copy state for features that are enabled for the guest.  The
 	 * state itself isn't problematic, but setting bits in the header for
@@ -5578,22 +5591,27 @@ static void kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
 			     XFEATURE_MASK_FPSSE;
 
 	if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
-		return;
+		return 0;
 
 	fpu_copy_guest_fpstate_to_uabi(&vcpu->arch.guest_fpu, state, size,
 				       supported_xcr0, vcpu->arch.pkru);
+	return 0;
 }
 
-static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
-					 struct kvm_xsave *guest_xsave)
+static int kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
+					struct kvm_xsave *guest_xsave)
 {
-	kvm_vcpu_ioctl_x86_get_xsave2(vcpu, (void *)guest_xsave->region,
-				      sizeof(guest_xsave->region));
+	return kvm_vcpu_ioctl_x86_get_xsave2(vcpu, (void *)guest_xsave->region,
+					     sizeof(guest_xsave->region));
 }
 
 static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
 					struct kvm_xsave *guest_xsave)
 {
+	if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+	    fpstate_is_confidential(&vcpu->arch.guest_fpu))
+		return -EINVAL;
+
 	if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
 		return 0;
 
@@ -5603,18 +5621,23 @@ static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
 					      &vcpu->arch.pkru);
 }
 
-static void kvm_vcpu_ioctl_x86_get_xcrs(struct kvm_vcpu *vcpu,
-					struct kvm_xcrs *guest_xcrs)
+static int kvm_vcpu_ioctl_x86_get_xcrs(struct kvm_vcpu *vcpu,
+				       struct kvm_xcrs *guest_xcrs)
 {
+	if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+	    vcpu->arch.guest_state_protected)
+		return -EINVAL;
+
 	if (!boot_cpu_has(X86_FEATURE_XSAVE)) {
 		guest_xcrs->nr_xcrs = 0;
-		return;
+		return 0;
 	}
 
 	guest_xcrs->nr_xcrs = 1;
 	guest_xcrs->flags = 0;
 	guest_xcrs->xcrs[0].xcr = XCR_XFEATURE_ENABLED_MASK;
 	guest_xcrs->xcrs[0].value = vcpu->arch.xcr0;
+	return 0;
 }
 
 static int kvm_vcpu_ioctl_x86_set_xcrs(struct kvm_vcpu *vcpu,
@@ -5622,6 +5645,10 @@ static int kvm_vcpu_ioctl_x86_set_xcrs(struct kvm_vcpu *vcpu,
 {
 	int i, r = 0;
 
+	if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+	    vcpu->arch.guest_state_protected)
+		return -EINVAL;
+
 	if (!boot_cpu_has(X86_FEATURE_XSAVE))
 		return -EINVAL;
 
@@ -6010,7 +6037,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	case KVM_GET_DEBUGREGS: {
 		struct kvm_debugregs dbgregs;
 
-		kvm_vcpu_ioctl_x86_get_debugregs(vcpu, &dbgregs);
+		r = kvm_vcpu_ioctl_x86_get_debugregs(vcpu, &dbgregs);
+		if (r < 0)
+			break;
 
 		r = -EFAULT;
 		if (copy_to_user(argp, &dbgregs,
@@ -6040,7 +6069,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		if (!u.xsave)
 			break;
 
-		kvm_vcpu_ioctl_x86_get_xsave(vcpu, u.xsave);
+		r = kvm_vcpu_ioctl_x86_get_xsave(vcpu, u.xsave);
+		if (r < 0)
+			break;
 
 		r = -EFAULT;
 		if (copy_to_user(argp, u.xsave, sizeof(struct kvm_xsave)))
@@ -6069,7 +6100,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		if (!u.xsave)
 			break;
 
-		kvm_vcpu_ioctl_x86_get_xsave2(vcpu, u.buffer, size);
+		r = kvm_vcpu_ioctl_x86_get_xsave2(vcpu, u.buffer, size);
+		if (r < 0)
+			break;
 
 		r = -EFAULT;
 		if (copy_to_user(argp, u.xsave, size))
@@ -6085,7 +6118,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		if (!u.xcrs)
 			break;
 
-		kvm_vcpu_ioctl_x86_get_xcrs(vcpu, u.xcrs);
+		r = kvm_vcpu_ioctl_x86_get_xcrs(vcpu, u.xcrs);
+		if (r < 0)
+			break;
 
 		r = -EFAULT;
 		if (copy_to_user(argp, u.xcrs,
@@ -6229,6 +6264,11 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	}
 #endif
 	case KVM_GET_SREGS2: {
+		r = -EINVAL;
+		if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+		    vcpu->arch.guest_state_protected)
+			goto out;
+
 		u.sregs2 = kzalloc(sizeof(struct kvm_sregs2), GFP_KERNEL);
 		r = -ENOMEM;
 		if (!u.sregs2)
@@ -6241,6 +6281,11 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		break;
 	}
 	case KVM_SET_SREGS2: {
+		r = -EINVAL;
+		if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+		    vcpu->arch.guest_state_protected)
+			goto out;
+
 		u.sregs2 = memdup_user(argp, sizeof(struct kvm_sregs2));
 		if (IS_ERR(u.sregs2)) {
 			r = PTR_ERR(u.sregs2);
@@ -11466,6 +11511,10 @@ static void __get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
+	if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+	    vcpu->arch.guest_state_protected)
+		return -EINVAL;
+
 	vcpu_load(vcpu);
 	__get_regs(vcpu, regs);
 	vcpu_put(vcpu);
@@ -11507,6 +11556,10 @@ static void __set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 
 int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
+	if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+	    vcpu->arch.guest_state_protected)
+		return -EINVAL;
+
 	vcpu_load(vcpu);
 	__set_regs(vcpu, regs);
 	vcpu_put(vcpu);
@@ -11579,6 +11632,10 @@ static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2)
 int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
 				  struct kvm_sregs *sregs)
 {
+	if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+	    vcpu->arch.guest_state_protected)
+		return -EINVAL;
+
 	vcpu_load(vcpu);
 	__get_sregs(vcpu, sregs);
 	vcpu_put(vcpu);
@@ -11846,6 +11903,10 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 {
 	int ret;
 
+	if ((vcpu->kvm->arch.vm_type & __KVM_X86_PROTECTED_STATE_TYPE) &&
+	    vcpu->arch.guest_state_protected)
+		return -EINVAL;
+
 	vcpu_load(vcpu);
 	ret = __set_sregs(vcpu, sregs);
 	vcpu_put(vcpu);
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 07/10] KVM: x86: Add is_vm_type_supported callback
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
                   ` (5 preceding siblings ...)
  2024-02-09 18:37 ` [PATCH 06/10] KVM: x86: define standard behavior for bits 0/1 of VM type Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-15  0:33   ` Michael Roth
  2024-02-09 18:37 ` [PATCH 08/10] KVM: SEV: define VM types for SEV and SEV-ES Paolo Bonzini
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

From: Isaku Yamahata <isaku.yamahata@intel.com>

Allow the backend to specify which VM types are supported.

Based on a patch by Isaku Yamahata <isaku.yamahata@intel.com>.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm-x86-ops.h | 1 +
 arch/x86/include/asm/kvm_host.h    | 1 +
 arch/x86/kvm/x86.c                 | 9 ++++++++-
 arch/x86/kvm/x86.h                 | 2 ++
 4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index ac8b7614e79d..8694a6e3d012 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -20,6 +20,7 @@ KVM_X86_OP(hardware_disable)
 KVM_X86_OP(hardware_unsetup)
 KVM_X86_OP(has_emulated_msr)
 KVM_X86_OP(vcpu_after_set_cpuid)
+KVM_X86_OP_OPTIONAL_RET0(is_vm_type_supported)
 KVM_X86_OP(vm_init)
 KVM_X86_OP_OPTIONAL(vm_destroy)
 KVM_X86_OP_OPTIONAL_RET0(vcpu_precreate)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b7d33205d49d..9636a43ccdc4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1601,6 +1601,7 @@ struct kvm_x86_ops {
 	bool (*has_emulated_msr)(struct kvm *kvm, u32 index);
 	void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu);
 
+	bool (*is_vm_type_supported)(unsigned long vm_type);
 	unsigned int vm_size;
 	int (*vm_init)(struct kvm *kvm);
 	void (*vm_destroy)(struct kvm *kvm);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e634e5b67516..c89ddaa1e09f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4581,12 +4581,19 @@ static int kvm_ioctl_get_supported_hv_cpuid(struct kvm_vcpu *vcpu,
 }
 #endif
 
-static bool kvm_is_vm_type_supported(unsigned long type)
+bool __kvm_is_vm_type_supported(unsigned long type)
 {
 	return type == KVM_X86_DEFAULT_VM ||
 	       (type == KVM_X86_SW_PROTECTED_VM &&
 		IS_ENABLED(CONFIG_KVM_SW_PROTECTED_VM) && tdp_enabled);
 }
+EXPORT_SYMBOL_GPL(__kvm_is_vm_type_supported);
+
+static bool kvm_is_vm_type_supported(unsigned long type)
+{
+	return __kvm_is_vm_type_supported(type) ||
+	       static_call(kvm_x86_is_vm_type_supported)(type);
+}
 
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 2f7e19166658..4e40c23d66ed 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -9,6 +9,8 @@
 #include "kvm_cache_regs.h"
 #include "kvm_emulate.h"
 
+bool __kvm_is_vm_type_supported(unsigned long type);
+
 struct kvm_caps {
 	/* control of guest tsc rate supported? */
 	bool has_tsc_control;
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 08/10] KVM: SEV: define VM types for SEV and SEV-ES
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
                   ` (6 preceding siblings ...)
  2024-02-09 18:37 ` [PATCH 07/10] KVM: x86: Add is_vm_type_supported callback Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-15  1:19   ` Michael Roth
  2024-02-09 18:37 ` [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation Paolo Bonzini
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 Documentation/virt/kvm/api.rst  |  2 ++
 arch/x86/include/uapi/asm/kvm.h |  2 ++
 arch/x86/kvm/svm/sev.c          | 18 +++++++++++++++++-
 arch/x86/kvm/svm/svm.c          | 11 +++++++++++
 arch/x86/kvm/svm/svm.h          |  2 ++
 arch/x86/kvm/x86.c              |  4 ++++
 6 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 3ec0b7a455a0..bf957bb70e4b 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8790,6 +8790,8 @@ means the VM type with value @n is supported.  Possible values of @n are::
 
   #define KVM_X86_DEFAULT_VM	0
   #define KVM_X86_SW_PROTECTED_VM	1
+  #define KVM_X86_SEV_VM	8
+  #define KVM_X86_SEV_ES_VM	10
 
 9. Known KVM API problems
 =========================
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 6c74db23257e..7c46e96cfe62 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -854,5 +854,7 @@ struct kvm_hyperv_eventfd {
 
 #define KVM_X86_DEFAULT_VM	0
 #define KVM_X86_SW_PROTECTED_VM	(KVM_X86_DEFAULT_VM | __KVM_X86_PRIVATE_MEM_TYPE)
+#define KVM_X86_SEV_VM		8
+#define KVM_X86_SEV_ES_VM	(KVM_X86_SEV_VM | __KVM_X86_PROTECTED_STATE_TYPE)
 
 #endif /* _ASM_X86_KVM_H */
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 712bfbc0028a..acf5c45ef14e 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -260,6 +260,9 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	if (kvm->created_vcpus)
 		return -EINVAL;
 
+	if (kvm->arch.vm_type != KVM_X86_DEFAULT_VM)
+		return -EINVAL;
+
 	ret = -EBUSY;
 	if (unlikely(sev->active))
 		return ret;
@@ -279,6 +282,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 
 	INIT_LIST_HEAD(&sev->regions_list);
 	INIT_LIST_HEAD(&sev->mirror_vms);
+	sev->need_init = false;
 
 	kvm_set_apicv_inhibit(kvm, APICV_INHIBIT_REASON_SEV);
 
@@ -1814,7 +1818,8 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
 	if (ret)
 		goto out_fput;
 
-	if (sev_guest(kvm) || !sev_guest(source_kvm)) {
+	if (kvm->arch.vm_type != source_kvm->arch.vm_type ||
+	    sev_guest(kvm) || !sev_guest(source_kvm)) {
 		ret = -EINVAL;
 		goto out_unlock;
 	}
@@ -2135,6 +2140,7 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
 	mirror_sev->asid = source_sev->asid;
 	mirror_sev->fd = source_sev->fd;
 	mirror_sev->es_active = source_sev->es_active;
+	mirror_sev->need_init = false;
 	mirror_sev->handle = source_sev->handle;
 	INIT_LIST_HEAD(&mirror_sev->regions_list);
 	INIT_LIST_HEAD(&mirror_sev->mirror_vms);
@@ -3192,3 +3198,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 
 	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
 }
+
+bool sev_is_vm_type_supported(unsigned long type)
+{
+	if (type == KVM_X86_SEV_VM)
+		return sev_enabled;
+	if (type == KVM_X86_SEV_ES_VM)
+		return sev_es_enabled;
+
+	return false;
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 392b9c2e2ce1..87541c84d07e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4087,6 +4087,11 @@ static void svm_cancel_injection(struct kvm_vcpu *vcpu)
 
 static int svm_vcpu_pre_run(struct kvm_vcpu *vcpu)
 {
+	struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
+
+	if (sev->need_init)
+		return -EINVAL;
+
 	return 1;
 }
 
@@ -4888,6 +4893,11 @@ static void svm_vm_destroy(struct kvm *kvm)
 
 static int svm_vm_init(struct kvm *kvm)
 {
+	if (kvm->arch.vm_type) {
+		struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+		sev->need_init = true;
+	}
+
 	if (!pause_filter_count || !pause_filter_thresh)
 		kvm->arch.pause_in_guest = true;
 
@@ -4914,6 +4924,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.vcpu_free = svm_vcpu_free,
 	.vcpu_reset = svm_vcpu_reset,
 
+	.is_vm_type_supported = sev_is_vm_type_supported,
 	.vm_size = sizeof(struct kvm_svm),
 	.vm_init = svm_vm_init,
 	.vm_destroy = svm_vm_destroy,
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 864c782eaa58..63be26d4a024 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -79,6 +79,7 @@ enum {
 struct kvm_sev_info {
 	bool active;		/* SEV enabled guest */
 	bool es_active;		/* SEV-ES enabled guest */
+	bool need_init;		/* waiting for SEV_INIT2 */
 	unsigned int asid;	/* ASID used for this guest */
 	unsigned int handle;	/* SEV firmware handle */
 	int fd;			/* SEV device fd */
@@ -696,6 +697,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
 void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
+bool sev_is_vm_type_supported(unsigned long type);
 
 /* vmenter.S */
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c89ddaa1e09f..dfc66ee091a1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4795,6 +4795,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = BIT(KVM_X86_DEFAULT_VM);
 		if (kvm_is_vm_type_supported(KVM_X86_SW_PROTECTED_VM))
 			r |= BIT(KVM_X86_SW_PROTECTED_VM);
+		if (kvm_is_vm_type_supported(KVM_X86_SEV_VM))
+			r |= BIT(KVM_X86_SEV_VM);
+		if (kvm_is_vm_type_supported(KVM_X86_SEV_ES_VM))
+			r |= BIT(KVM_X86_SEV_ES_VM);
 		break;
 	default:
 		break;
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
                   ` (7 preceding siblings ...)
  2024-02-09 18:37 ` [PATCH 08/10] KVM: SEV: define VM types for SEV and SEV-ES Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-15  1:34   ` Michael Roth
                     ` (2 more replies)
  2024-02-09 18:37 ` [PATCH 10/10] selftests: kvm: add tests for KVM_SEV_INIT2 Paolo Bonzini
                   ` (2 subsequent siblings)
  11 siblings, 3 replies; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

The idea that no parameter would ever be necessary when enabling SEV or
SEV-ES for a VM was decidedly optimistic.  In fact, in some sense it's
already a parameter whether SEV or SEV-ES is desired.  Another possible
source of variability is the desired set of VMSA features, as that affects
the measurement of the VM's initial state and cannot be changed
arbitrarily by the hypervisor.

Create a new sub-operation for KVM_MEM_ENCRYPT_OP that can take a struct,
and put the new op to work by including the VMSA features as a field of the
struct.  The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of
supported VMSA features for backwards compatibility.

The struct also includes the usual bells and whistles for future
extensibility: a flags field that must be zero for now, and some padding
at the end.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 .../virt/kvm/x86/amd-memory-encryption.rst    | 41 ++++++++++++++--
 arch/x86/include/uapi/asm/kvm.h               | 10 ++++
 arch/x86/kvm/svm/sev.c                        | 48 +++++++++++++++++--
 3 files changed, 92 insertions(+), 7 deletions(-)

diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
index 5ed11bc16b96..a4291e7bd8ed 100644
--- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
@@ -75,15 +75,50 @@ are defined in ``<linux/psp-dev.h>``.
 KVM implements the following commands to support common lifecycle events of SEV
 guests, such as launching, running, snapshotting, migrating and decommissioning.
 
-1. KVM_SEV_INIT
----------------
+1. KVM_SEV_INIT2
+----------------
 
-The KVM_SEV_INIT command is used by the hypervisor to initialize the SEV platform
+The KVM_SEV_INIT2 command is used by the hypervisor to initialize the SEV platform
 context. In a typical workflow, this command should be the first command issued.
 
+For this command to be accepted, either KVM_X86_SEV_VM or KVM_X86_SEV_ES_VM
+must have been passed to the KVM_CREATE_VM ioctl.  A virtual machine created
+with those machine types in turn cannot be run until KVM_SEV_INIT2 is invoked.
+
+Parameters: struct kvm_sev_init (in)
 
 Returns: 0 on success, -negative on error
 
+::
+
+        struct struct kvm_sev_init {
+                __u32 flags;          /* must be 0 */
+                __u64 vmsa_features;  /* initial value of features field in VMSA */
+                __u32 pad[8];
+        };
+
+It is an error if the hypervisor does not support any of the bits that
+are set in ``flags`` or ``vmsa_features``.
+
+This command replaces the deprecated KVM_SEV_INIT and KVM_SEV_ES_INIT commands.
+The commands did not have any parameters (the ```data``` field was unused) and
+only work for the KVM_X86_DEFAULT_VM machine type (0).
+
+They behave as if:
+
+* the VM type is KVM_X86_SEV_VM for KVM_SEV_INIT, or KVM_X86_SEV_ES_VM for
+  KVM_SEV_ES_INIT
+
+* the ``flags`` field of ``struct kvm_sev_init`` is set to zero
+
+* the ``vmsa_features`` field of ``struct kvm_sev_init`` is set to all features
+  supported by the hypervisor (as returned by ``KVM_GET_DEVICE_ATTR`` when
+  passed group 0 and attribute id ``KVM_X86_SEV_VMSA_FEATURES``).
+
+If the ``KVM_X86_SEV_VMSA_FEATURES`` attribute does not exist, the hypervisor only
+supports KVM_SEV_INIT and KVM_SEV_ES_INIT.  In that case the set of VMSA features is
+undefined.
+
 2. KVM_SEV_LAUNCH_START
 -----------------------
 
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 7c46e96cfe62..6baf18335c7b 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -683,6 +683,9 @@ enum sev_cmd_id {
 	/* Guest Migration Extension */
 	KVM_SEV_SEND_CANCEL,
 
+	/* Second time is the charm; improved versions of the above ioctls.  */
+	KVM_SEV_INIT2,
+
 	KVM_SEV_NR_MAX,
 };
 
@@ -694,6 +697,13 @@ struct kvm_sev_cmd {
 	__u32 sev_fd;
 };
 
+struct kvm_sev_init {
+	__u32 vm_type;
+	__u32 flags;
+	__u64 vmsa_features;
+	__u32 pad[8];
+};
+
 struct kvm_sev_launch_start {
 	__u32 handle;
 	__u32 policy;
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index acf5c45ef14e..78c52764453f 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -252,7 +252,9 @@ static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
 	sev_decommission(handle);
 }
 
-static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
+static int __sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp,
+			    struct kvm_sev_init *data,
+			    unsigned long vm_type)
 {
 	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
 	int asid, ret;
@@ -260,7 +262,10 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	if (kvm->created_vcpus)
 		return -EINVAL;
 
-	if (kvm->arch.vm_type != KVM_X86_DEFAULT_VM)
+	if (data->flags)
+		return -EINVAL;
+
+	if (data->vmsa_features & ~sev_supported_vmsa_features)
 		return -EINVAL;
 
 	ret = -EBUSY;
@@ -268,8 +273,8 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 		return ret;
 
 	sev->active = true;
-	sev->es_active = argp->id == KVM_SEV_ES_INIT;
-	sev->vmsa_features = sev_supported_vmsa_features;
+	sev->es_active = (vm_type & __KVM_X86_PROTECTED_STATE_TYPE) != 0;
+	sev->vmsa_features = data->vmsa_features;
 
 	asid = sev_asid_new(sev);
 	if (asid < 0)
@@ -298,6 +303,38 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_init data = {
+		.vmsa_features = sev_supported_vmsa_features,
+	};
+	unsigned long vm_type;
+
+	if (kvm->arch.vm_type != KVM_X86_DEFAULT_VM)
+		return -EINVAL;
+
+	vm_type = (argp->id == KVM_SEV_INIT ? KVM_X86_SEV_VM : KVM_X86_SEV_ES_VM);
+	return __sev_guest_init(kvm, argp, &data, vm_type);
+}
+
+static int sev_guest_init2(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct kvm_sev_init data;
+
+	if (!sev->need_init)
+		return -EINVAL;
+
+	if (kvm->arch.vm_type != KVM_X86_SEV_VM &&
+	    kvm->arch.vm_type != KVM_X86_SEV_ES_VM)
+		return -EINVAL;
+
+	if (copy_from_user(&data, (void __user *)(uintptr_t)argp->data, sizeof(data)))
+		return -EFAULT;
+
+	return __sev_guest_init(kvm, argp, &data, kvm->arch.vm_type);
+}
+
 static int sev_bind_asid(struct kvm *kvm, unsigned int handle, int *error)
 {
 	struct sev_data_activate activate;
@@ -1915,6 +1952,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_INIT:
 		r = sev_guest_init(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_INIT2:
+		r = sev_guest_init2(kvm, &sev_cmd);
+		break;
 	case KVM_SEV_LAUNCH_START:
 		r = sev_launch_start(kvm, &sev_cmd);
 		break;
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 10/10] selftests: kvm: add tests for KVM_SEV_INIT2
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
                   ` (8 preceding siblings ...)
  2024-02-09 18:37 ` [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-09 18:37 ` [PATCH 11/10] selftests: kvm: switch sev_migrate_tests to KVM_SEV_INIT2 Paolo Bonzini
  2024-02-09 19:40 ` [PATCH 00/10] KVM: SEV: allow customizing VMSA features Sean Christopherson
  11 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/kvm_util_base.h     |   6 +-
 .../selftests/kvm/set_memory_region_test.c    |   8 +-
 .../selftests/kvm/x86_64/sev_init2_tests.c    | 147 ++++++++++++++++++
 4 files changed, 154 insertions(+), 8 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/sev_init2_tests.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 492e937fab00..a81a89b1dc2a 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -116,6 +116,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/tsc_msrs_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_pmu_caps_test
 TEST_GEN_PROGS_x86_64 += x86_64/xen_shinfo_test
 TEST_GEN_PROGS_x86_64 += x86_64/xen_vmcall_test
+TEST_GEN_PROGS_x86_64 += x86_64/sev_init2_tests
 TEST_GEN_PROGS_x86_64 += x86_64/sev_migrate_tests
 TEST_GEN_PROGS_x86_64 += x86_64/amx_test
 TEST_GEN_PROGS_x86_64 += x86_64/max_vcpuid_cap_test
diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 9e5afc472c14..44b6ea56a205 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -846,17 +846,15 @@ static inline struct kvm_vm *vm_create_barebones(void)
 	return ____vm_create(VM_SHAPE_DEFAULT);
 }
 
-#ifdef __x86_64__
-static inline struct kvm_vm *vm_create_barebones_protected_vm(void)
+static inline struct kvm_vm *vm_create_barebones_type(unsigned long type)
 {
 	const struct vm_shape shape = {
 		.mode = VM_MODE_DEFAULT,
-		.type = KVM_X86_SW_PROTECTED_VM,
+		.type = type,
 	};
 
 	return ____vm_create(shape);
 }
-#endif
 
 static inline struct kvm_vm *vm_create(uint32_t nr_runnable_vcpus)
 {
diff --git a/tools/testing/selftests/kvm/set_memory_region_test.c b/tools/testing/selftests/kvm/set_memory_region_test.c
index 075b80dbe237..0ac6c21d7251 100644
--- a/tools/testing/selftests/kvm/set_memory_region_test.c
+++ b/tools/testing/selftests/kvm/set_memory_region_test.c
@@ -339,7 +339,7 @@ static void test_invalid_memory_region_flags(void)
 
 #ifdef __x86_64__
 	if (kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM))
-		vm = vm_create_barebones_protected_vm();
+		vm = vm_create_barebones_type(KVM_X86_SW_PROTECTED_VM);
 	else
 #endif
 		vm = vm_create_barebones();
@@ -452,7 +452,7 @@ static void test_add_private_memory_region(void)
 
 	pr_info("Testing ADD of KVM_MEM_GUEST_MEMFD memory regions\n");
 
-	vm = vm_create_barebones_protected_vm();
+	vm = vm_create_barebones_type(KVM_X86_SW_PROTECTED_VM);
 
 	test_invalid_guest_memfd(vm, vm->kvm_fd, 0, "KVM fd should fail");
 	test_invalid_guest_memfd(vm, vm->fd, 0, "VM's fd should fail");
@@ -461,7 +461,7 @@ static void test_add_private_memory_region(void)
 	test_invalid_guest_memfd(vm, memfd, 0, "Regular memfd() should fail");
 	close(memfd);
 
-	vm2 = vm_create_barebones_protected_vm();
+	vm2 = vm_create_barebones_type(KVM_X86_SW_PROTECTED_VM);
 	memfd = vm_create_guest_memfd(vm2, MEM_REGION_SIZE, 0);
 	test_invalid_guest_memfd(vm, memfd, 0, "Other VM's guest_memfd() should fail");
 
@@ -489,7 +489,7 @@ static void test_add_overlapping_private_memory_regions(void)
 
 	pr_info("Testing ADD of overlapping KVM_MEM_GUEST_MEMFD memory regions\n");
 
-	vm = vm_create_barebones_protected_vm();
+	vm = vm_create_barebones_type(KVM_X86_SW_PROTECTED_VM);
 
 	memfd = vm_create_guest_memfd(vm, MEM_REGION_SIZE * 4, 0);
 
diff --git a/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c b/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c
new file mode 100644
index 000000000000..639863978405
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c
@@ -0,0 +1,147 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/kvm.h>
+#include <linux/psp-sev.h>
+#include <stdio.h>
+#include <sys/ioctl.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <pthread.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+#include "svm_util.h"
+#include "kselftest.h"
+
+#define SVM_SEV_FEAT_DEBUG_SWAP 32u
+
+/*
+ * Some features may have hidden dependencies, or may only work
+ * for certain VM types.  Err on the side of safety and don't
+ * expect that all supported features can be passed one by one
+ * to KVM_SEV_INIT2.
+ *
+ * (Well, right now there's only one...)
+ */
+#define KNOWN_FEATURES SVM_SEV_FEAT_DEBUG_SWAP
+
+int kvm_fd;
+u64 supported_features;
+bool have_sev_es;
+
+static int __sev_ioctl(int vm_fd, int cmd_id, void *data)
+{
+	struct kvm_sev_cmd cmd = {
+		.id = cmd_id,
+		.data = (uint64_t)data,
+		.sev_fd = open_sev_dev_path_or_exit(),
+	};
+	int ret;
+
+	ret = ioctl(vm_fd, KVM_MEMORY_ENCRYPT_OP, &cmd);
+	TEST_ASSERT(ret < 0 || cmd.error == SEV_RET_SUCCESS,
+		    "%d failed: fw error: %d\n",
+		    cmd_id, cmd.error);
+
+	return ret;
+}
+
+static void test_init2(unsigned long vm_type, struct kvm_sev_init *init)
+{
+	struct kvm_vm *vm;
+	int ret;
+
+	vm = vm_create_barebones_type(vm_type);
+	ret = __sev_ioctl(vm->fd, KVM_SEV_INIT2, init);
+	TEST_ASSERT(ret == 0,
+		    "KVM_SEV_INIT2 return code is %d (expected 0), errno: %d",
+		    ret, errno);
+	kvm_vm_free(vm);
+}
+
+static void test_init2_invalid(unsigned long vm_type, struct kvm_sev_init *init)
+{
+	struct kvm_vm *vm;
+	int ret;
+
+	vm = vm_create_barebones_type(vm_type);
+	ret = __sev_ioctl(vm->fd, KVM_SEV_INIT2, init);
+	TEST_ASSERT(ret == -1 && errno == EINVAL,
+		    "KVM_SEV_INIT2 return code %d, errno: %d (expected EINVAL)",
+		    ret, errno);
+	kvm_vm_free(vm);
+}
+
+void test_vm_types(void)
+{
+	test_init2(KVM_X86_SEV_VM, &(struct kvm_sev_init){});
+
+	if (have_sev_es)
+		test_init2(KVM_X86_SEV_ES_VM, &(struct kvm_sev_init){});
+	else
+		test_init2_invalid(KVM_X86_SEV_ES_VM, &(struct kvm_sev_init){});
+
+	test_init2_invalid(0, &(struct kvm_sev_init){});
+	if (kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM))
+		test_init2_invalid(KVM_X86_SW_PROTECTED_VM, &(struct kvm_sev_init){});
+}
+
+void test_flags(uint32_t vm_type)
+{
+	int i;
+
+	for (i = 0; i < 32; i++)
+		test_init2_invalid(vm_type, &(struct kvm_sev_init){
+			.flags = 1u << i,
+		});
+}
+
+void test_features(uint32_t vm_type)
+{
+	int i;
+
+	for (i = 0; i < 64; i++) {
+		if (!(supported_features & (1u << i)))
+			test_init2_invalid(vm_type, &(struct kvm_sev_init){
+				.vm_type = vm_type,
+				.vmsa_features = 1u << i,
+			});
+		else if (KNOWN_FEATURES & (1u << i))
+			test_init2(vm_type, &(struct kvm_sev_init){
+				.vmsa_features = 1u << i,
+			});
+	}
+}
+
+int main(int argc, char *argv[])
+{
+	int kvm_fd = open_kvm_dev_path_or_exit();
+	bool have_sev;
+
+	TEST_REQUIRE(__kvm_has_device_attr(kvm_fd, 0, KVM_X86_SEV_VMSA_FEATURES) == 0);
+	kvm_device_attr_get(kvm_fd, 0, KVM_X86_SEV_VMSA_FEATURES, &supported_features );
+
+	have_sev = kvm_cpu_has(X86_FEATURE_SEV);
+	TEST_ASSERT(have_sev == !!(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SEV_VM)),
+		    "sev: KVM_CAP_VM_TYPES (%x) does not match cpuid (checking %x)",
+		    kvm_check_cap(KVM_CAP_VM_TYPES), 1 << KVM_X86_SEV_VM);
+
+	TEST_REQUIRE(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SEV_VM));
+	have_sev_es = kvm_cpu_has(X86_FEATURE_SEV_ES);
+
+	TEST_ASSERT(have_sev_es == !!(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SEV_ES_VM)),
+		    "sev-es: KVM_CAP_VM_TYPES (%x) does not match cpuid (checking %x)",
+		    kvm_check_cap(KVM_CAP_VM_TYPES), 1 << KVM_X86_SEV_ES_VM);
+
+	test_vm_types();
+
+	test_flags(KVM_X86_SEV_VM);
+	if (have_sev_es)
+		test_flags(KVM_X86_SEV_ES_VM);
+
+	test_features(KVM_X86_SEV_VM);
+	if (have_sev_es)
+		test_features(KVM_X86_SEV_ES_VM);
+
+	return 0;
+}
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 11/10] selftests: kvm: switch sev_migrate_tests to KVM_SEV_INIT2
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
                   ` (9 preceding siblings ...)
  2024-02-09 18:37 ` [PATCH 10/10] selftests: kvm: add tests for KVM_SEV_INIT2 Paolo Bonzini
@ 2024-02-09 18:37 ` Paolo Bonzini
  2024-02-09 19:40 ` [PATCH 00/10] KVM: SEV: allow customizing VMSA features Sean Christopherson
  11 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 18:37 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, michael.roth, aik, isaku.yamahata

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 .../selftests/kvm/x86_64/sev_migrate_tests.c  | 45 ++++++++++---------
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/sev_migrate_tests.c b/tools/testing/selftests/kvm/x86_64/sev_migrate_tests.c
index c7ef97561038..301f7083cad0 100644
--- a/tools/testing/selftests/kvm/x86_64/sev_migrate_tests.c
+++ b/tools/testing/selftests/kvm/x86_64/sev_migrate_tests.c
@@ -50,11 +50,12 @@ static void sev_ioctl(int vm_fd, int cmd_id, void *data)
 static struct kvm_vm *sev_vm_create(bool es)
 {
 	struct kvm_vm *vm;
+	struct kvm_sev_init init = { 0 };
 	struct kvm_sev_launch_start start = { 0 };
 	int i;
 
-	vm = vm_create_barebones();
-	sev_ioctl(vm->fd, es ? KVM_SEV_ES_INIT : KVM_SEV_INIT, NULL);
+	vm = vm_create_barebones_type(es ? KVM_X86_SEV_ES_VM : KVM_X86_SEV_VM);
+	sev_ioctl(vm->fd, KVM_SEV_INIT2, &init);
 	for (i = 0; i < NR_MIGRATE_TEST_VCPUS; ++i)
 		__vm_vcpu_add(vm, i);
 	if (es)
@@ -65,12 +66,12 @@ static struct kvm_vm *sev_vm_create(bool es)
 	return vm;
 }
 
-static struct kvm_vm *aux_vm_create(bool with_vcpus)
+static struct kvm_vm *aux_vm_create(bool es, bool with_vcpus)
 {
 	struct kvm_vm *vm;
 	int i;
 
-	vm = vm_create_barebones();
+	vm = vm_create_barebones_type(es ? KVM_X86_SEV_ES_VM : KVM_X86_SEV_VM);
 	if (!with_vcpus)
 		return vm;
 
@@ -102,7 +103,7 @@ static void test_sev_migrate_from(bool es)
 
 	src_vm = sev_vm_create(es);
 	for (i = 0; i < NR_MIGRATE_TEST_VMS; ++i)
-		dst_vms[i] = aux_vm_create(true);
+		dst_vms[i] = aux_vm_create(es, true);
 
 	/* Initial migration from the src to the first dst. */
 	sev_migrate_from(dst_vms[0], src_vm);
@@ -164,16 +165,17 @@ static void test_sev_migrate_locking(void)
 
 static void test_sev_migrate_parameters(void)
 {
-	struct kvm_vm *sev_vm, *sev_es_vm, *vm_no_vcpu, *vm_no_sev,
+	struct kvm_vm *sev_vm, *sev_es_vm, *vm_no_vcpu,
 		*sev_es_vm_no_vmsa;
 	int ret;
 
 	vm_no_vcpu = vm_create_barebones();
-	vm_no_sev = aux_vm_create(true);
-	ret = __sev_migrate_from(vm_no_vcpu, vm_no_sev);
+	sev_vm = aux_vm_create(false, true);
+	ret = __sev_migrate_from(vm_no_vcpu, sev_vm);
 	TEST_ASSERT(ret == -1 && errno == EINVAL,
 		    "Migrations require SEV enabled. ret %d, errno: %d\n", ret,
 		    errno);
+	kvm_vm_free(sev_vm);
 
 	if (!have_sev_es)
 		goto out;
@@ -213,7 +215,6 @@ static void test_sev_migrate_parameters(void)
 	kvm_vm_free(sev_es_vm_no_vmsa);
 out:
 	kvm_vm_free(vm_no_vcpu);
-	kvm_vm_free(vm_no_sev);
 }
 
 static int __sev_mirror_create(struct kvm_vm *dst, struct kvm_vm *src)
@@ -272,7 +273,7 @@ static void test_sev_mirror(bool es)
 	int i;
 
 	src_vm = sev_vm_create(es);
-	dst_vm = aux_vm_create(false);
+	dst_vm = aux_vm_create(es, false);
 
 	sev_mirror_create(dst_vm, src_vm);
 
@@ -295,8 +296,8 @@ static void test_sev_mirror_parameters(void)
 	int ret;
 
 	sev_vm = sev_vm_create(/* es= */ false);
-	vm_with_vcpu = aux_vm_create(true);
-	vm_no_vcpu = aux_vm_create(false);
+	vm_with_vcpu = aux_vm_create(false, true);
+	vm_no_vcpu = aux_vm_create(false, false);
 
 	ret = __sev_mirror_create(sev_vm, sev_vm);
 	TEST_ASSERT(
@@ -345,13 +346,13 @@ static void test_sev_move_copy(void)
 		      *dst_mirror_vm, *dst2_mirror_vm, *dst3_mirror_vm;
 
 	sev_vm = sev_vm_create(/* es= */ false);
-	dst_vm = aux_vm_create(true);
-	dst2_vm = aux_vm_create(true);
-	dst3_vm = aux_vm_create(true);
-	mirror_vm = aux_vm_create(false);
-	dst_mirror_vm = aux_vm_create(false);
-	dst2_mirror_vm = aux_vm_create(false);
-	dst3_mirror_vm = aux_vm_create(false);
+	dst_vm = aux_vm_create(false, true);
+	dst2_vm = aux_vm_create(false, true);
+	dst3_vm = aux_vm_create(false, true);
+	mirror_vm = aux_vm_create(false, false);
+	dst_mirror_vm = aux_vm_create(false, false);
+	dst2_mirror_vm = aux_vm_create(false, false);
+	dst3_mirror_vm = aux_vm_create(false, false);
 
 	sev_mirror_create(mirror_vm, sev_vm);
 
@@ -378,9 +379,9 @@ static void test_sev_move_copy(void)
 	 * destruction is done safely.
 	 */
 	sev_vm = sev_vm_create(/* es= */ false);
-	dst_vm = aux_vm_create(true);
-	mirror_vm = aux_vm_create(false);
-	dst_mirror_vm = aux_vm_create(false);
+	dst_vm = aux_vm_create(false, true);
+	mirror_vm = aux_vm_create(false, false);
+	dst_mirror_vm = aux_vm_create(false, false);
 
 	sev_mirror_create(mirror_vm, sev_vm);
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/10] KVM: SEV: allow customizing VMSA features
  2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
                   ` (10 preceding siblings ...)
  2024-02-09 18:37 ` [PATCH 11/10] selftests: kvm: switch sev_migrate_tests to KVM_SEV_INIT2 Paolo Bonzini
@ 2024-02-09 19:40 ` Sean Christopherson
  2024-02-09 22:40   ` Paolo Bonzini
  11 siblings, 1 reply; 35+ messages in thread
From: Sean Christopherson @ 2024-02-09 19:40 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, michael.roth, aik, isaku.yamahata

On Fri, Feb 09, 2024, Paolo Bonzini wrote:
> The idea that no parameter would ever be necessary when enabling SEV or
> SEV-ES for a VM was decidedly optimistic.

That implies there was a conscious decision regarding the uAPI.  AFAICT, all of
the SEV uAPIs are direct reflections of the PSP invocations.  Which is why I'm
being so draconian about the SNP uAPIs; this time around, we need to actually
design something.

> The first source of variability that was encountered is the desired set of
> VMSA features, as that affects the measurement of the VM's initial state and
> cannot be changed arbitrarily by the hypervisor.
> 
> This series adds all the APIs that are needed to customize the features,
> with room for future enhancements:
> 
> - a new /dev/kvm device attribute to retrieve the set of supported
>   features (right now, only debug swap)
> 
> - a new sub-operation for KVM_MEM_ENCRYPT_OP that can take a struct,
>   replacing the existing KVM_SEV_INIT and KVM_SEV_ES_INIT
> 
> It then puts the new op to work by including the VMSA features as a field
> of the The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of
> supported VMSA features for backwards compatibility; but I am considering
> also making them use zero as the feature mask, and will gladly adjust the
> patches if so requested.

Rather than add a new KVM_MEMORY_ENCRYPT_OP, I think we should go for broke and
start building the generic set of "protected VM" APIs.  E.g. TDX wants to add
KVM_TDX_INIT_VM, and I'm guessing ARM needs similar functionality.  And AFAIK,
every technology follows an INIT => ADD (MEASURE) * N => FINALIZE type sequence.

If need be, I would rather have a massive union, a la kvm_run, to hold the vendor
specific bits than end up with sub-sub-ioctls and every vendor implementation
reinventing the wheel.

If it's sane and feasible for userspace, maybe even KVM_CREATE_VM2?

> In order to avoid creating *two* new KVM_MEM_ENCRYPT_OPs, I decided that
> I could as well make SEV and SEV-ES use VM types.  And then, why not make
> a SEV-ES VM, when created with the new VM type instead of KVM_SEV_ES_INIT,
> reject KVM_GET_REGS/KVM_SET_REGS and friends on the vCPU file descriptor
> once the VMSA has been encrypted...  Which is how the API should have
> always behaved.

+1000

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/10] KVM: SEV: allow customizing VMSA features
  2024-02-09 19:40 ` [PATCH 00/10] KVM: SEV: allow customizing VMSA features Sean Christopherson
@ 2024-02-09 22:40   ` Paolo Bonzini
  2024-02-13  2:46     ` Sean Christopherson
  0 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-09 22:40 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: linux-kernel, kvm, michael.roth, aik, isaku.yamahata

On Fri, Feb 9, 2024 at 8:40 PM Sean Christopherson <seanjc@google.com> wrote:
> On Fri, Feb 09, 2024, Paolo Bonzini wrote:
> > The idea that no parameter would ever be necessary when enabling SEV or
> > SEV-ES for a VM was decidedly optimistic.
>
> That implies there was a conscious decision regarding the uAPI.  AFAICT, all of
> the SEV uAPIs are direct reflections of the PSP invocations.  Which is why I'm
> being so draconian about the SNP uAPIs; this time around, we need to actually
> design something.

You liked that word, heh? :) The part that I am less sure about, is
that it's actually _possible_ to design something.

If you end up with a KVM_CREATE_VM2 whose arguments are

   uint32_t flags;
   uint32_t vm_type;
   uint64_t arch_mishmash_0; /* Intel only */
   uint64_t arch_mishmash_1; /* AMD only */
   uint64_t arch_mishmash_2; /* Intel only */
   uint64_t arch_mishmash_3; /* AMD only */

and half of the flags are Intel only, the other half are AMD only...
do you actually gain anything over a vendor-specific ioctl?

Case in point being that the SEV VMSA features would be one of the
fields above, and they would obviously not be available for TDX.

kvm_run is a different story because it's the result of mmap, and not
a ioctl. But in this case we have:

- pretty generic APIs like UPDATE_DATA and MEASURE that should just be
renamed to remove SEV references. Even DBG_DECRYPT and DBG_ENCRYPT
fall in this category

- APIs that seem okay but may depend on specific initialization flows,
for example LAUNCH_UPDATE_VMSA. One example of the problems with
initialization flows is LAUNCH_FINISH, which seems pretty tame but is
different between SEV{,-ES} and SNP. Another example could be CPUID
which is slightly different between vendors.

- APIs that are currently vendor-specific, but where a second version
has a chance of being cross-vendor, for example LAUNCH_SECRET or
GET_ATTESTATION_REPORT. Or maybe not.

- others that have no hope, because they include so many pieces of
vendor-specific data that there's hardly anything to share. INIT is
one of them. I guess you could fit the Intel CPUID square hole into
AMD's CPUID round peg or vice versa, but there's really little in
common between the two.

I think we should try to go for the first three, but realistically
will have to stop at the first one in most cases. Honestly, this
unified API effort should have started years ago if we wanted to go
there. I see where you're coming from, but the benefits are marginal
(like the amount of userspace code that could be reused) and the
effort huge in comparison. And especially, it's much worse to get an
attempt at a unified API wrong, than to have N different APIs.

This is not a free-for-all, there are definitely some
KVM_MEMORY_ENCRYPT_OP that can be shared between SEV and TDX. The
series also adds VM type support for SEV which fixes a poor past
choice. I don't think there's much to gain from sharing the whole INIT
phase though.

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/10] KVM: SEV: allow customizing VMSA features
  2024-02-09 22:40   ` Paolo Bonzini
@ 2024-02-13  2:46     ` Sean Christopherson
  2024-02-13 14:44       ` Paolo Bonzini
  0 siblings, 1 reply; 35+ messages in thread
From: Sean Christopherson @ 2024-02-13  2:46 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, michael.roth, aik, isaku.yamahata

On Fri, Feb 09, 2024, Paolo Bonzini wrote:
> On Fri, Feb 9, 2024 at 8:40 PM Sean Christopherson <seanjc@google.com> wrote:
> > On Fri, Feb 09, 2024, Paolo Bonzini wrote:
> > > The idea that no parameter would ever be necessary when enabling SEV or
> > > SEV-ES for a VM was decidedly optimistic.
> >
> > That implies there was a conscious decision regarding the uAPI.  AFAICT, all of
> > the SEV uAPIs are direct reflections of the PSP invocations.  Which is why I'm
> > being so draconian about the SNP uAPIs; this time around, we need to actually
> > design something.
> 
> You liked that word, heh? :) The part that I am less sure about, is
> that it's actually _possible_ to design something.
> 
> If you end up with a KVM_CREATE_VM2 whose arguments are
> 
>    uint32_t flags;
>    uint32_t vm_type;
>    uint64_t arch_mishmash_0; /* Intel only */
>    uint64_t arch_mishmash_1; /* AMD only */
>    uint64_t arch_mishmash_2; /* Intel only */
>    uint64_t arch_mishmash_3; /* AMD only */
> 
> and half of the flags are Intel only, the other half are AMD only...
> do you actually gain anything over a vendor-specific ioctl?

Sane, generic names.  I agree the code gains are likely negligible, but for me
at least, having KVM-friendly names for the commands would be immensely helpful.
E.g. for KVM_CREATE_VM2, I was thinking more like:

  __u32 flags;
  __u32 vm_type;
  union {
	struct tdx;
	struct sev;
	struct sev_es;
	struct sev_snp;
	__u8 pad[<big size>]
  };

Rinse and repeat for APIs that have a common purpose, but different payloads.

Similar to KVM_{SET,GET}_NESTED_STATE, where the data is wildly different, and
there's very little overlap between {svm,vmx}_set_nested_state(), I find it quite
valuable to have a single set of APIs.  E.g. I don't have to translate between
VMX and SVM terminology when thinking about the APIs, when discussing them, etc.

That's especially true for all this CoCo goo, where the names are ridiculously
divergent, and often not exactly intuitive.  E.g. LAUNCH_MEASURE reads like
"measure the launch", but surprise, it's "get the measurement".

> Case in point being that the SEV VMSA features would be one of the
> fields above, and they would obviously not be available for TDX.
> 
> kvm_run is a different story because it's the result of mmap, and not
> a ioctl. But in this case we have:
> 
> - pretty generic APIs like UPDATE_DATA and MEASURE that should just be
> renamed to remove SEV references. Even DBG_DECRYPT and DBG_ENCRYPT
> fall in this category
> 
> - APIs that seem okay but may depend on specific initialization flows,
> for example LAUNCH_UPDATE_VMSA. One example of the problems with
> initialization flows is LAUNCH_FINISH, which seems pretty tame but is
> different between SEV{,-ES} and SNP. Another example could be CPUID
> which is slightly different between vendors.
> 
> - APIs that are currently vendor-specific, but where a second version
> has a chance of being cross-vendor, for example LAUNCH_SECRET or
> GET_ATTESTATION_REPORT. Or maybe not.

AFAICT, LAUNCH_SECRET (a.k.a. LAUNCH_UPDATE_SECRET) and GET_ATTESTATION_REPORT
shouldn't even be a KVM APIs.  Ditto for LAUNCH_MEASURE, and probably other PSP
commands.  IIUC, userspace has the VM's firmware handle, I don't see why KVM
needs to be a middle-man.

> - others that have no hope, because they include so many pieces of
> vendor-specific data that there's hardly anything to share. INIT is
> one of them. I guess you could fit the Intel CPUID square hole into
> AMD's CPUID round peg or vice versa, but there's really little in
> common between the two.
> 
> I think we should try to go for the first three, but realistically
> will have to stop at the first one in most cases. Honestly, this
> unified API effort should have started years ago if we wanted to go
> there. I see where you're coming from, but the benefits are marginal
> (like the amount of userspace code that could be reused) and the
> effort huge in comparison.

The effort doesn't seem huge, so long as we don't try to make the parameters
common across vendor code.  The list of APIs doesn't seem insurmountable (note,
I'm not entirely sure these are correct mappings):

  create
  init VM   (LAUNCH_START / TDH.MNG.INIT)
  update    (LAUNCH_UPDATE_DATA / TDH.MEM.PAGE.ADD+TDH.MR.EXTEND)
  init vCPU (LAUNCH_UPDATE_VMSA / TDH.VP.INIT)
  finalize  (LAUNCH_FINISH / TDH.MR.FINALIZE)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/10] KVM: SEV: allow customizing VMSA features
  2024-02-13  2:46     ` Sean Christopherson
@ 2024-02-13 14:44       ` Paolo Bonzini
  2024-02-17  1:40         ` Sean Christopherson
  0 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-13 14:44 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: linux-kernel, kvm, michael.roth, aik, isaku.yamahata

On Tue, Feb 13, 2024 at 3:46 AM Sean Christopherson <seanjc@google.com> wrote:
>   __u32 flags;
>   __u32 vm_type;
>   union {
>         struct tdx;
>         struct sev;
>         struct sev_es;
>         struct sev_snp;
>         __u8 pad[<big size>]
>   };
>
> Rinse and repeat for APIs that have a common purpose, but different payloads.
>
> Similar to KVM_{SET,GET}_NESTED_STATE, where the data is wildly different, and
> there's very little overlap between {svm,vmx}_set_nested_state(), I find it quite
> valuable to have a single set of APIs.  E.g. I don't have to translate between
> VMX and SVM terminology when thinking about the APIs, when discussing them, etc.
>
> That's especially true for all this CoCo goo, where the names are ridiculously
> divergent, and often not exactly intuitive.  E.g. LAUNCH_MEASURE reads like
> "measure the launch", but surprise, it's "get the measurement".

I agree, but then you'd have to do things like "CPUID data is passed
via UPDATE_DATA for SEV and INIT_VM for TDX (and probably not at all
for pKVM)". And in one case the firmware may prefer to encrypt in
place, in the other you cannot do that at all.

There was a reason why SVM support was not added from the beginning.
Before adding nested get/set support for SVM, the whole nested
virtualization was made as similar as possible in design and
functionality to VMX. Of course it cannot be entirely the same, but
for example they share the overall idea that pending events and L2
state are taken from vCPU state; kvm_nested_state only stores global
processor state (VMXON/VMCS pointers on VMX, and GIF on SVM) and,
while in guest mode, L1 state and control bits. This ensures that the
same userspace flow can work for both VMX and SVM. However, in this
case we can't really control what is done in firmware.

> The effort doesn't seem huge, so long as we don't try to make the parameters
> common across vendor code.  The list of APIs doesn't seem insurmountable (note,
> I'm not entirely sure these are correct mappings):

While the effort isn't huge, the benefit is also pretty small, which
comes to a second big difference with GET/SET_NESTED_STATE: because
there is a GET ioctl, we have the possibility of retrieving the "black
box" and passing it back. With CoCo it's anyway userspace's task to
fill in the parameter structs. I just don't see the possibility of
sharing any code except the final ioctl, which to be honest is not
much to show. And the higher price might be in re-reviewing code that
has already been reviewed, both in KVM and in userspace.

Paolo

>   create
>   init VM   (LAUNCH_START / TDH.MNG.INIT)
>   update    (LAUNCH_UPDATE_DATA / TDH.MEM.PAGE.ADD+TDH.MR.EXTEND)
>   init vCPU (LAUNCH_UPDATE_VMSA / TDH.VP.INIT)
>   finalize  (LAUNCH_FINISH / TDH.MR.FINALIZE)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/10] KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP
  2024-02-09 18:37 ` [PATCH 01/10] KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP Paolo Bonzini
@ 2024-02-14 22:50   ` Michael Roth
  0 siblings, 0 replies; 35+ messages in thread
From: Michael Roth @ 2024-02-14 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On Fri, Feb 09, 2024 at 01:37:33PM -0500, Paolo Bonzini wrote:
> The data structs for KVM_MEMORY_ENCRYPT_OP have different sizes for 32- and 64-bit
> kernels, but they do not make any attempt to convert from one ABI to the other.
> Fix this by adding the appropriate padding.
> 
> No functional change intended for 64-bit userspace.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  arch/x86/include/uapi/asm/kvm.h | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 0ad6bda1fc39..b305daff056e 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -687,6 +687,7 @@ enum sev_cmd_id {
>  
>  struct kvm_sev_cmd {
>  	__u32 id;
> +	__u32 pad0;
>  	__u64 data;
>  	__u32 error;
>  	__u32 sev_fd;
> @@ -697,28 +698,35 @@ struct kvm_sev_launch_start {
>  	__u32 policy;
>  	__u64 dh_uaddr;
>  	__u32 dh_len;
> +	__u32 pad0;
>  	__u64 session_uaddr;
>  	__u32 session_len;
> +	__u32 pad1;
>  };
>  
>  struct kvm_sev_launch_update_data {
>  	__u64 uaddr;
>  	__u32 len;
> +	__u32 pad0;
>  };
>  
>  
>  struct kvm_sev_launch_secret {
>  	__u64 hdr_uaddr;
>  	__u32 hdr_len;
> +	__u32 pad0;
>  	__u64 guest_uaddr;
>  	__u32 guest_len;
> +	__u32 pad1;
>  	__u64 trans_uaddr;
>  	__u32 trans_len;
> +	__u32 pad2;
>  };
>  
>  struct kvm_sev_launch_measure {
>  	__u64 uaddr;
>  	__u32 len;
> +	__u32 pad0;
>  };
>  
>  struct kvm_sev_guest_status {
> @@ -731,33 +739,43 @@ struct kvm_sev_dbg {
>  	__u64 src_uaddr;
>  	__u64 dst_uaddr;
>  	__u32 len;
> +	__u32 pad0;
>  };
>  
>  struct kvm_sev_attestation_report {
>  	__u8 mnonce[16];
>  	__u64 uaddr;
>  	__u32 len;
> +	__u32 pad0;
>  };
>  
>  struct kvm_sev_send_start {
>  	__u32 policy;
> +	__u32 pad0;
>  	__u64 pdh_cert_uaddr;
>  	__u32 pdh_cert_len;
> +	__u32 pad1;
>  	__u64 plat_certs_uaddr;
>  	__u32 plat_certs_len;
> +	__u32 pad2;
>  	__u64 amd_certs_uaddr;
>  	__u32 amd_certs_len;
> +	__u32 pad3;
>  	__u64 session_uaddr;
>  	__u32 session_len;
> +	__u32 pad4;
>  };
>  
>  struct kvm_sev_send_update_data {
>  	__u64 hdr_uaddr;
>  	__u32 hdr_len;
> +	__u32 pad0;
>  	__u64 guest_uaddr;
>  	__u32 guest_len;
> +	__u32 pad1;
>  	__u64 trans_uaddr;
>  	__u32 trans_len;
> +	__u32 pad2;
>  };
>  
>  struct kvm_sev_receive_start {
> @@ -765,17 +783,22 @@ struct kvm_sev_receive_start {
>  	__u32 policy;
>  	__u64 pdh_uaddr;
>  	__u32 pdh_len;
> +	__u32 pad0;
>  	__u64 session_uaddr;
>  	__u32 session_len;
> +	__u32 pad1;
>  };
>  
>  struct kvm_sev_receive_update_data {
>  	__u64 hdr_uaddr;
>  	__u32 hdr_len;
> +	__u32 pad0;
>  	__u64 guest_uaddr;
>  	__u32 guest_len;
> +	__u32 pad1;
>  	__u64 trans_uaddr;
>  	__u32 trans_len;
> +	__u32 pad2;
>  };
>  
>  #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
> -- 
> 2.39.0
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 02/10] KVM: introduce new vendor op for KVM_GET_DEVICE_ATTR
  2024-02-09 18:37 ` [PATCH 02/10] KVM: introduce new vendor op for KVM_GET_DEVICE_ATTR Paolo Bonzini
@ 2024-02-14 22:57   ` Michael Roth
  0 siblings, 0 replies; 35+ messages in thread
From: Michael Roth @ 2024-02-14 22:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On Fri, Feb 09, 2024 at 01:37:34PM -0500, Paolo Bonzini wrote:
> Allow vendor modules to provide their own attributes on /dev/kvm.
> To avoid proliferation of vendor ops, implement KVM_HAS_DEVICE_ATTR
> and KVM_GET_DEVICE_ATTR in terms of the same function.  You're not
> supposed to use KVM_GET_DEVICE_ATTR to do complicated computations,
> especially on /dev/kvm.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  arch/x86/include/asm/kvm-x86-ops.h |  1 +
>  arch/x86/include/asm/kvm_host.h    |  1 +
>  arch/x86/kvm/x86.c                 | 52 +++++++++++++++++++-----------
>  3 files changed, 36 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index 378ed944b849..ac8b7614e79d 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -122,6 +122,7 @@ KVM_X86_OP(enter_smm)
>  KVM_X86_OP(leave_smm)
>  KVM_X86_OP(enable_smi_window)
>  #endif
> +KVM_X86_OP_OPTIONAL(dev_get_attr)
>  KVM_X86_OP_OPTIONAL(mem_enc_ioctl)
>  KVM_X86_OP_OPTIONAL(mem_enc_register_region)
>  KVM_X86_OP_OPTIONAL(mem_enc_unregister_region)
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index d271ba20a0b2..0bcd9ae16097 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1769,6 +1769,7 @@ struct kvm_x86_ops {
>  	void (*enable_smi_window)(struct kvm_vcpu *vcpu);
>  #endif
>  
> +	int (*dev_get_attr)(u64 attr, u64 *val);
>  	int (*mem_enc_ioctl)(struct kvm *kvm, void __user *argp);
>  	int (*mem_enc_register_region)(struct kvm *kvm, struct kvm_enc_region *argp);
>  	int (*mem_enc_unregister_region)(struct kvm *kvm, struct kvm_enc_region *argp);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index bf10a9073a09..8746530930d5 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4804,37 +4804,53 @@ static inline void __user *kvm_get_attr_addr(struct kvm_device_attr *attr)
>  	return uaddr;
>  }
>  
> -static int kvm_x86_dev_get_attr(struct kvm_device_attr *attr)
> +static int __kvm_x86_dev_get_attr(struct kvm_device_attr *attr, u64 *val)
>  {
> -	u64 __user *uaddr = kvm_get_attr_addr(attr);
> +	int r;
>  
>  	if (attr->group)
>  		return -ENXIO;
>  
> +	switch (attr->attr) {
> +	case KVM_X86_XCOMP_GUEST_SUPP:
> +		r = 0;
> +		*val = kvm_caps.supported_xcr0;
> +		break;
> +	default:
> +		r = -ENXIO;
> +		if (kvm_x86_ops.dev_get_attr)
> +			r = kvm_x86_ops.dev_get_attr(attr->attr, val);
> +		break;
> +	}
> +
> +	return r;
> +}
> +
> +static int kvm_x86_dev_get_attr(struct kvm_device_attr *attr)
> +{
> +	u64 __user *uaddr;
> +	int r;
> +	u64 val;
> +
> +	r = __kvm_x86_dev_get_attr(attr, &val);
> +	if (r < 0)
> +		return r;
> +
> +	uaddr = kvm_get_attr_addr(attr);
>  	if (IS_ERR(uaddr))
>  		return PTR_ERR(uaddr);
>  
> -	switch (attr->attr) {
> -	case KVM_X86_XCOMP_GUEST_SUPP:
> -		if (put_user(kvm_caps.supported_xcr0, uaddr))
> -			return -EFAULT;
> -		return 0;
> -	default:
> -		return -ENXIO;
> -	}
> +	if (put_user(val, uaddr))
> +		return -EFAULT;
> +
> +	return 0;
>  }
>  
>  static int kvm_x86_dev_has_attr(struct kvm_device_attr *attr)
>  {
> -	if (attr->group)
> -		return -ENXIO;
> +	u64 val;
>  
> -	switch (attr->attr) {
> -	case KVM_X86_XCOMP_GUEST_SUPP:
> -		return 0;
> -	default:
> -		return -ENXIO;
> -	}
> +	return __kvm_x86_dev_get_attr(attr, &val);
>  }
>  
>  long kvm_arch_dev_ioctl(struct file *filp,
> -- 
> 2.39.0
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 03/10] Documentation: kvm/sev: separate description of firmware
  2024-02-09 18:37 ` [PATCH 03/10] Documentation: kvm/sev: separate description of firmware Paolo Bonzini
@ 2024-02-14 23:23   ` Michael Roth
  0 siblings, 0 replies; 35+ messages in thread
From: Michael Roth @ 2024-02-14 23:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On Fri, Feb 09, 2024 at 01:37:35PM -0500, Paolo Bonzini wrote:
> The description of firmware is included part under the "SEV Key Management"
> header, part under the KVM_SEV_INIT ioctl.  Put these two bits together and
> and rename "SEV Key Management" to what it actually is, namely a description
> of the KVM_MEMORY_ENCRYPT_OP API.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  .../virt/kvm/x86/amd-memory-encryption.rst    | 29 +++++++++++--------
>  1 file changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> index 995780088eb2..37c5c37f4f6e 100644
> --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> @@ -46,14 +46,8 @@ SEV hardware uses ASIDs to associate a memory encryption key with a VM.
>  Hence, the ASID for the SEV-enabled guests must be from 1 to a maximum value
>  defined in the CPUID 0x8000001f[ecx] field.
>  
> -SEV Key Management
> -==================
> -
> -The SEV guest key management is handled by a separate processor called the AMD
> -Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
> -key management interface to perform common hypervisor activities such as
> -encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
> -information, see the SEV Key Management spec [api-spec]_
> +``KVM_MEMORY_ENCRYPT_OP`` API
> +=============================
>  
>  The main ioctl to access SEV is KVM_MEMORY_ENCRYPT_OP.  If the argument
>  to KVM_MEMORY_ENCRYPT_OP is NULL, the ioctl returns 0 if SEV is enabled
> @@ -87,10 +81,6 @@ guests, such as launching, running, snapshotting, migrating and decommissioning.
>  The KVM_SEV_INIT command is used by the hypervisor to initialize the SEV platform
>  context. In a typical workflow, this command should be the first command issued.
>  
> -The firmware can be initialized either by using its own non-volatile storage or
> -the OS can manage the NV storage for the firmware using the module parameter
> -``init_ex_path``. If the file specified by ``init_ex_path`` does not exist or
> -is invalid, the OS will create or override the file with output from PSP.
>  
>  Returns: 0 on success, -negative on error
>  
> @@ -434,6 +424,21 @@ issued by the hypervisor to make the guest ready for execution.
>  
>  Returns: 0 on success, -negative on error
>  
> +Firmware Management
> +===================
> +
> +The SEV guest key management is handled by a separate processor called the AMD
> +Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
> +key management interface to perform common hypervisor activities such as
> +encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
> +information, see the SEV Key Management spec [api-spec]_
> +
> +The AMD-SP firmware can be initialized either by using its own non-volatile
> +storage or the OS can manage the NV storage for the firmware using
> +parameter ``init_ex_path`` of the ``ccp`` module. If the file specified
> +by ``init_ex_path`` does not exist or is invalid, the OS will create or
> +override the file with PSP non-volatile storage.
> +
>  References
>  ==========
>  
> -- 
> 2.39.0
> 
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/10] KVM: SEV: publish supported VMSA features
  2024-02-09 18:37 ` [PATCH 04/10] KVM: SEV: publish supported VMSA features Paolo Bonzini
@ 2024-02-14 23:49   ` Michael Roth
  0 siblings, 0 replies; 35+ messages in thread
From: Michael Roth @ 2024-02-14 23:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On Fri, Feb 09, 2024 at 01:37:36PM -0500, Paolo Bonzini wrote:
> Compute the set of features to be stored in the VMSA when KVM is
> initialized; move it from there into kvm_sev_info when SEV is initialized,
> and then into the initial VMSA.
> 
> The new variable can then be used to return the set of supported features
> to userspace, via the KVM_GET_DEVICE_ATTR ioctl.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  .../virt/kvm/x86/amd-memory-encryption.rst    | 12 ++++++++++
>  arch/x86/include/uapi/asm/kvm.h               |  1 +
>  arch/x86/kvm/svm/sev.c                        | 22 +++++++++++++++++--
>  arch/x86/kvm/svm/svm.c                        |  1 +
>  arch/x86/kvm/svm/svm.h                        |  1 +
>  5 files changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> index 37c5c37f4f6e..5ed11bc16b96 100644
> --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> @@ -424,6 +424,18 @@ issued by the hypervisor to make the guest ready for execution.
>  
>  Returns: 0 on success, -negative on error
>  
> +Device attribute API
> +====================
> +
> +Attributes of the SEV implementation can be retrieved through the
> +``KVM_HAS_DEVICE_ATTR`` and ``KVM_GET_DEVICE_ATTR`` ioctls on the ``/dev/kvm``
> +device node.
> +
> +Currently only one attribute is implemented:
> +
> +* group 0, attribute ``KVM_X86_SEV_VMSA_FEATURES``: return the set of all
> +  bits that are accepted in the ``vmsa_features`` of ``KVM_SEV_INIT2``.
> +
>  Firmware Management
>  ===================
>  
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index b305daff056e..cccaa5ff6d01 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -459,6 +459,7 @@ struct kvm_sync_regs {
>  
>  /* attributes for system fd (group 0) */
>  #define KVM_X86_XCOMP_GUEST_SUPP	0
> +#define KVM_X86_SEV_VMSA_FEATURES	1
>  
>  struct kvm_vmx_nested_state_data {
>  	__u8 vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index f760106c31f8..2e558f7538c2 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -65,6 +65,7 @@ module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0444);
>  #define sev_es_debug_swap_enabled false
>  #endif /* CONFIG_KVM_AMD_SEV */
>  
> +static u64 sev_supported_vmsa_features;
>  static u8 sev_enc_bit;
>  static DECLARE_RWSEM(sev_deactivate_lock);
>  static DEFINE_MUTEX(sev_bitmap_lock);

...

> @@ -2276,6 +2290,10 @@ void __init sev_hardware_setup(void)
>  	if (!sev_es_enabled || !cpu_feature_enabled(X86_FEATURE_DEBUG_SWAP) ||
>  	    !cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP))
>  		sev_es_debug_swap_enabled = false;
> +
> +	sev_supported_vmsa_features = 0;

This ^ seems unecessary. Otherwise:

Reviewed-by: Michael Roth <michael.roth@amd.com>

> +	if (sev_es_debug_swap_enabled)
> +		sev_supported_vmsa_features |= SVM_SEV_FEAT_DEBUG_SWAP;
>  #endif
>  }

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 05/10] KVM: SEV: store VMSA features in kvm_sev_info
  2024-02-09 18:37 ` [PATCH 05/10] KVM: SEV: store VMSA features in kvm_sev_info Paolo Bonzini
@ 2024-02-15  0:03   ` Michael Roth
  0 siblings, 0 replies; 35+ messages in thread
From: Michael Roth @ 2024-02-15  0:03 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On Fri, Feb 09, 2024 at 01:37:37PM -0500, Paolo Bonzini wrote:
> Right now, the set of features that are stored in the VMSA upon
> initialization is fixed and depends on the module parameters for
> kvm-amd.ko.  However, the hypervisor cannot really change it at will
> because the feature word has to match between the hypervisor and whatever
> computes a measurement of the VMSA for attestation purposes.
> 
> Add a field to kvm_set_info that holds the set of features to be stored

s/kvm_set_info_/kvm_sev_info/

Otherwise:

Reviewed-by: Michael Roth <michael.roth@amd.com>

> in the VMSA; and query it instead of referring to the module parameters.
> 
> Because KVM_SEV_INIT and KVM_SEV_ES_INIT accept no parameters, this
> does not yet introduce any functional change, but it paves the way for
> an API that allows customization of the features per-VM.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/svm/sev.c | 22 ++++++++++++++++++----
>  arch/x86/kvm/svm/svm.c |  2 +-
>  arch/x86/kvm/svm/svm.h |  3 ++-
>  3 files changed, 21 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 2e558f7538c2..712bfbc0028a 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -116,6 +116,14 @@ static inline bool is_mirroring_enc_context(struct kvm *kvm)
>  	return !!to_kvm_svm(kvm)->sev_info.enc_context_owner;
>  }
> 
> +static bool sev_vcpu_has_debug_swap(struct vcpu_svm *svm)
> +{
> +	struct kvm_vcpu *vcpu = &svm->vcpu;
> +	struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
> +
> +	return sev->vmsa_features & SVM_SEV_FEAT_DEBUG_SWAP;
> +}
> +
>  /* Must be called with the sev_bitmap_lock held */
>  static bool __sev_recycle_asids(int min_asid, int max_asid)
>  {
> @@ -258,6 +266,8 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
> 
>  	sev->active = true;
>  	sev->es_active = argp->id == KVM_SEV_ES_INIT;
> +	sev->vmsa_features = sev_supported_vmsa_features;
> +
>  	asid = sev_asid_new(sev);
>  	if (asid < 0)
>  		goto e_no_asid;
> @@ -278,6 +288,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	sev_asid_free(sev);
>  	sev->asid = 0;
>  e_no_asid:
> +	sev->vmsa_features = 0;
>  	sev->es_active = false;
>  	sev->active = false;
>  	return ret;
> @@ -572,6 +583,8 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
> 
>  static int sev_es_sync_vmsa(struct vcpu_svm *svm)
>  {
> +	struct kvm_vcpu *vcpu = &svm->vcpu;
> +	struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
>  	struct sev_es_save_area *save = svm->sev_es.vmsa;
> 
>  	/* Check some debug related fields before encrypting the VMSA */
> @@ -613,7 +626,7 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
>  	save->xss  = svm->vcpu.arch.ia32_xss;
>  	save->dr6  = svm->vcpu.arch.dr6;
> 
> -	save->sev_features = sev_supported_vmsa_features;
> +	save->sev_features = sev->vmsa_features;
> 
>  	pr_debug("Virtual Machine Save Area (VMSA):\n");
>  	print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false);
> @@ -1693,6 +1706,7 @@ static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm)
>  	dst->pages_locked = src->pages_locked;
>  	dst->enc_context_owner = src->enc_context_owner;
>  	dst->es_active = src->es_active;
> +	dst->vmsa_features = src->vmsa_features;
> 
>  	src->asid = 0;
>  	src->active = false;
> @@ -3063,7 +3077,7 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
>  	svm_set_intercept(svm, TRAP_CR8_WRITE);
> 
>  	vmcb->control.intercepts[INTERCEPT_DR] = 0;
> -	if (!sev_es_debug_swap_enabled) {
> +	if (!sev_vcpu_has_debug_swap(svm)) {
>  		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
>  		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
>  		recalc_intercepts(svm);
> @@ -3118,7 +3132,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
>  					    sev_enc_bit));
>  }
> 
> -void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa)
> +void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa)
>  {
>  	/*
>  	 * All host state for SEV-ES guests is categorized into three swap types
> @@ -3146,7 +3160,7 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa)
>  	 * the CPU (Type-B). If DebugSwap is disabled/unsupported, the CPU both
>  	 * saves and loads debug registers (Type-A).
>  	 */
> -	if (sev_es_debug_swap_enabled) {
> +	if (sev_vcpu_has_debug_swap(svm)) {
>  		hostsa->dr0 = native_get_debugreg(0);
>  		hostsa->dr1 = native_get_debugreg(1);
>  		hostsa->dr2 = native_get_debugreg(2);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index aa1792f402ab..392b9c2e2ce1 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1523,7 +1523,7 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
>  		struct sev_es_save_area *hostsa;
>  		hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
> 
> -		sev_es_prepare_switch_to_guest(hostsa);
> +		sev_es_prepare_switch_to_guest(svm, hostsa);
>  	}
> 
>  	if (tsc_scaling)
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index d630026b23b0..864c782eaa58 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -85,6 +85,7 @@ struct kvm_sev_info {
>  	unsigned long pages_locked; /* Number of pages locked */
>  	struct list_head regions_list;  /* List of registered regions */
>  	u64 ap_jump_table;	/* SEV-ES AP Jump Table address */
> +	u64 vmsa_features;
>  	struct kvm *enc_context_owner; /* Owner of copied encryption context */
>  	struct list_head mirror_vms; /* List of VMs mirroring */
>  	struct list_head mirror_entry; /* Use as a list entry of mirrors */
> @@ -693,7 +694,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
>  int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
>  void sev_es_vcpu_reset(struct vcpu_svm *svm);
>  void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
> -void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
> +void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa);
>  void sev_es_unmap_ghcb(struct vcpu_svm *svm);
> 
>  /* vmenter.S */
> -- 
> 2.39.0
> 
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 07/10] KVM: x86: Add is_vm_type_supported callback
  2024-02-09 18:37 ` [PATCH 07/10] KVM: x86: Add is_vm_type_supported callback Paolo Bonzini
@ 2024-02-15  0:33   ` Michael Roth
  2024-02-15 13:35     ` Paolo Bonzini
  0 siblings, 1 reply; 35+ messages in thread
From: Michael Roth @ 2024-02-15  0:33 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On Fri, Feb 09, 2024 at 01:37:39PM -0500, Paolo Bonzini wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Allow the backend to specify which VM types are supported.
> 
> Based on a patch by Isaku Yamahata <isaku.yamahata@intel.com>.
> 

I think this needs Isaku's SoB since he's still listed as the author.

> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/include/asm/kvm-x86-ops.h | 1 +
>  arch/x86/include/asm/kvm_host.h    | 1 +
>  arch/x86/kvm/x86.c                 | 9 ++++++++-
>  arch/x86/kvm/x86.h                 | 2 ++
>  4 files changed, 12 insertions(+), 1 deletion(-)
> 

...

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e634e5b67516..c89ddaa1e09f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4581,12 +4581,19 @@ static int kvm_ioctl_get_supported_hv_cpuid(struct kvm_vcpu *vcpu,
>  }
>  #endif
>  
> -static bool kvm_is_vm_type_supported(unsigned long type)
> +bool __kvm_is_vm_type_supported(unsigned long type)
>  {
>  	return type == KVM_X86_DEFAULT_VM ||
>  	       (type == KVM_X86_SW_PROTECTED_VM &&
>  		IS_ENABLED(CONFIG_KVM_SW_PROTECTED_VM) && tdp_enabled);
>  }
> +EXPORT_SYMBOL_GPL(__kvm_is_vm_type_supported);

...

> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 2f7e19166658..4e40c23d66ed 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -9,6 +9,8 @@
>  #include "kvm_cache_regs.h"
>  #include "kvm_emulate.h"
>  
> +bool __kvm_is_vm_type_supported(unsigned long type);

It's not really clear from this patch/commit message why the export is
needed at this stage.

-Mike

> -- 
> 2.39.0
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 08/10] KVM: SEV: define VM types for SEV and SEV-ES
  2024-02-09 18:37 ` [PATCH 08/10] KVM: SEV: define VM types for SEV and SEV-ES Paolo Bonzini
@ 2024-02-15  1:19   ` Michael Roth
  2024-02-15 13:40     ` Paolo Bonzini
  0 siblings, 1 reply; 35+ messages in thread
From: Michael Roth @ 2024-02-15  1:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On Fri, Feb 09, 2024 at 01:37:40PM -0500, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  Documentation/virt/kvm/api.rst  |  2 ++
>  arch/x86/include/uapi/asm/kvm.h |  2 ++
>  arch/x86/kvm/svm/sev.c          | 18 +++++++++++++++++-
>  arch/x86/kvm/svm/svm.c          | 11 +++++++++++
>  arch/x86/kvm/svm/svm.h          |  2 ++
>  arch/x86/kvm/x86.c              |  4 ++++
>  6 files changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 3ec0b7a455a0..bf957bb70e4b 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -8790,6 +8790,8 @@ means the VM type with value @n is supported.  Possible values of @n are::
>  
>    #define KVM_X86_DEFAULT_VM	0
>    #define KVM_X86_SW_PROTECTED_VM	1
> +  #define KVM_X86_SEV_VM	8
> +  #define KVM_X86_SEV_ES_VM	10
>  
>  9. Known KVM API problems
>  =========================
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 6c74db23257e..7c46e96cfe62 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -854,5 +854,7 @@ struct kvm_hyperv_eventfd {
>  
>  #define KVM_X86_DEFAULT_VM	0
>  #define KVM_X86_SW_PROTECTED_VM	(KVM_X86_DEFAULT_VM | __KVM_X86_PRIVATE_MEM_TYPE)
> +#define KVM_X86_SEV_VM		8

Hmm... would it make sense to decouple the VM types and their associated
capabilities? Only bit 2 is left in the lower range after this, and using any
bits beyond TDX's bit 4 risks overflowing check_extension ioctl's 32-bit return
value. Maybe a separate lookup table instead?

> +#define KVM_X86_SEV_ES_VM	(KVM_X86_SEV_VM | __KVM_X86_PROTECTED_STATE_TYPE)
>  
>  #endif /* _ASM_X86_KVM_H */
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 712bfbc0028a..acf5c45ef14e 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -260,6 +260,9 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	if (kvm->created_vcpus)
>  		return -EINVAL;
>  
> +	if (kvm->arch.vm_type != KVM_X86_DEFAULT_VM)
> +		return -EINVAL;
> +
>  	ret = -EBUSY;
>  	if (unlikely(sev->active))
>  		return ret;
> @@ -279,6 +282,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  
>  	INIT_LIST_HEAD(&sev->regions_list);
>  	INIT_LIST_HEAD(&sev->mirror_vms);
> +	sev->need_init = false;
>  
>  	kvm_set_apicv_inhibit(kvm, APICV_INHIBIT_REASON_SEV);
>  
> @@ -1814,7 +1818,8 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
>  	if (ret)
>  		goto out_fput;
>  
> -	if (sev_guest(kvm) || !sev_guest(source_kvm)) {
> +	if (kvm->arch.vm_type != source_kvm->arch.vm_type ||
> +	    sev_guest(kvm) || !sev_guest(source_kvm)) {
>  		ret = -EINVAL;
>  		goto out_unlock;
>  	}
> @@ -2135,6 +2140,7 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
>  	mirror_sev->asid = source_sev->asid;
>  	mirror_sev->fd = source_sev->fd;
>  	mirror_sev->es_active = source_sev->es_active;
> +	mirror_sev->need_init = false;
>  	mirror_sev->handle = source_sev->handle;
>  	INIT_LIST_HEAD(&mirror_sev->regions_list);
>  	INIT_LIST_HEAD(&mirror_sev->mirror_vms);
> @@ -3192,3 +3198,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
>  
>  	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
>  }
> +
> +bool sev_is_vm_type_supported(unsigned long type)
> +{
> +	if (type == KVM_X86_SEV_VM)
> +		return sev_enabled;
> +	if (type == KVM_X86_SEV_ES_VM)
> +		return sev_es_enabled;
> +
> +	return false;
> +}
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 392b9c2e2ce1..87541c84d07e 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4087,6 +4087,11 @@ static void svm_cancel_injection(struct kvm_vcpu *vcpu)
>  
>  static int svm_vcpu_pre_run(struct kvm_vcpu *vcpu)
>  {
> +	struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
> +
> +	if (sev->need_init)
> +		return -EINVAL;
> +
>  	return 1;
>  }
>  
> @@ -4888,6 +4893,11 @@ static void svm_vm_destroy(struct kvm *kvm)
>  
>  static int svm_vm_init(struct kvm *kvm)
>  {
> +	if (kvm->arch.vm_type) {
> +		struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +		sev->need_init = true;
> +	}
> +
>  	if (!pause_filter_count || !pause_filter_thresh)
>  		kvm->arch.pause_in_guest = true;
>  
> @@ -4914,6 +4924,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
>  	.vcpu_free = svm_vcpu_free,
>  	.vcpu_reset = svm_vcpu_reset,
>  
> +	.is_vm_type_supported = sev_is_vm_type_supported,
>  	.vm_size = sizeof(struct kvm_svm),
>  	.vm_init = svm_vm_init,
>  	.vm_destroy = svm_vm_destroy,
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 864c782eaa58..63be26d4a024 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -79,6 +79,7 @@ enum {
>  struct kvm_sev_info {
>  	bool active;		/* SEV enabled guest */
>  	bool es_active;		/* SEV-ES enabled guest */
> +	bool need_init;		/* waiting for SEV_INIT2 */

Seems like this should be a separate patch.

-Mike

>  	unsigned int asid;	/* ASID used for this guest */
>  	unsigned int handle;	/* SEV firmware handle */
>  	int fd;			/* SEV device fd */
> @@ -696,6 +697,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm);
>  void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
>  void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa);
>  void sev_es_unmap_ghcb(struct vcpu_svm *svm);
> +bool sev_is_vm_type_supported(unsigned long type);
>  
>  /* vmenter.S */
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c89ddaa1e09f..dfc66ee091a1 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4795,6 +4795,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  		r = BIT(KVM_X86_DEFAULT_VM);
>  		if (kvm_is_vm_type_supported(KVM_X86_SW_PROTECTED_VM))
>  			r |= BIT(KVM_X86_SW_PROTECTED_VM);
> +		if (kvm_is_vm_type_supported(KVM_X86_SEV_VM))
> +			r |= BIT(KVM_X86_SEV_VM);
> +		if (kvm_is_vm_type_supported(KVM_X86_SEV_ES_VM))
> +			r |= BIT(KVM_X86_SEV_ES_VM);
>  		break;
>  	default:
>  		break;
> -- 
> 2.39.0
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation
  2024-02-09 18:37 ` [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation Paolo Bonzini
@ 2024-02-15  1:34   ` Michael Roth
  2024-02-15 13:44     ` Paolo Bonzini
  2024-02-15 11:07   ` Alexey Kardashevskiy
  2024-02-15 21:14   ` Tom Lendacky
  2 siblings, 1 reply; 35+ messages in thread
From: Michael Roth @ 2024-02-15  1:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On Fri, Feb 09, 2024 at 01:37:41PM -0500, Paolo Bonzini wrote:
> The idea that no parameter would ever be necessary when enabling SEV or
> SEV-ES for a VM was decidedly optimistic.  In fact, in some sense it's
> already a parameter whether SEV or SEV-ES is desired.  Another possible
> source of variability is the desired set of VMSA features, as that affects
> the measurement of the VM's initial state and cannot be changed
> arbitrarily by the hypervisor.
> 
> Create a new sub-operation for KVM_MEM_ENCRYPT_OP that can take a struct,
> and put the new op to work by including the VMSA features as a field of the
> struct.  The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of
> supported VMSA features for backwards compatibility.
> 
> The struct also includes the usual bells and whistles for future
> extensibility: a flags field that must be zero for now, and some padding
> at the end.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  .../virt/kvm/x86/amd-memory-encryption.rst    | 41 ++++++++++++++--
>  arch/x86/include/uapi/asm/kvm.h               | 10 ++++
>  arch/x86/kvm/svm/sev.c                        | 48 +++++++++++++++++--
>  3 files changed, 92 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> index 5ed11bc16b96..a4291e7bd8ed 100644
> --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> @@ -75,15 +75,50 @@ are defined in ``<linux/psp-dev.h>``.
>  KVM implements the following commands to support common lifecycle events of SEV
>  guests, such as launching, running, snapshotting, migrating and decommissioning.
>  
> -1. KVM_SEV_INIT
> ----------------
> +1. KVM_SEV_INIT2
> +----------------
>  
> -The KVM_SEV_INIT command is used by the hypervisor to initialize the SEV platform
> +The KVM_SEV_INIT2 command is used by the hypervisor to initialize the SEV platform
>  context. In a typical workflow, this command should be the first command issued.
>  
> +For this command to be accepted, either KVM_X86_SEV_VM or KVM_X86_SEV_ES_VM
> +must have been passed to the KVM_CREATE_VM ioctl.  A virtual machine created
> +with those machine types in turn cannot be run until KVM_SEV_INIT2 is invoked.
> +
> +Parameters: struct kvm_sev_init (in)
>  
>  Returns: 0 on success, -negative on error
>  
> +::
> +
> +        struct struct kvm_sev_init {

Missing the vm_type param here.

> +                __u32 flags;          /* must be 0 */
> +                __u64 vmsa_features;  /* initial value of features field in VMSA */
> +                __u32 pad[8];
> +        };
> +
> +It is an error if the hypervisor does not support any of the bits that
> +are set in ``flags`` or ``vmsa_features``.
> +
> +This command replaces the deprecated KVM_SEV_INIT and KVM_SEV_ES_INIT commands.
> +The commands did not have any parameters (the ```data``` field was unused) and
> +only work for the KVM_X86_DEFAULT_VM machine type (0).
> +
> +They behave as if:
> +
> +* the VM type is KVM_X86_SEV_VM for KVM_SEV_INIT, or KVM_X86_SEV_ES_VM for
> +  KVM_SEV_ES_INIT
> +
> +* the ``flags`` field of ``struct kvm_sev_init`` is set to zero
> +
> +* the ``vmsa_features`` field of ``struct kvm_sev_init`` is set to all features
> +  supported by the hypervisor (as returned by ``KVM_GET_DEVICE_ATTR`` when
> +  passed group 0 and attribute id ``KVM_X86_SEV_VMSA_FEATURES``).
> +
> +If the ``KVM_X86_SEV_VMSA_FEATURES`` attribute does not exist, the hypervisor only
> +supports KVM_SEV_INIT and KVM_SEV_ES_INIT.  In that case the set of VMSA features is
> +undefined.

It's hard to imagine userspace implementation support for querying
KVM_X86_SEV_VMSA_FEATURES but still insisting on KVM_SEV_INIT. Maybe it
would be better to just lock in that VMSA_FEATURES at what is currently
supported: DEBUG_SWAP=on/off depending on the kvm_amd module param, and
then for all other features require opt-in via KVM_SEV_INIT2, and then
bake that into the documentation. That way way they could still reference
this documentation to properly calculate measurements for older/existing
VMM implementations.

-Mike

> +
>  2. KVM_SEV_LAUNCH_START
>  -----------------------
>  
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 7c46e96cfe62..6baf18335c7b 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -683,6 +683,9 @@ enum sev_cmd_id {
>  	/* Guest Migration Extension */
>  	KVM_SEV_SEND_CANCEL,
>  
> +	/* Second time is the charm; improved versions of the above ioctls.  */
> +	KVM_SEV_INIT2,
> +
>  	KVM_SEV_NR_MAX,
>  };
>  
> @@ -694,6 +697,13 @@ struct kvm_sev_cmd {
>  	__u32 sev_fd;
>  };
>  
> +struct kvm_sev_init {
> +	__u32 vm_type;
> +	__u32 flags;
> +	__u64 vmsa_features;
> +	__u32 pad[8];
> +};
> +
>  struct kvm_sev_launch_start {
>  	__u32 handle;
>  	__u32 policy;
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index acf5c45ef14e..78c52764453f 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -252,7 +252,9 @@ static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
>  	sev_decommission(handle);
>  }
>  
> -static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +static int __sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp,
> +			    struct kvm_sev_init *data,
> +			    unsigned long vm_type)
>  {
>  	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
>  	int asid, ret;
> @@ -260,7 +262,10 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	if (kvm->created_vcpus)
>  		return -EINVAL;
>  
> -	if (kvm->arch.vm_type != KVM_X86_DEFAULT_VM)
> +	if (data->flags)
> +		return -EINVAL;
> +
> +	if (data->vmsa_features & ~sev_supported_vmsa_features)
>  		return -EINVAL;
>  
>  	ret = -EBUSY;
> @@ -268,8 +273,8 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  		return ret;
>  
>  	sev->active = true;
> -	sev->es_active = argp->id == KVM_SEV_ES_INIT;
> -	sev->vmsa_features = sev_supported_vmsa_features;
> +	sev->es_active = (vm_type & __KVM_X86_PROTECTED_STATE_TYPE) != 0;
> +	sev->vmsa_features = data->vmsa_features;
>  
>  	asid = sev_asid_new(sev);
>  	if (asid < 0)
> @@ -298,6 +303,38 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	return ret;
>  }
>  
> +static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	struct kvm_sev_init data = {
> +		.vmsa_features = sev_supported_vmsa_features,
> +	};
> +	unsigned long vm_type;
> +
> +	if (kvm->arch.vm_type != KVM_X86_DEFAULT_VM)
> +		return -EINVAL;
> +
> +	vm_type = (argp->id == KVM_SEV_INIT ? KVM_X86_SEV_VM : KVM_X86_SEV_ES_VM);
> +	return __sev_guest_init(kvm, argp, &data, vm_type);
> +}
> +
> +static int sev_guest_init2(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +	struct kvm_sev_init data;
> +
> +	if (!sev->need_init)
> +		return -EINVAL;
> +
> +	if (kvm->arch.vm_type != KVM_X86_SEV_VM &&
> +	    kvm->arch.vm_type != KVM_X86_SEV_ES_VM)
> +		return -EINVAL;
> +
> +	if (copy_from_user(&data, (void __user *)(uintptr_t)argp->data, sizeof(data)))
> +		return -EFAULT;
> +
> +	return __sev_guest_init(kvm, argp, &data, kvm->arch.vm_type);
> +}
> +
>  static int sev_bind_asid(struct kvm *kvm, unsigned int handle, int *error)
>  {
>  	struct sev_data_activate activate;
> @@ -1915,6 +1952,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
>  	case KVM_SEV_INIT:
>  		r = sev_guest_init(kvm, &sev_cmd);
>  		break;
> +	case KVM_SEV_INIT2:
> +		r = sev_guest_init2(kvm, &sev_cmd);
> +		break;
>  	case KVM_SEV_LAUNCH_START:
>  		r = sev_launch_start(kvm, &sev_cmd);
>  		break;
> -- 
> 2.39.0
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation
  2024-02-09 18:37 ` [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation Paolo Bonzini
  2024-02-15  1:34   ` Michael Roth
@ 2024-02-15 11:07   ` Alexey Kardashevskiy
  2024-02-15 21:14   ` Tom Lendacky
  2 siblings, 0 replies; 35+ messages in thread
From: Alexey Kardashevskiy @ 2024-02-15 11:07 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm; +Cc: seanjc, michael.roth, isaku.yamahata



On 10/2/24 05:37, Paolo Bonzini wrote:
> The idea that no parameter would ever be necessary when enabling SEV or
> SEV-ES for a VM was decidedly optimistic.  In fact, in some sense it's
> already a parameter whether SEV or SEV-ES is desired.  Another possible
> source of variability is the desired set of VMSA features, as that affects
> the measurement of the VM's initial state and cannot be changed
> arbitrarily by the hypervisor.
> 
> Create a new sub-operation for KVM_MEM_ENCRYPT_OP that can take a struct,

a typo here: KVM_MEMORY_ENCRYPT_OP.


-- 
Alexey


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 07/10] KVM: x86: Add is_vm_type_supported callback
  2024-02-15  0:33   ` Michael Roth
@ 2024-02-15 13:35     ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-15 13:35 UTC (permalink / raw)
  To: Michael Roth; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On 2/15/24 01:33, Michael Roth wrote:
>> +bool __kvm_is_vm_type_supported(unsigned long type);
> It's not really clear from this patch/commit message why the export is
> needed at this stage.

Yep, I'll remove it.

Paolo


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 08/10] KVM: SEV: define VM types for SEV and SEV-ES
  2024-02-15  1:19   ` Michael Roth
@ 2024-02-15 13:40     ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-15 13:40 UTC (permalink / raw)
  To: Michael Roth; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On 2/15/24 02:19, Michael Roth wrote:
>>   #define KVM_X86_DEFAULT_VM	0
>>   #define KVM_X86_SW_PROTECTED_VM	(KVM_X86_DEFAULT_VM | __KVM_X86_PRIVATE_MEM_TYPE)
>> +#define KVM_X86_SEV_VM		8
> Hmm... would it make sense to decouple the VM types and their associated
> capabilities? Only bit 2 is left in the lower range after this, and using any
> bits beyond TDX's bit 4 risks overflowing check_extension ioctl's 32-bit return
> value.

Yes, the idea was to leave 0..7 for vendor independent types (with 0 and 
1 in use), 8..15 for AMD (3 of them being reserved already for 
SEV/SEV-ES/SEV-SNP), 16..23 for Intel.

> Maybe a separate lookup table instead?

The mask was nice because it can be used in relatively hot paths... 
I'll keep them but move the constants away from uapi/ headers.

Paolo


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation
  2024-02-15  1:34   ` Michael Roth
@ 2024-02-15 13:44     ` Paolo Bonzini
  2024-02-15 14:44       ` Michael Roth
  0 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-15 13:44 UTC (permalink / raw)
  To: Michael Roth; +Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata

On 2/15/24 02:34, Michael Roth wrote:
>> +        struct struct kvm_sev_init {
> Missing the vm_type param here.

It can go away in the struct actually.  Also, "struct struct".

>> +If the ``KVM_X86_SEV_VMSA_FEATURES`` attribute does not exist, the hypervisor only
>> +supports KVM_SEV_INIT and KVM_SEV_ES_INIT.  In that case the set of VMSA features is
>> +undefined.
> 
> It's hard to imagine userspace implementation support for querying
> KVM_X86_SEV_VMSA_FEATURES but still insisting on KVM_SEV_INIT.

... except for backwards compatibility with old kernels.  For example, 
the VMM could first invoke HAS_DEVICE_ATTR, and then fall back to 
KVM_SEV_INIT after checking that the user did not explicitly request or 
forbid one or more VMSA features.

> Maybe it
> would be better to just lock in that VMSA_FEATURES at what is currently
> supported: DEBUG_SWAP=on/off depending on the kvm_amd module param, and
> then for all other features require opt-in via KVM_SEV_INIT2, and then
> bake that into the documentation. That way way they could still reference
> this documentation to properly calculate measurements for older/existing
> VMM implementations.

Thinking more about it, I think all features including debug_swap should 
be disabled with the old SEV_INIT/SEV_ES_INIT.  Because the features 
need to match between the VMM and the measurement calculation, they need 
to be added explicitly on e.g. the QEMU command line.

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation
  2024-02-15 13:44     ` Paolo Bonzini
@ 2024-02-15 14:44       ` Michael Roth
  2024-02-15 17:28         ` Paolo Bonzini
  0 siblings, 1 reply; 35+ messages in thread
From: Michael Roth @ 2024-02-15 14:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata, thomas.lendacky

On Thu, Feb 15, 2024 at 02:44:47PM +0100, Paolo Bonzini wrote:
> On 2/15/24 02:34, Michael Roth wrote:
> > > +        struct struct kvm_sev_init {
> > Missing the vm_type param here.
> 
> It can go away in the struct actually.  Also, "struct struct".
> 
> > > +If the ``KVM_X86_SEV_VMSA_FEATURES`` attribute does not exist, the hypervisor only
> > > +supports KVM_SEV_INIT and KVM_SEV_ES_INIT.  In that case the set of VMSA features is
> > > +undefined.
> > 
> > It's hard to imagine userspace implementation support for querying
> > KVM_X86_SEV_VMSA_FEATURES but still insisting on KVM_SEV_INIT.
> 
> ... except for backwards compatibility with old kernels.  For example, the
> VMM could first invoke HAS_DEVICE_ATTR, and then fall back to KVM_SEV_INIT
> after checking that the user did not explicitly request or forbid one or
> more VMSA features.

What I mean is that if userspace is modified for these checks, it's
reasonable to also inform them that only VMSA features present in
those older kernels (i.e. debug-swap) will be available via KVM_SEV_INIT,
and for anything else they will need to use KVM_SEV_INIT.

That way we can provide clear documentation on what to expect regarding
VMSA features for KVM_SEV_INIT and not have to have the "undefined"
wording: it'll never use anything other than debug-swap depending on the
module param setting.

> 
> > Maybe it
> > would be better to just lock in that VMSA_FEATURES at what is currently
> > supported: DEBUG_SWAP=on/off depending on the kvm_amd module param, and
> > then for all other features require opt-in via KVM_SEV_INIT2, and then
> > bake that into the documentation. That way way they could still reference
> > this documentation to properly calculate measurements for older/existing
> > VMM implementations.
> 
> Thinking more about it, I think all features including debug_swap should be
> disabled with the old SEV_INIT/SEV_ES_INIT.  Because the features need to
> match between the VMM and the measurement calculation, they need to be added
> explicitly on e.g. the QEMU command line.

That seems reasonable, but the main thing I was hoping to avoid was
another round of VMSA features changing out from underneath the covers
again. The module param setting is something we've needed to convey
internally/externally a good bit due to the fallout and making this
change would lead to another repeat. Not the end of the world but would
be nice to avoid if possible.

-Mike

> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation
  2024-02-15 14:44       ` Michael Roth
@ 2024-02-15 17:28         ` Paolo Bonzini
  2024-02-15 17:54           ` Michael Roth
  0 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-15 17:28 UTC (permalink / raw)
  To: Michael Roth
  Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata, thomas.lendacky

On Thu, Feb 15, 2024 at 3:44 PM Michael Roth <michael.roth@amd.com> wrote:
> What I mean is that if userspace is modified for these checks, it's
> reasonable to also inform them that only VMSA features present in
> those older kernels (i.e. debug-swap) will be available via KVM_SEV_INIT,
> and for anything else they will need to use KVM_SEV_INIT.
>
> That way we can provide clear documentation on what to expect regarding
> VMSA features for KVM_SEV_INIT and not have to have the "undefined"
> wording: it'll never use anything other than debug-swap depending on the
> module param setting.

Ah, I agree.

> That seems reasonable, but the main thing I was hoping to avoid was
> another round of VMSA features changing out from underneath the covers
> again. The module param setting is something we've needed to convey
> internally/externally a good bit due to the fallout and making this
> change would lead to another repeat. Not the end of the world but would
> be nice to avoid if possible.

The fallout was caused by old kernels not supporting debug-swap and
now by failing measurements. As far as I know there is no downside of
leaving it disabled by default, and it will fix booting old guest
kernels.

Paolo


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation
  2024-02-15 17:28         ` Paolo Bonzini
@ 2024-02-15 17:54           ` Michael Roth
  2024-02-15 18:08             ` Paolo Bonzini
  0 siblings, 1 reply; 35+ messages in thread
From: Michael Roth @ 2024-02-15 17:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata, thomas.lendacky

On Thu, Feb 15, 2024 at 06:28:18PM +0100, Paolo Bonzini wrote:
> On Thu, Feb 15, 2024 at 3:44 PM Michael Roth <michael.roth@amd.com> wrote:
> > What I mean is that if userspace is modified for these checks, it's
> > reasonable to also inform them that only VMSA features present in
> > those older kernels (i.e. debug-swap) will be available via KVM_SEV_INIT,
> > and for anything else they will need to use KVM_SEV_INIT.
> >
> > That way we can provide clear documentation on what to expect regarding
> > VMSA features for KVM_SEV_INIT and not have to have the "undefined"
> > wording: it'll never use anything other than debug-swap depending on the
> > module param setting.
> 
> Ah, I agree.
> 
> > That seems reasonable, but the main thing I was hoping to avoid was
> > another round of VMSA features changing out from underneath the covers
> > again. The module param setting is something we've needed to convey
> > internally/externally a good bit due to the fallout and making this
> > change would lead to another repeat. Not the end of the world but would
> > be nice to avoid if possible.
> 
> The fallout was caused by old kernels not supporting debug-swap and
> now by failing measurements. As far as I know there is no downside of
> leaving it disabled by default, and it will fix booting old guest
> kernels.

Yah, agreed on older guest kernels, but it's the measurement side of things
where we'd expect some additional fallout. The guidance was essentially that
if you run a newer host kernel with debug-swap support, you need either need
to:

  a) update your measurements to account for the additional VMSA feature
  b) disable debug-swap param to maintain previous behavior/measurement

So those who'd taken approach a) would see another unexpected measurement
change when they eventually update to a newer kernel.

-Mike

> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation
  2024-02-15 17:54           ` Michael Roth
@ 2024-02-15 18:08             ` Paolo Bonzini
  2024-02-15 20:44               ` Michael Roth
  0 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2024-02-15 18:08 UTC (permalink / raw)
  To: Michael Roth
  Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata, thomas.lendacky

On Thu, Feb 15, 2024 at 6:55 PM Michael Roth <michael.roth@amd.com> wrote:
> > The fallout was caused by old kernels not supporting debug-swap and
> > now by failing measurements. As far as I know there is no downside of
> > leaving it disabled by default, and it will fix booting old guest
> > kernels.
>
> Yah, agreed on older guest kernels, but it's the measurement side of things
> where we'd expect some additional fallout. The guidance was essentially that
> if you run a newer host kernel with debug-swap support, you need either need
> to:
>
>   a) update your measurements to account for the additional VMSA feature
>   b) disable debug-swap param to maintain previous behavior/measurement

Out of curiosity, where was this documented? While debug-swap was a
pretty obvious culprit of the failed measurement, I didn't see any
mention to it anywhere (and also didn't see any mention that old
kernels would fail to boot in the KVM patches---which would have been
a pretty clear indication that something like these patches was
needed).

> So those who'd taken approach a) would see another unexpected measurement
> change when they eventually update to a newer kernel.

But they'd see it anyway if userspace starts disabling it by default.
In general, enabling _anything_ by default is a mistake in either KVM
or userspace if you care about guest ABI (which you obviously do in
the case of confidential computing).

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation
  2024-02-15 18:08             ` Paolo Bonzini
@ 2024-02-15 20:44               ` Michael Roth
  0 siblings, 0 replies; 35+ messages in thread
From: Michael Roth @ 2024-02-15 20:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, seanjc, aik, isaku.yamahata, thomas.lendacky,
	Larry.Dewey

On Thu, Feb 15, 2024 at 07:08:06PM +0100, Paolo Bonzini wrote:
> On Thu, Feb 15, 2024 at 6:55 PM Michael Roth <michael.roth@amd.com> wrote:
> > > The fallout was caused by old kernels not supporting debug-swap and
> > > now by failing measurements. As far as I know there is no downside of
> > > leaving it disabled by default, and it will fix booting old guest
> > > kernels.
> >
> > Yah, agreed on older guest kernels, but it's the measurement side of things
> > where we'd expect some additional fallout. The guidance was essentially that
> > if you run a newer host kernel with debug-swap support, you need either need
> > to:
> >
> >   a) update your measurements to account for the additional VMSA feature
> >   b) disable debug-swap param to maintain previous behavior/measurement
> 
> Out of curiosity, where was this documented? While debug-swap was a
> pretty obvious culprit of the failed measurement, I didn't see any
> mention to it anywhere (and also didn't see any mention that old
> kernels would fail to boot in the KVM patches---which would have been
> a pretty clear indication that something like these patches was
> needed).

Yes, this was reactive rather than proactive guidance unfortunately,
resulting from various internal/external bug reports where we needed to
suggest the above-mentioned options.

In retrospect, I think we would've handled things differently as well.
Which is why I'm hoping it's possible to ease the pain of another
potential measurement change for those who've since incorporated
debug-swap into their measurement workflow. But maybe it's not
realistic...

> 
> > So those who'd taken approach a) would see another unexpected measurement
> > change when they eventually update to a newer kernel.
> 
> But they'd see it anyway if userspace starts disabling it by default.

My thinking was that this wording would be specific to KVM_SEV_INIT, as
opposed to KVM_SEV_INIT2 where disabling all features by default should
absolutely be the way to go.

But realistically, it's not easy for a user to tell whether their VMM is
using KVM_SEV_INIT vs KVM_SEV_INIT2, and it does seem possible that
having the defaults be different between the 2 would also cause some
pain down the road since even in looking at the documentation it
wouldn't be immediately clear whether or not debug-swap would be enabled.

So maybe your approach of always disabling by default and requiring
userspace to opt-in would be better in the long-run since this behavior
is fairly new from a distro perspective and it's likely only
developers/early adopters that we'd be sticking the genie back in the
bottle for.

-Mike

> In general, enabling _anything_ by default is a mistake in either KVM
> or userspace if you care about guest ABI (which you obviously do in
> the case of confidential computing).
> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation
  2024-02-09 18:37 ` [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation Paolo Bonzini
  2024-02-15  1:34   ` Michael Roth
  2024-02-15 11:07   ` Alexey Kardashevskiy
@ 2024-02-15 21:14   ` Tom Lendacky
  2 siblings, 0 replies; 35+ messages in thread
From: Tom Lendacky @ 2024-02-15 21:14 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: seanjc, michael.roth, aik, isaku.yamahata

On 2/9/24 12:37, Paolo Bonzini wrote:
> The idea that no parameter would ever be necessary when enabling SEV or
> SEV-ES for a VM was decidedly optimistic.  In fact, in some sense it's
> already a parameter whether SEV or SEV-ES is desired.  Another possible
> source of variability is the desired set of VMSA features, as that affects
> the measurement of the VM's initial state and cannot be changed
> arbitrarily by the hypervisor.
> 
> Create a new sub-operation for KVM_MEM_ENCRYPT_OP that can take a struct,
> and put the new op to work by including the VMSA features as a field of the
> struct.  The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of
> supported VMSA features for backwards compatibility.
> 
> The struct also includes the usual bells and whistles for future
> extensibility: a flags field that must be zero for now, and some padding
> at the end.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   .../virt/kvm/x86/amd-memory-encryption.rst    | 41 ++++++++++++++--
>   arch/x86/include/uapi/asm/kvm.h               | 10 ++++
>   arch/x86/kvm/svm/sev.c                        | 48 +++++++++++++++++--
>   3 files changed, 92 insertions(+), 7 deletions(-)
> 

...

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index acf5c45ef14e..78c52764453f 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -252,7 +252,9 @@ static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
>   	sev_decommission(handle);
>   }
>   
> -static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +static int __sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp,
> +			    struct kvm_sev_init *data,
> +			    unsigned long vm_type)
>   {
>   	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
>   	int asid, ret;
> @@ -260,7 +262,10 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>   	if (kvm->created_vcpus)
>   		return -EINVAL;
>   
> -	if (kvm->arch.vm_type != KVM_X86_DEFAULT_VM)
> +	if (data->flags)
> +		return -EINVAL;
> +
> +	if (data->vmsa_features & ~sev_supported_vmsa_features)

An SEV guest doesn't have protected state and so it doesn't have a VMSA to 
which you can apply features. So this should be:

	if (vm_type != KVM_X86_SEV_VM &&
	    (data->vmsa_features & ~sev_supported_vmsa_features))

Thanks,
Tom

>   		return -EINVAL;
>   
>   	ret = -EBUSY;
> @@ -268,8 +273,8 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>   		return ret;
>   
>   	sev->active = true;
> -	sev->es_active = argp->id == KVM_SEV_ES_INIT;
> -	sev->vmsa_features = sev_supported_vmsa_features;
> +	sev->es_active = (vm_type & __KVM_X86_PROTECTED_STATE_TYPE) != 0;
> +	sev->vmsa_features = data->vmsa_features;
>   
>   	asid = sev_asid_new(sev);
>   	if (asid < 0)
> @@ -298,6 +303,38 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>   	return ret;
>   }
>   
> +static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	struct kvm_sev_init data = {
> +		.vmsa_features = sev_supported_vmsa_features,
> +	};
> +	unsigned long vm_type;
> +
> +	if (kvm->arch.vm_type != KVM_X86_DEFAULT_VM)
> +		return -EINVAL;
> +
> +	vm_type = (argp->id == KVM_SEV_INIT ? KVM_X86_SEV_VM : KVM_X86_SEV_ES_VM);
> +	return __sev_guest_init(kvm, argp, &data, vm_type);
> +}
> +
> +static int sev_guest_init2(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +	struct kvm_sev_init data;
> +
> +	if (!sev->need_init)
> +		return -EINVAL;
> +
> +	if (kvm->arch.vm_type != KVM_X86_SEV_VM &&
> +	    kvm->arch.vm_type != KVM_X86_SEV_ES_VM)
> +		return -EINVAL;
> +
> +	if (copy_from_user(&data, (void __user *)(uintptr_t)argp->data, sizeof(data)))
> +		return -EFAULT;
> +
> +	return __sev_guest_init(kvm, argp, &data, kvm->arch.vm_type);
> +}
> +
>   static int sev_bind_asid(struct kvm *kvm, unsigned int handle, int *error)
>   {
>   	struct sev_data_activate activate;
> @@ -1915,6 +1952,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
>   	case KVM_SEV_INIT:
>   		r = sev_guest_init(kvm, &sev_cmd);
>   		break;
> +	case KVM_SEV_INIT2:
> +		r = sev_guest_init2(kvm, &sev_cmd);
> +		break;
>   	case KVM_SEV_LAUNCH_START:
>   		r = sev_launch_start(kvm, &sev_cmd);
>   		break;

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/10] KVM: SEV: allow customizing VMSA features
  2024-02-13 14:44       ` Paolo Bonzini
@ 2024-02-17  1:40         ` Sean Christopherson
  0 siblings, 0 replies; 35+ messages in thread
From: Sean Christopherson @ 2024-02-17  1:40 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, michael.roth, aik, isaku.yamahata

On Tue, Feb 13, 2024, Paolo Bonzini wrote:
> On Tue, Feb 13, 2024 at 3:46 AM Sean Christopherson <seanjc@google.com> wrote:
> >   __u32 flags;
> >   __u32 vm_type;
> >   union {
> >         struct tdx;
> >         struct sev;
> >         struct sev_es;
> >         struct sev_snp;
> >         __u8 pad[<big size>]
> >   };
> >
> > Rinse and repeat for APIs that have a common purpose, but different payloads.
> >
> > Similar to KVM_{SET,GET}_NESTED_STATE, where the data is wildly different, and
> > there's very little overlap between {svm,vmx}_set_nested_state(), I find it quite
> > valuable to have a single set of APIs.  E.g. I don't have to translate between
> > VMX and SVM terminology when thinking about the APIs, when discussing them, etc.
> >
> > That's especially true for all this CoCo goo, where the names are ridiculously
> > divergent, and often not exactly intuitive.  E.g. LAUNCH_MEASURE reads like
> > "measure the launch", but surprise, it's "get the measurement".
> 
> I agree, but then you'd have to do things like "CPUID data is passed
> via UPDATE_DATA for SEV and INIT_VM for TDX (and probably not at all
> for pKVM)". And in one case the firmware may prefer to encrypt in
> place, in the other you cannot do that at all.
> 
> There was a reason why SVM support was not added from the beginning.
> Before adding nested get/set support for SVM, the whole nested
> virtualization was made as similar as possible in design and
> functionality to VMX. Of course it cannot be entirely the same, but
> for example they share the overall idea that pending events and L2
> state are taken from vCPU state; kvm_nested_state only stores global
> processor state (VMXON/VMCS pointers on VMX, and GIF on SVM) and,
> while in guest mode, L1 state and control bits. This ensures that the
> same userspace flow can work for both VMX and SVM. However, in this
> case we can't really control what is done in firmware.
> 
> > The effort doesn't seem huge, so long as we don't try to make the parameters
> > common across vendor code.  The list of APIs doesn't seem insurmountable (note,
> > I'm not entirely sure these are correct mappings):
> 
> While the effort isn't huge, the benefit is also pretty small, which
> comes to a second big difference with GET/SET_NESTED_STATE: because
> there is a GET ioctl, we have the possibility of retrieving the "black
> box" and passing it back. With CoCo it's anyway userspace's task to
> fill in the parameter structs. I just don't see the possibility of
> sharing any code except the final ioctl, which to be honest is not
> much to show. And the higher price might be in re-reviewing code that
> has already been reviewed, both in KVM and in userspace.

Yeah, I realize I'm probably grasping at straws.  *sigh*

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2024-02-17  1:40 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-09 18:37 [PATCH 00/10] KVM: SEV: allow customizing VMSA features Paolo Bonzini
2024-02-09 18:37 ` [PATCH 01/10] KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP Paolo Bonzini
2024-02-14 22:50   ` Michael Roth
2024-02-09 18:37 ` [PATCH 02/10] KVM: introduce new vendor op for KVM_GET_DEVICE_ATTR Paolo Bonzini
2024-02-14 22:57   ` Michael Roth
2024-02-09 18:37 ` [PATCH 03/10] Documentation: kvm/sev: separate description of firmware Paolo Bonzini
2024-02-14 23:23   ` Michael Roth
2024-02-09 18:37 ` [PATCH 04/10] KVM: SEV: publish supported VMSA features Paolo Bonzini
2024-02-14 23:49   ` Michael Roth
2024-02-09 18:37 ` [PATCH 05/10] KVM: SEV: store VMSA features in kvm_sev_info Paolo Bonzini
2024-02-15  0:03   ` Michael Roth
2024-02-09 18:37 ` [PATCH 06/10] KVM: x86: define standard behavior for bits 0/1 of VM type Paolo Bonzini
2024-02-09 18:37 ` [PATCH 07/10] KVM: x86: Add is_vm_type_supported callback Paolo Bonzini
2024-02-15  0:33   ` Michael Roth
2024-02-15 13:35     ` Paolo Bonzini
2024-02-09 18:37 ` [PATCH 08/10] KVM: SEV: define VM types for SEV and SEV-ES Paolo Bonzini
2024-02-15  1:19   ` Michael Roth
2024-02-15 13:40     ` Paolo Bonzini
2024-02-09 18:37 ` [PATCH 09/10] KVM: SEV: introduce KVM_SEV_INIT2 operation Paolo Bonzini
2024-02-15  1:34   ` Michael Roth
2024-02-15 13:44     ` Paolo Bonzini
2024-02-15 14:44       ` Michael Roth
2024-02-15 17:28         ` Paolo Bonzini
2024-02-15 17:54           ` Michael Roth
2024-02-15 18:08             ` Paolo Bonzini
2024-02-15 20:44               ` Michael Roth
2024-02-15 11:07   ` Alexey Kardashevskiy
2024-02-15 21:14   ` Tom Lendacky
2024-02-09 18:37 ` [PATCH 10/10] selftests: kvm: add tests for KVM_SEV_INIT2 Paolo Bonzini
2024-02-09 18:37 ` [PATCH 11/10] selftests: kvm: switch sev_migrate_tests to KVM_SEV_INIT2 Paolo Bonzini
2024-02-09 19:40 ` [PATCH 00/10] KVM: SEV: allow customizing VMSA features Sean Christopherson
2024-02-09 22:40   ` Paolo Bonzini
2024-02-13  2:46     ` Sean Christopherson
2024-02-13 14:44       ` Paolo Bonzini
2024-02-17  1:40         ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).