[PATCH v11 00/20] Add Secure TSC support for SNP guests

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v11 00/20] Add Secure TSC support for SNP guests
@ 2024-07-31 15:07 Nikunj A Dadhania
  2024-07-31 15:07 ` [PATCH v11 01/20] virt: sev-guest: Replace dev_dbg with pr_debug Nikunj A Dadhania
                   ` (20 more replies)
  0 siblings, 21 replies; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:07 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

This patchset is also available at:

  https://github.com/AMDESE/linux-kvm/tree/sectsc-guest-latest

and is based on v6.11-rc1

Overview
--------

Secure TSC allows guests to securely use RDTSC/RDTSCP instructions as the
parameters being used cannot be changed by hypervisor once the guest is
launched. More details in the AMD64 APM Vol 2, Section "Secure TSC".

In order to enable secure TSC, SEV-SNP guests need to send TSC_INFO guest
message before the APs are booted. Details from the TSC_INFO response will
then be used to program the VMSA before the APs are brought up. See "SEV
Secure Nested Paging Firmware ABI Specification" document (currently at
https://www.amd.com/system/files/TechDocs/56860.pdf) section "TSC Info"

SEV-guest driver has the implementation for guest and AMD Security
Processor communication. As the TSC_INFO needs to be initialized during
early boot before APs are started, move the guest messaging code from
sev-guest driver to sev/core.c and provide well defined APIs to the
sev-guest driver.

Patches:
01-04: sev-guest driver cleanup and enhancements
   05: Use AES GCM library
06-07: SNP init error handling and cache secrets page address
08-10: Preparatory patches for code movement
11-12: Patches moving SNP guest messaging code from SEV guest driver to
       SEV common code
13-20: SecureTSC enablement patches.

Testing SecureTSC
-----------------

SecureTSC hypervisor patches based on top of SEV-SNP Guest MEMFD series:
https://github.com/AMDESE/linux-kvm/tree/sectsc-host-latest

QEMU changes:
https://github.com/nikunjad/qemu/tree/snp-securetsc-latest

QEMU commandline SEV-SNP with SecureTSC:

  qemu-system-x86_64 -cpu EPYC-Milan-v2 -smp 4 \
    -object memory-backend-memfd,id=ram1,size=1G,share=true,prealloc=false,reserve=false \
    -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,secure-tsc=on \
    -machine q35,confidential-guest-support=sev0,memory-backend=ram1 \
    ...

Changelog:
----------
v11:
* Rebased on top of v6.11-rc1
* Added Acked-by/Reviewed-by
* Moved SEV Guest driver cleanups in the beginning of the series
* Commit message updates
* Enforced PAGE_SIZE constraints for snp_guest_msg
* After offline discussion with Boris, redesigned and exported
  SEV guest messaging APIs to sev-guest driver
* Dropped VMPCK rework patches
* Make sure movement of SEV core routines does not break the SEV Guest
  driver midway of the series.

v10: https://lore.kernel.org/lkml/20240621123903.2411843-1-nikunj@amd.com/
* Rebased on top of tip/x86/sev
* Added Reviewed-by from Tom
* Commit message updates
* Change the condition for better readability in get_vmpck()
* Make vmpck_id as u32 again and use VMPCK_MAX_NUM as the default value



Nikunj A Dadhania (20):
  virt: sev-guest: Replace dev_dbg with pr_debug
  virt: sev-guest: Rename local guest message variables
  virt: sev-guest: Fix user-visible strings
  virt: sev-guest: Ensure the SNP guest messages do not exceed a page
  virt: sev-guest: Use AES GCM crypto library
  x86/sev: Handle failures from snp_init()
  x86/sev: Cache the secrets page address
  virt: sev-guest: Consolidate SNP guest messaging parameters to a
    struct
  virt: sev-guest: Reduce the scope of SNP command mutex
  virt: sev-guest: Carve out SNP message context structure
  x86/sev: Carve out and export SNP guest messaging init routines
  x86/sev: Relocate SNP guest messaging routines to common code
  x86/cc: Add CC_ATTR_GUEST_SECURE_TSC
  x86/sev: Add Secure TSC support for SNP guests
  x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
  x86/sev: Prevent RDTSC/RDTSCP interception for Secure TSC enabled
    guests
  x86/sev: Allow Secure TSC feature for SNP guests
  x86/sev: Mark Secure TSC as reliable clocksource
  x86/kvmclock: Skip kvmclock when Secure TSC is available
  x86/cpu/amd: Do not print FW_BUG for Secure TSC

 arch/x86/include/asm/sev-common.h       |   1 +
 arch/x86/include/asm/sev.h              | 166 +++++-
 arch/x86/include/asm/svm.h              |   6 +-
 include/linux/cc_platform.h             |   8 +
 arch/x86/boot/compressed/sev.c          |   3 +-
 arch/x86/coco/core.c                    |   3 +
 arch/x86/coco/sev/core.c                | 590 ++++++++++++++++++--
 arch/x86/coco/sev/shared.c              |  10 +
 arch/x86/kernel/cpu/amd.c               |   3 +-
 arch/x86/kernel/kvmclock.c              |   2 +-
 arch/x86/mm/mem_encrypt.c               |   4 +
 arch/x86/mm/mem_encrypt_amd.c           |   4 +
 arch/x86/mm/mem_encrypt_identity.c      |   7 +
 drivers/virt/coco/sev-guest/sev-guest.c | 695 +++---------------------
 arch/x86/Kconfig                        |   1 +
 drivers/virt/coco/sev-guest/Kconfig     |   3 -
 16 files changed, 820 insertions(+), 686 deletions(-)


base-commit: 8400291e289ee6b2bf9779ff1c83a291501f017b
-- 
2.34.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v11 01/20] virt: sev-guest: Replace dev_dbg with pr_debug
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
@ 2024-07-31 15:07 ` Nikunj A Dadhania
  2024-08-27  8:48   ` [tip: x86/sev] virt: sev-guest: Replace dev_dbg() with pr_debug() tip-bot2 for Nikunj A Dadhania
  2024-07-31 15:07 ` [PATCH v11 02/20] virt: sev-guest: Rename local guest message variables Nikunj A Dadhania
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:07 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

In preparation of moving code to arch/x86/coco/sev/core.c,
replace dev_dbg with pr_debug.

No functional change.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
---
 drivers/virt/coco/sev-guest/sev-guest.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 6fc7884ea0a1..7d343f2c6ef8 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -296,8 +296,9 @@ static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload,
 	struct snp_guest_msg_hdr *req_hdr = &req->hdr;
 	struct snp_guest_msg_hdr *resp_hdr = &resp->hdr;
 
-	dev_dbg(snp_dev->dev, "response [seqno %lld type %d version %d sz %d]\n",
-		resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version, resp_hdr->msg_sz);
+	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
+		 resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version,
+		 resp_hdr->msg_sz);
 
 	/* Copy response from shared memory to encrypted memory. */
 	memcpy(resp, snp_dev->response, sizeof(*resp));
@@ -343,8 +344,8 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8
 	if (!hdr->msg_seqno)
 		return -ENOSR;
 
-	dev_dbg(snp_dev->dev, "request [seqno %lld type %d version %d sz %d]\n",
-		hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
+	pr_debug("request [seqno %lld type %d version %d sz %d]\n",
+		 hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
 
 	return __enc_payload(snp_dev, req, payload, sz);
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 02/20] virt: sev-guest: Rename local guest message variables
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
  2024-07-31 15:07 ` [PATCH v11 01/20] virt: sev-guest: Replace dev_dbg with pr_debug Nikunj A Dadhania
@ 2024-07-31 15:07 ` Nikunj A Dadhania
  2024-08-27  8:48   ` [tip: x86/sev] " tip-bot2 for Nikunj A Dadhania
  2024-09-13 17:22   ` [PATCH v11 02/20] " Tom Lendacky
  2024-07-31 15:07 ` [PATCH v11 03/20] virt: sev-guest: Fix user-visible strings Nikunj A Dadhania
                   ` (18 subsequent siblings)
  20 siblings, 2 replies; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:07 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Rename local guest message variables for more clarity

No functional change.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 drivers/virt/coco/sev-guest/sev-guest.c | 117 ++++++++++++------------
 1 file changed, 59 insertions(+), 58 deletions(-)

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 7d343f2c6ef8..a72fe1e959c2 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -291,45 +291,45 @@ static int dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
 static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload, u32 sz)
 {
 	struct snp_guest_crypto *crypto = snp_dev->crypto;
-	struct snp_guest_msg *resp = &snp_dev->secret_response;
-	struct snp_guest_msg *req = &snp_dev->secret_request;
-	struct snp_guest_msg_hdr *req_hdr = &req->hdr;
-	struct snp_guest_msg_hdr *resp_hdr = &resp->hdr;
+	struct snp_guest_msg *resp_msg = &snp_dev->secret_response;
+	struct snp_guest_msg *req_msg = &snp_dev->secret_request;
+	struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr;
+	struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr;
 
 	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
-		 resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version,
-		 resp_hdr->msg_sz);
+		 resp_msg_hdr->msg_seqno, resp_msg_hdr->msg_type, resp_msg_hdr->msg_version,
+		 resp_msg_hdr->msg_sz);
 
 	/* Copy response from shared memory to encrypted memory. */
-	memcpy(resp, snp_dev->response, sizeof(*resp));
+	memcpy(resp_msg, snp_dev->response, sizeof(*resp_msg));
 
 	/* Verify that the sequence counter is incremented by 1 */
-	if (unlikely(resp_hdr->msg_seqno != (req_hdr->msg_seqno + 1)))
+	if (unlikely(resp_msg_hdr->msg_seqno != (req_msg_hdr->msg_seqno + 1)))
 		return -EBADMSG;
 
 	/* Verify response message type and version number. */
-	if (resp_hdr->msg_type != (req_hdr->msg_type + 1) ||
-	    resp_hdr->msg_version != req_hdr->msg_version)
+	if (resp_msg_hdr->msg_type != (req_msg_hdr->msg_type + 1) ||
+	    resp_msg_hdr->msg_version != req_msg_hdr->msg_version)
 		return -EBADMSG;
 
 	/*
 	 * If the message size is greater than our buffer length then return
 	 * an error.
 	 */
-	if (unlikely((resp_hdr->msg_sz + crypto->a_len) > sz))
+	if (unlikely((resp_msg_hdr->msg_sz + crypto->a_len) > sz))
 		return -EBADMSG;
 
 	/* Decrypt the payload */
-	return dec_payload(snp_dev, resp, payload, resp_hdr->msg_sz + crypto->a_len);
+	return dec_payload(snp_dev, resp_msg, payload, resp_msg_hdr->msg_sz + crypto->a_len);
 }
 
 static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8 type,
 			void *payload, size_t sz)
 {
-	struct snp_guest_msg *req = &snp_dev->secret_request;
-	struct snp_guest_msg_hdr *hdr = &req->hdr;
+	struct snp_guest_msg *msg = &snp_dev->secret_request;
+	struct snp_guest_msg_hdr *hdr = &msg->hdr;
 
-	memset(req, 0, sizeof(*req));
+	memset(msg, 0, sizeof(*msg));
 
 	hdr->algo = SNP_AEAD_AES_256_GCM;
 	hdr->hdr_version = MSG_HDR_VER;
@@ -347,7 +347,7 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8
 	pr_debug("request [seqno %lld type %d version %d sz %d]\n",
 		 hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
 
-	return __enc_payload(snp_dev, req, payload, sz);
+	return __enc_payload(snp_dev, msg, payload, sz);
 }
 
 static int __handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
@@ -496,8 +496,8 @@ struct snp_req_resp {
 static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
 {
 	struct snp_guest_crypto *crypto = snp_dev->crypto;
-	struct snp_report_req *req = &snp_dev->req.report;
-	struct snp_report_resp *resp;
+	struct snp_report_req *report_req = &snp_dev->req.report;
+	struct snp_report_resp *report_resp;
 	int rc, resp_len;
 
 	lockdep_assert_held(&snp_cmd_mutex);
@@ -505,7 +505,7 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 	if (!arg->req_data || !arg->resp_data)
 		return -EINVAL;
 
-	if (copy_from_user(req, (void __user *)arg->req_data, sizeof(*req)))
+	if (copy_from_user(report_req, (void __user *)arg->req_data, sizeof(*report_req)))
 		return -EFAULT;
 
 	/*
@@ -513,30 +513,29 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(resp->data) + crypto->a_len;
-	resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
-	if (!resp)
+	resp_len = sizeof(report_resp->data) + crypto->a_len;
+	report_resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+	if (!report_resp)
 		return -ENOMEM;
 
-	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg,
-				  SNP_MSG_REPORT_REQ, req, sizeof(*req), resp->data,
-				  resp_len);
+	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
+				  report_req, sizeof(*report_req), report_resp->data, resp_len);
 	if (rc)
 		goto e_free;
 
-	if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+	if (copy_to_user((void __user *)arg->resp_data, report_resp, sizeof(*report_resp)))
 		rc = -EFAULT;
 
 e_free:
-	kfree(resp);
+	kfree(report_resp);
 	return rc;
 }
 
 static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
 {
-	struct snp_derived_key_req *req = &snp_dev->req.derived_key;
+	struct snp_derived_key_req *derived_key_req = &snp_dev->req.derived_key;
 	struct snp_guest_crypto *crypto = snp_dev->crypto;
-	struct snp_derived_key_resp resp = {0};
+	struct snp_derived_key_resp derived_key_resp = {0};
 	int rc, resp_len;
 	/* Response data is 64 bytes and max authsize for GCM is 16 bytes. */
 	u8 buf[64 + 16];
@@ -551,25 +550,27 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(resp.data) + crypto->a_len;
+	resp_len = sizeof(derived_key_resp.data) + crypto->a_len;
 	if (sizeof(buf) < resp_len)
 		return -ENOMEM;
 
-	if (copy_from_user(req, (void __user *)arg->req_data, sizeof(*req)))
+	if (copy_from_user(derived_key_req, (void __user *)arg->req_data,
+			   sizeof(*derived_key_req)))
 		return -EFAULT;
 
-	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg,
-				  SNP_MSG_KEY_REQ, req, sizeof(*req), buf, resp_len);
+	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_KEY_REQ,
+				  derived_key_req, sizeof(*derived_key_req), buf, resp_len);
 	if (rc)
 		return rc;
 
-	memcpy(resp.data, buf, sizeof(resp.data));
-	if (copy_to_user((void __user *)arg->resp_data, &resp, sizeof(resp)))
+	memcpy(derived_key_resp.data, buf, sizeof(derived_key_resp.data));
+	if (copy_to_user((void __user *)arg->resp_data, &derived_key_resp,
+			 sizeof(derived_key_resp)))
 		rc = -EFAULT;
 
 	/* The response buffer contains the sensitive data, explicitly clear it. */
 	memzero_explicit(buf, sizeof(buf));
-	memzero_explicit(&resp, sizeof(resp));
+	memzero_explicit(&derived_key_resp, sizeof(derived_key_resp));
 	return rc;
 }
 
@@ -577,9 +578,9 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 			  struct snp_req_resp *io)
 
 {
-	struct snp_ext_report_req *req = &snp_dev->req.ext_report;
+	struct snp_ext_report_req *report_req = &snp_dev->req.ext_report;
 	struct snp_guest_crypto *crypto = snp_dev->crypto;
-	struct snp_report_resp *resp;
+	struct snp_report_resp *report_resp;
 	int ret, npages = 0, resp_len;
 	sockptr_t certs_address;
 
@@ -588,22 +589,22 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 	if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data))
 		return -EINVAL;
 
-	if (copy_from_sockptr(req, io->req_data, sizeof(*req)))
+	if (copy_from_sockptr(report_req, io->req_data, sizeof(*report_req)))
 		return -EFAULT;
 
 	/* caller does not want certificate data */
-	if (!req->certs_len || !req->certs_address)
+	if (!report_req->certs_len || !report_req->certs_address)
 		goto cmd;
 
-	if (req->certs_len > SEV_FW_BLOB_MAX_SIZE ||
-	    !IS_ALIGNED(req->certs_len, PAGE_SIZE))
+	if (report_req->certs_len > SEV_FW_BLOB_MAX_SIZE ||
+	    !IS_ALIGNED(report_req->certs_len, PAGE_SIZE))
 		return -EINVAL;
 
 	if (sockptr_is_kernel(io->resp_data)) {
-		certs_address = KERNEL_SOCKPTR((void *)req->certs_address);
+		certs_address = KERNEL_SOCKPTR((void *)report_req->certs_address);
 	} else {
-		certs_address = USER_SOCKPTR((void __user *)req->certs_address);
-		if (!access_ok(certs_address.user, req->certs_len))
+		certs_address = USER_SOCKPTR((void __user *)report_req->certs_address);
+		if (!access_ok(certs_address.user, report_req->certs_len))
 			return -EFAULT;
 	}
 
@@ -613,45 +614,45 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 	 * the host. If host does not supply any certs in it, then copy
 	 * zeros to indicate that certificate data was not provided.
 	 */
-	memset(snp_dev->certs_data, 0, req->certs_len);
-	npages = req->certs_len >> PAGE_SHIFT;
+	memset(snp_dev->certs_data, 0, report_req->certs_len);
+	npages = report_req->certs_len >> PAGE_SHIFT;
 cmd:
 	/*
 	 * The intermediate response buffer is used while decrypting the
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(resp->data) + crypto->a_len;
-	resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
-	if (!resp)
+	resp_len = sizeof(report_resp->data) + crypto->a_len;
+	report_resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+	if (!report_resp)
 		return -ENOMEM;
 
 	snp_dev->input.data_npages = npages;
-	ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg,
-				   SNP_MSG_REPORT_REQ, &req->data,
-				   sizeof(req->data), resp->data, resp_len);
+	ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
+				   &report_req->data, sizeof(report_req->data),
+				   report_resp->data, resp_len);
 
 	/* If certs length is invalid then copy the returned length */
 	if (arg->vmm_error == SNP_GUEST_VMM_ERR_INVALID_LEN) {
-		req->certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
+		report_req->certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
 
-		if (copy_to_sockptr(io->req_data, req, sizeof(*req)))
+		if (copy_to_sockptr(io->req_data, report_req, sizeof(*report_req)))
 			ret = -EFAULT;
 	}
 
 	if (ret)
 		goto e_free;
 
-	if (npages && copy_to_sockptr(certs_address, snp_dev->certs_data, req->certs_len)) {
+	if (npages && copy_to_sockptr(certs_address, snp_dev->certs_data, report_req->certs_len)) {
 		ret = -EFAULT;
 		goto e_free;
 	}
 
-	if (copy_to_sockptr(io->resp_data, resp, sizeof(*resp)))
+	if (copy_to_sockptr(io->resp_data, report_resp, sizeof(*report_resp)))
 		ret = -EFAULT;
 
 e_free:
-	kfree(resp);
+	kfree(report_resp);
 	return ret;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 03/20] virt: sev-guest: Fix user-visible strings
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
  2024-07-31 15:07 ` [PATCH v11 01/20] virt: sev-guest: Replace dev_dbg with pr_debug Nikunj A Dadhania
  2024-07-31 15:07 ` [PATCH v11 02/20] virt: sev-guest: Rename local guest message variables Nikunj A Dadhania
@ 2024-07-31 15:07 ` Nikunj A Dadhania
  2024-08-27  8:48   ` [tip: x86/sev] " tip-bot2 for Nikunj A Dadhania
  2024-09-13 17:26   ` [PATCH v11 03/20] " Tom Lendacky
  2024-07-31 15:07 ` [PATCH v11 04/20] virt: sev-guest: Ensure the SNP guest messages do not exceed a page Nikunj A Dadhania
                   ` (17 subsequent siblings)
  20 siblings, 2 replies; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:07 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

User-visible abbreviations should be in capitals, ensure messages are
readable and clear.

No functional change.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 drivers/virt/coco/sev-guest/sev-guest.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index a72fe1e959c2..3b76cbf78f41 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -114,7 +114,7 @@ static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
  */
 static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
 {
-	dev_alert(snp_dev->dev, "Disabling vmpck_id %d to prevent IV reuse.\n",
+	dev_alert(snp_dev->dev, "Disabling VMPCK%d communication key to prevent IV reuse.\n",
 		  vmpck_id);
 	memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
 	snp_dev->vmpck = NULL;
@@ -1117,13 +1117,13 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	ret = -EINVAL;
 	snp_dev->vmpck = get_vmpck(vmpck_id, secrets, &snp_dev->os_area_msg_seqno);
 	if (!snp_dev->vmpck) {
-		dev_err(dev, "invalid vmpck id %d\n", vmpck_id);
+		dev_err(dev, "Invalid VMPCK%d communication key\n", vmpck_id);
 		goto e_unmap;
 	}
 
 	/* Verify that VMPCK is not zero. */
 	if (is_vmpck_empty(snp_dev)) {
-		dev_err(dev, "vmpck id %d is null\n", vmpck_id);
+		dev_err(dev, "Empty VMPCK%d communication key\n", vmpck_id);
 		goto e_unmap;
 	}
 
@@ -1174,7 +1174,7 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	if (ret)
 		goto e_free_cert_data;
 
-	dev_info(dev, "Initialized SEV guest driver (using vmpck_id %d)\n", vmpck_id);
+	dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n", vmpck_id);
 	return 0;
 
 e_free_cert_data:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 04/20] virt: sev-guest: Ensure the SNP guest messages do not exceed a page
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (2 preceding siblings ...)
  2024-07-31 15:07 ` [PATCH v11 03/20] virt: sev-guest: Fix user-visible strings Nikunj A Dadhania
@ 2024-07-31 15:07 ` Nikunj A Dadhania
  2024-08-27  8:48   ` [tip: x86/sev] " tip-bot2 for Nikunj A Dadhania
  2024-07-31 15:07 ` [PATCH v11 05/20] virt: sev-guest: Use AES GCM crypto library Nikunj A Dadhania
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:07 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Currently, snp_guest_msg includes a message header (96 bytes) and a
payload (4000 bytes). There is an implicit assumption here that the SNP
message header will always be 96 bytes, and with that assumption the
payload array size has been set to 4000 bytes magic number. If any new
member is added to the SNP message header, the SNP guest message will span
more than a page.

Instead of using magic number '4000' for the payload, declare the
snp_guest_msg in a way that payload plus the message header do not exceed a
page.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
---
 arch/x86/include/asm/sev.h              | 2 +-
 drivers/virt/coco/sev-guest/sev-guest.c | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 79bbe2be900e..ee34ab00a8d6 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -164,7 +164,7 @@ struct snp_guest_msg_hdr {
 
 struct snp_guest_msg {
 	struct snp_guest_msg_hdr hdr;
-	u8 payload[4000];
+	u8 payload[PAGE_SIZE - sizeof(struct snp_guest_msg_hdr)];
 } __packed;
 
 struct sev_guest_platform_data {
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 3b76cbf78f41..0b950069bfcb 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -1131,6 +1131,9 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	snp_dev->dev = dev;
 	snp_dev->secrets = secrets;
 
+	/* Ensure SNP guest messages do not span more than a page */
+	BUILD_BUG_ON(sizeof(struct snp_guest_msg) > PAGE_SIZE);
+
 	/* Allocate the shared page used for the request and response message. */
 	snp_dev->request = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
 	if (!snp_dev->request)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 05/20] virt: sev-guest: Use AES GCM crypto library
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (3 preceding siblings ...)
  2024-07-31 15:07 ` [PATCH v11 04/20] virt: sev-guest: Ensure the SNP guest messages do not exceed a page Nikunj A Dadhania
@ 2024-07-31 15:07 ` Nikunj A Dadhania
  2024-07-31 15:07 ` [PATCH v11 06/20] x86/sev: Handle failures from snp_init() Nikunj A Dadhania
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:07 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

The sev-guest driver encryption code uses the crypto API for SNP guest
messaging with the AMD Security processor. In order to enable secure TSC,
SEV-SNP guests need to send such a TSC_INFO message before the APs are
booted. Details from the TSC_INFO response will then be used to program the
VMSA before the APs are brought up.

However, the crypto API is not available this early in the boot process.

In preparation for moving the encryption code out of sev-guest to support
secure TSC and to ease review, switch to using the AES GCM library
implementation instead.

Drop __enc_payload() and dec_payload() helpers as both are small and can be
moved to the respective callers.

CC: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
---
 arch/x86/include/asm/sev.h              |   3 +
 drivers/virt/coco/sev-guest/sev-guest.c | 175 ++++++------------------
 drivers/virt/coco/sev-guest/Kconfig     |   4 +-
 3 files changed, 43 insertions(+), 139 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index ee34ab00a8d6..e7977f76d77e 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -120,6 +120,9 @@ struct snp_req_data {
 };
 
 #define MAX_AUTHTAG_LEN		32
+#define AUTHTAG_LEN		16
+#define AAD_LEN			48
+#define MSG_HDR_VER		1
 
 /* See SNP spec SNP_GUEST_REQUEST section for the structure */
 enum msg_type {
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 0b950069bfcb..39d90dd0b012 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -17,8 +17,7 @@
 #include <linux/set_memory.h>
 #include <linux/fs.h>
 #include <linux/tsm.h>
-#include <crypto/aead.h>
-#include <linux/scatterlist.h>
+#include <crypto/gcm.h>
 #include <linux/psp-sev.h>
 #include <linux/sockptr.h>
 #include <linux/cleanup.h>
@@ -31,26 +30,18 @@
 #include <asm/sev.h>
 
 #define DEVICE_NAME	"sev-guest"
-#define AAD_LEN		48
-#define MSG_HDR_VER	1
 
 #define SNP_REQ_MAX_RETRY_DURATION	(60*HZ)
 #define SNP_REQ_RETRY_DELAY		(2*HZ)
 
 #define SVSM_MAX_RETRIES		3
 
-struct snp_guest_crypto {
-	struct crypto_aead *tfm;
-	u8 *iv, *authtag;
-	int iv_len, a_len;
-};
-
 struct snp_guest_dev {
 	struct device *dev;
 	struct miscdevice misc;
 
 	void *certs_data;
-	struct snp_guest_crypto *crypto;
+	struct aesgcm_ctx *ctx;
 	/* request and response are in unencrypted memory */
 	struct snp_guest_msg *request, *response;
 
@@ -169,132 +160,31 @@ static inline struct snp_guest_dev *to_snp_dev(struct file *file)
 	return container_of(dev, struct snp_guest_dev, misc);
 }
 
-static struct snp_guest_crypto *init_crypto(struct snp_guest_dev *snp_dev, u8 *key, size_t keylen)
+static struct aesgcm_ctx *snp_init_crypto(u8 *key, size_t keylen)
 {
-	struct snp_guest_crypto *crypto;
+	struct aesgcm_ctx *ctx;
 
-	crypto = kzalloc(sizeof(*crypto), GFP_KERNEL_ACCOUNT);
-	if (!crypto)
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT);
+	if (!ctx)
 		return NULL;
 
-	crypto->tfm = crypto_alloc_aead("gcm(aes)", 0, 0);
-	if (IS_ERR(crypto->tfm))
-		goto e_free;
-
-	if (crypto_aead_setkey(crypto->tfm, key, keylen))
-		goto e_free_crypto;
-
-	crypto->iv_len = crypto_aead_ivsize(crypto->tfm);
-	crypto->iv = kmalloc(crypto->iv_len, GFP_KERNEL_ACCOUNT);
-	if (!crypto->iv)
-		goto e_free_crypto;
-
-	if (crypto_aead_authsize(crypto->tfm) > MAX_AUTHTAG_LEN) {
-		if (crypto_aead_setauthsize(crypto->tfm, MAX_AUTHTAG_LEN)) {
-			dev_err(snp_dev->dev, "failed to set authsize to %d\n", MAX_AUTHTAG_LEN);
-			goto e_free_iv;
-		}
+	if (aesgcm_expandkey(ctx, key, keylen, AUTHTAG_LEN)) {
+		pr_err("Crypto context initialization failed\n");
+		kfree(ctx);
+		return NULL;
 	}
 
-	crypto->a_len = crypto_aead_authsize(crypto->tfm);
-	crypto->authtag = kmalloc(crypto->a_len, GFP_KERNEL_ACCOUNT);
-	if (!crypto->authtag)
-		goto e_free_iv;
-
-	return crypto;
-
-e_free_iv:
-	kfree(crypto->iv);
-e_free_crypto:
-	crypto_free_aead(crypto->tfm);
-e_free:
-	kfree(crypto);
-
-	return NULL;
-}
-
-static void deinit_crypto(struct snp_guest_crypto *crypto)
-{
-	crypto_free_aead(crypto->tfm);
-	kfree(crypto->iv);
-	kfree(crypto->authtag);
-	kfree(crypto);
-}
-
-static int enc_dec_message(struct snp_guest_crypto *crypto, struct snp_guest_msg *msg,
-			   u8 *src_buf, u8 *dst_buf, size_t len, bool enc)
-{
-	struct snp_guest_msg_hdr *hdr = &msg->hdr;
-	struct scatterlist src[3], dst[3];
-	DECLARE_CRYPTO_WAIT(wait);
-	struct aead_request *req;
-	int ret;
-
-	req = aead_request_alloc(crypto->tfm, GFP_KERNEL);
-	if (!req)
-		return -ENOMEM;
-
-	/*
-	 * AEAD memory operations:
-	 * +------ AAD -------+------- DATA -----+---- AUTHTAG----+
-	 * |  msg header      |  plaintext       |  hdr->authtag  |
-	 * | bytes 30h - 5Fh  |    or            |                |
-	 * |                  |   cipher         |                |
-	 * +------------------+------------------+----------------+
-	 */
-	sg_init_table(src, 3);
-	sg_set_buf(&src[0], &hdr->algo, AAD_LEN);
-	sg_set_buf(&src[1], src_buf, hdr->msg_sz);
-	sg_set_buf(&src[2], hdr->authtag, crypto->a_len);
-
-	sg_init_table(dst, 3);
-	sg_set_buf(&dst[0], &hdr->algo, AAD_LEN);
-	sg_set_buf(&dst[1], dst_buf, hdr->msg_sz);
-	sg_set_buf(&dst[2], hdr->authtag, crypto->a_len);
-
-	aead_request_set_ad(req, AAD_LEN);
-	aead_request_set_tfm(req, crypto->tfm);
-	aead_request_set_callback(req, 0, crypto_req_done, &wait);
-
-	aead_request_set_crypt(req, src, dst, len, crypto->iv);
-	ret = crypto_wait_req(enc ? crypto_aead_encrypt(req) : crypto_aead_decrypt(req), &wait);
-
-	aead_request_free(req);
-	return ret;
-}
-
-static int __enc_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
-			 void *plaintext, size_t len)
-{
-	struct snp_guest_crypto *crypto = snp_dev->crypto;
-	struct snp_guest_msg_hdr *hdr = &msg->hdr;
-
-	memset(crypto->iv, 0, crypto->iv_len);
-	memcpy(crypto->iv, &hdr->msg_seqno, sizeof(hdr->msg_seqno));
-
-	return enc_dec_message(crypto, msg, plaintext, msg->payload, len, true);
-}
-
-static int dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
-		       void *plaintext, size_t len)
-{
-	struct snp_guest_crypto *crypto = snp_dev->crypto;
-	struct snp_guest_msg_hdr *hdr = &msg->hdr;
-
-	/* Build IV with response buffer sequence number */
-	memset(crypto->iv, 0, crypto->iv_len);
-	memcpy(crypto->iv, &hdr->msg_seqno, sizeof(hdr->msg_seqno));
-
-	return enc_dec_message(crypto, msg, msg->payload, plaintext, len, false);
+	return ctx;
 }
 
 static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload, u32 sz)
 {
-	struct snp_guest_crypto *crypto = snp_dev->crypto;
 	struct snp_guest_msg *resp_msg = &snp_dev->secret_response;
 	struct snp_guest_msg *req_msg = &snp_dev->secret_request;
 	struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr;
 	struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr;
+	struct aesgcm_ctx *ctx = snp_dev->ctx;
+	u8 iv[GCM_AES_IV_SIZE] = {};
 
 	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
 		 resp_msg_hdr->msg_seqno, resp_msg_hdr->msg_type, resp_msg_hdr->msg_version,
@@ -316,11 +206,16 @@ static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload,
 	 * If the message size is greater than our buffer length then return
 	 * an error.
 	 */
-	if (unlikely((resp_msg_hdr->msg_sz + crypto->a_len) > sz))
+	if (unlikely((resp_msg_hdr->msg_sz + ctx->authsize) > sz))
 		return -EBADMSG;
 
 	/* Decrypt the payload */
-	return dec_payload(snp_dev, resp_msg, payload, resp_msg_hdr->msg_sz + crypto->a_len);
+	memcpy(iv, &resp_msg_hdr->msg_seqno, min(sizeof(iv), sizeof(resp_msg_hdr->msg_seqno)));
+	if (!aesgcm_decrypt(ctx, payload, resp_msg->payload, resp_msg_hdr->msg_sz,
+			    &resp_msg_hdr->algo, AAD_LEN, iv, resp_msg_hdr->authtag))
+		return -EBADMSG;
+
+	return 0;
 }
 
 static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8 type,
@@ -328,6 +223,8 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8
 {
 	struct snp_guest_msg *msg = &snp_dev->secret_request;
 	struct snp_guest_msg_hdr *hdr = &msg->hdr;
+	struct aesgcm_ctx *ctx = snp_dev->ctx;
+	u8 iv[GCM_AES_IV_SIZE] = {};
 
 	memset(msg, 0, sizeof(*msg));
 
@@ -347,7 +244,14 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8
 	pr_debug("request [seqno %lld type %d version %d sz %d]\n",
 		 hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
 
-	return __enc_payload(snp_dev, msg, payload, sz);
+	if (WARN_ON((sz + ctx->authsize) > sizeof(msg->payload)))
+		return -EBADMSG;
+
+	memcpy(iv, &hdr->msg_seqno, min(sizeof(iv), sizeof(hdr->msg_seqno)));
+	aesgcm_encrypt(ctx, msg->payload, payload, sz, &hdr->algo, AAD_LEN,
+		       iv, hdr->authtag);
+
+	return 0;
 }
 
 static int __handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
@@ -495,7 +399,6 @@ struct snp_req_resp {
 
 static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
 {
-	struct snp_guest_crypto *crypto = snp_dev->crypto;
 	struct snp_report_req *report_req = &snp_dev->req.report;
 	struct snp_report_resp *report_resp;
 	int rc, resp_len;
@@ -513,7 +416,7 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(report_resp->data) + crypto->a_len;
+	resp_len = sizeof(report_resp->data) + snp_dev->ctx->authsize;
 	report_resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
 	if (!report_resp)
 		return -ENOMEM;
@@ -534,7 +437,6 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
 {
 	struct snp_derived_key_req *derived_key_req = &snp_dev->req.derived_key;
-	struct snp_guest_crypto *crypto = snp_dev->crypto;
 	struct snp_derived_key_resp derived_key_resp = {0};
 	int rc, resp_len;
 	/* Response data is 64 bytes and max authsize for GCM is 16 bytes. */
@@ -550,7 +452,7 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(derived_key_resp.data) + crypto->a_len;
+	resp_len = sizeof(derived_key_resp.data) + snp_dev->ctx->authsize;
 	if (sizeof(buf) < resp_len)
 		return -ENOMEM;
 
@@ -579,7 +481,6 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 
 {
 	struct snp_ext_report_req *report_req = &snp_dev->req.ext_report;
-	struct snp_guest_crypto *crypto = snp_dev->crypto;
 	struct snp_report_resp *report_resp;
 	int ret, npages = 0, resp_len;
 	sockptr_t certs_address;
@@ -622,7 +523,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(report_resp->data) + crypto->a_len;
+	resp_len = sizeof(report_resp->data) + snp_dev->ctx->authsize;
 	report_resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
 	if (!report_resp)
 		return -ENOMEM;
@@ -1148,8 +1049,8 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 		goto e_free_response;
 
 	ret = -EIO;
-	snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN);
-	if (!snp_dev->crypto)
+	snp_dev->ctx = snp_init_crypto(snp_dev->vmpck, VMPCK_KEY_LEN);
+	if (!snp_dev->ctx)
 		goto e_free_cert_data;
 
 	misc = &snp_dev->misc;
@@ -1175,11 +1076,13 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 
 	ret =  misc_register(misc);
 	if (ret)
-		goto e_free_cert_data;
+		goto e_free_ctx;
 
 	dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n", vmpck_id);
 	return 0;
 
+e_free_ctx:
+	kfree(snp_dev->ctx);
 e_free_cert_data:
 	free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
 e_free_response:
@@ -1198,7 +1101,7 @@ static void __exit sev_guest_remove(struct platform_device *pdev)
 	free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
 	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
 	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
-	deinit_crypto(snp_dev->crypto);
+	kfree(snp_dev->ctx);
 	misc_deregister(&snp_dev->misc);
 }
 
diff --git a/drivers/virt/coco/sev-guest/Kconfig b/drivers/virt/coco/sev-guest/Kconfig
index 1cffc72c41cb..0b772bd921d8 100644
--- a/drivers/virt/coco/sev-guest/Kconfig
+++ b/drivers/virt/coco/sev-guest/Kconfig
@@ -2,9 +2,7 @@ config SEV_GUEST
 	tristate "AMD SEV Guest driver"
 	default m
 	depends on AMD_MEM_ENCRYPT
-	select CRYPTO
-	select CRYPTO_AEAD2
-	select CRYPTO_GCM
+	select CRYPTO_LIB_AESGCM
 	select TSM_REPORTS
 	help
 	  SEV-SNP firmware provides the guest a mechanism to communicate with
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 06/20] x86/sev: Handle failures from snp_init()
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (4 preceding siblings ...)
  2024-07-31 15:07 ` [PATCH v11 05/20] virt: sev-guest: Use AES GCM crypto library Nikunj A Dadhania
@ 2024-07-31 15:07 ` Nikunj A Dadhania
  2024-08-27 11:32   ` Borislav Petkov
  2024-07-31 15:07 ` [PATCH v11 07/20] x86/sev: Cache the secrets page address Nikunj A Dadhania
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:07 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Address the ignored failures from snp_init() in sme_enable(). Add error
handling for scenarios where snp_init() fails to retrieve the SEV-SNP CC
blob or encounters issues while parsing the CC blob. This change ensures
that SNP guests will error out early, preventing delayed error reporting or
undefined behavior.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/mm/mem_encrypt_identity.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index ac33b2263a43..e83b363c5e68 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -535,6 +535,13 @@ void __head sme_enable(struct boot_params *bp)
 	if (snp && !(msr & MSR_AMD64_SEV_SNP_ENABLED))
 		snp_abort();
 
+	/*
+	 * The SEV-SNP CC blob should be present and parsing CC blob should
+	 * succeed when SEV-SNP is enabled.
+	 */
+	if (!snp && (msr & MSR_AMD64_SEV_SNP_ENABLED))
+		snp_abort();
+
 	/* Check if memory encryption is enabled */
 	if (feature_mask == AMD_SME_BIT) {
 		if (!(bp->hdr.xloadflags & XLF_MEM_ENCRYPTION))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 07/20] x86/sev: Cache the secrets page address
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (5 preceding siblings ...)
  2024-07-31 15:07 ` [PATCH v11 06/20] x86/sev: Handle failures from snp_init() Nikunj A Dadhania
@ 2024-07-31 15:07 ` Nikunj A Dadhania
  2024-07-31 15:07 ` [PATCH v11 08/20] virt: sev-guest: Consolidate SNP guest messaging parameters to a struct Nikunj A Dadhania
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:07 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Instead of calling get_secrets_page(), which parses the CC blob every time
to get the secrets page physical address (secrets_pa), save the secrets
page physical address during snp_init() from the CC blob. Since
get_secrets_page() is no longer used, remove the function.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/coco/sev/core.c | 51 +++++++++-------------------------------
 1 file changed, 11 insertions(+), 40 deletions(-)

diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 082d61d85dfc..a0b64cfd4b8e 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -92,6 +92,9 @@ static struct ghcb *boot_ghcb __section(".data");
 /* Bitmap of SEV features supported by the hypervisor */
 static u64 sev_hv_features __ro_after_init;
 
+/* Secrets page physical address from the CC blob */
+static u64 secrets_pa __ro_after_init;
+
 /* #VC handler runtime per-CPU data */
 struct sev_es_runtime_data {
 	struct ghcb ghcb_page;
@@ -722,45 +725,13 @@ void noinstr __sev_es_nmi_complete(void)
 	__sev_put_ghcb(&state);
 }
 
-static u64 __init get_secrets_page(void)
-{
-	u64 pa_data = boot_params.cc_blob_address;
-	struct cc_blob_sev_info info;
-	void *map;
-
-	/*
-	 * The CC blob contains the address of the secrets page, check if the
-	 * blob is present.
-	 */
-	if (!pa_data)
-		return 0;
-
-	map = early_memremap(pa_data, sizeof(info));
-	if (!map) {
-		pr_err("Unable to locate SNP secrets page: failed to map the Confidential Computing blob.\n");
-		return 0;
-	}
-	memcpy(&info, map, sizeof(info));
-	early_memunmap(map, sizeof(info));
-
-	/* smoke-test the secrets page passed */
-	if (!info.secrets_phys || info.secrets_len != PAGE_SIZE)
-		return 0;
-
-	return info.secrets_phys;
-}
-
 static u64 __init get_snp_jump_table_addr(void)
 {
 	struct snp_secrets_page *secrets;
 	void __iomem *mem;
-	u64 pa, addr;
-
-	pa = get_secrets_page();
-	if (!pa)
-		return 0;
+	u64 addr;
 
-	mem = ioremap_encrypted(pa, PAGE_SIZE);
+	mem = ioremap_encrypted(secrets_pa, PAGE_SIZE);
 	if (!mem) {
 		pr_err("Unable to locate AP jump table address: failed to map the SNP secrets page.\n");
 		return 0;
@@ -2300,6 +2271,11 @@ bool __head snp_init(struct boot_params *bp)
 	if (!cc_info)
 		return false;
 
+	if (cc_info->secrets_phys && cc_info->secrets_len == PAGE_SIZE)
+		secrets_pa = cc_info->secrets_phys;
+	else
+		return false;
+
 	setup_cpuid_table(cc_info);
 
 	svsm_setup(cc_info);
@@ -2513,16 +2489,11 @@ static struct platform_device sev_guest_device = {
 static int __init snp_init_platform_device(void)
 {
 	struct sev_guest_platform_data data;
-	u64 gpa;
 
 	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
 		return -ENODEV;
 
-	gpa = get_secrets_page();
-	if (!gpa)
-		return -ENODEV;
-
-	data.secrets_gpa = gpa;
+	data.secrets_gpa = secrets_pa;
 	if (platform_device_add_data(&sev_guest_device, &data, sizeof(data)))
 		return -ENODEV;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 08/20] virt: sev-guest: Consolidate SNP guest messaging parameters to a struct
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (6 preceding siblings ...)
  2024-07-31 15:07 ` [PATCH v11 07/20] x86/sev: Cache the secrets page address Nikunj A Dadhania
@ 2024-07-31 15:07 ` Nikunj A Dadhania
  2024-09-04 14:31   ` Borislav Petkov
  2024-07-31 15:08 ` [PATCH v11 09/20] virt: sev-guest: Reduce the scope of SNP command mutex Nikunj A Dadhania
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:07 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Add a snp_guest_req structure to eliminate the need to pass a long list of
parameters. This structure will be used to call the SNP Guest message
request API, simplifying the function arguments.

Update the snp_issue_guest_request() prototype to include the new guest
request structure.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/sev.h              | 19 +++++++-
 arch/x86/coco/sev/core.c                |  9 ++--
 drivers/virt/coco/sev-guest/sev-guest.c | 62 ++++++++++++++++---------
 3 files changed, 61 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index e7977f76d77e..27fa1c9c3465 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -174,6 +174,19 @@ struct sev_guest_platform_data {
 	u64 secrets_gpa;
 };
 
+struct snp_guest_req {
+	void *req_buf;
+	size_t req_sz;
+
+	void *resp_buf;
+	size_t resp_sz;
+
+	u64 exit_code;
+	unsigned int vmpck_id;
+	u8 msg_version;
+	u8 msg_type;
+};
+
 /*
  * The secrets page contains 96-bytes of reserved field that can be used by
  * the guest OS. The guest OS uses the area to save the message sequence
@@ -395,7 +408,8 @@ void snp_set_wakeup_secondary_cpu(void);
 bool snp_init(struct boot_params *bp);
 void __noreturn snp_abort(void);
 void snp_dmi_setup(void);
-int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio);
+int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input,
+			    struct snp_guest_request_ioctl *rio);
 int snp_issue_svsm_attest_req(u64 call_id, struct svsm_call *call, struct svsm_attest_call *input);
 void snp_accept_memory(phys_addr_t start, phys_addr_t end);
 u64 snp_get_unsupported_features(u64 status);
@@ -425,7 +439,8 @@ static inline void snp_set_wakeup_secondary_cpu(void) { }
 static inline bool snp_init(struct boot_params *bp) { return false; }
 static inline void snp_abort(void) { }
 static inline void snp_dmi_setup(void) { }
-static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio)
+static inline int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input,
+					  struct snp_guest_request_ioctl *rio)
 {
 	return -ENOTTY;
 }
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index a0b64cfd4b8e..e6a3f3df4637 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -2417,7 +2417,8 @@ int snp_issue_svsm_attest_req(u64 call_id, struct svsm_call *call,
 }
 EXPORT_SYMBOL_GPL(snp_issue_svsm_attest_req);
 
-int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio)
+int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input,
+			    struct snp_guest_request_ioctl *rio)
 {
 	struct ghcb_state state;
 	struct es_em_ctxt ctxt;
@@ -2441,12 +2442,12 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct sn
 
 	vc_ghcb_invalidate(ghcb);
 
-	if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST) {
+	if (req->exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST) {
 		ghcb_set_rax(ghcb, input->data_gpa);
 		ghcb_set_rbx(ghcb, input->data_npages);
 	}
 
-	ret = sev_es_ghcb_hv_call(ghcb, &ctxt, exit_code, input->req_gpa, input->resp_gpa);
+	ret = sev_es_ghcb_hv_call(ghcb, &ctxt, req->exit_code, input->req_gpa, input->resp_gpa);
 	if (ret)
 		goto e_put;
 
@@ -2461,7 +2462,7 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct sn
 
 	case SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN):
 		/* Number of expected pages are returned in RBX */
-		if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST) {
+		if (req->exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST) {
 			input->data_npages = ghcb_get_rbx(ghcb);
 			ret = -ENOSPC;
 			break;
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 39d90dd0b012..92734a2345a6 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -177,7 +177,7 @@ static struct aesgcm_ctx *snp_init_crypto(u8 *key, size_t keylen)
 	return ctx;
 }
 
-static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload, u32 sz)
+static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_req *req)
 {
 	struct snp_guest_msg *resp_msg = &snp_dev->secret_response;
 	struct snp_guest_msg *req_msg = &snp_dev->secret_request;
@@ -206,20 +206,19 @@ static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload,
 	 * If the message size is greater than our buffer length then return
 	 * an error.
 	 */
-	if (unlikely((resp_msg_hdr->msg_sz + ctx->authsize) > sz))
+	if (unlikely((resp_msg_hdr->msg_sz + ctx->authsize) > req->resp_sz))
 		return -EBADMSG;
 
 	/* Decrypt the payload */
 	memcpy(iv, &resp_msg_hdr->msg_seqno, min(sizeof(iv), sizeof(resp_msg_hdr->msg_seqno)));
-	if (!aesgcm_decrypt(ctx, payload, resp_msg->payload, resp_msg_hdr->msg_sz,
+	if (!aesgcm_decrypt(ctx, req->resp_buf, resp_msg->payload, resp_msg_hdr->msg_sz,
 			    &resp_msg_hdr->algo, AAD_LEN, iv, resp_msg_hdr->authtag))
 		return -EBADMSG;
 
 	return 0;
 }
 
-static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8 type,
-			void *payload, size_t sz)
+static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, struct snp_guest_req *req)
 {
 	struct snp_guest_msg *msg = &snp_dev->secret_request;
 	struct snp_guest_msg_hdr *hdr = &msg->hdr;
@@ -231,11 +230,11 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8
 	hdr->algo = SNP_AEAD_AES_256_GCM;
 	hdr->hdr_version = MSG_HDR_VER;
 	hdr->hdr_sz = sizeof(*hdr);
-	hdr->msg_type = type;
-	hdr->msg_version = version;
+	hdr->msg_type = req->msg_type;
+	hdr->msg_version = req->msg_version;
 	hdr->msg_seqno = seqno;
-	hdr->msg_vmpck = vmpck_id;
-	hdr->msg_sz = sz;
+	hdr->msg_vmpck = req->vmpck_id;
+	hdr->msg_sz = req->req_sz;
 
 	/* Verify the sequence number is non-zero */
 	if (!hdr->msg_seqno)
@@ -244,17 +243,17 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8
 	pr_debug("request [seqno %lld type %d version %d sz %d]\n",
 		 hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
 
-	if (WARN_ON((sz + ctx->authsize) > sizeof(msg->payload)))
+	if (WARN_ON((req->req_sz + ctx->authsize) > sizeof(msg->payload)))
 		return -EBADMSG;
 
 	memcpy(iv, &hdr->msg_seqno, min(sizeof(iv), sizeof(hdr->msg_seqno)));
-	aesgcm_encrypt(ctx, msg->payload, payload, sz, &hdr->algo, AAD_LEN,
-		       iv, hdr->authtag);
+	aesgcm_encrypt(ctx, msg->payload, req->req_buf, req->req_sz, &hdr->algo,
+		       AAD_LEN, iv, hdr->authtag);
 
 	return 0;
 }
 
-static int __handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
+static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_guest_req *req,
 				  struct snp_guest_request_ioctl *rio)
 {
 	unsigned long req_start = jiffies;
@@ -269,7 +268,7 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
 	 * sequence number must be incremented or the VMPCK must be deleted to
 	 * prevent reuse of the IV.
 	 */
-	rc = snp_issue_guest_request(exit_code, &snp_dev->input, rio);
+	rc = snp_issue_guest_request(req, &snp_dev->input, rio);
 	switch (rc) {
 	case -ENOSPC:
 		/*
@@ -280,7 +279,7 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
 		 * IV reuse.
 		 */
 		override_npages = snp_dev->input.data_npages;
-		exit_code	= SVM_VMGEXIT_GUEST_REQUEST;
+		req->exit_code	= SVM_VMGEXIT_GUEST_REQUEST;
 
 		/*
 		 * Override the error to inform callers the given extended
@@ -340,10 +339,8 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
 	return rc;
 }
 
-static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
-				struct snp_guest_request_ioctl *rio, u8 type,
-				void *req_buf, size_t req_sz, void *resp_buf,
-				u32 resp_sz)
+static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_guest_req *req,
+				  struct snp_guest_request_ioctl *rio)
 {
 	u64 seqno;
 	int rc;
@@ -357,7 +354,7 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
 	memset(snp_dev->response, 0, sizeof(struct snp_guest_msg));
 
 	/* Encrypt the userspace provided payload in snp_dev->secret_request. */
-	rc = enc_payload(snp_dev, seqno, rio->msg_version, type, req_buf, req_sz);
+	rc = enc_payload(snp_dev, seqno, req);
 	if (rc)
 		return rc;
 
@@ -368,7 +365,7 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
 	memcpy(snp_dev->request, &snp_dev->secret_request,
 	       sizeof(snp_dev->secret_request));
 
-	rc = __handle_guest_request(snp_dev, exit_code, rio);
+	rc = __handle_guest_request(snp_dev, req, rio);
 	if (rc) {
 		if (rc == -EIO &&
 		    rio->exitinfo2 == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
@@ -382,7 +379,7 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
 		return rc;
 	}
 
-	rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
+	rc = verify_and_dec_payload(snp_dev, req);
 	if (rc) {
 		dev_alert(snp_dev->dev, "Detected unexpected decode failure from ASP. rc: %d\n", rc);
 		snp_disable_vmpck(snp_dev);
@@ -392,6 +389,25 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
 	return 0;
 }
 
+static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
+				struct snp_guest_request_ioctl *rio, u8 type,
+				void *req_buf, size_t req_sz, void *resp_buf,
+				u32 resp_sz)
+{
+	struct snp_guest_req req = {
+		.msg_version	= rio->msg_version,
+		.msg_type	= type,
+		.vmpck_id	= vmpck_id,
+		.req_buf	= req_buf,
+		.req_sz		= req_sz,
+		.resp_buf	= resp_buf,
+		.resp_sz	= resp_sz,
+		.exit_code	= exit_code,
+	};
+
+	return snp_send_guest_request(snp_dev, &req, rio);
+}
+
 struct snp_req_resp {
 	sockptr_t req_data;
 	sockptr_t resp_data;
@@ -1058,7 +1074,7 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	misc->name = DEVICE_NAME;
 	misc->fops = &snp_guest_fops;
 
-	/* initial the input address for guest request */
+	/* Initialize the input addresses for guest request */
 	snp_dev->input.req_gpa = __pa(snp_dev->request);
 	snp_dev->input.resp_gpa = __pa(snp_dev->response);
 	snp_dev->input.data_gpa = __pa(snp_dev->certs_data);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 09/20] virt: sev-guest: Reduce the scope of SNP command mutex
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (7 preceding siblings ...)
  2024-07-31 15:07 ` [PATCH v11 08/20] virt: sev-guest: Consolidate SNP guest messaging parameters to a struct Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-12 21:54   ` Tom Lendacky
  2024-07-31 15:08 ` [PATCH v11 10/20] virt: sev-guest: Carve out SNP message context structure Nikunj A Dadhania
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

The SNP command mutex is used to serialize access to the shared buffer,
command handling, and message sequence number.

All shared buffer, command handling, and message sequence updates are done
within snp_send_guest_request(), so moving the mutex to this function is
appropriate and maintains the critical section.

Since the mutex is now taken at a later point in time, remove the lockdep
checks that occur before taking the mutex.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 drivers/virt/coco/sev-guest/sev-guest.c | 17 ++---------------
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 92734a2345a6..42f7126f1718 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -345,6 +345,8 @@ static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
 	u64 seqno;
 	int rc;
 
+	guard(mutex)(&snp_cmd_mutex);
+
 	/* Get message sequence and verify that its a non-zero */
 	seqno = snp_get_msg_seqno(snp_dev);
 	if (!seqno)
@@ -419,8 +421,6 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 	struct snp_report_resp *report_resp;
 	int rc, resp_len;
 
-	lockdep_assert_held(&snp_cmd_mutex);
-
 	if (!arg->req_data || !arg->resp_data)
 		return -EINVAL;
 
@@ -458,8 +458,6 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
 	/* Response data is 64 bytes and max authsize for GCM is 16 bytes. */
 	u8 buf[64 + 16];
 
-	lockdep_assert_held(&snp_cmd_mutex);
-
 	if (!arg->req_data || !arg->resp_data)
 		return -EINVAL;
 
@@ -501,8 +499,6 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 	int ret, npages = 0, resp_len;
 	sockptr_t certs_address;
 
-	lockdep_assert_held(&snp_cmd_mutex);
-
 	if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data))
 		return -EINVAL;
 
@@ -590,12 +586,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
 	if (!input.msg_version)
 		return -EINVAL;
 
-	mutex_lock(&snp_cmd_mutex);
-
 	/* Check if the VMPCK is not empty */
 	if (is_vmpck_empty(snp_dev)) {
 		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
-		mutex_unlock(&snp_cmd_mutex);
 		return -ENOTTY;
 	}
 
@@ -620,8 +613,6 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
 		break;
 	}
 
-	mutex_unlock(&snp_cmd_mutex);
-
 	if (input.exitinfo2 && copy_to_user(argp, &input, sizeof(input)))
 		return -EFAULT;
 
@@ -736,8 +727,6 @@ static int sev_svsm_report_new(struct tsm_report *report, void *data)
 	man_len = SZ_4K;
 	certs_len = SEV_FW_BLOB_MAX_SIZE;
 
-	guard(mutex)(&snp_cmd_mutex);
-
 	if (guid_is_null(&desc->service_guid)) {
 		call_id = SVSM_ATTEST_CALL(SVSM_ATTEST_SERVICES);
 	} else {
@@ -872,8 +861,6 @@ static int sev_report_new(struct tsm_report *report, void *data)
 	if (!buf)
 		return -ENOMEM;
 
-	guard(mutex)(&snp_cmd_mutex);
-
 	/* Check if the VMPCK is not empty */
 	if (is_vmpck_empty(snp_dev)) {
 		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 10/20] virt: sev-guest: Carve out SNP message context structure
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (8 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 09/20] virt: sev-guest: Reduce the scope of SNP command mutex Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-13 15:52   ` Tom Lendacky
  2024-07-31 15:08 ` [PATCH v11 11/20] x86/sev: Carve out and export SNP guest messaging init routines Nikunj A Dadhania
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Currently, the sev-guest driver is the only user of SNP guest messaging.
snp_guest_dev structure holds all the allocated buffers, secrets page and
VMPCK details. In preparation of adding messaging allocation and
initialization APIs, decouple snp_guest_dev from messaging-related
information by carving out guest message context structure(snp_msg_desc).

Incorporate this newly added context into snp_send_guest_request() and all
related functions, replacing the use of the snp_guest_dev.

No functional change.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/sev.h              |  21 +++
 drivers/virt/coco/sev-guest/sev-guest.c | 183 ++++++++++++------------
 2 files changed, 111 insertions(+), 93 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 27fa1c9c3465..2e49c4a9e7fe 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -234,6 +234,27 @@ struct snp_secrets_page {
 	u8 rsvd4[3744];
 } __packed;
 
+struct snp_msg_desc {
+	/* request and response are in unencrypted memory */
+	struct snp_guest_msg *request, *response;
+
+	/*
+	 * Avoid information leakage by double-buffering shared messages
+	 * in fields that are in regular encrypted memory.
+	 */
+	struct snp_guest_msg secret_request, secret_response;
+
+	struct snp_secrets_page *secrets;
+	struct snp_req_data input;
+
+	void *certs_data;
+
+	struct aesgcm_ctx *ctx;
+
+	u32 *os_area_msg_seqno;
+	u8 *vmpck;
+};
+
 /*
  * The SVSM Calling Area (CA) related structures.
  */
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 42f7126f1718..38ddabcd7ba3 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -40,26 +40,13 @@ struct snp_guest_dev {
 	struct device *dev;
 	struct miscdevice misc;
 
-	void *certs_data;
-	struct aesgcm_ctx *ctx;
-	/* request and response are in unencrypted memory */
-	struct snp_guest_msg *request, *response;
-
-	/*
-	 * Avoid information leakage by double-buffering shared messages
-	 * in fields that are in regular encrypted memory.
-	 */
-	struct snp_guest_msg secret_request, secret_response;
+	struct snp_msg_desc *msg_desc;
 
-	struct snp_secrets_page *secrets;
-	struct snp_req_data input;
 	union {
 		struct snp_report_req report;
 		struct snp_derived_key_req derived_key;
 		struct snp_ext_report_req ext_report;
 	} req;
-	u32 *os_area_msg_seqno;
-	u8 *vmpck;
 };
 
 /*
@@ -76,12 +63,12 @@ MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.
 /* Mutex to serialize the shared buffer access and command handling. */
 static DEFINE_MUTEX(snp_cmd_mutex);
 
-static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
+static bool is_vmpck_empty(struct snp_msg_desc *mdesc)
 {
 	char zero_key[VMPCK_KEY_LEN] = {0};
 
-	if (snp_dev->vmpck)
-		return !memcmp(snp_dev->vmpck, zero_key, VMPCK_KEY_LEN);
+	if (mdesc->vmpck)
+		return !memcmp(mdesc->vmpck, zero_key, VMPCK_KEY_LEN);
 
 	return true;
 }
@@ -103,30 +90,30 @@ static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
  * vulnerable. If the sequence number were incremented for a fresh IV the ASP
  * will reject the request.
  */
-static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
+static void snp_disable_vmpck(struct snp_msg_desc *mdesc)
 {
-	dev_alert(snp_dev->dev, "Disabling VMPCK%d communication key to prevent IV reuse.\n",
+	pr_alert("Disabling VMPCK%d communication key to prevent IV reuse.\n",
 		  vmpck_id);
-	memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
-	snp_dev->vmpck = NULL;
+	memzero_explicit(mdesc->vmpck, VMPCK_KEY_LEN);
+	mdesc->vmpck = NULL;
 }
 
-static inline u64 __snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
+static inline u64 __snp_get_msg_seqno(struct snp_msg_desc *mdesc)
 {
 	u64 count;
 
 	lockdep_assert_held(&snp_cmd_mutex);
 
 	/* Read the current message sequence counter from secrets pages */
-	count = *snp_dev->os_area_msg_seqno;
+	count = *mdesc->os_area_msg_seqno;
 
 	return count + 1;
 }
 
 /* Return a non-zero on success */
-static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
+static u64 snp_get_msg_seqno(struct snp_msg_desc *mdesc)
 {
-	u64 count = __snp_get_msg_seqno(snp_dev);
+	u64 count = __snp_get_msg_seqno(mdesc);
 
 	/*
 	 * The message sequence counter for the SNP guest request is a  64-bit
@@ -137,20 +124,20 @@ static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
 	 * invalid number and will fail the  message request.
 	 */
 	if (count >= UINT_MAX) {
-		dev_err(snp_dev->dev, "request message sequence counter overflow\n");
+		pr_err("request message sequence counter overflow\n");
 		return 0;
 	}
 
 	return count;
 }
 
-static void snp_inc_msg_seqno(struct snp_guest_dev *snp_dev)
+static void snp_inc_msg_seqno(struct snp_msg_desc *mdesc)
 {
 	/*
 	 * The counter is also incremented by the PSP, so increment it by 2
 	 * and save in secrets page.
 	 */
-	*snp_dev->os_area_msg_seqno += 2;
+	*mdesc->os_area_msg_seqno += 2;
 }
 
 static inline struct snp_guest_dev *to_snp_dev(struct file *file)
@@ -177,13 +164,13 @@ static struct aesgcm_ctx *snp_init_crypto(u8 *key, size_t keylen)
 	return ctx;
 }
 
-static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_req *req)
+static int verify_and_dec_payload(struct snp_msg_desc *mdesc, struct snp_guest_req *req)
 {
-	struct snp_guest_msg *resp_msg = &snp_dev->secret_response;
-	struct snp_guest_msg *req_msg = &snp_dev->secret_request;
+	struct snp_guest_msg *resp_msg = &mdesc->secret_response;
+	struct snp_guest_msg *req_msg = &mdesc->secret_request;
 	struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr;
 	struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr;
-	struct aesgcm_ctx *ctx = snp_dev->ctx;
+	struct aesgcm_ctx *ctx = mdesc->ctx;
 	u8 iv[GCM_AES_IV_SIZE] = {};
 
 	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
@@ -191,7 +178,7 @@ static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, struct snp_gues
 		 resp_msg_hdr->msg_sz);
 
 	/* Copy response from shared memory to encrypted memory. */
-	memcpy(resp_msg, snp_dev->response, sizeof(*resp_msg));
+	memcpy(resp_msg, mdesc->response, sizeof(*resp_msg));
 
 	/* Verify that the sequence counter is incremented by 1 */
 	if (unlikely(resp_msg_hdr->msg_seqno != (req_msg_hdr->msg_seqno + 1)))
@@ -218,11 +205,11 @@ static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, struct snp_gues
 	return 0;
 }
 
-static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, struct snp_guest_req *req)
+static int enc_payload(struct snp_msg_desc *mdesc, u64 seqno, struct snp_guest_req *req)
 {
-	struct snp_guest_msg *msg = &snp_dev->secret_request;
+	struct snp_guest_msg *msg = &mdesc->secret_request;
 	struct snp_guest_msg_hdr *hdr = &msg->hdr;
-	struct aesgcm_ctx *ctx = snp_dev->ctx;
+	struct aesgcm_ctx *ctx = mdesc->ctx;
 	u8 iv[GCM_AES_IV_SIZE] = {};
 
 	memset(msg, 0, sizeof(*msg));
@@ -253,7 +240,7 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, struct snp_gues
 	return 0;
 }
 
-static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_guest_req *req,
+static int __handle_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
 				  struct snp_guest_request_ioctl *rio)
 {
 	unsigned long req_start = jiffies;
@@ -268,7 +255,7 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
 	 * sequence number must be incremented or the VMPCK must be deleted to
 	 * prevent reuse of the IV.
 	 */
-	rc = snp_issue_guest_request(req, &snp_dev->input, rio);
+	rc = snp_issue_guest_request(req, &mdesc->input, rio);
 	switch (rc) {
 	case -ENOSPC:
 		/*
@@ -278,7 +265,7 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
 		 * order to increment the sequence number and thus avoid
 		 * IV reuse.
 		 */
-		override_npages = snp_dev->input.data_npages;
+		override_npages = mdesc->input.data_npages;
 		req->exit_code	= SVM_VMGEXIT_GUEST_REQUEST;
 
 		/*
@@ -318,7 +305,7 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
 	 * structure and any failure will wipe the VMPCK, preventing further
 	 * use anyway.
 	 */
-	snp_inc_msg_seqno(snp_dev);
+	snp_inc_msg_seqno(mdesc);
 
 	if (override_err) {
 		rio->exitinfo2 = override_err;
@@ -334,12 +321,12 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
 	}
 
 	if (override_npages)
-		snp_dev->input.data_npages = override_npages;
+		mdesc->input.data_npages = override_npages;
 
 	return rc;
 }
 
-static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_guest_req *req,
+static int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
 				  struct snp_guest_request_ioctl *rio)
 {
 	u64 seqno;
@@ -348,15 +335,15 @@ static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
 	guard(mutex)(&snp_cmd_mutex);
 
 	/* Get message sequence and verify that its a non-zero */
-	seqno = snp_get_msg_seqno(snp_dev);
+	seqno = snp_get_msg_seqno(mdesc);
 	if (!seqno)
 		return -EIO;
 
 	/* Clear shared memory's response for the host to populate. */
-	memset(snp_dev->response, 0, sizeof(struct snp_guest_msg));
+	memset(mdesc->response, 0, sizeof(struct snp_guest_msg));
 
-	/* Encrypt the userspace provided payload in snp_dev->secret_request. */
-	rc = enc_payload(snp_dev, seqno, req);
+	/* Encrypt the userspace provided payload in mdesc->secret_request. */
+	rc = enc_payload(mdesc, seqno, req);
 	if (rc)
 		return rc;
 
@@ -364,34 +351,33 @@ static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
 	 * Write the fully encrypted request to the shared unencrypted
 	 * request page.
 	 */
-	memcpy(snp_dev->request, &snp_dev->secret_request,
-	       sizeof(snp_dev->secret_request));
+	memcpy(mdesc->request, &mdesc->secret_request,
+	       sizeof(mdesc->secret_request));
 
-	rc = __handle_guest_request(snp_dev, req, rio);
+	rc = __handle_guest_request(mdesc, req, rio);
 	if (rc) {
 		if (rc == -EIO &&
 		    rio->exitinfo2 == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
 			return rc;
 
-		dev_alert(snp_dev->dev,
-			  "Detected error from ASP request. rc: %d, exitinfo2: 0x%llx\n",
-			  rc, rio->exitinfo2);
+		pr_alert("Detected error from ASP request. rc: %d, exitinfo2: 0x%llx\n",
+			 rc, rio->exitinfo2);
 
-		snp_disable_vmpck(snp_dev);
+		snp_disable_vmpck(mdesc);
 		return rc;
 	}
 
-	rc = verify_and_dec_payload(snp_dev, req);
+	rc = verify_and_dec_payload(mdesc, req);
 	if (rc) {
-		dev_alert(snp_dev->dev, "Detected unexpected decode failure from ASP. rc: %d\n", rc);
-		snp_disable_vmpck(snp_dev);
+		pr_alert("Detected unexpected decode failure from ASP. rc: %d\n", rc);
+		snp_disable_vmpck(mdesc);
 		return rc;
 	}
 
 	return 0;
 }
 
-static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
+static int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code,
 				struct snp_guest_request_ioctl *rio, u8 type,
 				void *req_buf, size_t req_sz, void *resp_buf,
 				u32 resp_sz)
@@ -407,7 +393,7 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
 		.exit_code	= exit_code,
 	};
 
-	return snp_send_guest_request(snp_dev, &req, rio);
+	return snp_send_guest_request(mdesc, &req, rio);
 }
 
 struct snp_req_resp {
@@ -418,6 +404,7 @@ struct snp_req_resp {
 static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
 {
 	struct snp_report_req *report_req = &snp_dev->req.report;
+	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
 	struct snp_report_resp *report_resp;
 	int rc, resp_len;
 
@@ -432,12 +419,12 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(report_resp->data) + snp_dev->ctx->authsize;
+	resp_len = sizeof(report_resp->data) + mdesc->ctx->authsize;
 	report_resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
 	if (!report_resp)
 		return -ENOMEM;
 
-	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
+	rc = handle_guest_request(mdesc, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
 				  report_req, sizeof(*report_req), report_resp->data, resp_len);
 	if (rc)
 		goto e_free;
@@ -454,6 +441,7 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
 {
 	struct snp_derived_key_req *derived_key_req = &snp_dev->req.derived_key;
 	struct snp_derived_key_resp derived_key_resp = {0};
+	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
 	int rc, resp_len;
 	/* Response data is 64 bytes and max authsize for GCM is 16 bytes. */
 	u8 buf[64 + 16];
@@ -466,7 +454,7 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(derived_key_resp.data) + snp_dev->ctx->authsize;
+	resp_len = sizeof(derived_key_resp.data) + mdesc->ctx->authsize;
 	if (sizeof(buf) < resp_len)
 		return -ENOMEM;
 
@@ -474,7 +462,7 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
 			   sizeof(*derived_key_req)))
 		return -EFAULT;
 
-	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_KEY_REQ,
+	rc = handle_guest_request(mdesc, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_KEY_REQ,
 				  derived_key_req, sizeof(*derived_key_req), buf, resp_len);
 	if (rc)
 		return rc;
@@ -495,6 +483,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 
 {
 	struct snp_ext_report_req *report_req = &snp_dev->req.ext_report;
+	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
 	struct snp_report_resp *report_resp;
 	int ret, npages = 0, resp_len;
 	sockptr_t certs_address;
@@ -527,7 +516,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 	 * the host. If host does not supply any certs in it, then copy
 	 * zeros to indicate that certificate data was not provided.
 	 */
-	memset(snp_dev->certs_data, 0, report_req->certs_len);
+	memset(mdesc->certs_data, 0, report_req->certs_len);
 	npages = report_req->certs_len >> PAGE_SHIFT;
 cmd:
 	/*
@@ -535,19 +524,19 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(report_resp->data) + snp_dev->ctx->authsize;
+	resp_len = sizeof(report_resp->data) + mdesc->ctx->authsize;
 	report_resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
 	if (!report_resp)
 		return -ENOMEM;
 
-	snp_dev->input.data_npages = npages;
-	ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
+	mdesc->input.data_npages = npages;
+	ret = handle_guest_request(mdesc, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
 				   &report_req->data, sizeof(report_req->data),
 				   report_resp->data, resp_len);
 
 	/* If certs length is invalid then copy the returned length */
 	if (arg->vmm_error == SNP_GUEST_VMM_ERR_INVALID_LEN) {
-		report_req->certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
+		report_req->certs_len = mdesc->input.data_npages << PAGE_SHIFT;
 
 		if (copy_to_sockptr(io->req_data, report_req, sizeof(*report_req)))
 			ret = -EFAULT;
@@ -556,7 +545,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 	if (ret)
 		goto e_free;
 
-	if (npages && copy_to_sockptr(certs_address, snp_dev->certs_data, report_req->certs_len)) {
+	if (npages && copy_to_sockptr(certs_address, mdesc->certs_data, report_req->certs_len)) {
 		ret = -EFAULT;
 		goto e_free;
 	}
@@ -572,6 +561,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	struct snp_guest_dev *snp_dev = to_snp_dev(file);
+	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
 	void __user *argp = (void __user *)arg;
 	struct snp_guest_request_ioctl input;
 	struct snp_req_resp io;
@@ -587,7 +577,7 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
 		return -EINVAL;
 
 	/* Check if the VMPCK is not empty */
-	if (is_vmpck_empty(snp_dev)) {
+	if (is_vmpck_empty(mdesc)) {
 		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
 		return -ENOTTY;
 	}
@@ -862,7 +852,7 @@ static int sev_report_new(struct tsm_report *report, void *data)
 		return -ENOMEM;
 
 	/* Check if the VMPCK is not empty */
-	if (is_vmpck_empty(snp_dev)) {
+	if (is_vmpck_empty(snp_dev->msg_desc)) {
 		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
 		return -ENOTTY;
 	}
@@ -992,6 +982,7 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	struct snp_secrets_page *secrets;
 	struct device *dev = &pdev->dev;
 	struct snp_guest_dev *snp_dev;
+	struct snp_msg_desc *mdesc;
 	struct miscdevice *misc;
 	void __iomem *mapping;
 	int ret;
@@ -1014,46 +1005,50 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	if (!snp_dev)
 		goto e_unmap;
 
+	mdesc = devm_kzalloc(&pdev->dev, sizeof(struct snp_msg_desc), GFP_KERNEL);
+	if (!mdesc)
+		goto e_unmap;
+
 	/* Adjust the default VMPCK key based on the executing VMPL level */
 	if (vmpck_id == -1)
 		vmpck_id = snp_vmpl;
 
 	ret = -EINVAL;
-	snp_dev->vmpck = get_vmpck(vmpck_id, secrets, &snp_dev->os_area_msg_seqno);
-	if (!snp_dev->vmpck) {
+	mdesc->vmpck = get_vmpck(vmpck_id, secrets, &mdesc->os_area_msg_seqno);
+	if (!mdesc->vmpck) {
 		dev_err(dev, "Invalid VMPCK%d communication key\n", vmpck_id);
 		goto e_unmap;
 	}
 
 	/* Verify that VMPCK is not zero. */
-	if (is_vmpck_empty(snp_dev)) {
+	if (is_vmpck_empty(mdesc)) {
 		dev_err(dev, "Empty VMPCK%d communication key\n", vmpck_id);
 		goto e_unmap;
 	}
 
 	platform_set_drvdata(pdev, snp_dev);
 	snp_dev->dev = dev;
-	snp_dev->secrets = secrets;
+	mdesc->secrets = secrets;
 
 	/* Ensure SNP guest messages do not span more than a page */
 	BUILD_BUG_ON(sizeof(struct snp_guest_msg) > PAGE_SIZE);
 
 	/* Allocate the shared page used for the request and response message. */
-	snp_dev->request = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
-	if (!snp_dev->request)
+	mdesc->request = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
+	if (!mdesc->request)
 		goto e_unmap;
 
-	snp_dev->response = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
-	if (!snp_dev->response)
+	mdesc->response = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
+	if (!mdesc->response)
 		goto e_free_request;
 
-	snp_dev->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
-	if (!snp_dev->certs_data)
+	mdesc->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
+	if (!mdesc->certs_data)
 		goto e_free_response;
 
 	ret = -EIO;
-	snp_dev->ctx = snp_init_crypto(snp_dev->vmpck, VMPCK_KEY_LEN);
-	if (!snp_dev->ctx)
+	mdesc->ctx = snp_init_crypto(mdesc->vmpck, VMPCK_KEY_LEN);
+	if (!mdesc->ctx)
 		goto e_free_cert_data;
 
 	misc = &snp_dev->misc;
@@ -1062,9 +1057,9 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	misc->fops = &snp_guest_fops;
 
 	/* Initialize the input addresses for guest request */
-	snp_dev->input.req_gpa = __pa(snp_dev->request);
-	snp_dev->input.resp_gpa = __pa(snp_dev->response);
-	snp_dev->input.data_gpa = __pa(snp_dev->certs_data);
+	mdesc->input.req_gpa = __pa(mdesc->request);
+	mdesc->input.resp_gpa = __pa(mdesc->response);
+	mdesc->input.data_gpa = __pa(mdesc->certs_data);
 
 	/* Set the privlevel_floor attribute based on the vmpck_id */
 	sev_tsm_ops.privlevel_floor = vmpck_id;
@@ -1081,17 +1076,18 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	if (ret)
 		goto e_free_ctx;
 
+	snp_dev->msg_desc = mdesc;
 	dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n", vmpck_id);
 	return 0;
 
 e_free_ctx:
-	kfree(snp_dev->ctx);
+	kfree(mdesc->ctx);
 e_free_cert_data:
-	free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
+	free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE);
 e_free_response:
-	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+	free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg));
 e_free_request:
-	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
+	free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg));
 e_unmap:
 	iounmap(mapping);
 	return ret;
@@ -1100,11 +1096,12 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 static void __exit sev_guest_remove(struct platform_device *pdev)
 {
 	struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev);
+	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
 
-	free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
-	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
-	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
-	kfree(snp_dev->ctx);
+	free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE);
+	free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg));
+	free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg));
+	kfree(mdesc->ctx);
 	misc_deregister(&snp_dev->misc);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 11/20] x86/sev: Carve out and export SNP guest messaging init routines
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (9 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 10/20] virt: sev-guest: Carve out SNP message context structure Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-13 15:53   ` Tom Lendacky
  2024-07-31 15:08 ` [PATCH v11 12/20] x86/sev: Relocate SNP guest messaging routines to common code Nikunj A Dadhania
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Currently, the SEV guest driver is the only user of SNP guest messaging.
All routines for initializing SNP guest messaging are implemented within
the SEV guest driver. To add Secure TSC guest support, these initialization
routines need to be available during early boot.

Carve out common SNP guest messaging buffer allocations and message
initialization routines to core/sev.c and export them. These newly added
APIs set up the SNP message context (snp_msg_desc), which contains all the
necessary details for sending SNP guest messages.

At present, the SEV guest platform data structure is used to pass the
secrets page physical address to SEV guest driver. Since the secrets page
address is locally available to the initialization routine, use the cached
address. Remove the unused SEV guest platform data structure.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/sev.h              |  71 ++++++++-
 arch/x86/coco/sev/core.c                | 133 +++++++++++++++-
 drivers/virt/coco/sev-guest/sev-guest.c | 194 +++---------------------
 3 files changed, 213 insertions(+), 185 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 2e49c4a9e7fe..3812692ba3fe 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -14,6 +14,7 @@
 #include <asm/insn.h>
 #include <asm/sev-common.h>
 #include <asm/coco.h>
+#include <asm/set_memory.h>
 
 #define GHCB_PROTOCOL_MIN	1ULL
 #define GHCB_PROTOCOL_MAX	2ULL
@@ -170,10 +171,6 @@ struct snp_guest_msg {
 	u8 payload[PAGE_SIZE - sizeof(struct snp_guest_msg_hdr)];
 } __packed;
 
-struct sev_guest_platform_data {
-	u64 secrets_gpa;
-};
-
 struct snp_guest_req {
 	void *req_buf;
 	size_t req_sz;
@@ -253,6 +250,7 @@ struct snp_msg_desc {
 
 	u32 *os_area_msg_seqno;
 	u8 *vmpck;
+	int vmpck_id;
 };
 
 /*
@@ -438,6 +436,63 @@ u64 sev_get_status(void);
 void sev_show_status(void);
 void snp_update_svsm_ca(void);
 
+static inline void free_shared_pages(void *buf, size_t sz)
+{
+	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+	int ret;
+
+	if (!buf)
+		return;
+
+	ret = set_memory_encrypted((unsigned long)buf, npages);
+	if (ret) {
+		WARN_ONCE(ret, "failed to restore encryption mask (leak it)\n");
+		return;
+	}
+
+	__free_pages(virt_to_page(buf), get_order(sz));
+}
+
+static inline void *alloc_shared_pages(size_t sz)
+{
+	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+	struct page *page;
+	int ret;
+
+	page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz));
+	if (!page)
+		return NULL;
+
+	ret = set_memory_decrypted((unsigned long)page_address(page), npages);
+	if (ret) {
+		pr_err("failed to mark page shared, ret=%d\n", ret);
+		__free_pages(page, get_order(sz));
+		return NULL;
+	}
+
+	return page_address(page);
+}
+
+static inline bool is_vmpck_empty(struct snp_msg_desc *mdesc)
+{
+	char zero_key[VMPCK_KEY_LEN] = {0};
+
+	if (mdesc->vmpck)
+		return !memcmp(mdesc->vmpck, zero_key, VMPCK_KEY_LEN);
+
+	return true;
+}
+
+int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id);
+struct snp_msg_desc *snp_msg_alloc(void);
+
+static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc)
+{
+	mdesc->vmpck = NULL;
+	mdesc->os_area_msg_seqno = NULL;
+	kfree(mdesc->ctx);
+}
+
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define snp_vmpl 0
@@ -474,6 +529,14 @@ static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
 static inline u64 sev_get_status(void) { return 0; }
 static inline void sev_show_status(void) { }
 static inline void snp_update_svsm_ca(void) { }
+static inline void free_shared_pages(void *buf, size_t sz) { }
+static inline void *alloc_shared_pages(size_t sz) { return NULL; }
+static inline bool is_vmpck_empty(struct snp_msg_desc *mdesc) { return false; }
+
+static inline int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id) { return -1; }
+static inline struct snp_msg_desc *snp_msg_alloc(void) { return NULL; }
+
+static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc) { }
 
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index e6a3f3df4637..6787d0972a45 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -25,6 +25,7 @@
 #include <linux/psp-sev.h>
 #include <linux/dmi.h>
 #include <uapi/linux/sev-guest.h>
+#include <crypto/gcm.h>
 
 #include <asm/init.h>
 #include <asm/cpu_entry_area.h>
@@ -95,6 +96,8 @@ static u64 sev_hv_features __ro_after_init;
 /* Secrets page physical address from the CC blob */
 static u64 secrets_pa __ro_after_init;
 
+static struct snp_msg_desc *snp_mdesc;
+
 /* #VC handler runtime per-CPU data */
 struct sev_es_runtime_data {
 	struct ghcb ghcb_page;
@@ -2489,15 +2492,9 @@ static struct platform_device sev_guest_device = {
 
 static int __init snp_init_platform_device(void)
 {
-	struct sev_guest_platform_data data;
-
 	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
 		return -ENODEV;
 
-	data.secrets_gpa = secrets_pa;
-	if (platform_device_add_data(&sev_guest_device, &data, sizeof(data)))
-		return -ENODEV;
-
 	if (platform_device_register(&sev_guest_device))
 		return -ENODEV;
 
@@ -2576,3 +2573,127 @@ static int __init sev_sysfs_init(void)
 }
 arch_initcall(sev_sysfs_init);
 #endif // CONFIG_SYSFS
+
+static u8 *get_vmpck(int id, struct snp_secrets_page *secrets, u32 **seqno)
+{
+	u8 *key = NULL;
+
+	switch (id) {
+	case 0:
+		*seqno = &secrets->os_area.msg_seqno_0;
+		key = secrets->vmpck0;
+		break;
+	case 1:
+		*seqno = &secrets->os_area.msg_seqno_1;
+		key = secrets->vmpck1;
+		break;
+	case 2:
+		*seqno = &secrets->os_area.msg_seqno_2;
+		key = secrets->vmpck2;
+		break;
+	case 3:
+		*seqno = &secrets->os_area.msg_seqno_3;
+		key = secrets->vmpck3;
+		break;
+	default:
+		break;
+	}
+
+	return key;
+}
+
+static struct aesgcm_ctx *snp_init_crypto(u8 *key, size_t keylen)
+{
+	struct aesgcm_ctx *ctx;
+
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT);
+	if (!ctx)
+		return NULL;
+
+	if (aesgcm_expandkey(ctx, key, keylen, AUTHTAG_LEN)) {
+		pr_err("Crypto context initialization failed\n");
+		kfree(ctx);
+		return NULL;
+	}
+
+	return ctx;
+}
+
+int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id)
+{
+	/* Adjust the default VMPCK key based on the executing VMPL level */
+	if (vmpck_id == -1)
+		vmpck_id = snp_vmpl;
+
+	mdesc->vmpck = get_vmpck(vmpck_id, mdesc->secrets, &mdesc->os_area_msg_seqno);
+	if (!mdesc->vmpck) {
+		pr_err("Invalid VMPCK%d communication key\n", vmpck_id);
+		return -EINVAL;
+	}
+
+	/* Verify that VMPCK is not zero. */
+	if (is_vmpck_empty(mdesc)) {
+		pr_err("Empty VMPCK%d communication key\n", vmpck_id);
+		return -EINVAL;
+	}
+
+	mdesc->vmpck_id = vmpck_id;
+
+	mdesc->ctx = snp_init_crypto(mdesc->vmpck, VMPCK_KEY_LEN);
+	if (!mdesc->ctx)
+		return -ENOMEM;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(snp_msg_init);
+
+struct snp_msg_desc *snp_msg_alloc(void)
+{
+	struct snp_msg_desc *mdesc;
+
+	if (snp_mdesc)
+		return snp_mdesc;
+
+	mdesc = kzalloc(sizeof(struct snp_msg_desc), GFP_KERNEL);
+	if (!mdesc)
+		return ERR_PTR(-ENOMEM);
+
+	mdesc->secrets = ioremap_encrypted(secrets_pa, PAGE_SIZE);
+	if (!mdesc->secrets)
+		return ERR_PTR(-ENODEV);
+
+	/* Ensure SNP guest messages do not span more than a page */
+	BUILD_BUG_ON(sizeof(struct snp_guest_msg) > PAGE_SIZE);
+
+	/* Allocate the shared page used for the request and response message. */
+	mdesc->request = alloc_shared_pages(sizeof(struct snp_guest_msg));
+	if (!mdesc->request)
+		goto e_unmap;
+
+	mdesc->response = alloc_shared_pages(sizeof(struct snp_guest_msg));
+	if (!mdesc->response)
+		goto e_free_request;
+
+	mdesc->certs_data = alloc_shared_pages(SEV_FW_BLOB_MAX_SIZE);
+	if (!mdesc->certs_data)
+		goto e_free_response;
+
+	/* initial the input address for guest request */
+	mdesc->input.req_gpa = __pa(mdesc->request);
+	mdesc->input.resp_gpa = __pa(mdesc->response);
+	mdesc->input.data_gpa = __pa(mdesc->certs_data);
+
+	snp_mdesc = mdesc;
+
+	return mdesc;
+
+e_free_response:
+	free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg));
+e_free_request:
+	free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg));
+e_unmap:
+	iounmap(mdesc->secrets);
+
+	return ERR_PTR(-ENOMEM);
+}
+EXPORT_SYMBOL_GPL(snp_msg_alloc);
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 38ddabcd7ba3..40509fe18658 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -63,16 +63,6 @@ MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.
 /* Mutex to serialize the shared buffer access and command handling. */
 static DEFINE_MUTEX(snp_cmd_mutex);
 
-static bool is_vmpck_empty(struct snp_msg_desc *mdesc)
-{
-	char zero_key[VMPCK_KEY_LEN] = {0};
-
-	if (mdesc->vmpck)
-		return !memcmp(mdesc->vmpck, zero_key, VMPCK_KEY_LEN);
-
-	return true;
-}
-
 /*
  * If an error is received from the host or AMD Secure Processor (ASP) there
  * are two options. Either retry the exact same encrypted request or discontinue
@@ -93,7 +83,7 @@ static bool is_vmpck_empty(struct snp_msg_desc *mdesc)
 static void snp_disable_vmpck(struct snp_msg_desc *mdesc)
 {
 	pr_alert("Disabling VMPCK%d communication key to prevent IV reuse.\n",
-		  vmpck_id);
+		  mdesc->vmpck_id);
 	memzero_explicit(mdesc->vmpck, VMPCK_KEY_LEN);
 	mdesc->vmpck = NULL;
 }
@@ -147,23 +137,6 @@ static inline struct snp_guest_dev *to_snp_dev(struct file *file)
 	return container_of(dev, struct snp_guest_dev, misc);
 }
 
-static struct aesgcm_ctx *snp_init_crypto(u8 *key, size_t keylen)
-{
-	struct aesgcm_ctx *ctx;
-
-	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT);
-	if (!ctx)
-		return NULL;
-
-	if (aesgcm_expandkey(ctx, key, keylen, AUTHTAG_LEN)) {
-		pr_err("Crypto context initialization failed\n");
-		kfree(ctx);
-		return NULL;
-	}
-
-	return ctx;
-}
-
 static int verify_and_dec_payload(struct snp_msg_desc *mdesc, struct snp_guest_req *req)
 {
 	struct snp_guest_msg *resp_msg = &mdesc->secret_response;
@@ -385,7 +358,7 @@ static int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code,
 	struct snp_guest_req req = {
 		.msg_version	= rio->msg_version,
 		.msg_type	= type,
-		.vmpck_id	= vmpck_id,
+		.vmpck_id	= mdesc->vmpck_id,
 		.req_buf	= req_buf,
 		.req_sz		= req_sz,
 		.resp_buf	= resp_buf,
@@ -609,76 +582,11 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
 	return ret;
 }
 
-static void free_shared_pages(void *buf, size_t sz)
-{
-	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
-	int ret;
-
-	if (!buf)
-		return;
-
-	ret = set_memory_encrypted((unsigned long)buf, npages);
-	if (ret) {
-		WARN_ONCE(ret, "failed to restore encryption mask (leak it)\n");
-		return;
-	}
-
-	__free_pages(virt_to_page(buf), get_order(sz));
-}
-
-static void *alloc_shared_pages(struct device *dev, size_t sz)
-{
-	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
-	struct page *page;
-	int ret;
-
-	page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz));
-	if (!page)
-		return NULL;
-
-	ret = set_memory_decrypted((unsigned long)page_address(page), npages);
-	if (ret) {
-		dev_err(dev, "failed to mark page shared, ret=%d\n", ret);
-		__free_pages(page, get_order(sz));
-		return NULL;
-	}
-
-	return page_address(page);
-}
-
 static const struct file_operations snp_guest_fops = {
 	.owner	= THIS_MODULE,
 	.unlocked_ioctl = snp_guest_ioctl,
 };
 
-static u8 *get_vmpck(int id, struct snp_secrets_page *secrets, u32 **seqno)
-{
-	u8 *key = NULL;
-
-	switch (id) {
-	case 0:
-		*seqno = &secrets->os_area.msg_seqno_0;
-		key = secrets->vmpck0;
-		break;
-	case 1:
-		*seqno = &secrets->os_area.msg_seqno_1;
-		key = secrets->vmpck1;
-		break;
-	case 2:
-		*seqno = &secrets->os_area.msg_seqno_2;
-		key = secrets->vmpck2;
-		break;
-	case 3:
-		*seqno = &secrets->os_area.msg_seqno_3;
-		key = secrets->vmpck3;
-		break;
-	default:
-		break;
-	}
-
-	return key;
-}
-
 struct snp_msg_report_resp_hdr {
 	u32 status;
 	u32 report_size;
@@ -978,130 +886,66 @@ static void unregister_sev_tsm(void *data)
 
 static int __init sev_guest_probe(struct platform_device *pdev)
 {
-	struct sev_guest_platform_data *data;
-	struct snp_secrets_page *secrets;
 	struct device *dev = &pdev->dev;
 	struct snp_guest_dev *snp_dev;
 	struct snp_msg_desc *mdesc;
 	struct miscdevice *misc;
-	void __iomem *mapping;
 	int ret;
 
 	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
 		return -ENODEV;
 
-	if (!dev->platform_data)
-		return -ENODEV;
-
-	data = (struct sev_guest_platform_data *)dev->platform_data;
-	mapping = ioremap_encrypted(data->secrets_gpa, PAGE_SIZE);
-	if (!mapping)
-		return -ENODEV;
-
-	secrets = (__force void *)mapping;
-
-	ret = -ENOMEM;
 	snp_dev = devm_kzalloc(&pdev->dev, sizeof(struct snp_guest_dev), GFP_KERNEL);
 	if (!snp_dev)
-		goto e_unmap;
-
-	mdesc = devm_kzalloc(&pdev->dev, sizeof(struct snp_msg_desc), GFP_KERNEL);
-	if (!mdesc)
-		goto e_unmap;
-
-	/* Adjust the default VMPCK key based on the executing VMPL level */
-	if (vmpck_id == -1)
-		vmpck_id = snp_vmpl;
+		return -ENOMEM;
 
-	ret = -EINVAL;
-	mdesc->vmpck = get_vmpck(vmpck_id, secrets, &mdesc->os_area_msg_seqno);
-	if (!mdesc->vmpck) {
-		dev_err(dev, "Invalid VMPCK%d communication key\n", vmpck_id);
-		goto e_unmap;
-	}
+	mdesc = snp_msg_alloc();
+	if (IS_ERR_OR_NULL(mdesc))
+		return -ENOMEM;
 
-	/* Verify that VMPCK is not zero. */
-	if (is_vmpck_empty(mdesc)) {
-		dev_err(dev, "Empty VMPCK%d communication key\n", vmpck_id);
-		goto e_unmap;
-	}
+	ret = snp_msg_init(mdesc, vmpck_id);
+	if (ret)
+		return -EIO;
 
 	platform_set_drvdata(pdev, snp_dev);
 	snp_dev->dev = dev;
-	mdesc->secrets = secrets;
-
-	/* Ensure SNP guest messages do not span more than a page */
-	BUILD_BUG_ON(sizeof(struct snp_guest_msg) > PAGE_SIZE);
-
-	/* Allocate the shared page used for the request and response message. */
-	mdesc->request = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
-	if (!mdesc->request)
-		goto e_unmap;
-
-	mdesc->response = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
-	if (!mdesc->response)
-		goto e_free_request;
-
-	mdesc->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
-	if (!mdesc->certs_data)
-		goto e_free_response;
-
-	ret = -EIO;
-	mdesc->ctx = snp_init_crypto(mdesc->vmpck, VMPCK_KEY_LEN);
-	if (!mdesc->ctx)
-		goto e_free_cert_data;
 
 	misc = &snp_dev->misc;
 	misc->minor = MISC_DYNAMIC_MINOR;
 	misc->name = DEVICE_NAME;
 	misc->fops = &snp_guest_fops;
 
-	/* Initialize the input addresses for guest request */
-	mdesc->input.req_gpa = __pa(mdesc->request);
-	mdesc->input.resp_gpa = __pa(mdesc->response);
-	mdesc->input.data_gpa = __pa(mdesc->certs_data);
-
 	/* Set the privlevel_floor attribute based on the vmpck_id */
-	sev_tsm_ops.privlevel_floor = vmpck_id;
+	sev_tsm_ops.privlevel_floor = mdesc->vmpck_id;
 
 	ret = tsm_register(&sev_tsm_ops, snp_dev);
 	if (ret)
-		goto e_free_cert_data;
+		goto e_msg_init;
 
 	ret = devm_add_action_or_reset(&pdev->dev, unregister_sev_tsm, NULL);
 	if (ret)
-		goto e_free_cert_data;
+		goto e_msg_init;
 
 	ret =  misc_register(misc);
 	if (ret)
-		goto e_free_ctx;
+		goto e_msg_init;
 
 	snp_dev->msg_desc = mdesc;
-	dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n", vmpck_id);
+	dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n",
+		 mdesc->vmpck_id);
 	return 0;
 
-e_free_ctx:
-	kfree(mdesc->ctx);
-e_free_cert_data:
-	free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE);
-e_free_response:
-	free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg));
-e_free_request:
-	free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg));
-e_unmap:
-	iounmap(mapping);
+e_msg_init:
+	snp_msg_cleanup(mdesc);
+
 	return ret;
 }
 
 static void __exit sev_guest_remove(struct platform_device *pdev)
 {
 	struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev);
-	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
 
-	free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE);
-	free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg));
-	free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg));
-	kfree(mdesc->ctx);
+	snp_msg_cleanup(snp_dev->msg_desc);
 	misc_deregister(&snp_dev->misc);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 12/20] x86/sev: Relocate SNP guest messaging routines to common code
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (10 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 11/20] x86/sev: Carve out and export SNP guest messaging init routines Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-13 16:27   ` Tom Lendacky
  2024-07-31 15:08 ` [PATCH v11 13/20] x86/cc: Add CC_ATTR_GUEST_SECURE_TSC Nikunj A Dadhania
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

At present, the SEV guest driver exclusively handles SNP guest messaging.
All routines for sending guest messages are embedded within the guest
driver. To support Secure TSC, SEV-SNP guests must communicate with the AMD
Security Processor during early boot. However, these guest messaging
functions are not accessible during early boot since they are currently
part of the guest driver.

Hence, relocate the core SNP guest messaging functions to SEV common code
and provide an API for sending SNP guest messages.

No functional change, but just an export symbol.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/sev.h              |   8 +
 arch/x86/coco/sev/core.c                | 284 +++++++++++++++++++++++
 drivers/virt/coco/sev-guest/sev-guest.c | 286 ------------------------
 arch/x86/Kconfig                        |   1 +
 drivers/virt/coco/sev-guest/Kconfig     |   1 -
 5 files changed, 293 insertions(+), 287 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 3812692ba3fe..eda435eba53e 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -125,6 +125,9 @@ struct snp_req_data {
 #define AAD_LEN			48
 #define MSG_HDR_VER		1
 
+#define SNP_REQ_MAX_RETRY_DURATION      (60*HZ)
+#define SNP_REQ_RETRY_DELAY             (2*HZ)
+
 /* See SNP spec SNP_GUEST_REQUEST section for the structure */
 enum msg_type {
 	SNP_MSG_TYPE_INVALID = 0,
@@ -493,6 +496,9 @@ static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc)
 	kfree(mdesc->ctx);
 }
 
+int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
+			   struct snp_guest_request_ioctl *rio);
+
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define snp_vmpl 0
@@ -537,6 +543,8 @@ static inline int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id) { retur
 static inline struct snp_msg_desc *snp_msg_alloc(void) { return NULL; }
 
 static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc) { }
+static inline int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
+					 struct snp_guest_request_ioctl *rio) { return -ENODEV; }
 
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 6787d0972a45..aaeeca938265 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -2697,3 +2697,287 @@ struct snp_msg_desc *snp_msg_alloc(void)
 	return ERR_PTR(-ENOMEM);
 }
 EXPORT_SYMBOL_GPL(snp_msg_alloc);
+
+/* Mutex to serialize the shared buffer access and command handling. */
+static DEFINE_MUTEX(snp_cmd_mutex);
+
+/*
+ * If an error is received from the host or AMD Secure Processor (ASP) there
+ * are two options. Either retry the exact same encrypted request or discontinue
+ * using the VMPCK.
+ *
+ * This is because in the current encryption scheme GHCB v2 uses AES-GCM to
+ * encrypt the requests. The IV for this scheme is the sequence number. GCM
+ * cannot tolerate IV reuse.
+ *
+ * The ASP FW v1.51 only increments the sequence numbers on a successful
+ * guest<->ASP back and forth and only accepts messages at its exact sequence
+ * number.
+ *
+ * So if the sequence number were to be reused the encryption scheme is
+ * vulnerable. If the sequence number were incremented for a fresh IV the ASP
+ * will reject the request.
+ */
+static void snp_disable_vmpck(struct snp_msg_desc *mdesc)
+{
+	pr_alert("Disabling VMPCK%d communication key to prevent IV reuse.\n",
+		  mdesc->vmpck_id);
+	memzero_explicit(mdesc->vmpck, VMPCK_KEY_LEN);
+	mdesc->vmpck = NULL;
+}
+
+static inline u64 __snp_get_msg_seqno(struct snp_msg_desc *mdesc)
+{
+	u64 count;
+
+	lockdep_assert_held(&snp_cmd_mutex);
+
+	/* Read the current message sequence counter from secrets pages */
+	count = *mdesc->os_area_msg_seqno;
+
+	return count + 1;
+}
+
+/* Return a non-zero on success */
+static u64 snp_get_msg_seqno(struct snp_msg_desc *mdesc)
+{
+	u64 count = __snp_get_msg_seqno(mdesc);
+
+	/*
+	 * The message sequence counter for the SNP guest request is a  64-bit
+	 * value but the version 2 of GHCB specification defines a 32-bit storage
+	 * for it. If the counter exceeds the 32-bit value then return zero.
+	 * The caller should check the return value, but if the caller happens to
+	 * not check the value and use it, then the firmware treats zero as an
+	 * invalid number and will fail the  message request.
+	 */
+	if (count >= UINT_MAX) {
+		pr_err("request message sequence counter overflow\n");
+		return 0;
+	}
+
+	return count;
+}
+
+static void snp_inc_msg_seqno(struct snp_msg_desc *mdesc)
+{
+	/*
+	 * The counter is also incremented by the PSP, so increment it by 2
+	 * and save in secrets page.
+	 */
+	*mdesc->os_area_msg_seqno += 2;
+}
+
+static int verify_and_dec_payload(struct snp_msg_desc *mdesc, struct snp_guest_req *req)
+{
+	struct snp_guest_msg *resp_msg = &mdesc->secret_response;
+	struct snp_guest_msg *req_msg = &mdesc->secret_request;
+	struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr;
+	struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr;
+	struct aesgcm_ctx *ctx = mdesc->ctx;
+	u8 iv[GCM_AES_IV_SIZE] = {};
+
+	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
+		 resp_msg_hdr->msg_seqno, resp_msg_hdr->msg_type, resp_msg_hdr->msg_version,
+		 resp_msg_hdr->msg_sz);
+
+	/* Copy response from shared memory to encrypted memory. */
+	memcpy(resp_msg, mdesc->response, sizeof(*resp_msg));
+
+	/* Verify that the sequence counter is incremented by 1 */
+	if (unlikely(resp_msg_hdr->msg_seqno != (req_msg_hdr->msg_seqno + 1)))
+		return -EBADMSG;
+
+	/* Verify response message type and version number. */
+	if (resp_msg_hdr->msg_type != (req_msg_hdr->msg_type + 1) ||
+	    resp_msg_hdr->msg_version != req_msg_hdr->msg_version)
+		return -EBADMSG;
+
+	/*
+	 * If the message size is greater than our buffer length then return
+	 * an error.
+	 */
+	if (unlikely((resp_msg_hdr->msg_sz + ctx->authsize) > req->resp_sz))
+		return -EBADMSG;
+
+	/* Decrypt the payload */
+	memcpy(iv, &resp_msg_hdr->msg_seqno, min(sizeof(iv), sizeof(resp_msg_hdr->msg_seqno)));
+	if (!aesgcm_decrypt(ctx, req->resp_buf, resp_msg->payload, resp_msg_hdr->msg_sz,
+			    &resp_msg_hdr->algo, AAD_LEN, iv, resp_msg_hdr->authtag))
+		return -EBADMSG;
+
+	return 0;
+}
+
+static int enc_payload(struct snp_msg_desc *mdesc, u64 seqno, struct snp_guest_req *req)
+{
+	struct snp_guest_msg *msg = &mdesc->secret_request;
+	struct snp_guest_msg_hdr *hdr = &msg->hdr;
+	struct aesgcm_ctx *ctx = mdesc->ctx;
+	u8 iv[GCM_AES_IV_SIZE] = {};
+
+	memset(msg, 0, sizeof(*msg));
+
+	hdr->algo = SNP_AEAD_AES_256_GCM;
+	hdr->hdr_version = MSG_HDR_VER;
+	hdr->hdr_sz = sizeof(*hdr);
+	hdr->msg_type = req->msg_type;
+	hdr->msg_version = req->msg_version;
+	hdr->msg_seqno = seqno;
+	hdr->msg_vmpck = req->vmpck_id;
+	hdr->msg_sz = req->req_sz;
+
+	/* Verify the sequence number is non-zero */
+	if (!hdr->msg_seqno)
+		return -ENOSR;
+
+	pr_debug("request [seqno %lld type %d version %d sz %d]\n",
+		 hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
+
+	if (WARN_ON((req->req_sz + ctx->authsize) > sizeof(msg->payload)))
+		return -EBADMSG;
+
+	memcpy(iv, &hdr->msg_seqno, min(sizeof(iv), sizeof(hdr->msg_seqno)));
+	aesgcm_encrypt(ctx, msg->payload, req->req_buf, req->req_sz, &hdr->algo,
+		       AAD_LEN, iv, hdr->authtag);
+
+	return 0;
+}
+
+static int __handle_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
+				  struct snp_guest_request_ioctl *rio)
+{
+	unsigned long req_start = jiffies;
+	unsigned int override_npages = 0;
+	u64 override_err = 0;
+	int rc;
+
+retry_request:
+	/*
+	 * Call firmware to process the request. In this function the encrypted
+	 * message enters shared memory with the host. So after this call the
+	 * sequence number must be incremented or the VMPCK must be deleted to
+	 * prevent reuse of the IV.
+	 */
+	rc = snp_issue_guest_request(req, &mdesc->input, rio);
+	switch (rc) {
+	case -ENOSPC:
+		/*
+		 * If the extended guest request fails due to having too
+		 * small of a certificate data buffer, retry the same
+		 * guest request without the extended data request in
+		 * order to increment the sequence number and thus avoid
+		 * IV reuse.
+		 */
+		override_npages = mdesc->input.data_npages;
+		req->exit_code	= SVM_VMGEXIT_GUEST_REQUEST;
+
+		/*
+		 * Override the error to inform callers the given extended
+		 * request buffer size was too small and give the caller the
+		 * required buffer size.
+		 */
+		override_err = SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN);
+
+		/*
+		 * If this call to the firmware succeeds, the sequence number can
+		 * be incremented allowing for continued use of the VMPCK. If
+		 * there is an error reflected in the return value, this value
+		 * is checked further down and the result will be the deletion
+		 * of the VMPCK and the error code being propagated back to the
+		 * user as an ioctl() return code.
+		 */
+		goto retry_request;
+
+	/*
+	 * The host may return SNP_GUEST_VMM_ERR_BUSY if the request has been
+	 * throttled. Retry in the driver to avoid returning and reusing the
+	 * message sequence number on a different message.
+	 */
+	case -EAGAIN:
+		if (jiffies - req_start > SNP_REQ_MAX_RETRY_DURATION) {
+			rc = -ETIMEDOUT;
+			break;
+		}
+		schedule_timeout_killable(SNP_REQ_RETRY_DELAY);
+		goto retry_request;
+	}
+
+	/*
+	 * Increment the message sequence number. There is no harm in doing
+	 * this now because decryption uses the value stored in the response
+	 * structure and any failure will wipe the VMPCK, preventing further
+	 * use anyway.
+	 */
+	snp_inc_msg_seqno(mdesc);
+
+	if (override_err) {
+		rio->exitinfo2 = override_err;
+
+		/*
+		 * If an extended guest request was issued and the supplied certificate
+		 * buffer was not large enough, a standard guest request was issued to
+		 * prevent IV reuse. If the standard request was successful, return -EIO
+		 * back to the caller as would have originally been returned.
+		 */
+		if (!rc && override_err == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
+			rc = -EIO;
+	}
+
+	if (override_npages)
+		mdesc->input.data_npages = override_npages;
+
+	return rc;
+}
+
+int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
+			   struct snp_guest_request_ioctl *rio)
+{
+	u64 seqno;
+	int rc;
+
+	guard(mutex)(&snp_cmd_mutex);
+
+	/* Get message sequence and verify that its a non-zero */
+	seqno = snp_get_msg_seqno(mdesc);
+	if (!seqno)
+		return -EIO;
+
+	/* Clear shared memory's response for the host to populate. */
+	memset(mdesc->response, 0, sizeof(struct snp_guest_msg));
+
+	/* Encrypt the userspace provided payload in mdesc->secret_request. */
+	rc = enc_payload(mdesc, seqno, req);
+	if (rc)
+		return rc;
+
+	/*
+	 * Write the fully encrypted request to the shared unencrypted
+	 * request page.
+	 */
+	memcpy(mdesc->request, &mdesc->secret_request,
+	       sizeof(mdesc->secret_request));
+
+	rc = __handle_guest_request(mdesc, req, rio);
+	if (rc) {
+		if (rc == -EIO &&
+		    rio->exitinfo2 == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
+			return rc;
+
+		pr_alert("Detected error from ASP request. rc: %d, exitinfo2: 0x%llx\n",
+			 rc, rio->exitinfo2);
+
+		snp_disable_vmpck(mdesc);
+		return rc;
+	}
+
+	rc = verify_and_dec_payload(mdesc, req);
+	if (rc) {
+		pr_alert("Detected unexpected decode failure from ASP. rc: %d\n", rc);
+		snp_disable_vmpck(mdesc);
+		return rc;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(snp_send_guest_request);
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 40509fe18658..019eca753f85 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -31,9 +31,6 @@
 
 #define DEVICE_NAME	"sev-guest"
 
-#define SNP_REQ_MAX_RETRY_DURATION	(60*HZ)
-#define SNP_REQ_RETRY_DELAY		(2*HZ)
-
 #define SVSM_MAX_RETRIES		3
 
 struct snp_guest_dev {
@@ -60,76 +57,6 @@ static int vmpck_id = -1;
 module_param(vmpck_id, int, 0444);
 MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.");
 
-/* Mutex to serialize the shared buffer access and command handling. */
-static DEFINE_MUTEX(snp_cmd_mutex);
-
-/*
- * If an error is received from the host or AMD Secure Processor (ASP) there
- * are two options. Either retry the exact same encrypted request or discontinue
- * using the VMPCK.
- *
- * This is because in the current encryption scheme GHCB v2 uses AES-GCM to
- * encrypt the requests. The IV for this scheme is the sequence number. GCM
- * cannot tolerate IV reuse.
- *
- * The ASP FW v1.51 only increments the sequence numbers on a successful
- * guest<->ASP back and forth and only accepts messages at its exact sequence
- * number.
- *
- * So if the sequence number were to be reused the encryption scheme is
- * vulnerable. If the sequence number were incremented for a fresh IV the ASP
- * will reject the request.
- */
-static void snp_disable_vmpck(struct snp_msg_desc *mdesc)
-{
-	pr_alert("Disabling VMPCK%d communication key to prevent IV reuse.\n",
-		  mdesc->vmpck_id);
-	memzero_explicit(mdesc->vmpck, VMPCK_KEY_LEN);
-	mdesc->vmpck = NULL;
-}
-
-static inline u64 __snp_get_msg_seqno(struct snp_msg_desc *mdesc)
-{
-	u64 count;
-
-	lockdep_assert_held(&snp_cmd_mutex);
-
-	/* Read the current message sequence counter from secrets pages */
-	count = *mdesc->os_area_msg_seqno;
-
-	return count + 1;
-}
-
-/* Return a non-zero on success */
-static u64 snp_get_msg_seqno(struct snp_msg_desc *mdesc)
-{
-	u64 count = __snp_get_msg_seqno(mdesc);
-
-	/*
-	 * The message sequence counter for the SNP guest request is a  64-bit
-	 * value but the version 2 of GHCB specification defines a 32-bit storage
-	 * for it. If the counter exceeds the 32-bit value then return zero.
-	 * The caller should check the return value, but if the caller happens to
-	 * not check the value and use it, then the firmware treats zero as an
-	 * invalid number and will fail the  message request.
-	 */
-	if (count >= UINT_MAX) {
-		pr_err("request message sequence counter overflow\n");
-		return 0;
-	}
-
-	return count;
-}
-
-static void snp_inc_msg_seqno(struct snp_msg_desc *mdesc)
-{
-	/*
-	 * The counter is also incremented by the PSP, so increment it by 2
-	 * and save in secrets page.
-	 */
-	*mdesc->os_area_msg_seqno += 2;
-}
-
 static inline struct snp_guest_dev *to_snp_dev(struct file *file)
 {
 	struct miscdevice *dev = file->private_data;
@@ -137,219 +64,6 @@ static inline struct snp_guest_dev *to_snp_dev(struct file *file)
 	return container_of(dev, struct snp_guest_dev, misc);
 }
 
-static int verify_and_dec_payload(struct snp_msg_desc *mdesc, struct snp_guest_req *req)
-{
-	struct snp_guest_msg *resp_msg = &mdesc->secret_response;
-	struct snp_guest_msg *req_msg = &mdesc->secret_request;
-	struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr;
-	struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr;
-	struct aesgcm_ctx *ctx = mdesc->ctx;
-	u8 iv[GCM_AES_IV_SIZE] = {};
-
-	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
-		 resp_msg_hdr->msg_seqno, resp_msg_hdr->msg_type, resp_msg_hdr->msg_version,
-		 resp_msg_hdr->msg_sz);
-
-	/* Copy response from shared memory to encrypted memory. */
-	memcpy(resp_msg, mdesc->response, sizeof(*resp_msg));
-
-	/* Verify that the sequence counter is incremented by 1 */
-	if (unlikely(resp_msg_hdr->msg_seqno != (req_msg_hdr->msg_seqno + 1)))
-		return -EBADMSG;
-
-	/* Verify response message type and version number. */
-	if (resp_msg_hdr->msg_type != (req_msg_hdr->msg_type + 1) ||
-	    resp_msg_hdr->msg_version != req_msg_hdr->msg_version)
-		return -EBADMSG;
-
-	/*
-	 * If the message size is greater than our buffer length then return
-	 * an error.
-	 */
-	if (unlikely((resp_msg_hdr->msg_sz + ctx->authsize) > req->resp_sz))
-		return -EBADMSG;
-
-	/* Decrypt the payload */
-	memcpy(iv, &resp_msg_hdr->msg_seqno, min(sizeof(iv), sizeof(resp_msg_hdr->msg_seqno)));
-	if (!aesgcm_decrypt(ctx, req->resp_buf, resp_msg->payload, resp_msg_hdr->msg_sz,
-			    &resp_msg_hdr->algo, AAD_LEN, iv, resp_msg_hdr->authtag))
-		return -EBADMSG;
-
-	return 0;
-}
-
-static int enc_payload(struct snp_msg_desc *mdesc, u64 seqno, struct snp_guest_req *req)
-{
-	struct snp_guest_msg *msg = &mdesc->secret_request;
-	struct snp_guest_msg_hdr *hdr = &msg->hdr;
-	struct aesgcm_ctx *ctx = mdesc->ctx;
-	u8 iv[GCM_AES_IV_SIZE] = {};
-
-	memset(msg, 0, sizeof(*msg));
-
-	hdr->algo = SNP_AEAD_AES_256_GCM;
-	hdr->hdr_version = MSG_HDR_VER;
-	hdr->hdr_sz = sizeof(*hdr);
-	hdr->msg_type = req->msg_type;
-	hdr->msg_version = req->msg_version;
-	hdr->msg_seqno = seqno;
-	hdr->msg_vmpck = req->vmpck_id;
-	hdr->msg_sz = req->req_sz;
-
-	/* Verify the sequence number is non-zero */
-	if (!hdr->msg_seqno)
-		return -ENOSR;
-
-	pr_debug("request [seqno %lld type %d version %d sz %d]\n",
-		 hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
-
-	if (WARN_ON((req->req_sz + ctx->authsize) > sizeof(msg->payload)))
-		return -EBADMSG;
-
-	memcpy(iv, &hdr->msg_seqno, min(sizeof(iv), sizeof(hdr->msg_seqno)));
-	aesgcm_encrypt(ctx, msg->payload, req->req_buf, req->req_sz, &hdr->algo,
-		       AAD_LEN, iv, hdr->authtag);
-
-	return 0;
-}
-
-static int __handle_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
-				  struct snp_guest_request_ioctl *rio)
-{
-	unsigned long req_start = jiffies;
-	unsigned int override_npages = 0;
-	u64 override_err = 0;
-	int rc;
-
-retry_request:
-	/*
-	 * Call firmware to process the request. In this function the encrypted
-	 * message enters shared memory with the host. So after this call the
-	 * sequence number must be incremented or the VMPCK must be deleted to
-	 * prevent reuse of the IV.
-	 */
-	rc = snp_issue_guest_request(req, &mdesc->input, rio);
-	switch (rc) {
-	case -ENOSPC:
-		/*
-		 * If the extended guest request fails due to having too
-		 * small of a certificate data buffer, retry the same
-		 * guest request without the extended data request in
-		 * order to increment the sequence number and thus avoid
-		 * IV reuse.
-		 */
-		override_npages = mdesc->input.data_npages;
-		req->exit_code	= SVM_VMGEXIT_GUEST_REQUEST;
-
-		/*
-		 * Override the error to inform callers the given extended
-		 * request buffer size was too small and give the caller the
-		 * required buffer size.
-		 */
-		override_err = SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN);
-
-		/*
-		 * If this call to the firmware succeeds, the sequence number can
-		 * be incremented allowing for continued use of the VMPCK. If
-		 * there is an error reflected in the return value, this value
-		 * is checked further down and the result will be the deletion
-		 * of the VMPCK and the error code being propagated back to the
-		 * user as an ioctl() return code.
-		 */
-		goto retry_request;
-
-	/*
-	 * The host may return SNP_GUEST_VMM_ERR_BUSY if the request has been
-	 * throttled. Retry in the driver to avoid returning and reusing the
-	 * message sequence number on a different message.
-	 */
-	case -EAGAIN:
-		if (jiffies - req_start > SNP_REQ_MAX_RETRY_DURATION) {
-			rc = -ETIMEDOUT;
-			break;
-		}
-		schedule_timeout_killable(SNP_REQ_RETRY_DELAY);
-		goto retry_request;
-	}
-
-	/*
-	 * Increment the message sequence number. There is no harm in doing
-	 * this now because decryption uses the value stored in the response
-	 * structure and any failure will wipe the VMPCK, preventing further
-	 * use anyway.
-	 */
-	snp_inc_msg_seqno(mdesc);
-
-	if (override_err) {
-		rio->exitinfo2 = override_err;
-
-		/*
-		 * If an extended guest request was issued and the supplied certificate
-		 * buffer was not large enough, a standard guest request was issued to
-		 * prevent IV reuse. If the standard request was successful, return -EIO
-		 * back to the caller as would have originally been returned.
-		 */
-		if (!rc && override_err == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
-			rc = -EIO;
-	}
-
-	if (override_npages)
-		mdesc->input.data_npages = override_npages;
-
-	return rc;
-}
-
-static int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
-				  struct snp_guest_request_ioctl *rio)
-{
-	u64 seqno;
-	int rc;
-
-	guard(mutex)(&snp_cmd_mutex);
-
-	/* Get message sequence and verify that its a non-zero */
-	seqno = snp_get_msg_seqno(mdesc);
-	if (!seqno)
-		return -EIO;
-
-	/* Clear shared memory's response for the host to populate. */
-	memset(mdesc->response, 0, sizeof(struct snp_guest_msg));
-
-	/* Encrypt the userspace provided payload in mdesc->secret_request. */
-	rc = enc_payload(mdesc, seqno, req);
-	if (rc)
-		return rc;
-
-	/*
-	 * Write the fully encrypted request to the shared unencrypted
-	 * request page.
-	 */
-	memcpy(mdesc->request, &mdesc->secret_request,
-	       sizeof(mdesc->secret_request));
-
-	rc = __handle_guest_request(mdesc, req, rio);
-	if (rc) {
-		if (rc == -EIO &&
-		    rio->exitinfo2 == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
-			return rc;
-
-		pr_alert("Detected error from ASP request. rc: %d, exitinfo2: 0x%llx\n",
-			 rc, rio->exitinfo2);
-
-		snp_disable_vmpck(mdesc);
-		return rc;
-	}
-
-	rc = verify_and_dec_payload(mdesc, req);
-	if (rc) {
-		pr_alert("Detected unexpected decode failure from ASP. rc: %d\n", rc);
-		snp_disable_vmpck(mdesc);
-		return rc;
-	}
-
-	return 0;
-}
-
 static int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code,
 				struct snp_guest_request_ioctl *rio, u8 type,
 				void *req_buf, size_t req_sz, void *resp_buf,
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 007bab9f2a0e..45060e7cea48 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1552,6 +1552,7 @@ config AMD_MEM_ENCRYPT
 	select ARCH_HAS_CC_PLATFORM
 	select X86_MEM_ENCRYPT
 	select UNACCEPTED_MEMORY
+	select CRYPTO_LIB_AESGCM
 	help
 	  Say yes to enable support for the encryption of system memory.
 	  This requires an AMD processor that supports Secure Memory
diff --git a/drivers/virt/coco/sev-guest/Kconfig b/drivers/virt/coco/sev-guest/Kconfig
index 0b772bd921d8..a6405ab6c2c3 100644
--- a/drivers/virt/coco/sev-guest/Kconfig
+++ b/drivers/virt/coco/sev-guest/Kconfig
@@ -2,7 +2,6 @@ config SEV_GUEST
 	tristate "AMD SEV Guest driver"
 	default m
 	depends on AMD_MEM_ENCRYPT
-	select CRYPTO_LIB_AESGCM
 	select TSM_REPORTS
 	help
 	  SEV-SNP firmware provides the guest a mechanism to communicate with
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 13/20] x86/cc: Add CC_ATTR_GUEST_SECURE_TSC
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (11 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 12/20] x86/sev: Relocate SNP guest messaging routines to common code Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-13 15:21   ` Tom Lendacky
  2024-07-31 15:08 ` [PATCH v11 14/20] x86/sev: Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Add confidential compute platform attribute CC_ATTR_GUEST_SECURE_TSC that
can be used by the guest to query whether the Secure TSC feature is active.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 include/linux/cc_platform.h | 8 ++++++++
 arch/x86/coco/core.c        | 3 +++
 2 files changed, 11 insertions(+)

diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index caa4b4430634..96dc61846c9d 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -88,6 +88,14 @@ enum cc_attr {
 	 * enabled to run SEV-SNP guests.
 	 */
 	CC_ATTR_HOST_SEV_SNP,
+
+	/**
+	 * @CC_ATTR_GUEST_SECURE_TSC: Secure TSC is active.
+	 *
+	 * The platform/OS is running as a guest/virtual machine and actively
+	 * using AMD SEV-SNP Secure TSC feature.
+	 */
+	CC_ATTR_GUEST_SECURE_TSC,
 };
 
 #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index 0f81f70aca82..00df00e2cb4a 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -100,6 +100,9 @@ static bool noinstr amd_cc_platform_has(enum cc_attr attr)
 	case CC_ATTR_HOST_SEV_SNP:
 		return cc_flags.host_sev_snp;
 
+	case CC_ATTR_GUEST_SECURE_TSC:
+		return sev_status & MSR_AMD64_SNP_SECURE_TSC;
+
 	default:
 		return false;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 14/20] x86/sev: Add Secure TSC support for SNP guests
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (12 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 13/20] x86/cc: Add CC_ATTR_GUEST_SECURE_TSC Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-13 16:29   ` Tom Lendacky
  2024-07-31 15:08 ` [PATCH v11 15/20] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests Nikunj A Dadhania
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Add support for Secure TSC in SNP-enabled guests. Secure TSC allows guests
to securely use RDTSC/RDTSCP instructions, ensuring that the parameters
used cannot be altered by the hypervisor once the guest is launched.

Secure TSC-enabled guests need to query TSC information from the AMD
Security Processor. This communication channel is encrypted between the AMD
Security Processor and the guest, with the hypervisor acting merely as a
conduit to deliver the guest messages to the AMD Security Processor. Each
message is protected with AEAD (AES-256 GCM). Use a minimal AES GCM library
to encrypt and decrypt SNP guest messages for communication with the PSP.

Use mem_encrypt_init() to fetch SNP TSC information from the AMD Security
Processor and initialize snp_tsc_scale and snp_tsc_offset. During secondary
CPU initialization, set the VMSA fields GUEST_TSC_SCALE (offset 2F0h) and
GUEST_TSC_OFFSET (offset 2F8h) with snp_tsc_scale and snp_tsc_offset,
respectively.

Since handle_guest_request() is common routine used by both the SEV guest
driver and Secure TSC code, move it to the SEV header file.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
---
 arch/x86/include/asm/sev-common.h       |  1 +
 arch/x86/include/asm/sev.h              | 46 +++++++++++++
 arch/x86/include/asm/svm.h              |  6 +-
 arch/x86/coco/sev/core.c                | 91 +++++++++++++++++++++++++
 arch/x86/mm/mem_encrypt.c               |  4 ++
 drivers/virt/coco/sev-guest/sev-guest.c | 19 ------
 6 files changed, 146 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 98726c2b04f8..655eb0ac5032 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -206,6 +206,7 @@ struct snp_psc_desc {
 #define GHCB_TERM_NO_SVSM		7	/* SVSM is not advertised in the secrets page */
 #define GHCB_TERM_SVSM_VMPL0		8	/* SVSM is present but has set VMPL to 0 */
 #define GHCB_TERM_SVSM_CAA		9	/* SVSM is present but CAA is not page aligned */
+#define GHCB_TERM_SECURE_TSC		10	/* Secure TSC initialization failed */
 
 #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index eda435eba53e..f95fa64bf480 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -146,6 +146,9 @@ enum msg_type {
 	SNP_MSG_VMRK_REQ,
 	SNP_MSG_VMRK_RSP,
 
+	SNP_MSG_TSC_INFO_REQ = 17,
+	SNP_MSG_TSC_INFO_RSP,
+
 	SNP_MSG_TYPE_MAX
 };
 
@@ -174,6 +177,22 @@ struct snp_guest_msg {
 	u8 payload[PAGE_SIZE - sizeof(struct snp_guest_msg_hdr)];
 } __packed;
 
+#define SNP_TSC_INFO_REQ_SZ	128
+#define SNP_TSC_INFO_RESP_SZ	128
+
+struct snp_tsc_info_req {
+	u8 rsvd[SNP_TSC_INFO_REQ_SZ];
+} __packed;
+
+struct snp_tsc_info_resp {
+	u32 status;
+	u32 rsvd1;
+	u64 tsc_scale;
+	u64 tsc_offset;
+	u32 tsc_factor;
+	u8 rsvd2[100];
+} __packed;
+
 struct snp_guest_req {
 	void *req_buf;
 	size_t req_sz;
@@ -499,6 +518,27 @@ static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc)
 int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
 			   struct snp_guest_request_ioctl *rio);
 
+static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code,
+				       struct snp_guest_request_ioctl *rio, u8 type,
+				       void *req_buf, size_t req_sz, void *resp_buf,
+				       u32 resp_sz)
+{
+	struct snp_guest_req req = {
+		.msg_version	= rio->msg_version,
+		.msg_type	= type,
+		.vmpck_id	= mdesc->vmpck_id,
+		.req_buf	= req_buf,
+		.req_sz		= req_sz,
+		.resp_buf	= resp_buf,
+		.resp_sz	= resp_sz,
+		.exit_code	= exit_code,
+	};
+
+	return snp_send_guest_request(mdesc, &req, rio);
+}
+
+void __init snp_secure_tsc_prepare(void);
+
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define snp_vmpl 0
@@ -545,6 +585,12 @@ static inline struct snp_msg_desc *snp_msg_alloc(void) { return NULL; }
 static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc) { }
 static inline int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
 					 struct snp_guest_request_ioctl *rio) { return -ENODEV; }
+static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code,
+				       struct snp_guest_request_ioctl *rio, u8 type,
+				       void *req_buf, size_t req_sz, void *resp_buf,
+				       u32 resp_sz) { return -ENODEV; }
+
+static inline void __init snp_secure_tsc_prepare(void) { }
 
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index f0dea3750ca9..1695a933106b 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -417,7 +417,9 @@ struct sev_es_save_area {
 	u8 reserved_0x298[80];
 	u32 pkru;
 	u32 tsc_aux;
-	u8 reserved_0x2f0[24];
+	u64 tsc_scale;
+	u64 tsc_offset;
+	u8 reserved_0x300[8];
 	u64 rcx;
 	u64 rdx;
 	u64 rbx;
@@ -549,7 +551,7 @@ static inline void __unused_size_checks(void)
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x1c0);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x248);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x298);
-	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x2f0);
+	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x300);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x320);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x380);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x3f0);
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index aaeeca938265..9815aa419978 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -98,6 +98,10 @@ static u64 secrets_pa __ro_after_init;
 
 static struct snp_msg_desc *snp_mdesc;
 
+/* Secure TSC values read using TSC_INFO SNP Guest request */
+static u64 snp_tsc_scale __ro_after_init;
+static u64 snp_tsc_offset __ro_after_init;
+
 /* #VC handler runtime per-CPU data */
 struct sev_es_runtime_data {
 	struct ghcb ghcb_page;
@@ -1175,6 +1179,12 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip)
 	vmsa->vmpl		= snp_vmpl;
 	vmsa->sev_features	= sev_status >> 2;
 
+	/* Set Secure TSC parameters */
+	if (cc_platform_has(CC_ATTR_GUEST_SECURE_TSC)) {
+		vmsa->tsc_scale = snp_tsc_scale;
+		vmsa->tsc_offset = snp_tsc_offset;
+	}
+
 	/* Switch the page over to a VMSA page now that it is initialized */
 	ret = snp_set_vmsa(vmsa, caa, apic_id, true);
 	if (ret) {
@@ -2981,3 +2991,84 @@ int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req
 	return 0;
 }
 EXPORT_SYMBOL_GPL(snp_send_guest_request);
+
+static int __init snp_get_tsc_info(void)
+{
+	static u8 buf[SNP_TSC_INFO_RESP_SZ + AUTHTAG_LEN];
+	struct snp_guest_request_ioctl rio;
+	struct snp_tsc_info_resp tsc_resp;
+	struct snp_tsc_info_req *tsc_req;
+	struct snp_msg_desc *mdesc;
+	struct snp_guest_req req;
+	int rc;
+
+	/*
+	 * The intermediate response buffer is used while decrypting the
+	 * response payload. Make sure that it has enough space to cover the
+	 * authtag.
+	 */
+	BUILD_BUG_ON(sizeof(buf) < (sizeof(tsc_resp) + AUTHTAG_LEN));
+
+	mdesc = snp_msg_alloc();
+	if (IS_ERR_OR_NULL(mdesc))
+		return -ENOMEM;
+
+	rc = snp_msg_init(mdesc, snp_vmpl);
+	if (rc)
+		return rc;
+
+	tsc_req = kzalloc(sizeof(struct snp_tsc_info_req), GFP_KERNEL);
+	if (!tsc_req)
+		return -ENOMEM;
+
+	memset(&req, 0, sizeof(req));
+	memset(&rio, 0, sizeof(rio));
+	memset(buf, 0, sizeof(buf));
+
+	req.msg_version = MSG_HDR_VER;
+	req.msg_type = SNP_MSG_TSC_INFO_REQ;
+	req.vmpck_id = snp_vmpl;
+	req.req_buf = tsc_req;
+	req.req_sz = sizeof(*tsc_req);
+	req.resp_buf = buf;
+	req.resp_sz = sizeof(tsc_resp) + AUTHTAG_LEN;
+	req.exit_code = SVM_VMGEXIT_GUEST_REQUEST;
+
+	rc = snp_send_guest_request(mdesc, &req, &rio);
+	if (rc)
+		goto err_req;
+
+	memcpy(&tsc_resp, buf, sizeof(tsc_resp));
+	pr_debug("%s: response status %x scale %llx offset %llx factor %x\n",
+		 __func__, tsc_resp.status, tsc_resp.tsc_scale, tsc_resp.tsc_offset,
+		 tsc_resp.tsc_factor);
+
+	if (tsc_resp.status == 0) {
+		snp_tsc_scale = tsc_resp.tsc_scale;
+		snp_tsc_offset = tsc_resp.tsc_offset;
+	} else {
+		pr_err("Failed to get TSC info, response status %x\n", tsc_resp.status);
+		rc = -EIO;
+	}
+
+err_req:
+	/* The response buffer contains the sensitive data, explicitly clear it. */
+	memzero_explicit(buf, sizeof(buf));
+	memzero_explicit(&tsc_resp, sizeof(tsc_resp));
+	memzero_explicit(&req, sizeof(req));
+
+	return rc;
+}
+
+void __init snp_secure_tsc_prepare(void)
+{
+	if (!cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
+		return;
+
+	if (snp_get_tsc_info()) {
+		pr_alert("Unable to retrieve Secure TSC info from ASP\n");
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SECURE_TSC);
+	}
+
+	pr_debug("SecureTSC enabled");
+}
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 0a120d85d7bb..996ca27f0b72 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -94,6 +94,10 @@ void __init mem_encrypt_init(void)
 	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
 	swiotlb_update_mem_attributes();
 
+	/* Initialize SNP Secure TSC */
+	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		snp_secure_tsc_prepare();
+
 	print_mem_encrypt_feature_info();
 }
 
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 019eca753f85..db1b00db624d 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -64,25 +64,6 @@ static inline struct snp_guest_dev *to_snp_dev(struct file *file)
 	return container_of(dev, struct snp_guest_dev, misc);
 }
 
-static int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code,
-				struct snp_guest_request_ioctl *rio, u8 type,
-				void *req_buf, size_t req_sz, void *resp_buf,
-				u32 resp_sz)
-{
-	struct snp_guest_req req = {
-		.msg_version	= rio->msg_version,
-		.msg_type	= type,
-		.vmpck_id	= mdesc->vmpck_id,
-		.req_buf	= req_buf,
-		.req_sz		= req_sz,
-		.resp_buf	= resp_buf,
-		.resp_sz	= resp_sz,
-		.exit_code	= exit_code,
-	};
-
-	return snp_send_guest_request(mdesc, &req, rio);
-}
-
 struct snp_req_resp {
 	sockptr_t req_data;
 	sockptr_t resp_data;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 15/20] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (13 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 14/20] x86/sev: Add Secure TSC support for SNP guests Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-07-31 15:08 ` [PATCH v11 16/20] x86/sev: Prevent RDTSC/RDTSCP interception " Nikunj A Dadhania
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as
the subsequent TSC value reads are undefined. MSR_IA32_TSC read/write
accesses should not exit to the hypervisor for such guests.

Accesses to MSR_IA32_TSC needs special handling in the #VC handler for the
guests with Secure TSC enabled. Writes to MSR_IA32_TSC should be ignored,
and reads of MSR_IA32_TSC should return the result of the RDTSC
instruction.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
---
 arch/x86/coco/sev/core.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 9815aa419978..a4c737afce50 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1335,6 +1335,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
 		return ES_OK;
 	}
 
+	/*
+	 * TSC related accesses should not exit to the hypervisor when a
+	 * guest is executing with SecureTSC enabled, so special handling
+	 * is required for accesses of MSR_IA32_TSC:
+	 *
+	 * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
+	 *         of the TSC to return undefined values, so ignore all
+	 *         writes.
+	 * Reads:  Reads of MSR_IA32_TSC should return the current TSC
+	 *         value, use the value returned by RDTSC.
+	 */
+	if (regs->cx == MSR_IA32_TSC && cc_platform_has(CC_ATTR_GUEST_SECURE_TSC)) {
+		u64 tsc;
+
+		if (exit_info_1)
+			return ES_OK;
+
+		tsc = rdtsc();
+		regs->ax = UINT_MAX & tsc;
+		regs->dx = UINT_MAX & (tsc >> 32);
+
+		return ES_OK;
+	}
+
 	ghcb_set_rcx(ghcb, regs->cx);
 	if (exit_info_1) {
 		ghcb_set_rax(ghcb, regs->ax);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 16/20] x86/sev: Prevent RDTSC/RDTSCP interception for Secure TSC enabled guests
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (14 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 15/20] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-13 16:49   ` Tom Lendacky
  2024-07-31 15:08 ` [PATCH v11 17/20] x86/sev: Allow Secure TSC feature for SNP guests Nikunj A Dadhania
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

The hypervisor should not be intercepting RDTSC/RDTSCP when Secure TSC is
enabled. A #VC exception will be generated if the RDTSC/RDTSCP instructions
are being intercepted. If this should occur and Secure TSC is enabled,
terminate guest execution.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
---
 arch/x86/coco/sev/shared.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/coco/sev/shared.c
index 71de53194089..c2a9e2ada659 100644
--- a/arch/x86/coco/sev/shared.c
+++ b/arch/x86/coco/sev/shared.c
@@ -1140,6 +1140,16 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,
 	bool rdtscp = (exit_code == SVM_EXIT_RDTSCP);
 	enum es_result ret;
 
+	/*
+	 * RDTSC and RDTSCP should not be intercepted when Secure TSC is
+	 * enabled. Terminate the SNP guest when the interception is enabled.
+	 * This file is included from kernel/sev.c and boot/compressed/sev.c,
+	 * use sev_status here as cc_platform_has() is not available when
+	 * compiling boot/compressed/sev.c.
+	 */
+	if (sev_status & MSR_AMD64_SNP_SECURE_TSC)
+		return ES_VMM_ERROR;
+
 	ret = sev_es_ghcb_hv_call(ghcb, ctxt, exit_code, 0, 0);
 	if (ret != ES_OK)
 		return ret;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 17/20] x86/sev: Allow Secure TSC feature for SNP guests
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (15 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 16/20] x86/sev: Prevent RDTSC/RDTSCP interception " Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-13 16:53   ` Tom Lendacky
  2024-07-31 15:08 ` [PATCH v11 18/20] x86/sev: Mark Secure TSC as reliable clocksource Nikunj A Dadhania
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Now that all the required plumbing is done for enabling SNP Secure TSC
feature, add Secure TSC to SNP features present list.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
---
 arch/x86/boot/compressed/sev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index cd44e120fe53..bb55934c1cee 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -401,7 +401,8 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
  * by the guest kernel. As and when a new feature is implemented in the
  * guest kernel, a corresponding bit should be added to the mask.
  */
-#define SNP_FEATURES_PRESENT	MSR_AMD64_SNP_DEBUG_SWAP
+#define SNP_FEATURES_PRESENT	(MSR_AMD64_SNP_DEBUG_SWAP |	\
+				 MSR_AMD64_SNP_SECURE_TSC)
 
 u64 snp_get_unsupported_features(u64 status)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 18/20] x86/sev: Mark Secure TSC as reliable clocksource
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (16 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 17/20] x86/sev: Allow Secure TSC feature for SNP guests Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-13 16:59   ` Tom Lendacky
  2024-07-31 15:08 ` [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available Nikunj A Dadhania
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

In SNP guest environment with Secure TSC enabled, unlike other clock
sources (such as HPET, ACPI timer, APIC, etc.), the RDTSC instruction is
handled without causing a VM exit, resulting in minimal overhead and
jitters. Hence, mark Secure TSC as the only reliable clock source,
bypassing unstable calibration.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
---
 arch/x86/mm/mem_encrypt_amd.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 86a476a426c2..e9fb5f24703a 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -516,6 +516,10 @@ void __init sme_early_init(void)
 	 * kernel mapped.
 	 */
 	snp_update_svsm_ca();
+
+	/* Mark the TSC as reliable when Secure TSC is enabled */
+	if (sev_status & MSR_AMD64_SNP_SECURE_TSC)
+		setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
 }
 
 void __init mem_encrypt_free_decrypted_mem(void)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (17 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 18/20] x86/sev: Mark Secure TSC as reliable clocksource Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-13 17:19   ` Tom Lendacky
  2024-09-13 17:30   ` Sean Christopherson
  2024-07-31 15:08 ` [PATCH v11 20/20] x86/cpu/amd: Do not print FW_BUG for Secure TSC Nikunj A Dadhania
  2024-08-14  4:14 ` [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A. Dadhania
  20 siblings, 2 replies; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

For AMD SNP guests with SecureTSC enabled, kvm-clock is being picked up
momentarily instead of selecting more stable TSC clocksource.

[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[    0.000001] kvm-clock: using sched offset of 1799357702246960 cycles
[    0.001493] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.006289] tsc: Detected 1996.249 MHz processor
[    0.305123] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
[    1.045759] clocksource: Switched to clocksource kvm-clock
[    1.141326] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
[    1.144634] clocksource: Switched to clocksource tsc

When Secure TSC is enabled, skip using the kvmclock. The guest kernel will
fallback and use Secure TSC based clocksource.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
---
 arch/x86/kernel/kvmclock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 5b2c15214a6b..3d03b4c937b9 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -289,7 +289,7 @@ void __init kvmclock_init(void)
 {
 	u8 flags;
 
-	if (!kvm_para_available() || !kvmclock)
+	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
 		return;
 
 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE2)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v11 20/20] x86/cpu/amd: Do not print FW_BUG for Secure TSC
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (18 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available Nikunj A Dadhania
@ 2024-07-31 15:08 ` Nikunj A Dadhania
  2024-09-13 17:21   ` Tom Lendacky
  2024-09-13 17:42   ` Jim Mattson
  2024-08-14  4:14 ` [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A. Dadhania
  20 siblings, 2 replies; 66+ messages in thread
From: Nikunj A Dadhania @ 2024-07-31 15:08 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

When Secure TSC is enabled and TscInvariant (bit 8) in CPUID_8000_0007_edx
is set, the kernel complains with the below firmware bug:

[Firmware Bug]: TSC doesn't count with P0 frequency!

Secure TSC does not need to run at P0 frequency; the TSC frequency is set
by the VMM as part of the SNP_LAUNCH_START command. Skip this check when
Secure TSC is enabled

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
---
 arch/x86/kernel/cpu/amd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index be5889bded49..87b55d2183a0 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -370,7 +370,8 @@ static void bsp_determine_snp(struct cpuinfo_x86 *c)
 
 static void bsp_init_amd(struct cpuinfo_x86 *c)
 {
-	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
+	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC) &&
+	    !cc_platform_has(CC_ATTR_GUEST_SECURE_TSC)) {
 
 		if (c->x86 > 0x10 ||
 		    (c->x86 == 0x10 && c->x86_model >= 0x2)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 00/20] Add Secure TSC support for SNP guests
  2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (19 preceding siblings ...)
  2024-07-31 15:08 ` [PATCH v11 20/20] x86/cpu/amd: Do not print FW_BUG for Secure TSC Nikunj A Dadhania
@ 2024-08-14  4:14 ` Nikunj A. Dadhania
  20 siblings, 0 replies; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-08-14  4:14 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, kvm



On 7/31/2024 8:37 PM, Nikunj A Dadhania wrote:
> This patchset is also available at:
> 
>   https://github.com/AMDESE/linux-kvm/tree/sectsc-guest-latest
> 
> and is based on v6.11-rc1
> 
> Overview
> --------
> 
> Secure TSC allows guests to securely use RDTSC/RDTSCP instructions as the
> parameters being used cannot be changed by hypervisor once the guest is
> launched. More details in the AMD64 APM Vol 2, Section "Secure TSC".
> 
> In order to enable secure TSC, SEV-SNP guests need to send TSC_INFO guest
> message before the APs are booted. Details from the TSC_INFO response will
> then be used to program the VMSA before the APs are brought up. See "SEV
> Secure Nested Paging Firmware ABI Specification" document (currently at
> https://www.amd.com/system/files/TechDocs/56860.pdf) section "TSC Info"
> 
> SEV-guest driver has the implementation for guest and AMD Security
> Processor communication. As the TSC_INFO needs to be initialized during
> early boot before APs are started, move the guest messaging code from
> sev-guest driver to sev/core.c and provide well defined APIs to the
> sev-guest driver.
> 
> Patches:
> 01-04: sev-guest driver cleanup and enhancements
>    05: Use AES GCM library
> 06-07: SNP init error handling and cache secrets page address
> 08-10: Preparatory patches for code movement
> 11-12: Patches moving SNP guest messaging code from SEV guest driver to
>        SEV common code
> 13-20: SecureTSC enablement patches.
> 
> Testing SecureTSC
> -----------------
> 
> SecureTSC hypervisor patches based on top of SEV-SNP Guest MEMFD series:
> https://github.com/AMDESE/linux-kvm/tree/sectsc-host-latest
> 
> QEMU changes:
> https://github.com/nikunjad/qemu/tree/snp-securetsc-latest
> 
> QEMU commandline SEV-SNP with SecureTSC:
> 
>   qemu-system-x86_64 -cpu EPYC-Milan-v2 -smp 4 \
>     -object memory-backend-memfd,id=ram1,size=1G,share=true,prealloc=false,reserve=false \
>     -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,secure-tsc=on \
>     -machine q35,confidential-guest-support=sev0,memory-backend=ram1 \
>     ...
> 
> Changelog:
> ----------
> v11:
> * Rebased on top of v6.11-rc1
> * Added Acked-by/Reviewed-by
> * Moved SEV Guest driver cleanups in the beginning of the series
> * Commit message updates
> * Enforced PAGE_SIZE constraints for snp_guest_msg
> * After offline discussion with Boris, redesigned and exported
>   SEV guest messaging APIs to sev-guest driver
> * Dropped VMPCK rework patches
> * Make sure movement of SEV core routines does not break the SEV Guest
>   driver midway of the series.
> 

A gentle reminder.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [tip: x86/sev] virt: sev-guest: Ensure the SNP guest messages do not exceed a page
  2024-07-31 15:07 ` [PATCH v11 04/20] virt: sev-guest: Ensure the SNP guest messages do not exceed a page Nikunj A Dadhania
@ 2024-08-27  8:48   ` tip-bot2 for Nikunj A Dadhania
  0 siblings, 0 replies; 66+ messages in thread
From: tip-bot2 for Nikunj A Dadhania @ 2024-08-27  8:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Tom Lendacky, Nikunj A Dadhania, Borislav Petkov (AMD), x86,
	linux-kernel

The following commit has been merged into the x86/sev branch of tip:

Commit-ID:     2b9ac0b84c2cae91bbaceab62df4de6d503421ec
Gitweb:        https://git.kernel.org/tip/2b9ac0b84c2cae91bbaceab62df4de6d503421ec
Author:        Nikunj A Dadhania <nikunj@amd.com>
AuthorDate:    Wed, 31 Jul 2024 20:37:55 +05:30
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Tue, 27 Aug 2024 10:35:38 +02:00

virt: sev-guest: Ensure the SNP guest messages do not exceed a page

Currently, struct snp_guest_msg includes a message header (96 bytes) and
a payload (4000 bytes). There is an implicit assumption here that the
SNP message header will always be 96 bytes, and with that assumption the
payload array size has been set to 4000 bytes - a magic number. If any
new member is added to the SNP message header, the SNP guest message
will span more than a page.

Instead of using a magic number for the payload, declare struct
snp_guest_msg in a way that payload plus the message header do not
exceed a page.

  [ bp: Massage. ]

Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240731150811.156771-5-nikunj@amd.com
---
 arch/x86/include/asm/sev.h              | 2 +-
 drivers/virt/coco/sev-guest/sev-guest.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 79bbe2b..ee34ab0 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -164,7 +164,7 @@ struct snp_guest_msg_hdr {
 
 struct snp_guest_msg {
 	struct snp_guest_msg_hdr hdr;
-	u8 payload[4000];
+	u8 payload[PAGE_SIZE - sizeof(struct snp_guest_msg_hdr)];
 } __packed;
 
 struct sev_guest_platform_data {
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 3b76cbf..89754b0 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -1092,6 +1092,8 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	void __iomem *mapping;
 	int ret;
 
+	BUILD_BUG_ON(sizeof(struct snp_guest_msg) > PAGE_SIZE);
+
 	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
 		return -ENODEV;
 

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [tip: x86/sev] virt: sev-guest: Fix user-visible strings
  2024-07-31 15:07 ` [PATCH v11 03/20] virt: sev-guest: Fix user-visible strings Nikunj A Dadhania
@ 2024-08-27  8:48   ` tip-bot2 for Nikunj A Dadhania
  2024-09-13 17:26   ` [PATCH v11 03/20] " Tom Lendacky
  1 sibling, 0 replies; 66+ messages in thread
From: tip-bot2 for Nikunj A Dadhania @ 2024-08-27  8:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Nikunj A Dadhania, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/sev branch of tip:

Commit-ID:     5f7c38f81df206b370d97a827251bd4bc50ff46b
Gitweb:        https://git.kernel.org/tip/5f7c38f81df206b370d97a827251bd4bc50ff46b
Author:        Nikunj A Dadhania <nikunj@amd.com>
AuthorDate:    Wed, 31 Jul 2024 20:37:54 +05:30
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Tue, 27 Aug 2024 10:35:06 +02:00

virt: sev-guest: Fix user-visible strings

User-visible abbreviations should be in capitals, ensure messages are
readable and clear.

No functional change.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240731150811.156771-4-nikunj@amd.com
---
 drivers/virt/coco/sev-guest/sev-guest.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index a72fe1e..3b76cbf 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -114,7 +114,7 @@ static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
  */
 static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
 {
-	dev_alert(snp_dev->dev, "Disabling vmpck_id %d to prevent IV reuse.\n",
+	dev_alert(snp_dev->dev, "Disabling VMPCK%d communication key to prevent IV reuse.\n",
 		  vmpck_id);
 	memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
 	snp_dev->vmpck = NULL;
@@ -1117,13 +1117,13 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	ret = -EINVAL;
 	snp_dev->vmpck = get_vmpck(vmpck_id, secrets, &snp_dev->os_area_msg_seqno);
 	if (!snp_dev->vmpck) {
-		dev_err(dev, "invalid vmpck id %d\n", vmpck_id);
+		dev_err(dev, "Invalid VMPCK%d communication key\n", vmpck_id);
 		goto e_unmap;
 	}
 
 	/* Verify that VMPCK is not zero. */
 	if (is_vmpck_empty(snp_dev)) {
-		dev_err(dev, "vmpck id %d is null\n", vmpck_id);
+		dev_err(dev, "Empty VMPCK%d communication key\n", vmpck_id);
 		goto e_unmap;
 	}
 
@@ -1174,7 +1174,7 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	if (ret)
 		goto e_free_cert_data;
 
-	dev_info(dev, "Initialized SEV guest driver (using vmpck_id %d)\n", vmpck_id);
+	dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n", vmpck_id);
 	return 0;
 
 e_free_cert_data:

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [tip: x86/sev] virt: sev-guest: Rename local guest message variables
  2024-07-31 15:07 ` [PATCH v11 02/20] virt: sev-guest: Rename local guest message variables Nikunj A Dadhania
@ 2024-08-27  8:48   ` tip-bot2 for Nikunj A Dadhania
  2024-09-13 17:22   ` [PATCH v11 02/20] " Tom Lendacky
  1 sibling, 0 replies; 66+ messages in thread
From: tip-bot2 for Nikunj A Dadhania @ 2024-08-27  8:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Nikunj A Dadhania, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/sev branch of tip:

Commit-ID:     a1bbb2236bb97c0afee4cdf8fd732ff5f9cd60ac
Gitweb:        https://git.kernel.org/tip/a1bbb2236bb97c0afee4cdf8fd732ff5f9cd60ac
Author:        Nikunj A Dadhania <nikunj@amd.com>
AuthorDate:    Wed, 31 Jul 2024 20:37:53 +05:30
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Tue, 27 Aug 2024 10:34:41 +02:00

virt: sev-guest: Rename local guest message variables

Rename local guest message variables for more clarity.

No functional change.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240731150811.156771-3-nikunj@amd.com
---
 drivers/virt/coco/sev-guest/sev-guest.c | 117 +++++++++++------------
 1 file changed, 59 insertions(+), 58 deletions(-)

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 7d343f2..a72fe1e 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -291,45 +291,45 @@ static int dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
 static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload, u32 sz)
 {
 	struct snp_guest_crypto *crypto = snp_dev->crypto;
-	struct snp_guest_msg *resp = &snp_dev->secret_response;
-	struct snp_guest_msg *req = &snp_dev->secret_request;
-	struct snp_guest_msg_hdr *req_hdr = &req->hdr;
-	struct snp_guest_msg_hdr *resp_hdr = &resp->hdr;
+	struct snp_guest_msg *resp_msg = &snp_dev->secret_response;
+	struct snp_guest_msg *req_msg = &snp_dev->secret_request;
+	struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr;
+	struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr;
 
 	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
-		 resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version,
-		 resp_hdr->msg_sz);
+		 resp_msg_hdr->msg_seqno, resp_msg_hdr->msg_type, resp_msg_hdr->msg_version,
+		 resp_msg_hdr->msg_sz);
 
 	/* Copy response from shared memory to encrypted memory. */
-	memcpy(resp, snp_dev->response, sizeof(*resp));
+	memcpy(resp_msg, snp_dev->response, sizeof(*resp_msg));
 
 	/* Verify that the sequence counter is incremented by 1 */
-	if (unlikely(resp_hdr->msg_seqno != (req_hdr->msg_seqno + 1)))
+	if (unlikely(resp_msg_hdr->msg_seqno != (req_msg_hdr->msg_seqno + 1)))
 		return -EBADMSG;
 
 	/* Verify response message type and version number. */
-	if (resp_hdr->msg_type != (req_hdr->msg_type + 1) ||
-	    resp_hdr->msg_version != req_hdr->msg_version)
+	if (resp_msg_hdr->msg_type != (req_msg_hdr->msg_type + 1) ||
+	    resp_msg_hdr->msg_version != req_msg_hdr->msg_version)
 		return -EBADMSG;
 
 	/*
 	 * If the message size is greater than our buffer length then return
 	 * an error.
 	 */
-	if (unlikely((resp_hdr->msg_sz + crypto->a_len) > sz))
+	if (unlikely((resp_msg_hdr->msg_sz + crypto->a_len) > sz))
 		return -EBADMSG;
 
 	/* Decrypt the payload */
-	return dec_payload(snp_dev, resp, payload, resp_hdr->msg_sz + crypto->a_len);
+	return dec_payload(snp_dev, resp_msg, payload, resp_msg_hdr->msg_sz + crypto->a_len);
 }
 
 static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8 type,
 			void *payload, size_t sz)
 {
-	struct snp_guest_msg *req = &snp_dev->secret_request;
-	struct snp_guest_msg_hdr *hdr = &req->hdr;
+	struct snp_guest_msg *msg = &snp_dev->secret_request;
+	struct snp_guest_msg_hdr *hdr = &msg->hdr;
 
-	memset(req, 0, sizeof(*req));
+	memset(msg, 0, sizeof(*msg));
 
 	hdr->algo = SNP_AEAD_AES_256_GCM;
 	hdr->hdr_version = MSG_HDR_VER;
@@ -347,7 +347,7 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8
 	pr_debug("request [seqno %lld type %d version %d sz %d]\n",
 		 hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
 
-	return __enc_payload(snp_dev, req, payload, sz);
+	return __enc_payload(snp_dev, msg, payload, sz);
 }
 
 static int __handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
@@ -496,8 +496,8 @@ struct snp_req_resp {
 static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
 {
 	struct snp_guest_crypto *crypto = snp_dev->crypto;
-	struct snp_report_req *req = &snp_dev->req.report;
-	struct snp_report_resp *resp;
+	struct snp_report_req *report_req = &snp_dev->req.report;
+	struct snp_report_resp *report_resp;
 	int rc, resp_len;
 
 	lockdep_assert_held(&snp_cmd_mutex);
@@ -505,7 +505,7 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 	if (!arg->req_data || !arg->resp_data)
 		return -EINVAL;
 
-	if (copy_from_user(req, (void __user *)arg->req_data, sizeof(*req)))
+	if (copy_from_user(report_req, (void __user *)arg->req_data, sizeof(*report_req)))
 		return -EFAULT;
 
 	/*
@@ -513,30 +513,29 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(resp->data) + crypto->a_len;
-	resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
-	if (!resp)
+	resp_len = sizeof(report_resp->data) + crypto->a_len;
+	report_resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+	if (!report_resp)
 		return -ENOMEM;
 
-	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg,
-				  SNP_MSG_REPORT_REQ, req, sizeof(*req), resp->data,
-				  resp_len);
+	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
+				  report_req, sizeof(*report_req), report_resp->data, resp_len);
 	if (rc)
 		goto e_free;
 
-	if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+	if (copy_to_user((void __user *)arg->resp_data, report_resp, sizeof(*report_resp)))
 		rc = -EFAULT;
 
 e_free:
-	kfree(resp);
+	kfree(report_resp);
 	return rc;
 }
 
 static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
 {
-	struct snp_derived_key_req *req = &snp_dev->req.derived_key;
+	struct snp_derived_key_req *derived_key_req = &snp_dev->req.derived_key;
 	struct snp_guest_crypto *crypto = snp_dev->crypto;
-	struct snp_derived_key_resp resp = {0};
+	struct snp_derived_key_resp derived_key_resp = {0};
 	int rc, resp_len;
 	/* Response data is 64 bytes and max authsize for GCM is 16 bytes. */
 	u8 buf[64 + 16];
@@ -551,25 +550,27 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(resp.data) + crypto->a_len;
+	resp_len = sizeof(derived_key_resp.data) + crypto->a_len;
 	if (sizeof(buf) < resp_len)
 		return -ENOMEM;
 
-	if (copy_from_user(req, (void __user *)arg->req_data, sizeof(*req)))
+	if (copy_from_user(derived_key_req, (void __user *)arg->req_data,
+			   sizeof(*derived_key_req)))
 		return -EFAULT;
 
-	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg,
-				  SNP_MSG_KEY_REQ, req, sizeof(*req), buf, resp_len);
+	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_KEY_REQ,
+				  derived_key_req, sizeof(*derived_key_req), buf, resp_len);
 	if (rc)
 		return rc;
 
-	memcpy(resp.data, buf, sizeof(resp.data));
-	if (copy_to_user((void __user *)arg->resp_data, &resp, sizeof(resp)))
+	memcpy(derived_key_resp.data, buf, sizeof(derived_key_resp.data));
+	if (copy_to_user((void __user *)arg->resp_data, &derived_key_resp,
+			 sizeof(derived_key_resp)))
 		rc = -EFAULT;
 
 	/* The response buffer contains the sensitive data, explicitly clear it. */
 	memzero_explicit(buf, sizeof(buf));
-	memzero_explicit(&resp, sizeof(resp));
+	memzero_explicit(&derived_key_resp, sizeof(derived_key_resp));
 	return rc;
 }
 
@@ -577,9 +578,9 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 			  struct snp_req_resp *io)
 
 {
-	struct snp_ext_report_req *req = &snp_dev->req.ext_report;
+	struct snp_ext_report_req *report_req = &snp_dev->req.ext_report;
 	struct snp_guest_crypto *crypto = snp_dev->crypto;
-	struct snp_report_resp *resp;
+	struct snp_report_resp *report_resp;
 	int ret, npages = 0, resp_len;
 	sockptr_t certs_address;
 
@@ -588,22 +589,22 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 	if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data))
 		return -EINVAL;
 
-	if (copy_from_sockptr(req, io->req_data, sizeof(*req)))
+	if (copy_from_sockptr(report_req, io->req_data, sizeof(*report_req)))
 		return -EFAULT;
 
 	/* caller does not want certificate data */
-	if (!req->certs_len || !req->certs_address)
+	if (!report_req->certs_len || !report_req->certs_address)
 		goto cmd;
 
-	if (req->certs_len > SEV_FW_BLOB_MAX_SIZE ||
-	    !IS_ALIGNED(req->certs_len, PAGE_SIZE))
+	if (report_req->certs_len > SEV_FW_BLOB_MAX_SIZE ||
+	    !IS_ALIGNED(report_req->certs_len, PAGE_SIZE))
 		return -EINVAL;
 
 	if (sockptr_is_kernel(io->resp_data)) {
-		certs_address = KERNEL_SOCKPTR((void *)req->certs_address);
+		certs_address = KERNEL_SOCKPTR((void *)report_req->certs_address);
 	} else {
-		certs_address = USER_SOCKPTR((void __user *)req->certs_address);
-		if (!access_ok(certs_address.user, req->certs_len))
+		certs_address = USER_SOCKPTR((void __user *)report_req->certs_address);
+		if (!access_ok(certs_address.user, report_req->certs_len))
 			return -EFAULT;
 	}
 
@@ -613,45 +614,45 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 	 * the host. If host does not supply any certs in it, then copy
 	 * zeros to indicate that certificate data was not provided.
 	 */
-	memset(snp_dev->certs_data, 0, req->certs_len);
-	npages = req->certs_len >> PAGE_SHIFT;
+	memset(snp_dev->certs_data, 0, report_req->certs_len);
+	npages = report_req->certs_len >> PAGE_SHIFT;
 cmd:
 	/*
 	 * The intermediate response buffer is used while decrypting the
 	 * response payload. Make sure that it has enough space to cover the
 	 * authtag.
 	 */
-	resp_len = sizeof(resp->data) + crypto->a_len;
-	resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
-	if (!resp)
+	resp_len = sizeof(report_resp->data) + crypto->a_len;
+	report_resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+	if (!report_resp)
 		return -ENOMEM;
 
 	snp_dev->input.data_npages = npages;
-	ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg,
-				   SNP_MSG_REPORT_REQ, &req->data,
-				   sizeof(req->data), resp->data, resp_len);
+	ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
+				   &report_req->data, sizeof(report_req->data),
+				   report_resp->data, resp_len);
 
 	/* If certs length is invalid then copy the returned length */
 	if (arg->vmm_error == SNP_GUEST_VMM_ERR_INVALID_LEN) {
-		req->certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
+		report_req->certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
 
-		if (copy_to_sockptr(io->req_data, req, sizeof(*req)))
+		if (copy_to_sockptr(io->req_data, report_req, sizeof(*report_req)))
 			ret = -EFAULT;
 	}
 
 	if (ret)
 		goto e_free;
 
-	if (npages && copy_to_sockptr(certs_address, snp_dev->certs_data, req->certs_len)) {
+	if (npages && copy_to_sockptr(certs_address, snp_dev->certs_data, report_req->certs_len)) {
 		ret = -EFAULT;
 		goto e_free;
 	}
 
-	if (copy_to_sockptr(io->resp_data, resp, sizeof(*resp)))
+	if (copy_to_sockptr(io->resp_data, report_resp, sizeof(*report_resp)))
 		ret = -EFAULT;
 
 e_free:
-	kfree(resp);
+	kfree(report_resp);
 	return ret;
 }
 

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [tip: x86/sev] virt: sev-guest: Replace dev_dbg() with pr_debug()
  2024-07-31 15:07 ` [PATCH v11 01/20] virt: sev-guest: Replace dev_dbg with pr_debug Nikunj A Dadhania
@ 2024-08-27  8:48   ` tip-bot2 for Nikunj A Dadhania
  0 siblings, 0 replies; 66+ messages in thread
From: tip-bot2 for Nikunj A Dadhania @ 2024-08-27  8:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Nikunj A Dadhania, Borislav Petkov (AMD), Tom Lendacky,
	Peter Gonda, x86, linux-kernel

The following commit has been merged into the x86/sev branch of tip:

Commit-ID:     dc6d20b900b72bea89ebd8154ba9bde1029f330b
Gitweb:        https://git.kernel.org/tip/dc6d20b900b72bea89ebd8154ba9bde1029f330b
Author:        Nikunj A Dadhania <nikunj@amd.com>
AuthorDate:    Wed, 31 Jul 2024 20:37:52 +05:30
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Tue, 27 Aug 2024 10:22:20 +02:00

virt: sev-guest: Replace dev_dbg() with pr_debug()

In preparation for moving code to arch/x86/coco/sev/core.c, replace
dev_dbg with pr_debug.

No functional change.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Peter Gonda <pgonda@google.com>
Link: https://lore.kernel.org/r/20240731150811.156771-2-nikunj@amd.com
---
 drivers/virt/coco/sev-guest/sev-guest.c |  9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 6fc7884..7d343f2 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -296,8 +296,9 @@ static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload, 
 	struct snp_guest_msg_hdr *req_hdr = &req->hdr;
 	struct snp_guest_msg_hdr *resp_hdr = &resp->hdr;
 
-	dev_dbg(snp_dev->dev, "response [seqno %lld type %d version %d sz %d]\n",
-		resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version, resp_hdr->msg_sz);
+	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
+		 resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version,
+		 resp_hdr->msg_sz);
 
 	/* Copy response from shared memory to encrypted memory. */
 	memcpy(resp, snp_dev->response, sizeof(*resp));
@@ -343,8 +344,8 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8
 	if (!hdr->msg_seqno)
 		return -ENOSR;
 
-	dev_dbg(snp_dev->dev, "request [seqno %lld type %d version %d sz %d]\n",
-		hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
+	pr_debug("request [seqno %lld type %d version %d sz %d]\n",
+		 hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
 
 	return __enc_payload(snp_dev, req, payload, sz);
 }

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 06/20] x86/sev: Handle failures from snp_init()
  2024-07-31 15:07 ` [PATCH v11 06/20] x86/sev: Handle failures from snp_init() Nikunj A Dadhania
@ 2024-08-27 11:32   ` Borislav Petkov
  2024-08-28  4:47     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Borislav Petkov @ 2024-08-27 11:32 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: linux-kernel, thomas.lendacky, x86, kvm, mingo, tglx, dave.hansen,
	pgonda, seanjc, pbonzini

On Wed, Jul 31, 2024 at 08:37:57PM +0530, Nikunj A Dadhania wrote:
> Address the ignored failures from snp_init() in sme_enable(). Add error
> handling for scenarios where snp_init() fails to retrieve the SEV-SNP CC
> blob or encounters issues while parsing the CC blob.

Is this a real issue you've encountered or?

> This change ensures

Avoid having "This patch" or "This commit" or "This <whatever>" in the commit
message. It is tautologically useless.

Also, do

$ git grep 'This patch' Documentation/process

for more details.

> that SNP guests will error out early, preventing delayed error reporting or
> undefined behavior.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
>  arch/x86/mm/mem_encrypt_identity.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
> index ac33b2263a43..e83b363c5e68 100644
> --- a/arch/x86/mm/mem_encrypt_identity.c
> +++ b/arch/x86/mm/mem_encrypt_identity.c
> @@ -535,6 +535,13 @@ void __head sme_enable(struct boot_params *bp)
>  	if (snp && !(msr & MSR_AMD64_SEV_SNP_ENABLED))
>  		snp_abort();
>  
> +	/*
> +	 * The SEV-SNP CC blob should be present and parsing CC blob should
> +	 * succeed when SEV-SNP is enabled.
> +	 */
> +	if (!snp && (msr & MSR_AMD64_SEV_SNP_ENABLED))
> +		snp_abort();

Any chance you could combine the above and this test?

Perhaps look around at the code before adding your check - there might be some
opportunity for aggregation and improvement...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 06/20] x86/sev: Handle failures from snp_init()
  2024-08-27 11:32   ` Borislav Petkov
@ 2024-08-28  4:47     ` Nikunj A. Dadhania
  2024-08-28  9:49       ` Borislav Petkov
  0 siblings, 1 reply; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-08-28  4:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, thomas.lendacky, x86, kvm, mingo, tglx, dave.hansen,
	pgonda, seanjc, pbonzini

Hi Boris,

On 8/27/2024 5:02 PM, Borislav Petkov wrote:
> On Wed, Jul 31, 2024 at 08:37:57PM +0530, Nikunj A Dadhania wrote:
>> Address the ignored failures from snp_init() in sme_enable(). Add error
>> handling for scenarios where snp_init() fails to retrieve the SEV-SNP CC
>> blob or encounters issues while parsing the CC blob.
> 
> Is this a real issue you've encountered or?

As per you comment [1], you had suggested to error out early in snp_init()
instead of waiting till snp_init_platform_device(). As snp_init() was
ignoring the failure case, I have added this patch. Following patch adds
secrets page parsing from CC blob. When the parsing fails, snp_init() will
return failure.

> 
>> This change ensures
> 
> Avoid having "This patch" or "This commit" or "This <whatever>" in the commit
> message. It is tautologically useless.

Sure, will do.
 
>> diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
>> index ac33b2263a43..e83b363c5e68 100644
>> --- a/arch/x86/mm/mem_encrypt_identity.c
>> +++ b/arch/x86/mm/mem_encrypt_identity.c
>> @@ -535,6 +535,13 @@ void __head sme_enable(struct boot_params *bp)
>>  	if (snp && !(msr & MSR_AMD64_SEV_SNP_ENABLED))
>>  		snp_abort();
>>  
>> +	/*
>> +	 * The SEV-SNP CC blob should be present and parsing CC blob should
>> +	 * succeed when SEV-SNP is enabled.
>> +	 */
>> +	if (!snp && (msr & MSR_AMD64_SEV_SNP_ENABLED))
>> +		snp_abort();
> 
> Any chance you could combine the above and this test?
> 
> Perhaps look around at the code before adding your check - there might be some
> opportunity for aggregation and improvement...

Sure, how about the below patch ?

From: Nikunj A Dadhania <nikunj@amd.com>
Date: Wed, 22 May 2024 12:43:42 +0530
Subject: [PATCH] x86/sev: Handle failures from snp_init()

Address the ignored failures from snp_init() in sme_enable(). Add error
handling for scenarios where snp_init() fails to retrieve the SEV-SNP CC
blob or encounters issues while parsing the CC blob. Ensure that SNP guests
will error out early, preventing delayed error reporting or undefined
behavior.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/mm/mem_encrypt_identity.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index ac33b2263a43..a0124a479972 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -495,7 +495,7 @@ void __head sme_enable(struct boot_params *bp)
 	unsigned int eax, ebx, ecx, edx;
 	unsigned long feature_mask;
 	unsigned long me_mask;
-	bool snp;
+	bool snp, snp_enabled;
 	u64 msr;
 
 	snp = snp_init(bp);
@@ -529,10 +529,17 @@ void __head sme_enable(struct boot_params *bp)
 
 	/* Check the SEV MSR whether SEV or SME is enabled */
 	RIP_REL_REF(sev_status) = msr = __rdmsr(MSR_AMD64_SEV);
-	feature_mask = (msr & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;
+	snp_enabled = msr & MSR_AMD64_SEV_SNP_ENABLED;
+	feature_mask = snp_enabled ? AMD_SEV_BIT : AMD_SME_BIT;
 
-	/* The SEV-SNP CC blob should never be present unless SEV-SNP is enabled. */
-	if (snp && !(msr & MSR_AMD64_SEV_SNP_ENABLED))
+	/*
+	 * The SEV-SNP CC blob should never be present unless SEV-SNP is enabled.
+	 *
+	 * The SEV-SNP CC blob should be present and parsing CC blob should
+	 * succeed when SEV-SNP is enabled.
+	 */
+	if ((snp && !snp_enabled) ||
+	    (!snp && snp_enabled))
 		snp_abort();
 
 	/* Check if memory encryption is enabled */
-- 
2.34.1


1. https://lore.kernel.org/lkml/20240416144542.GFZh6PFjPNT9Zt3iUl@fat_crate.local/

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 06/20] x86/sev: Handle failures from snp_init()
  2024-08-28  4:47     ` Nikunj A. Dadhania
@ 2024-08-28  9:49       ` Borislav Petkov
  2024-08-28 10:16         ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Borislav Petkov @ 2024-08-28  9:49 UTC (permalink / raw)
  To: Nikunj A. Dadhania
  Cc: linux-kernel, thomas.lendacky, x86, kvm, mingo, tglx, dave.hansen,
	pgonda, seanjc, pbonzini

On Wed, Aug 28, 2024 at 10:17:57AM +0530, Nikunj A. Dadhania wrote:
> +	if ((snp && !snp_enabled) ||
> +	    (!snp && snp_enabled))
>  		snp_abort();

And which boolean function is that?

diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index e83b363c5e68..706cb59851b0 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -495,10 +495,10 @@ void __head sme_enable(struct boot_params *bp)
 	unsigned int eax, ebx, ecx, edx;
 	unsigned long feature_mask;
 	unsigned long me_mask;
-	bool snp;
+	bool snp_en;
 	u64 msr;
 
-	snp = snp_init(bp);
+	snp_en = snp_init(bp);
 
 	/* Check for the SME/SEV support leaf */
 	eax = 0x80000000;
@@ -531,15 +531,11 @@ void __head sme_enable(struct boot_params *bp)
 	RIP_REL_REF(sev_status) = msr = __rdmsr(MSR_AMD64_SEV);
 	feature_mask = (msr & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;
 
-	/* The SEV-SNP CC blob should never be present unless SEV-SNP is enabled. */
-	if (snp && !(msr & MSR_AMD64_SEV_SNP_ENABLED))
-		snp_abort();
-
 	/*
-	 * The SEV-SNP CC blob should be present and parsing CC blob should
-	 * succeed when SEV-SNP is enabled.
+	 * Any discrepancies between the presence of a CC blob and SNP
+	 * enablement abort the guest.
 	 */
-	if (!snp && (msr & MSR_AMD64_SEV_SNP_ENABLED))
+	if (snp_en ^ (msr & MSR_AMD64_SEV_SNP_ENABLED))
 		snp_abort();
 
 	/* Check if memory encryption is enabled */

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 06/20] x86/sev: Handle failures from snp_init()
  2024-08-28  9:49       ` Borislav Petkov
@ 2024-08-28 10:16         ` Nikunj A. Dadhania
  2024-08-28 10:23           ` Borislav Petkov
  0 siblings, 1 reply; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-08-28 10:16 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, thomas.lendacky, x86, kvm, mingo, tglx, dave.hansen,
	pgonda, seanjc, pbonzini

On 8/28/2024 3:19 PM, Borislav Petkov wrote:
> On Wed, Aug 28, 2024 at 10:17:57AM +0530, Nikunj A. Dadhania wrote:
>> +	if ((snp && !snp_enabled) ||
>> +	    (!snp && snp_enabled))
>>  		snp_abort();
> 
> And which boolean function is that?

Ah.. missed that.

>  	/*
> -	 * The SEV-SNP CC blob should be present and parsing CC blob should
> -	 * succeed when SEV-SNP is enabled.
> +	 * Any discrepancies between the presence of a CC blob and SNP
> +	 * enablement abort the guest.
>  	 */
> -	if (!snp && (msr & MSR_AMD64_SEV_SNP_ENABLED))
> +	if (snp_en ^ (msr & MSR_AMD64_SEV_SNP_ENABLED))
>  		snp_abort();
>  
>  	/* Check if memory encryption is enabled */
> 

Do you want me to send the patch again with above change?

Regards
Nikunj

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 06/20] x86/sev: Handle failures from snp_init()
  2024-08-28 10:16         ` Nikunj A. Dadhania
@ 2024-08-28 10:23           ` Borislav Petkov
  0 siblings, 0 replies; 66+ messages in thread
From: Borislav Petkov @ 2024-08-28 10:23 UTC (permalink / raw)
  To: Nikunj A. Dadhania
  Cc: linux-kernel, thomas.lendacky, x86, kvm, mingo, tglx, dave.hansen,
	pgonda, seanjc, pbonzini

On Wed, Aug 28, 2024 at 03:46:23PM +0530, Nikunj A. Dadhania wrote:
> Do you want me to send the patch again with above change?

After I've gone through the whole set, sure.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 08/20] virt: sev-guest: Consolidate SNP guest messaging parameters to a struct
  2024-07-31 15:07 ` [PATCH v11 08/20] virt: sev-guest: Consolidate SNP guest messaging parameters to a struct Nikunj A Dadhania
@ 2024-09-04 14:31   ` Borislav Petkov
  2024-09-05  4:35     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Borislav Petkov @ 2024-09-04 14:31 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: linux-kernel, thomas.lendacky, x86, kvm, mingo, tglx, dave.hansen,
	pgonda, seanjc, pbonzini

On Wed, Jul 31, 2024 at 08:37:59PM +0530, Nikunj A Dadhania wrote:
> +static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
> +				struct snp_guest_request_ioctl *rio, u8 type,
> +				void *req_buf, size_t req_sz, void *resp_buf,
> +				u32 resp_sz)
> +{
> +	struct snp_guest_req req = {
> +		.msg_version	= rio->msg_version,
> +		.msg_type	= type,
> +		.vmpck_id	= vmpck_id,
> +		.req_buf	= req_buf,
> +		.req_sz		= req_sz,
> +		.resp_buf	= resp_buf,
> +		.resp_sz	= resp_sz,
> +		.exit_code	= exit_code,
> +	};
> +
> +	return snp_send_guest_request(snp_dev, &req, rio);
> +}

Right, except you don't need that silly routine copying stuff around either
but simply do the right thing at each call site from the get-go.

using the following coding pattern:

	struct snp_guest_req req = { };

	/* assign all members required for the respective call: */
	req.<member> = ...;
	...

	err = snp_send_guest_request(snp_dev, &req, rio);
	if (err)
		...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 08/20] virt: sev-guest: Consolidate SNP guest messaging parameters to a struct
  2024-09-04 14:31   ` Borislav Petkov
@ 2024-09-05  4:35     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-05  4:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, thomas.lendacky, x86, kvm, mingo, tglx, dave.hansen,
	pgonda, seanjc, pbonzini



On 9/4/2024 8:01 PM, Borislav Petkov wrote:
> On Wed, Jul 31, 2024 at 08:37:59PM +0530, Nikunj A Dadhania wrote:
>> +static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
>> +				struct snp_guest_request_ioctl *rio, u8 type,
>> +				void *req_buf, size_t req_sz, void *resp_buf,
>> +				u32 resp_sz)
>> +{
>> +	struct snp_guest_req req = {
>> +		.msg_version	= rio->msg_version,
>> +		.msg_type	= type,
>> +		.vmpck_id	= vmpck_id,
>> +		.req_buf	= req_buf,
>> +		.req_sz		= req_sz,
>> +		.resp_buf	= resp_buf,
>> +		.resp_sz	= resp_sz,
>> +		.exit_code	= exit_code,
>> +	};
>> +
>> +	return snp_send_guest_request(snp_dev, &req, rio);
>> +}
> 
> Right, except you don't need that silly routine copying stuff around either
> but simply do the right thing at each call site from the get-go.
> 
> using the following coding pattern:
> 
> 	struct snp_guest_req req = { };
> 
> 	/* assign all members required for the respective call: */
> 	req.<member> = ...;
> 	...
> 
> 	err = snp_send_guest_request(snp_dev, &req, rio);
> 	if (err)
> 		...

Sure, will update all the call sites.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 09/20] virt: sev-guest: Reduce the scope of SNP command mutex
  2024-07-31 15:08 ` [PATCH v11 09/20] virt: sev-guest: Reduce the scope of SNP command mutex Nikunj A Dadhania
@ 2024-09-12 21:54   ` Tom Lendacky
  2024-09-13  4:26     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Tom Lendacky @ 2024-09-12 21:54 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> The SNP command mutex is used to serialize access to the shared buffer,
> command handling, and message sequence number.
> 
> All shared buffer, command handling, and message sequence updates are done
> within snp_send_guest_request(), so moving the mutex to this function is
> appropriate and maintains the critical section.
> 
> Since the mutex is now taken at a later point in time, remove the lockdep
> checks that occur before taking the mutex.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
>  drivers/virt/coco/sev-guest/sev-guest.c | 17 ++---------------
>  1 file changed, 2 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
> index 92734a2345a6..42f7126f1718 100644
> --- a/drivers/virt/coco/sev-guest/sev-guest.c
> +++ b/drivers/virt/coco/sev-guest/sev-guest.c
> @@ -345,6 +345,8 @@ static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
>  	u64 seqno;
>  	int rc;
>  
> +	guard(mutex)(&snp_cmd_mutex);
> +
>  	/* Get message sequence and verify that its a non-zero */
>  	seqno = snp_get_msg_seqno(snp_dev);
>  	if (!seqno)
> @@ -419,8 +421,6 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
>  	struct snp_report_resp *report_resp;
>  	int rc, resp_len;
>  
> -	lockdep_assert_held(&snp_cmd_mutex);
> -
>  	if (!arg->req_data || !arg->resp_data)
>  		return -EINVAL;
>  
> @@ -458,8 +458,6 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
>  	/* Response data is 64 bytes and max authsize for GCM is 16 bytes. */
>  	u8 buf[64 + 16];
>  
> -	lockdep_assert_held(&snp_cmd_mutex);
> -
>  	if (!arg->req_data || !arg->resp_data)
>  		return -EINVAL;
>  
> @@ -501,8 +499,6 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
>  	int ret, npages = 0, resp_len;
>  	sockptr_t certs_address;
>  
> -	lockdep_assert_held(&snp_cmd_mutex);
> -
>  	if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data))
>  		return -EINVAL;
>  
> @@ -590,12 +586,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
>  	if (!input.msg_version)
>  		return -EINVAL;
>  
> -	mutex_lock(&snp_cmd_mutex);
> -
>  	/* Check if the VMPCK is not empty */
>  	if (is_vmpck_empty(snp_dev)) {

Are we ok with this being outside of the lock now?

I believe is_vmpck_empty() can get a false and then be waiting on the
mutex while snp_disable_vmpck() is called. Suddenly the code thinks the
VMPCK is valid when it isn't anymore. Not sure if that matters, as the
guest request will fail anyway?

Thanks,
Tom

>  		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
> -		mutex_unlock(&snp_cmd_mutex);
>  		return -ENOTTY;
>  	}
>  
> @@ -620,8 +613,6 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
>  		break;
>  	}
>  
> -	mutex_unlock(&snp_cmd_mutex);
> -
>  	if (input.exitinfo2 && copy_to_user(argp, &input, sizeof(input)))
>  		return -EFAULT;
>  
> @@ -736,8 +727,6 @@ static int sev_svsm_report_new(struct tsm_report *report, void *data)
>  	man_len = SZ_4K;
>  	certs_len = SEV_FW_BLOB_MAX_SIZE;
>  
> -	guard(mutex)(&snp_cmd_mutex);
> -
>  	if (guid_is_null(&desc->service_guid)) {
>  		call_id = SVSM_ATTEST_CALL(SVSM_ATTEST_SERVICES);
>  	} else {
> @@ -872,8 +861,6 @@ static int sev_report_new(struct tsm_report *report, void *data)
>  	if (!buf)
>  		return -ENOMEM;
>  
> -	guard(mutex)(&snp_cmd_mutex);
> -
>  	/* Check if the VMPCK is not empty */
>  	if (is_vmpck_empty(snp_dev)) {
>  		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 09/20] virt: sev-guest: Reduce the scope of SNP command mutex
  2024-09-12 21:54   ` Tom Lendacky
@ 2024-09-13  4:26     ` Nikunj A. Dadhania
  2024-09-13 14:06       ` Tom Lendacky
  0 siblings, 1 reply; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-13  4:26 UTC (permalink / raw)
  To: Tom Lendacky, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

Hi Tom,

On 9/13/2024 3:24 AM, Tom Lendacky wrote:
> On 7/31/24 10:08, Nikunj A Dadhania wrote:
>> @@ -590,12 +586,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
>>  	if (!input.msg_version)
>>  		return -EINVAL;
>>  
>> -	mutex_lock(&snp_cmd_mutex);
>> -
>>  	/* Check if the VMPCK is not empty */
>>  	if (is_vmpck_empty(snp_dev)) {
> 
> Are we ok with this being outside of the lock now?

We can move the check inside the lock, and get_* will try to prepare
the message and after grabbing the lock if the the VMPCK is empty we
would fail. Something like below:

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 8a2d0d751685..537f59358090 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -347,6 +347,12 @@ static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
 
 	guard(mutex)(&snp_cmd_mutex);
 
+	/* Check if the VMPCK is not empty */
+	if (is_vmpck_empty(snp_dev)) {
+		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
+		return -ENOTTY;
+        }
+
 	/* Get message sequence and verify that its a non-zero */
 	seqno = snp_get_msg_seqno(snp_dev);
 	if (!seqno)
@@ -594,12 +600,6 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
 	if (!input.msg_version)
 		return -EINVAL;
 
-	/* Check if the VMPCK is not empty */
-	if (is_vmpck_empty(snp_dev)) {
-		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
-		return -ENOTTY;
-	}
-
 	switch (ioctl) {
 	case SNP_GET_REPORT:
 		ret = get_report(snp_dev, &input);
@@ -869,12 +869,6 @@ static int sev_report_new(struct tsm_report *report, void *data)
 	if (!buf)
 		return -ENOMEM;
 
-	/* Check if the VMPCK is not empty */
-	if (is_vmpck_empty(snp_dev)) {
-		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
-		return -ENOTTY;
-	}
-
 	cert_table = buf + report_size;
 	struct snp_ext_report_req ext_req = {
 		.data = { .vmpl = desc->privlevel },


> I believe is_vmpck_empty() can get a false and then be waiting on the
> mutex while snp_disable_vmpck() is called. Suddenly the code thinks the
> VMPCK is valid when it isn't anymore. Not sure if that matters, as the
> guest request will fail anyway?

The above code will fail early.

> 
> Thanks,
> Tom
> 

Regards
Nikunj


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 09/20] virt: sev-guest: Reduce the scope of SNP command mutex
  2024-09-13  4:26     ` Nikunj A. Dadhania
@ 2024-09-13 14:06       ` Tom Lendacky
  0 siblings, 0 replies; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 14:06 UTC (permalink / raw)
  To: Nikunj A. Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 9/12/24 23:26, Nikunj A. Dadhania wrote:
> Hi Tom,
> 
> On 9/13/2024 3:24 AM, Tom Lendacky wrote:
>> On 7/31/24 10:08, Nikunj A Dadhania wrote:
>>> @@ -590,12 +586,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
>>>  	if (!input.msg_version)
>>>  		return -EINVAL;
>>>  
>>> -	mutex_lock(&snp_cmd_mutex);
>>> -
>>>  	/* Check if the VMPCK is not empty */
>>>  	if (is_vmpck_empty(snp_dev)) {
>>
>> Are we ok with this being outside of the lock now?
> 
> We can move the check inside the lock, and get_* will try to prepare
> the message and after grabbing the lock if the the VMPCK is empty we
> would fail. Something like below:

Yep, works for me.

Thanks,
Tom

> 
> diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
> index 8a2d0d751685..537f59358090 100644
> --- a/drivers/virt/coco/sev-guest/sev-guest.c
> +++ b/drivers/virt/coco/sev-guest/sev-guest.c
> @@ -347,6 +347,12 @@ static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
>  
>  	guard(mutex)(&snp_cmd_mutex);
>  
> +	/* Check if the VMPCK is not empty */
> +	if (is_vmpck_empty(snp_dev)) {
> +		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
> +		return -ENOTTY;
> +        }
> +
>  	/* Get message sequence and verify that its a non-zero */
>  	seqno = snp_get_msg_seqno(snp_dev);
>  	if (!seqno)
> @@ -594,12 +600,6 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
>  	if (!input.msg_version)
>  		return -EINVAL;
>  
> -	/* Check if the VMPCK is not empty */
> -	if (is_vmpck_empty(snp_dev)) {
> -		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
> -		return -ENOTTY;
> -	}
> -
>  	switch (ioctl) {
>  	case SNP_GET_REPORT:
>  		ret = get_report(snp_dev, &input);
> @@ -869,12 +869,6 @@ static int sev_report_new(struct tsm_report *report, void *data)
>  	if (!buf)
>  		return -ENOMEM;
>  
> -	/* Check if the VMPCK is not empty */
> -	if (is_vmpck_empty(snp_dev)) {
> -		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
> -		return -ENOTTY;
> -	}
> -
>  	cert_table = buf + report_size;
>  	struct snp_ext_report_req ext_req = {
>  		.data = { .vmpl = desc->privlevel },
> 
> 
>> I believe is_vmpck_empty() can get a false and then be waiting on the
>> mutex while snp_disable_vmpck() is called. Suddenly the code thinks the
>> VMPCK is valid when it isn't anymore. Not sure if that matters, as the
>> guest request will fail anyway?
> 
> The above code will fail early.
> 
>>
>> Thanks,
>> Tom
>>
> 
> Regards
> Nikunj
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 13/20] x86/cc: Add CC_ATTR_GUEST_SECURE_TSC
  2024-07-31 15:08 ` [PATCH v11 13/20] x86/cc: Add CC_ATTR_GUEST_SECURE_TSC Nikunj A Dadhania
@ 2024-09-13 15:21   ` Tom Lendacky
  2024-09-16  4:53     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 15:21 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> Add confidential compute platform attribute CC_ATTR_GUEST_SECURE_TSC that
> can be used by the guest to query whether the Secure TSC feature is active.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
>  include/linux/cc_platform.h | 8 ++++++++
>  arch/x86/coco/core.c        | 3 +++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
> index caa4b4430634..96dc61846c9d 100644
> --- a/include/linux/cc_platform.h
> +++ b/include/linux/cc_platform.h
> @@ -88,6 +88,14 @@ enum cc_attr {
>  	 * enabled to run SEV-SNP guests.
>  	 */
>  	CC_ATTR_HOST_SEV_SNP,
> +
> +	/**
> +	 * @CC_ATTR_GUEST_SECURE_TSC: Secure TSC is active.
> +	 *
> +	 * The platform/OS is running as a guest/virtual machine and actively
> +	 * using AMD SEV-SNP Secure TSC feature.
> +	 */
> +	CC_ATTR_GUEST_SECURE_TSC,

If this is specifically used for the AMD feature, as opposed to a generic
"does your system have a secure TSC", then it should probably be
CC_ATTR_GUEST_SNP_SECURE_TSC or CC_ATTR_GUEST_SEV_SNP_SECURE_TSC.

Thanks,
Tom

>  };
>  
>  #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
> diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
> index 0f81f70aca82..00df00e2cb4a 100644
> --- a/arch/x86/coco/core.c
> +++ b/arch/x86/coco/core.c
> @@ -100,6 +100,9 @@ static bool noinstr amd_cc_platform_has(enum cc_attr attr)
>  	case CC_ATTR_HOST_SEV_SNP:
>  		return cc_flags.host_sev_snp;
>  
> +	case CC_ATTR_GUEST_SECURE_TSC:
> +		return sev_status & MSR_AMD64_SNP_SECURE_TSC;
> +
>  	default:
>  		return false;
>  	}

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 10/20] virt: sev-guest: Carve out SNP message context structure
  2024-07-31 15:08 ` [PATCH v11 10/20] virt: sev-guest: Carve out SNP message context structure Nikunj A Dadhania
@ 2024-09-13 15:52   ` Tom Lendacky
  0 siblings, 0 replies; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 15:52 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> Currently, the sev-guest driver is the only user of SNP guest messaging.
> snp_guest_dev structure holds all the allocated buffers, secrets page and

The snp_guest_dev structure...

> VMPCK details. In preparation of adding messaging allocation and

s/of/for/

> initialization APIs, decouple snp_guest_dev from messaging-related
> information by carving out guest message context structure(snp_msg_desc).

s/out guest/out the guest/

> 
> Incorporate this newly added context into snp_send_guest_request() and all
> related functions, replacing the use of the snp_guest_dev.
> 
> No functional change.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  arch/x86/include/asm/sev.h              |  21 +++
>  drivers/virt/coco/sev-guest/sev-guest.c | 183 ++++++++++++------------
>  2 files changed, 111 insertions(+), 93 deletions(-)
> 
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index 27fa1c9c3465..2e49c4a9e7fe 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -234,6 +234,27 @@ struct snp_secrets_page {
>  	u8 rsvd4[3744];
>  } __packed;
>  
> +struct snp_msg_desc {
> +	/* request and response are in unencrypted memory */
> +	struct snp_guest_msg *request, *response;
> +
> +	/*
> +	 * Avoid information leakage by double-buffering shared messages
> +	 * in fields that are in regular encrypted memory.
> +	 */
> +	struct snp_guest_msg secret_request, secret_response;
> +
> +	struct snp_secrets_page *secrets;
> +	struct snp_req_data input;
> +
> +	void *certs_data;
> +
> +	struct aesgcm_ctx *ctx;
> +
> +	u32 *os_area_msg_seqno;
> +	u8 *vmpck;
> +};
> +
>  /*
>   * The SVSM Calling Area (CA) related structures.
>   */
> diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
> index 42f7126f1718..38ddabcd7ba3 100644
> --- a/drivers/virt/coco/sev-guest/sev-guest.c
> +++ b/drivers/virt/coco/sev-guest/sev-guest.c
> @@ -40,26 +40,13 @@ struct snp_guest_dev {
>  	struct device *dev;
>  	struct miscdevice misc;
>  
> -	void *certs_data;
> -	struct aesgcm_ctx *ctx;
> -	/* request and response are in unencrypted memory */
> -	struct snp_guest_msg *request, *response;
> -
> -	/*
> -	 * Avoid information leakage by double-buffering shared messages
> -	 * in fields that are in regular encrypted memory.
> -	 */
> -	struct snp_guest_msg secret_request, secret_response;
> +	struct snp_msg_desc *msg_desc;
>  
> -	struct snp_secrets_page *secrets;
> -	struct snp_req_data input;
>  	union {
>  		struct snp_report_req report;
>  		struct snp_derived_key_req derived_key;
>  		struct snp_ext_report_req ext_report;
>  	} req;
> -	u32 *os_area_msg_seqno;
> -	u8 *vmpck;
>  };
>  
>  /*
> @@ -76,12 +63,12 @@ MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.
>  /* Mutex to serialize the shared buffer access and command handling. */
>  static DEFINE_MUTEX(snp_cmd_mutex);
>  
> -static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
> +static bool is_vmpck_empty(struct snp_msg_desc *mdesc)
>  {
>  	char zero_key[VMPCK_KEY_LEN] = {0};
>  
> -	if (snp_dev->vmpck)
> -		return !memcmp(snp_dev->vmpck, zero_key, VMPCK_KEY_LEN);
> +	if (mdesc->vmpck)
> +		return !memcmp(mdesc->vmpck, zero_key, VMPCK_KEY_LEN);
>  
>  	return true;
>  }
> @@ -103,30 +90,30 @@ static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
>   * vulnerable. If the sequence number were incremented for a fresh IV the ASP
>   * will reject the request.
>   */
> -static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
> +static void snp_disable_vmpck(struct snp_msg_desc *mdesc)
>  {
> -	dev_alert(snp_dev->dev, "Disabling VMPCK%d communication key to prevent IV reuse.\n",
> +	pr_alert("Disabling VMPCK%d communication key to prevent IV reuse.\n",
>  		  vmpck_id);
> -	memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
> -	snp_dev->vmpck = NULL;
> +	memzero_explicit(mdesc->vmpck, VMPCK_KEY_LEN);
> +	mdesc->vmpck = NULL;
>  }
>  
> -static inline u64 __snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
> +static inline u64 __snp_get_msg_seqno(struct snp_msg_desc *mdesc)
>  {
>  	u64 count;
>  
>  	lockdep_assert_held(&snp_cmd_mutex);
>  
>  	/* Read the current message sequence counter from secrets pages */
> -	count = *snp_dev->os_area_msg_seqno;
> +	count = *mdesc->os_area_msg_seqno;
>  
>  	return count + 1;
>  }
>  
>  /* Return a non-zero on success */
> -static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
> +static u64 snp_get_msg_seqno(struct snp_msg_desc *mdesc)
>  {
> -	u64 count = __snp_get_msg_seqno(snp_dev);
> +	u64 count = __snp_get_msg_seqno(mdesc);
>  
>  	/*
>  	 * The message sequence counter for the SNP guest request is a  64-bit
> @@ -137,20 +124,20 @@ static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
>  	 * invalid number and will fail the  message request.
>  	 */
>  	if (count >= UINT_MAX) {
> -		dev_err(snp_dev->dev, "request message sequence counter overflow\n");
> +		pr_err("request message sequence counter overflow\n");
>  		return 0;
>  	}
>  
>  	return count;
>  }
>  
> -static void snp_inc_msg_seqno(struct snp_guest_dev *snp_dev)
> +static void snp_inc_msg_seqno(struct snp_msg_desc *mdesc)
>  {
>  	/*
>  	 * The counter is also incremented by the PSP, so increment it by 2
>  	 * and save in secrets page.
>  	 */
> -	*snp_dev->os_area_msg_seqno += 2;
> +	*mdesc->os_area_msg_seqno += 2;
>  }
>  
>  static inline struct snp_guest_dev *to_snp_dev(struct file *file)
> @@ -177,13 +164,13 @@ static struct aesgcm_ctx *snp_init_crypto(u8 *key, size_t keylen)
>  	return ctx;
>  }
>  
> -static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_req *req)
> +static int verify_and_dec_payload(struct snp_msg_desc *mdesc, struct snp_guest_req *req)
>  {
> -	struct snp_guest_msg *resp_msg = &snp_dev->secret_response;
> -	struct snp_guest_msg *req_msg = &snp_dev->secret_request;
> +	struct snp_guest_msg *resp_msg = &mdesc->secret_response;
> +	struct snp_guest_msg *req_msg = &mdesc->secret_request;
>  	struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr;
>  	struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr;
> -	struct aesgcm_ctx *ctx = snp_dev->ctx;
> +	struct aesgcm_ctx *ctx = mdesc->ctx;
>  	u8 iv[GCM_AES_IV_SIZE] = {};
>  
>  	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
> @@ -191,7 +178,7 @@ static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, struct snp_gues
>  		 resp_msg_hdr->msg_sz);
>  
>  	/* Copy response from shared memory to encrypted memory. */
> -	memcpy(resp_msg, snp_dev->response, sizeof(*resp_msg));
> +	memcpy(resp_msg, mdesc->response, sizeof(*resp_msg));
>  
>  	/* Verify that the sequence counter is incremented by 1 */
>  	if (unlikely(resp_msg_hdr->msg_seqno != (req_msg_hdr->msg_seqno + 1)))
> @@ -218,11 +205,11 @@ static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, struct snp_gues
>  	return 0;
>  }
>  
> -static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, struct snp_guest_req *req)
> +static int enc_payload(struct snp_msg_desc *mdesc, u64 seqno, struct snp_guest_req *req)
>  {
> -	struct snp_guest_msg *msg = &snp_dev->secret_request;
> +	struct snp_guest_msg *msg = &mdesc->secret_request;
>  	struct snp_guest_msg_hdr *hdr = &msg->hdr;
> -	struct aesgcm_ctx *ctx = snp_dev->ctx;
> +	struct aesgcm_ctx *ctx = mdesc->ctx;
>  	u8 iv[GCM_AES_IV_SIZE] = {};
>  
>  	memset(msg, 0, sizeof(*msg));
> @@ -253,7 +240,7 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, struct snp_gues
>  	return 0;
>  }
>  
> -static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_guest_req *req,
> +static int __handle_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
>  				  struct snp_guest_request_ioctl *rio)
>  {
>  	unsigned long req_start = jiffies;
> @@ -268,7 +255,7 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
>  	 * sequence number must be incremented or the VMPCK must be deleted to
>  	 * prevent reuse of the IV.
>  	 */
> -	rc = snp_issue_guest_request(req, &snp_dev->input, rio);
> +	rc = snp_issue_guest_request(req, &mdesc->input, rio);
>  	switch (rc) {
>  	case -ENOSPC:
>  		/*
> @@ -278,7 +265,7 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
>  		 * order to increment the sequence number and thus avoid
>  		 * IV reuse.
>  		 */
> -		override_npages = snp_dev->input.data_npages;
> +		override_npages = mdesc->input.data_npages;
>  		req->exit_code	= SVM_VMGEXIT_GUEST_REQUEST;
>  
>  		/*
> @@ -318,7 +305,7 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
>  	 * structure and any failure will wipe the VMPCK, preventing further
>  	 * use anyway.
>  	 */
> -	snp_inc_msg_seqno(snp_dev);
> +	snp_inc_msg_seqno(mdesc);
>  
>  	if (override_err) {
>  		rio->exitinfo2 = override_err;
> @@ -334,12 +321,12 @@ static int __handle_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
>  	}
>  
>  	if (override_npages)
> -		snp_dev->input.data_npages = override_npages;
> +		mdesc->input.data_npages = override_npages;
>  
>  	return rc;
>  }
>  
> -static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_guest_req *req,
> +static int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
>  				  struct snp_guest_request_ioctl *rio)
>  {
>  	u64 seqno;
> @@ -348,15 +335,15 @@ static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
>  	guard(mutex)(&snp_cmd_mutex);
>  
>  	/* Get message sequence and verify that its a non-zero */
> -	seqno = snp_get_msg_seqno(snp_dev);
> +	seqno = snp_get_msg_seqno(mdesc);
>  	if (!seqno)
>  		return -EIO;
>  
>  	/* Clear shared memory's response for the host to populate. */
> -	memset(snp_dev->response, 0, sizeof(struct snp_guest_msg));
> +	memset(mdesc->response, 0, sizeof(struct snp_guest_msg));
>  
> -	/* Encrypt the userspace provided payload in snp_dev->secret_request. */
> -	rc = enc_payload(snp_dev, seqno, req);
> +	/* Encrypt the userspace provided payload in mdesc->secret_request. */
> +	rc = enc_payload(mdesc, seqno, req);
>  	if (rc)
>  		return rc;
>  
> @@ -364,34 +351,33 @@ static int snp_send_guest_request(struct snp_guest_dev *snp_dev, struct snp_gues
>  	 * Write the fully encrypted request to the shared unencrypted
>  	 * request page.
>  	 */
> -	memcpy(snp_dev->request, &snp_dev->secret_request,
> -	       sizeof(snp_dev->secret_request));
> +	memcpy(mdesc->request, &mdesc->secret_request,
> +	       sizeof(mdesc->secret_request));
>  
> -	rc = __handle_guest_request(snp_dev, req, rio);
> +	rc = __handle_guest_request(mdesc, req, rio);
>  	if (rc) {
>  		if (rc == -EIO &&
>  		    rio->exitinfo2 == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
>  			return rc;
>  
> -		dev_alert(snp_dev->dev,
> -			  "Detected error from ASP request. rc: %d, exitinfo2: 0x%llx\n",
> -			  rc, rio->exitinfo2);
> +		pr_alert("Detected error from ASP request. rc: %d, exitinfo2: 0x%llx\n",
> +			 rc, rio->exitinfo2);
>  
> -		snp_disable_vmpck(snp_dev);
> +		snp_disable_vmpck(mdesc);
>  		return rc;
>  	}
>  
> -	rc = verify_and_dec_payload(snp_dev, req);
> +	rc = verify_and_dec_payload(mdesc, req);
>  	if (rc) {
> -		dev_alert(snp_dev->dev, "Detected unexpected decode failure from ASP. rc: %d\n", rc);
> -		snp_disable_vmpck(snp_dev);
> +		pr_alert("Detected unexpected decode failure from ASP. rc: %d\n", rc);
> +		snp_disable_vmpck(mdesc);
>  		return rc;
>  	}
>  
>  	return 0;
>  }
>  
> -static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
> +static int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code,
>  				struct snp_guest_request_ioctl *rio, u8 type,
>  				void *req_buf, size_t req_sz, void *resp_buf,
>  				u32 resp_sz)
> @@ -407,7 +393,7 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
>  		.exit_code	= exit_code,
>  	};
>  
> -	return snp_send_guest_request(snp_dev, &req, rio);
> +	return snp_send_guest_request(mdesc, &req, rio);
>  }
>  
>  struct snp_req_resp {
> @@ -418,6 +404,7 @@ struct snp_req_resp {
>  static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
>  {
>  	struct snp_report_req *report_req = &snp_dev->req.report;
> +	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
>  	struct snp_report_resp *report_resp;
>  	int rc, resp_len;
>  
> @@ -432,12 +419,12 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
>  	 * response payload. Make sure that it has enough space to cover the
>  	 * authtag.
>  	 */
> -	resp_len = sizeof(report_resp->data) + snp_dev->ctx->authsize;
> +	resp_len = sizeof(report_resp->data) + mdesc->ctx->authsize;
>  	report_resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
>  	if (!report_resp)
>  		return -ENOMEM;
>  
> -	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
> +	rc = handle_guest_request(mdesc, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
>  				  report_req, sizeof(*report_req), report_resp->data, resp_len);
>  	if (rc)
>  		goto e_free;
> @@ -454,6 +441,7 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
>  {
>  	struct snp_derived_key_req *derived_key_req = &snp_dev->req.derived_key;
>  	struct snp_derived_key_resp derived_key_resp = {0};
> +	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
>  	int rc, resp_len;
>  	/* Response data is 64 bytes and max authsize for GCM is 16 bytes. */
>  	u8 buf[64 + 16];
> @@ -466,7 +454,7 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
>  	 * response payload. Make sure that it has enough space to cover the
>  	 * authtag.
>  	 */
> -	resp_len = sizeof(derived_key_resp.data) + snp_dev->ctx->authsize;
> +	resp_len = sizeof(derived_key_resp.data) + mdesc->ctx->authsize;
>  	if (sizeof(buf) < resp_len)
>  		return -ENOMEM;
>  
> @@ -474,7 +462,7 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
>  			   sizeof(*derived_key_req)))
>  		return -EFAULT;
>  
> -	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_KEY_REQ,
> +	rc = handle_guest_request(mdesc, SVM_VMGEXIT_GUEST_REQUEST, arg, SNP_MSG_KEY_REQ,
>  				  derived_key_req, sizeof(*derived_key_req), buf, resp_len);
>  	if (rc)
>  		return rc;
> @@ -495,6 +483,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
>  
>  {
>  	struct snp_ext_report_req *report_req = &snp_dev->req.ext_report;
> +	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
>  	struct snp_report_resp *report_resp;
>  	int ret, npages = 0, resp_len;
>  	sockptr_t certs_address;
> @@ -527,7 +516,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
>  	 * the host. If host does not supply any certs in it, then copy
>  	 * zeros to indicate that certificate data was not provided.
>  	 */
> -	memset(snp_dev->certs_data, 0, report_req->certs_len);
> +	memset(mdesc->certs_data, 0, report_req->certs_len);
>  	npages = report_req->certs_len >> PAGE_SHIFT;
>  cmd:
>  	/*
> @@ -535,19 +524,19 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
>  	 * response payload. Make sure that it has enough space to cover the
>  	 * authtag.
>  	 */
> -	resp_len = sizeof(report_resp->data) + snp_dev->ctx->authsize;
> +	resp_len = sizeof(report_resp->data) + mdesc->ctx->authsize;
>  	report_resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
>  	if (!report_resp)
>  		return -ENOMEM;
>  
> -	snp_dev->input.data_npages = npages;
> -	ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
> +	mdesc->input.data_npages = npages;
> +	ret = handle_guest_request(mdesc, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg, SNP_MSG_REPORT_REQ,
>  				   &report_req->data, sizeof(report_req->data),
>  				   report_resp->data, resp_len);
>  
>  	/* If certs length is invalid then copy the returned length */
>  	if (arg->vmm_error == SNP_GUEST_VMM_ERR_INVALID_LEN) {
> -		report_req->certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
> +		report_req->certs_len = mdesc->input.data_npages << PAGE_SHIFT;
>  
>  		if (copy_to_sockptr(io->req_data, report_req, sizeof(*report_req)))
>  			ret = -EFAULT;
> @@ -556,7 +545,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
>  	if (ret)
>  		goto e_free;
>  
> -	if (npages && copy_to_sockptr(certs_address, snp_dev->certs_data, report_req->certs_len)) {
> +	if (npages && copy_to_sockptr(certs_address, mdesc->certs_data, report_req->certs_len)) {
>  		ret = -EFAULT;
>  		goto e_free;
>  	}
> @@ -572,6 +561,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
>  static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
>  {
>  	struct snp_guest_dev *snp_dev = to_snp_dev(file);
> +	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
>  	void __user *argp = (void __user *)arg;
>  	struct snp_guest_request_ioctl input;
>  	struct snp_req_resp io;
> @@ -587,7 +577,7 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
>  		return -EINVAL;
>  
>  	/* Check if the VMPCK is not empty */
> -	if (is_vmpck_empty(snp_dev)) {
> +	if (is_vmpck_empty(mdesc)) {
>  		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
>  		return -ENOTTY;
>  	}
> @@ -862,7 +852,7 @@ static int sev_report_new(struct tsm_report *report, void *data)
>  		return -ENOMEM;
>  
>  	/* Check if the VMPCK is not empty */
> -	if (is_vmpck_empty(snp_dev)) {
> +	if (is_vmpck_empty(snp_dev->msg_desc)) {
>  		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
>  		return -ENOTTY;
>  	}
> @@ -992,6 +982,7 @@ static int __init sev_guest_probe(struct platform_device *pdev)
>  	struct snp_secrets_page *secrets;
>  	struct device *dev = &pdev->dev;
>  	struct snp_guest_dev *snp_dev;
> +	struct snp_msg_desc *mdesc;
>  	struct miscdevice *misc;
>  	void __iomem *mapping;
>  	int ret;
> @@ -1014,46 +1005,50 @@ static int __init sev_guest_probe(struct platform_device *pdev)
>  	if (!snp_dev)
>  		goto e_unmap;
>  
> +	mdesc = devm_kzalloc(&pdev->dev, sizeof(struct snp_msg_desc), GFP_KERNEL);
> +	if (!mdesc)
> +		goto e_unmap;
> +
>  	/* Adjust the default VMPCK key based on the executing VMPL level */
>  	if (vmpck_id == -1)
>  		vmpck_id = snp_vmpl;
>  
>  	ret = -EINVAL;
> -	snp_dev->vmpck = get_vmpck(vmpck_id, secrets, &snp_dev->os_area_msg_seqno);
> -	if (!snp_dev->vmpck) {
> +	mdesc->vmpck = get_vmpck(vmpck_id, secrets, &mdesc->os_area_msg_seqno);
> +	if (!mdesc->vmpck) {
>  		dev_err(dev, "Invalid VMPCK%d communication key\n", vmpck_id);
>  		goto e_unmap;
>  	}
>  
>  	/* Verify that VMPCK is not zero. */
> -	if (is_vmpck_empty(snp_dev)) {
> +	if (is_vmpck_empty(mdesc)) {
>  		dev_err(dev, "Empty VMPCK%d communication key\n", vmpck_id);
>  		goto e_unmap;
>  	}
>  
>  	platform_set_drvdata(pdev, snp_dev);
>  	snp_dev->dev = dev;
> -	snp_dev->secrets = secrets;
> +	mdesc->secrets = secrets;
>  
>  	/* Ensure SNP guest messages do not span more than a page */
>  	BUILD_BUG_ON(sizeof(struct snp_guest_msg) > PAGE_SIZE);
>  
>  	/* Allocate the shared page used for the request and response message. */
> -	snp_dev->request = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
> -	if (!snp_dev->request)
> +	mdesc->request = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
> +	if (!mdesc->request)
>  		goto e_unmap;
>  
> -	snp_dev->response = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
> -	if (!snp_dev->response)
> +	mdesc->response = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
> +	if (!mdesc->response)
>  		goto e_free_request;
>  
> -	snp_dev->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
> -	if (!snp_dev->certs_data)
> +	mdesc->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
> +	if (!mdesc->certs_data)
>  		goto e_free_response;
>  
>  	ret = -EIO;
> -	snp_dev->ctx = snp_init_crypto(snp_dev->vmpck, VMPCK_KEY_LEN);
> -	if (!snp_dev->ctx)
> +	mdesc->ctx = snp_init_crypto(mdesc->vmpck, VMPCK_KEY_LEN);
> +	if (!mdesc->ctx)
>  		goto e_free_cert_data;
>  
>  	misc = &snp_dev->misc;
> @@ -1062,9 +1057,9 @@ static int __init sev_guest_probe(struct platform_device *pdev)
>  	misc->fops = &snp_guest_fops;
>  
>  	/* Initialize the input addresses for guest request */
> -	snp_dev->input.req_gpa = __pa(snp_dev->request);
> -	snp_dev->input.resp_gpa = __pa(snp_dev->response);
> -	snp_dev->input.data_gpa = __pa(snp_dev->certs_data);
> +	mdesc->input.req_gpa = __pa(mdesc->request);
> +	mdesc->input.resp_gpa = __pa(mdesc->response);
> +	mdesc->input.data_gpa = __pa(mdesc->certs_data);
>  
>  	/* Set the privlevel_floor attribute based on the vmpck_id */
>  	sev_tsm_ops.privlevel_floor = vmpck_id;
> @@ -1081,17 +1076,18 @@ static int __init sev_guest_probe(struct platform_device *pdev)
>  	if (ret)
>  		goto e_free_ctx;
>  
> +	snp_dev->msg_desc = mdesc;
>  	dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n", vmpck_id);
>  	return 0;
>  
>  e_free_ctx:
> -	kfree(snp_dev->ctx);
> +	kfree(mdesc->ctx);
>  e_free_cert_data:
> -	free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
> +	free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE);
>  e_free_response:
> -	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
> +	free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg));
>  e_free_request:
> -	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
> +	free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg));
>  e_unmap:
>  	iounmap(mapping);
>  	return ret;
> @@ -1100,11 +1096,12 @@ static int __init sev_guest_probe(struct platform_device *pdev)
>  static void __exit sev_guest_remove(struct platform_device *pdev)
>  {
>  	struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev);
> +	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
>  
> -	free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
> -	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
> -	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
> -	kfree(snp_dev->ctx);
> +	free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE);
> +	free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg));
> +	free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg));
> +	kfree(mdesc->ctx);
>  	misc_deregister(&snp_dev->misc);
>  }
>  

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 11/20] x86/sev: Carve out and export SNP guest messaging init routines
  2024-07-31 15:08 ` [PATCH v11 11/20] x86/sev: Carve out and export SNP guest messaging init routines Nikunj A Dadhania
@ 2024-09-13 15:53   ` Tom Lendacky
  0 siblings, 0 replies; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 15:53 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> Currently, the SEV guest driver is the only user of SNP guest messaging.
> All routines for initializing SNP guest messaging are implemented within
> the SEV guest driver. To add Secure TSC guest support, these initialization
> routines need to be available during early boot.
> 
> Carve out common SNP guest messaging buffer allocations and message
> initialization routines to core/sev.c and export them. These newly added
> APIs set up the SNP message context (snp_msg_desc), which contains all the
> necessary details for sending SNP guest messages.
> 
> At present, the SEV guest platform data structure is used to pass the
> secrets page physical address to SEV guest driver. Since the secrets page
> address is locally available to the initialization routine, use the cached
> address. Remove the unused SEV guest platform data structure.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  arch/x86/include/asm/sev.h              |  71 ++++++++-
>  arch/x86/coco/sev/core.c                | 133 +++++++++++++++-
>  drivers/virt/coco/sev-guest/sev-guest.c | 194 +++---------------------
>  3 files changed, 213 insertions(+), 185 deletions(-)
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 12/20] x86/sev: Relocate SNP guest messaging routines to common code
  2024-07-31 15:08 ` [PATCH v11 12/20] x86/sev: Relocate SNP guest messaging routines to common code Nikunj A Dadhania
@ 2024-09-13 16:27   ` Tom Lendacky
  2024-09-16  4:42     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 16:27 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> At present, the SEV guest driver exclusively handles SNP guest messaging.
> All routines for sending guest messages are embedded within the guest
> driver. To support Secure TSC, SEV-SNP guests must communicate with the AMD
> Security Processor during early boot. However, these guest messaging
> functions are not accessible during early boot since they are currently
> part of the guest driver.
> 
> Hence, relocate the core SNP guest messaging functions to SEV common code
> and provide an API for sending SNP guest messages.
> 
> No functional change, but just an export symbol.

That means we can drop the export symbol on snp_issue_guest_request() and
make it static, right?

> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  arch/x86/include/asm/sev.h              |   8 +
>  arch/x86/coco/sev/core.c                | 284 +++++++++++++++++++++++
>  drivers/virt/coco/sev-guest/sev-guest.c | 286 ------------------------
>  arch/x86/Kconfig                        |   1 +
>  drivers/virt/coco/sev-guest/Kconfig     |   1 -
>  5 files changed, 293 insertions(+), 287 deletions(-)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 14/20] x86/sev: Add Secure TSC support for SNP guests
  2024-07-31 15:08 ` [PATCH v11 14/20] x86/sev: Add Secure TSC support for SNP guests Nikunj A Dadhania
@ 2024-09-13 16:29   ` Tom Lendacky
  0 siblings, 0 replies; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 16:29 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> Add support for Secure TSC in SNP-enabled guests. Secure TSC allows guests
> to securely use RDTSC/RDTSCP instructions, ensuring that the parameters
> used cannot be altered by the hypervisor once the guest is launched.
> 
> Secure TSC-enabled guests need to query TSC information from the AMD
> Security Processor. This communication channel is encrypted between the AMD
> Security Processor and the guest, with the hypervisor acting merely as a
> conduit to deliver the guest messages to the AMD Security Processor. Each
> message is protected with AEAD (AES-256 GCM). Use a minimal AES GCM library
> to encrypt and decrypt SNP guest messages for communication with the PSP.
> 
> Use mem_encrypt_init() to fetch SNP TSC information from the AMD Security
> Processor and initialize snp_tsc_scale and snp_tsc_offset. During secondary
> CPU initialization, set the VMSA fields GUEST_TSC_SCALE (offset 2F0h) and
> GUEST_TSC_OFFSET (offset 2F8h) with snp_tsc_scale and snp_tsc_offset,
> respectively.
> 
> Since handle_guest_request() is common routine used by both the SEV guest
> driver and Secure TSC code, move it to the SEV header file.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  arch/x86/include/asm/sev-common.h       |  1 +
>  arch/x86/include/asm/sev.h              | 46 +++++++++++++
>  arch/x86/include/asm/svm.h              |  6 +-
>  arch/x86/coco/sev/core.c                | 91 +++++++++++++++++++++++++
>  arch/x86/mm/mem_encrypt.c               |  4 ++
>  drivers/virt/coco/sev-guest/sev-guest.c | 19 ------
>  6 files changed, 146 insertions(+), 21 deletions(-)
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 16/20] x86/sev: Prevent RDTSC/RDTSCP interception for Secure TSC enabled guests
  2024-07-31 15:08 ` [PATCH v11 16/20] x86/sev: Prevent RDTSC/RDTSCP interception " Nikunj A Dadhania
@ 2024-09-13 16:49   ` Tom Lendacky
  0 siblings, 0 replies; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 16:49 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> The hypervisor should not be intercepting RDTSC/RDTSCP when Secure TSC is
> enabled. A #VC exception will be generated if the RDTSC/RDTSCP instructions
> are being intercepted. If this should occur and Secure TSC is enabled,
> terminate guest execution.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  arch/x86/coco/sev/shared.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/coco/sev/shared.c
> index 71de53194089..c2a9e2ada659 100644
> --- a/arch/x86/coco/sev/shared.c
> +++ b/arch/x86/coco/sev/shared.c
> @@ -1140,6 +1140,16 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,
>  	bool rdtscp = (exit_code == SVM_EXIT_RDTSCP);
>  	enum es_result ret;
>  
> +	/*
> +	 * RDTSC and RDTSCP should not be intercepted when Secure TSC is
> +	 * enabled. Terminate the SNP guest when the interception is enabled.
> +	 * This file is included from kernel/sev.c and boot/compressed/sev.c,
> +	 * use sev_status here as cc_platform_has() is not available when
> +	 * compiling boot/compressed/sev.c.
> +	 */
> +	if (sev_status & MSR_AMD64_SNP_SECURE_TSC)
> +		return ES_VMM_ERROR;
> +
>  	ret = sev_es_ghcb_hv_call(ghcb, ctxt, exit_code, 0, 0);
>  	if (ret != ES_OK)
>  		return ret;

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 17/20] x86/sev: Allow Secure TSC feature for SNP guests
  2024-07-31 15:08 ` [PATCH v11 17/20] x86/sev: Allow Secure TSC feature for SNP guests Nikunj A Dadhania
@ 2024-09-13 16:53   ` Tom Lendacky
  2024-09-16  6:23     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 16:53 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> Now that all the required plumbing is done for enabling SNP Secure TSC
> feature, add Secure TSC to SNP features present list.

So I think this should be the last patch in the series after the TSC is
marked reliable, kvmclock is bypassed, etc. This way everything is in
place when the guest is allowed to run with Secure TSC.

> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  arch/x86/boot/compressed/sev.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index cd44e120fe53..bb55934c1cee 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -401,7 +401,8 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
>   * by the guest kernel. As and when a new feature is implemented in the
>   * guest kernel, a corresponding bit should be added to the mask.
>   */
> -#define SNP_FEATURES_PRESENT	MSR_AMD64_SNP_DEBUG_SWAP
> +#define SNP_FEATURES_PRESENT	(MSR_AMD64_SNP_DEBUG_SWAP |	\
> +				 MSR_AMD64_SNP_SECURE_TSC)
>  
>  u64 snp_get_unsupported_features(u64 status)
>  {

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 18/20] x86/sev: Mark Secure TSC as reliable clocksource
  2024-07-31 15:08 ` [PATCH v11 18/20] x86/sev: Mark Secure TSC as reliable clocksource Nikunj A Dadhania
@ 2024-09-13 16:59   ` Tom Lendacky
  0 siblings, 0 replies; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 16:59 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> In SNP guest environment with Secure TSC enabled, unlike other clock
> sources (such as HPET, ACPI timer, APIC, etc.), the RDTSC instruction is
> handled without causing a VM exit, resulting in minimal overhead and
> jitters. Hence, mark Secure TSC as the only reliable clock source,
> bypassing unstable calibration.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  arch/x86/mm/mem_encrypt_amd.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
> index 86a476a426c2..e9fb5f24703a 100644
> --- a/arch/x86/mm/mem_encrypt_amd.c
> +++ b/arch/x86/mm/mem_encrypt_amd.c
> @@ -516,6 +516,10 @@ void __init sme_early_init(void)
>  	 * kernel mapped.
>  	 */
>  	snp_update_svsm_ca();
> +
> +	/* Mark the TSC as reliable when Secure TSC is enabled */
> +	if (sev_status & MSR_AMD64_SNP_SECURE_TSC)
> +		setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
>  }
>  
>  void __init mem_encrypt_free_decrypted_mem(void)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-07-31 15:08 ` [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available Nikunj A Dadhania
@ 2024-09-13 17:19   ` Tom Lendacky
  2024-09-13 17:30   ` Sean Christopherson
  1 sibling, 0 replies; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 17:19 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> For AMD SNP guests with SecureTSC enabled, kvm-clock is being picked up
> momentarily instead of selecting more stable TSC clocksource.
> 
> [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> [    0.000001] kvm-clock: using sched offset of 1799357702246960 cycles
> [    0.001493] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [    0.006289] tsc: Detected 1996.249 MHz processor
> [    0.305123] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
> [    1.045759] clocksource: Switched to clocksource kvm-clock
> [    1.141326] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
> [    1.144634] clocksource: Switched to clocksource tsc
> 
> When Secure TSC is enabled, skip using the kvmclock. The guest kernel will
> fallback and use Secure TSC based clocksource.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  arch/x86/kernel/kvmclock.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 5b2c15214a6b..3d03b4c937b9 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -289,7 +289,7 @@ void __init kvmclock_init(void)
>  {
>  	u8 flags;
>  
> -	if (!kvm_para_available() || !kvmclock)
> +	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
>  		return;
>  
>  	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE2)) {

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 20/20] x86/cpu/amd: Do not print FW_BUG for Secure TSC
  2024-07-31 15:08 ` [PATCH v11 20/20] x86/cpu/amd: Do not print FW_BUG for Secure TSC Nikunj A Dadhania
@ 2024-09-13 17:21   ` Tom Lendacky
  2024-09-13 17:42   ` Jim Mattson
  1 sibling, 0 replies; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 17:21 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:08, Nikunj A Dadhania wrote:
> When Secure TSC is enabled and TscInvariant (bit 8) in CPUID_8000_0007_edx
> is set, the kernel complains with the below firmware bug:
> 
> [Firmware Bug]: TSC doesn't count with P0 frequency!
> 
> Secure TSC does not need to run at P0 frequency; the TSC frequency is set
> by the VMM as part of the SNP_LAUNCH_START command. Skip this check when
> Secure TSC is enabled
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  arch/x86/kernel/cpu/amd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index be5889bded49..87b55d2183a0 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -370,7 +370,8 @@ static void bsp_determine_snp(struct cpuinfo_x86 *c)
>  
>  static void bsp_init_amd(struct cpuinfo_x86 *c)
>  {
> -	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
> +	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC) &&
> +	    !cc_platform_has(CC_ATTR_GUEST_SECURE_TSC)) {
>  
>  		if (c->x86 > 0x10 ||
>  		    (c->x86 == 0x10 && c->x86_model >= 0x2)) {

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 02/20] virt: sev-guest: Rename local guest message variables
  2024-07-31 15:07 ` [PATCH v11 02/20] virt: sev-guest: Rename local guest message variables Nikunj A Dadhania
  2024-08-27  8:48   ` [tip: x86/sev] " tip-bot2 for Nikunj A Dadhania
@ 2024-09-13 17:22   ` Tom Lendacky
  1 sibling, 0 replies; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 17:22 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:07, Nikunj A Dadhania wrote:
> Rename local guest message variables for more clarity
> 
> No functional change.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  drivers/virt/coco/sev-guest/sev-guest.c | 117 ++++++++++++------------
>  1 file changed, 59 insertions(+), 58 deletions(-)
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 03/20] virt: sev-guest: Fix user-visible strings
  2024-07-31 15:07 ` [PATCH v11 03/20] virt: sev-guest: Fix user-visible strings Nikunj A Dadhania
  2024-08-27  8:48   ` [tip: x86/sev] " tip-bot2 for Nikunj A Dadhania
@ 2024-09-13 17:26   ` Tom Lendacky
  1 sibling, 0 replies; 66+ messages in thread
From: Tom Lendacky @ 2024-09-13 17:26 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 7/31/24 10:07, Nikunj A Dadhania wrote:
> User-visible abbreviations should be in capitals, ensure messages are
> readable and clear.
> 
> No functional change.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>  drivers/virt/coco/sev-guest/sev-guest.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-07-31 15:08 ` [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available Nikunj A Dadhania
  2024-09-13 17:19   ` Tom Lendacky
@ 2024-09-13 17:30   ` Sean Christopherson
  2024-09-16 15:20     ` Nikunj A. Dadhania
  1 sibling, 1 reply; 66+ messages in thread
From: Sean Christopherson @ 2024-09-13 17:30 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, pbonzini

On Wed, Jul 31, 2024, Nikunj A Dadhania wrote:
> For AMD SNP guests with SecureTSC enabled, kvm-clock is being picked up
> momentarily instead of selecting more stable TSC clocksource.
> 
> [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> [    0.000001] kvm-clock: using sched offset of 1799357702246960 cycles
> [    0.001493] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [    0.006289] tsc: Detected 1996.249 MHz processor
> [    0.305123] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
> [    1.045759] clocksource: Switched to clocksource kvm-clock
> [    1.141326] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
> [    1.144634] clocksource: Switched to clocksource tsc
> 
> When Secure TSC is enabled, skip using the kvmclock. The guest kernel will
> fallback and use Secure TSC based clocksource.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>
> ---
>  arch/x86/kernel/kvmclock.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 5b2c15214a6b..3d03b4c937b9 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -289,7 +289,7 @@ void __init kvmclock_init(void)
>  {
>  	u8 flags;
>  
> -	if (!kvm_para_available() || !kvmclock)
> +	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))

I would much prefer we solve the kvmclock vs. TSC fight in a generic way.  Unless
I've missed something, the fact that the TSC is more trusted in the SNP/TDX world
is simply what's forcing the issue, but it's not actually the reason why Linux
should prefer the TSC over kvmclock.  The underlying reason is that platforms that
support SNP/TDX are guaranteed to have a stable, always running TSC, i.e. that the
TSC is a superior timesource purely from a functionality perspective.  That it's
more secure is icing on the cake.

>  		return;
>  
>  	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE2)) {
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 20/20] x86/cpu/amd: Do not print FW_BUG for Secure TSC
  2024-07-31 15:08 ` [PATCH v11 20/20] x86/cpu/amd: Do not print FW_BUG for Secure TSC Nikunj A Dadhania
  2024-09-13 17:21   ` Tom Lendacky
@ 2024-09-13 17:42   ` Jim Mattson
  2024-09-16 11:40     ` Nikunj A. Dadhania
  1 sibling, 1 reply; 66+ messages in thread
From: Jim Mattson @ 2024-09-13 17:42 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, seanjc, pbonzini

On Wed, Jul 31, 2024 at 8:16 AM Nikunj A Dadhania <nikunj@amd.com> wrote:
>
> When Secure TSC is enabled and TscInvariant (bit 8) in CPUID_8000_0007_edx
> is set, the kernel complains with the below firmware bug:
>
> [Firmware Bug]: TSC doesn't count with P0 frequency!
>
> Secure TSC does not need to run at P0 frequency; the TSC frequency is set
> by the VMM as part of the SNP_LAUNCH_START command. Skip this check when
> Secure TSC is enabled
>
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>
> ---
>  arch/x86/kernel/cpu/amd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index be5889bded49..87b55d2183a0 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -370,7 +370,8 @@ static void bsp_determine_snp(struct cpuinfo_x86 *c)
>
>  static void bsp_init_amd(struct cpuinfo_x86 *c)
>  {
> -       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
> +       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC) &&
> +           !cc_platform_has(CC_ATTR_GUEST_SECURE_TSC)) {

Could we extend this to never complain in a virtual machine? i.e.
...
-       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
+       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC) &&
+           !cpu_has(c, X86_FEATURE_HYPERVISOR)) {
...

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 12/20] x86/sev: Relocate SNP guest messaging routines to common code
  2024-09-13 16:27   ` Tom Lendacky
@ 2024-09-16  4:42     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-16  4:42 UTC (permalink / raw)
  To: Tom Lendacky, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini



On 9/13/2024 9:57 PM, Tom Lendacky wrote:
> On 7/31/24 10:08, Nikunj A Dadhania wrote:
>> At present, the SEV guest driver exclusively handles SNP guest messaging.
>> All routines for sending guest messages are embedded within the guest
>> driver. To support Secure TSC, SEV-SNP guests must communicate with the AMD
>> Security Processor during early boot. However, these guest messaging
>> functions are not accessible during early boot since they are currently
>> part of the guest driver.
>>
>> Hence, relocate the core SNP guest messaging functions to SEV common code
>> and provide an API for sending SNP guest messages.
>>
>> No functional change, but just an export symbol.
> 
> That means we can drop the export symbol on snp_issue_guest_request() and
> make it static, right?

Yes, let me remove that.

> 
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> 
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> 
>> ---
>>  arch/x86/include/asm/sev.h              |   8 +
>>  arch/x86/coco/sev/core.c                | 284 +++++++++++++++++++++++
>>  drivers/virt/coco/sev-guest/sev-guest.c | 286 ------------------------
>>  arch/x86/Kconfig                        |   1 +
>>  drivers/virt/coco/sev-guest/Kconfig     |   1 -
>>  5 files changed, 293 insertions(+), 287 deletions(-)

Regards
Nikunj

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 13/20] x86/cc: Add CC_ATTR_GUEST_SECURE_TSC
  2024-09-13 15:21   ` Tom Lendacky
@ 2024-09-16  4:53     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-16  4:53 UTC (permalink / raw)
  To: Tom Lendacky, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini



On 9/13/2024 8:51 PM, Tom Lendacky wrote:
> On 7/31/24 10:08, Nikunj A Dadhania wrote:

>> @@ -88,6 +88,14 @@ enum cc_attr {
>>  	 * enabled to run SEV-SNP guests.
>>  	 */
>>  	CC_ATTR_HOST_SEV_SNP,
>> +
>> +	/**
>> +	 * @CC_ATTR_GUEST_SECURE_TSC: Secure TSC is active.
>> +	 *
>> +	 * The platform/OS is running as a guest/virtual machine and actively
>> +	 * using AMD SEV-SNP Secure TSC feature.
>> +	 */
>> +	CC_ATTR_GUEST_SECURE_TSC,
> 
> If this is specifically used for the AMD feature, as opposed to a generic
> "does your system have a secure TSC", then it should probably be
> CC_ATTR_GUEST_SNP_SECURE_TSC or CC_ATTR_GUEST_SEV_SNP_SECURE_TSC.

Sure, let me rename it to CC_ATTR_GUEST_SNP_SECURE_TSC.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 17/20] x86/sev: Allow Secure TSC feature for SNP guests
  2024-09-13 16:53   ` Tom Lendacky
@ 2024-09-16  6:23     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-16  6:23 UTC (permalink / raw)
  To: Tom Lendacky, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini



On 9/13/2024 10:23 PM, Tom Lendacky wrote:
> On 7/31/24 10:08, Nikunj A Dadhania wrote:
>> Now that all the required plumbing is done for enabling SNP Secure TSC
>> feature, add Secure TSC to SNP features present list.
> 
> So I think this should be the last patch in the series after the TSC is
> marked reliable, kvmclock is bypassed, etc. This way everything is in
> place when the guest is allowed to run with Secure TSC.

Sure, I can re-arrange that. I had re-arranged it in this order so the
TSC related problems are visible after SecureTSC is enabled and fix them.

> 
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> Tested-by: Peter Gonda <pgonda@google.com>
> 
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

Regards
Nikunj

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 20/20] x86/cpu/amd: Do not print FW_BUG for Secure TSC
  2024-09-13 17:42   ` Jim Mattson
@ 2024-09-16 11:40     ` Nikunj A. Dadhania
  2024-09-16 20:21       ` Jim Mattson
  0 siblings, 1 reply; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-16 11:40 UTC (permalink / raw)
  To: Jim Mattson
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, seanjc, pbonzini



On 9/13/2024 11:12 PM, Jim Mattson wrote:
> On Wed, Jul 31, 2024 at 8:16 AM Nikunj A Dadhania <nikunj@amd.com> wrote:
>>
>> When Secure TSC is enabled and TscInvariant (bit 8) in CPUID_8000_0007_edx
>> is set, the kernel complains with the below firmware bug:
>>
>> [Firmware Bug]: TSC doesn't count with P0 frequency!
>>
>> Secure TSC does not need to run at P0 frequency; the TSC frequency is set
>> by the VMM as part of the SNP_LAUNCH_START command. Skip this check when
>> Secure TSC is enabled
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> Tested-by: Peter Gonda <pgonda@google.com>
>> ---
>>  arch/x86/kernel/cpu/amd.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index be5889bded49..87b55d2183a0 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -370,7 +370,8 @@ static void bsp_determine_snp(struct cpuinfo_x86 *c)
>>
>>  static void bsp_init_amd(struct cpuinfo_x86 *c)
>>  {
>> -       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
>> +       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC) &&
>> +           !cc_platform_has(CC_ATTR_GUEST_SECURE_TSC)) {
> 
> Could we extend this to never complain in a virtual machine? i.e.

Let me get more clarity on the below and your commit[1]

> ...
> -       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
> +       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC) &&
> +           !cpu_has(c, X86_FEATURE_HYPERVISOR)) {
> ...

Or do this for Family 15h and above ?

Regards
Nikunj

1. https://github.com/torvalds/linux/commit/8b0e00fba934

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-09-13 17:30   ` Sean Christopherson
@ 2024-09-16 15:20     ` Nikunj A. Dadhania
  2024-09-18 12:07       ` Sean Christopherson
  0 siblings, 1 reply; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-16 15:20 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, pbonzini



On 9/13/2024 11:00 PM, Sean Christopherson wrote:
> On Wed, Jul 31, 2024, Nikunj A Dadhania wrote:
>> For AMD SNP guests with SecureTSC enabled, kvm-clock is being picked up
>> momentarily instead of selecting more stable TSC clocksource.
>>
>> [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
>> [    0.000001] kvm-clock: using sched offset of 1799357702246960 cycles
>> [    0.001493] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
>> [    0.006289] tsc: Detected 1996.249 MHz processor
>> [    0.305123] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
>> [    1.045759] clocksource: Switched to clocksource kvm-clock
>> [    1.141326] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
>> [    1.144634] clocksource: Switched to clocksource tsc
>>
>> When Secure TSC is enabled, skip using the kvmclock. The guest kernel will
>> fallback and use Secure TSC based clocksource.
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> Tested-by: Peter Gonda <pgonda@google.com>
>> ---
>>  arch/x86/kernel/kvmclock.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>> index 5b2c15214a6b..3d03b4c937b9 100644
>> --- a/arch/x86/kernel/kvmclock.c
>> +++ b/arch/x86/kernel/kvmclock.c
>> @@ -289,7 +289,7 @@ void __init kvmclock_init(void)
>>  {
>>  	u8 flags;
>>  
>> -	if (!kvm_para_available() || !kvmclock)
>> +	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
> 
> I would much prefer we solve the kvmclock vs. TSC fight in a generic way.  Unless
> I've missed something, the fact that the TSC is more trusted in the SNP/TDX world
> is simply what's forcing the issue, but it's not actually the reason why Linux
> should prefer the TSC over kvmclock.  The underlying reason is that platforms that
> support SNP/TDX are guaranteed to have a stable, always running TSC, i.e. that the
> TSC is a superior timesource purely from a functionality perspective.  That it's
> more secure is icing on the cake.

Are you suggesting that whenever the guest is either SNP or TDX, kvmclock should be
disabled assuming that timesource is stable and always running?

Regards
Nikunj

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 20/20] x86/cpu/amd: Do not print FW_BUG for Secure TSC
  2024-09-16 11:40     ` Nikunj A. Dadhania
@ 2024-09-16 20:21       ` Jim Mattson
  0 siblings, 0 replies; 66+ messages in thread
From: Jim Mattson @ 2024-09-16 20:21 UTC (permalink / raw)
  To: Nikunj A. Dadhania
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, seanjc, pbonzini

On Mon, Sep 16, 2024 at 4:41 AM Nikunj A. Dadhania <nikunj@amd.com> wrote:
>
>
>
> On 9/13/2024 11:12 PM, Jim Mattson wrote:
> > On Wed, Jul 31, 2024 at 8:16 AM Nikunj A Dadhania <nikunj@amd.com> wrote:
> >>
> >> When Secure TSC is enabled and TscInvariant (bit 8) in CPUID_8000_0007_edx
> >> is set, the kernel complains with the below firmware bug:
> >>
> >> [Firmware Bug]: TSC doesn't count with P0 frequency!
> >>
> >> Secure TSC does not need to run at P0 frequency; the TSC frequency is set
> >> by the VMM as part of the SNP_LAUNCH_START command. Skip this check when
> >> Secure TSC is enabled
> >>
> >> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> >> Tested-by: Peter Gonda <pgonda@google.com>
> >> ---
> >>  arch/x86/kernel/cpu/amd.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> >> index be5889bded49..87b55d2183a0 100644
> >> --- a/arch/x86/kernel/cpu/amd.c
> >> +++ b/arch/x86/kernel/cpu/amd.c
> >> @@ -370,7 +370,8 @@ static void bsp_determine_snp(struct cpuinfo_x86 *c)
> >>
> >>  static void bsp_init_amd(struct cpuinfo_x86 *c)
> >>  {
> >> -       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
> >> +       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC) &&
> >> +           !cc_platform_has(CC_ATTR_GUEST_SECURE_TSC)) {
> >
> > Could we extend this to never complain in a virtual machine? i.e.
>
> Let me get more clarity on the below and your commit[1]
>
> > ...
> > -       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
> > +       if (cpu_has(c, X86_FEATURE_CONSTANT_TSC) &&
> > +           !cpu_has(c, X86_FEATURE_HYPERVISOR)) {
> > ...
>
> Or do this for Family 15h and above ?

I don't think there exists a virtual firmware that sets this bit on
older CPU families. In fact, before my referenced commit, it wasn't
possible.

> Regards
> Nikunj
>
> 1. https://github.com/torvalds/linux/commit/8b0e00fba934

Something like this is necessary for existing versions of Linux. I
would like to have set HW_CR.TscFreqSel[bit 24] at VCPU creation, but
Sean would not let me. So, now userspace has to do it right after VCPU
creation. I don't have any intention of adding the code to qemu, but
maybe someone will.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-09-16 15:20     ` Nikunj A. Dadhania
@ 2024-09-18 12:07       ` Sean Christopherson
  2024-09-20  5:15         ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Sean Christopherson @ 2024-09-18 12:07 UTC (permalink / raw)
  To: Nikunj A. Dadhania
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, pbonzini

On Mon, Sep 16, 2024, Nikunj A. Dadhania wrote:
> On 9/13/2024 11:00 PM, Sean Christopherson wrote:
> >> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> >> Tested-by: Peter Gonda <pgonda@google.com>
> >> ---
> >>  arch/x86/kernel/kvmclock.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> >> index 5b2c15214a6b..3d03b4c937b9 100644
> >> --- a/arch/x86/kernel/kvmclock.c
> >> +++ b/arch/x86/kernel/kvmclock.c
> >> @@ -289,7 +289,7 @@ void __init kvmclock_init(void)
> >>  {
> >>  	u8 flags;
> >>  
> >> -	if (!kvm_para_available() || !kvmclock)
> >> +	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
> > 
> > I would much prefer we solve the kvmclock vs. TSC fight in a generic way.  Unless
> > I've missed something, the fact that the TSC is more trusted in the SNP/TDX world
> > is simply what's forcing the issue, but it's not actually the reason why Linux
> > should prefer the TSC over kvmclock.  The underlying reason is that platforms that
> > support SNP/TDX are guaranteed to have a stable, always running TSC, i.e. that the
> > TSC is a superior timesource purely from a functionality perspective.  That it's
> > more secure is icing on the cake.
> 
> Are you suggesting that whenever the guest is either SNP or TDX, kvmclock
> should be disabled assuming that timesource is stable and always running?

No, I'm saying that the guest should prefer the raw TSC over kvmclock if the TSC
is stable, irrespective of SNP or TDX.  This is effectively already done for the
timekeeping base (see commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
invariant TSC is exposed")), but the scheduler still uses kvmclock thanks to the
kvm_sched_clock_init() code.

The other aspect of this to consider is wallclock.  If I'm reading the code
correctly, _completely_ disabling kvmclock will case the kernel to keep using the
RTC for wallclock.  Using the RTC is an amusingly bad decision for SNP and TDX
(and regular VMs), as the RTC is a slooow emulation path and it's still very much
controlled by the untrusted host.

Unless you have a better idea for what to do with wallclock, I think the right
approach is to come up a cleaner way to prefer TSC over kvmclock for timekeeping
and the scheduler, but leave wallclock as-is.  And then for SNP and TDX, "assert"
that the TSC is being used instead of kvmclock.  Presumably, all SNP and TDX
hosts provide a stable TSC, so there's probably no reason for the guest to even
care if the TSC is "secure".

Note, past me missed the wallclock side of things[*], so I don't think hiding
kvmclock entirely is the best solution.

[*] https://lore.kernel.org/all/ZOjF2DMBgW%2FzVvL3@google.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-09-18 12:07       ` Sean Christopherson
@ 2024-09-20  5:15         ` Nikunj A. Dadhania
  2024-09-20  7:21           ` Sean Christopherson
  0 siblings, 1 reply; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-20  5:15 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, pbonzini

On 9/18/2024 5:37 PM, Sean Christopherson wrote:
> On Mon, Sep 16, 2024, Nikunj A. Dadhania wrote:
>> On 9/13/2024 11:00 PM, Sean Christopherson wrote:
>>>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>>>> Tested-by: Peter Gonda <pgonda@google.com>
>>>> ---
>>>>  arch/x86/kernel/kvmclock.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>>>> index 5b2c15214a6b..3d03b4c937b9 100644
>>>> --- a/arch/x86/kernel/kvmclock.c
>>>> +++ b/arch/x86/kernel/kvmclock.c
>>>> @@ -289,7 +289,7 @@ void __init kvmclock_init(void)
>>>>  {
>>>>  	u8 flags;
>>>>  
>>>> -	if (!kvm_para_available() || !kvmclock)
>>>> +	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
>>>
>>> I would much prefer we solve the kvmclock vs. TSC fight in a generic way.  Unless
>>> I've missed something, the fact that the TSC is more trusted in the SNP/TDX world
>>> is simply what's forcing the issue, but it's not actually the reason why Linux
>>> should prefer the TSC over kvmclock.  The underlying reason is that platforms that
>>> support SNP/TDX are guaranteed to have a stable, always running TSC, i.e. that the
>>> TSC is a superior timesource purely from a functionality perspective.  That it's
>>> more secure is icing on the cake.
>>
>> Are you suggesting that whenever the guest is either SNP or TDX, kvmclock
>> should be disabled assuming that timesource is stable and always running?
> 
> No, I'm saying that the guest should prefer the raw TSC over kvmclock if the TSC
> is stable, irrespective of SNP or TDX.  This is effectively already done for the
> timekeeping base (see commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
> invariant TSC is exposed")), but the scheduler still uses kvmclock thanks to the
> kvm_sched_clock_init() code.

The kvm-clock and tsc-early both are having the rating of 299. As they are of
same rating, kvm-clock is being picked up first.

Is it fine to drop the clock rating of kvmclock to 298 ? With this tsc-early will
be picked up instead.

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index fafdbf813ae3..1982cee74354 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -289,7 +289,7 @@ void __init kvmclock_init(void)
 {
 	u8 flags;
 
-	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
+	if (!kvm_para_available() || !kvmclock)
 		return;
 
 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE2)) {
@@ -342,7 +342,7 @@ void __init kvmclock_init(void)
 	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
 	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
 	    !check_tsc_unstable())
-		kvm_clock.rating = 299;
+		kvm_clock.rating = 298;
 
 	clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
 	pv_info.name = "KVM";



[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[    0.000001] kvm-clock: using sched offset of 6630881179920185 cycles
[    0.001266] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.005347] tsc: Detected 1996.247 MHz processor
[    0.263100] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398caa9ddcb, max_idle_ns: 881590739785 ns
[    0.980456] clocksource: Switched to clocksource tsc-early
[    1.059332] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398caa9ddcb, max_idle_ns: 881590739785 ns
[    1.062094] clocksource: Switched to clocksource tsc

> The other aspect of this to consider is wallclock.  If I'm reading the code
> correctly, _completely_ disabling kvmclock will case the kernel to keep using the
> RTC for wallclock.  Using the RTC is an amusingly bad decision for SNP and TDX
> (and regular VMs), as the RTC is a slooow emulation path and it's still very much
> controlled by the untrusted host.

Right, this is not expected.
 
> Unless you have a better idea for what to do with wallclock, I think the right
> approach is to come up a cleaner way to prefer TSC over kvmclock for timekeeping
> and the scheduler, but leave wallclock as-is.  And then for SNP and TDX, "assert"
> that the TSC is being used instead of kvmclock.  Presumably, all SNP and TDX
> hosts provide a stable TSC, so there's probably no reason for the guest to even
> care if the TSC is "secure".
> 
> Note, past me missed the wallclock side of things[*], so I don't think hiding
> kvmclock entirely is the best solution.
> 
> [*] https://lore.kernel.org/all/ZOjF2DMBgW%2FzVvL3@google.com

Regards
Nikunj

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-09-20  5:15         ` Nikunj A. Dadhania
@ 2024-09-20  7:21           ` Sean Christopherson
  2024-09-20  8:54             ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Sean Christopherson @ 2024-09-20  7:21 UTC (permalink / raw)
  To: Nikunj A. Dadhania
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, pbonzini

On Fri, Sep 20, 2024, Nikunj A. Dadhania wrote:
> On 9/18/2024 5:37 PM, Sean Christopherson wrote:
> > On Mon, Sep 16, 2024, Nikunj A. Dadhania wrote:
> >> On 9/13/2024 11:00 PM, Sean Christopherson wrote:
> >>>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> >>>> Tested-by: Peter Gonda <pgonda@google.com>
> >>>> ---
> >>>>  arch/x86/kernel/kvmclock.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> >>>> index 5b2c15214a6b..3d03b4c937b9 100644
> >>>> --- a/arch/x86/kernel/kvmclock.c
> >>>> +++ b/arch/x86/kernel/kvmclock.c
> >>>> @@ -289,7 +289,7 @@ void __init kvmclock_init(void)
> >>>>  {
> >>>>  	u8 flags;
> >>>>  
> >>>> -	if (!kvm_para_available() || !kvmclock)
> >>>> +	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
> >>>
> >>> I would much prefer we solve the kvmclock vs. TSC fight in a generic way.  Unless
> >>> I've missed something, the fact that the TSC is more trusted in the SNP/TDX world
> >>> is simply what's forcing the issue, but it's not actually the reason why Linux
> >>> should prefer the TSC over kvmclock.  The underlying reason is that platforms that
> >>> support SNP/TDX are guaranteed to have a stable, always running TSC, i.e. that the
> >>> TSC is a superior timesource purely from a functionality perspective.  That it's
> >>> more secure is icing on the cake.
> >>
> >> Are you suggesting that whenever the guest is either SNP or TDX, kvmclock
> >> should be disabled assuming that timesource is stable and always running?
> > 
> > No, I'm saying that the guest should prefer the raw TSC over kvmclock if the TSC
> > is stable, irrespective of SNP or TDX.  This is effectively already done for the
> > timekeeping base (see commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
> > invariant TSC is exposed")), but the scheduler still uses kvmclock thanks to the
> > kvm_sched_clock_init() code.
> 
> The kvm-clock and tsc-early both are having the rating of 299. As they are of
> same rating, kvm-clock is being picked up first.
> 
> Is it fine to drop the clock rating of kvmclock to 298 ? With this tsc-early will
> be picked up instead.

IMO, it's ugly, but that's a problem with the rating system inasmuch as anything.

But the kernel will still be using kvmclock for the scheduler clock, which is
undesirable.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-09-20  7:21           ` Sean Christopherson
@ 2024-09-20  8:54             ` Nikunj A. Dadhania
  2024-09-25  8:53               ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-20  8:54 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, pbonzini

On 9/20/2024 12:51 PM, Sean Christopherson wrote:
> On Fri, Sep 20, 2024, Nikunj A. Dadhania wrote:
>> On 9/18/2024 5:37 PM, Sean Christopherson wrote:
>>> On Mon, Sep 16, 2024, Nikunj A. Dadhania wrote:
>>>> On 9/13/2024 11:00 PM, Sean Christopherson wrote:
>>>>>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>>>>>> Tested-by: Peter Gonda <pgonda@google.com>
>>>>>> ---
>>>>>>  arch/x86/kernel/kvmclock.c | 2 +-
>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>>>>>> index 5b2c15214a6b..3d03b4c937b9 100644
>>>>>> --- a/arch/x86/kernel/kvmclock.c
>>>>>> +++ b/arch/x86/kernel/kvmclock.c
>>>>>> @@ -289,7 +289,7 @@ void __init kvmclock_init(void)
>>>>>>  {
>>>>>>  	u8 flags;
>>>>>>  
>>>>>> -	if (!kvm_para_available() || !kvmclock)
>>>>>> +	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
>>>>>
>>>>> I would much prefer we solve the kvmclock vs. TSC fight in a generic way.  Unless
>>>>> I've missed something, the fact that the TSC is more trusted in the SNP/TDX world
>>>>> is simply what's forcing the issue, but it's not actually the reason why Linux
>>>>> should prefer the TSC over kvmclock.  The underlying reason is that platforms that
>>>>> support SNP/TDX are guaranteed to have a stable, always running TSC, i.e. that the
>>>>> TSC is a superior timesource purely from a functionality perspective.  That it's
>>>>> more secure is icing on the cake.
>>>>
>>>> Are you suggesting that whenever the guest is either SNP or TDX, kvmclock
>>>> should be disabled assuming that timesource is stable and always running?
>>>
>>> No, I'm saying that the guest should prefer the raw TSC over kvmclock if the TSC
>>> is stable, irrespective of SNP or TDX.  This is effectively already done for the
>>> timekeeping base (see commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
>>> invariant TSC is exposed")), but the scheduler still uses kvmclock thanks to the
>>> kvm_sched_clock_init() code.
>>
>> The kvm-clock and tsc-early both are having the rating of 299. As they are of
>> same rating, kvm-clock is being picked up first.
>>
>> Is it fine to drop the clock rating of kvmclock to 298 ? With this tsc-early will
>> be picked up instead.
> 
> IMO, it's ugly, but that's a problem with the rating system inasmuch as anything.
>
> But the kernel will still be using kvmclock for the scheduler clock, which is
> undesirable.

Agree, kvm_sched_clock_init() is still being called. The above hunk was to use
tsc-early/tsc as the clocksource and not kvm-clock.

Regards,
Nikunj

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-09-20  8:54             ` Nikunj A. Dadhania
@ 2024-09-25  8:53               ` Nikunj A. Dadhania
  2024-09-25 12:55                 ` Sean Christopherson
  0 siblings, 1 reply; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-25  8:53 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, pbonzini, peterz, gautham.shenoy



On 9/20/2024 2:24 PM, Nikunj A. Dadhania wrote:
> On 9/20/2024 12:51 PM, Sean Christopherson wrote:
>> On Fri, Sep 20, 2024, Nikunj A. Dadhania wrote:
>>> On 9/18/2024 5:37 PM, Sean Christopherson wrote:
>>>> On Mon, Sep 16, 2024, Nikunj A. Dadhania wrote:
>>>>> On 9/13/2024 11:00 PM, Sean Christopherson wrote:
>>>>>>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>>>>>>> Tested-by: Peter Gonda <pgonda@google.com>
>>>>>>> ---
>>>>>>>  arch/x86/kernel/kvmclock.c | 2 +-
>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>>>>>>> index 5b2c15214a6b..3d03b4c937b9 100644
>>>>>>> --- a/arch/x86/kernel/kvmclock.c
>>>>>>> +++ b/arch/x86/kernel/kvmclock.c
>>>>>>> @@ -289,7 +289,7 @@ void __init kvmclock_init(void)
>>>>>>>  {
>>>>>>>  	u8 flags;
>>>>>>>  
>>>>>>> -	if (!kvm_para_available() || !kvmclock)
>>>>>>> +	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
>>>>>>
>>>>>> I would much prefer we solve the kvmclock vs. TSC fight in a generic way.  Unless
>>>>>> I've missed something, the fact that the TSC is more trusted in the SNP/TDX world
>>>>>> is simply what's forcing the issue, but it's not actually the reason why Linux
>>>>>> should prefer the TSC over kvmclock.  The underlying reason is that platforms that
>>>>>> support SNP/TDX are guaranteed to have a stable, always running TSC, i.e. that the
>>>>>> TSC is a superior timesource purely from a functionality perspective.  That it's
>>>>>> more secure is icing on the cake.
>>>>>
>>>>> Are you suggesting that whenever the guest is either SNP or TDX, kvmclock
>>>>> should be disabled assuming that timesource is stable and always running?
>>>>
>>>> No, I'm saying that the guest should prefer the raw TSC over kvmclock if the TSC
>>>> is stable, irrespective of SNP or TDX.  This is effectively already done for the
>>>> timekeeping base (see commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
>>>> invariant TSC is exposed")), but the scheduler still uses kvmclock thanks to the
>>>> kvm_sched_clock_init() code.
>>>
>>> The kvm-clock and tsc-early both are having the rating of 299. As they are of
>>> same rating, kvm-clock is being picked up first.
>>>
>>> Is it fine to drop the clock rating of kvmclock to 298 ? With this tsc-early will
>>> be picked up instead.
>>
>> IMO, it's ugly, but that's a problem with the rating system inasmuch as anything.
>>
>> But the kernel will still be using kvmclock for the scheduler clock, which is
>> undesirable.
> 
> Agree, kvm_sched_clock_init() is still being called. The above hunk was to use
> tsc-early/tsc as the clocksource and not kvm-clock.

How about the below patch:

From: Nikunj A Dadhania <nikunj@amd.com>
Date: Tue, 28 Nov 2023 18:29:56 +0530
Subject: [RFC PATCH] x86/kvmclock: Prefer invariant TSC as the clocksource and
 scheduler clock

For platforms that support stable and always running TSC, although the
kvm-clock rating is dropped to 299 to prefer TSC, the guest scheduler clock
still keeps on using the kvm-clock which is undesirable. Moreover, as the
kvm-clock and early-tsc clocksource are both registered with 299 rating,
kvm-clock is being picked up momentarily instead of selecting more stable
tsc-early clocksource.

  kvm-clock: Using msrs 4b564d01 and 4b564d00
  kvm-clock: using sched offset of 1799357702246960 cycles
  clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
  tsc: Detected 1996.249 MHz processor
  clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
  clocksource: Switched to clocksource kvm-clock
  clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
  clocksource: Switched to clocksource tsc

Drop the kvm-clock rating to 298, so that tsc-early is picked up before
kvm-clock and use TSC for scheduler clock as well when the TSC is invariant
and stable.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>

---

The issue we see here is that on bare-metal if the TSC is marked unstable,
then the sched-clock will fall back to jiffies. In the virtualization case,
do we want to fall back to kvm-clock when TSC is marked unstable?

---
 arch/x86/kernel/kvmclock.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 5b2c15214a6b..c997b2628c4b 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -317,9 +317,6 @@ void __init kvmclock_init(void)
 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
 		pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
 
-	flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
-	kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
-
 	x86_platform.calibrate_tsc = kvm_get_tsc_khz;
 	x86_platform.calibrate_cpu = kvm_get_tsc_khz;
 	x86_platform.get_wallclock = kvm_get_wallclock;
@@ -341,8 +338,12 @@ void __init kvmclock_init(void)
 	 */
 	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
 	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
-	    !check_tsc_unstable())
-		kvm_clock.rating = 299;
+	    !check_tsc_unstable()) {
+		kvm_clock.rating = 298;
+	} else {
+		flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
+		kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
+	}
 
 	clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
 	pv_info.name = "KVM";
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-09-25  8:53               ` Nikunj A. Dadhania
@ 2024-09-25 12:55                 ` Sean Christopherson
  2024-09-30  6:27                   ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Sean Christopherson @ 2024-09-25 12:55 UTC (permalink / raw)
  To: Nikunj A. Dadhania
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, pbonzini, peterz, gautham.shenoy

On Wed, Sep 25, 2024, Nikunj A. Dadhania wrote:
> >>>>> Are you suggesting that whenever the guest is either SNP or TDX, kvmclock
> >>>>> should be disabled assuming that timesource is stable and always running?
> >>>>
> >>>> No, I'm saying that the guest should prefer the raw TSC over kvmclock if the TSC
> >>>> is stable, irrespective of SNP or TDX.  This is effectively already done for the
> >>>> timekeeping base (see commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
> >>>> invariant TSC is exposed")), but the scheduler still uses kvmclock thanks to the
> >>>> kvm_sched_clock_init() code.
> >>>
> >>> The kvm-clock and tsc-early both are having the rating of 299. As they are of
> >>> same rating, kvm-clock is being picked up first.
> >>>
> >>> Is it fine to drop the clock rating of kvmclock to 298 ? With this tsc-early will
> >>> be picked up instead.
> >>
> >> IMO, it's ugly, but that's a problem with the rating system inasmuch as anything.
> >>
> >> But the kernel will still be using kvmclock for the scheduler clock, which is
> >> undesirable.
> > 
> > Agree, kvm_sched_clock_init() is still being called. The above hunk was to use
> > tsc-early/tsc as the clocksource and not kvm-clock.
> 
> How about the below patch:
> 
> From: Nikunj A Dadhania <nikunj@amd.com>
> Date: Tue, 28 Nov 2023 18:29:56 +0530
> Subject: [RFC PATCH] x86/kvmclock: Prefer invariant TSC as the clocksource and
>  scheduler clock
> 
> For platforms that support stable and always running TSC, although the
> kvm-clock rating is dropped to 299 to prefer TSC, the guest scheduler clock
> still keeps on using the kvm-clock which is undesirable. Moreover, as the
> kvm-clock and early-tsc clocksource are both registered with 299 rating,
> kvm-clock is being picked up momentarily instead of selecting more stable
> tsc-early clocksource.
> 
>   kvm-clock: Using msrs 4b564d01 and 4b564d00
>   kvm-clock: using sched offset of 1799357702246960 cycles
>   clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
>   tsc: Detected 1996.249 MHz processor
>   clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
>   clocksource: Switched to clocksource kvm-clock
>   clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
>   clocksource: Switched to clocksource tsc
> 
> Drop the kvm-clock rating to 298, so that tsc-early is picked up before
> kvm-clock and use TSC for scheduler clock as well when the TSC is invariant
> and stable.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> 
> ---
> 
> The issue we see here is that on bare-metal if the TSC is marked unstable,
> then the sched-clock will fall back to jiffies. In the virtualization case,
> do we want to fall back to kvm-clock when TSC is marked unstable?

In the general case, yes.  Though that might be a WARN-able offense if the TSC
is allegedly constant+nonstop.  And for SNP and TDX, it might be a "panic and do
not boot" offense, since using kvmclock undermines the security of the guest.

> ---
>  arch/x86/kernel/kvmclock.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 5b2c15214a6b..c997b2628c4b 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -317,9 +317,6 @@ void __init kvmclock_init(void)
>  	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
>  		pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
>  
> -	flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
> -	kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
> -
>  	x86_platform.calibrate_tsc = kvm_get_tsc_khz;
>  	x86_platform.calibrate_cpu = kvm_get_tsc_khz;
>  	x86_platform.get_wallclock = kvm_get_wallclock;
> @@ -341,8 +338,12 @@ void __init kvmclock_init(void)
>  	 */
>  	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
>  	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> -	    !check_tsc_unstable())
> -		kvm_clock.rating = 299;
> +	    !check_tsc_unstable()) {
> +		kvm_clock.rating = 298;
> +	} else {
> +		flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
> +		kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
> +	}

I would really, really like to fix this in a centralized location, not by having
each PV clocksource muck with their clock's rating.  I'm not even sure the existing
code is entirely correct, as kvmclock_init() runs _before_ tsc_early_init().  Which
is desirable in the legacy case, as it allows calibrating the TSC using kvmclock,

  	x86_platform.calibrate_tsc = kvm_get_tsc_khz;

but on modern setups that's definitely undesirable, as it means the kernel won't
use CPUID.0x15, which every explicitly tells software the frequency of the TSC.

And I don't think we want to simply point at native_calibrate_tsc(), because that
thing is not at all correct for a VM, where checking x86_vendor and x86_vfm is at
best sketchy.  E.g. I would think it's in AMD's interest for Secure TSC to define
the TSC frequency using CPUID.0x15, even if AMD CPUs don't (yet) natively support
CPUID.0x15.

In other words, I think we need to overhaul the PV clock vs. TSC logic so that it
makes sense for modern CPUs+VMs, not just keep hacking away at kvmclock.  I don't
expect the code would be all that complex in the end, the hardest part is likely
just figuring out (and agreeing on) what exactly the kernel should be doing.

>  	clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
>  	pv_info.name = "KVM";
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-09-25 12:55                 ` Sean Christopherson
@ 2024-09-30  6:27                   ` Nikunj A. Dadhania
  2024-09-30 21:20                     ` Thomas Gleixner
  0 siblings, 1 reply; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-09-30  6:27 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, tglx,
	dave.hansen, pgonda, pbonzini, peterz, gautham.shenoy



On 9/25/2024 6:25 PM, Sean Christopherson wrote:
> On Wed, Sep 25, 2024, Nikunj A. Dadhania wrote:
>>>>>>> Are you suggesting that whenever the guest is either SNP or TDX, kvmclock
>>>>>>> should be disabled assuming that timesource is stable and always running?
>>>>>>
>>>>>> No, I'm saying that the guest should prefer the raw TSC over kvmclock if the TSC
>>>>>> is stable, irrespective of SNP or TDX.  This is effectively already done for the
>>>>>> timekeeping base (see commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
>>>>>> invariant TSC is exposed")), but the scheduler still uses kvmclock thanks to the
>>>>>> kvm_sched_clock_init() code.
>>>>>
>>>>> The kvm-clock and tsc-early both are having the rating of 299. As they are of
>>>>> same rating, kvm-clock is being picked up first.
>>>>>
>>>>> Is it fine to drop the clock rating of kvmclock to 298 ? With this tsc-early will
>>>>> be picked up instead.
>>>>
>>>> IMO, it's ugly, but that's a problem with the rating system inasmuch as anything.
>>>>
>>>> But the kernel will still be using kvmclock for the scheduler clock, which is
>>>> undesirable.
>>>
>>> Agree, kvm_sched_clock_init() is still being called. The above hunk was to use
>>> tsc-early/tsc as the clocksource and not kvm-clock.
>>
>> How about the below patch:
>>
>> From: Nikunj A Dadhania <nikunj@amd.com>
>> Date: Tue, 28 Nov 2023 18:29:56 +0530
>> Subject: [RFC PATCH] x86/kvmclock: Prefer invariant TSC as the clocksource and
>>  scheduler clock
>>
>> For platforms that support stable and always running TSC, although the
>> kvm-clock rating is dropped to 299 to prefer TSC, the guest scheduler clock
>> still keeps on using the kvm-clock which is undesirable. Moreover, as the
>> kvm-clock and early-tsc clocksource are both registered with 299 rating,
>> kvm-clock is being picked up momentarily instead of selecting more stable
>> tsc-early clocksource.
>>
>>   kvm-clock: Using msrs 4b564d01 and 4b564d00
>>   kvm-clock: using sched offset of 1799357702246960 cycles
>>   clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
>>   tsc: Detected 1996.249 MHz processor
>>   clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
>>   clocksource: Switched to clocksource kvm-clock
>>   clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
>>   clocksource: Switched to clocksource tsc
>>
>> Drop the kvm-clock rating to 298, so that tsc-early is picked up before
>> kvm-clock and use TSC for scheduler clock as well when the TSC is invariant
>> and stable.
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>>
>> ---
>>
>> The issue we see here is that on bare-metal if the TSC is marked unstable,
>> then the sched-clock will fall back to jiffies. In the virtualization case,
>> do we want to fall back to kvm-clock when TSC is marked unstable?
> 
> In the general case, yes.  Though that might be a WARN-able offense if the TSC
> is allegedly constant+nonstop.  And for SNP and TDX, it might be a "panic and do
> not boot" offense, since using kvmclock undermines the security of the guest.
> 
>> ---
>>  arch/x86/kernel/kvmclock.c | 11 ++++++-----
>>  1 file changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>> index 5b2c15214a6b..c997b2628c4b 100644
>> --- a/arch/x86/kernel/kvmclock.c
>> +++ b/arch/x86/kernel/kvmclock.c
>> @@ -317,9 +317,6 @@ void __init kvmclock_init(void)
>>  	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
>>  		pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
>>  
>> -	flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
>> -	kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
>> -
>>  	x86_platform.calibrate_tsc = kvm_get_tsc_khz;
>>  	x86_platform.calibrate_cpu = kvm_get_tsc_khz;
>>  	x86_platform.get_wallclock = kvm_get_wallclock;
>> @@ -341,8 +338,12 @@ void __init kvmclock_init(void)
>>  	 */
>>  	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
>>  	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
>> -	    !check_tsc_unstable())
>> -		kvm_clock.rating = 299;
>> +	    !check_tsc_unstable()) {
>> +		kvm_clock.rating = 298;
>> +	} else {
>> +		flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
>> +		kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
>> +	}
> 
> I would really, really like to fix this in a centralized location, not by having
> each PV clocksource muck with their clock's rating.  

TSC Clock Rating Adjustment:
* During TSC initialization, downgrade the TSC clock rating to 200 if TSC is not
  constant/reliable, placing it below HPET.
* Ensure the kvm-clock rating is set to 299 by default in the 
  struct clocksource kvm_clock.
* Avoid changing the kvm clock rating based on the availability of reliable
  clock sources. Let the TSC clock source determine and downgrade itself.

The above will make sure that the PV clocksource rating remain
unaffected.

Clock soure selection order when the ratings match:
* Currently, clocks are registered and enqueued based on their rating.
* When clock ratings are tied, use the advertised clock frequency(freq_khz) as a
  secondary key to favor clocks with better frequency.

This approach improves the selection process by considering both rating and
frequency. Something like below:

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index d0538a75f4c6..591451ccc0fa 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -1098,6 +1098,9 @@ static void clocksource_enqueue(struct clocksource *cs)
 		/* Keep track of the place, where to insert */
 		if (tmp->rating < cs->rating)
 			break;
+		if (tmp->rating == cs->rating && tmp->freq_khz < cs->freq_khz)
+			break;
+
 		entry = &tmp->list;
 	}
 	list_add(&cs->list, entry);
@@ -1133,6 +1136,9 @@ void __clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq
 		 * clocksource with mask >= 40-bit and f >= 4GHz. That maps to
 		 * ~ 0.06ppm granularity for NTP.
 		 */
+
+		cs->freq_khz = freq * scale / 1000;
+
 		sec = cs->mask;
 		do_div(sec, freq);
 		do_div(sec, scale);


> I'm not even sure the existing
> code is entirely correct, as kvmclock_init() runs _before_ tsc_early_init().  Which
> is desirable in the legacy case, as it allows calibrating the TSC using kvmclock,
> 
>   	x86_platform.calibrate_tsc = kvm_get_tsc_khz;
> 
> but on modern setups that's definitely undesirable, as it means the kernel won't
> use CPUID.0x15, which every explicitly tells software the frequency of the TSC.
>
> And I don't think we want to simply point at native_calibrate_tsc(), because that
> thing is not at all correct for a VM, where checking x86_vendor and x86_vfm is at
> best sketchy.  
>
> E.g. I would think it's in AMD's interest for Secure TSC to define
> the TSC frequency using CPUID.0x15, even if AMD CPUs don't (yet) natively support
> CPUID.0x15.

For SecureTSC: GUEST_TSC_FREQ MSR (C001_0134h) provides the TSC frequency.

> In other words, I think we need to overhaul the PV clock vs. TSC logic so that it
> makes sense for modern CPUs+VMs, not just keep hacking away at kvmclock.  I don't
> expect the code would be all that complex in the end, the hardest part is likely
> just figuring out (and agreeing on) what exactly the kernel should be doing.

To summarise this thread with respect to TSC vs KVM clock, there three key questions:

1) When should kvmclock init be done?
2) How should the TSC frequency be discovered?
3) What should be the sched clock source and how should it be selected in a generic way?

○ Legacy CPU/VMs: VMs running on platforms without non-stop/constant TSC 
  + kvm-clock should be registered before tsc-early/tsc
  + Need to calibrate TSC frequency
  + Use kvmclock wallclock
  + Use kvmclock for sched clock selected dynamicaly
    (using clocksource enable()/disable() callback)

○ Modern CPU/VMs: VMs running on platforms supporting constant, non-stop and reliable TSC
  + kvm-clock should be registered before tsc-early/tsc
  + TSC Frequency:
      Intel: TSC frequency using CPUID 0x15H/ 0x16H ?

      For SecureTSC: GUEST_TSC_FREQ MSR (C001_0134h) provides the TSC frequency, other 
      AMD guests need to calibrate the TSC frequency.
  + Use kvmclock wallclock
  + Use TSC for sched clock

After reviewing the code, the current init sequence looks correct for both legacy
and modern VMs/CPUs. Let kvmclock go ahead and register itself as a clocksource, although 
registration of the sched clock can be deferred until the kvm-clock is picked up as the clock 
source. And restore the old sched clock when kvmclock is disabled. Something like the below
patch, lightly tested:

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 5b2c15214a6b..7167caa3348d 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -21,6 +21,7 @@
 #include <asm/hypervisor.h>
 #include <asm/x86_init.h>
 #include <asm/kvmclock.h>
+#include <asm/timer.h>
 
 static int kvmclock __initdata = 1;
 static int kvmclock_vsyscall __initdata = 1;
@@ -148,12 +149,41 @@ bool kvm_check_and_clear_guest_paused(void)
 	return ret;
 }
 
+static u64 (*old_pv_sched_clock)(void);
+
+static void enable_kvm_schedclock_work(struct work_struct *work)
+{
+	u8 flags;
+
+	old_pv_sched_clock = static_call_query(pv_sched_clock);
+	flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
+	kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
+}
+
+static DECLARE_DELAYED_WORK(enable_kvm_sc, enable_kvm_schedclock_work);
+
+static void disable_kvm_schedclock_work(struct work_struct *work)
+{
+	if (old_pv_sched_clock)
+		paravirt_set_sched_clock(old_pv_sched_clock);
+}
+static DECLARE_DELAYED_WORK(disable_kvm_sc, disable_kvm_schedclock_work);
+
 static int kvm_cs_enable(struct clocksource *cs)
 {
+	u8 flags;
+
 	vclocks_set_used(VDSO_CLOCKMODE_PVCLOCK);
+	schedule_delayed_work(&enable_kvm_sc, 0);
+
 	return 0;
 }
 
+static void kvm_cs_disable(struct clocksource *cs)
+{
+	schedule_delayed_work(&disable_kvm_sc, 0);
+}
+
 static struct clocksource kvm_clock = {
 	.name	= "kvm-clock",
 	.read	= kvm_clock_get_cycles,
@@ -162,6 +192,7 @@ static struct clocksource kvm_clock = {
 	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
 	.id     = CSID_X86_KVM_CLK,
 	.enable	= kvm_cs_enable,
+	.disable = kvm_cs_disable,
 };
 
 static void kvm_register_clock(char *txt)
@@ -287,8 +318,6 @@ static int kvmclock_setup_percpu(unsigned int cpu)
 
 void __init kvmclock_init(void)
 {
-	u8 flags;
-
 	if (!kvm_para_available() || !kvmclock)
 		return;
 
@@ -317,9 +346,6 @@ void __init kvmclock_init(void)
 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
 		pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
 
-	flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
-	kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
-
 	x86_platform.calibrate_tsc = kvm_get_tsc_khz;
 	x86_platform.calibrate_cpu = kvm_get_tsc_khz;
 	x86_platform.get_wallclock = kvm_get_wallclock;

Regards
Nikunj

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-09-30  6:27                   ` Nikunj A. Dadhania
@ 2024-09-30 21:20                     ` Thomas Gleixner
  2024-10-01  4:26                       ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Thomas Gleixner @ 2024-09-30 21:20 UTC (permalink / raw)
  To: Nikunj A. Dadhania, Sean Christopherson
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, dave.hansen,
	pgonda, pbonzini, peterz, gautham.shenoy

On Mon, Sep 30 2024 at 11:57, Nikunj A. Dadhania wrote:
> TSC Clock Rating Adjustment:
> * During TSC initialization, downgrade the TSC clock rating to 200 if TSC is not
>   constant/reliable, placing it below HPET.

Downgrading a constant TSC is a bad idea. Reliable just means that it
does not need a watchdog clocksource. If it's non-constant it's
downgraded anyway.

> * Ensure the kvm-clock rating is set to 299 by default in the 
>   struct clocksource kvm_clock.
> * Avoid changing the kvm clock rating based on the availability of reliable
>   clock sources. Let the TSC clock source determine and downgrade itself.

Why downgrade? If it's the best one you want to upgrade it so it's
preferred over the others.

> The above will make sure that the PV clocksource rating remain
> unaffected.
>
> Clock soure selection order when the ratings match:
> * Currently, clocks are registered and enqueued based on their rating.
> * When clock ratings are tied, use the advertised clock frequency(freq_khz) as a
>   secondary key to favor clocks with better frequency.
>
> This approach improves the selection process by considering both rating and
> frequency. Something like below:

What does the frequency tell us? Not really anything. It's not
necessarily the better clocksource.

Higher frequency gives you a slightly better resolution, but as all of
this is usually sub-nanosecond resolution already that's not making a
difference in practice.

So if you know you want TSC to be selected, then upgrade the rating of
both the early and the regular TSC clocksource and be done with it.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-09-30 21:20                     ` Thomas Gleixner
@ 2024-10-01  4:26                       ` Nikunj A. Dadhania
  2024-10-01 14:36                         ` Nikunj A. Dadhania
  0 siblings, 1 reply; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-01  4:26 UTC (permalink / raw)
  To: Thomas Gleixner, Sean Christopherson
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, dave.hansen,
	pgonda, pbonzini, peterz, gautham.shenoy



On 10/1/2024 2:50 AM, Thomas Gleixner wrote:
> On Mon, Sep 30 2024 at 11:57, Nikunj A. Dadhania wrote:
>> TSC Clock Rating Adjustment:
>> * During TSC initialization, downgrade the TSC clock rating to 200 if TSC is not
>>   constant/reliable, placing it below HPET.
> 
> Downgrading a constant TSC is a bad idea. Reliable just means that it
> does not need a watchdog clocksource. If it's non-constant it's
> downgraded anyway.
> 
>> * Ensure the kvm-clock rating is set to 299 by default in the 
>>   struct clocksource kvm_clock.
>> * Avoid changing the kvm clock rating based on the availability of reliable
>>   clock sources. Let the TSC clock source determine and downgrade itself.
> 
> Why downgrade? If it's the best one you want to upgrade it so it's
> preferred over the others.

Thanks for confirming that upgrading the TSC rating is fine.

> The above will make sure that the PV clocksource rating remain
>> unaffected.
>>
>> Clock soure selection order when the ratings match:
>> * Currently, clocks are registered and enqueued based on their rating.
>> * When clock ratings are tied, use the advertised clock frequency(freq_khz) as a
>>   secondary key to favor clocks with better frequency.
>>
>> This approach improves the selection process by considering both rating and
>> frequency. Something like below:
> 
> What does the frequency tell us? Not really anything. It's not
> necessarily the better clocksource.
> 
> Higher frequency gives you a slightly better resolution, but as all of
> this is usually sub-nanosecond resolution already that's not making a
> difference in practice.
> 
> So if you know you want TSC to be selected, then upgrade the rating of
> both the early and the regular TSC clocksource and be done with it.

Sure Thomas, I will modify the patch accordingly and send an RFC.

Also I realized that, for the guests, instead of rdtsc(), we should be 
calling rdtsc_ordered() to make sure that time moves forward even when
vCPUs are migrated.

Thanks,
Nikunj



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available
  2024-10-01  4:26                       ` Nikunj A. Dadhania
@ 2024-10-01 14:36                         ` Nikunj A. Dadhania
  0 siblings, 0 replies; 66+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-01 14:36 UTC (permalink / raw)
  To: Thomas Gleixner, Sean Christopherson
  Cc: linux-kernel, thomas.lendacky, bp, x86, kvm, mingo, dave.hansen,
	pgonda, pbonzini, peterz, gautham.shenoy



On 10/1/2024 9:56 AM, Nikunj A. Dadhania wrote:
 
 
> Also I realized that, for the guests, instead of rdtsc(), we should be 
> calling rdtsc_ordered() to make sure that time moves forward even when
> vCPUs are migrated.

The above is with reference to native_sched_clock() being used for VM
instead of kvm_sched_clock_read() when using TSC.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2024-10-01 14:36 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-31 15:07 [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A Dadhania
2024-07-31 15:07 ` [PATCH v11 01/20] virt: sev-guest: Replace dev_dbg with pr_debug Nikunj A Dadhania
2024-08-27  8:48   ` [tip: x86/sev] virt: sev-guest: Replace dev_dbg() with pr_debug() tip-bot2 for Nikunj A Dadhania
2024-07-31 15:07 ` [PATCH v11 02/20] virt: sev-guest: Rename local guest message variables Nikunj A Dadhania
2024-08-27  8:48   ` [tip: x86/sev] " tip-bot2 for Nikunj A Dadhania
2024-09-13 17:22   ` [PATCH v11 02/20] " Tom Lendacky
2024-07-31 15:07 ` [PATCH v11 03/20] virt: sev-guest: Fix user-visible strings Nikunj A Dadhania
2024-08-27  8:48   ` [tip: x86/sev] " tip-bot2 for Nikunj A Dadhania
2024-09-13 17:26   ` [PATCH v11 03/20] " Tom Lendacky
2024-07-31 15:07 ` [PATCH v11 04/20] virt: sev-guest: Ensure the SNP guest messages do not exceed a page Nikunj A Dadhania
2024-08-27  8:48   ` [tip: x86/sev] " tip-bot2 for Nikunj A Dadhania
2024-07-31 15:07 ` [PATCH v11 05/20] virt: sev-guest: Use AES GCM crypto library Nikunj A Dadhania
2024-07-31 15:07 ` [PATCH v11 06/20] x86/sev: Handle failures from snp_init() Nikunj A Dadhania
2024-08-27 11:32   ` Borislav Petkov
2024-08-28  4:47     ` Nikunj A. Dadhania
2024-08-28  9:49       ` Borislav Petkov
2024-08-28 10:16         ` Nikunj A. Dadhania
2024-08-28 10:23           ` Borislav Petkov
2024-07-31 15:07 ` [PATCH v11 07/20] x86/sev: Cache the secrets page address Nikunj A Dadhania
2024-07-31 15:07 ` [PATCH v11 08/20] virt: sev-guest: Consolidate SNP guest messaging parameters to a struct Nikunj A Dadhania
2024-09-04 14:31   ` Borislav Petkov
2024-09-05  4:35     ` Nikunj A. Dadhania
2024-07-31 15:08 ` [PATCH v11 09/20] virt: sev-guest: Reduce the scope of SNP command mutex Nikunj A Dadhania
2024-09-12 21:54   ` Tom Lendacky
2024-09-13  4:26     ` Nikunj A. Dadhania
2024-09-13 14:06       ` Tom Lendacky
2024-07-31 15:08 ` [PATCH v11 10/20] virt: sev-guest: Carve out SNP message context structure Nikunj A Dadhania
2024-09-13 15:52   ` Tom Lendacky
2024-07-31 15:08 ` [PATCH v11 11/20] x86/sev: Carve out and export SNP guest messaging init routines Nikunj A Dadhania
2024-09-13 15:53   ` Tom Lendacky
2024-07-31 15:08 ` [PATCH v11 12/20] x86/sev: Relocate SNP guest messaging routines to common code Nikunj A Dadhania
2024-09-13 16:27   ` Tom Lendacky
2024-09-16  4:42     ` Nikunj A. Dadhania
2024-07-31 15:08 ` [PATCH v11 13/20] x86/cc: Add CC_ATTR_GUEST_SECURE_TSC Nikunj A Dadhania
2024-09-13 15:21   ` Tom Lendacky
2024-09-16  4:53     ` Nikunj A. Dadhania
2024-07-31 15:08 ` [PATCH v11 14/20] x86/sev: Add Secure TSC support for SNP guests Nikunj A Dadhania
2024-09-13 16:29   ` Tom Lendacky
2024-07-31 15:08 ` [PATCH v11 15/20] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests Nikunj A Dadhania
2024-07-31 15:08 ` [PATCH v11 16/20] x86/sev: Prevent RDTSC/RDTSCP interception " Nikunj A Dadhania
2024-09-13 16:49   ` Tom Lendacky
2024-07-31 15:08 ` [PATCH v11 17/20] x86/sev: Allow Secure TSC feature for SNP guests Nikunj A Dadhania
2024-09-13 16:53   ` Tom Lendacky
2024-09-16  6:23     ` Nikunj A. Dadhania
2024-07-31 15:08 ` [PATCH v11 18/20] x86/sev: Mark Secure TSC as reliable clocksource Nikunj A Dadhania
2024-09-13 16:59   ` Tom Lendacky
2024-07-31 15:08 ` [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is available Nikunj A Dadhania
2024-09-13 17:19   ` Tom Lendacky
2024-09-13 17:30   ` Sean Christopherson
2024-09-16 15:20     ` Nikunj A. Dadhania
2024-09-18 12:07       ` Sean Christopherson
2024-09-20  5:15         ` Nikunj A. Dadhania
2024-09-20  7:21           ` Sean Christopherson
2024-09-20  8:54             ` Nikunj A. Dadhania
2024-09-25  8:53               ` Nikunj A. Dadhania
2024-09-25 12:55                 ` Sean Christopherson
2024-09-30  6:27                   ` Nikunj A. Dadhania
2024-09-30 21:20                     ` Thomas Gleixner
2024-10-01  4:26                       ` Nikunj A. Dadhania
2024-10-01 14:36                         ` Nikunj A. Dadhania
2024-07-31 15:08 ` [PATCH v11 20/20] x86/cpu/amd: Do not print FW_BUG for Secure TSC Nikunj A Dadhania
2024-09-13 17:21   ` Tom Lendacky
2024-09-13 17:42   ` Jim Mattson
2024-09-16 11:40     ` Nikunj A. Dadhania
2024-09-16 20:21       ` Jim Mattson
2024-08-14  4:14 ` [PATCH v11 00/20] Add Secure TSC support for SNP guests Nikunj A. Dadhania

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox