public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v13 00/13] Add Secure TSC support for SNP guests
@ 2024-10-21  5:51 Nikunj A Dadhania
  2024-10-21  5:51 ` [PATCH v13 01/13] x86/sev: Carve out and export SNP guest messaging init routines Nikunj A Dadhania
                   ` (12 more replies)
  0 siblings, 13 replies; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

This patchset is also available at:

  https://github.com/AMDESE/linux-kvm/tree/sectsc-guest-latest

and is based on tip/x86/sev

Overview
--------

Secure TSC allows guests to securely use RDTSC/RDTSCP instructions as the
parameters being used cannot be changed by hypervisor once the guest is
launched. More details in the AMD64 APM Vol 2, Section "Secure TSC".

In order to enable secure TSC, SEV-SNP guests need to send a TSC_INFO guest
message before the APs are booted. Details from the TSC_INFO response will
then be used to program the VMSA before the APs are brought up. See "SEV
Secure Nested Paging Firmware ABI Specification" document (currently at
https://www.amd.com/system/files/TechDocs/56860.pdf) section "TSC Info"

SEV-guest driver has the implementation for guest and AMD Security
Processor communication. As the TSC_INFO needs to be initialized during
early boot before APs are started, move the guest messaging code from
sev-guest driver to sev/core.c and provide well defined APIs to the
sev-guest driver.

Patches:
01-02: Patches moving SNP guest messaging code from SEV guest driver to
       SEV common code
03-09: SecureTSC enablement patches
10-11: Generic TSC/kvmclock improvements
12-13: SecureTSC enablement patches.

Testing SecureTSC
-----------------

SecureTSC hypervisor patches based on top of SEV-SNP Guest MEMFD series:
https://github.com/AMDESE/linux-kvm/tree/sectsc-host-latest

QEMU changes:
https://github.com/nikunjad/qemu/tree/snp-securetsc-latest

QEMU commandline SEV-SNP with SecureTSC:

  qemu-system-x86_64 -cpu EPYC-Milan-v2 -smp 4 \
    -object memory-backend-memfd,id=ram1,size=1G,share=true,prealloc=false,reserve=false \
    -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,secure-tsc=on \
    -machine q35,confidential-guest-support=sev0,memory-backend=ram1 \
    ...

Changelog:
----------
v13:
* Rebased on top of tip/x86/sev
* Squashed CC_ATTR_GUEST_SNP_SNP_SECURE_TSC change to patch where it used. (Tom)
* Fixed sparse warnings (kernel test robot)
* Added patch to prevent GUEST_TSC_FREQ MSR interception(Tom)
* Fix sched_clock override in common code (Sean)
* Added GHCB_TERM_SECURE_TSC_KVMCLOCK reason code and use sev_es_terminate() to
  inform the hypervisor that KVMCLOCK was being instead of Secure TSC.

v12: https://lore.kernel.org/lkml/20241009092850.197575-1-nikunj@amd.com/
* Rebased on top of v6.12-rc2
* Collected Reviewed-by (Tom)
* Improve error handling in sme_enable() (Boris)
* Drop handle_guest_request() copying routine (Boris)
* Move VMPCK empty check inside the lock (Tom)
* Drop export symbol for snp_issue_guest_request() (Tom)
* Rename CC_ATTR_GUEST_SECURE_TSC as CC_ATTR_GUEST_SNP_SECURE_TSC (Tom)
* Upgrade the tsc early and regular clock rating when TSC is invariant,
  non-stop and stable (tglx)
* Initialize kvm sched clock only when the kvmclock source is selected (Sean)
* Abort SecureTSC enabled guests when kvmclock is selected (Sean)
* Added patch to use TSC frequency using GUEST_TSC_FREQ MSR (Sean)

Nikunj A Dadhania (13):
  x86/sev: Carve out and export SNP guest messaging init routines
  x86/sev: Relocate SNP guest messaging routines to common code
  x86/sev: Add Secure TSC support for SNP guests
  x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
  x86/sev: Prevent RDTSC/RDTSCP interception for Secure TSC enabled
    guests
  x86/sev: Prevent GUEST_TSC_FREQ MSR interception for Secure TSC
    enabled guests
  x86/sev: Mark Secure TSC as reliable clocksource
  x86/cpu/amd: Do not print FW_BUG for Secure TSC
  tsc: Use the GUEST_TSC_FREQ MSR for discovering TSC frequency
  tsc: Upgrade TSC clocksource rating
  tsc: Switch to native sched clock
  x86/kvmclock: Abort SecureTSC enabled guest when kvmclock is selected
  x86/sev: Allow Secure TSC feature for SNP guests

 arch/x86/include/asm/msr-index.h        |   1 +
 arch/x86/include/asm/sev-common.h       |   2 +
 arch/x86/include/asm/sev.h              | 134 +++++-
 arch/x86/include/asm/svm.h              |   6 +-
 include/linux/cc_platform.h             |   8 +
 arch/x86/boot/compressed/sev.c          |   3 +-
 arch/x86/coco/core.c                    |   3 +
 arch/x86/coco/sev/core.c                | 567 +++++++++++++++++++++++-
 arch/x86/coco/sev/shared.c              |  13 +-
 arch/x86/kernel/cpu/amd.c               |   3 +-
 arch/x86/kernel/kvmclock.c              |   9 +
 arch/x86/kernel/tsc.c                   |  41 ++
 arch/x86/mm/mem_encrypt.c               |   4 +
 arch/x86/mm/mem_encrypt_amd.c           |   4 +
 drivers/virt/coco/sev-guest/sev-guest.c | 485 +-------------------
 arch/x86/Kconfig                        |   1 +
 drivers/virt/coco/sev-guest/Kconfig     |   1 -
 17 files changed, 794 insertions(+), 491 deletions(-)


base-commit: 0a895c0d9b73d934de95aa0dd4e631c394bdd25d
-- 
2.34.1


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v13 01/13] x86/sev: Carve out and export SNP guest messaging init routines
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-21  7:48   ` Christophe JAILLET
  2024-10-21  5:51 ` [PATCH v13 02/13] x86/sev: Relocate SNP guest messaging routines to common code Nikunj A Dadhania
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Currently, the SEV guest driver is the only user of SNP guest messaging.
All routines for initializing SNP guest messaging are implemented within
the SEV guest driver. To add Secure TSC guest support, these initialization
routines need to be available during early boot.

Carve out common SNP guest messaging buffer allocations and message
initialization routines to core/sev.c and export them. These newly added
APIs set up the SNP message context (snp_msg_desc), which contains all the
necessary details for sending SNP guest messages.

At present, the SEV guest platform data structure is used to pass the
secrets page physical address to SEV guest driver. Since the secrets page
address is locally available to the initialization routine, use the cached
address. Remove the unused SEV guest platform data structure.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/sev.h              |  71 ++++++++-
 arch/x86/coco/sev/core.c                | 133 +++++++++++++++-
 drivers/virt/coco/sev-guest/sev-guest.c | 195 +++---------------------
 3 files changed, 215 insertions(+), 184 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 2e49c4a9e7fe..3812692ba3fe 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -14,6 +14,7 @@
 #include <asm/insn.h>
 #include <asm/sev-common.h>
 #include <asm/coco.h>
+#include <asm/set_memory.h>
 
 #define GHCB_PROTOCOL_MIN	1ULL
 #define GHCB_PROTOCOL_MAX	2ULL
@@ -170,10 +171,6 @@ struct snp_guest_msg {
 	u8 payload[PAGE_SIZE - sizeof(struct snp_guest_msg_hdr)];
 } __packed;
 
-struct sev_guest_platform_data {
-	u64 secrets_gpa;
-};
-
 struct snp_guest_req {
 	void *req_buf;
 	size_t req_sz;
@@ -253,6 +250,7 @@ struct snp_msg_desc {
 
 	u32 *os_area_msg_seqno;
 	u8 *vmpck;
+	int vmpck_id;
 };
 
 /*
@@ -438,6 +436,63 @@ u64 sev_get_status(void);
 void sev_show_status(void);
 void snp_update_svsm_ca(void);
 
+static inline void free_shared_pages(void *buf, size_t sz)
+{
+	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+	int ret;
+
+	if (!buf)
+		return;
+
+	ret = set_memory_encrypted((unsigned long)buf, npages);
+	if (ret) {
+		WARN_ONCE(ret, "failed to restore encryption mask (leak it)\n");
+		return;
+	}
+
+	__free_pages(virt_to_page(buf), get_order(sz));
+}
+
+static inline void *alloc_shared_pages(size_t sz)
+{
+	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+	struct page *page;
+	int ret;
+
+	page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz));
+	if (!page)
+		return NULL;
+
+	ret = set_memory_decrypted((unsigned long)page_address(page), npages);
+	if (ret) {
+		pr_err("failed to mark page shared, ret=%d\n", ret);
+		__free_pages(page, get_order(sz));
+		return NULL;
+	}
+
+	return page_address(page);
+}
+
+static inline bool is_vmpck_empty(struct snp_msg_desc *mdesc)
+{
+	char zero_key[VMPCK_KEY_LEN] = {0};
+
+	if (mdesc->vmpck)
+		return !memcmp(mdesc->vmpck, zero_key, VMPCK_KEY_LEN);
+
+	return true;
+}
+
+int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id);
+struct snp_msg_desc *snp_msg_alloc(void);
+
+static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc)
+{
+	mdesc->vmpck = NULL;
+	mdesc->os_area_msg_seqno = NULL;
+	kfree(mdesc->ctx);
+}
+
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define snp_vmpl 0
@@ -474,6 +529,14 @@ static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
 static inline u64 sev_get_status(void) { return 0; }
 static inline void sev_show_status(void) { }
 static inline void snp_update_svsm_ca(void) { }
+static inline void free_shared_pages(void *buf, size_t sz) { }
+static inline void *alloc_shared_pages(size_t sz) { return NULL; }
+static inline bool is_vmpck_empty(struct snp_msg_desc *mdesc) { return false; }
+
+static inline int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id) { return -1; }
+static inline struct snp_msg_desc *snp_msg_alloc(void) { return NULL; }
+
+static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc) { }
 
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index c7b4270d0e18..6ba13ae0b153 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -25,6 +25,7 @@
 #include <linux/psp-sev.h>
 #include <linux/dmi.h>
 #include <uapi/linux/sev-guest.h>
+#include <crypto/gcm.h>
 
 #include <asm/init.h>
 #include <asm/cpu_entry_area.h>
@@ -95,6 +96,8 @@ static u64 sev_hv_features __ro_after_init;
 /* Secrets page physical address from the CC blob */
 static u64 secrets_pa __ro_after_init;
 
+static struct snp_msg_desc *snp_mdesc;
+
 /* #VC handler runtime per-CPU data */
 struct sev_es_runtime_data {
 	struct ghcb ghcb_page;
@@ -2445,15 +2448,9 @@ static struct platform_device sev_guest_device = {
 
 static int __init snp_init_platform_device(void)
 {
-	struct sev_guest_platform_data data;
-
 	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
 		return -ENODEV;
 
-	data.secrets_gpa = secrets_pa;
-	if (platform_device_add_data(&sev_guest_device, &data, sizeof(data)))
-		return -ENODEV;
-
 	if (platform_device_register(&sev_guest_device))
 		return -ENODEV;
 
@@ -2532,3 +2529,127 @@ static int __init sev_sysfs_init(void)
 }
 arch_initcall(sev_sysfs_init);
 #endif // CONFIG_SYSFS
+
+static u8 *get_vmpck(int id, struct snp_secrets_page *secrets, u32 **seqno)
+{
+	u8 *key = NULL;
+
+	switch (id) {
+	case 0:
+		*seqno = &secrets->os_area.msg_seqno_0;
+		key = secrets->vmpck0;
+		break;
+	case 1:
+		*seqno = &secrets->os_area.msg_seqno_1;
+		key = secrets->vmpck1;
+		break;
+	case 2:
+		*seqno = &secrets->os_area.msg_seqno_2;
+		key = secrets->vmpck2;
+		break;
+	case 3:
+		*seqno = &secrets->os_area.msg_seqno_3;
+		key = secrets->vmpck3;
+		break;
+	default:
+		break;
+	}
+
+	return key;
+}
+
+static struct aesgcm_ctx *snp_init_crypto(u8 *key, size_t keylen)
+{
+	struct aesgcm_ctx *ctx;
+
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT);
+	if (!ctx)
+		return NULL;
+
+	if (aesgcm_expandkey(ctx, key, keylen, AUTHTAG_LEN)) {
+		pr_err("Crypto context initialization failed\n");
+		kfree(ctx);
+		return NULL;
+	}
+
+	return ctx;
+}
+
+int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id)
+{
+	/* Adjust the default VMPCK key based on the executing VMPL level */
+	if (vmpck_id == -1)
+		vmpck_id = snp_vmpl;
+
+	mdesc->vmpck = get_vmpck(vmpck_id, mdesc->secrets, &mdesc->os_area_msg_seqno);
+	if (!mdesc->vmpck) {
+		pr_err("Invalid VMPCK%d communication key\n", vmpck_id);
+		return -EINVAL;
+	}
+
+	/* Verify that VMPCK is not zero. */
+	if (is_vmpck_empty(mdesc)) {
+		pr_err("Empty VMPCK%d communication key\n", vmpck_id);
+		return -EINVAL;
+	}
+
+	mdesc->vmpck_id = vmpck_id;
+
+	mdesc->ctx = snp_init_crypto(mdesc->vmpck, VMPCK_KEY_LEN);
+	if (!mdesc->ctx)
+		return -ENOMEM;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(snp_msg_init);
+
+struct snp_msg_desc *snp_msg_alloc(void)
+{
+	struct snp_msg_desc *mdesc;
+
+	BUILD_BUG_ON(sizeof(struct snp_guest_msg) > PAGE_SIZE);
+
+	if (snp_mdesc)
+		return snp_mdesc;
+
+	mdesc = kzalloc(sizeof(struct snp_msg_desc), GFP_KERNEL);
+	if (!mdesc)
+		return ERR_PTR(-ENOMEM);
+
+	mdesc->secrets = (__force struct snp_secrets_page *)ioremap_encrypted(secrets_pa,
+									      PAGE_SIZE);
+	if (!mdesc->secrets)
+		return ERR_PTR(-ENODEV);
+
+	/* Allocate the shared page used for the request and response message. */
+	mdesc->request = alloc_shared_pages(sizeof(struct snp_guest_msg));
+	if (!mdesc->request)
+		goto e_unmap;
+
+	mdesc->response = alloc_shared_pages(sizeof(struct snp_guest_msg));
+	if (!mdesc->response)
+		goto e_free_request;
+
+	mdesc->certs_data = alloc_shared_pages(SEV_FW_BLOB_MAX_SIZE);
+	if (!mdesc->certs_data)
+		goto e_free_response;
+
+	/* initial the input address for guest request */
+	mdesc->input.req_gpa = __pa(mdesc->request);
+	mdesc->input.resp_gpa = __pa(mdesc->response);
+	mdesc->input.data_gpa = __pa(mdesc->certs_data);
+
+	snp_mdesc = mdesc;
+
+	return mdesc;
+
+e_free_response:
+	free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg));
+e_free_request:
+	free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg));
+e_unmap:
+	iounmap((__force void __iomem *)mdesc->secrets);
+
+	return ERR_PTR(-ENOMEM);
+}
+EXPORT_SYMBOL_GPL(snp_msg_alloc);
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index fca5c45ed5cd..862fc74452ac 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -63,16 +63,6 @@ MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.
 /* Mutex to serialize the shared buffer access and command handling. */
 static DEFINE_MUTEX(snp_cmd_mutex);
 
-static bool is_vmpck_empty(struct snp_msg_desc *mdesc)
-{
-	char zero_key[VMPCK_KEY_LEN] = {0};
-
-	if (mdesc->vmpck)
-		return !memcmp(mdesc->vmpck, zero_key, VMPCK_KEY_LEN);
-
-	return true;
-}
-
 /*
  * If an error is received from the host or AMD Secure Processor (ASP) there
  * are two options. Either retry the exact same encrypted request or discontinue
@@ -93,7 +83,7 @@ static bool is_vmpck_empty(struct snp_msg_desc *mdesc)
 static void snp_disable_vmpck(struct snp_msg_desc *mdesc)
 {
 	pr_alert("Disabling VMPCK%d communication key to prevent IV reuse.\n",
-		  vmpck_id);
+		  mdesc->vmpck_id);
 	memzero_explicit(mdesc->vmpck, VMPCK_KEY_LEN);
 	mdesc->vmpck = NULL;
 }
@@ -147,23 +137,6 @@ static inline struct snp_guest_dev *to_snp_dev(struct file *file)
 	return container_of(dev, struct snp_guest_dev, misc);
 }
 
-static struct aesgcm_ctx *snp_init_crypto(u8 *key, size_t keylen)
-{
-	struct aesgcm_ctx *ctx;
-
-	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT);
-	if (!ctx)
-		return NULL;
-
-	if (aesgcm_expandkey(ctx, key, keylen, AUTHTAG_LEN)) {
-		pr_err("Crypto context initialization failed\n");
-		kfree(ctx);
-		return NULL;
-	}
-
-	return ctx;
-}
-
 static int verify_and_dec_payload(struct snp_msg_desc *mdesc, struct snp_guest_req *req)
 {
 	struct snp_guest_msg *resp_msg = &mdesc->secret_response;
@@ -414,7 +387,7 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 
 	req.msg_version = arg->msg_version;
 	req.msg_type = SNP_MSG_REPORT_REQ;
-	req.vmpck_id = vmpck_id;
+	req.vmpck_id = mdesc->vmpck_id;
 	req.req_buf = report_req;
 	req.req_sz = sizeof(*report_req);
 	req.resp_buf = report_resp->data;
@@ -461,7 +434,7 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
 
 	req.msg_version = arg->msg_version;
 	req.msg_type = SNP_MSG_KEY_REQ;
-	req.vmpck_id = vmpck_id;
+	req.vmpck_id = mdesc->vmpck_id;
 	req.req_buf = derived_key_req;
 	req.req_sz = sizeof(*derived_key_req);
 	req.resp_buf = buf;
@@ -539,7 +512,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
 
 	req.msg_version = arg->msg_version;
 	req.msg_type = SNP_MSG_REPORT_REQ;
-	req.vmpck_id = vmpck_id;
+	req.vmpck_id = mdesc->vmpck_id;
 	req.req_buf = &report_req->data;
 	req.req_sz = sizeof(report_req->data);
 	req.resp_buf = report_resp->data;
@@ -616,76 +589,11 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
 	return ret;
 }
 
-static void free_shared_pages(void *buf, size_t sz)
-{
-	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
-	int ret;
-
-	if (!buf)
-		return;
-
-	ret = set_memory_encrypted((unsigned long)buf, npages);
-	if (ret) {
-		WARN_ONCE(ret, "failed to restore encryption mask (leak it)\n");
-		return;
-	}
-
-	__free_pages(virt_to_page(buf), get_order(sz));
-}
-
-static void *alloc_shared_pages(struct device *dev, size_t sz)
-{
-	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
-	struct page *page;
-	int ret;
-
-	page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz));
-	if (!page)
-		return NULL;
-
-	ret = set_memory_decrypted((unsigned long)page_address(page), npages);
-	if (ret) {
-		dev_err(dev, "failed to mark page shared, ret=%d\n", ret);
-		__free_pages(page, get_order(sz));
-		return NULL;
-	}
-
-	return page_address(page);
-}
-
 static const struct file_operations snp_guest_fops = {
 	.owner	= THIS_MODULE,
 	.unlocked_ioctl = snp_guest_ioctl,
 };
 
-static u8 *get_vmpck(int id, struct snp_secrets_page *secrets, u32 **seqno)
-{
-	u8 *key = NULL;
-
-	switch (id) {
-	case 0:
-		*seqno = &secrets->os_area.msg_seqno_0;
-		key = secrets->vmpck0;
-		break;
-	case 1:
-		*seqno = &secrets->os_area.msg_seqno_1;
-		key = secrets->vmpck1;
-		break;
-	case 2:
-		*seqno = &secrets->os_area.msg_seqno_2;
-		key = secrets->vmpck2;
-		break;
-	case 3:
-		*seqno = &secrets->os_area.msg_seqno_3;
-		key = secrets->vmpck3;
-		break;
-	default:
-		break;
-	}
-
-	return key;
-}
-
 struct snp_msg_report_resp_hdr {
 	u32 status;
 	u32 report_size;
@@ -979,13 +887,10 @@ static void unregister_sev_tsm(void *data)
 
 static int __init sev_guest_probe(struct platform_device *pdev)
 {
-	struct sev_guest_platform_data *data;
-	struct snp_secrets_page *secrets;
 	struct device *dev = &pdev->dev;
 	struct snp_guest_dev *snp_dev;
 	struct snp_msg_desc *mdesc;
 	struct miscdevice *misc;
-	void __iomem *mapping;
 	int ret;
 
 	BUILD_BUG_ON(sizeof(struct snp_guest_msg) > PAGE_SIZE);
@@ -993,115 +898,57 @@ static int __init sev_guest_probe(struct platform_device *pdev)
 	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
 		return -ENODEV;
 
-	if (!dev->platform_data)
-		return -ENODEV;
-
-	data = (struct sev_guest_platform_data *)dev->platform_data;
-	mapping = ioremap_encrypted(data->secrets_gpa, PAGE_SIZE);
-	if (!mapping)
-		return -ENODEV;
-
-	secrets = (__force void *)mapping;
-
-	ret = -ENOMEM;
 	snp_dev = devm_kzalloc(&pdev->dev, sizeof(struct snp_guest_dev), GFP_KERNEL);
 	if (!snp_dev)
-		goto e_unmap;
-
-	mdesc = devm_kzalloc(&pdev->dev, sizeof(struct snp_msg_desc), GFP_KERNEL);
-	if (!mdesc)
-		goto e_unmap;
-
-	/* Adjust the default VMPCK key based on the executing VMPL level */
-	if (vmpck_id == -1)
-		vmpck_id = snp_vmpl;
+		return -ENOMEM;
 
-	ret = -EINVAL;
-	mdesc->vmpck = get_vmpck(vmpck_id, secrets, &mdesc->os_area_msg_seqno);
-	if (!mdesc->vmpck) {
-		dev_err(dev, "Invalid VMPCK%d communication key\n", vmpck_id);
-		goto e_unmap;
-	}
+	mdesc = snp_msg_alloc();
+	if (IS_ERR_OR_NULL(mdesc))
+		return -ENOMEM;
 
-	/* Verify that VMPCK is not zero. */
-	if (is_vmpck_empty(mdesc)) {
-		dev_err(dev, "Empty VMPCK%d communication key\n", vmpck_id);
-		goto e_unmap;
-	}
+	ret = snp_msg_init(mdesc, vmpck_id);
+	if (ret)
+		return -EIO;
 
 	platform_set_drvdata(pdev, snp_dev);
 	snp_dev->dev = dev;
-	mdesc->secrets = secrets;
-
-	/* Allocate the shared page used for the request and response message. */
-	mdesc->request = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
-	if (!mdesc->request)
-		goto e_unmap;
-
-	mdesc->response = alloc_shared_pages(dev, sizeof(struct snp_guest_msg));
-	if (!mdesc->response)
-		goto e_free_request;
-
-	mdesc->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
-	if (!mdesc->certs_data)
-		goto e_free_response;
-
-	ret = -EIO;
-	mdesc->ctx = snp_init_crypto(mdesc->vmpck, VMPCK_KEY_LEN);
-	if (!mdesc->ctx)
-		goto e_free_cert_data;
 
 	misc = &snp_dev->misc;
 	misc->minor = MISC_DYNAMIC_MINOR;
 	misc->name = DEVICE_NAME;
 	misc->fops = &snp_guest_fops;
 
-	/* Initialize the input addresses for guest request */
-	mdesc->input.req_gpa = __pa(mdesc->request);
-	mdesc->input.resp_gpa = __pa(mdesc->response);
-	mdesc->input.data_gpa = __pa(mdesc->certs_data);
-
 	/* Set the privlevel_floor attribute based on the vmpck_id */
-	sev_tsm_ops.privlevel_floor = vmpck_id;
+	sev_tsm_ops.privlevel_floor = mdesc->vmpck_id;
 
 	ret = tsm_register(&sev_tsm_ops, snp_dev);
 	if (ret)
-		goto e_free_cert_data;
+		goto e_msg_init;
 
 	ret = devm_add_action_or_reset(&pdev->dev, unregister_sev_tsm, NULL);
 	if (ret)
-		goto e_free_cert_data;
+		goto e_msg_init;
 
 	ret =  misc_register(misc);
 	if (ret)
-		goto e_free_ctx;
+		goto e_msg_init;
 
 	snp_dev->msg_desc = mdesc;
-	dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n", vmpck_id);
+	dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n",
+		 mdesc->vmpck_id);
 	return 0;
 
-e_free_ctx:
-	kfree(mdesc->ctx);
-e_free_cert_data:
-	free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE);
-e_free_response:
-	free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg));
-e_free_request:
-	free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg));
-e_unmap:
-	iounmap(mapping);
+e_msg_init:
+	snp_msg_cleanup(mdesc);
+
 	return ret;
 }
 
 static void __exit sev_guest_remove(struct platform_device *pdev)
 {
 	struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev);
-	struct snp_msg_desc *mdesc = snp_dev->msg_desc;
 
-	free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE);
-	free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg));
-	free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg));
-	kfree(mdesc->ctx);
+	snp_msg_cleanup(snp_dev->msg_desc);
 	misc_deregister(&snp_dev->misc);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 02/13] x86/sev: Relocate SNP guest messaging routines to common code
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
  2024-10-21  5:51 ` [PATCH v13 01/13] x86/sev: Carve out and export SNP guest messaging init routines Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-21  5:51 ` [PATCH v13 03/13] x86/sev: Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

At present, the SEV guest driver exclusively handles SNP guest messaging.
All routines for sending guest messages are embedded within the guest
driver. To support Secure TSC, SEV-SNP guests must communicate with the AMD
Security Processor during early boot. However, these guest messaging
functions are not accessible during early boot since they are currently
part of the guest driver.

Hence, relocate the core SNP guest messaging functions to SEV common code
and provide an API for sending SNP guest messages.

No functional change, but just an export symbol added for
sev_send_geust_message() and dropped the export symbol on
snp_issue_guest_request() and made it static.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/sev.h              |  15 +-
 arch/x86/coco/sev/core.c                | 295 +++++++++++++++++++++++-
 drivers/virt/coco/sev-guest/sev-guest.c | 292 -----------------------
 arch/x86/Kconfig                        |   1 +
 drivers/virt/coco/sev-guest/Kconfig     |   1 -
 5 files changed, 301 insertions(+), 303 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 3812692ba3fe..d6ad5f6b1ff3 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -125,6 +125,9 @@ struct snp_req_data {
 #define AAD_LEN			48
 #define MSG_HDR_VER		1
 
+#define SNP_REQ_MAX_RETRY_DURATION      (60*HZ)
+#define SNP_REQ_RETRY_DELAY             (2*HZ)
+
 /* See SNP spec SNP_GUEST_REQUEST section for the structure */
 enum msg_type {
 	SNP_MSG_TYPE_INVALID = 0,
@@ -427,8 +430,6 @@ void snp_set_wakeup_secondary_cpu(void);
 bool snp_init(struct boot_params *bp);
 void __noreturn snp_abort(void);
 void snp_dmi_setup(void);
-int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input,
-			    struct snp_guest_request_ioctl *rio);
 int snp_issue_svsm_attest_req(u64 call_id, struct svsm_call *call, struct svsm_attest_call *input);
 void snp_accept_memory(phys_addr_t start, phys_addr_t end);
 u64 snp_get_unsupported_features(u64 status);
@@ -493,6 +494,9 @@ static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc)
 	kfree(mdesc->ctx);
 }
 
+int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
+			   struct snp_guest_request_ioctl *rio);
+
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define snp_vmpl 0
@@ -515,11 +519,6 @@ static inline void snp_set_wakeup_secondary_cpu(void) { }
 static inline bool snp_init(struct boot_params *bp) { return false; }
 static inline void snp_abort(void) { }
 static inline void snp_dmi_setup(void) { }
-static inline int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input,
-					  struct snp_guest_request_ioctl *rio)
-{
-	return -ENOTTY;
-}
 static inline int snp_issue_svsm_attest_req(u64 call_id, struct svsm_call *call, struct svsm_attest_call *input)
 {
 	return -ENOTTY;
@@ -537,6 +536,8 @@ static inline int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id) { retur
 static inline struct snp_msg_desc *snp_msg_alloc(void) { return NULL; }
 
 static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc) { }
+static inline int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
+					 struct snp_guest_request_ioctl *rio) { return -ENODEV; }
 
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 6ba13ae0b153..c96b742789c5 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -2376,8 +2376,8 @@ int snp_issue_svsm_attest_req(u64 call_id, struct svsm_call *call,
 }
 EXPORT_SYMBOL_GPL(snp_issue_svsm_attest_req);
 
-int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input,
-			    struct snp_guest_request_ioctl *rio)
+static int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input,
+				   struct snp_guest_request_ioctl *rio)
 {
 	struct ghcb_state state;
 	struct es_em_ctxt ctxt;
@@ -2439,7 +2439,6 @@ int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *inpu
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(snp_issue_guest_request);
 
 static struct platform_device sev_guest_device = {
 	.name		= "sev-guest",
@@ -2653,3 +2652,293 @@ struct snp_msg_desc *snp_msg_alloc(void)
 	return ERR_PTR(-ENOMEM);
 }
 EXPORT_SYMBOL_GPL(snp_msg_alloc);
+
+/* Mutex to serialize the shared buffer access and command handling. */
+static DEFINE_MUTEX(snp_cmd_mutex);
+
+/*
+ * If an error is received from the host or AMD Secure Processor (ASP) there
+ * are two options. Either retry the exact same encrypted request or discontinue
+ * using the VMPCK.
+ *
+ * This is because in the current encryption scheme GHCB v2 uses AES-GCM to
+ * encrypt the requests. The IV for this scheme is the sequence number. GCM
+ * cannot tolerate IV reuse.
+ *
+ * The ASP FW v1.51 only increments the sequence numbers on a successful
+ * guest<->ASP back and forth and only accepts messages at its exact sequence
+ * number.
+ *
+ * So if the sequence number were to be reused the encryption scheme is
+ * vulnerable. If the sequence number were incremented for a fresh IV the ASP
+ * will reject the request.
+ */
+static void snp_disable_vmpck(struct snp_msg_desc *mdesc)
+{
+	pr_alert("Disabling VMPCK%d communication key to prevent IV reuse.\n",
+		  mdesc->vmpck_id);
+	memzero_explicit(mdesc->vmpck, VMPCK_KEY_LEN);
+	mdesc->vmpck = NULL;
+}
+
+static inline u64 __snp_get_msg_seqno(struct snp_msg_desc *mdesc)
+{
+	u64 count;
+
+	lockdep_assert_held(&snp_cmd_mutex);
+
+	/* Read the current message sequence counter from secrets pages */
+	count = *mdesc->os_area_msg_seqno;
+
+	return count + 1;
+}
+
+/* Return a non-zero on success */
+static u64 snp_get_msg_seqno(struct snp_msg_desc *mdesc)
+{
+	u64 count = __snp_get_msg_seqno(mdesc);
+
+	/*
+	 * The message sequence counter for the SNP guest request is a  64-bit
+	 * value but the version 2 of GHCB specification defines a 32-bit storage
+	 * for it. If the counter exceeds the 32-bit value then return zero.
+	 * The caller should check the return value, but if the caller happens to
+	 * not check the value and use it, then the firmware treats zero as an
+	 * invalid number and will fail the  message request.
+	 */
+	if (count >= UINT_MAX) {
+		pr_err("request message sequence counter overflow\n");
+		return 0;
+	}
+
+	return count;
+}
+
+static void snp_inc_msg_seqno(struct snp_msg_desc *mdesc)
+{
+	/*
+	 * The counter is also incremented by the PSP, so increment it by 2
+	 * and save in secrets page.
+	 */
+	*mdesc->os_area_msg_seqno += 2;
+}
+
+static int verify_and_dec_payload(struct snp_msg_desc *mdesc, struct snp_guest_req *req)
+{
+	struct snp_guest_msg *resp_msg = &mdesc->secret_response;
+	struct snp_guest_msg *req_msg = &mdesc->secret_request;
+	struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr;
+	struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr;
+	struct aesgcm_ctx *ctx = mdesc->ctx;
+	u8 iv[GCM_AES_IV_SIZE] = {};
+
+	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
+		 resp_msg_hdr->msg_seqno, resp_msg_hdr->msg_type, resp_msg_hdr->msg_version,
+		 resp_msg_hdr->msg_sz);
+
+	/* Copy response from shared memory to encrypted memory. */
+	memcpy(resp_msg, mdesc->response, sizeof(*resp_msg));
+
+	/* Verify that the sequence counter is incremented by 1 */
+	if (unlikely(resp_msg_hdr->msg_seqno != (req_msg_hdr->msg_seqno + 1)))
+		return -EBADMSG;
+
+	/* Verify response message type and version number. */
+	if (resp_msg_hdr->msg_type != (req_msg_hdr->msg_type + 1) ||
+	    resp_msg_hdr->msg_version != req_msg_hdr->msg_version)
+		return -EBADMSG;
+
+	/*
+	 * If the message size is greater than our buffer length then return
+	 * an error.
+	 */
+	if (unlikely((resp_msg_hdr->msg_sz + ctx->authsize) > req->resp_sz))
+		return -EBADMSG;
+
+	/* Decrypt the payload */
+	memcpy(iv, &resp_msg_hdr->msg_seqno, min(sizeof(iv), sizeof(resp_msg_hdr->msg_seqno)));
+	if (!aesgcm_decrypt(ctx, req->resp_buf, resp_msg->payload, resp_msg_hdr->msg_sz,
+			    &resp_msg_hdr->algo, AAD_LEN, iv, resp_msg_hdr->authtag))
+		return -EBADMSG;
+
+	return 0;
+}
+
+static int enc_payload(struct snp_msg_desc *mdesc, u64 seqno, struct snp_guest_req *req)
+{
+	struct snp_guest_msg *msg = &mdesc->secret_request;
+	struct snp_guest_msg_hdr *hdr = &msg->hdr;
+	struct aesgcm_ctx *ctx = mdesc->ctx;
+	u8 iv[GCM_AES_IV_SIZE] = {};
+
+	memset(msg, 0, sizeof(*msg));
+
+	hdr->algo = SNP_AEAD_AES_256_GCM;
+	hdr->hdr_version = MSG_HDR_VER;
+	hdr->hdr_sz = sizeof(*hdr);
+	hdr->msg_type = req->msg_type;
+	hdr->msg_version = req->msg_version;
+	hdr->msg_seqno = seqno;
+	hdr->msg_vmpck = req->vmpck_id;
+	hdr->msg_sz = req->req_sz;
+
+	/* Verify the sequence number is non-zero */
+	if (!hdr->msg_seqno)
+		return -ENOSR;
+
+	pr_debug("request [seqno %lld type %d version %d sz %d]\n",
+		 hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
+
+	if (WARN_ON((req->req_sz + ctx->authsize) > sizeof(msg->payload)))
+		return -EBADMSG;
+
+	memcpy(iv, &hdr->msg_seqno, min(sizeof(iv), sizeof(hdr->msg_seqno)));
+	aesgcm_encrypt(ctx, msg->payload, req->req_buf, req->req_sz, &hdr->algo,
+		       AAD_LEN, iv, hdr->authtag);
+
+	return 0;
+}
+
+static int __handle_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
+				  struct snp_guest_request_ioctl *rio)
+{
+	unsigned long req_start = jiffies;
+	unsigned int override_npages = 0;
+	u64 override_err = 0;
+	int rc;
+
+retry_request:
+	/*
+	 * Call firmware to process the request. In this function the encrypted
+	 * message enters shared memory with the host. So after this call the
+	 * sequence number must be incremented or the VMPCK must be deleted to
+	 * prevent reuse of the IV.
+	 */
+	rc = snp_issue_guest_request(req, &mdesc->input, rio);
+	switch (rc) {
+	case -ENOSPC:
+		/*
+		 * If the extended guest request fails due to having too
+		 * small of a certificate data buffer, retry the same
+		 * guest request without the extended data request in
+		 * order to increment the sequence number and thus avoid
+		 * IV reuse.
+		 */
+		override_npages = mdesc->input.data_npages;
+		req->exit_code	= SVM_VMGEXIT_GUEST_REQUEST;
+
+		/*
+		 * Override the error to inform callers the given extended
+		 * request buffer size was too small and give the caller the
+		 * required buffer size.
+		 */
+		override_err = SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN);
+
+		/*
+		 * If this call to the firmware succeeds, the sequence number can
+		 * be incremented allowing for continued use of the VMPCK. If
+		 * there is an error reflected in the return value, this value
+		 * is checked further down and the result will be the deletion
+		 * of the VMPCK and the error code being propagated back to the
+		 * user as an ioctl() return code.
+		 */
+		goto retry_request;
+
+	/*
+	 * The host may return SNP_GUEST_VMM_ERR_BUSY if the request has been
+	 * throttled. Retry in the driver to avoid returning and reusing the
+	 * message sequence number on a different message.
+	 */
+	case -EAGAIN:
+		if (jiffies - req_start > SNP_REQ_MAX_RETRY_DURATION) {
+			rc = -ETIMEDOUT;
+			break;
+		}
+		schedule_timeout_killable(SNP_REQ_RETRY_DELAY);
+		goto retry_request;
+	}
+
+	/*
+	 * Increment the message sequence number. There is no harm in doing
+	 * this now because decryption uses the value stored in the response
+	 * structure and any failure will wipe the VMPCK, preventing further
+	 * use anyway.
+	 */
+	snp_inc_msg_seqno(mdesc);
+
+	if (override_err) {
+		rio->exitinfo2 = override_err;
+
+		/*
+		 * If an extended guest request was issued and the supplied certificate
+		 * buffer was not large enough, a standard guest request was issued to
+		 * prevent IV reuse. If the standard request was successful, return -EIO
+		 * back to the caller as would have originally been returned.
+		 */
+		if (!rc && override_err == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
+			rc = -EIO;
+	}
+
+	if (override_npages)
+		mdesc->input.data_npages = override_npages;
+
+	return rc;
+}
+
+int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
+			   struct snp_guest_request_ioctl *rio)
+{
+	u64 seqno;
+	int rc;
+
+	guard(mutex)(&snp_cmd_mutex);
+
+	/* Check if the VMPCK is not empty */
+	if (is_vmpck_empty(mdesc)) {
+		pr_err_ratelimited("VMPCK is disabled\n");
+		return -ENOTTY;
+	}
+
+	/* Get message sequence and verify that its a non-zero */
+	seqno = snp_get_msg_seqno(mdesc);
+	if (!seqno)
+		return -EIO;
+
+	/* Clear shared memory's response for the host to populate. */
+	memset(mdesc->response, 0, sizeof(struct snp_guest_msg));
+
+	/* Encrypt the userspace provided payload in mdesc->secret_request. */
+	rc = enc_payload(mdesc, seqno, req);
+	if (rc)
+		return rc;
+
+	/*
+	 * Write the fully encrypted request to the shared unencrypted
+	 * request page.
+	 */
+	memcpy(mdesc->request, &mdesc->secret_request,
+	       sizeof(mdesc->secret_request));
+
+	rc = __handle_guest_request(mdesc, req, rio);
+	if (rc) {
+		if (rc == -EIO &&
+		    rio->exitinfo2 == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
+			return rc;
+
+		pr_alert("Detected error from ASP request. rc: %d, exitinfo2: 0x%llx\n",
+			 rc, rio->exitinfo2);
+
+		snp_disable_vmpck(mdesc);
+		return rc;
+	}
+
+	rc = verify_and_dec_payload(mdesc, req);
+	if (rc) {
+		pr_alert("Detected unexpected decode failure from ASP. rc: %d\n", rc);
+		snp_disable_vmpck(mdesc);
+		return rc;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(snp_send_guest_request);
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 862fc74452ac..d64efc489686 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -31,9 +31,6 @@
 
 #define DEVICE_NAME	"sev-guest"
 
-#define SNP_REQ_MAX_RETRY_DURATION	(60*HZ)
-#define SNP_REQ_RETRY_DELAY		(2*HZ)
-
 #define SVSM_MAX_RETRIES		3
 
 struct snp_guest_dev {
@@ -60,76 +57,6 @@ static int vmpck_id = -1;
 module_param(vmpck_id, int, 0444);
 MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.");
 
-/* Mutex to serialize the shared buffer access and command handling. */
-static DEFINE_MUTEX(snp_cmd_mutex);
-
-/*
- * If an error is received from the host or AMD Secure Processor (ASP) there
- * are two options. Either retry the exact same encrypted request or discontinue
- * using the VMPCK.
- *
- * This is because in the current encryption scheme GHCB v2 uses AES-GCM to
- * encrypt the requests. The IV for this scheme is the sequence number. GCM
- * cannot tolerate IV reuse.
- *
- * The ASP FW v1.51 only increments the sequence numbers on a successful
- * guest<->ASP back and forth and only accepts messages at its exact sequence
- * number.
- *
- * So if the sequence number were to be reused the encryption scheme is
- * vulnerable. If the sequence number were incremented for a fresh IV the ASP
- * will reject the request.
- */
-static void snp_disable_vmpck(struct snp_msg_desc *mdesc)
-{
-	pr_alert("Disabling VMPCK%d communication key to prevent IV reuse.\n",
-		  mdesc->vmpck_id);
-	memzero_explicit(mdesc->vmpck, VMPCK_KEY_LEN);
-	mdesc->vmpck = NULL;
-}
-
-static inline u64 __snp_get_msg_seqno(struct snp_msg_desc *mdesc)
-{
-	u64 count;
-
-	lockdep_assert_held(&snp_cmd_mutex);
-
-	/* Read the current message sequence counter from secrets pages */
-	count = *mdesc->os_area_msg_seqno;
-
-	return count + 1;
-}
-
-/* Return a non-zero on success */
-static u64 snp_get_msg_seqno(struct snp_msg_desc *mdesc)
-{
-	u64 count = __snp_get_msg_seqno(mdesc);
-
-	/*
-	 * The message sequence counter for the SNP guest request is a  64-bit
-	 * value but the version 2 of GHCB specification defines a 32-bit storage
-	 * for it. If the counter exceeds the 32-bit value then return zero.
-	 * The caller should check the return value, but if the caller happens to
-	 * not check the value and use it, then the firmware treats zero as an
-	 * invalid number and will fail the  message request.
-	 */
-	if (count >= UINT_MAX) {
-		pr_err("request message sequence counter overflow\n");
-		return 0;
-	}
-
-	return count;
-}
-
-static void snp_inc_msg_seqno(struct snp_msg_desc *mdesc)
-{
-	/*
-	 * The counter is also incremented by the PSP, so increment it by 2
-	 * and save in secrets page.
-	 */
-	*mdesc->os_area_msg_seqno += 2;
-}
-
 static inline struct snp_guest_dev *to_snp_dev(struct file *file)
 {
 	struct miscdevice *dev = file->private_data;
@@ -137,225 +64,6 @@ static inline struct snp_guest_dev *to_snp_dev(struct file *file)
 	return container_of(dev, struct snp_guest_dev, misc);
 }
 
-static int verify_and_dec_payload(struct snp_msg_desc *mdesc, struct snp_guest_req *req)
-{
-	struct snp_guest_msg *resp_msg = &mdesc->secret_response;
-	struct snp_guest_msg *req_msg = &mdesc->secret_request;
-	struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr;
-	struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr;
-	struct aesgcm_ctx *ctx = mdesc->ctx;
-	u8 iv[GCM_AES_IV_SIZE] = {};
-
-	pr_debug("response [seqno %lld type %d version %d sz %d]\n",
-		 resp_msg_hdr->msg_seqno, resp_msg_hdr->msg_type, resp_msg_hdr->msg_version,
-		 resp_msg_hdr->msg_sz);
-
-	/* Copy response from shared memory to encrypted memory. */
-	memcpy(resp_msg, mdesc->response, sizeof(*resp_msg));
-
-	/* Verify that the sequence counter is incremented by 1 */
-	if (unlikely(resp_msg_hdr->msg_seqno != (req_msg_hdr->msg_seqno + 1)))
-		return -EBADMSG;
-
-	/* Verify response message type and version number. */
-	if (resp_msg_hdr->msg_type != (req_msg_hdr->msg_type + 1) ||
-	    resp_msg_hdr->msg_version != req_msg_hdr->msg_version)
-		return -EBADMSG;
-
-	/*
-	 * If the message size is greater than our buffer length then return
-	 * an error.
-	 */
-	if (unlikely((resp_msg_hdr->msg_sz + ctx->authsize) > req->resp_sz))
-		return -EBADMSG;
-
-	/* Decrypt the payload */
-	memcpy(iv, &resp_msg_hdr->msg_seqno, min(sizeof(iv), sizeof(resp_msg_hdr->msg_seqno)));
-	if (!aesgcm_decrypt(ctx, req->resp_buf, resp_msg->payload, resp_msg_hdr->msg_sz,
-			    &resp_msg_hdr->algo, AAD_LEN, iv, resp_msg_hdr->authtag))
-		return -EBADMSG;
-
-	return 0;
-}
-
-static int enc_payload(struct snp_msg_desc *mdesc, u64 seqno, struct snp_guest_req *req)
-{
-	struct snp_guest_msg *msg = &mdesc->secret_request;
-	struct snp_guest_msg_hdr *hdr = &msg->hdr;
-	struct aesgcm_ctx *ctx = mdesc->ctx;
-	u8 iv[GCM_AES_IV_SIZE] = {};
-
-	memset(msg, 0, sizeof(*msg));
-
-	hdr->algo = SNP_AEAD_AES_256_GCM;
-	hdr->hdr_version = MSG_HDR_VER;
-	hdr->hdr_sz = sizeof(*hdr);
-	hdr->msg_type = req->msg_type;
-	hdr->msg_version = req->msg_version;
-	hdr->msg_seqno = seqno;
-	hdr->msg_vmpck = req->vmpck_id;
-	hdr->msg_sz = req->req_sz;
-
-	/* Verify the sequence number is non-zero */
-	if (!hdr->msg_seqno)
-		return -ENOSR;
-
-	pr_debug("request [seqno %lld type %d version %d sz %d]\n",
-		 hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
-
-	if (WARN_ON((req->req_sz + ctx->authsize) > sizeof(msg->payload)))
-		return -EBADMSG;
-
-	memcpy(iv, &hdr->msg_seqno, min(sizeof(iv), sizeof(hdr->msg_seqno)));
-	aesgcm_encrypt(ctx, msg->payload, req->req_buf, req->req_sz, &hdr->algo,
-		       AAD_LEN, iv, hdr->authtag);
-
-	return 0;
-}
-
-static int __handle_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
-				  struct snp_guest_request_ioctl *rio)
-{
-	unsigned long req_start = jiffies;
-	unsigned int override_npages = 0;
-	u64 override_err = 0;
-	int rc;
-
-retry_request:
-	/*
-	 * Call firmware to process the request. In this function the encrypted
-	 * message enters shared memory with the host. So after this call the
-	 * sequence number must be incremented or the VMPCK must be deleted to
-	 * prevent reuse of the IV.
-	 */
-	rc = snp_issue_guest_request(req, &mdesc->input, rio);
-	switch (rc) {
-	case -ENOSPC:
-		/*
-		 * If the extended guest request fails due to having too
-		 * small of a certificate data buffer, retry the same
-		 * guest request without the extended data request in
-		 * order to increment the sequence number and thus avoid
-		 * IV reuse.
-		 */
-		override_npages = mdesc->input.data_npages;
-		req->exit_code	= SVM_VMGEXIT_GUEST_REQUEST;
-
-		/*
-		 * Override the error to inform callers the given extended
-		 * request buffer size was too small and give the caller the
-		 * required buffer size.
-		 */
-		override_err = SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN);
-
-		/*
-		 * If this call to the firmware succeeds, the sequence number can
-		 * be incremented allowing for continued use of the VMPCK. If
-		 * there is an error reflected in the return value, this value
-		 * is checked further down and the result will be the deletion
-		 * of the VMPCK and the error code being propagated back to the
-		 * user as an ioctl() return code.
-		 */
-		goto retry_request;
-
-	/*
-	 * The host may return SNP_GUEST_VMM_ERR_BUSY if the request has been
-	 * throttled. Retry in the driver to avoid returning and reusing the
-	 * message sequence number on a different message.
-	 */
-	case -EAGAIN:
-		if (jiffies - req_start > SNP_REQ_MAX_RETRY_DURATION) {
-			rc = -ETIMEDOUT;
-			break;
-		}
-		schedule_timeout_killable(SNP_REQ_RETRY_DELAY);
-		goto retry_request;
-	}
-
-	/*
-	 * Increment the message sequence number. There is no harm in doing
-	 * this now because decryption uses the value stored in the response
-	 * structure and any failure will wipe the VMPCK, preventing further
-	 * use anyway.
-	 */
-	snp_inc_msg_seqno(mdesc);
-
-	if (override_err) {
-		rio->exitinfo2 = override_err;
-
-		/*
-		 * If an extended guest request was issued and the supplied certificate
-		 * buffer was not large enough, a standard guest request was issued to
-		 * prevent IV reuse. If the standard request was successful, return -EIO
-		 * back to the caller as would have originally been returned.
-		 */
-		if (!rc && override_err == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
-			rc = -EIO;
-	}
-
-	if (override_npages)
-		mdesc->input.data_npages = override_npages;
-
-	return rc;
-}
-
-static int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
-				  struct snp_guest_request_ioctl *rio)
-{
-	u64 seqno;
-	int rc;
-
-	guard(mutex)(&snp_cmd_mutex);
-
-	/* Check if the VMPCK is not empty */
-	if (is_vmpck_empty(mdesc)) {
-		pr_err_ratelimited("VMPCK is disabled\n");
-		return -ENOTTY;
-	}
-
-	/* Get message sequence and verify that its a non-zero */
-	seqno = snp_get_msg_seqno(mdesc);
-	if (!seqno)
-		return -EIO;
-
-	/* Clear shared memory's response for the host to populate. */
-	memset(mdesc->response, 0, sizeof(struct snp_guest_msg));
-
-	/* Encrypt the userspace provided payload in mdesc->secret_request. */
-	rc = enc_payload(mdesc, seqno, req);
-	if (rc)
-		return rc;
-
-	/*
-	 * Write the fully encrypted request to the shared unencrypted
-	 * request page.
-	 */
-	memcpy(mdesc->request, &mdesc->secret_request,
-	       sizeof(mdesc->secret_request));
-
-	rc = __handle_guest_request(mdesc, req, rio);
-	if (rc) {
-		if (rc == -EIO &&
-		    rio->exitinfo2 == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN))
-			return rc;
-
-		pr_alert("Detected error from ASP request. rc: %d, exitinfo2: 0x%llx\n",
-			 rc, rio->exitinfo2);
-
-		snp_disable_vmpck(mdesc);
-		return rc;
-	}
-
-	rc = verify_and_dec_payload(mdesc, req);
-	if (rc) {
-		pr_alert("Detected unexpected decode failure from ASP. rc: %d\n", rc);
-		snp_disable_vmpck(mdesc);
-		return rc;
-	}
-
-	return 0;
-}
-
 struct snp_req_resp {
 	sockptr_t req_data;
 	sockptr_t resp_data;
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2852fcd82cbd..6426b6d469a4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1556,6 +1556,7 @@ config AMD_MEM_ENCRYPT
 	select ARCH_HAS_CC_PLATFORM
 	select X86_MEM_ENCRYPT
 	select UNACCEPTED_MEMORY
+	select CRYPTO_LIB_AESGCM
 	help
 	  Say yes to enable support for the encryption of system memory.
 	  This requires an AMD processor that supports Secure Memory
diff --git a/drivers/virt/coco/sev-guest/Kconfig b/drivers/virt/coco/sev-guest/Kconfig
index 0b772bd921d8..a6405ab6c2c3 100644
--- a/drivers/virt/coco/sev-guest/Kconfig
+++ b/drivers/virt/coco/sev-guest/Kconfig
@@ -2,7 +2,6 @@ config SEV_GUEST
 	tristate "AMD SEV Guest driver"
 	default m
 	depends on AMD_MEM_ENCRYPT
-	select CRYPTO_LIB_AESGCM
 	select TSM_REPORTS
 	help
 	  SEV-SNP firmware provides the guest a mechanism to communicate with
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 03/13] x86/sev: Add Secure TSC support for SNP guests
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
  2024-10-21  5:51 ` [PATCH v13 01/13] x86/sev: Carve out and export SNP guest messaging init routines Nikunj A Dadhania
  2024-10-21  5:51 ` [PATCH v13 02/13] x86/sev: Relocate SNP guest messaging routines to common code Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-21  8:12   ` Christophe JAILLET
  2024-10-21  5:51 ` [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests Nikunj A Dadhania
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Add support for Secure TSC in SNP-enabled guests. Secure TSC allows guests
to securely use RDTSC/RDTSCP instructions, ensuring that the parameters
used cannot be altered by the hypervisor once the guest is launched.

Secure TSC-enabled guests need to query TSC information from the AMD
Security Processor. This communication channel is encrypted between the AMD
Security Processor and the guest, with the hypervisor acting merely as a
conduit to deliver the guest messages to the AMD Security Processor. Each
message is protected with AEAD (AES-256 GCM). Use a minimal AES GCM library
to encrypt and decrypt SNP guest messages for communication with the PSP.

Use mem_encrypt_init() to fetch SNP TSC information from the AMD Security
Processor and initialize snp_tsc_scale and snp_tsc_offset. During secondary
CPU initialization, set the VMSA fields GUEST_TSC_SCALE (offset 2F0h) and
GUEST_TSC_OFFSET (offset 2F8h) with snp_tsc_scale and snp_tsc_offset,
respectively.

Add confidential compute platform attribute CC_ATTR_GUEST_SNP_SECURE_TSC
that can be used by the guest to query whether the Secure TSC feature is
active.

Since handle_guest_request() is common routine used by both the SEV guest
driver and Secure TSC code, move it to the SEV header file.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/sev-common.h |  1 +
 arch/x86/include/asm/sev.h        | 46 ++++++++++++++++
 arch/x86/include/asm/svm.h        |  6 +-
 include/linux/cc_platform.h       |  8 +++
 arch/x86/coco/core.c              |  3 +
 arch/x86/coco/sev/core.c          | 91 +++++++++++++++++++++++++++++++
 arch/x86/mm/mem_encrypt.c         |  4 ++
 7 files changed, 157 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 50f5666938c0..6ef92432a5ce 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -206,6 +206,7 @@ struct snp_psc_desc {
 #define GHCB_TERM_NO_SVSM		7	/* SVSM is not advertised in the secrets page */
 #define GHCB_TERM_SVSM_VMPL0		8	/* SVSM is present but has set VMPL to 0 */
 #define GHCB_TERM_SVSM_CAA		9	/* SVSM is present but CAA is not page aligned */
+#define GHCB_TERM_SECURE_TSC		10	/* Secure TSC initialization failed */
 
 #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index d6ad5f6b1ff3..9169b18eeb78 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -146,6 +146,9 @@ enum msg_type {
 	SNP_MSG_VMRK_REQ,
 	SNP_MSG_VMRK_RSP,
 
+	SNP_MSG_TSC_INFO_REQ = 17,
+	SNP_MSG_TSC_INFO_RSP,
+
 	SNP_MSG_TYPE_MAX
 };
 
@@ -174,6 +177,22 @@ struct snp_guest_msg {
 	u8 payload[PAGE_SIZE - sizeof(struct snp_guest_msg_hdr)];
 } __packed;
 
+#define SNP_TSC_INFO_REQ_SZ	128
+#define SNP_TSC_INFO_RESP_SZ	128
+
+struct snp_tsc_info_req {
+	u8 rsvd[SNP_TSC_INFO_REQ_SZ];
+} __packed;
+
+struct snp_tsc_info_resp {
+	u32 status;
+	u32 rsvd1;
+	u64 tsc_scale;
+	u64 tsc_offset;
+	u32 tsc_factor;
+	u8 rsvd2[100];
+} __packed;
+
 struct snp_guest_req {
 	void *req_buf;
 	size_t req_sz;
@@ -497,6 +516,27 @@ static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc)
 int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
 			   struct snp_guest_request_ioctl *rio);
 
+static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code,
+				       struct snp_guest_request_ioctl *rio, u8 type,
+				       void *req_buf, size_t req_sz, void *resp_buf,
+				       u32 resp_sz)
+{
+	struct snp_guest_req req = {
+		.msg_version	= rio->msg_version,
+		.msg_type	= type,
+		.vmpck_id	= mdesc->vmpck_id,
+		.req_buf	= req_buf,
+		.req_sz		= req_sz,
+		.resp_buf	= resp_buf,
+		.resp_sz	= resp_sz,
+		.exit_code	= exit_code,
+	};
+
+	return snp_send_guest_request(mdesc, &req, rio);
+}
+
+void __init snp_secure_tsc_prepare(void);
+
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define snp_vmpl 0
@@ -538,6 +578,12 @@ static inline struct snp_msg_desc *snp_msg_alloc(void) { return NULL; }
 static inline void snp_msg_cleanup(struct snp_msg_desc *mdesc) { }
 static inline int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req,
 					 struct snp_guest_request_ioctl *rio) { return -ENODEV; }
+static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code,
+				       struct snp_guest_request_ioctl *rio, u8 type,
+				       void *req_buf, size_t req_sz, void *resp_buf,
+				       u32 resp_sz) { return -ENODEV; }
+
+static inline void __init snp_secure_tsc_prepare(void) { }
 
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 2b59b9951c90..92e18798f197 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -417,7 +417,9 @@ struct sev_es_save_area {
 	u8 reserved_0x298[80];
 	u32 pkru;
 	u32 tsc_aux;
-	u8 reserved_0x2f0[24];
+	u64 tsc_scale;
+	u64 tsc_offset;
+	u8 reserved_0x300[8];
 	u64 rcx;
 	u64 rdx;
 	u64 rbx;
@@ -564,7 +566,7 @@ static inline void __unused_size_checks(void)
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x1c0);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x248);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x298);
-	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x2f0);
+	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x300);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x320);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x380);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x3f0);
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index caa4b4430634..cb7103dc124f 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -88,6 +88,14 @@ enum cc_attr {
 	 * enabled to run SEV-SNP guests.
 	 */
 	CC_ATTR_HOST_SEV_SNP,
+
+	/**
+	 * @CC_ATTR_GUEST_SNP_SECURE_TSC: SNP Secure TSC is active.
+	 *
+	 * The platform/OS is running as a guest/virtual machine and actively
+	 * using AMD SEV-SNP Secure TSC feature.
+	 */
+	CC_ATTR_GUEST_SNP_SECURE_TSC,
 };
 
 #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index 0f81f70aca82..5b9a358a3254 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -100,6 +100,9 @@ static bool noinstr amd_cc_platform_has(enum cc_attr attr)
 	case CC_ATTR_HOST_SEV_SNP:
 		return cc_flags.host_sev_snp;
 
+	case CC_ATTR_GUEST_SNP_SECURE_TSC:
+		return sev_status & MSR_AMD64_SNP_SECURE_TSC;
+
 	default:
 		return false;
 	}
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index c96b742789c5..965209067f03 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -98,6 +98,10 @@ static u64 secrets_pa __ro_after_init;
 
 static struct snp_msg_desc *snp_mdesc;
 
+/* Secure TSC values read using TSC_INFO SNP Guest request */
+static u64 snp_tsc_scale __ro_after_init;
+static u64 snp_tsc_offset __ro_after_init;
+
 /* #VC handler runtime per-CPU data */
 struct sev_es_runtime_data {
 	struct ghcb ghcb_page;
@@ -1148,6 +1152,12 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip)
 	vmsa->vmpl		= snp_vmpl;
 	vmsa->sev_features	= sev_status >> 2;
 
+	/* Set Secure TSC parameters */
+	if (cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)) {
+		vmsa->tsc_scale = snp_tsc_scale;
+		vmsa->tsc_offset = snp_tsc_offset;
+	}
+
 	/* Switch the page over to a VMSA page now that it is initialized */
 	ret = snp_set_vmsa(vmsa, caa, apic_id, true);
 	if (ret) {
@@ -2942,3 +2952,84 @@ int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req
 	return 0;
 }
 EXPORT_SYMBOL_GPL(snp_send_guest_request);
+
+static int __init snp_get_tsc_info(void)
+{
+	static u8 buf[SNP_TSC_INFO_RESP_SZ + AUTHTAG_LEN];
+	struct snp_guest_request_ioctl rio;
+	struct snp_tsc_info_resp tsc_resp;
+	struct snp_tsc_info_req *tsc_req;
+	struct snp_msg_desc *mdesc;
+	struct snp_guest_req req;
+	int rc;
+
+	/*
+	 * The intermediate response buffer is used while decrypting the
+	 * response payload. Make sure that it has enough space to cover the
+	 * authtag.
+	 */
+	BUILD_BUG_ON(sizeof(buf) < (sizeof(tsc_resp) + AUTHTAG_LEN));
+
+	mdesc = snp_msg_alloc();
+	if (IS_ERR_OR_NULL(mdesc))
+		return -ENOMEM;
+
+	rc = snp_msg_init(mdesc, snp_vmpl);
+	if (rc)
+		return rc;
+
+	tsc_req = kzalloc(sizeof(struct snp_tsc_info_req), GFP_KERNEL);
+	if (!tsc_req)
+		return -ENOMEM;
+
+	memset(&req, 0, sizeof(req));
+	memset(&rio, 0, sizeof(rio));
+	memset(buf, 0, sizeof(buf));
+
+	req.msg_version = MSG_HDR_VER;
+	req.msg_type = SNP_MSG_TSC_INFO_REQ;
+	req.vmpck_id = snp_vmpl;
+	req.req_buf = tsc_req;
+	req.req_sz = sizeof(*tsc_req);
+	req.resp_buf = buf;
+	req.resp_sz = sizeof(tsc_resp) + AUTHTAG_LEN;
+	req.exit_code = SVM_VMGEXIT_GUEST_REQUEST;
+
+	rc = snp_send_guest_request(mdesc, &req, &rio);
+	if (rc)
+		goto err_req;
+
+	memcpy(&tsc_resp, buf, sizeof(tsc_resp));
+	pr_debug("%s: response status %x scale %llx offset %llx factor %x\n",
+		 __func__, tsc_resp.status, tsc_resp.tsc_scale, tsc_resp.tsc_offset,
+		 tsc_resp.tsc_factor);
+
+	if (tsc_resp.status == 0) {
+		snp_tsc_scale = tsc_resp.tsc_scale;
+		snp_tsc_offset = tsc_resp.tsc_offset;
+	} else {
+		pr_err("Failed to get TSC info, response status %x\n", tsc_resp.status);
+		rc = -EIO;
+	}
+
+err_req:
+	/* The response buffer contains the sensitive data, explicitly clear it. */
+	memzero_explicit(buf, sizeof(buf));
+	memzero_explicit(&tsc_resp, sizeof(tsc_resp));
+	memzero_explicit(&req, sizeof(req));
+
+	return rc;
+}
+
+void __init snp_secure_tsc_prepare(void)
+{
+	if (!cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
+		return;
+
+	if (snp_get_tsc_info()) {
+		pr_alert("Unable to retrieve Secure TSC info from ASP\n");
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SECURE_TSC);
+	}
+
+	pr_debug("SecureTSC enabled");
+}
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 0a120d85d7bb..996ca27f0b72 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -94,6 +94,10 @@ void __init mem_encrypt_init(void)
 	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
 	swiotlb_update_mem_attributes();
 
+	/* Initialize SNP Secure TSC */
+	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		snp_secure_tsc_prepare();
+
 	print_mem_encrypt_feature_info();
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (2 preceding siblings ...)
  2024-10-21  5:51 ` [PATCH v13 03/13] x86/sev: Add Secure TSC support for SNP guests Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-23  3:25   ` Xiaoyao Li
  2024-10-21  5:51 ` [PATCH v13 05/13] x86/sev: Prevent RDTSC/RDTSCP interception " Nikunj A Dadhania
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as
the subsequent TSC value reads are undefined. MSR_IA32_TSC read/write
accesses should not exit to the hypervisor for such guests.

Accesses to MSR_IA32_TSC needs special handling in the #VC handler for the
guests with Secure TSC enabled. Writes to MSR_IA32_TSC should be ignored,
and reads of MSR_IA32_TSC should return the result of the RDTSC
instruction.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
---
 arch/x86/coco/sev/core.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 965209067f03..2ad7773458c0 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1308,6 +1308,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
 		return ES_OK;
 	}
 
+	/*
+	 * TSC related accesses should not exit to the hypervisor when a
+	 * guest is executing with SecureTSC enabled, so special handling
+	 * is required for accesses of MSR_IA32_TSC:
+	 *
+	 * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
+	 *         of the TSC to return undefined values, so ignore all
+	 *         writes.
+	 * Reads:  Reads of MSR_IA32_TSC should return the current TSC
+	 *         value, use the value returned by RDTSC.
+	 */
+	if (regs->cx == MSR_IA32_TSC && cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)) {
+		u64 tsc;
+
+		if (exit_info_1)
+			return ES_OK;
+
+		tsc = rdtsc();
+		regs->ax = UINT_MAX & tsc;
+		regs->dx = UINT_MAX & (tsc >> 32);
+
+		return ES_OK;
+	}
+
 	ghcb_set_rcx(ghcb, regs->cx);
 	if (exit_info_1) {
 		ghcb_set_rax(ghcb, regs->ax);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 05/13] x86/sev: Prevent RDTSC/RDTSCP interception for Secure TSC enabled guests
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (3 preceding siblings ...)
  2024-10-21  5:51 ` [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-24  7:56   ` Xiaoyao Li
  2024-10-21  5:51 ` [PATCH v13 06/13] x86/sev: Prevent GUEST_TSC_FREQ MSR " Nikunj A Dadhania
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

The hypervisor should not be intercepting RDTSC/RDTSCP when Secure TSC is
enabled. A #VC exception will be generated if the RDTSC/RDTSCP instructions
are being intercepted. If this should occur and Secure TSC is enabled,
terminate guest execution.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/coco/sev/shared.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/coco/sev/shared.c
index 71de53194089..c2a9e2ada659 100644
--- a/arch/x86/coco/sev/shared.c
+++ b/arch/x86/coco/sev/shared.c
@@ -1140,6 +1140,16 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,
 	bool rdtscp = (exit_code == SVM_EXIT_RDTSCP);
 	enum es_result ret;
 
+	/*
+	 * RDTSC and RDTSCP should not be intercepted when Secure TSC is
+	 * enabled. Terminate the SNP guest when the interception is enabled.
+	 * This file is included from kernel/sev.c and boot/compressed/sev.c,
+	 * use sev_status here as cc_platform_has() is not available when
+	 * compiling boot/compressed/sev.c.
+	 */
+	if (sev_status & MSR_AMD64_SNP_SECURE_TSC)
+		return ES_VMM_ERROR;
+
 	ret = sev_es_ghcb_hv_call(ghcb, ctxt, exit_code, 0, 0);
 	if (ret != ES_OK)
 		return ret;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 06/13] x86/sev: Prevent GUEST_TSC_FREQ MSR interception for Secure TSC enabled guests
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (4 preceding siblings ...)
  2024-10-21  5:51 ` [PATCH v13 05/13] x86/sev: Prevent RDTSC/RDTSCP interception " Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-21 14:08   ` Tom Lendacky
  2024-10-21  5:51 ` [PATCH v13 07/13] x86/sev: Mark Secure TSC as reliable clocksource Nikunj A Dadhania
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

The hypervisor should not be intercepting GUEST_TSC_FREQ MSR(0xcOO10134)
when Secure TSC is enabled. A #VC exception will be generated if the
GUEST_TSC_FREQ MSR is being intercepted. If this should occur and SecureTSC
is enabled, terminate guest execution.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/msr-index.h | 1 +
 arch/x86/coco/sev/core.c         | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3ae84c3b8e6d..233be13cc21f 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -608,6 +608,7 @@
 #define MSR_AMD_PERF_CTL		0xc0010062
 #define MSR_AMD_PERF_STATUS		0xc0010063
 #define MSR_AMD_PSTATE_DEF_BASE		0xc0010064
+#define MSR_AMD64_GUEST_TSC_FREQ	0xc0010134
 #define MSR_AMD64_OSVW_ID_LENGTH	0xc0010140
 #define MSR_AMD64_OSVW_STATUS		0xc0010141
 #define MSR_AMD_PPIN_CTL		0xc00102f0
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 2ad7773458c0..4e9b1cc1f26b 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1332,6 +1332,14 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
 		return ES_OK;
 	}
 
+	/*
+	 * GUEST_TSC_FREQ should not be intercepted when Secure TSC is
+	 * enabled. Terminate the SNP guest when the interception is enabled.
+	 */
+	if (regs->cx == MSR_AMD64_GUEST_TSC_FREQ && cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
+		return ES_VMM_ERROR;
+
+
 	ghcb_set_rcx(ghcb, regs->cx);
 	if (exit_info_1) {
 		ghcb_set_rax(ghcb, regs->ax);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 07/13] x86/sev: Mark Secure TSC as reliable clocksource
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (5 preceding siblings ...)
  2024-10-21  5:51 ` [PATCH v13 06/13] x86/sev: Prevent GUEST_TSC_FREQ MSR " Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-21  5:51 ` [PATCH v13 08/13] x86/cpu/amd: Do not print FW_BUG for Secure TSC Nikunj A Dadhania
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

In SNP guest environment with Secure TSC enabled, unlike other clock
sources (such as HPET, ACPI timer, APIC, etc.), the RDTSC instruction is
handled without causing a VM exit, resulting in minimal overhead and
jitters. Hence, mark Secure TSC as the only reliable clock source,
bypassing unstable calibration.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/mem_encrypt_amd.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 86a476a426c2..e9fb5f24703a 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -516,6 +516,10 @@ void __init sme_early_init(void)
 	 * kernel mapped.
 	 */
 	snp_update_svsm_ca();
+
+	/* Mark the TSC as reliable when Secure TSC is enabled */
+	if (sev_status & MSR_AMD64_SNP_SECURE_TSC)
+		setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
 }
 
 void __init mem_encrypt_free_decrypted_mem(void)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 08/13] x86/cpu/amd: Do not print FW_BUG for Secure TSC
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (6 preceding siblings ...)
  2024-10-21  5:51 ` [PATCH v13 07/13] x86/sev: Mark Secure TSC as reliable clocksource Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-21  5:51 ` [PATCH v13 09/13] tsc: Use the GUEST_TSC_FREQ MSR for discovering TSC frequency Nikunj A Dadhania
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

When Secure TSC is enabled and TscInvariant (bit 8) in CPUID_8000_0007_edx
is set, the kernel complains with the below firmware bug:

[Firmware Bug]: TSC doesn't count with P0 frequency!

Secure TSC does not need to run at P0 frequency; the TSC frequency is set
by the VMM as part of the SNP_LAUNCH_START command. Skip this check when
Secure TSC is enabled

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kernel/cpu/amd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 015971adadfc..4769c10cba04 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -370,7 +370,8 @@ static void bsp_determine_snp(struct cpuinfo_x86 *c)
 
 static void bsp_init_amd(struct cpuinfo_x86 *c)
 {
-	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
+	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC) &&
+	    !cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)) {
 
 		if (c->x86 > 0x10 ||
 		    (c->x86 == 0x10 && c->x86_model >= 0x2)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 09/13] tsc: Use the GUEST_TSC_FREQ MSR for discovering TSC frequency
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (7 preceding siblings ...)
  2024-10-21  5:51 ` [PATCH v13 08/13] x86/cpu/amd: Do not print FW_BUG for Secure TSC Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-21 14:33   ` Tom Lendacky
  2024-10-21  5:51 ` [PATCH v13 10/13] tsc: Upgrade TSC clocksource rating Nikunj A Dadhania
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Calibrating the TSC frequency using the kvmclock is not correct for
SecureTSC enabled guests. Use the platform provided TSC frequency via the
GUEST_TSC_FREQ MSR (C001_0134h).

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/sev.h |  2 ++
 arch/x86/coco/sev/core.c   | 16 ++++++++++++++++
 arch/x86/kernel/tsc.c      |  5 +++++
 3 files changed, 23 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 9169b18eeb78..34f7b9fc363b 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -536,6 +536,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
 }
 
 void __init snp_secure_tsc_prepare(void);
+void __init securetsc_init(void);
 
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
@@ -584,6 +585,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
 				       u32 resp_sz) { return -ENODEV; }
 
 static inline void __init snp_secure_tsc_prepare(void) { }
+static inline void __init securetsc_init(void) { }
 
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 4e9b1cc1f26b..154d568c59cf 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -3065,3 +3065,19 @@ void __init snp_secure_tsc_prepare(void)
 
 	pr_debug("SecureTSC enabled");
 }
+
+static unsigned long securetsc_get_tsc_khz(void)
+{
+	unsigned long long tsc_freq_mhz;
+
+	setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
+	rdmsrl(MSR_AMD64_GUEST_TSC_FREQ, tsc_freq_mhz);
+
+	return (unsigned long)(tsc_freq_mhz * 1000);
+}
+
+void __init securetsc_init(void)
+{
+	x86_platform.calibrate_cpu = securetsc_get_tsc_khz;
+	x86_platform.calibrate_tsc = securetsc_get_tsc_khz;
+}
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index dfe6847fd99e..c83f1091bb4f 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -30,6 +30,7 @@
 #include <asm/i8259.h>
 #include <asm/topology.h>
 #include <asm/uv/uv.h>
+#include <asm/sev.h>
 
 unsigned int __read_mostly cpu_khz;	/* TSC clocks / usec, not used here */
 EXPORT_SYMBOL(cpu_khz);
@@ -1514,6 +1515,10 @@ void __init tsc_early_init(void)
 	/* Don't change UV TSC multi-chassis synchronization */
 	if (is_early_uv_system())
 		return;
+
+	if (cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
+		securetsc_init();
+
 	if (!determine_cpu_tsc_frequencies(true))
 		return;
 	tsc_enable_sched_clock();
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 10/13] tsc: Upgrade TSC clocksource rating
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (8 preceding siblings ...)
  2024-10-21  5:51 ` [PATCH v13 09/13] tsc: Use the GUEST_TSC_FREQ MSR for discovering TSC frequency Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-21  5:51 ` [PATCH v13 11/13] tsc: Switch to native sched clock Nikunj A Dadhania
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj,
	Alexey Makhalov, Juergen Gross, Boris Ostrovsky

In virtualized environments running on modern CPUs, the underlying
platforms guarantees to have a stable, always running TSC, i.e. that the
TSC is a superior timesource as compared to other clock sources (such as
kvmclock, HPET, ACPI timer, APIC, etc.).

Upgrade the rating of the early and regular clock source to prefer TSC over
other clock sources when TSC is invariant, non-stop and stable.

Cc: Alexey Makhalov <alexey.makhalov@broadcom.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/kernel/tsc.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index c83f1091bb4f..27faf121fb78 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1264,6 +1264,21 @@ static void __init check_system_tsc_reliable(void)
 		tsc_disable_clocksource_watchdog();
 }
 
+static void __init upgrade_clock_rating(struct clocksource *tsc_early,
+					struct clocksource *tsc)
+{
+	/*
+	 * Upgrade the clock rating for TSC early and regular clocksource when
+	 * the underlying platform provides non-stop, invaraint and stable TSC.
+	 */
+	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
+	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
+	    !tsc_unstable) {
+		tsc_early->rating = 449;
+		tsc->rating = 450;
+	}
+}
+
 /*
  * Make an educated guess if the TSC is trustworthy and synchronized
  * over all CPUs.
@@ -1565,6 +1580,8 @@ void __init tsc_init(void)
 	if (tsc_clocksource_reliable || no_tsc_watchdog)
 		tsc_disable_clocksource_watchdog();
 
+	upgrade_clock_rating(&clocksource_tsc_early, &clocksource_tsc);
+
 	clocksource_register_khz(&clocksource_tsc_early, tsc_khz);
 	detect_art();
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 11/13] tsc: Switch to native sched clock
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (9 preceding siblings ...)
  2024-10-21  5:51 ` [PATCH v13 10/13] tsc: Upgrade TSC clocksource rating Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-21 14:37   ` Tom Lendacky
                     ` (2 more replies)
  2024-10-21  5:51 ` [PATCH v13 12/13] x86/kvmclock: Abort SecureTSC enabled guest when kvmclock is selected Nikunj A Dadhania
  2024-10-21  5:51 ` [PATCH v13 13/13] x86/sev: Allow Secure TSC feature for SNP guests Nikunj A Dadhania
  12 siblings, 3 replies; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj,
	Alexey Makhalov, Juergen Gross, Boris Ostrovsky

Although the kernel switches over to stable TSC clocksource instead of PV
clocksource, the scheduler still keeps on using PV clocks as the sched
clock source. This is because the following KVM, Xen and VMWare, switches
the paravirt sched clock handler in their init routines. The HyperV is the
only PV clock source that checks if the platform provides invariant TSC and
does not switch to PV sched clock.

When switching back to stable TSC, restore the scheduler clock to
native_sched_clock().

As the clock selection happens in the stop machine context, schedule
delayed work to update the static_call()

Cc: Alexey Makhalov <alexey.makhalov@broadcom.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/kernel/tsc.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 27faf121fb78..38e35cac6c42 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -272,10 +272,25 @@ bool using_native_sched_clock(void)
 {
 	return static_call_query(pv_sched_clock) == native_sched_clock;
 }
+
+static void enable_native_sc_work(struct work_struct *work)
+{
+	pr_info("using native sched clock\n");
+	paravirt_set_sched_clock(native_sched_clock);
+}
+static DECLARE_DELAYED_WORK(enable_native_sc, enable_native_sc_work);
+
+static void enable_native_sched_clock(void)
+{
+	if (!using_native_sched_clock())
+		schedule_delayed_work(&enable_native_sc, 0);
+}
 #else
 u64 sched_clock_noinstr(void) __attribute__((alias("native_sched_clock")));
 
 bool using_native_sched_clock(void) { return true; }
+
+void enable_native_sched_clock(void) { }
 #endif
 
 notrace u64 sched_clock(void)
@@ -1157,6 +1172,10 @@ static void tsc_cs_tick_stable(struct clocksource *cs)
 static int tsc_cs_enable(struct clocksource *cs)
 {
 	vclocks_set_used(VDSO_CLOCKMODE_TSC);
+
+	/* Restore native_sched_clock() when switching to TSC */
+	enable_native_sched_clock();
+
 	return 0;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 12/13] x86/kvmclock: Abort SecureTSC enabled guest when kvmclock is selected
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (10 preceding siblings ...)
  2024-10-21  5:51 ` [PATCH v13 11/13] tsc: Switch to native sched clock Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  2024-10-21 15:00   ` Tom Lendacky
  2024-10-21  5:51 ` [PATCH v13 13/13] x86/sev: Allow Secure TSC feature for SNP guests Nikunj A Dadhania
  12 siblings, 1 reply; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

SecureTSC enabled guests should use TSC as the only clock source, terminate
the guest with appropriate code when clock source switches to hypervisor
controlled kvmclock.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/sev-common.h | 1 +
 arch/x86/include/asm/sev.h        | 2 ++
 arch/x86/coco/sev/shared.c        | 3 +--
 arch/x86/kernel/kvmclock.c        | 9 +++++++++
 4 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 6ef92432a5ce..ad0743800b0e 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -207,6 +207,7 @@ struct snp_psc_desc {
 #define GHCB_TERM_SVSM_VMPL0		8	/* SVSM is present but has set VMPL to 0 */
 #define GHCB_TERM_SVSM_CAA		9	/* SVSM is present but CAA is not page aligned */
 #define GHCB_TERM_SECURE_TSC		10	/* Secure TSC initialization failed */
+#define GHCB_TERM_SECURE_TSC_KVMCLOCK	11	/* KVM clock selected instead of Secure TSC */
 
 #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 34f7b9fc363b..783dc57f73c3 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -537,6 +537,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
 
 void __init snp_secure_tsc_prepare(void);
 void __init securetsc_init(void);
+void __noreturn sev_es_terminate(unsigned int set, unsigned int reason);
 
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
@@ -586,6 +587,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
 
 static inline void __init snp_secure_tsc_prepare(void) { }
 static inline void __init securetsc_init(void) { }
+static inline void sev_es_terminate(unsigned int set, unsigned int reason) { }
 
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/coco/sev/shared.c
index c2a9e2ada659..d202790e1385 100644
--- a/arch/x86/coco/sev/shared.c
+++ b/arch/x86/coco/sev/shared.c
@@ -117,8 +117,7 @@ static bool __init sev_es_check_cpu_features(void)
 	return true;
 }
 
-static void __head __noreturn
-sev_es_terminate(unsigned int set, unsigned int reason)
+void __head __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
 {
 	u64 val = GHCB_MSR_TERM_REQ;
 
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 5b2c15214a6b..b135044f3c7b 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -21,6 +21,7 @@
 #include <asm/hypervisor.h>
 #include <asm/x86_init.h>
 #include <asm/kvmclock.h>
+#include <asm/sev.h>
 
 static int kvmclock __initdata = 1;
 static int kvmclock_vsyscall __initdata = 1;
@@ -150,6 +151,14 @@ bool kvm_check_and_clear_guest_paused(void)
 
 static int kvm_cs_enable(struct clocksource *cs)
 {
+	/*
+	 * For a guest with SecureTSC enabled, the TSC should be the only clock
+	 * source. Abort the guest when kvmclock is selected as the clock
+	 * source.
+	 */
+	if (cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SECURE_TSC_KVMCLOCK);
+
 	vclocks_set_used(VDSO_CLOCKMODE_PVCLOCK);
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v13 13/13] x86/sev: Allow Secure TSC feature for SNP guests
  2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
                   ` (11 preceding siblings ...)
  2024-10-21  5:51 ` [PATCH v13 12/13] x86/kvmclock: Abort SecureTSC enabled guest when kvmclock is selected Nikunj A Dadhania
@ 2024-10-21  5:51 ` Nikunj A Dadhania
  12 siblings, 0 replies; 35+ messages in thread
From: Nikunj A Dadhania @ 2024-10-21  5:51 UTC (permalink / raw)
  To: linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini, nikunj

Now that all the required plumbing is done for enabling SNP Secure TSC
feature, add Secure TSC to SNP features present list.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/boot/compressed/sev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index cd44e120fe53..bb55934c1cee 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -401,7 +401,8 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
  * by the guest kernel. As and when a new feature is implemented in the
  * guest kernel, a corresponding bit should be added to the mask.
  */
-#define SNP_FEATURES_PRESENT	MSR_AMD64_SNP_DEBUG_SWAP
+#define SNP_FEATURES_PRESENT	(MSR_AMD64_SNP_DEBUG_SWAP |	\
+				 MSR_AMD64_SNP_SECURE_TSC)
 
 u64 snp_get_unsupported_features(u64 status)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 01/13] x86/sev: Carve out and export SNP guest messaging init routines
  2024-10-21  5:51 ` [PATCH v13 01/13] x86/sev: Carve out and export SNP guest messaging init routines Nikunj A Dadhania
@ 2024-10-21  7:48   ` Christophe JAILLET
  2024-10-22  4:12     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 35+ messages in thread
From: Christophe JAILLET @ 2024-10-21  7:48 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

Le 21/10/2024 à 07:51, Nikunj A Dadhania a écrit :
> Currently, the SEV guest driver is the only user of SNP guest messaging.
> All routines for initializing SNP guest messaging are implemented within
> the SEV guest driver. To add Secure TSC guest support, these initialization
> routines need to be available during early boot.
> 
> Carve out common SNP guest messaging buffer allocations and message
> initialization routines to core/sev.c and export them. These newly added
> APIs set up the SNP message context (snp_msg_desc), which contains all the
> necessary details for sending SNP guest messages.
> 
> At present, the SEV guest platform data structure is used to pass the
> secrets page physical address to SEV guest driver. Since the secrets page
> address is locally available to the initialization routine, use the cached
> address. Remove the unused SEV guest platform data structure.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---

...

> +static inline bool is_vmpck_empty(struct snp_msg_desc *mdesc)
> +{
> +	char zero_key[VMPCK_KEY_LEN] = {0};

Nitpick: I think this could be a static const.

> +
> +	if (mdesc->vmpck)
> +		return !memcmp(mdesc->vmpck, zero_key, VMPCK_KEY_LEN);
> +
> +	return true;
> +}

...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 03/13] x86/sev: Add Secure TSC support for SNP guests
  2024-10-21  5:51 ` [PATCH v13 03/13] x86/sev: Add Secure TSC support for SNP guests Nikunj A Dadhania
@ 2024-10-21  8:12   ` Christophe JAILLET
  2024-10-22  4:15     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 35+ messages in thread
From: Christophe JAILLET @ 2024-10-21  8:12 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

Le 21/10/2024 à 07:51, Nikunj A Dadhania a écrit :
> Add support for Secure TSC in SNP-enabled guests. Secure TSC allows guests
> to securely use RDTSC/RDTSCP instructions, ensuring that the parameters
> used cannot be altered by the hypervisor once the guest is launched.
> 
> Secure TSC-enabled guests need to query TSC information from the AMD
> Security Processor. This communication channel is encrypted between the AMD
> Security Processor and the guest, with the hypervisor acting merely as a
> conduit to deliver the guest messages to the AMD Security Processor. Each
> message is protected with AEAD (AES-256 GCM). Use a minimal AES GCM library
> to encrypt and decrypt SNP guest messages for communication with the PSP.
> 
> Use mem_encrypt_init() to fetch SNP TSC information from the AMD Security
> Processor and initialize snp_tsc_scale and snp_tsc_offset. During secondary
> CPU initialization, set the VMSA fields GUEST_TSC_SCALE (offset 2F0h) and
> GUEST_TSC_OFFSET (offset 2F8h) with snp_tsc_scale and snp_tsc_offset,
> respectively.
> 
> Add confidential compute platform attribute CC_ATTR_GUEST_SNP_SECURE_TSC
> that can be used by the guest to query whether the Secure TSC feature is
> active.
> 
> Since handle_guest_request() is common routine used by both the SEV guest
> driver and Secure TSC code, move it to the SEV header file.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---

..

> +static int __init snp_get_tsc_info(void)
> +{
> +	static u8 buf[SNP_TSC_INFO_RESP_SZ + AUTHTAG_LEN];
> +	struct snp_guest_request_ioctl rio;
> +	struct snp_tsc_info_resp tsc_resp;
> +	struct snp_tsc_info_req *tsc_req;
> +	struct snp_msg_desc *mdesc;
> +	struct snp_guest_req req;
> +	int rc;
> +
> +	/*
> +	 * The intermediate response buffer is used while decrypting the
> +	 * response payload. Make sure that it has enough space to cover the
> +	 * authtag.
> +	 */
> +	BUILD_BUG_ON(sizeof(buf) < (sizeof(tsc_resp) + AUTHTAG_LEN));
> +
> +	mdesc = snp_msg_alloc();
> +	if (IS_ERR_OR_NULL(mdesc))
> +		return -ENOMEM;
> +
> +	rc = snp_msg_init(mdesc, snp_vmpl);
> +	if (rc)
> +		return rc;
> +
> +	tsc_req = kzalloc(sizeof(struct snp_tsc_info_req), GFP_KERNEL);
> +	if (!tsc_req)
> +		return -ENOMEM;
> +
> +	memset(&req, 0, sizeof(req));
> +	memset(&rio, 0, sizeof(rio));
> +	memset(buf, 0, sizeof(buf));
> +
> +	req.msg_version = MSG_HDR_VER;
> +	req.msg_type = SNP_MSG_TSC_INFO_REQ;
> +	req.vmpck_id = snp_vmpl;
> +	req.req_buf = tsc_req;
> +	req.req_sz = sizeof(*tsc_req);
> +	req.resp_buf = buf;
> +	req.resp_sz = sizeof(tsc_resp) + AUTHTAG_LEN;
> +	req.exit_code = SVM_VMGEXIT_GUEST_REQUEST;
> +
> +	rc = snp_send_guest_request(mdesc, &req, &rio);
> +	if (rc)
> +		goto err_req;
> +
> +	memcpy(&tsc_resp, buf, sizeof(tsc_resp));
> +	pr_debug("%s: response status %x scale %llx offset %llx factor %x\n",
> +		 __func__, tsc_resp.status, tsc_resp.tsc_scale, tsc_resp.tsc_offset,
> +		 tsc_resp.tsc_factor);
> +
> +	if (tsc_resp.status == 0) {
> +		snp_tsc_scale = tsc_resp.tsc_scale;
> +		snp_tsc_offset = tsc_resp.tsc_offset;
> +	} else {
> +		pr_err("Failed to get TSC info, response status %x\n", tsc_resp.status);
> +		rc = -EIO;
> +	}
> +
> +err_req:
> +	/* The response buffer contains the sensitive data, explicitly clear it. */
> +	memzero_explicit(buf, sizeof(buf));
> +	memzero_explicit(&tsc_resp, sizeof(tsc_resp));
> +	memzero_explicit(&req, sizeof(req));

req does not seem to hold sensitive data.
Is it needed, or maybe should it be tsc_req?

> +
> +	return rc;
> +}

...

CJ

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 06/13] x86/sev: Prevent GUEST_TSC_FREQ MSR interception for Secure TSC enabled guests
  2024-10-21  5:51 ` [PATCH v13 06/13] x86/sev: Prevent GUEST_TSC_FREQ MSR " Nikunj A Dadhania
@ 2024-10-21 14:08   ` Tom Lendacky
  2024-10-22  4:20     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 35+ messages in thread
From: Tom Lendacky @ 2024-10-21 14:08 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/21/24 00:51, Nikunj A Dadhania wrote:
> The hypervisor should not be intercepting GUEST_TSC_FREQ MSR(0xcOO10134)
> when Secure TSC is enabled. A #VC exception will be generated if the
> GUEST_TSC_FREQ MSR is being intercepted. If this should occur and SecureTSC
> is enabled, terminate guest execution.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

Just a minor comment/question below.

> ---
>  arch/x86/include/asm/msr-index.h | 1 +
>  arch/x86/coco/sev/core.c         | 8 ++++++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 3ae84c3b8e6d..233be13cc21f 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -608,6 +608,7 @@
>  #define MSR_AMD_PERF_CTL		0xc0010062
>  #define MSR_AMD_PERF_STATUS		0xc0010063
>  #define MSR_AMD_PSTATE_DEF_BASE		0xc0010064
> +#define MSR_AMD64_GUEST_TSC_FREQ	0xc0010134
>  #define MSR_AMD64_OSVW_ID_LENGTH	0xc0010140
>  #define MSR_AMD64_OSVW_STATUS		0xc0010141
>  #define MSR_AMD_PPIN_CTL		0xc00102f0
> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
> index 2ad7773458c0..4e9b1cc1f26b 100644
> --- a/arch/x86/coco/sev/core.c
> +++ b/arch/x86/coco/sev/core.c
> @@ -1332,6 +1332,14 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>  		return ES_OK;
>  	}
>  
> +	/*
> +	 * GUEST_TSC_FREQ should not be intercepted when Secure TSC is
> +	 * enabled. Terminate the SNP guest when the interception is enabled.
> +	 */
> +	if (regs->cx == MSR_AMD64_GUEST_TSC_FREQ && cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))

Should the cc_platform_has() check be changed into a check against
sev_status directly (similar to the DEBUG_SWAP support)? Just in case
this handler ends up getting used in early code where cc_platform_has()
can't be used.

Thanks,
Tom

> +		return ES_VMM_ERROR;
> +
> +
>  	ghcb_set_rcx(ghcb, regs->cx);
>  	if (exit_info_1) {
>  		ghcb_set_rax(ghcb, regs->ax);

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 09/13] tsc: Use the GUEST_TSC_FREQ MSR for discovering TSC frequency
  2024-10-21  5:51 ` [PATCH v13 09/13] tsc: Use the GUEST_TSC_FREQ MSR for discovering TSC frequency Nikunj A Dadhania
@ 2024-10-21 14:33   ` Tom Lendacky
  2024-10-22  4:24     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 35+ messages in thread
From: Tom Lendacky @ 2024-10-21 14:33 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/21/24 00:51, Nikunj A Dadhania wrote:
> Calibrating the TSC frequency using the kvmclock is not correct for
> SecureTSC enabled guests. Use the platform provided TSC frequency via the
> GUEST_TSC_FREQ MSR (C001_0134h).
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
>  arch/x86/include/asm/sev.h |  2 ++
>  arch/x86/coco/sev/core.c   | 16 ++++++++++++++++
>  arch/x86/kernel/tsc.c      |  5 +++++
>  3 files changed, 23 insertions(+)
> 
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index 9169b18eeb78..34f7b9fc363b 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -536,6 +536,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
>  }
>  
>  void __init snp_secure_tsc_prepare(void);
> +void __init securetsc_init(void);
>  
>  #else	/* !CONFIG_AMD_MEM_ENCRYPT */
>  
> @@ -584,6 +585,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
>  				       u32 resp_sz) { return -ENODEV; }
>  
>  static inline void __init snp_secure_tsc_prepare(void) { }
> +static inline void __init securetsc_init(void) { }

This should probably be named snp_securetsc_init() or
snp_secure_tsc_init() (to be consistent with the function above it) so
that it is in the snp namespace.

>  
>  #endif	/* CONFIG_AMD_MEM_ENCRYPT */
>  
> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
> index 4e9b1cc1f26b..154d568c59cf 100644
> --- a/arch/x86/coco/sev/core.c
> +++ b/arch/x86/coco/sev/core.c
> @@ -3065,3 +3065,19 @@ void __init snp_secure_tsc_prepare(void)
>  
>  	pr_debug("SecureTSC enabled");
>  }
> +
> +static unsigned long securetsc_get_tsc_khz(void)
> +{
> +	unsigned long long tsc_freq_mhz;
> +
> +	setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
> +	rdmsrl(MSR_AMD64_GUEST_TSC_FREQ, tsc_freq_mhz);
> +
> +	return (unsigned long)(tsc_freq_mhz * 1000);
> +}
> +
> +void __init securetsc_init(void)
> +{
> +	x86_platform.calibrate_cpu = securetsc_get_tsc_khz;
> +	x86_platform.calibrate_tsc = securetsc_get_tsc_khz;
> +}
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index dfe6847fd99e..c83f1091bb4f 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -30,6 +30,7 @@
>  #include <asm/i8259.h>
>  #include <asm/topology.h>
>  #include <asm/uv/uv.h>
> +#include <asm/sev.h>
>  
>  unsigned int __read_mostly cpu_khz;	/* TSC clocks / usec, not used here */
>  EXPORT_SYMBOL(cpu_khz);
> @@ -1514,6 +1515,10 @@ void __init tsc_early_init(void)
>  	/* Don't change UV TSC multi-chassis synchronization */
>  	if (is_early_uv_system())
>  		return;
> +
> +	if (cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
> +		securetsc_init();

Would this call be better in kvm_init_platform() or kvmclock_init()? Any
reason it has to be here?

Thanks,
Tom

> +
>  	if (!determine_cpu_tsc_frequencies(true))
>  		return;
>  	tsc_enable_sched_clock();

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 11/13] tsc: Switch to native sched clock
  2024-10-21  5:51 ` [PATCH v13 11/13] tsc: Switch to native sched clock Nikunj A Dadhania
@ 2024-10-21 14:37   ` Tom Lendacky
  2024-10-22  4:26     ` Nikunj A. Dadhania
  2024-10-21 21:30   ` kernel test robot
  2024-10-21 23:03   ` kernel test robot
  2 siblings, 1 reply; 35+ messages in thread
From: Tom Lendacky @ 2024-10-21 14:37 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini,
	Alexey Makhalov, Juergen Gross, Boris Ostrovsky

On 10/21/24 00:51, Nikunj A Dadhania wrote:
> Although the kernel switches over to stable TSC clocksource instead of PV
> clocksource, the scheduler still keeps on using PV clocks as the sched
> clock source. This is because the following KVM, Xen and VMWare, switches

s/the following//
s/switches/switch/

> the paravirt sched clock handler in their init routines. The HyperV is the

s/The HyperV/HyperV/

> only PV clock source that checks if the platform provides invariant TSC and

s/provides invariant/provides an invariant/

Thanks,
Tom

> does not switch to PV sched clock.
> 
> When switching back to stable TSC, restore the scheduler clock to
> native_sched_clock().
> 
> As the clock selection happens in the stop machine context, schedule
> delayed work to update the static_call()
> 
> Cc: Alexey Makhalov <alexey.makhalov@broadcom.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
>  arch/x86/kernel/tsc.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 27faf121fb78..38e35cac6c42 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -272,10 +272,25 @@ bool using_native_sched_clock(void)
>  {
>  	return static_call_query(pv_sched_clock) == native_sched_clock;
>  }
> +
> +static void enable_native_sc_work(struct work_struct *work)
> +{
> +	pr_info("using native sched clock\n");
> +	paravirt_set_sched_clock(native_sched_clock);
> +}
> +static DECLARE_DELAYED_WORK(enable_native_sc, enable_native_sc_work);
> +
> +static void enable_native_sched_clock(void)
> +{
> +	if (!using_native_sched_clock())
> +		schedule_delayed_work(&enable_native_sc, 0);
> +}
>  #else
>  u64 sched_clock_noinstr(void) __attribute__((alias("native_sched_clock")));
>  
>  bool using_native_sched_clock(void) { return true; }
> +
> +void enable_native_sched_clock(void) { }
>  #endif
>  
>  notrace u64 sched_clock(void)
> @@ -1157,6 +1172,10 @@ static void tsc_cs_tick_stable(struct clocksource *cs)
>  static int tsc_cs_enable(struct clocksource *cs)
>  {
>  	vclocks_set_used(VDSO_CLOCKMODE_TSC);
> +
> +	/* Restore native_sched_clock() when switching to TSC */
> +	enable_native_sched_clock();
> +
>  	return 0;
>  }
>  

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 12/13] x86/kvmclock: Abort SecureTSC enabled guest when kvmclock is selected
  2024-10-21  5:51 ` [PATCH v13 12/13] x86/kvmclock: Abort SecureTSC enabled guest when kvmclock is selected Nikunj A Dadhania
@ 2024-10-21 15:00   ` Tom Lendacky
  2024-10-22  4:28     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 35+ messages in thread
From: Tom Lendacky @ 2024-10-21 15:00 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/21/24 00:51, Nikunj A Dadhania wrote:
> SecureTSC enabled guests should use TSC as the only clock source, terminate
> the guest with appropriate code when clock source switches to hypervisor
> controlled kvmclock.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
>  arch/x86/include/asm/sev-common.h | 1 +
>  arch/x86/include/asm/sev.h        | 2 ++
>  arch/x86/coco/sev/shared.c        | 3 +--
>  arch/x86/kernel/kvmclock.c        | 9 +++++++++
>  4 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
> index 6ef92432a5ce..ad0743800b0e 100644
> --- a/arch/x86/include/asm/sev-common.h
> +++ b/arch/x86/include/asm/sev-common.h
> @@ -207,6 +207,7 @@ struct snp_psc_desc {
>  #define GHCB_TERM_SVSM_VMPL0		8	/* SVSM is present but has set VMPL to 0 */
>  #define GHCB_TERM_SVSM_CAA		9	/* SVSM is present but CAA is not page aligned */
>  #define GHCB_TERM_SECURE_TSC		10	/* Secure TSC initialization failed */
> +#define GHCB_TERM_SECURE_TSC_KVMCLOCK	11	/* KVM clock selected instead of Secure TSC */
>  
>  #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
>  
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index 34f7b9fc363b..783dc57f73c3 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -537,6 +537,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
>  
>  void __init snp_secure_tsc_prepare(void);
>  void __init securetsc_init(void);
> +void __noreturn sev_es_terminate(unsigned int set, unsigned int reason);
>  
>  #else	/* !CONFIG_AMD_MEM_ENCRYPT */
>  
> @@ -586,6 +587,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
>  
>  static inline void __init snp_secure_tsc_prepare(void) { }
>  static inline void __init securetsc_init(void) { }
> +static inline void sev_es_terminate(unsigned int set, unsigned int reason) { }
>  
>  #endif	/* CONFIG_AMD_MEM_ENCRYPT */
>  
> diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/coco/sev/shared.c
> index c2a9e2ada659..d202790e1385 100644
> --- a/arch/x86/coco/sev/shared.c
> +++ b/arch/x86/coco/sev/shared.c
> @@ -117,8 +117,7 @@ static bool __init sev_es_check_cpu_features(void)
>  	return true;
>  }
>  
> -static void __head __noreturn
> -sev_es_terminate(unsigned int set, unsigned int reason)
> +void __head __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
>  {
>  	u64 val = GHCB_MSR_TERM_REQ;
>  
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 5b2c15214a6b..b135044f3c7b 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -21,6 +21,7 @@
>  #include <asm/hypervisor.h>
>  #include <asm/x86_init.h>
>  #include <asm/kvmclock.h>
> +#include <asm/sev.h>
>  
>  static int kvmclock __initdata = 1;
>  static int kvmclock_vsyscall __initdata = 1;
> @@ -150,6 +151,14 @@ bool kvm_check_and_clear_guest_paused(void)
>  
>  static int kvm_cs_enable(struct clocksource *cs)
>  {
> +	/*
> +	 * For a guest with SecureTSC enabled, the TSC should be the only clock
> +	 * source. Abort the guest when kvmclock is selected as the clock
> +	 * source.
> +	 */
> +	if (cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))

if (WARN_ON(cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)))

so that the guest sees something as well?

Thanks,
Tom

> +		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SECURE_TSC_KVMCLOCK);
> +
>  	vclocks_set_used(VDSO_CLOCKMODE_PVCLOCK);
>  	return 0;
>  }

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 11/13] tsc: Switch to native sched clock
  2024-10-21  5:51 ` [PATCH v13 11/13] tsc: Switch to native sched clock Nikunj A Dadhania
  2024-10-21 14:37   ` Tom Lendacky
@ 2024-10-21 21:30   ` kernel test robot
  2024-10-21 23:03   ` kernel test robot
  2 siblings, 0 replies; 35+ messages in thread
From: kernel test robot @ 2024-10-21 21:30 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: llvm, oe-kbuild-all, mingo, tglx, dave.hansen, pgonda, seanjc,
	pbonzini, nikunj, Alexey Makhalov, Juergen Gross, Boris Ostrovsky

Hi Nikunj,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 0a895c0d9b73d934de95aa0dd4e631c394bdd25d]

url:    https://github.com/intel-lab-lkp/linux/commits/Nikunj-A-Dadhania/x86-sev-Carve-out-and-export-SNP-guest-messaging-init-routines/20241021-140125
base:   0a895c0d9b73d934de95aa0dd4e631c394bdd25d
patch link:    https://lore.kernel.org/r/20241021055156.2342564-12-nikunj%40amd.com
patch subject: [PATCH v13 11/13] tsc: Switch to native sched clock
config: x86_64-allnoconfig (https://download.01.org/0day-ci/archive/20241022/202410220542.6y7YYM8W-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241022/202410220542.6y7YYM8W-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410220542.6y7YYM8W-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> arch/x86/kernel/tsc.c:293:6: warning: no previous prototype for function 'enable_native_sched_clock' [-Wmissing-prototypes]
     293 | void enable_native_sched_clock(void) { }
         |      ^
   arch/x86/kernel/tsc.c:293:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
     293 | void enable_native_sched_clock(void) { }
         | ^
         | static 
   1 warning generated.


vim +/enable_native_sched_clock +293 arch/x86/kernel/tsc.c

   292	
 > 293	void enable_native_sched_clock(void) { }
   294	#endif
   295	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 11/13] tsc: Switch to native sched clock
  2024-10-21  5:51 ` [PATCH v13 11/13] tsc: Switch to native sched clock Nikunj A Dadhania
  2024-10-21 14:37   ` Tom Lendacky
  2024-10-21 21:30   ` kernel test robot
@ 2024-10-21 23:03   ` kernel test robot
  2 siblings, 0 replies; 35+ messages in thread
From: kernel test robot @ 2024-10-21 23:03 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: oe-kbuild-all, mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini,
	nikunj, Alexey Makhalov, Juergen Gross, Boris Ostrovsky

Hi Nikunj,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 0a895c0d9b73d934de95aa0dd4e631c394bdd25d]

url:    https://github.com/intel-lab-lkp/linux/commits/Nikunj-A-Dadhania/x86-sev-Carve-out-and-export-SNP-guest-messaging-init-routines/20241021-140125
base:   0a895c0d9b73d934de95aa0dd4e631c394bdd25d
patch link:    https://lore.kernel.org/r/20241021055156.2342564-12-nikunj%40amd.com
patch subject: [PATCH v13 11/13] tsc: Switch to native sched clock
config: i386-buildonly-randconfig-002-20241022 (https://download.01.org/0day-ci/archive/20241022/202410220640.W8Ynjt5W-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241022/202410220640.W8Ynjt5W-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410220640.W8Ynjt5W-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> arch/x86/kernel/tsc.c:293:6: warning: no previous prototype for 'enable_native_sched_clock' [-Wmissing-prototypes]
     293 | void enable_native_sched_clock(void) { }
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~


vim +/enable_native_sched_clock +293 arch/x86/kernel/tsc.c

   292	
 > 293	void enable_native_sched_clock(void) { }
   294	#endif
   295	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 01/13] x86/sev: Carve out and export SNP guest messaging init routines
  2024-10-21  7:48   ` Christophe JAILLET
@ 2024-10-22  4:12     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 35+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-22  4:12 UTC (permalink / raw)
  To: Christophe JAILLET, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/21/2024 1:18 PM, Christophe JAILLET wrote:
> Le 21/10/2024 à 07:51, Nikunj A Dadhania a écrit :
>> Currently, the SEV guest driver is the only user of SNP guest messaging.
>> All routines for initializing SNP guest messaging are implemented within
>> the SEV guest driver. To add Secure TSC guest support, these initialization
>> routines need to be available during early boot.
>>
>> Carve out common SNP guest messaging buffer allocations and message
>> initialization routines to core/sev.c and export them. These newly added
>> APIs set up the SNP message context (snp_msg_desc), which contains all the
>> necessary details for sending SNP guest messages.
>>
>> At present, the SEV guest platform data structure is used to pass the
>> secrets page physical address to SEV guest driver. Since the secrets page
>> address is locally available to the initialization routine, use the cached
>> address. Remove the unused SEV guest platform data structure.
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
> 
> ...
> 
>> +static inline bool is_vmpck_empty(struct snp_msg_desc *mdesc)
>> +{
>> +    char zero_key[VMPCK_KEY_LEN] = {0};
> 
> Nitpick: I think this could be a static const.
> 

As the primary objective of this patch is to move the code, I have retained the
previous implementation.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 03/13] x86/sev: Add Secure TSC support for SNP guests
  2024-10-21  8:12   ` Christophe JAILLET
@ 2024-10-22  4:15     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 35+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-22  4:15 UTC (permalink / raw)
  To: Christophe JAILLET, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/21/2024 1:42 PM, Christophe JAILLET wrote:
> Le 21/10/2024 à 07:51, Nikunj A Dadhania a écrit :
> 
> ..
> 
>> +static int __init snp_get_tsc_info(void)
>> +{
>> +    static u8 buf[SNP_TSC_INFO_RESP_SZ + AUTHTAG_LEN];
>> +    struct snp_guest_request_ioctl rio;
>> +    struct snp_tsc_info_resp tsc_resp;
>> +    struct snp_tsc_info_req *tsc_req;
>> +    struct snp_msg_desc *mdesc;
>> +    struct snp_guest_req req;
>> +    int rc;
>> +
>> +    /*
>> +     * The intermediate response buffer is used while decrypting the
>> +     * response payload. Make sure that it has enough space to cover the
>> +     * authtag.
>> +     */
>> +    BUILD_BUG_ON(sizeof(buf) < (sizeof(tsc_resp) + AUTHTAG_LEN));
>> +
>> +    mdesc = snp_msg_alloc();
>> +    if (IS_ERR_OR_NULL(mdesc))
>> +        return -ENOMEM;
>> +
>> +    rc = snp_msg_init(mdesc, snp_vmpl);
>> +    if (rc)
>> +        return rc;
>> +
>> +    tsc_req = kzalloc(sizeof(struct snp_tsc_info_req), GFP_KERNEL);
>> +    if (!tsc_req)
>> +        return -ENOMEM;
>> +
>> +    memset(&req, 0, sizeof(req));
>> +    memset(&rio, 0, sizeof(rio));
>> +    memset(buf, 0, sizeof(buf));
>> +
>> +    req.msg_version = MSG_HDR_VER;
>> +    req.msg_type = SNP_MSG_TSC_INFO_REQ;
>> +    req.vmpck_id = snp_vmpl;
>> +    req.req_buf = tsc_req;
>> +    req.req_sz = sizeof(*tsc_req);
>> +    req.resp_buf = buf;
>> +    req.resp_sz = sizeof(tsc_resp) + AUTHTAG_LEN;
>> +    req.exit_code = SVM_VMGEXIT_GUEST_REQUEST;
>> +
>> +    rc = snp_send_guest_request(mdesc, &req, &rio);
>> +    if (rc)
>> +        goto err_req;
>> +
>> +    memcpy(&tsc_resp, buf, sizeof(tsc_resp));
>> +    pr_debug("%s: response status %x scale %llx offset %llx factor %x\n",
>> +         __func__, tsc_resp.status, tsc_resp.tsc_scale, tsc_resp.tsc_offset,
>> +         tsc_resp.tsc_factor);
>> +
>> +    if (tsc_resp.status == 0) {
>> +        snp_tsc_scale = tsc_resp.tsc_scale;
>> +        snp_tsc_offset = tsc_resp.tsc_offset;
>> +    } else {
>> +        pr_err("Failed to get TSC info, response status %x\n", tsc_resp.status);
>> +        rc = -EIO;
>> +    }
>> +
>> +err_req:
>> +    /* The response buffer contains the sensitive data, explicitly clear it. */
>> +    memzero_explicit(buf, sizeof(buf));
>> +    memzero_explicit(&tsc_resp, sizeof(tsc_resp));
>> +    memzero_explicit(&req, sizeof(req));
> 
> req does not seem to hold sensitive data.

That is correct, I will remove that.

> Is it needed, or maybe should it be tsc_req?

No, and tsc_req is zeroed by the caller and is not updated by AMD
security processor.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 06/13] x86/sev: Prevent GUEST_TSC_FREQ MSR interception for Secure TSC enabled guests
  2024-10-21 14:08   ` Tom Lendacky
@ 2024-10-22  4:20     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 35+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-22  4:20 UTC (permalink / raw)
  To: Tom Lendacky, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/21/2024 7:38 PM, Tom Lendacky wrote:
> On 10/21/24 00:51, Nikunj A Dadhania wrote:
>> The hypervisor should not be intercepting GUEST_TSC_FREQ MSR(0xcOO10134)
>> when Secure TSC is enabled. A #VC exception will be generated if the
>> GUEST_TSC_FREQ MSR is being intercepted. If this should occur and SecureTSC
>> is enabled, terminate guest execution.
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> 
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Just a minor comment/question below.
> 
..
>> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
>> index 2ad7773458c0..4e9b1cc1f26b 100644
>> --- a/arch/x86/coco/sev/core.c
>> +++ b/arch/x86/coco/sev/core.c
>> @@ -1332,6 +1332,14 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>>  		return ES_OK;
>>  	}
>>  
>> +	/*
>> +	 * GUEST_TSC_FREQ should not be intercepted when Secure TSC is
>> +	 * enabled. Terminate the SNP guest when the interception is enabled.
>> +	 */
>> +	if (regs->cx == MSR_AMD64_GUEST_TSC_FREQ && cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
> 
> Should the cc_platform_has() check be changed into a check against
> sev_status directly (similar to the DEBUG_SWAP support)? Just in case
> this handler ends up getting used in early code where cc_platform_has()
> can't be used.

Sure, that makes sense. I will change it in my next version.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 09/13] tsc: Use the GUEST_TSC_FREQ MSR for discovering TSC frequency
  2024-10-21 14:33   ` Tom Lendacky
@ 2024-10-22  4:24     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 35+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-22  4:24 UTC (permalink / raw)
  To: Tom Lendacky, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/21/2024 8:03 PM, Tom Lendacky wrote:
> On 10/21/24 00:51, Nikunj A Dadhania wrote:
>> Calibrating the TSC frequency using the kvmclock is not correct for
>> SecureTSC enabled guests. Use the platform provided TSC frequency via the
>> GUEST_TSC_FREQ MSR (C001_0134h).
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> ---
>>  arch/x86/include/asm/sev.h |  2 ++
>>  arch/x86/coco/sev/core.c   | 16 ++++++++++++++++
>>  arch/x86/kernel/tsc.c      |  5 +++++
>>  3 files changed, 23 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
>> index 9169b18eeb78..34f7b9fc363b 100644
>> --- a/arch/x86/include/asm/sev.h
>> +++ b/arch/x86/include/asm/sev.h
>> @@ -536,6 +536,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
>>  }
>>  
>>  void __init snp_secure_tsc_prepare(void);
>> +void __init securetsc_init(void);
>>  
>>  #else	/* !CONFIG_AMD_MEM_ENCRYPT */
>>  
>> @@ -584,6 +585,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
>>  				       u32 resp_sz) { return -ENODEV; }
>>  
>>  static inline void __init snp_secure_tsc_prepare(void) { }
>> +static inline void __init securetsc_init(void) { }
> 
> This should probably be named snp_securetsc_init() or
> snp_secure_tsc_init() (to be consistent with the function above it) so
> that it is in the snp namespace.

Ack.

>> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
>> index dfe6847fd99e..c83f1091bb4f 100644
>> --- a/arch/x86/kernel/tsc.c
>> +++ b/arch/x86/kernel/tsc.c
>> @@ -30,6 +30,7 @@
>>  #include <asm/i8259.h>
>>  #include <asm/topology.h>
>>  #include <asm/uv/uv.h>
>> +#include <asm/sev.h>
>>  
>>  unsigned int __read_mostly cpu_khz;	/* TSC clocks / usec, not used here */
>>  EXPORT_SYMBOL(cpu_khz);
>> @@ -1514,6 +1515,10 @@ void __init tsc_early_init(void)
>>  	/* Don't change UV TSC multi-chassis synchronization */
>>  	if (is_early_uv_system())
>>  		return;
>> +
>> +	if (cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
>> +		securetsc_init();
> 
> Would this call be better in kvm_init_platform() or kvmclock_init()? Any
> reason it has to be here?

Added here to make sure that initialization happens on all the hypervisor,
not just for KVM. Moreover, kvmclock can be disabled from the command line.

Regards,
Nikunj

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 11/13] tsc: Switch to native sched clock
  2024-10-21 14:37   ` Tom Lendacky
@ 2024-10-22  4:26     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 35+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-22  4:26 UTC (permalink / raw)
  To: Tom Lendacky, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini,
	Alexey Makhalov, Juergen Gross, Boris Ostrovsky

On 10/21/2024 8:07 PM, Tom Lendacky wrote:
> On 10/21/24 00:51, Nikunj A Dadhania wrote:
>> Although the kernel switches over to stable TSC clocksource instead of PV
>> clocksource, the scheduler still keeps on using PV clocks as the sched
>> clock source. This is because the following KVM, Xen and VMWare, switches
> 
> s/the following//
> s/switches/switch/
> 
>> the paravirt sched clock handler in their init routines. The HyperV is the
> 
> s/The HyperV/HyperV/
> 
>> only PV clock source that checks if the platform provides invariant TSC and
> 
> s/provides invariant/provides an invariant/

Sure, I will update the message.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 12/13] x86/kvmclock: Abort SecureTSC enabled guest when kvmclock is selected
  2024-10-21 15:00   ` Tom Lendacky
@ 2024-10-22  4:28     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 35+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-22  4:28 UTC (permalink / raw)
  To: Tom Lendacky, linux-kernel, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/21/2024 8:30 PM, Tom Lendacky wrote:
> On 10/21/24 00:51, Nikunj A Dadhania wrote:
>> SecureTSC enabled guests should use TSC as the only clock source, terminate
>> the guest with appropriate code when clock source switches to hypervisor
>> controlled kvmclock.
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> ---
>>  arch/x86/include/asm/sev-common.h | 1 +
>>  arch/x86/include/asm/sev.h        | 2 ++
>>  arch/x86/coco/sev/shared.c        | 3 +--
>>  arch/x86/kernel/kvmclock.c        | 9 +++++++++
>>  4 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
>> index 6ef92432a5ce..ad0743800b0e 100644
>> --- a/arch/x86/include/asm/sev-common.h
>> +++ b/arch/x86/include/asm/sev-common.h
>> @@ -207,6 +207,7 @@ struct snp_psc_desc {
>>  #define GHCB_TERM_SVSM_VMPL0		8	/* SVSM is present but has set VMPL to 0 */
>>  #define GHCB_TERM_SVSM_CAA		9	/* SVSM is present but CAA is not page aligned */
>>  #define GHCB_TERM_SECURE_TSC		10	/* Secure TSC initialization failed */
>> +#define GHCB_TERM_SECURE_TSC_KVMCLOCK	11	/* KVM clock selected instead of Secure TSC */
>>  
>>  #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
>>  
>> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
>> index 34f7b9fc363b..783dc57f73c3 100644
>> --- a/arch/x86/include/asm/sev.h
>> +++ b/arch/x86/include/asm/sev.h
>> @@ -537,6 +537,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
>>  
>>  void __init snp_secure_tsc_prepare(void);
>>  void __init securetsc_init(void);
>> +void __noreturn sev_es_terminate(unsigned int set, unsigned int reason);
>>  
>>  #else	/* !CONFIG_AMD_MEM_ENCRYPT */
>>  
>> @@ -586,6 +587,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code
>>  
>>  static inline void __init snp_secure_tsc_prepare(void) { }
>>  static inline void __init securetsc_init(void) { }
>> +static inline void sev_es_terminate(unsigned int set, unsigned int reason) { }
>>  
>>  #endif	/* CONFIG_AMD_MEM_ENCRYPT */
>>  
>> diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/coco/sev/shared.c
>> index c2a9e2ada659..d202790e1385 100644
>> --- a/arch/x86/coco/sev/shared.c
>> +++ b/arch/x86/coco/sev/shared.c
>> @@ -117,8 +117,7 @@ static bool __init sev_es_check_cpu_features(void)
>>  	return true;
>>  }
>>  
>> -static void __head __noreturn
>> -sev_es_terminate(unsigned int set, unsigned int reason)
>> +void __head __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
>>  {
>>  	u64 val = GHCB_MSR_TERM_REQ;
>>  
>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>> index 5b2c15214a6b..b135044f3c7b 100644
>> --- a/arch/x86/kernel/kvmclock.c
>> +++ b/arch/x86/kernel/kvmclock.c
>> @@ -21,6 +21,7 @@
>>  #include <asm/hypervisor.h>
>>  #include <asm/x86_init.h>
>>  #include <asm/kvmclock.h>
>> +#include <asm/sev.h>
>>  
>>  static int kvmclock __initdata = 1;
>>  static int kvmclock_vsyscall __initdata = 1;
>> @@ -150,6 +151,14 @@ bool kvm_check_and_clear_guest_paused(void)
>>  
>>  static int kvm_cs_enable(struct clocksource *cs)
>>  {
>> +	/*
>> +	 * For a guest with SecureTSC enabled, the TSC should be the only clock
>> +	 * source. Abort the guest when kvmclock is selected as the clock
>> +	 * source.
>> +	 */
>> +	if (cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
> 
> if (WARN_ON(cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)))
> 
> so that the guest sees something as well?

Sure, will add.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
  2024-10-21  5:51 ` [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests Nikunj A Dadhania
@ 2024-10-23  3:25   ` Xiaoyao Li
  2024-10-24  6:24     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 35+ messages in thread
From: Xiaoyao Li @ 2024-10-23  3:25 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
> Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as
> the subsequent TSC value reads are undefined. MSR_IA32_TSC read/write
> accesses should not exit to the hypervisor for such guests.
> 
> Accesses to MSR_IA32_TSC needs special handling in the #VC handler for the
> guests with Secure TSC enabled. Writes to MSR_IA32_TSC should be ignored,
> and reads of MSR_IA32_TSC should return the result of the RDTSC
> instruction.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>
> ---
>   arch/x86/coco/sev/core.c | 24 ++++++++++++++++++++++++
>   1 file changed, 24 insertions(+)
> 
> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
> index 965209067f03..2ad7773458c0 100644
> --- a/arch/x86/coco/sev/core.c
> +++ b/arch/x86/coco/sev/core.c
> @@ -1308,6 +1308,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>   		return ES_OK;
>   	}
>   
> +	/*
> +	 * TSC related accesses should not exit to the hypervisor when a
> +	 * guest is executing with SecureTSC enabled, so special handling
> +	 * is required for accesses of MSR_IA32_TSC:
> +	 *
> +	 * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
> +	 *         of the TSC to return undefined values, so ignore all
> +	 *         writes.
> +	 * Reads:  Reads of MSR_IA32_TSC should return the current TSC
> +	 *         value, use the value returned by RDTSC.
> +	 */

Why doesn't handle it by returning ES_VMM_ERROR when hypervisor 
intercepts RD/WR of MSR_IA32_TSC? With SECURE_TSC enabled, it seems not 
need to be intercepted.

I think the reason is that SNP guest relies on interception to do the 
ignore behavior for WRMSR in #VC handler because the writing leads to 
undefined result. Then the question is what if the hypervisor doesn't 
intercept write to MSR_IA32_TSC in the first place?

> +	if (regs->cx == MSR_IA32_TSC && cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)) {
> +		u64 tsc;
> +
> +		if (exit_info_1)
> +			return ES_OK;
> +
> +		tsc = rdtsc();
> +		regs->ax = UINT_MAX & tsc;
> +		regs->dx = UINT_MAX & (tsc >> 32);
> +
> +		return ES_OK;
> +	}
> +
>   	ghcb_set_rcx(ghcb, regs->cx);
>   	if (exit_info_1) {
>   		ghcb_set_rax(ghcb, regs->ax);


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
  2024-10-23  3:25   ` Xiaoyao Li
@ 2024-10-24  6:24     ` Nikunj A. Dadhania
  2024-10-24  7:31       ` Xiaoyao Li
  0 siblings, 1 reply; 35+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-24  6:24 UTC (permalink / raw)
  To: Xiaoyao Li, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini



On 10/23/2024 8:55 AM, Xiaoyao Li wrote:
> On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
>> Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as
>> the subsequent TSC value reads are undefined. MSR_IA32_TSC read/write
>> accesses should not exit to the hypervisor for such guests.
>>
>> Accesses to MSR_IA32_TSC needs special handling in the #VC handler for the
>> guests with Secure TSC enabled. Writes to MSR_IA32_TSC should be ignored,
>> and reads of MSR_IA32_TSC should return the result of the RDTSC
>> instruction.
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Tested-by: Peter Gonda <pgonda@google.com>
>> ---
>>   arch/x86/coco/sev/core.c | 24 ++++++++++++++++++++++++
>>   1 file changed, 24 insertions(+)
>>
>> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
>> index 965209067f03..2ad7773458c0 100644
>> --- a/arch/x86/coco/sev/core.c
>> +++ b/arch/x86/coco/sev/core.c
>> @@ -1308,6 +1308,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>>           return ES_OK;
>>       }
>>   +    /*
>> +     * TSC related accesses should not exit to the hypervisor when a
>> +     * guest is executing with SecureTSC enabled, so special handling
>> +     * is required for accesses of MSR_IA32_TSC:
>> +     *
>> +     * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
>> +     *         of the TSC to return undefined values, so ignore all
>> +     *         writes.
>> +     * Reads:  Reads of MSR_IA32_TSC should return the current TSC
>> +     *         value, use the value returned by RDTSC.
>> +     */
> 
> Why doesn't handle it by returning ES_VMM_ERROR when hypervisor intercepts
> RD/WR of MSR_IA32_TSC? With SECURE_TSC enabled, it seems not need to be
> intercepted.

ES_VMM_ERROR will terminate the guest, which is not the expected behaviour. As
documented, writes to the MSR is ignored and reads are done using RDTSC.

> I think the reason is that SNP guest relies on interception to do the ignore
> behavior for WRMSR in #VC handler because the writing leads to undefined
> result.

For legacy and secure guests MSR_IA32_TSC is always intercepted(for both RD/WR).
Moreover, this is a legacy MSR, RDTSC and RDTSCP is the what modern OSes should
use. The idea is the catch any writes to TSC MSR and handle them gracefully.

> Then the question is what if the hypervisor doesn't intercept write to
> MSR_IA32_TSC in the first place?

I have tried to disable interception of MSR_IA32_TSC, and writes are ignored by
the HW as well. I would like to continue the current documented HW as per the APM.

Regards,
Nikunj

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
  2024-10-24  6:24     ` Nikunj A. Dadhania
@ 2024-10-24  7:31       ` Xiaoyao Li
  2024-10-24 10:16         ` Nikunj A. Dadhania
  0 siblings, 1 reply; 35+ messages in thread
From: Xiaoyao Li @ 2024-10-24  7:31 UTC (permalink / raw)
  To: Nikunj A. Dadhania, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/24/2024 2:24 PM, Nikunj A. Dadhania wrote:
> 
> 
> On 10/23/2024 8:55 AM, Xiaoyao Li wrote:
>> On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
>>> Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as
>>> the subsequent TSC value reads are undefined. MSR_IA32_TSC read/write
>>> accesses should not exit to the hypervisor for such guests.
>>>
>>> Accesses to MSR_IA32_TSC needs special handling in the #VC handler for the
>>> guests with Secure TSC enabled. Writes to MSR_IA32_TSC should be ignored,
>>> and reads of MSR_IA32_TSC should return the result of the RDTSC
>>> instruction.
>>>
>>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>>> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>>> Tested-by: Peter Gonda <pgonda@google.com>
>>> ---
>>>    arch/x86/coco/sev/core.c | 24 ++++++++++++++++++++++++
>>>    1 file changed, 24 insertions(+)
>>>
>>> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
>>> index 965209067f03..2ad7773458c0 100644
>>> --- a/arch/x86/coco/sev/core.c
>>> +++ b/arch/x86/coco/sev/core.c
>>> @@ -1308,6 +1308,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>>>            return ES_OK;
>>>        }
>>>    +    /*
>>> +     * TSC related accesses should not exit to the hypervisor when a
>>> +     * guest is executing with SecureTSC enabled, so special handling
>>> +     * is required for accesses of MSR_IA32_TSC:
>>> +     *
>>> +     * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
>>> +     *         of the TSC to return undefined values, so ignore all
>>> +     *         writes.
>>> +     * Reads:  Reads of MSR_IA32_TSC should return the current TSC
>>> +     *         value, use the value returned by RDTSC.
>>> +     */
>>
>> Why doesn't handle it by returning ES_VMM_ERROR when hypervisor intercepts
>> RD/WR of MSR_IA32_TSC? With SECURE_TSC enabled, it seems not need to be
>> intercepted.
> 
> ES_VMM_ERROR will terminate the guest, which is not the expected behaviour. As
> documented, writes to the MSR is ignored and reads are done using RDTSC.
> 
>> I think the reason is that SNP guest relies on interception to do the ignore
>> behavior for WRMSR in #VC handler because the writing leads to undefined
>> result.
> 
> For legacy and secure guests MSR_IA32_TSC is always intercepted(for both RD/WR).

We cannot make such assumption unless it's enforced by AMD HW.

> Moreover, this is a legacy MSR, RDTSC and RDTSCP is the what modern OSes should
> use. 

Again, this is your assumption and expectation only.

> The idea is the catch any writes to TSC MSR and handle them gracefully.

If SNP guest requires MSR_IA32_TSC being intercepted by hypervisor. It 
should come with a solution that guest kernel can check it certainly, 
just like the patch 5 and patch 6, that they can check the behavior of 
hypervisor.

If there is no clean way for guest to ensure MSR_IA32_TSC is intercepted 
by hypervisor, we at least need add some comment to call out that these 
code replies on the assumption that hypervisor intercepts MSR_IA32_TSC.

>> Then the question is what if the hypervisor doesn't intercept write to
>> MSR_IA32_TSC in the first place?
> 
> I have tried to disable interception of MSR_IA32_TSC, and writes are ignored by
> the HW as well. I would like to continue the current documented HW as per the APM.

I only means the writes are ignored in your testing HW. We don't know 
the result on other SNP-capable HW or future HW, unless it's documented 
in APM.

Current documented behavior is that write leads to a undefined value of 
subsequent read. So we need to avoid write. One solution is to intercept 
write and ignore it, but it depends on hypervisor to intercept it. 
Anther solution would be we fix all the place of writing MSR_IA32_TSC 
for SNP guest in linux.

> Regards,
> Nikunj


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 05/13] x86/sev: Prevent RDTSC/RDTSCP interception for Secure TSC enabled guests
  2024-10-21  5:51 ` [PATCH v13 05/13] x86/sev: Prevent RDTSC/RDTSCP interception " Nikunj A Dadhania
@ 2024-10-24  7:56   ` Xiaoyao Li
  2024-10-24  8:44     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 35+ messages in thread
From: Xiaoyao Li @ 2024-10-24  7:56 UTC (permalink / raw)
  To: Nikunj A Dadhania, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
> The hypervisor should not be intercepting RDTSC/RDTSCP when Secure TSC is
> enabled. A #VC exception will be generated if the RDTSC/RDTSCP instructions
> are being intercepted. If this should occur and Secure TSC is enabled,
> terminate guest execution.

There is another option to ignore the interception and just return back 
to guest execution. I think it better to add some justification on why 
make it fatal and terminate the guest is better than ignoring the 
interception.

> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>   arch/x86/coco/sev/shared.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/coco/sev/shared.c
> index 71de53194089..c2a9e2ada659 100644
> --- a/arch/x86/coco/sev/shared.c
> +++ b/arch/x86/coco/sev/shared.c
> @@ -1140,6 +1140,16 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,
>   	bool rdtscp = (exit_code == SVM_EXIT_RDTSCP);
>   	enum es_result ret;
>   
> +	/*
> +	 * RDTSC and RDTSCP should not be intercepted when Secure TSC is
> +	 * enabled. Terminate the SNP guest when the interception is enabled.
> +	 * This file is included from kernel/sev.c and boot/compressed/sev.c,
> +	 * use sev_status here as cc_platform_has() is not available when
> +	 * compiling boot/compressed/sev.c.
> +	 */
> +	if (sev_status & MSR_AMD64_SNP_SECURE_TSC)
> +		return ES_VMM_ERROR;
> +
>   	ret = sev_es_ghcb_hv_call(ghcb, ctxt, exit_code, 0, 0);
>   	if (ret != ES_OK)
>   		return ret;


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 05/13] x86/sev: Prevent RDTSC/RDTSCP interception for Secure TSC enabled guests
  2024-10-24  7:56   ` Xiaoyao Li
@ 2024-10-24  8:44     ` Nikunj A. Dadhania
  2024-10-24 15:34       ` Xiaoyao Li
  0 siblings, 1 reply; 35+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-24  8:44 UTC (permalink / raw)
  To: Xiaoyao Li, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini



On 10/24/2024 1:26 PM, Xiaoyao Li wrote:
> On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
>> The hypervisor should not be intercepting RDTSC/RDTSCP when Secure TSC is
>> enabled. A #VC exception will be generated if the RDTSC/RDTSCP instructions
>> are being intercepted. If this should occur and Secure TSC is enabled,
>> terminate guest execution.
> 
> There is another option to ignore the interception and just return back to
> guest execution. 

That is not correct, RDTSC/RDTSCP should return the timestamp counter value
computed using the GUEST_TSC_SCALE and GUEST_TSC_OFFSET part of VMSA.
 
> I think it better to add some justification on why make it> fatal and terminate the guest is better than ignoring the interception.

How about the below updated commit message:

The hypervisor should not be intercepting RDTSC/RDTSCP when Secure TSC is
enabled. A #VC exception will be generated if the RDTSC/RDTSCP instructions
are being intercepted. If this should occur and Secure TSC is enabled,
terminate guest execution as the guest cannot rely on the TSC value provided 
by the hypervisor.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
  2024-10-24  7:31       ` Xiaoyao Li
@ 2024-10-24 10:16         ` Nikunj A. Dadhania
  0 siblings, 0 replies; 35+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-24 10:16 UTC (permalink / raw)
  To: Xiaoyao Li, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini



On 10/24/2024 1:01 PM, Xiaoyao Li wrote:
> On 10/24/2024 2:24 PM, Nikunj A. Dadhania wrote:
>>
>>
>> On 10/23/2024 8:55 AM, Xiaoyao Li wrote:
>>> On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
>>>> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
>>>> index 965209067f03..2ad7773458c0 100644
>>>> --- a/arch/x86/coco/sev/core.c
>>>> +++ b/arch/x86/coco/sev/core.c
>>>> @@ -1308,6 +1308,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>>>>            return ES_OK;
>>>>        }
>>>>    +    /*
>>>> +     * TSC related accesses should not exit to the hypervisor when a
>>>> +     * guest is executing with SecureTSC enabled, so special handling
>>>> +     * is required for accesses of MSR_IA32_TSC:
>>>> +     *
>>>> +     * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
>>>> +     *         of the TSC to return undefined values, so ignore all
>>>> +     *         writes.
>>>> +     * Reads:  Reads of MSR_IA32_TSC should return the current TSC
>>>> +     *         value, use the value returned by RDTSC.
>>>> +     */
>>>
>>>
>>> Why doesn't handle it by returning ES_VMM_ERROR when hypervisor
>>> intercepts RD/WR of MSR_IA32_TSC? With SECURE_TSC enabled, it seems
>>> not need to be intercepted.
>> 
>> ES_VMM_ERROR will terminate the guest, which is not the expected
>> behaviour. As documented, writes to the MSR is ignored and reads are
>> done using RDTSC.
>> 
>>> I think the reason is that SNP guest relies on interception to do
>>> the ignore behavior for WRMSR in #VC handler because the writing
>>> leads to undefined result.
>> 
>> For legacy and secure guests MSR_IA32_TSC is always intercepted(for
>> both RD/WR).
>
> We cannot make such assumption unless it's enforced by AMD HW.
>
>> Moreover, this is a legacy MSR, RDTSC and RDTSCP is the what modern
>> OSes should use.
>
> Again, this is your assumption and expectation only.

Not my assumption, I could not find any reference to MSR_IA32_TSC 
read/write in the kernel code. 

>
>> The idea is the catch any writes to TSC MSR and handle them
>> gracefully.
>
> If SNP guest requires MSR_IA32_TSC being intercepted by hypervisor. It
> should come with a solution that guest kernel can check it certainly,
> just like the patch 5 and patch 6, that they can check the behavior of
> hypervisor.
>
> If there is no clean way for guest to ensure MSR_IA32_TSC is
> intercepted by hypervisor, 

Yes, that is my understanding as well.

> we at least need add some comment to call
> out that these code replies on the assumption that hypervisor
> intercepts MSR_IA32_TSC.

Sure, I will add a comment.

>>> Then the question is what if the hypervisor doesn't intercept write
>>> to MSR_IA32_TSC in the first place?
>> 
>> I have tried to disable interception of MSR_IA32_TSC, and writes are
>> ignored by the HW as well. I would like to continue the current
>> documented HW as per the APM.
>
> I only means the writes are ignored in your testing HW. We don't know
> the result on other SNP-capable HW or future HW, unless it's
> documented in APM.
>
> Current documented behavior is that write leads to a undefined value
> of subsequent read. So we need to avoid write. One solution is to
> intercept write and ignore it, but it depends on hypervisor to
> intercept it. 
> Anther solution would be we fix all the place of writing
> MSR_IA32_TSC for SNP guest in linux.

There is no MSR_IA32_TSC write in the kernel code, IMHO, so second
solution is taken care.

Regards,
Nikunj

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v13 05/13] x86/sev: Prevent RDTSC/RDTSCP interception for Secure TSC enabled guests
  2024-10-24  8:44     ` Nikunj A. Dadhania
@ 2024-10-24 15:34       ` Xiaoyao Li
  0 siblings, 0 replies; 35+ messages in thread
From: Xiaoyao Li @ 2024-10-24 15:34 UTC (permalink / raw)
  To: Nikunj A. Dadhania, linux-kernel, thomas.lendacky, bp, x86, kvm
  Cc: mingo, tglx, dave.hansen, pgonda, seanjc, pbonzini

On 10/24/2024 4:44 PM, Nikunj A. Dadhania wrote:
> 
> 
> On 10/24/2024 1:26 PM, Xiaoyao Li wrote:
>> On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
>>> The hypervisor should not be intercepting RDTSC/RDTSCP when Secure TSC is
>>> enabled. A #VC exception will be generated if the RDTSC/RDTSCP instructions
>>> are being intercepted. If this should occur and Secure TSC is enabled,
>>> terminate guest execution.
>>
>> There is another option to ignore the interception and just return back to
>> guest execution.
> 
> That is not correct, RDTSC/RDTSCP should return the timestamp counter value
> computed using the GUEST_TSC_SCALE and GUEST_TSC_OFFSET part of VMSA.

Ah, I missed this. Yes, if ignore the interception, guest needs to do 
TSC scale itself with GUEST_TSC_SCALE and GUEST_TSC_OFFSET to get the 
correct TSC. It's complicating things while making not intercepting 
RDTSC/RDTSP a hard requirement is much simple.

I think it's worth adding it as the justification.

>> I think it better to add some justification on why make it> fatal and terminate the guest is better than ignoring the interception.
> 
> How about the below updated commit message:
> 
> The hypervisor should not be intercepting RDTSC/RDTSCP when Secure TSC is
> enabled. A #VC exception will be generated if the RDTSC/RDTSCP instructions
> are being intercepted. If this should occur and Secure TSC is enabled,
> terminate guest execution as the guest cannot rely on the TSC value provided
> by the hypervisor.
> 
> Regards
> Nikunj
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2024-10-24 15:34 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-21  5:51 [PATCH v13 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
2024-10-21  5:51 ` [PATCH v13 01/13] x86/sev: Carve out and export SNP guest messaging init routines Nikunj A Dadhania
2024-10-21  7:48   ` Christophe JAILLET
2024-10-22  4:12     ` Nikunj A. Dadhania
2024-10-21  5:51 ` [PATCH v13 02/13] x86/sev: Relocate SNP guest messaging routines to common code Nikunj A Dadhania
2024-10-21  5:51 ` [PATCH v13 03/13] x86/sev: Add Secure TSC support for SNP guests Nikunj A Dadhania
2024-10-21  8:12   ` Christophe JAILLET
2024-10-22  4:15     ` Nikunj A. Dadhania
2024-10-21  5:51 ` [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests Nikunj A Dadhania
2024-10-23  3:25   ` Xiaoyao Li
2024-10-24  6:24     ` Nikunj A. Dadhania
2024-10-24  7:31       ` Xiaoyao Li
2024-10-24 10:16         ` Nikunj A. Dadhania
2024-10-21  5:51 ` [PATCH v13 05/13] x86/sev: Prevent RDTSC/RDTSCP interception " Nikunj A Dadhania
2024-10-24  7:56   ` Xiaoyao Li
2024-10-24  8:44     ` Nikunj A. Dadhania
2024-10-24 15:34       ` Xiaoyao Li
2024-10-21  5:51 ` [PATCH v13 06/13] x86/sev: Prevent GUEST_TSC_FREQ MSR " Nikunj A Dadhania
2024-10-21 14:08   ` Tom Lendacky
2024-10-22  4:20     ` Nikunj A. Dadhania
2024-10-21  5:51 ` [PATCH v13 07/13] x86/sev: Mark Secure TSC as reliable clocksource Nikunj A Dadhania
2024-10-21  5:51 ` [PATCH v13 08/13] x86/cpu/amd: Do not print FW_BUG for Secure TSC Nikunj A Dadhania
2024-10-21  5:51 ` [PATCH v13 09/13] tsc: Use the GUEST_TSC_FREQ MSR for discovering TSC frequency Nikunj A Dadhania
2024-10-21 14:33   ` Tom Lendacky
2024-10-22  4:24     ` Nikunj A. Dadhania
2024-10-21  5:51 ` [PATCH v13 10/13] tsc: Upgrade TSC clocksource rating Nikunj A Dadhania
2024-10-21  5:51 ` [PATCH v13 11/13] tsc: Switch to native sched clock Nikunj A Dadhania
2024-10-21 14:37   ` Tom Lendacky
2024-10-22  4:26     ` Nikunj A. Dadhania
2024-10-21 21:30   ` kernel test robot
2024-10-21 23:03   ` kernel test robot
2024-10-21  5:51 ` [PATCH v13 12/13] x86/kvmclock: Abort SecureTSC enabled guest when kvmclock is selected Nikunj A Dadhania
2024-10-21 15:00   ` Tom Lendacky
2024-10-22  4:28     ` Nikunj A. Dadhania
2024-10-21  5:51 ` [PATCH v13 13/13] x86/sev: Allow Secure TSC feature for SNP guests Nikunj A Dadhania

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox