* [RFC 00/14] AMD: Add Secure AVIC Guest Support
@ 2024-09-13 11:36 Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 01/14] x86/apic: Add new driver for Secure AVIC Neeraj Upadhyay
` (14 more replies)
0 siblings, 15 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:36 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
Introduction
------------
Secure AVIC is a new hardware feature in the AMD64 architecture to
allow SEV-SNP guests to prevent hypervisor from generating unexpected
interrupts to a vCPU or otherwise violate architectural assumptions
around APIC behavior.
One of the significant differences from AVIC or emulated x2APIC is that
Secure AVIC uses a guest-owned and managed APIC backing page. It also
introduces additional fields in both the VMCB and the Secure AVIC backing
page to aid the guest in limiting which interrupt vectors can be injected
into the guest.
Guest APIC Backing Page
-----------------------
Each vCPU has a guest-allocated APIC backing page of size 4K, which
maintains APIC state for that vCPU. The x2APIC MSRs are mapped at
their corresposing x2APIC MMIO offset within the guest APIC backing
page. All x2APIC accesses by guest or Secure AVIC hardware operate
on this backing page. The backing page should be pinned and NPT entry
for it should be always mapped while the corresponding vCPU is running.
MSR Accesses
------------
Secure AVIC only supports x2APIC MSR accesses. xAPIC MMIO offset based
accesses are not supported.
Some of the MSR accesses such as ICR writes (with shorthand equal to
self), SELF_IPI, EOI, TPR writes are accelerated by Secure AVIC
hardware. Other MSR accesses generate a #VC exception. The #VC
exception handler reads/writes to the guest APIC backing page.
As guest APIC backing page is accessible to the guest, the Secure
AVIC driver code optimizes APIC register access by directly
reading/writing to the guest APIC backing page (instead of taking
the #VC exception route).
In addition to the architected MSRs, following new fields are added to
the guest APIC backing page which can be modified directly by the
guest:
a. ALLOWED_IRR
ALLOWED_IRR vector indicates the interrupt vectors which the guest
allows the hypervisor to send. The combination of host-controlled
REQUESTED_IRR vectors (part of VMCB) and ALLOWED_IRR is used by
hardware to update the IRR vectors of the Guest APIC backing page.
#Offset #bits Description
204h 31:0 Guest allowed vectors 0-31
214h 31:0 Guest allowed vectors 32-63
...
274h 31:0 Guest allowed vectors 224-255
ALLOWED_IRR is meant to be used specifically for vectors that the
hypervisor is allowed to inject, such as device interrupts. Interrupt
vectors used exclusively by the guest itself (like IPI vectors) should
not be allowed to be injected into the guest for security reasons.
b. NMI Request
#Offset #bits Description
278h 0 Set by Guest to request Virtual NMI
LAPIC Timer Support
-------------------
LAPIC timer is emulated by hypervisor. So, APIC_LVTT, APIC_TMICT and
APIC_TDCR, APIC_TMCCT APIC registers are not read/written to the guest
APIC backing page and are communicated to the hypervisor using SVM_EXIT_MSR
VMGEXIT.
IPI Support
-----------
Only SELF_IPI is accelerated by Secure AVIC hardware. Other IPIs require
writing (from the Secure AVIC driver) to the IRR vector of the target CPU
backing page and then issuing VMGEXIT for the hypervisor to notify the
target vCPU.
Driver Implementation Open Points
---------------------------------
The Secure AVIC driver only supports physical destination mode. If
logical destination mode need to be supported, then a separate x2apic
driver would be required for supporting logical destination mode.
Setting of ALLOWED_IRR vectors is done from vector.c for IOAPIC and MSI
interrupts. ALLOWED_IRR vector is not cleared when an interrupt vector
migrates to different CPU. Using a cleaner approach to manage and
configure allowed vectors needs more work.
Testing
-------
This series is based on top of commit 196145c606d0 "Merge
tag 'clk-fixes-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux."
Host Secure AVIC support patch series is at [1].
Following tests are done:
1) Boot to Prompt using initramfs and ubuntu fs.
2) Verified timer and IPI as part of the guest bootup.
3) Verified long run SCF TORTURE IPI test.
4) Verified FIO test with NVME passthrough.
[1] https://github.com/AMDESE/linux-kvm/tree/savic-host
Kishon Vijay Abraham I (11):
x86/apic: Add new driver for Secure AVIC
x86/apic: Initialize Secure AVIC APIC backing page
x86/apic: Initialize APIC backing page for Secure AVIC
x86/apic: Add update_vector callback for Secure AVIC
x86/apic: Add support to send IPI for Secure AVIC
x86/apic: Support LAPIC timer for Secure AVIC
x86/sev: Initialize VGIF for secondary VCPUs for Secure AVIC
x86/apic: Add support to send NMI IPI for Secure AVIC
x86/apic: Allow NMI to be injected from hypervisor for Secure AVIC
x86/sev: Enable NMI support for Secure AVIC
x86/sev: Indicate SEV-SNP guest supports Secure AVIC
Neeraj Upadhyay (3):
x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver
x86/apic: Initialize APIC ID for Secure AVIC
x86/apic: Enable Secure AVIC in Control MSR
arch/x86/Kconfig | 12 +
arch/x86/boot/compressed/sev.c | 3 +-
arch/x86/coco/core.c | 3 +
arch/x86/coco/sev/core.c | 91 +++++-
arch/x86/include/asm/apic.h | 3 +
arch/x86/include/asm/apicdef.h | 2 +
arch/x86/include/asm/msr-index.h | 9 +-
arch/x86/include/asm/sev.h | 6 +
arch/x86/include/uapi/asm/svm.h | 1 +
arch/x86/kernel/apic/Makefile | 1 +
arch/x86/kernel/apic/apic.c | 4 +
arch/x86/kernel/apic/vector.c | 8 +
arch/x86/kernel/apic/x2apic_savic.c | 480 ++++++++++++++++++++++++++++
include/linux/cc_platform.h | 8 +
14 files changed, 621 insertions(+), 10 deletions(-)
create mode 100644 arch/x86/kernel/apic/x2apic_savic.c
--
2.34.1
^ permalink raw reply [flat|nested] 81+ messages in thread
* [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
@ 2024-09-13 11:36 ` Neeraj Upadhyay
2024-10-08 19:15 ` Borislav Petkov
` (2 more replies)
2024-09-13 11:36 ` [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page Neeraj Upadhyay
` (13 subsequent siblings)
14 siblings, 3 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:36 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
The Secure AVIC feature provides SEV-SNP guests hardware acceleration
for performance sensitive APIC accesses while securely managing the
guest-owned APIC state through the use of a private APIC backing page.
This helps prevent malicious hypervisor from generating unexpected
interrupts for a vCPU or otherwise violate architectural assumptions
around APIC behavior.
Add a new x2APIC driver that will serve as the base of the Secure AVIC
support. It is initially the same as the x2APIC phys driver, but will be
modified as features of Secure AVIC are implemented.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Co-developed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/Kconfig | 12 +++
arch/x86/boot/compressed/sev.c | 1 +
arch/x86/coco/core.c | 3 +
arch/x86/include/asm/msr-index.h | 4 +-
arch/x86/kernel/apic/Makefile | 1 +
arch/x86/kernel/apic/x2apic_savic.c | 112 ++++++++++++++++++++++++++++
include/linux/cc_platform.h | 8 ++
7 files changed, 140 insertions(+), 1 deletion(-)
create mode 100644 arch/x86/kernel/apic/x2apic_savic.c
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 007bab9f2a0e..b05b4e9d2e49 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -469,6 +469,18 @@ config X86_X2APIC
If you don't know what to do here, say N.
+config AMD_SECURE_AVIC
+ bool "AMD Secure AVIC"
+ depends on X86_X2APIC && AMD_MEM_ENCRYPT
+ help
+ This enables AMD Secure AVIC support on guests that have this feature.
+
+ AMD Secure AVIC provides hardware acceleration for performance sensitive
+ APIC accesses and support for managing guest owned APIC state for SEV-SNP
+ guests.
+
+ If you don't know what to do here, say N.
+
config X86_POSTED_MSI
bool "Enable MSI and MSI-x delivery by posted interrupts"
depends on X86_64 && IRQ_REMAP
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index cd44e120fe53..ec038be0a048 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -394,6 +394,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
MSR_AMD64_SNP_VMSA_REG_PROT | \
MSR_AMD64_SNP_RESERVED_BIT13 | \
MSR_AMD64_SNP_RESERVED_BIT15 | \
+ MSR_AMD64_SNP_SECURE_AVIC_ENABLED | \
MSR_AMD64_SNP_RESERVED_MASK)
/*
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index 0f81f70aca82..4c3bc031e9a9 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -100,6 +100,9 @@ static bool noinstr amd_cc_platform_has(enum cc_attr attr)
case CC_ATTR_HOST_SEV_SNP:
return cc_flags.host_sev_snp;
+ case CC_ATTR_SNP_SECURE_AVIC:
+ return sev_status & MSR_AMD64_SNP_SECURE_AVIC_ENABLED;
+
default:
return false;
}
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 82c6a4d350e0..d0583619c978 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -658,7 +658,9 @@
#define MSR_AMD64_SNP_VMSA_REG_PROT BIT_ULL(MSR_AMD64_SNP_VMSA_REG_PROT_BIT)
#define MSR_AMD64_SNP_SMT_PROT_BIT 17
#define MSR_AMD64_SNP_SMT_PROT BIT_ULL(MSR_AMD64_SNP_SMT_PROT_BIT)
-#define MSR_AMD64_SNP_RESV_BIT 18
+#define MSR_AMD64_SNP_SECURE_AVIC_BIT 18
+#define MSR_AMD64_SNP_SECURE_AVIC_ENABLED BIT_ULL(MSR_AMD64_SNP_SECURE_AVIC_BIT)
+#define MSR_AMD64_SNP_RESV_BIT 19
#define MSR_AMD64_SNP_RESERVED_MASK GENMASK_ULL(63, MSR_AMD64_SNP_RESV_BIT)
#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
diff --git a/arch/x86/kernel/apic/Makefile b/arch/x86/kernel/apic/Makefile
index 3bf0487cf3b7..12153993c12b 100644
--- a/arch/x86/kernel/apic/Makefile
+++ b/arch/x86/kernel/apic/Makefile
@@ -18,6 +18,7 @@ ifeq ($(CONFIG_X86_64),y)
# APIC probe will depend on the listing order here
obj-$(CONFIG_X86_NUMACHIP) += apic_numachip.o
obj-$(CONFIG_X86_UV) += x2apic_uv_x.o
+obj-$(CONFIG_AMD_SECURE_AVIC) += x2apic_savic.o
obj-$(CONFIG_X86_X2APIC) += x2apic_phys.o
obj-$(CONFIG_X86_X2APIC) += x2apic_cluster.o
obj-y += apic_flat_64.o
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
new file mode 100644
index 000000000000..97dac09a7f42
--- /dev/null
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -0,0 +1,112 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Secure AVIC Support (SEV-SNP Guests)
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ *
+ * Author: Kishon Vijay Abraham I <kvijayab@amd.com>
+ */
+
+#include <linux/cpumask.h>
+#include <linux/cc_platform.h>
+
+#include <asm/apic.h>
+#include <asm/sev.h>
+
+#include "local.h"
+
+static int x2apic_savic_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
+{
+ return x2apic_enabled() && cc_platform_has(CC_ATTR_SNP_SECURE_AVIC);
+}
+
+static void x2apic_savic_send_IPI(int cpu, int vector)
+{
+ u32 dest = per_cpu(x86_cpu_to_apicid, cpu);
+
+ /* x2apic MSRs are special and need a special fence: */
+ weak_wrmsr_fence();
+ __x2apic_send_IPI_dest(dest, vector, APIC_DEST_PHYSICAL);
+}
+
+static void
+__send_IPI_mask(const struct cpumask *mask, int vector, int apic_dest)
+{
+ unsigned long query_cpu;
+ unsigned long this_cpu;
+ unsigned long flags;
+
+ /* x2apic MSRs are special and need a special fence: */
+ weak_wrmsr_fence();
+
+ local_irq_save(flags);
+
+ this_cpu = smp_processor_id();
+ for_each_cpu(query_cpu, mask) {
+ if (apic_dest == APIC_DEST_ALLBUT && this_cpu == query_cpu)
+ continue;
+ __x2apic_send_IPI_dest(per_cpu(x86_cpu_to_apicid, query_cpu),
+ vector, APIC_DEST_PHYSICAL);
+ }
+ local_irq_restore(flags);
+}
+
+static void x2apic_savic_send_IPI_mask(const struct cpumask *mask, int vector)
+{
+ __send_IPI_mask(mask, vector, APIC_DEST_ALLINC);
+}
+
+static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, int vector)
+{
+ __send_IPI_mask(mask, vector, APIC_DEST_ALLBUT);
+}
+
+static int x2apic_savic_probe(void)
+{
+ if (!cc_platform_has(CC_ATTR_SNP_SECURE_AVIC))
+ return 0;
+
+ if (!x2apic_mode) {
+ pr_err("Secure AVIC enabled in non x2APIC mode\n");
+ snp_abort();
+ }
+
+ pr_info("Secure AVIC Enabled\n");
+
+ return 1;
+}
+
+static struct apic apic_x2apic_savic __ro_after_init = {
+
+ .name = "secure avic x2apic",
+ .probe = x2apic_savic_probe,
+ .acpi_madt_oem_check = x2apic_savic_acpi_madt_oem_check,
+
+ .dest_mode_logical = false,
+
+ .disable_esr = 0,
+
+ .cpu_present_to_apicid = default_cpu_present_to_apicid,
+
+ .max_apic_id = UINT_MAX,
+ .x2apic_set_max_apicid = true,
+ .get_apic_id = x2apic_get_apic_id,
+
+ .calc_dest_apicid = apic_default_calc_apicid,
+
+ .send_IPI = x2apic_savic_send_IPI,
+ .send_IPI_mask = x2apic_savic_send_IPI_mask,
+ .send_IPI_mask_allbutself = x2apic_savic_send_IPI_mask_allbutself,
+ .send_IPI_allbutself = x2apic_send_IPI_allbutself,
+ .send_IPI_all = x2apic_send_IPI_all,
+ .send_IPI_self = x2apic_send_IPI_self,
+ .nmi_to_offline_cpu = true,
+
+ .read = native_apic_msr_read,
+ .write = native_apic_msr_write,
+ .eoi = native_apic_msr_eoi,
+ .icr_read = native_x2apic_icr_read,
+ .icr_write = native_x2apic_icr_write,
+};
+
+apic_driver(apic_x2apic_savic);
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index caa4b4430634..801208678450 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -88,6 +88,14 @@ enum cc_attr {
* enabled to run SEV-SNP guests.
*/
CC_ATTR_HOST_SEV_SNP,
+
+ /**
+ * @CC_ATTR_SNP_SECURE_AVIC: Secure AVIC mode is active.
+ *
+ * The host kernel is running with the necessary features enabled
+ * to run SEV-SNP guests with full Secure AVIC capabilities.
+ */
+ CC_ATTR_SNP_SECURE_AVIC,
};
#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 01/14] x86/apic: Add new driver for Secure AVIC Neeraj Upadhyay
@ 2024-09-13 11:36 ` Neeraj Upadhyay
2024-10-09 15:27 ` Dave Hansen
2024-10-23 16:36 ` Borislav Petkov
2024-09-13 11:36 ` [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver Neeraj Upadhyay
` (12 subsequent siblings)
14 siblings, 2 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:36 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
With Secure AVIC, the APIC backing page is owned and managed by guest.
Allocate APIC backing page for all guest CPUs. In addition, add a
setup() APIC callback. This callback is used by Secure AVIC driver to
initialize APIC backing page area for each CPU.
Allocate APIC backing page memory area in chunks of 2M, so that
backing page memory is mapped using full huge pages. Without this,
if there are private to shared page state conversions for any
non-backing-page allocation which is part of the same huge page as the
one containing a backing page, hypervisor splits the huge page into 4K
pages. Splitting of APIC backing page area into individual 4K pages can
result in performance impact, due to TLB pressure.
Secure AVIC requires that vCPU's APIC backing page's NPT entry is always
present while that vCPU is running. If APIC backing page's NPT entry is
not present, a VMEXIT_BUSY is returned on VMRUN and the vCPU cannot
be resumed after that point. To handle this, invoke sev_notify_savic_gpa()
in Secure AVIC driver's setup() callback. This triggers SVM_VMGEXIT_SECURE_
AVIC_GPA exit for the hypervisor to note GPA of the vCPU's APIC
backing page. Hypervisor uses this information to ensure that the APIC
backing page is mapped in NPT before invoking VMRUN.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Co-developed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
GHCB spec update for SVM_VMGEXIT_SECURE_AVIC_GPA NAE event is
part of the draft spec:
https://lore.kernel.org/linux-coco/3453675d-ca29-4715-9c17-10b56b3af17e@amd.com/T/#u
arch/x86/coco/sev/core.c | 22 +++++++++++++++++
arch/x86/include/asm/apic.h | 1 +
arch/x86/include/asm/sev.h | 2 ++
arch/x86/include/uapi/asm/svm.h | 1 +
arch/x86/kernel/apic/apic.c | 2 ++
arch/x86/kernel/apic/x2apic_savic.c | 38 +++++++++++++++++++++++++++++
6 files changed, 66 insertions(+)
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index de1df0cb45da..93470538af5e 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1367,6 +1367,28 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
return ret;
}
+enum es_result sev_notify_savic_gpa(u64 gpa)
+{
+ struct ghcb_state state;
+ struct es_em_ctxt ctxt;
+ unsigned long flags;
+ struct ghcb *ghcb;
+ int ret = 0;
+
+ local_irq_save(flags);
+
+ ghcb = __sev_get_ghcb(&state);
+
+ vc_ghcb_invalidate(ghcb);
+
+ ret = sev_es_ghcb_hv_call(ghcb, &ctxt, SVM_VMGEXIT_SECURE_AVIC_GPA, gpa, 0);
+
+ __sev_put_ghcb(&state);
+
+ local_irq_restore(flags);
+ return ret;
+}
+
static void snp_register_per_cpu_ghcb(void)
{
struct sev_es_runtime_data *data;
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 9327eb00e96d..ca682c1e8748 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -302,6 +302,7 @@ struct apic {
/* Probe, setup and smpboot functions */
int (*probe)(void);
+ void (*setup)(void);
int (*acpi_madt_oem_check)(char *oem_id, char *oem_table_id);
void (*init_apic_ldr)(void);
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 79bbe2be900e..e84fc7fcc32a 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -399,6 +399,7 @@ u64 snp_get_unsupported_features(u64 status);
u64 sev_get_status(void);
void sev_show_status(void);
void snp_update_svsm_ca(void);
+enum es_result sev_notify_savic_gpa(u64 gpa);
#else /* !CONFIG_AMD_MEM_ENCRYPT */
@@ -435,6 +436,7 @@ static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
static inline u64 sev_get_status(void) { return 0; }
static inline void sev_show_status(void) { }
static inline void snp_update_svsm_ca(void) { }
+static inline enum es_result sev_notify_savic_gpa(u64 gpa) { return ES_UNSUPPORTED; }
#endif /* CONFIG_AMD_MEM_ENCRYPT */
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 1814b413fd57..0f21cea6d21c 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -116,6 +116,7 @@
#define SVM_VMGEXIT_AP_CREATE 1
#define SVM_VMGEXIT_AP_DESTROY 2
#define SVM_VMGEXIT_SNP_RUN_VMPL 0x80000018
+#define SVM_VMGEXIT_SECURE_AVIC_GPA 0x8000001a
#define SVM_VMGEXIT_HV_FEATURES 0x8000fffd
#define SVM_VMGEXIT_TERM_REQUEST 0x8000fffe
#define SVM_VMGEXIT_TERM_REASON(reason_set, reason_code) \
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 373638691cd4..b47d1dc854c3 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1499,6 +1499,8 @@ static void setup_local_APIC(void)
return;
}
+ if (apic->setup)
+ apic->setup();
/*
* If this comes from kexec/kcrash the APIC might be enabled in
* SPIV. Soft disable it before doing further initialization.
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index 97dac09a7f42..d903c35b8b64 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -9,12 +9,16 @@
#include <linux/cpumask.h>
#include <linux/cc_platform.h>
+#include <linux/percpu-defs.h>
#include <asm/apic.h>
#include <asm/sev.h>
#include "local.h"
+static DEFINE_PER_CPU(void *, apic_backing_page);
+static DEFINE_PER_CPU(bool, savic_setup_done);
+
static int x2apic_savic_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
{
return x2apic_enabled() && cc_platform_has(CC_ATTR_SNP_SECURE_AVIC);
@@ -61,8 +65,30 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
__send_IPI_mask(mask, vector, APIC_DEST_ALLBUT);
}
+static void x2apic_savic_setup(void)
+{
+ void *backing_page;
+ enum es_result ret;
+ unsigned long gpa;
+
+ if (this_cpu_read(savic_setup_done))
+ return;
+
+ backing_page = this_cpu_read(apic_backing_page);
+ gpa = __pa(backing_page);
+ ret = sev_notify_savic_gpa(gpa);
+ if (ret != ES_OK)
+ snp_abort();
+ this_cpu_write(savic_setup_done, true);
+}
+
static int x2apic_savic_probe(void)
{
+ void *backing_pages;
+ unsigned int cpu;
+ size_t sz;
+ int i;
+
if (!cc_platform_has(CC_ATTR_SNP_SECURE_AVIC))
return 0;
@@ -71,6 +97,17 @@ static int x2apic_savic_probe(void)
snp_abort();
}
+ sz = ALIGN(num_possible_cpus() * SZ_4K, SZ_2M);
+ backing_pages = kzalloc(sz, GFP_ATOMIC);
+ if (!backing_pages)
+ snp_abort();
+
+ i = 0;
+ for_each_possible_cpu(cpu) {
+ per_cpu(apic_backing_page, cpu) = backing_pages + i * SZ_4K;
+ i++;
+ }
+
pr_info("Secure AVIC Enabled\n");
return 1;
@@ -81,6 +118,7 @@ static struct apic apic_x2apic_savic __ro_after_init = {
.name = "secure avic x2apic",
.probe = x2apic_savic_probe,
.acpi_madt_oem_check = x2apic_savic_acpi_madt_oem_check,
+ .setup = x2apic_savic_setup,
.dest_mode_logical = false,
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 01/14] x86/apic: Add new driver for Secure AVIC Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page Neeraj Upadhyay
@ 2024-09-13 11:36 ` Neeraj Upadhyay
2024-11-06 18:16 ` Borislav Petkov
2024-11-06 19:20 ` Melody (Huibo) Wang
2024-09-13 11:36 ` [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC Neeraj Upadhyay
` (11 subsequent siblings)
14 siblings, 2 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:36 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
The x2APIC registers are mapped at an offset within the guest APIC
backing page which is same as their x2APIC MMIO offset. Secure AVIC
adds new registers such as ALLOWED_IRRs (which are at 4-byte offset
within the IRR register offset range) and NMI_REQ to the APIC register
space. In addition, the APIC_ID register is writable and configured by
guest.
Add read() and write() APIC callback functions to read and write x2APIC
registers directly from the guest APIC backing page.
The default .read()/.write() callbacks of x2APIC drivers perform
a rdmsr/wrmsr of the x2APIC registers. When Secure AVIC is enabled,
these would result in #VC exception (for non-accelerated register
accesses). The #VC exception handler reads/write the x2APIC register
in the guest APIC backing page. Since this would increase the latency
of accessing x2APIC registers, the read() and write() callbacks of
Secure AVIC driver directly reads/writes to the guest APIC backing page.
Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/include/asm/apicdef.h | 2 +
arch/x86/kernel/apic/x2apic_savic.c | 107 +++++++++++++++++++++++++++-
2 files changed, 107 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
index 094106b6a538..be39a543fbe5 100644
--- a/arch/x86/include/asm/apicdef.h
+++ b/arch/x86/include/asm/apicdef.h
@@ -135,6 +135,8 @@
#define APIC_TDR_DIV_128 0xA
#define APIC_EFEAT 0x400
#define APIC_ECTRL 0x410
+#define APIC_SEOI 0x420
+#define APIC_IER 0x480
#define APIC_EILVTn(n) (0x500 + 0x10 * n)
#define APIC_EILVT_NR_AMD_K8 1 /* # of extended interrupts */
#define APIC_EILVT_NR_AMD_10H 4
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index d903c35b8b64..6a471bbc3dba 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -10,6 +10,7 @@
#include <linux/cpumask.h>
#include <linux/cc_platform.h>
#include <linux/percpu-defs.h>
+#include <linux/align.h>
#include <asm/apic.h>
#include <asm/sev.h>
@@ -24,6 +25,108 @@ static int x2apic_savic_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
return x2apic_enabled() && cc_platform_has(CC_ATTR_SNP_SECURE_AVIC);
}
+static inline u32 get_reg(char *page, int reg_off)
+{
+ return READ_ONCE(*((u32 *)(page + reg_off)));
+}
+
+static inline void set_reg(char *page, int reg_off, u32 val)
+{
+ WRITE_ONCE(*((u32 *)(page + reg_off)), val);
+}
+
+#define SAVIC_ALLOWED_IRR_OFFSET 0x204
+
+static u32 x2apic_savic_read(u32 reg)
+{
+ void *backing_page = this_cpu_read(apic_backing_page);
+
+ switch (reg) {
+ case APIC_LVTT:
+ case APIC_TMICT:
+ case APIC_TMCCT:
+ case APIC_TDCR:
+ case APIC_ID:
+ case APIC_LVR:
+ case APIC_TASKPRI:
+ case APIC_ARBPRI:
+ case APIC_PROCPRI:
+ case APIC_LDR:
+ case APIC_SPIV:
+ case APIC_ESR:
+ case APIC_ICR:
+ case APIC_LVTTHMR:
+ case APIC_LVTPC:
+ case APIC_LVT0:
+ case APIC_LVT1:
+ case APIC_LVTERR:
+ case APIC_EFEAT:
+ case APIC_ECTRL:
+ case APIC_SEOI:
+ case APIC_IER:
+ case APIC_EILVTn(0) ... APIC_EILVTn(3):
+ return get_reg(backing_page, reg);
+ case APIC_ISR ... APIC_ISR + 0x70:
+ case APIC_TMR ... APIC_TMR + 0x70:
+ WARN_ONCE(!IS_ALIGNED(reg, 16), "Reg offset %#x not aligned at 16 bytes", reg);
+ return get_reg(backing_page, reg);
+ /* IRR and ALLOWED_IRR offset range */
+ case APIC_IRR ... APIC_IRR + 0x74:
+ /*
+ * Either aligned at 16 bytes for valid IRR reg offset or a
+ * valid Secure AVIC ALLOWED_IRR offset.
+ */
+ WARN_ONCE(!(IS_ALIGNED(reg, 16) || IS_ALIGNED(reg - SAVIC_ALLOWED_IRR_OFFSET, 16)),
+ "Misaligned IRR/ALLOWED_IRR reg offset %#x", reg);
+ return get_reg(backing_page, reg);
+ default:
+ pr_err("Permission denied: read of Secure AVIC reg offset %#x\n", reg);
+ return 0;
+ }
+}
+
+#define SAVIC_NMI_REQ_OFFSET 0x278
+
+static void x2apic_savic_write(u32 reg, u32 data)
+{
+ void *backing_page = this_cpu_read(apic_backing_page);
+
+ switch (reg) {
+ case APIC_LVTT:
+ case APIC_LVT0:
+ case APIC_LVT1:
+ case APIC_TMICT:
+ case APIC_TDCR:
+ case APIC_SELF_IPI:
+ /* APIC_ID is writable and configured by guest for Secure AVIC */
+ case APIC_ID:
+ case APIC_TASKPRI:
+ case APIC_EOI:
+ case APIC_SPIV:
+ case SAVIC_NMI_REQ_OFFSET:
+ case APIC_ESR:
+ case APIC_ICR:
+ case APIC_LVTTHMR:
+ case APIC_LVTPC:
+ case APIC_LVTERR:
+ case APIC_ECTRL:
+ case APIC_SEOI:
+ case APIC_IER:
+ case APIC_EILVTn(0) ... APIC_EILVTn(3):
+ set_reg(backing_page, reg, data);
+ break;
+ /* ALLOWED_IRR offsets are writable */
+ case SAVIC_ALLOWED_IRR_OFFSET ... SAVIC_ALLOWED_IRR_OFFSET + 0x70:
+ if (IS_ALIGNED(reg - SAVIC_ALLOWED_IRR_OFFSET, 16)) {
+ set_reg(backing_page, reg, data);
+ break;
+ }
+ fallthrough;
+ default:
+ pr_err("Permission denied: write to Secure AVIC reg offset %#x\n", reg);
+ }
+}
+
static void x2apic_savic_send_IPI(int cpu, int vector)
{
u32 dest = per_cpu(x86_cpu_to_apicid, cpu);
@@ -140,8 +243,8 @@ static struct apic apic_x2apic_savic __ro_after_init = {
.send_IPI_self = x2apic_send_IPI_self,
.nmi_to_offline_cpu = true,
- .read = native_apic_msr_read,
- .write = native_apic_msr_write,
+ .read = x2apic_savic_read,
+ .write = x2apic_savic_write,
.eoi = native_apic_msr_eoi,
.icr_read = native_x2apic_icr_read,
.icr_write = native_x2apic_icr_write,
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (2 preceding siblings ...)
2024-09-13 11:36 ` [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver Neeraj Upadhyay
@ 2024-09-13 11:36 ` Neeraj Upadhyay
2024-11-07 15:28 ` Borislav Petkov
2024-11-11 22:43 ` [sos-linux-ext-patches] " Melody (Huibo) Wang
2024-09-13 11:36 ` [RFC 05/14] x86/apic: Initialize APIC ID " Neeraj Upadhyay
` (10 subsequent siblings)
14 siblings, 2 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:36 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
Secure AVIC lets guest manage the APIC backing page (unlike emulated
x2APIC or x2AVIC where the hypervisor manages the APIC backing page).
However the introduced Secure AVIC Linux design still maintains the
APIC backing page in the hypervisor to shadow the APIC backing page
maintained by guest (It should be noted only subset of the registers
are shadowed for specific usecases and registers like APIC_IRR,
APIC_ISR are not shadowed).
Add sev_ghcb_msr_read() to invoke "SVM_EXIT_MSR" VMGEXIT to read
MSRs from hypervisor. Initialize the Secure AVIC's APIC backing
page by copying the initial state of shadow APIC backing page in
the hypervisor to the guest APIC backing page. Specifically copy
APIC_LVR, APIC_LDR, and APIC_LVT MSRs from the shadow APIC backing
page.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Co-developed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/coco/sev/core.c | 41 ++++++++++++++++-----
arch/x86/include/asm/sev.h | 2 ++
arch/x86/kernel/apic/x2apic_savic.c | 55 +++++++++++++++++++++++++++++
3 files changed, 90 insertions(+), 8 deletions(-)
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 93470538af5e..0e140f92cfef 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1331,18 +1331,15 @@ int __init sev_es_efi_map_ghcbs(pgd_t *pgd)
return 0;
}
-static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
+static enum es_result __vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt, bool write)
{
struct pt_regs *regs = ctxt->regs;
+ u64 exit_info_1 = write ? 1 : 0;
enum es_result ret;
- u64 exit_info_1;
-
- /* Is it a WRMSR? */
- exit_info_1 = (ctxt->insn.opcode.bytes[1] == 0x30) ? 1 : 0;
if (regs->cx == MSR_SVSM_CAA) {
/* Writes to the SVSM CAA msr are ignored */
- if (exit_info_1)
+ if (write)
return ES_OK;
regs->ax = lower_32_bits(this_cpu_read(svsm_caa_pa));
@@ -1352,14 +1349,14 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
}
ghcb_set_rcx(ghcb, regs->cx);
- if (exit_info_1) {
+ if (write) {
ghcb_set_rax(ghcb, regs->ax);
ghcb_set_rdx(ghcb, regs->dx);
}
ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_MSR, exit_info_1, 0);
- if ((ret == ES_OK) && (!exit_info_1)) {
+ if (ret == ES_OK && !write) {
regs->ax = ghcb->save.rax;
regs->dx = ghcb->save.rdx;
}
@@ -1367,6 +1364,34 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
return ret;
}
+static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
+{
+ return __vc_handle_msr(ghcb, ctxt, ctxt->insn.opcode.bytes[1] == 0x30);
+}
+
+enum es_result sev_ghcb_msr_read(u64 msr, u64 *value)
+{
+ struct pt_regs regs = { .cx = msr };
+ struct es_em_ctxt ctxt = { .regs = ®s };
+ struct ghcb_state state;
+ unsigned long flags;
+ enum es_result ret;
+ struct ghcb *ghcb;
+
+ local_irq_save(flags);
+ ghcb = __sev_get_ghcb(&state);
+ vc_ghcb_invalidate(ghcb);
+
+ ret = __vc_handle_msr(ghcb, &ctxt, false);
+ if (ret == ES_OK)
+ *value = regs.ax | regs.dx << 32;
+
+ __sev_put_ghcb(&state);
+ local_irq_restore(flags);
+
+ return ret;
+}
+
enum es_result sev_notify_savic_gpa(u64 gpa)
{
struct ghcb_state state;
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index e84fc7fcc32a..5e6385bfb85a 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -400,6 +400,7 @@ u64 sev_get_status(void);
void sev_show_status(void);
void snp_update_svsm_ca(void);
enum es_result sev_notify_savic_gpa(u64 gpa);
+enum es_result sev_ghcb_msr_read(u64 msr, u64 *value);
#else /* !CONFIG_AMD_MEM_ENCRYPT */
@@ -437,6 +438,7 @@ static inline u64 sev_get_status(void) { return 0; }
static inline void sev_show_status(void) { }
static inline void snp_update_svsm_ca(void) { }
static inline enum es_result sev_notify_savic_gpa(u64 gpa) { return ES_UNSUPPORTED; }
+static inline enum es_result sev_ghcb_msr_read(u64 msr, u64 *value) { return ES_UNSUPPORTED; }
#endif /* CONFIG_AMD_MEM_ENCRYPT */
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index 6a471bbc3dba..99151be4e173 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -11,6 +11,7 @@
#include <linux/cc_platform.h>
#include <linux/percpu-defs.h>
#include <linux/align.h>
+#include <linux/sizes.h>
#include <asm/apic.h>
#include <asm/sev.h>
@@ -20,6 +21,19 @@
static DEFINE_PER_CPU(void *, apic_backing_page);
static DEFINE_PER_CPU(bool, savic_setup_done);
+enum lapic_lvt_entry {
+ LVT_TIMER,
+ LVT_THERMAL_MONITOR,
+ LVT_PERFORMANCE_COUNTER,
+ LVT_LINT0,
+ LVT_LINT1,
+ LVT_ERROR,
+
+ APIC_MAX_NR_LVT_ENTRIES,
+};
+
+#define APIC_LVTx(x) (APIC_LVTT + 0x10 * (x))
+
static int x2apic_savic_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
{
return x2apic_enabled() && cc_platform_has(CC_ATTR_SNP_SECURE_AVIC);
@@ -35,6 +49,22 @@ static inline void set_reg(char *page, int reg_off, u32 val)
WRITE_ONCE(*((u32 *)(page + reg_off)), val);
}
+static u32 read_msr_from_hv(u32 reg)
+{
+ u64 data, msr;
+ int ret;
+
+ msr = APIC_BASE_MSR + (reg >> 4);
+ ret = sev_ghcb_msr_read(msr, &data);
+ if (ret != ES_OK) {
+ pr_err("Secure AVIC msr (%#llx) read returned error (%d)\n", msr, ret);
+ /* MSR read failures are treated as fatal errors */
+ snp_abort();
+ }
+
+ return lower_32_bits(data);
+}
+
#define SAVIC_ALLOWED_IRR_OFFSET 0x204
static u32 x2apic_savic_read(u32 reg)
@@ -168,6 +198,30 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
__send_IPI_mask(mask, vector, APIC_DEST_ALLBUT);
}
+static void init_backing_page(void *backing_page)
+{
+ u32 val;
+ int i;
+
+ val = read_msr_from_hv(APIC_LVR);
+ set_reg(backing_page, APIC_LVR, val);
+
+ /*
+ * Hypervisor is used for all timer related functions,
+ * so don't copy those values.
+ */
+ for (i = LVT_THERMAL_MONITOR; i < APIC_MAX_NR_LVT_ENTRIES; i++) {
+ val = read_msr_from_hv(APIC_LVTx(i));
+ set_reg(backing_page, APIC_LVTx(i), val);
+ }
+
+ val = read_msr_from_hv(APIC_LVT0);
+ set_reg(backing_page, APIC_LVT0, val);
+
+ val = read_msr_from_hv(APIC_LDR);
+ set_reg(backing_page, APIC_LDR, val);
+}
+
static void x2apic_savic_setup(void)
{
void *backing_page;
@@ -178,6 +232,7 @@ static void x2apic_savic_setup(void)
return;
backing_page = this_cpu_read(apic_backing_page);
+ init_backing_page(backing_page);
gpa = __pa(backing_page);
ret = sev_notify_savic_gpa(gpa);
if (ret != ES_OK)
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 05/14] x86/apic: Initialize APIC ID for Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (3 preceding siblings ...)
2024-09-13 11:36 ` [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC Neeraj Upadhyay
@ 2024-09-13 11:36 ` Neeraj Upadhyay
2024-11-09 20:13 ` [sos-linux-ext-patches] " Melody (Huibo) Wang
2024-09-13 11:36 ` [RFC 06/14] x86/apic: Add update_vector callback " Neeraj Upadhyay
` (9 subsequent siblings)
14 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:36 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
Initialize the APIC ID in the APIC backing page with the
CPUID function 0000_000bh_EDX (Extended Topology Enumeration),
and ensure that APIC ID msr read from hypervisor is consistent
with the value read from CPUID.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kernel/apic/x2apic_savic.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index 99151be4e173..09fbc1857bf3 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -14,6 +14,7 @@
#include <linux/sizes.h>
#include <asm/apic.h>
+#include <asm/cpuid.h>
#include <asm/sev.h>
#include "local.h"
@@ -200,6 +201,8 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
static void init_backing_page(void *backing_page)
{
+ u32 hv_apic_id;
+ u32 apic_id;
u32 val;
int i;
@@ -220,6 +223,13 @@ static void init_backing_page(void *backing_page)
val = read_msr_from_hv(APIC_LDR);
set_reg(backing_page, APIC_LDR, val);
+
+ /* Read APIC ID from Extended Topology Enumeration CPUID */
+ apic_id = cpuid_edx(0x0000000b);
+ hv_apic_id = read_msr_from_hv(APIC_ID);
+ WARN_ONCE(hv_apic_id != apic_id, "Inconsistent APIC_ID values: %d (cpuid), %d (msr)",
+ apic_id, hv_apic_id);
+ set_reg(backing_page, APIC_ID, apic_id);
}
static void x2apic_savic_setup(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 06/14] x86/apic: Add update_vector callback for Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (4 preceding siblings ...)
2024-09-13 11:36 ` [RFC 05/14] x86/apic: Initialize APIC ID " Neeraj Upadhyay
@ 2024-09-13 11:36 ` Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 07/14] x86/apic: Add support to send IPI " Neeraj Upadhyay
` (8 subsequent siblings)
14 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:36 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
Add update_vector callback to set/clear ALLOWED_IRR field in
the APIC backing page. The allowed IRR vector indicates the
interrupt vectors which the guest allows the hypervisor to
send (typically for emulated devices). ALLOWED_IRR is meant
to be used specifically for vectors that the hypervisor is
allowed to inject, such as device interrupts. Interrupt
vectors used exclusively by the guest itself (like IPI vectors)
should not be allowed to be injected into the guest for security
reasons.
The update_vector callback is invoked from APIC vector domain
whenever a vector is allocated, freed or moved.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Co-developed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/include/asm/apic.h | 2 ++
arch/x86/kernel/apic/vector.c | 8 ++++++++
arch/x86/kernel/apic/x2apic_savic.c | 21 +++++++++++++++++++++
3 files changed, 31 insertions(+)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index ca682c1e8748..2d5400372470 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -315,6 +315,8 @@ struct apic {
/* wakeup secondary CPU using 64-bit wakeup point */
int (*wakeup_secondary_cpu_64)(u32 apicid, unsigned long start_eip);
+ void (*update_vector)(unsigned int cpu, unsigned int vector, bool set);
+
char *name;
};
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 557318145038..5aa65a732b05 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -174,6 +174,8 @@ static void apic_update_vector(struct irq_data *irqd, unsigned int newvec,
apicd->prev_cpu = apicd->cpu;
WARN_ON_ONCE(apicd->cpu == newcpu);
} else {
+ if (apic->update_vector)
+ apic->update_vector(apicd->cpu, apicd->vector, false);
irq_matrix_free(vector_matrix, apicd->cpu, apicd->vector,
managed);
}
@@ -183,6 +185,8 @@ static void apic_update_vector(struct irq_data *irqd, unsigned int newvec,
apicd->cpu = newcpu;
BUG_ON(!IS_ERR_OR_NULL(per_cpu(vector_irq, newcpu)[newvec]));
per_cpu(vector_irq, newcpu)[newvec] = desc;
+ if (apic->update_vector)
+ apic->update_vector(apicd->cpu, apicd->vector, true);
}
static void vector_assign_managed_shutdown(struct irq_data *irqd)
@@ -528,11 +532,15 @@ static bool vector_configure_legacy(unsigned int virq, struct irq_data *irqd,
if (irqd_is_activated(irqd)) {
trace_vector_setup(virq, true, 0);
apic_update_irq_cfg(irqd, apicd->vector, apicd->cpu);
+ if (apic->update_vector)
+ apic->update_vector(apicd->cpu, apicd->vector, true);
} else {
/* Release the vector */
apicd->can_reserve = true;
irqd_set_can_reserve(irqd);
clear_irq_vector(irqd);
+ if (apic->update_vector)
+ apic->update_vector(apicd->cpu, apicd->vector, false);
realloc = true;
}
raw_spin_unlock_irqrestore(&vector_lock, flags);
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index 09fbc1857bf3..a9e54c1c6446 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -19,6 +19,9 @@
#include "local.h"
+#define VEC_POS(v) ((v) & (32 - 1))
+#define REG_POS(v) (((v) >> 5) << 4)
+
static DEFINE_PER_CPU(void *, apic_backing_page);
static DEFINE_PER_CPU(bool, savic_setup_done);
@@ -199,6 +202,22 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
__send_IPI_mask(mask, vector, APIC_DEST_ALLBUT);
}
+static void x2apic_savic_update_vector(unsigned int cpu, unsigned int vector, bool set)
+{
+ void *backing_page;
+ unsigned long *reg;
+ int reg_off;
+
+ backing_page = per_cpu(apic_backing_page, cpu);
+ reg_off = SAVIC_ALLOWED_IRR_OFFSET + REG_POS(vector);
+ reg = (unsigned long *)((char *)backing_page + reg_off);
+
+ if (set)
+ test_and_set_bit(VEC_POS(vector), reg);
+ else
+ test_and_clear_bit(VEC_POS(vector), reg);
+}
+
static void init_backing_page(void *backing_page)
{
u32 hv_apic_id;
@@ -313,6 +332,8 @@ static struct apic apic_x2apic_savic __ro_after_init = {
.eoi = native_apic_msr_eoi,
.icr_read = native_x2apic_icr_read,
.icr_write = native_x2apic_icr_write,
+
+ .update_vector = x2apic_savic_update_vector,
};
apic_driver(apic_x2apic_savic);
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 07/14] x86/apic: Add support to send IPI for Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (5 preceding siblings ...)
2024-09-13 11:36 ` [RFC 06/14] x86/apic: Add update_vector callback " Neeraj Upadhyay
@ 2024-09-13 11:36 ` Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 08/14] x86/apic: Support LAPIC timer " Neeraj Upadhyay
` (7 subsequent siblings)
14 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:36 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
With Secure AVIC only Self-IPI is accelerated. To handle all the
other IPIs, add new callbacks for sending IPI, which write to the
IRR of the target guest APIC backing page (after decoding the ICR
register) and then issue VMGEXIT for the hypervisor to notify the
target vCPU.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Co-developed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/coco/sev/core.c | 25 +++++
arch/x86/include/asm/sev.h | 2 +
arch/x86/kernel/apic/x2apic_savic.c | 152 +++++++++++++++++++++++++---
3 files changed, 166 insertions(+), 13 deletions(-)
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 0e140f92cfef..63ecab60cab7 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1392,6 +1392,31 @@ enum es_result sev_ghcb_msr_read(u64 msr, u64 *value)
return ret;
}
+enum es_result sev_ghcb_msr_write(u64 msr, u64 value)
+{
+ struct pt_regs regs = {
+ .cx = msr,
+ .ax = lower_32_bits(value),
+ .dx = upper_32_bits(value)
+ };
+ struct es_em_ctxt ctxt = { .regs = ®s };
+ struct ghcb_state state;
+ unsigned long flags;
+ enum es_result ret;
+ struct ghcb *ghcb;
+
+ local_irq_save(flags);
+ ghcb = __sev_get_ghcb(&state);
+ vc_ghcb_invalidate(ghcb);
+
+ ret = __vc_handle_msr(ghcb, &ctxt, true);
+
+ __sev_put_ghcb(&state);
+ local_irq_restore(flags);
+
+ return ret;
+}
+
enum es_result sev_notify_savic_gpa(u64 gpa)
{
struct ghcb_state state;
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 5e6385bfb85a..1e55e3f1b7da 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -401,6 +401,7 @@ void sev_show_status(void);
void snp_update_svsm_ca(void);
enum es_result sev_notify_savic_gpa(u64 gpa);
enum es_result sev_ghcb_msr_read(u64 msr, u64 *value);
+enum es_result sev_ghcb_msr_write(u64 msr, u64 value);
#else /* !CONFIG_AMD_MEM_ENCRYPT */
@@ -439,6 +440,7 @@ static inline void sev_show_status(void) { }
static inline void snp_update_svsm_ca(void) { }
static inline enum es_result sev_notify_savic_gpa(u64 gpa) { return ES_UNSUPPORTED; }
static inline enum es_result sev_ghcb_msr_read(u64 msr, u64 *value) { return ES_UNSUPPORTED; }
+static inline enum es_result sev_ghcb_msr_write(u64 msr, u64 value) { return ES_UNSUPPORTED; }
#endif /* CONFIG_AMD_MEM_ENCRYPT */
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index a9e54c1c6446..30a24b70e5cb 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -69,6 +69,20 @@ static u32 read_msr_from_hv(u32 reg)
return lower_32_bits(data);
}
+static void write_msr_to_hv(u32 reg, u64 data)
+{
+ u64 msr;
+ int ret;
+
+ msr = APIC_BASE_MSR + (reg >> 4);
+ ret = sev_ghcb_msr_write(msr, data);
+ if (ret != ES_OK) {
+ pr_err("Secure AVIC msr (%#llx) write returned error (%d)\n", msr, ret);
+ /* MSR writes should never fail. Any failure is fatal error for SNP guest */
+ snp_abort();
+ }
+}
+
#define SAVIC_ALLOWED_IRR_OFFSET 0x204
static u32 x2apic_savic_read(u32 reg)
@@ -124,6 +138,7 @@ static u32 x2apic_savic_read(u32 reg)
static void x2apic_savic_write(u32 reg, u32 data)
{
void *backing_page = this_cpu_read(apic_backing_page);
+ unsigned int cfg;
switch (reg) {
case APIC_LVTT:
@@ -131,7 +146,6 @@ static void x2apic_savic_write(u32 reg, u32 data)
case APIC_LVT1:
case APIC_TMICT:
case APIC_TDCR:
- case APIC_SELF_IPI:
/* APIC_ID is writable and configured by guest for Secure AVIC */
case APIC_ID:
case APIC_TASKPRI:
@@ -149,6 +163,11 @@ static void x2apic_savic_write(u32 reg, u32 data)
case APIC_EILVTn(0) ... APIC_EILVTn(3):
set_reg(backing_page, reg, data);
break;
+ /* Self IPIs are accelerated by hardware, use wrmsr */
+ case APIC_SELF_IPI:
+ cfg = __prepare_ICR(APIC_DEST_SELF, data, 0);
+ native_x2apic_icr_write(cfg, 0);
+ break;
/* ALLOWED_IRR offsets are writable */
case SAVIC_ALLOWED_IRR_OFFSET ... SAVIC_ALLOWED_IRR_OFFSET + 0x70:
if (IS_ALIGNED(reg - SAVIC_ALLOWED_IRR_OFFSET, 16)) {
@@ -161,13 +180,100 @@ static void x2apic_savic_write(u32 reg, u32 data)
}
}
+static void send_ipi(int cpu, int vector)
+{
+ void *backing_page;
+ int reg_off;
+
+ backing_page = per_cpu(apic_backing_page, cpu);
+ reg_off = APIC_IRR + REG_POS(vector);
+ /*
+ * Use test_and_set_bit() to ensure that IRR updates are atomic w.r.t. other
+ * IRR updates such as during VMRUN and during CPU interrupt handling flow.
+ */
+ test_and_set_bit(VEC_POS(vector), (unsigned long *)((char *)backing_page + reg_off));
+}
+
+static void send_ipi_dest(u64 icr_data)
+{
+ int vector, cpu;
+
+ vector = icr_data & APIC_VECTOR_MASK;
+ cpu = icr_data >> 32;
+
+ send_ipi(cpu, vector);
+}
+
+static void send_ipi_target(u64 icr_data)
+{
+ if (icr_data & APIC_DEST_LOGICAL) {
+ pr_err("IPI target should be of PHYSICAL type\n");
+ return;
+ }
+
+ send_ipi_dest(icr_data);
+}
+
+static void send_ipi_allbut(u64 icr_data)
+{
+ const struct cpumask *self_cpu_mask = get_cpu_mask(smp_processor_id());
+ unsigned long flags;
+ int vector, cpu;
+
+ vector = icr_data & APIC_VECTOR_MASK;
+ local_irq_save(flags);
+ for_each_cpu_andnot(cpu, cpu_present_mask, self_cpu_mask)
+ send_ipi(cpu, vector);
+ write_msr_to_hv(APIC_ICR, icr_data);
+ local_irq_restore(flags);
+}
+
+static void send_ipi_allinc(u64 icr_data)
+{
+ int vector;
+
+ send_ipi_allbut(icr_data);
+ vector = icr_data & APIC_VECTOR_MASK;
+ native_x2apic_icr_write(APIC_DEST_SELF | vector, 0);
+}
+
+static void x2apic_savic_icr_write(u32 icr_low, u32 icr_high)
+{
+ int dsh, vector;
+ u64 icr_data;
+
+ icr_data = ((u64)icr_high) << 32 | icr_low;
+ dsh = icr_low & APIC_DEST_ALLBUT;
+
+ switch (dsh) {
+ case APIC_DEST_SELF:
+ vector = icr_data & APIC_VECTOR_MASK;
+ x2apic_savic_write(APIC_SELF_IPI, vector);
+ break;
+ case APIC_DEST_ALLINC:
+ send_ipi_allinc(icr_data);
+ break;
+ case APIC_DEST_ALLBUT:
+ send_ipi_allbut(icr_data);
+ break;
+ default:
+ send_ipi_target(icr_data);
+ write_msr_to_hv(APIC_ICR, icr_data);
+ }
+}
+
+static void __send_IPI_dest(unsigned int apicid, int vector, unsigned int dest)
+{
+ unsigned int cfg = __prepare_ICR(0, vector, dest);
+
+ x2apic_savic_icr_write(cfg, apicid);
+}
+
static void x2apic_savic_send_IPI(int cpu, int vector)
{
u32 dest = per_cpu(x86_cpu_to_apicid, cpu);
- /* x2apic MSRs are special and need a special fence: */
- weak_wrmsr_fence();
- __x2apic_send_IPI_dest(dest, vector, APIC_DEST_PHYSICAL);
+ __send_IPI_dest(dest, vector, APIC_DEST_PHYSICAL);
}
static void
@@ -177,18 +283,16 @@ __send_IPI_mask(const struct cpumask *mask, int vector, int apic_dest)
unsigned long this_cpu;
unsigned long flags;
- /* x2apic MSRs are special and need a special fence: */
- weak_wrmsr_fence();
-
local_irq_save(flags);
this_cpu = smp_processor_id();
for_each_cpu(query_cpu, mask) {
if (apic_dest == APIC_DEST_ALLBUT && this_cpu == query_cpu)
continue;
- __x2apic_send_IPI_dest(per_cpu(x86_cpu_to_apicid, query_cpu),
- vector, APIC_DEST_PHYSICAL);
+ __send_IPI_dest(per_cpu(x86_cpu_to_apicid, query_cpu), vector,
+ APIC_DEST_PHYSICAL);
}
+
local_irq_restore(flags);
}
@@ -202,6 +306,28 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
__send_IPI_mask(mask, vector, APIC_DEST_ALLBUT);
}
+static void __send_IPI_shorthand(int vector, u32 which)
+{
+ unsigned int cfg = __prepare_ICR(which, vector, 0);
+
+ x2apic_savic_icr_write(cfg, 0);
+}
+
+static void x2apic_savic_send_IPI_allbutself(int vector)
+{
+ __send_IPI_shorthand(vector, APIC_DEST_ALLBUT);
+}
+
+static void x2apic_savic_send_IPI_all(int vector)
+{
+ __send_IPI_shorthand(vector, APIC_DEST_ALLINC);
+}
+
+static void x2apic_savic_send_IPI_self(int vector)
+{
+ __send_IPI_shorthand(vector, APIC_DEST_SELF);
+}
+
static void x2apic_savic_update_vector(unsigned int cpu, unsigned int vector, bool set)
{
void *backing_page;
@@ -322,16 +448,16 @@ static struct apic apic_x2apic_savic __ro_after_init = {
.send_IPI = x2apic_savic_send_IPI,
.send_IPI_mask = x2apic_savic_send_IPI_mask,
.send_IPI_mask_allbutself = x2apic_savic_send_IPI_mask_allbutself,
- .send_IPI_allbutself = x2apic_send_IPI_allbutself,
- .send_IPI_all = x2apic_send_IPI_all,
- .send_IPI_self = x2apic_send_IPI_self,
+ .send_IPI_allbutself = x2apic_savic_send_IPI_allbutself,
+ .send_IPI_all = x2apic_savic_send_IPI_all,
+ .send_IPI_self = x2apic_savic_send_IPI_self,
.nmi_to_offline_cpu = true,
.read = x2apic_savic_read,
.write = x2apic_savic_write,
.eoi = native_apic_msr_eoi,
.icr_read = native_x2apic_icr_read,
- .icr_write = native_x2apic_icr_write,
+ .icr_write = x2apic_savic_icr_write,
.update_vector = x2apic_savic_update_vector,
};
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 08/14] x86/apic: Support LAPIC timer for Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (6 preceding siblings ...)
2024-09-13 11:36 ` [RFC 07/14] x86/apic: Add support to send IPI " Neeraj Upadhyay
@ 2024-09-13 11:36 ` Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 09/14] x86/sev: Initialize VGIF for secondary VCPUs " Neeraj Upadhyay
` (6 subsequent siblings)
14 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:36 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
Secure AVIC requires LAPIC timer to be emulated by hypervisor. KVM
already supports emulating LAPIC timer using hrtimers. In order
to emulate LAPIC timer, APIC_LVTT, APIC_TMICT and APIC_TDCR register
values need to be propagated to the hypervisor for arming the timer.
APIC_TMCCT register value has to be read from the hypervisor, which
is required for calibrating the APIC timer. So, read/write all APIC
timer registers from/to the hypervisor.
In addition, configure APIC_ALLOWED_IRR for the hypervisor to inject
timer interrupt using LOCAL_TIMER_VECTOR.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Co-developed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kernel/apic/apic.c | 2 ++
arch/x86/kernel/apic/x2apic_savic.c | 7 +++++--
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index b47d1dc854c3..aeda74bf15e6 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -579,6 +579,8 @@ static void setup_APIC_timer(void)
0xF, ~0UL);
} else
clockevents_register_device(levt);
+
+ apic->update_vector(smp_processor_id(), LOCAL_TIMER_VECTOR, true);
}
/*
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index 30a24b70e5cb..2eab9a773005 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -94,6 +94,7 @@ static u32 x2apic_savic_read(u32 reg)
case APIC_TMICT:
case APIC_TMCCT:
case APIC_TDCR:
+ return read_msr_from_hv(reg);
case APIC_ID:
case APIC_LVR:
case APIC_TASKPRI:
@@ -142,10 +143,12 @@ static void x2apic_savic_write(u32 reg, u32 data)
switch (reg) {
case APIC_LVTT:
- case APIC_LVT0:
- case APIC_LVT1:
case APIC_TMICT:
case APIC_TDCR:
+ write_msr_to_hv(reg, data);
+ break;
+ case APIC_LVT0:
+ case APIC_LVT1:
/* APIC_ID is writable and configured by guest for Secure AVIC */
case APIC_ID:
case APIC_TASKPRI:
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 09/14] x86/sev: Initialize VGIF for secondary VCPUs for Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (7 preceding siblings ...)
2024-09-13 11:36 ` [RFC 08/14] x86/apic: Support LAPIC timer " Neeraj Upadhyay
@ 2024-09-13 11:37 ` Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 10/14] x86/apic: Add support to send NMI IPI " Neeraj Upadhyay
` (5 subsequent siblings)
14 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:37 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
VINTR_CTRL in VMSA should be configured for Secure AVIC. Configure
for secondary vCPUs (the configuration for boot CPU is done in
hypervisor).
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/coco/sev/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 63ecab60cab7..3c832c9befab 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1190,6 +1190,9 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip)
vmsa->x87_ftw = AP_INIT_X87_FTW_DEFAULT;
vmsa->x87_fcw = AP_INIT_X87_FCW_DEFAULT;
+ if (cc_platform_has(CC_ATTR_SNP_SECURE_AVIC))
+ vmsa->vintr_ctrl |= V_GIF_MASK;
+
/* SVME must be set. */
vmsa->efer = EFER_SVME;
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 10/14] x86/apic: Add support to send NMI IPI for Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (8 preceding siblings ...)
2024-09-13 11:37 ` [RFC 09/14] x86/sev: Initialize VGIF for secondary VCPUs " Neeraj Upadhyay
@ 2024-09-13 11:37 ` Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 11/14] x86/apic: Allow NMI to be injected from hypervisor " Neeraj Upadhyay
` (4 subsequent siblings)
14 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:37 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
Secure AVIC has introduced a new field in the APIC backing page
"NmiReq" that has to be set by the guest to request a NMI IPI.
Add support to set NmiReq appropriately to send NMI IPI.
This also requires Virtual NMI feature to be enabled in VINTRL_CTRL
field in the VMSA. However this would be added by a later commit
after adding support for injecting NMI from the hypervisor.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kernel/apic/x2apic_savic.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index 2eab9a773005..5502a828a795 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -183,7 +183,7 @@ static void x2apic_savic_write(u32 reg, u32 data)
}
}
-static void send_ipi(int cpu, int vector)
+static void send_ipi(int cpu, int vector, bool nmi)
{
void *backing_page;
int reg_off;
@@ -195,16 +195,20 @@ static void send_ipi(int cpu, int vector)
* IRR updates such as during VMRUN and during CPU interrupt handling flow.
*/
test_and_set_bit(VEC_POS(vector), (unsigned long *)((char *)backing_page + reg_off));
+ if (nmi)
+ set_reg(backing_page, SAVIC_NMI_REQ_OFFSET, nmi);
}
static void send_ipi_dest(u64 icr_data)
{
int vector, cpu;
+ bool nmi;
vector = icr_data & APIC_VECTOR_MASK;
cpu = icr_data >> 32;
+ nmi = ((icr_data & APIC_DM_FIXED_MASK) == APIC_DM_NMI);
- send_ipi(cpu, vector);
+ send_ipi(cpu, vector, nmi);
}
static void send_ipi_target(u64 icr_data)
@@ -222,11 +226,13 @@ static void send_ipi_allbut(u64 icr_data)
const struct cpumask *self_cpu_mask = get_cpu_mask(smp_processor_id());
unsigned long flags;
int vector, cpu;
+ bool nmi;
vector = icr_data & APIC_VECTOR_MASK;
+ nmi = ((icr_data & APIC_DM_FIXED_MASK) == APIC_DM_NMI);
local_irq_save(flags);
for_each_cpu_andnot(cpu, cpu_present_mask, self_cpu_mask)
- send_ipi(cpu, vector);
+ send_ipi(cpu, vector, nmi);
write_msr_to_hv(APIC_ICR, icr_data);
local_irq_restore(flags);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 11/14] x86/apic: Allow NMI to be injected from hypervisor for Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (9 preceding siblings ...)
2024-09-13 11:37 ` [RFC 10/14] x86/apic: Add support to send NMI IPI " Neeraj Upadhyay
@ 2024-09-13 11:37 ` Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 12/14] x86/sev: Enable NMI support " Neeraj Upadhyay
` (3 subsequent siblings)
14 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:37 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
Secure AVIC requires "AllowedNmi" bit in the Secure AVIC Control MSR
to be set for NMI to be injected from hypervisor.
Set "AllowedNmi" bit in Secure AVIC Control MSR here to allow NMI
interrupts to be injected from hypervisor. While at that, also propagate
APIC_LVT0 and APIC_LVT1 register values to the hypervisor required for
injecting NMI interrupts from hypervisor.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/include/asm/msr-index.h | 5 +++++
arch/x86/kernel/apic/x2apic_savic.c | 10 ++++++++--
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index d0583619c978..0b7454ed7b39 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -662,6 +662,11 @@
#define MSR_AMD64_SNP_SECURE_AVIC_ENABLED BIT_ULL(MSR_AMD64_SNP_SECURE_AVIC_BIT)
#define MSR_AMD64_SNP_RESV_BIT 19
#define MSR_AMD64_SNP_RESERVED_MASK GENMASK_ULL(63, MSR_AMD64_SNP_RESV_BIT)
+#define MSR_AMD64_SECURE_AVIC_CONTROL 0xc0010138
+#define MSR_AMD64_SECURE_AVIC_EN_BIT 0
+#define MSR_AMD64_SECURE_AVIC_EN BIT_ULL(MSR_AMD64_SECURE_AVIC_EN_BIT)
+#define MSR_AMD64_SECURE_AVIC_ALLOWEDNMI_BIT 1
+#define MSR_AMD64_SECURE_AVIC_ALLOWEDNMI BIT_ULL(MSR_AMD64_SECURE_AVIC_ALLOWEDNMI_BIT)
#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index 5502a828a795..321b3678e26f 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -38,6 +38,11 @@ enum lapic_lvt_entry {
#define APIC_LVTx(x) (APIC_LVTT + 0x10 * (x))
+static inline void savic_wr_control_msr(u64 val)
+{
+ native_wrmsr(MSR_AMD64_SECURE_AVIC_CONTROL, lower_32_bits(val), upper_32_bits(val));
+}
+
static int x2apic_savic_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
{
return x2apic_enabled() && cc_platform_has(CC_ATTR_SNP_SECURE_AVIC);
@@ -143,12 +148,12 @@ static void x2apic_savic_write(u32 reg, u32 data)
switch (reg) {
case APIC_LVTT:
+ case APIC_LVT0:
+ case APIC_LVT1:
case APIC_TMICT:
case APIC_TDCR:
write_msr_to_hv(reg, data);
break;
- case APIC_LVT0:
- case APIC_LVT1:
/* APIC_ID is writable and configured by guest for Secure AVIC */
case APIC_ID:
case APIC_TASKPRI:
@@ -401,6 +406,7 @@ static void x2apic_savic_setup(void)
ret = sev_notify_savic_gpa(gpa);
if (ret != ES_OK)
snp_abort();
+ savic_wr_control_msr(gpa | MSR_AMD64_SECURE_AVIC_ALLOWEDNMI);
this_cpu_write(savic_setup_done, true);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 12/14] x86/sev: Enable NMI support for Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (10 preceding siblings ...)
2024-09-13 11:37 ` [RFC 11/14] x86/apic: Allow NMI to be injected from hypervisor " Neeraj Upadhyay
@ 2024-09-13 11:37 ` Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 13/14] x86/apic: Enable Secure AVIC in Control MSR Neeraj Upadhyay
` (2 subsequent siblings)
14 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:37 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
Now that support to send NMI IPI and support to inject NMI from
hypervisor has been added, set V_NMI_ENABLE in VINTR_CTRL field of
VMSA to enable NMI.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/coco/sev/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 3c832c9befab..d0057a2a7d4a 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1191,7 +1191,7 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip)
vmsa->x87_fcw = AP_INIT_X87_FCW_DEFAULT;
if (cc_platform_has(CC_ATTR_SNP_SECURE_AVIC))
- vmsa->vintr_ctrl |= V_GIF_MASK;
+ vmsa->vintr_ctrl |= (V_GIF_MASK | V_NMI_ENABLE_MASK);
/* SVME must be set. */
vmsa->efer = EFER_SVME;
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 13/14] x86/apic: Enable Secure AVIC in Control MSR
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (11 preceding siblings ...)
2024-09-13 11:37 ` [RFC 12/14] x86/sev: Enable NMI support " Neeraj Upadhyay
@ 2024-09-13 11:37 ` Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 14/14] x86/sev: Indicate SEV-SNP guest supports Secure AVIC Neeraj Upadhyay
2024-10-17 8:23 ` [RFC 00/14] AMD: Add Secure AVIC Guest Support Kirill A. Shutemov
14 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:37 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
With all the pieces in place now, enable Secure AVIC in Secure
AVIC Control MSR. Any access to x2APIC MSRs are emulated by
hypervisor before Secure AVIC is enabled in the Control MSR.
Post Secure AVIC enablement, all x2APIC MSR accesses (whether
accelerated by AVIC hardware or trapped as #VC exception) operate
on guest APIC backing page.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kernel/apic/x2apic_savic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index 321b3678e26f..a3f0ddc6b5b6 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -406,7 +406,7 @@ static void x2apic_savic_setup(void)
ret = sev_notify_savic_gpa(gpa);
if (ret != ES_OK)
snp_abort();
- savic_wr_control_msr(gpa | MSR_AMD64_SECURE_AVIC_ALLOWEDNMI);
+ savic_wr_control_msr(gpa | MSR_AMD64_SECURE_AVIC_EN | MSR_AMD64_SECURE_AVIC_ALLOWEDNMI);
this_cpu_write(savic_setup_done, true);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* [RFC 14/14] x86/sev: Indicate SEV-SNP guest supports Secure AVIC
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (12 preceding siblings ...)
2024-09-13 11:37 ` [RFC 13/14] x86/apic: Enable Secure AVIC in Control MSR Neeraj Upadhyay
@ 2024-09-13 11:37 ` Neeraj Upadhyay
2024-10-17 8:23 ` [RFC 00/14] AMD: Add Secure AVIC Guest Support Kirill A. Shutemov
14 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-09-13 11:37 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
From: Kishon Vijay Abraham I <kvijayab@amd.com>
Now that Secure AVIC support is added in the guest, indicate SEV-SNP
guest supports Secure AVIC.
Without this, the guest terminates booting with Non-Automatic Exit(NAE)
termination request event.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/boot/compressed/sev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index ec038be0a048..fa5f1dc94e2b 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -402,7 +402,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
* by the guest kernel. As and when a new feature is implemented in the
* guest kernel, a corresponding bit should be added to the mask.
*/
-#define SNP_FEATURES_PRESENT MSR_AMD64_SNP_DEBUG_SWAP
+#define SNP_FEATURES_PRESENT (MSR_AMD64_SNP_DEBUG_SWAP | MSR_AMD64_SNP_SECURE_AVIC_ENABLED)
u64 snp_get_unsupported_features(u64 status)
{
--
2.34.1
^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-09-13 11:36 ` [RFC 01/14] x86/apic: Add new driver for Secure AVIC Neeraj Upadhyay
@ 2024-10-08 19:15 ` Borislav Petkov
2024-10-09 1:56 ` Neeraj Upadhyay
2024-10-09 10:10 ` Kirill A. Shutemov
2024-11-18 21:45 ` Melody (Huibo) Wang
2 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-10-08 19:15 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Fri, Sep 13, 2024 at 05:06:52PM +0530, Neeraj Upadhyay wrote:
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index cd44e120fe53..ec038be0a048 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -394,6 +394,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
> MSR_AMD64_SNP_VMSA_REG_PROT | \
> MSR_AMD64_SNP_RESERVED_BIT13 | \
> MSR_AMD64_SNP_RESERVED_BIT15 | \
> + MSR_AMD64_SNP_SECURE_AVIC_ENABLED | \
> MSR_AMD64_SNP_RESERVED_MASK)
>
> /*
Shouldn't this hunk be in the last patch of the series, after all the sAVIC
enablement has happened?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-08 19:15 ` Borislav Petkov
@ 2024-10-09 1:56 ` Neeraj Upadhyay
2024-10-09 5:23 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-09 1:56 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On 10/9/2024 12:45 AM, Borislav Petkov wrote:
> On Fri, Sep 13, 2024 at 05:06:52PM +0530, Neeraj Upadhyay wrote:
>> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
>> index cd44e120fe53..ec038be0a048 100644
>> --- a/arch/x86/boot/compressed/sev.c
>> +++ b/arch/x86/boot/compressed/sev.c
>> @@ -394,6 +394,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
>> MSR_AMD64_SNP_VMSA_REG_PROT | \
>> MSR_AMD64_SNP_RESERVED_BIT13 | \
>> MSR_AMD64_SNP_RESERVED_BIT15 | \
>> + MSR_AMD64_SNP_SECURE_AVIC_ENABLED | \
>> MSR_AMD64_SNP_RESERVED_MASK)
>>
>> /*
>
> Shouldn't this hunk be in the last patch of the series, after all the sAVIC
> enablement has happened?
>
As SECURE_AVIC feature is not supported (as reported by snp_get_unsupported_features())
by guest at this patch in the series, it is added to SNP_FEATURES_IMPL_REQ here. The bit
value within SNP_FEATURES_IMPL_REQ hasn't changed with this change as the same bit pos
was part of MSR_AMD64_SNP_RESERVED_MASK before this patch. In patch 14 SECURE_AVIC guest
support is indicated by guest.
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 1:56 ` Neeraj Upadhyay
@ 2024-10-09 5:23 ` Borislav Petkov
2024-10-09 6:01 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-10-09 5:23 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Wed, Oct 09, 2024 at 07:26:55AM +0530, Neeraj Upadhyay wrote:
> As SECURE_AVIC feature is not supported (as reported by snp_get_unsupported_features())
> by guest at this patch in the series, it is added to SNP_FEATURES_IMPL_REQ here. The bit
> value within SNP_FEATURES_IMPL_REQ hasn't changed with this change as the same bit pos
> was part of MSR_AMD64_SNP_RESERVED_MASK before this patch. In patch 14 SECURE_AVIC guest
> support is indicated by guest.
So what's the point of adding it to SNP_FEATURES_IMPL_REQ here? What does that
do at all in this patch alone? Why is this change needed in here?
IOW, why don't you do all the feature bit handling in the last patch, where it
all belongs logically?
In the last patch you can start *testing* for
MSR_AMD64_SNP_SECURE_AVIC_ENABLED *and* enforce it with SNP_FEATURES_PRESENT.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 5:23 ` Borislav Petkov
@ 2024-10-09 6:01 ` Neeraj Upadhyay
2024-10-09 10:38 ` Borislav Petkov
2024-10-09 13:15 ` Tom Lendacky
0 siblings, 2 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-09 6:01 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On 10/9/2024 10:53 AM, Borislav Petkov wrote:
> On Wed, Oct 09, 2024 at 07:26:55AM +0530, Neeraj Upadhyay wrote:
>> As SECURE_AVIC feature is not supported (as reported by snp_get_unsupported_features())
>> by guest at this patch in the series, it is added to SNP_FEATURES_IMPL_REQ here. The bit
>> value within SNP_FEATURES_IMPL_REQ hasn't changed with this change as the same bit pos
>> was part of MSR_AMD64_SNP_RESERVED_MASK before this patch. In patch 14 SECURE_AVIC guest
>> support is indicated by guest.
>
> So what's the point of adding it to SNP_FEATURES_IMPL_REQ here? What does that
> do at all in this patch alone? Why is this change needed in here?
>
Before this patch, if hypervisor enables Secure AVIC (reported in sev_status), guest would
terminate in snp_check_features(). The reason for this is, SNP_FEATURES_IMPL_REQ had the Secure
AVIC bit set before this patch, as that bit was part of MSR_AMD64_SNP_RESERVED_MASK
GENMASK_ULL(63, 18).
#define SNP_FEATURES_IMPL_REQ (MSR_AMD64_SNP_VTOM | \
...
MSR_AMD64_SNP_RESERVED_MASK)
Adding MSR_AMD64_SNP_SECURE_AVIC_BIT (bit 18) to SNP_FEATURES_IMPL_REQ in this patch
keeps that behavior intact as now with this change MSR_AMD64_SNP_RESERVED_MASK becomes
GENMASK_ULL(63, 19).
> IOW, why don't you do all the feature bit handling in the last patch, where it
> all belongs logically?
>
If we do that, then hypervisor could have enabled Secure AVIC support and the guest
code at this patch won't catch the missing guest-support early and it can result in some
unknown failures at later point during guest boot.
- Neeraj
> In the last patch you can start *testing* for
> MSR_AMD64_SNP_SECURE_AVIC_ENABLED *and* enforce it with SNP_FEATURES_PRESENT.
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-09-13 11:36 ` [RFC 01/14] x86/apic: Add new driver for Secure AVIC Neeraj Upadhyay
2024-10-08 19:15 ` Borislav Petkov
@ 2024-10-09 10:10 ` Kirill A. Shutemov
2024-10-09 10:42 ` Borislav Petkov
2024-11-18 21:45 ` Melody (Huibo) Wang
2 siblings, 1 reply; 81+ messages in thread
From: Kirill A. Shutemov @ 2024-10-09 10:10 UTC (permalink / raw)
To: Neeraj Upadhyay, bp
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Fri, Sep 13, 2024 at 05:06:52PM +0530, Neeraj Upadhyay wrote:
> diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
> index caa4b4430634..801208678450 100644
> --- a/include/linux/cc_platform.h
> +++ b/include/linux/cc_platform.h
> @@ -88,6 +88,14 @@ enum cc_attr {
> * enabled to run SEV-SNP guests.
> */
> CC_ATTR_HOST_SEV_SNP,
> +
> + /**
> + * @CC_ATTR_SNP_SECURE_AVIC: Secure AVIC mode is active.
> + *
> + * The host kernel is running with the necessary features enabled
> + * to run SEV-SNP guests with full Secure AVIC capabilities.
> + */
> + CC_ATTR_SNP_SECURE_AVIC,
I don't think CC attributes is the right way to track this kind of
features. My understanding of cc_platform interface is that it has to be
used to advertise some kind of property of the platform that generic code
and be interested in, not a specific implementation.
For the same reason, I think CC_ATTR_GUEST/HOST_SEV_SNP is also a bad use
of the interface.
Borislav, I know we had different view on this. What is your criteria on
what should and shouldn't be a CC attribute? I don't think we want a
parallel X86_FEATURE_*.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 6:01 ` Neeraj Upadhyay
@ 2024-10-09 10:38 ` Borislav Petkov
2024-10-09 11:00 ` Neeraj Upadhyay
2024-10-09 11:02 ` Borislav Petkov
2024-10-09 13:15 ` Tom Lendacky
1 sibling, 2 replies; 81+ messages in thread
From: Borislav Petkov @ 2024-10-09 10:38 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Wed, Oct 09, 2024 at 11:31:07AM +0530, Neeraj Upadhyay wrote:
> Before this patch, if hypervisor enables Secure AVIC (reported in sev_status), guest would
> terminate in snp_check_features().
We want the guest to terminate at this patch too.
The only case where the guest should not terminate is when the *full* sAVIC
support is in. I.e., at patch 14.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 10:10 ` Kirill A. Shutemov
@ 2024-10-09 10:42 ` Borislav Petkov
2024-10-09 11:03 ` Kirill A. Shutemov
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-10-09 10:42 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Neeraj Upadhyay, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Wed, Oct 09, 2024 at 01:10:58PM +0300, Kirill A. Shutemov wrote:
> I don't think CC attributes is the right way to track this kind of
> features. My understanding of cc_platform interface is that it has to be
> used to advertise some kind of property of the platform that generic code
> and be interested in, not a specific implementation.
Yes.
>
> For the same reason, I think CC_ATTR_GUEST/HOST_SEV_SNP is also a bad use
> of the interface.
>
> Borislav, I know we had different view on this. What is your criteria on
> what should and shouldn't be a CC attribute? I don't think we want a
> parallel X86_FEATURE_*.
Yes, we don't.
Do you have a better idea which is cleaner than what we do now?
Yes yes, cc_platform reports aspects of the coco platform to generic code but
nothing stops the x86 code from calling those interfaces too, for simplicity
reasons.
Because the coco platform being a SNP guest or having an SAVIC are also two
aspects of that same confidential computing platform. So we might as well use
it this way too.
I'd say.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 10:38 ` Borislav Petkov
@ 2024-10-09 11:00 ` Neeraj Upadhyay
2024-10-09 11:02 ` Borislav Petkov
1 sibling, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-09 11:00 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On 10/9/2024 4:08 PM, Borislav Petkov wrote:
> On Wed, Oct 09, 2024 at 11:31:07AM +0530, Neeraj Upadhyay wrote:
>> Before this patch, if hypervisor enables Secure AVIC (reported in sev_status), guest would
>> terminate in snp_check_features().
>
> We want the guest to terminate at this patch too.
>
If I understand it correctly, you are fine with adding MSR_AMD64_SNP_SECURE_AVIC_ENABLED
to SNP_FEATURES_IMPL_REQ in this patch.
> The only case where the guest should not terminate is when the *full* sAVIC
> support is in. I.e., at patch 14.
>
Got it. This version of the patch series is following that.
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 10:38 ` Borislav Petkov
2024-10-09 11:00 ` Neeraj Upadhyay
@ 2024-10-09 11:02 ` Borislav Petkov
2024-10-09 12:38 ` Neeraj Upadhyay
1 sibling, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-10-09 11:02 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Wed, Oct 09, 2024 at 12:38:21PM +0200, Borislav Petkov wrote:
> On Wed, Oct 09, 2024 at 11:31:07AM +0530, Neeraj Upadhyay wrote:
> > Before this patch, if hypervisor enables Secure AVIC (reported in sev_status), guest would
> > terminate in snp_check_features().
>
> We want the guest to terminate at this patch too.
>
> The only case where the guest should not terminate is when the *full* sAVIC
> support is in. I.e., at patch 14.
I went and did a small C program doing all that - I see what you mean now
- you want to *enforce* the guest termination when SAVIC bit is not in
SNP_FEATURES_PRESENT.
Basically what I want you do to.
Please call that out in the commit message as it is important.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 10:42 ` Borislav Petkov
@ 2024-10-09 11:03 ` Kirill A. Shutemov
2024-10-09 11:22 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Kirill A. Shutemov @ 2024-10-09 11:03 UTC (permalink / raw)
To: Borislav Petkov
Cc: Neeraj Upadhyay, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Wed, Oct 09, 2024 at 12:42:34PM +0200, Borislav Petkov wrote:
> Do you have a better idea which is cleaner than what we do now?
I would rather convert these three attributes to synthetic X86_FEATUREs
next to X86_FEATURE_TDX_GUEST. I suggested it once.
> Yes yes, cc_platform reports aspects of the coco platform to generic code but
> nothing stops the x86 code from calling those interfaces too, for simplicity
> reasons.
I don't see why it is any simpler than having a synthetic X86_FEATURE.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 11:03 ` Kirill A. Shutemov
@ 2024-10-09 11:22 ` Borislav Petkov
2024-10-09 12:12 ` Kirill A. Shutemov
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-10-09 11:22 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Neeraj Upadhyay, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Wed, Oct 09, 2024 at 02:03:48PM +0300, Kirill A. Shutemov wrote:
> I would rather convert these three attributes to synthetic X86_FEATUREs
> next to X86_FEATURE_TDX_GUEST. I suggested it once.
And back then I answered that splitting the coco checks between a X86_FEATURE
and a cc_platform ones is confusing. Which ones do I use, X86_FEATURE or
cc_platform?
Oh, for SNP or TDX I use cpu_feature_enabled() but in generic code I use
cc_platform.
Sounds confusing to me.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 11:22 ` Borislav Petkov
@ 2024-10-09 12:12 ` Kirill A. Shutemov
2024-10-09 13:53 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Kirill A. Shutemov @ 2024-10-09 12:12 UTC (permalink / raw)
To: Borislav Petkov
Cc: Neeraj Upadhyay, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Wed, Oct 09, 2024 at 01:22:16PM +0200, Borislav Petkov wrote:
> On Wed, Oct 09, 2024 at 02:03:48PM +0300, Kirill A. Shutemov wrote:
> > I would rather convert these three attributes to synthetic X86_FEATUREs
> > next to X86_FEATURE_TDX_GUEST. I suggested it once.
>
> And back then I answered that splitting the coco checks between a X86_FEATURE
> and a cc_platform ones is confusing. Which ones do I use, X86_FEATURE or
> cc_platform?
>
> Oh, for SNP or TDX I use cpu_feature_enabled() but in generic code I use
> cc_platform.
>
> Sounds confusing to me.
If you use SNP or TDX check in generic code something is wrong. Abstraction
is broken somewhere. Generic code doesn't need to know concrete implementation.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 11:02 ` Borislav Petkov
@ 2024-10-09 12:38 ` Neeraj Upadhyay
0 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-09 12:38 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On 10/9/2024 4:32 PM, Borislav Petkov wrote:
> On Wed, Oct 09, 2024 at 12:38:21PM +0200, Borislav Petkov wrote:
>> On Wed, Oct 09, 2024 at 11:31:07AM +0530, Neeraj Upadhyay wrote:
>>> Before this patch, if hypervisor enables Secure AVIC (reported in sev_status), guest would
>>> terminate in snp_check_features().
>>
>> We want the guest to terminate at this patch too.
>>
>> The only case where the guest should not terminate is when the *full* sAVIC
>> support is in. I.e., at patch 14.
>
> I went and did a small C program doing all that - I see what you mean now
> - you want to *enforce* the guest termination when SAVIC bit is not in
> SNP_FEATURES_PRESENT.
>
Yes.
> Basically what I want you do to.
>
Cool!
> Please call that out in the commit message as it is important.
>
Sure, will update in next version. Thanks!
- Neeraj
> Thx.
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 6:01 ` Neeraj Upadhyay
2024-10-09 10:38 ` Borislav Petkov
@ 2024-10-09 13:15 ` Tom Lendacky
2024-10-09 13:50 ` Neeraj Upadhyay
1 sibling, 1 reply; 81+ messages in thread
From: Tom Lendacky @ 2024-10-09 13:15 UTC (permalink / raw)
To: Neeraj Upadhyay, Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
On 10/9/24 01:01, Neeraj Upadhyay wrote:
>
>
> On 10/9/2024 10:53 AM, Borislav Petkov wrote:
>> On Wed, Oct 09, 2024 at 07:26:55AM +0530, Neeraj Upadhyay wrote:
>>> As SECURE_AVIC feature is not supported (as reported by snp_get_unsupported_features())
>>> by guest at this patch in the series, it is added to SNP_FEATURES_IMPL_REQ here. The bit
>>> value within SNP_FEATURES_IMPL_REQ hasn't changed with this change as the same bit pos
>>> was part of MSR_AMD64_SNP_RESERVED_MASK before this patch. In patch 14 SECURE_AVIC guest
>>> support is indicated by guest.
>>
>> So what's the point of adding it to SNP_FEATURES_IMPL_REQ here? What does that
>> do at all in this patch alone? Why is this change needed in here?
>>
>
> Before this patch, if hypervisor enables Secure AVIC (reported in sev_status), guest would
> terminate in snp_check_features(). The reason for this is, SNP_FEATURES_IMPL_REQ had the Secure
> AVIC bit set before this patch, as that bit was part of MSR_AMD64_SNP_RESERVED_MASK
> GENMASK_ULL(63, 18).
>
> #define SNP_FEATURES_IMPL_REQ (MSR_AMD64_SNP_VTOM | \
> ...
> MSR_AMD64_SNP_RESERVED_MASK)
>
>
>
> Adding MSR_AMD64_SNP_SECURE_AVIC_BIT (bit 18) to SNP_FEATURES_IMPL_REQ in this patch
> keeps that behavior intact as now with this change MSR_AMD64_SNP_RESERVED_MASK becomes
> GENMASK_ULL(63, 19).
>
>
>> IOW, why don't you do all the feature bit handling in the last patch, where it
>> all belongs logically?
>>
>
> If we do that, then hypervisor could have enabled Secure AVIC support and the guest
> code at this patch won't catch the missing guest-support early and it can result in some
> unknown failures at later point during guest boot.
Won't the SNP_RESERVED_MASK catch it? You are just renaming the bit
position value, right? It was a 1 before and is still a 1. So the guest
will terminate if the hypervisor sets the Secure AVIC bit both before
and after this patch, right?
Thanks,
Tom
>
>
> - Neeraj
>
>> In the last patch you can start *testing* for
>> MSR_AMD64_SNP_SECURE_AVIC_ENABLED *and* enforce it with SNP_FEATURES_PRESENT.
>>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 13:15 ` Tom Lendacky
@ 2024-10-09 13:50 ` Neeraj Upadhyay
0 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-09 13:50 UTC (permalink / raw)
To: Tom Lendacky, Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
On 10/9/2024 6:45 PM, Tom Lendacky wrote:
> On 10/9/24 01:01, Neeraj Upadhyay wrote:
>>
>>
>> On 10/9/2024 10:53 AM, Borislav Petkov wrote:
>>> On Wed, Oct 09, 2024 at 07:26:55AM +0530, Neeraj Upadhyay wrote:
>>>> As SECURE_AVIC feature is not supported (as reported by snp_get_unsupported_features())
>>>> by guest at this patch in the series, it is added to SNP_FEATURES_IMPL_REQ here. The bit
>>>> value within SNP_FEATURES_IMPL_REQ hasn't changed with this change as the same bit pos
>>>> was part of MSR_AMD64_SNP_RESERVED_MASK before this patch. In patch 14 SECURE_AVIC guest
>>>> support is indicated by guest.
>>>
>>> So what's the point of adding it to SNP_FEATURES_IMPL_REQ here? What does that
>>> do at all in this patch alone? Why is this change needed in here?
>>>
>>
>> Before this patch, if hypervisor enables Secure AVIC (reported in sev_status), guest would
>> terminate in snp_check_features(). The reason for this is, SNP_FEATURES_IMPL_REQ had the Secure
>> AVIC bit set before this patch, as that bit was part of MSR_AMD64_SNP_RESERVED_MASK
>> GENMASK_ULL(63, 18).
>>
>> #define SNP_FEATURES_IMPL_REQ (MSR_AMD64_SNP_VTOM | \
>> ...
>> MSR_AMD64_SNP_RESERVED_MASK)
>>
>>
>>
>> Adding MSR_AMD64_SNP_SECURE_AVIC_BIT (bit 18) to SNP_FEATURES_IMPL_REQ in this patch
>> keeps that behavior intact as now with this change MSR_AMD64_SNP_RESERVED_MASK becomes
>> GENMASK_ULL(63, 19).
>>
>>
>>> IOW, why don't you do all the feature bit handling in the last patch, where it
>>> all belongs logically?
>>>
>>
>> If we do that, then hypervisor could have enabled Secure AVIC support and the guest
>> code at this patch won't catch the missing guest-support early and it can result in some
>> unknown failures at later point during guest boot.
>
> Won't the SNP_RESERVED_MASK catch it? You are just renaming the bit
> position value, right? It was a 1 before and is still a 1. So the guest
> will terminate if the hypervisor sets the Secure AVIC bit both before
> and after this patch, right?
>
Yes that is right. SNP_RESERVED_MASK catches it before this patch. My reply to Boris
above was for the case if we move setting of MSR_AMD64_SNP_SECURE_AVIC_ENABLED in
SNP_FEATURES_IMPL_REQ from this patch to patch 14.
- Neeraj
> Thanks,
> Tom
>
>>
>>
>> - Neeraj
>>
>>> In the last patch you can start *testing* for
>>> MSR_AMD64_SNP_SECURE_AVIC_ENABLED *and* enforce it with SNP_FEATURES_PRESENT.
>>>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 12:12 ` Kirill A. Shutemov
@ 2024-10-09 13:53 ` Borislav Petkov
2024-10-11 7:29 ` Kirill A. Shutemov
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-10-09 13:53 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Neeraj Upadhyay, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Wed, Oct 09, 2024 at 03:12:41PM +0300, Kirill A. Shutemov wrote:
> If you use SNP or TDX check in generic code something is wrong. Abstraction
> is broken somewhere. Generic code doesn't need to know concrete
> implementation.
That's perhaps because you're thinking that the *actual* coco implementation type
should be hidden away from generic code. But SNP and TDX are pretty different
so we might as well ask for them by their name.
But I can see why you'd think there might be some abstraction violation there.
My goal here - even though there might be some bad taste of abstraction
violation here - is simplicity. As expressed a bunch of times already, having
cc_platform *and* X86_FEATURE* things used in relation to coco code can be
confusing. So I'd prefer to avoid that confusion.
Nothing says anywhere that arch code cannot use cc_platform interfaces.
Absolutely nothing. So for the sake of KISS I'm going in that direction.
If it turns out later that this was a bad idea and we need to change it, we
can always can. As we do for other interfaces in the kernel.
If you're still not convinced, I already asked you:
"Do you have a better idea which is cleaner than what we do now?"
Your turn.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-09-13 11:36 ` [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page Neeraj Upadhyay
@ 2024-10-09 15:27 ` Dave Hansen
2024-10-09 16:31 ` Neeraj Upadhyay
2024-10-23 16:36 ` Borislav Petkov
1 sibling, 1 reply; 81+ messages in thread
From: Dave Hansen @ 2024-10-09 15:27 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
On 9/13/24 04:36, Neeraj Upadhyay wrote:
> + sz = ALIGN(num_possible_cpus() * SZ_4K, SZ_2M);
> + backing_pages = kzalloc(sz, GFP_ATOMIC);
> + if (!backing_pages)
> + snp_abort();
Is this in an atomic context? If not, why the GFP_ATOMIC?
Second, this looks to be allocating a potentially large physically
contiguous chunk of memory, then handing it out 4k at a time. The loop is:
buf = alloc(NR_CPUS * PAGE_SIZE);
for (i = 0; i < NR_CPUS; i++)
foo[i] = buf + i * PAGE_SIZE;
but could be:
for (i = 0; i < NR_CPUS; i++)
foo[i] = alloc(PAGE_SIZE);
right?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-10-09 15:27 ` Dave Hansen
@ 2024-10-09 16:31 ` Neeraj Upadhyay
2024-10-09 17:03 ` Dave Hansen
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-09 16:31 UTC (permalink / raw)
To: Dave Hansen, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
On 10/9/2024 8:57 PM, Dave Hansen wrote:
> On 9/13/24 04:36, Neeraj Upadhyay wrote:
>> + sz = ALIGN(num_possible_cpus() * SZ_4K, SZ_2M);
>> + backing_pages = kzalloc(sz, GFP_ATOMIC);
>> + if (!backing_pages)
>> + snp_abort();
>
> Is this in an atomic context? If not, why the GFP_ATOMIC?
>
No. I think GFP_ATOMIC is not required. I will change it to GFP_KERNEL.
> Second, this looks to be allocating a potentially large physically
> contiguous chunk of memory, then handing it out 4k at a time. The loop is:
>
> buf = alloc(NR_CPUS * PAGE_SIZE);
> for (i = 0; i < NR_CPUS; i++)
> foo[i] = buf + i * PAGE_SIZE;
>
> but could be:
>
> for (i = 0; i < NR_CPUS; i++)
> foo[i] = alloc(PAGE_SIZE);
>
> right?
Single contiguous allocation is done here to avoid TLB impact due to backing page
accesses (e.g. sending ipi requires writing to target CPU's backing page).
I can change it to allocation in chunks of size 2M instead of one big allocation.
Is that fine? Also, as described in commit message, reserving entire 2M chunk
for backing pages also prevents splitting of NPT entries into individual 4K entries.
This can happen if part of a 2M page is not allocated for backing pages by guest
and page state change (from private to shared) is done for that part.
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-10-09 16:31 ` Neeraj Upadhyay
@ 2024-10-09 17:03 ` Dave Hansen
2024-10-09 17:52 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Dave Hansen @ 2024-10-09 17:03 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
On 10/9/24 09:31, Neeraj Upadhyay wrote:
>> Second, this looks to be allocating a potentially large physically
>> contiguous chunk of memory, then handing it out 4k at a time. The loop is:
>>
>> buf = alloc(NR_CPUS * PAGE_SIZE);
>> for (i = 0; i < NR_CPUS; i++)
>> foo[i] = buf + i * PAGE_SIZE;
>>
>> but could be:
>>
>> for (i = 0; i < NR_CPUS; i++)
>> foo[i] = alloc(PAGE_SIZE);
>>
>> right?
>
> Single contiguous allocation is done here to avoid TLB impact due to backing page
> accesses (e.g. sending ipi requires writing to target CPU's backing page).
> I can change it to allocation in chunks of size 2M instead of one big allocation.
> Is that fine? Also, as described in commit message, reserving entire 2M chunk
> for backing pages also prevents splitting of NPT entries into individual 4K entries.
> This can happen if part of a 2M page is not allocated for backing pages by guest
> and page state change (from private to shared) is done for that part.
Ick.
First, this needs to be thoroughly commented, not in the changelogs.
Second, this is premature optimization at its finest. Just imagine if
_every_ site that needed 16k or 32k of shared memory decided to allocate
a 2M chunk for this _and_ used it sparsely. What's the average number
of vCPUs in a guest. 4? 8?
The absolute minimum that we can do here is some stupid infrastructure
that you call for allocating shared pages, or for things that _will_ be
converted to shared so they get packed.
But hacking uncommented 2M allocations into every site seems like
insanity to me.
IMNHO, you can either invest the time to put the infrastructure in place
and get 2M pages, or you can live with the suboptimal performance of 4k.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-10-09 17:03 ` Dave Hansen
@ 2024-10-09 17:52 ` Neeraj Upadhyay
2024-10-23 16:30 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-09 17:52 UTC (permalink / raw)
To: Dave Hansen, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
On 10/9/2024 10:33 PM, Dave Hansen wrote:
> On 10/9/24 09:31, Neeraj Upadhyay wrote:
>>> Second, this looks to be allocating a potentially large physically
>>> contiguous chunk of memory, then handing it out 4k at a time. The loop is:
>>>
>>> buf = alloc(NR_CPUS * PAGE_SIZE);
>>> for (i = 0; i < NR_CPUS; i++)
>>> foo[i] = buf + i * PAGE_SIZE;
>>>
>>> but could be:
>>>
>>> for (i = 0; i < NR_CPUS; i++)
>>> foo[i] = alloc(PAGE_SIZE);
>>>
>>> right?
>>
>> Single contiguous allocation is done here to avoid TLB impact due to backing page
>> accesses (e.g. sending ipi requires writing to target CPU's backing page).
>> I can change it to allocation in chunks of size 2M instead of one big allocation.
>> Is that fine? Also, as described in commit message, reserving entire 2M chunk
>> for backing pages also prevents splitting of NPT entries into individual 4K entries.
>> This can happen if part of a 2M page is not allocated for backing pages by guest
>> and page state change (from private to shared) is done for that part.
>
> Ick.
>
> First, this needs to be thoroughly commented, not in the changelogs.
>
Ok.
> Second, this is premature optimization at its finest. Just imagine if
> _every_ site that needed 16k or 32k of shared memory decided to allocate
> a 2M chunk for this _and_ used it sparsely. What's the average number
> of vCPUs in a guest. 4? 8?
>
Got it.
> The absolute minimum that we can do here is some stupid infrastructure
> that you call for allocating shared pages, or for things that _will_ be
> converted to shared so they get packed.
>
> But hacking uncommented 2M allocations into every site seems like
> insanity to me.
>
> IMNHO, you can either invest the time to put the infrastructure in place
> and get 2M pages, or you can live with the suboptimal performance of 4k.
I will start with 4K. For later, I will get the performance numbers to propose
a change in allocation scheme - for ex, allocating a bigger contiguous
batch from the total allocation required for backing pages (num_possible_cpus() * 4K)
without doing 2M reservation.
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-10-09 13:53 ` Borislav Petkov
@ 2024-10-11 7:29 ` Kirill A. Shutemov
0 siblings, 0 replies; 81+ messages in thread
From: Kirill A. Shutemov @ 2024-10-11 7:29 UTC (permalink / raw)
To: Borislav Petkov
Cc: Neeraj Upadhyay, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Wed, Oct 09, 2024 at 03:53:35PM +0200, Borislav Petkov wrote:
> On Wed, Oct 09, 2024 at 03:12:41PM +0300, Kirill A. Shutemov wrote:
> > If you use SNP or TDX check in generic code something is wrong. Abstraction
> > is broken somewhere. Generic code doesn't need to know concrete
> > implementation.
>
> That's perhaps because you're thinking that the *actual* coco implementation type
> should be hidden away from generic code. But SNP and TDX are pretty different
> so we might as well ask for them by their name.
>
> But I can see why you'd think there might be some abstraction violation there.
>
> My goal here - even though there might be some bad taste of abstraction
> violation here - is simplicity. As expressed a bunch of times already, having
> cc_platform *and* X86_FEATURE* things used in relation to coco code can be
> confusing. So I'd prefer to avoid that confusion.
>
> Nothing says anywhere that arch code cannot use cc_platform interfaces.
> Absolutely nothing. So for the sake of KISS I'm going in that direction.
>
> If it turns out later that this was a bad idea and we need to change it, we
> can always can. As we do for other interfaces in the kernel.
>
> If you're still not convinced, I already asked you:
>
> "Do you have a better idea which is cleaner than what we do now?"
>
> Your turn.
Okay, I've got your point. It is not what I would do, but I don't have
sufficient argument to change what is already there.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 00/14] AMD: Add Secure AVIC Guest Support
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
` (13 preceding siblings ...)
2024-09-13 11:37 ` [RFC 14/14] x86/sev: Indicate SEV-SNP guest supports Secure AVIC Neeraj Upadhyay
@ 2024-10-17 8:23 ` Kirill A. Shutemov
2024-10-18 2:33 ` Neeraj Upadhyay
14 siblings, 1 reply; 81+ messages in thread
From: Kirill A. Shutemov @ 2024-10-17 8:23 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, bp,
David.Kaplan, x86, hpa, peterz, seanjc, pbonzini, kvm
On Fri, Sep 13, 2024 at 05:06:51PM +0530, Neeraj Upadhyay wrote:
> Introduction
> ------------
>
> Secure AVIC is a new hardware feature in the AMD64 architecture to
> allow SEV-SNP guests to prevent hypervisor from generating unexpected
> interrupts to a vCPU or otherwise violate architectural assumptions
> around APIC behavior.
>
> One of the significant differences from AVIC or emulated x2APIC is that
> Secure AVIC uses a guest-owned and managed APIC backing page. It also
> introduces additional fields in both the VMCB and the Secure AVIC backing
> page to aid the guest in limiting which interrupt vectors can be injected
> into the guest.
>
> Guest APIC Backing Page
> -----------------------
> Each vCPU has a guest-allocated APIC backing page of size 4K, which
> maintains APIC state for that vCPU. The x2APIC MSRs are mapped at
> their corresposing x2APIC MMIO offset within the guest APIC backing
> page. All x2APIC accesses by guest or Secure AVIC hardware operate
> on this backing page. The backing page should be pinned and NPT entry
> for it should be always mapped while the corresponding vCPU is running.
>
>
> MSR Accesses
> ------------
> Secure AVIC only supports x2APIC MSR accesses. xAPIC MMIO offset based
> accesses are not supported.
>
> Some of the MSR accesses such as ICR writes (with shorthand equal to
> self), SELF_IPI, EOI, TPR writes are accelerated by Secure AVIC
> hardware. Other MSR accesses generate a #VC exception. The #VC
> exception handler reads/writes to the guest APIC backing page.
> As guest APIC backing page is accessible to the guest, the Secure
> AVIC driver code optimizes APIC register access by directly
> reading/writing to the guest APIC backing page (instead of taking
> the #VC exception route).
>
> In addition to the architected MSRs, following new fields are added to
> the guest APIC backing page which can be modified directly by the
> guest:
>
> a. ALLOWED_IRR
>
> ALLOWED_IRR vector indicates the interrupt vectors which the guest
> allows the hypervisor to send. The combination of host-controlled
> REQUESTED_IRR vectors (part of VMCB) and ALLOWED_IRR is used by
> hardware to update the IRR vectors of the Guest APIC backing page.
>
> #Offset #bits Description
> 204h 31:0 Guest allowed vectors 0-31
> 214h 31:0 Guest allowed vectors 32-63
> ...
> 274h 31:0 Guest allowed vectors 224-255
>
> ALLOWED_IRR is meant to be used specifically for vectors that the
> hypervisor is allowed to inject, such as device interrupts. Interrupt
> vectors used exclusively by the guest itself (like IPI vectors) should
> not be allowed to be injected into the guest for security reasons.
>
> b. NMI Request
>
> #Offset #bits Description
> 278h 0 Set by Guest to request Virtual NMI
>
>
> LAPIC Timer Support
> -------------------
> LAPIC timer is emulated by hypervisor. So, APIC_LVTT, APIC_TMICT and
> APIC_TDCR, APIC_TMCCT APIC registers are not read/written to the guest
> APIC backing page and are communicated to the hypervisor using SVM_EXIT_MSR
> VMGEXIT.
>
> IPI Support
> -----------
> Only SELF_IPI is accelerated by Secure AVIC hardware. Other IPIs require
> writing (from the Secure AVIC driver) to the IRR vector of the target CPU
> backing page and then issuing VMGEXIT for the hypervisor to notify the
> target vCPU.
>
> Driver Implementation Open Points
> ---------------------------------
>
> The Secure AVIC driver only supports physical destination mode. If
> logical destination mode need to be supported, then a separate x2apic
> driver would be required for supporting logical destination mode.
>
> Setting of ALLOWED_IRR vectors is done from vector.c for IOAPIC and MSI
> interrupts. ALLOWED_IRR vector is not cleared when an interrupt vector
> migrates to different CPU. Using a cleaner approach to manage and
> configure allowed vectors needs more work.
>
>
> Testing
> -------
>
> This series is based on top of commit 196145c606d0 "Merge
> tag 'clk-fixes-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux."
>
> Host Secure AVIC support patch series is at [1].
>
> Following tests are done:
>
> 1) Boot to Prompt using initramfs and ubuntu fs.
> 2) Verified timer and IPI as part of the guest bootup.
> 3) Verified long run SCF TORTURE IPI test.
> 4) Verified FIO test with NVME passthrough.
One case that is missing is kexec.
If the first kernel set ALLOWED_IRR, but the target kernel doesn't know
anything about Secure AVIC, there are going to be a problem I assume.
I think we need ->setup() counterpart (->teardown() ?) to get
configuration back to the boot state. And get it called from kexec path.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 00/14] AMD: Add Secure AVIC Guest Support
2024-10-17 8:23 ` [RFC 00/14] AMD: Add Secure AVIC Guest Support Kirill A. Shutemov
@ 2024-10-18 2:33 ` Neeraj Upadhyay
2024-10-18 7:54 ` Kirill A. Shutemov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-18 2:33 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, bp,
David.Kaplan, x86, hpa, peterz, seanjc, pbonzini, kvm
Hi Kirill,
On 10/17/2024 1:53 PM, Kirill A. Shutemov wrote:
> On Fri, Sep 13, 2024 at 05:06:51PM +0530, Neeraj Upadhyay wrote:
>> Introduction
>> ------------
>>
>> Secure AVIC is a new hardware feature in the AMD64 architecture to
>> allow SEV-SNP guests to prevent hypervisor from generating unexpected
>> interrupts to a vCPU or otherwise violate architectural assumptions
>> around APIC behavior.
>>
>> One of the significant differences from AVIC or emulated x2APIC is that
>> Secure AVIC uses a guest-owned and managed APIC backing page. It also
>> introduces additional fields in both the VMCB and the Secure AVIC backing
>> page to aid the guest in limiting which interrupt vectors can be injected
>> into the guest.
>>
>> Guest APIC Backing Page
>> -----------------------
>> Each vCPU has a guest-allocated APIC backing page of size 4K, which
>> maintains APIC state for that vCPU. The x2APIC MSRs are mapped at
>> their corresposing x2APIC MMIO offset within the guest APIC backing
>> page. All x2APIC accesses by guest or Secure AVIC hardware operate
>> on this backing page. The backing page should be pinned and NPT entry
>> for it should be always mapped while the corresponding vCPU is running.
>>
>>
>> MSR Accesses
>> ------------
>> Secure AVIC only supports x2APIC MSR accesses. xAPIC MMIO offset based
>> accesses are not supported.
>>
>> Some of the MSR accesses such as ICR writes (with shorthand equal to
>> self), SELF_IPI, EOI, TPR writes are accelerated by Secure AVIC
>> hardware. Other MSR accesses generate a #VC exception. The #VC
>> exception handler reads/writes to the guest APIC backing page.
>> As guest APIC backing page is accessible to the guest, the Secure
>> AVIC driver code optimizes APIC register access by directly
>> reading/writing to the guest APIC backing page (instead of taking
>> the #VC exception route).
>>
>> In addition to the architected MSRs, following new fields are added to
>> the guest APIC backing page which can be modified directly by the
>> guest:
>>
>> a. ALLOWED_IRR
>>
>> ALLOWED_IRR vector indicates the interrupt vectors which the guest
>> allows the hypervisor to send. The combination of host-controlled
>> REQUESTED_IRR vectors (part of VMCB) and ALLOWED_IRR is used by
>> hardware to update the IRR vectors of the Guest APIC backing page.
>>
>> #Offset #bits Description
>> 204h 31:0 Guest allowed vectors 0-31
>> 214h 31:0 Guest allowed vectors 32-63
>> ...
>> 274h 31:0 Guest allowed vectors 224-255
>>
>> ALLOWED_IRR is meant to be used specifically for vectors that the
>> hypervisor is allowed to inject, such as device interrupts. Interrupt
>> vectors used exclusively by the guest itself (like IPI vectors) should
>> not be allowed to be injected into the guest for security reasons.
>>
>> b. NMI Request
>>
>> #Offset #bits Description
>> 278h 0 Set by Guest to request Virtual NMI
>>
>>
>> LAPIC Timer Support
>> -------------------
>> LAPIC timer is emulated by hypervisor. So, APIC_LVTT, APIC_TMICT and
>> APIC_TDCR, APIC_TMCCT APIC registers are not read/written to the guest
>> APIC backing page and are communicated to the hypervisor using SVM_EXIT_MSR
>> VMGEXIT.
>>
>> IPI Support
>> -----------
>> Only SELF_IPI is accelerated by Secure AVIC hardware. Other IPIs require
>> writing (from the Secure AVIC driver) to the IRR vector of the target CPU
>> backing page and then issuing VMGEXIT for the hypervisor to notify the
>> target vCPU.
>>
>> Driver Implementation Open Points
>> ---------------------------------
>>
>> The Secure AVIC driver only supports physical destination mode. If
>> logical destination mode need to be supported, then a separate x2apic
>> driver would be required for supporting logical destination mode.
>>
>> Setting of ALLOWED_IRR vectors is done from vector.c for IOAPIC and MSI
>> interrupts. ALLOWED_IRR vector is not cleared when an interrupt vector
>> migrates to different CPU. Using a cleaner approach to manage and
>> configure allowed vectors needs more work.
>>
>>
>> Testing
>> -------
>>
>> This series is based on top of commit 196145c606d0 "Merge
>> tag 'clk-fixes-for-linus' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux."
>>
>> Host Secure AVIC support patch series is at [1].
>>
>> Following tests are done:
>>
>> 1) Boot to Prompt using initramfs and ubuntu fs.
>> 2) Verified timer and IPI as part of the guest bootup.
>> 3) Verified long run SCF TORTURE IPI test.
>> 4) Verified FIO test with NVME passthrough.
>
> One case that is missing is kexec.
>
> If the first kernel set ALLOWED_IRR, but the target kernel doesn't know
> anything about Secure AVIC, there are going to be a problem I assume.
>
> I think we need ->setup() counterpart (->teardown() ?) to get
> configuration back to the boot state. And get it called from kexec path.
>
Agree, I haven't fully investigated the changes required to support kexec.
Yes, teardown step might be required to disable Secure AVIC in control msr
and possibly resetting other Secure AVIC configuration.
Thanks for pointing it out! I will update the details with kexec support
being missing in this series.
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 00/14] AMD: Add Secure AVIC Guest Support
2024-10-18 2:33 ` Neeraj Upadhyay
@ 2024-10-18 7:54 ` Kirill A. Shutemov
2024-10-29 9:47 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Kirill A. Shutemov @ 2024-10-18 7:54 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, bp,
David.Kaplan, x86, hpa, peterz, seanjc, pbonzini, kvm
On Fri, Oct 18, 2024 at 08:03:20AM +0530, Neeraj Upadhyay wrote:
> Hi Kirill,
>
> On 10/17/2024 1:53 PM, Kirill A. Shutemov wrote:
> > On Fri, Sep 13, 2024 at 05:06:51PM +0530, Neeraj Upadhyay wrote:
> >> Introduction
> >> ------------
> >>
> >> Secure AVIC is a new hardware feature in the AMD64 architecture to
> >> allow SEV-SNP guests to prevent hypervisor from generating unexpected
> >> interrupts to a vCPU or otherwise violate architectural assumptions
> >> around APIC behavior.
> >>
> >> One of the significant differences from AVIC or emulated x2APIC is that
> >> Secure AVIC uses a guest-owned and managed APIC backing page. It also
> >> introduces additional fields in both the VMCB and the Secure AVIC backing
> >> page to aid the guest in limiting which interrupt vectors can be injected
> >> into the guest.
> >>
> >> Guest APIC Backing Page
> >> -----------------------
> >> Each vCPU has a guest-allocated APIC backing page of size 4K, which
> >> maintains APIC state for that vCPU. The x2APIC MSRs are mapped at
> >> their corresposing x2APIC MMIO offset within the guest APIC backing
> >> page. All x2APIC accesses by guest or Secure AVIC hardware operate
> >> on this backing page. The backing page should be pinned and NPT entry
> >> for it should be always mapped while the corresponding vCPU is running.
> >>
> >>
> >> MSR Accesses
> >> ------------
> >> Secure AVIC only supports x2APIC MSR accesses. xAPIC MMIO offset based
> >> accesses are not supported.
> >>
> >> Some of the MSR accesses such as ICR writes (with shorthand equal to
> >> self), SELF_IPI, EOI, TPR writes are accelerated by Secure AVIC
> >> hardware. Other MSR accesses generate a #VC exception. The #VC
> >> exception handler reads/writes to the guest APIC backing page.
> >> As guest APIC backing page is accessible to the guest, the Secure
> >> AVIC driver code optimizes APIC register access by directly
> >> reading/writing to the guest APIC backing page (instead of taking
> >> the #VC exception route).
> >>
> >> In addition to the architected MSRs, following new fields are added to
> >> the guest APIC backing page which can be modified directly by the
> >> guest:
> >>
> >> a. ALLOWED_IRR
> >>
> >> ALLOWED_IRR vector indicates the interrupt vectors which the guest
> >> allows the hypervisor to send. The combination of host-controlled
> >> REQUESTED_IRR vectors (part of VMCB) and ALLOWED_IRR is used by
> >> hardware to update the IRR vectors of the Guest APIC backing page.
> >>
> >> #Offset #bits Description
> >> 204h 31:0 Guest allowed vectors 0-31
> >> 214h 31:0 Guest allowed vectors 32-63
> >> ...
> >> 274h 31:0 Guest allowed vectors 224-255
> >>
> >> ALLOWED_IRR is meant to be used specifically for vectors that the
> >> hypervisor is allowed to inject, such as device interrupts. Interrupt
> >> vectors used exclusively by the guest itself (like IPI vectors) should
> >> not be allowed to be injected into the guest for security reasons.
> >>
> >> b. NMI Request
> >>
> >> #Offset #bits Description
> >> 278h 0 Set by Guest to request Virtual NMI
> >>
> >>
> >> LAPIC Timer Support
> >> -------------------
> >> LAPIC timer is emulated by hypervisor. So, APIC_LVTT, APIC_TMICT and
> >> APIC_TDCR, APIC_TMCCT APIC registers are not read/written to the guest
> >> APIC backing page and are communicated to the hypervisor using SVM_EXIT_MSR
> >> VMGEXIT.
> >>
> >> IPI Support
> >> -----------
> >> Only SELF_IPI is accelerated by Secure AVIC hardware. Other IPIs require
> >> writing (from the Secure AVIC driver) to the IRR vector of the target CPU
> >> backing page and then issuing VMGEXIT for the hypervisor to notify the
> >> target vCPU.
> >>
> >> Driver Implementation Open Points
> >> ---------------------------------
> >>
> >> The Secure AVIC driver only supports physical destination mode. If
> >> logical destination mode need to be supported, then a separate x2apic
> >> driver would be required for supporting logical destination mode.
> >>
> >> Setting of ALLOWED_IRR vectors is done from vector.c for IOAPIC and MSI
> >> interrupts. ALLOWED_IRR vector is not cleared when an interrupt vector
> >> migrates to different CPU. Using a cleaner approach to manage and
> >> configure allowed vectors needs more work.
> >>
> >>
> >> Testing
> >> -------
> >>
> >> This series is based on top of commit 196145c606d0 "Merge
> >> tag 'clk-fixes-for-linus' of
> >> git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux."
> >>
> >> Host Secure AVIC support patch series is at [1].
> >>
> >> Following tests are done:
> >>
> >> 1) Boot to Prompt using initramfs and ubuntu fs.
> >> 2) Verified timer and IPI as part of the guest bootup.
> >> 3) Verified long run SCF TORTURE IPI test.
> >> 4) Verified FIO test with NVME passthrough.
> >
> > One case that is missing is kexec.
> >
> > If the first kernel set ALLOWED_IRR, but the target kernel doesn't know
> > anything about Secure AVIC, there are going to be a problem I assume.
> >
> > I think we need ->setup() counterpart (->teardown() ?) to get
> > configuration back to the boot state. And get it called from kexec path.
> >
>
> Agree, I haven't fully investigated the changes required to support kexec.
> Yes, teardown step might be required to disable Secure AVIC in control msr
> and possibly resetting other Secure AVIC configuration.
>
> Thanks for pointing it out! I will update the details with kexec support
> being missing in this series.
I think it has to be addressed before it got merged. Or we will get a
regression.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-10-09 17:52 ` Neeraj Upadhyay
@ 2024-10-23 16:30 ` Borislav Petkov
2024-10-24 4:01 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-10-23 16:30 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Dave Hansen, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Wed, Oct 09, 2024 at 11:22:58PM +0530, Neeraj Upadhyay wrote:
> I will start with 4K. For later, I will get the performance numbers to propose
> a change in allocation scheme - for ex, allocating a bigger contiguous
> batch from the total allocation required for backing pages (num_possible_cpus() * 4K)
> without doing 2M reservation.
Why does performance matter here if you're going to allocate simply a 4K page
per vCPU and set them all up in the APIC setup path? And then you can do the
page conversion to guest-owned as part of the guest vCPU init path?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-09-13 11:36 ` [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page Neeraj Upadhyay
2024-10-09 15:27 ` Dave Hansen
@ 2024-10-23 16:36 ` Borislav Petkov
2024-10-24 3:24 ` Neeraj Upadhyay
1 sibling, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-10-23 16:36 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Fri, Sep 13, 2024 at 05:06:53PM +0530, Neeraj Upadhyay wrote:
> @@ -61,8 +65,30 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
> __send_IPI_mask(mask, vector, APIC_DEST_ALLBUT);
> }
>
> +static void x2apic_savic_setup(void)
> +{
> + void *backing_page;
> + enum es_result ret;
> + unsigned long gpa;
> +
> + if (this_cpu_read(savic_setup_done))
I'm sure you can get rid of that silly bool. Like check the apic_backing_page
pointer instead and use that pointer to verify whether the per-CPU setup has
been done successfully.
> + return;
> +
> + backing_page = this_cpu_read(apic_backing_page);
> + gpa = __pa(backing_page);
> + ret = sev_notify_savic_gpa(gpa);
> + if (ret != ES_OK)
> + snp_abort();
> + this_cpu_write(savic_setup_done, true);
> +}
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-10-23 16:36 ` Borislav Petkov
@ 2024-10-24 3:24 ` Neeraj Upadhyay
0 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-24 3:24 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On 10/23/2024 10:06 PM, Borislav Petkov wrote:
> On Fri, Sep 13, 2024 at 05:06:53PM +0530, Neeraj Upadhyay wrote:
>> @@ -61,8 +65,30 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
>> __send_IPI_mask(mask, vector, APIC_DEST_ALLBUT);
>> }
>>
>> +static void x2apic_savic_setup(void)
>> +{
>> + void *backing_page;
>> + enum es_result ret;
>> + unsigned long gpa;
>> +
>> + if (this_cpu_read(savic_setup_done))
>
> I'm sure you can get rid of that silly bool. Like check the apic_backing_page
> pointer instead and use that pointer to verify whether the per-CPU setup has
> been done successfully.
>
Ok agree. In this patch version, APIC backing page allocation for all CPUs is done in
x2apic_savic_probe(). This is done to group the allocation together, so that these
backing pages are mapped using single 2M NPT and RMP entry.
I will move the APIC backing page setup to per-CPU setup (x2apic_savic_setup()) and
use that pointer to do the check.
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-10-23 16:30 ` Borislav Petkov
@ 2024-10-24 4:01 ` Neeraj Upadhyay
2024-10-24 11:49 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-24 4:01 UTC (permalink / raw)
To: Borislav Petkov
Cc: Dave Hansen, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On 10/23/2024 10:00 PM, Borislav Petkov wrote:
> On Wed, Oct 09, 2024 at 11:22:58PM +0530, Neeraj Upadhyay wrote:
>> I will start with 4K. For later, I will get the performance numbers to propose
>> a change in allocation scheme - for ex, allocating a bigger contiguous
>> batch from the total allocation required for backing pages (num_possible_cpus() * 4K)
>> without doing 2M reservation.
>
> Why does performance matter here if you're going to allocate simply a 4K page
> per vCPU and set them all up in the APIC setup path? And then you can do the
> page conversion to guest-owned as part of the guest vCPU init path?
>
Please let me know if I didn't understand your questions correctly. The performance
concerns here are w.r.t. these backing page allocations being part of a single
hugepage.
Grouping of allocation together allows these pages to be part of the same 2M NPT
and RMP table entry, which can provide better performance compared to having
separate 4K entries for each backing page. For example, to send IPI to target CPUs,
->send_IPI callback (executing on source CPU) in Secure AVIC driver writes to the
backing page of target CPU. Having these backing pages as part of the single
2M entry could provide better caching of the translation and require single entry
in TLB at the source CPU.
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-10-24 4:01 ` Neeraj Upadhyay
@ 2024-10-24 11:49 ` Borislav Petkov
2024-10-24 12:31 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-10-24 11:49 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Dave Hansen, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Thu, Oct 24, 2024 at 09:31:01AM +0530, Neeraj Upadhyay wrote:
> Please let me know if I didn't understand your questions correctly. The performance
> concerns here are w.r.t. these backing page allocations being part of a single
> hugepage.
>
> Grouping of allocation together allows these pages to be part of the same 2M NPT
> and RMP table entry, which can provide better performance compared to having
> separate 4K entries for each backing page. For example, to send IPI to target CPUs,
> ->send_IPI callback (executing on source CPU) in Secure AVIC driver writes to the
> backing page of target CPU. Having these backing pages as part of the single
> 2M entry could provide better caching of the translation and require single entry
> in TLB at the source CPU.
Lemme see if I understand you correctly: you want a single 2M page to contain
*all* backing pages so that when the HV wants to send IPIs etc, the first vCPU
will load the page translation into the TLB and the following ones will have
it already?
Versus having separate 4K pages which would mean that everytime a vCPU's backing
page is needed, every vCPU would have to do a TLB walk and pull it in so that
the mapping's there?
Am I close?
If so, what's the problem with loading that backing page each time you VMRUN
the vCPU?
IOW, how noticeable would that be?
And what guarantees that the 2M page containing the backing pages would always
remain in the TLB?
Hmmm.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-10-24 11:49 ` Borislav Petkov
@ 2024-10-24 12:31 ` Neeraj Upadhyay
2024-10-24 12:59 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-24 12:31 UTC (permalink / raw)
To: Borislav Petkov
Cc: Dave Hansen, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On 10/24/2024 5:19 PM, Borislav Petkov wrote:
> On Thu, Oct 24, 2024 at 09:31:01AM +0530, Neeraj Upadhyay wrote:
>> Please let me know if I didn't understand your questions correctly. The performance
>> concerns here are w.r.t. these backing page allocations being part of a single
>> hugepage.
>>
>> Grouping of allocation together allows these pages to be part of the same 2M NPT
>> and RMP table entry, which can provide better performance compared to having
>> separate 4K entries for each backing page. For example, to send IPI to target CPUs,
>> ->send_IPI callback (executing on source CPU) in Secure AVIC driver writes to the
>> backing page of target CPU. Having these backing pages as part of the single
>> 2M entry could provide better caching of the translation and require single entry
>> in TLB at the source CPU.
>
> Lemme see if I understand you correctly: you want a single 2M page to contain
> *all* backing pages so that when the HV wants to send IPIs etc, the first vCPU
With Secure AVIC enabled, source vCPU directly writes to the Interrupt Request Register
(IRR) offset in the target CPU's backing page. So, the IPI is directly requested in
target vCPU's backing page by source vCPU context and not by HV.
> will load the page translation into the TLB and the following ones will have
> it already?
>
Yes, but the following ones will be already present in source vCPU's CPU TLB.
> Versus having separate 4K pages which would mean that everytime a vCPU's backing
> page is needed, every vCPU would have to do a TLB walk and pull it in so that
> the mapping's there?
>
The walk is done by source CPU here, as it is the one which is writing to the
backing page of target vCPUs.
> Am I close?
>
I have clarified some parts above. Basically, source vCPU is directly writing to
remote backing pages.
> If so, what's the problem with loading that backing page each time you VMRUN
> the vCPU?
>
As I clarified above, it's the source vCPU which need to load each backing page.
> IOW, how noticeable would that be?
>
I don't have the data at this point. That is the reason I will send this contiguous
allocation as a separate patch (if required) when I can get data on some workloads
which are impacted by this.
> And what guarantees that the 2M page containing the backing pages would always
> remain in the TLB?
>
For smp_call_function_many(), where a source CPU sends IPI to multiple CPUs,
source CPU writes to backing pages of different target CPUs within this function.
So, accesses have temporal locality. For other use cases, I need to enable
perf with Secure AVIC to collect the TLB misses on a IPI benchmark and get
back with the numbers.
- Neeraj
> Hmmm.
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page
2024-10-24 12:31 ` Neeraj Upadhyay
@ 2024-10-24 12:59 ` Borislav Petkov
0 siblings, 0 replies; 81+ messages in thread
From: Borislav Petkov @ 2024-10-24 12:59 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Dave Hansen, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Thu, Oct 24, 2024 at 06:01:16PM +0530, Neeraj Upadhyay wrote:
> With Secure AVIC enabled, source vCPU directly writes to the Interrupt
> Request Register (IRR) offset in the target CPU's backing page. So, the IPI
> is directly requested in target vCPU's backing page by source vCPU context
> and not by HV.
So the source vCPU will fault in the target vCPU's backing page if it is not
there anymore. And if it is part of a 2M translation, the likelihood that it
is there is higher.
> As I clarified above, it's the source vCPU which need to load each backing
> page.
So if we have 4K backing pages, the source vCPU will fault-in the target's
respective backing page into its TLB and send the IPI. And if it is an IPI to
multiple vCPUs, then it will have to fault in each vCPU's backing page in
succession.
However, when the target vCPU gets to VMRUN, the backing page will have to be
faulted in into the target vCPU's TLB too.
And this is the same with a 2M backing page - the target vCPUs will have to
fault that 2M page translation too.
But then if the target vCPU wants to send IPIs itself, the 2M backing pages
will be there already. Hmmm.
> I don't have the data at this point. That is the reason I will send this
> contiguous allocation as a separate patch (if required) when I can get data
> on some workloads which are impacted by this.
Yes, that would clarify whether something more involved than simply using 4K
pages is needed.
> For smp_call_function_many(), where a source CPU sends IPI to multiple CPUs,
> source CPU writes to backing pages of different target CPUs within this function.
> So, accesses have temporal locality. For other use cases, I need to enable
> perf with Secure AVIC to collect the TLB misses on a IPI benchmark and get
> back with the numbers.
Right, I can see some TLB walks getting avoided if you have a single 2M page
but without actually measuring it, I don't know. If I had to venture a guess,
it probably won't show any difference but who knows...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 00/14] AMD: Add Secure AVIC Guest Support
2024-10-18 7:54 ` Kirill A. Shutemov
@ 2024-10-29 9:47 ` Borislav Petkov
2024-10-29 10:24 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-10-29 9:47 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Neeraj Upadhyay, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Fri, Oct 18, 2024 at 10:54:21AM +0300, Kirill A. Shutemov wrote:
> I think it has to be addressed before it got merged. Or we will get a
> regression.
... or temporarily disable kexec when SAVIC is present.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 00/14] AMD: Add Secure AVIC Guest Support
2024-10-29 9:47 ` Borislav Petkov
@ 2024-10-29 10:24 ` Neeraj Upadhyay
2024-10-29 10:54 ` Borislav Petkov
2024-10-29 11:51 ` Kirill A. Shutemov
0 siblings, 2 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-29 10:24 UTC (permalink / raw)
To: Borislav Petkov, Kirill A. Shutemov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On 10/29/2024 3:17 PM, Borislav Petkov wrote:
> On Fri, Oct 18, 2024 at 10:54:21AM +0300, Kirill A. Shutemov wrote:
>> I think it has to be addressed before it got merged. Or we will get a
>> regression.
>
> ... or temporarily disable kexec when SAVIC is present.
>
Thanks! I plan to do something like below patch for the next version.
Verified Secure AVIC guest kexec with this.
- Neeraj
-----------------------------------------------------------------------
From 80a4901644fa8a9ed2c6f690fbba4b8a6176b215 Mon Sep 17 00:00:00 2001
From: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Date: Tue, 29 Oct 2024 15:38:21 +0530
Subject: [RFC 15/14] x86/apic: Add kexec support for Secure AVIC
Add a ->teardown callback to disable Secure AVIC before
rebooting into the new kernel. This ensures that the new
kernel does not access the old APIC backing page which was
allocated by the previous kernel. This can happen if there
are any APIC accesses done during guest boot before Secure
AVIC driver probe is done by the new kernel (as Secure AVIC
remained enabled in control msr).
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
This is dependent on SNP guest supports patches [1]
[1] https://lore.kernel.org/lkml/cover.1722520012.git.ashish.kalra@amd.com/
arch/x86/include/asm/apic.h | 1 +
arch/x86/kernel/apic/apic.c | 3 +++
arch/x86/kernel/apic/x2apic_savic.c | 7 +++++++
3 files changed, 11 insertions(+)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 2d5400372470..ec332afd0277 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -303,6 +303,7 @@ struct apic {
/* Probe, setup and smpboot functions */
int (*probe)(void);
void (*setup)(void);
+ void (*teardown)(void);
int (*acpi_madt_oem_check)(char *oem_id, char *oem_table_id);
void (*init_apic_ldr)(void);
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index aeda74bf15e6..08156ac4ec6c 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1163,6 +1163,9 @@ void disable_local_APIC(void)
if (!apic_accessible())
return;
+ if (apic->teardown)
+ apic->teardown();
+
apic_soft_disable();
#ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index a3f0ddc6b5b6..bb7a28f9646a 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -391,6 +391,12 @@ static void init_backing_page(void *backing_page)
set_reg(backing_page, APIC_ID, apic_id);
}
+static void x2apic_savic_teardown(void)
+{
+ /* Disable Secure AVIC */
+ native_wrmsr(MSR_AMD64_SECURE_AVIC_CONTROL, 0, 0);
+}
+
static void x2apic_savic_setup(void)
{
void *backing_page;
@@ -447,6 +453,7 @@ static struct apic apic_x2apic_savic __ro_after_init = {
.probe = x2apic_savic_probe,
.acpi_madt_oem_check = x2apic_savic_acpi_madt_oem_check,
.setup = x2apic_savic_setup,
+ .teardown = x2apic_savic_teardown,
.dest_mode_logical = false,
--
^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [RFC 00/14] AMD: Add Secure AVIC Guest Support
2024-10-29 10:24 ` Neeraj Upadhyay
@ 2024-10-29 10:54 ` Borislav Petkov
2024-10-29 11:51 ` Kirill A. Shutemov
1 sibling, 0 replies; 81+ messages in thread
From: Borislav Petkov @ 2024-10-29 10:54 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Kirill A. Shutemov, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Tue, Oct 29, 2024 at 03:54:24PM +0530, Neeraj Upadhyay wrote:
> Thanks! I plan to do something like below patch for the next version.
> Verified Secure AVIC guest kexec with this.
Sure, if you're adding a ->setup anyway, then it better have a counterpart.
:-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 00/14] AMD: Add Secure AVIC Guest Support
2024-10-29 10:24 ` Neeraj Upadhyay
2024-10-29 10:54 ` Borislav Petkov
@ 2024-10-29 11:51 ` Kirill A. Shutemov
2024-10-29 12:15 ` Neeraj Upadhyay
1 sibling, 1 reply; 81+ messages in thread
From: Kirill A. Shutemov @ 2024-10-29 11:51 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Borislav Petkov, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Tue, Oct 29, 2024 at 03:54:24PM +0530, Neeraj Upadhyay wrote:
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index aeda74bf15e6..08156ac4ec6c 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1163,6 +1163,9 @@ void disable_local_APIC(void)
> if (!apic_accessible())
> return;
>
> + if (apic->teardown)
> + apic->teardown();
> +
> apic_soft_disable();
>
> #ifdef CONFIG_X86_32
Hm. I think it will call apic->teardown() for all but the one CPU that
does kexec. I believe we need to disable SAVIC for all CPUs.
Have you tested the case when the target kernel doesn't support SAVIC and
tries to use a new interrupt vector on the boot CPU? I think it will
break.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 00/14] AMD: Add Secure AVIC Guest Support
2024-10-29 11:51 ` Kirill A. Shutemov
@ 2024-10-29 12:15 ` Neeraj Upadhyay
2024-10-29 14:36 ` Kirill A. Shutemov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-29 12:15 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Borislav Petkov, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On 10/29/2024 5:21 PM, Kirill A. Shutemov wrote:
> On Tue, Oct 29, 2024 at 03:54:24PM +0530, Neeraj Upadhyay wrote:
>> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
>> index aeda74bf15e6..08156ac4ec6c 100644
>> --- a/arch/x86/kernel/apic/apic.c
>> +++ b/arch/x86/kernel/apic/apic.c
>> @@ -1163,6 +1163,9 @@ void disable_local_APIC(void)
>> if (!apic_accessible())
>> return;
>>
>> + if (apic->teardown)
>> + apic->teardown();
>> +
>> apic_soft_disable();
>>
>> #ifdef CONFIG_X86_32
>
> Hm. I think it will call apic->teardown() for all but the one CPU that
> does kexec. I believe we need to disable SAVIC for all CPUs.
>
I see it being called for all CPUs.
For the CPU doing kexec, I see below backtrace, which lands into disable_local_APIC()
disable_local_APIC
native_stop_other_cpus
native_machine_shutdown
machine_shutdown
kernel_kexec
For the other CPUs, it is below:
disable_local_APIC
stop_this_cpu
__sysvec_reboot
sysvec_reboot
> Have you tested the case when the target kernel doesn't support SAVIC and
> tries to use a new interrupt vector on the boot CPU? I think it will
> break.
>
For a VM launched with VMSA feature containing Secure AVIC, the target
kernel also is required to support Secure AVIC. Otherwise, guest bootup
would fail. I will capture this information in the documentation.
So, as far as I understand, SAVIC kernel kexecing into a non-SAVIC kernel
is not a valid use case.
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 00/14] AMD: Add Secure AVIC Guest Support
2024-10-29 12:15 ` Neeraj Upadhyay
@ 2024-10-29 14:36 ` Kirill A. Shutemov
2024-10-29 15:28 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Kirill A. Shutemov @ 2024-10-29 14:36 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Borislav Petkov, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Tue, Oct 29, 2024 at 05:45:23PM +0530, Neeraj Upadhyay wrote:
>
>
> On 10/29/2024 5:21 PM, Kirill A. Shutemov wrote:
> > On Tue, Oct 29, 2024 at 03:54:24PM +0530, Neeraj Upadhyay wrote:
> >> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> >> index aeda74bf15e6..08156ac4ec6c 100644
> >> --- a/arch/x86/kernel/apic/apic.c
> >> +++ b/arch/x86/kernel/apic/apic.c
> >> @@ -1163,6 +1163,9 @@ void disable_local_APIC(void)
> >> if (!apic_accessible())
> >> return;
> >>
> >> + if (apic->teardown)
> >> + apic->teardown();
> >> +
> >> apic_soft_disable();
> >>
> >> #ifdef CONFIG_X86_32
> >
> > Hm. I think it will call apic->teardown() for all but the one CPU that
> > does kexec. I believe we need to disable SAVIC for all CPUs.
> >
>
> I see it being called for all CPUs.
>
> For the CPU doing kexec, I see below backtrace, which lands into disable_local_APIC()
>
> disable_local_APIC
> native_stop_other_cpus
> native_machine_shutdown
> machine_shutdown
> kernel_kexec
>
> For the other CPUs, it is below:
>
> disable_local_APIC
> stop_this_cpu
> __sysvec_reboot
> sysvec_reboot
Backtraces are backwards, but, yeah, I missed reboot path.
> > Have you tested the case when the target kernel doesn't support SAVIC and
> > tries to use a new interrupt vector on the boot CPU? I think it will
> > break.
> >
>
> For a VM launched with VMSA feature containing Secure AVIC, the target
> kernel also is required to support Secure AVIC. Otherwise, guest bootup
> would fail. I will capture this information in the documentation.
> So, as far as I understand, SAVIC kernel kexecing into a non-SAVIC kernel
> is not a valid use case.
Hm. I thought if SAVIC is not enabled by the guest the guest would boot
without the secure feature, no?
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 00/14] AMD: Add Secure AVIC Guest Support
2024-10-29 14:36 ` Kirill A. Shutemov
@ 2024-10-29 15:28 ` Neeraj Upadhyay
0 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-10-29 15:28 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Borislav Petkov, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
>>> Have you tested the case when the target kernel doesn't support SAVIC and
>>> tries to use a new interrupt vector on the boot CPU? I think it will
>>> break.
>>>
>>
>> For a VM launched with VMSA feature containing Secure AVIC, the target
>> kernel also is required to support Secure AVIC. Otherwise, guest bootup
>> would fail. I will capture this information in the documentation.
>> So, as far as I understand, SAVIC kernel kexecing into a non-SAVIC kernel
>> is not a valid use case.
>
> Hm. I thought if SAVIC is not enabled by the guest the guest would boot
> without the secure feature, no?
>
Actually no. The guest VM which is launched by VMM with Secure AVIC enabled
would have SecureAVIC reported in sev_status MSR. Secure AVIC is part of
SNP_FEATURES_IMPL_REQ and guest boot would terminate due to snp_get_unsupported_features()
check in arch/x86/boot/compressed/sev.c if secure avic is not enabled (having said that,
I need to update config rules to select CONFIG_AMD_SECURE_AVIC if CONFIG_AMD_MEM_ENCRYPT
is enabled).
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver
2024-09-13 11:36 ` [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver Neeraj Upadhyay
@ 2024-11-06 18:16 ` Borislav Petkov
2024-11-07 3:32 ` Neeraj Upadhyay
2024-11-06 19:20 ` Melody (Huibo) Wang
1 sibling, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-11-06 18:16 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Fri, Sep 13, 2024 at 05:06:54PM +0530, Neeraj Upadhyay wrote:
> @@ -24,6 +25,108 @@ static int x2apic_savic_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
> return x2apic_enabled() && cc_platform_has(CC_ATTR_SNP_SECURE_AVIC);
> }
>
> +static inline u32 get_reg(char *page, int reg_off)
Just "reg" like the other APICs.
> +{
> + return READ_ONCE(*((u32 *)(page + reg_off)));
> +}
> +
> +static inline void set_reg(char *page, int reg_off, u32 val)
> +{
> + WRITE_ONCE(*((u32 *)(page + reg_off)), val);
> +}
> +
> +#define SAVIC_ALLOWED_IRR_OFFSET 0x204
> +
> +static u32 x2apic_savic_read(u32 reg)
> +{
> + void *backing_page = this_cpu_read(apic_backing_page);
> +
> + switch (reg) {
> + case APIC_LVTT:
> + case APIC_TMICT:
> + case APIC_TMCCT:
> + case APIC_TDCR:
> + case APIC_ID:
> + case APIC_LVR:
> + case APIC_TASKPRI:
> + case APIC_ARBPRI:
> + case APIC_PROCPRI:
> + case APIC_LDR:
> + case APIC_SPIV:
> + case APIC_ESR:
> + case APIC_ICR:
> + case APIC_LVTTHMR:
> + case APIC_LVTPC:
> + case APIC_LVT0:
> + case APIC_LVT1:
> + case APIC_LVTERR:
> + case APIC_EFEAT:
> + case APIC_ECTRL:
> + case APIC_SEOI:
> + case APIC_IER:
I'm sure those can be turned into ranges instead of enumerating every single
APIC register...
> + case APIC_EILVTn(0) ... APIC_EILVTn(3):
Like here.
> + return get_reg(backing_page, reg);
> + case APIC_ISR ... APIC_ISR + 0x70:
> + case APIC_TMR ... APIC_TMR + 0x70:
> + WARN_ONCE(!IS_ALIGNED(reg, 16), "Reg offset %#x not aligned at 16 bytes", reg);
What's the point of a WARN...
> + return get_reg(backing_page, reg);
... and then allowing the register access anyway?
> + /* IRR and ALLOWED_IRR offset range */
> + case APIC_IRR ... APIC_IRR + 0x74:
> + /*
> + * Either aligned at 16 bytes for valid IRR reg offset or a
> + * valid Secure AVIC ALLOWED_IRR offset.
> + */
> + WARN_ONCE(!(IS_ALIGNED(reg, 16) || IS_ALIGNED(reg - SAVIC_ALLOWED_IRR_OFFSET, 16)),
> + "Misaligned IRR/ALLOWED_IRR reg offset %#x", reg);
> + return get_reg(backing_page, reg);
Ditto.
And below too.
> + default:
> + pr_err("Permission denied: read of Secure AVIC reg offset %#x\n", reg);
> + return 0;
> + }
> +}
> +
> +#define SAVIC_NMI_REQ_OFFSET 0x278
> +
> +static void x2apic_savic_write(u32 reg, u32 data)
> +{
> + void *backing_page = this_cpu_read(apic_backing_page);
> +
> + switch (reg) {
> + case APIC_LVTT:
> + case APIC_LVT0:
> + case APIC_LVT1:
> + case APIC_TMICT:
> + case APIC_TDCR:
> + case APIC_SELF_IPI:
> + /* APIC_ID is writable and configured by guest for Secure AVIC */
> + case APIC_ID:
> + case APIC_TASKPRI:
> + case APIC_EOI:
> + case APIC_SPIV:
> + case SAVIC_NMI_REQ_OFFSET:
> + case APIC_ESR:
> + case APIC_ICR:
> + case APIC_LVTTHMR:
> + case APIC_LVTPC:
> + case APIC_LVTERR:
> + case APIC_ECTRL:
> + case APIC_SEOI:
> + case APIC_IER:
> + case APIC_EILVTn(0) ... APIC_EILVTn(3):
> + set_reg(backing_page, reg, data);
> + break;
> + /* ALLOWED_IRR offsets are writable */
> + case SAVIC_ALLOWED_IRR_OFFSET ... SAVIC_ALLOWED_IRR_OFFSET + 0x70:
> + if (IS_ALIGNED(reg - SAVIC_ALLOWED_IRR_OFFSET, 16)) {
> + set_reg(backing_page, reg, data);
> + break;
> + }
> + fallthrough;
> + default:
> + pr_err("Permission denied: write to Secure AVIC reg offset %#x\n", reg);
> + }
> +}
> +
> static void x2apic_savic_send_IPI(int cpu, int vector)
> {
> u32 dest = per_cpu(x86_cpu_to_apicid, cpu);
> @@ -140,8 +243,8 @@ static struct apic apic_x2apic_savic __ro_after_init = {
> .send_IPI_self = x2apic_send_IPI_self,
> .nmi_to_offline_cpu = true,
>
> - .read = native_apic_msr_read,
> - .write = native_apic_msr_write,
> + .read = x2apic_savic_read,
> + .write = x2apic_savic_write,
> .eoi = native_apic_msr_eoi,
> .icr_read = native_x2apic_icr_read,
> .icr_write = native_x2apic_icr_write,
> --
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver
2024-09-13 11:36 ` [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver Neeraj Upadhyay
2024-11-06 18:16 ` Borislav Petkov
@ 2024-11-06 19:20 ` Melody (Huibo) Wang
2024-11-07 3:33 ` Neeraj Upadhyay
1 sibling, 1 reply; 81+ messages in thread
From: Melody (Huibo) Wang @ 2024-11-06 19:20 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
Hi Neeraj,
On 9/13/2024 4:36 AM, Neeraj Upadhyay wrote:
> The x2APIC registers are mapped at an offset within the guest APIC
> backing page which is same as their x2APIC MMIO offset. Secure AVIC
> adds new registers such as ALLOWED_IRRs (which are at 4-byte offset
> within the IRR register offset range) and NMI_REQ to the APIC register
> space. In addition, the APIC_ID register is writable and configured by
> guest.
>
> Add read() and write() APIC callback functions to read and write x2APIC
> registers directly from the guest APIC backing page.
>
> The default .read()/.write() callbacks of x2APIC drivers perform
> a rdmsr/wrmsr of the x2APIC registers. When Secure AVIC is enabled,
> these would result in #VC exception (for non-accelerated register
> accesses). The #VC exception handler reads/write the x2APIC register
> in the guest APIC backing page. Since this would increase the latency
> of accessing x2APIC registers, the read() and write() callbacks of
> Secure AVIC driver directly reads/writes to the guest APIC backing page.
>
I think this is important non-obvious information which should be in a comment in the code
itself, not just in the commit message.
Thanks,
Melody
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver
2024-11-06 18:16 ` Borislav Petkov
@ 2024-11-07 3:32 ` Neeraj Upadhyay
2024-11-07 14:28 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-07 3:32 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On 11/6/2024 11:46 PM, Borislav Petkov wrote:
> On Fri, Sep 13, 2024 at 05:06:54PM +0530, Neeraj Upadhyay wrote:
>> @@ -24,6 +25,108 @@ static int x2apic_savic_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
>> return x2apic_enabled() && cc_platform_has(CC_ATTR_SNP_SECURE_AVIC);
>> }
>>
>> +static inline u32 get_reg(char *page, int reg_off)
>
> Just "reg" like the other APICs.
>
Ok sure.
>> +static u32 x2apic_savic_read(u32 reg)
>> +{
>> + void *backing_page = this_cpu_read(apic_backing_page);
>> +
>> + switch (reg) {
>> + case APIC_LVTT:
>> + case APIC_TMICT:
>> + case APIC_TMCCT:
>> + case APIC_TDCR:
>> + case APIC_ID:
>> + case APIC_LVR:
>> + case APIC_TASKPRI:
>> + case APIC_ARBPRI:
>> + case APIC_PROCPRI:
>> + case APIC_LDR:
>> + case APIC_SPIV:
>> + case APIC_ESR:
>> + case APIC_ICR:
>> + case APIC_LVTTHMR:
>> + case APIC_LVTPC:
>> + case APIC_LVT0:
>> + case APIC_LVT1:
>> + case APIC_LVTERR:
>> + case APIC_EFEAT:
>> + case APIC_ECTRL:
>> + case APIC_SEOI:
>> + case APIC_IER:
>
> I'm sure those can be turned into ranges instead of enumerating every single
> APIC register...
>
Below are the offset of these, as per "Table 16-6. x2APIC Register" in
APM vol2:
#Reg #offset
APIC_LVTT - 0x320
APIC_TMICT - 0x380
APIC_TMCCT - 0x390
APIC_TDCR - 0x3E0
Above timer related registers are read from HV when we reach the end of this patch
series.
APIC_ID - 20h
APIC_LVR - 30h
APIC_TASKPRI - 80h
APIC_ARBPRI - 90h
APIC_PROCPRI - A0h
APIC_LDR - D0h
APIC_SPIV - F0h
APIC_ESR - 280h
APIC_ICR - 300h
APIC_LVTTHMR - 330h
APIC_LVTPC - 340h
APIC_LVT0 - 350h
APIC_LVT1 - 360h
APIC_LVTERR - 370h
APIC_EFEAT - 0x400h
APIC_ECTRL - 0x410h
APIC_SEOI - 0x420h
APIC_IER - 0x480h
These are few registers like part of LVT (APIC_LVTTHMR ... APIC_LVTERR) ,
priority (APIC_TASKPRI ... APIC_PROCPRI), extended APIC
(APIC_EFEAT ... APIC_ECTRL) which can be grouped.
Intention of doing per reg is to be explicit about which registers
are accessed from backing page, which from hv and which are not allowed
access. As access (and their perms) are per-reg and not range-based, this
made sense to me. Also, if ranges are used, I think 16-byte aligned
checks are needed for the range. If using ranges looks more logical grouping
here, I can update it as per the above range groupings.
>> + case APIC_EILVTn(0) ... APIC_EILVTn(3):
>
> Like here.
>
As this case is for EILVT register range, these registers were grouped.
(I need to add a 16-byte alignment check here).
>> + return get_reg(backing_page, reg);
>> + case APIC_ISR ... APIC_ISR + 0x70:
>> + case APIC_TMR ... APIC_TMR + 0x70:
>> + WARN_ONCE(!IS_ALIGNED(reg, 16), "Reg offset %#x not aligned at 16 bytes", reg);
>
> What's the point of a WARN...
>
>> + return get_reg(backing_page, reg);
>
> ... and then allowing the register access anyway?
>
I will skip access for non-aligned case.
>> + /* IRR and ALLOWED_IRR offset range */
>> + case APIC_IRR ... APIC_IRR + 0x74:
>> + /*
>> + * Either aligned at 16 bytes for valid IRR reg offset or a
>> + * valid Secure AVIC ALLOWED_IRR offset.
>> + */
>> + WARN_ONCE(!(IS_ALIGNED(reg, 16) || IS_ALIGNED(reg - SAVIC_ALLOWED_IRR_OFFSET, 16)),
>> + "Misaligned IRR/ALLOWED_IRR reg offset %#x", reg);
>> + return get_reg(backing_page, reg);
>
> Ditto.
>
> And below too.
>
Same reply as above.
- Neeraj
>> + default:
>> + pr_err("Permission denied: read of Secure AVIC reg offset %#x\n", reg);
>> + return 0;
>> + }
>> +}
>> +
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver
2024-11-06 19:20 ` Melody (Huibo) Wang
@ 2024-11-07 3:33 ` Neeraj Upadhyay
0 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-07 3:33 UTC (permalink / raw)
To: Melody (Huibo) Wang, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
On 11/7/2024 12:50 AM, Melody (Huibo) Wang wrote:
> Hi Neeraj,
>
> On 9/13/2024 4:36 AM, Neeraj Upadhyay wrote:
>> The x2APIC registers are mapped at an offset within the guest APIC
>> backing page which is same as their x2APIC MMIO offset. Secure AVIC
>> adds new registers such as ALLOWED_IRRs (which are at 4-byte offset
>> within the IRR register offset range) and NMI_REQ to the APIC register
>> space. In addition, the APIC_ID register is writable and configured by
>> guest.
>>
>> Add read() and write() APIC callback functions to read and write x2APIC
>> registers directly from the guest APIC backing page.
>>
>> The default .read()/.write() callbacks of x2APIC drivers perform
>> a rdmsr/wrmsr of the x2APIC registers. When Secure AVIC is enabled,
>> these would result in #VC exception (for non-accelerated register
>> accesses). The #VC exception handler reads/write the x2APIC register
>> in the guest APIC backing page. Since this would increase the latency
>> of accessing x2APIC registers, the read() and write() callbacks of
>> Secure AVIC driver directly reads/writes to the guest APIC backing page.
>>
> I think this is important non-obvious information which should be in a comment in the code
> itself, not just in the commit message.
>
Sure, I will add some of this information in the comments. Thanks for the review!
- Neeraj
> Thanks,
> Melody
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver
2024-11-07 3:32 ` Neeraj Upadhyay
@ 2024-11-07 14:28 ` Borislav Petkov
2024-11-08 8:59 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-11-07 14:28 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Thu, Nov 07, 2024 at 09:02:16AM +0530, Neeraj Upadhyay wrote:
> Intention of doing per reg is to be explicit about which registers
> are accessed from backing page, which from hv and which are not allowed
> access. As access (and their perms) are per-reg and not range-based, this
> made sense to me. Also, if ranges are used, I think 16-byte aligned
> checks are needed for the range. If using ranges looks more logical grouping
> here, I can update it as per the above range groupings.
Is this list of registers going to remain or are we going to keep adding to
it so that the ranges become contiguous?
And yes, there is some merit to explicitly naming them but you can also put
that in a comment once above those functions too.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC
2024-09-13 11:36 ` [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC Neeraj Upadhyay
@ 2024-11-07 15:28 ` Borislav Petkov
2024-11-08 18:08 ` Neeraj Upadhyay
2024-11-11 22:43 ` [sos-linux-ext-patches] " Melody (Huibo) Wang
1 sibling, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-11-07 15:28 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Fri, Sep 13, 2024 at 05:06:55PM +0530, Neeraj Upadhyay wrote:
> From: Kishon Vijay Abraham I <kvijayab@amd.com>
>
> Secure AVIC lets guest manage the APIC backing page (unlike emulated
> x2APIC or x2AVIC where the hypervisor manages the APIC backing page).
>
> However the introduced Secure AVIC Linux design still maintains the
> APIC backing page in the hypervisor to shadow the APIC backing page
> maintained by guest (It should be noted only subset of the registers
> are shadowed for specific usecases and registers like APIC_IRR,
> APIC_ISR are not shadowed).
>
> Add sev_ghcb_msr_read() to invoke "SVM_EXIT_MSR" VMGEXIT to read
> MSRs from hypervisor. Initialize the Secure AVIC's APIC backing
> page by copying the initial state of shadow APIC backing page in
> the hypervisor to the guest APIC backing page. Specifically copy
> APIC_LVR, APIC_LDR, and APIC_LVT MSRs from the shadow APIC backing
> page.
You don't have to explain what the patch does - rather, why the patch exists
in the first place and perhaps mention some non-obvious stuff why the code
does what it does.
Check your whole set pls.
> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Co-developed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/coco/sev/core.c | 41 ++++++++++++++++-----
> arch/x86/include/asm/sev.h | 2 ++
> arch/x86/kernel/apic/x2apic_savic.c | 55 +++++++++++++++++++++++++++++
> 3 files changed, 90 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
> index 93470538af5e..0e140f92cfef 100644
> --- a/arch/x86/coco/sev/core.c
> +++ b/arch/x86/coco/sev/core.c
> @@ -1331,18 +1331,15 @@ int __init sev_es_efi_map_ghcbs(pgd_t *pgd)
> return 0;
> }
>
> -static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
> +static enum es_result __vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt, bool write)
Yeah, this one was bugging me already during Nikunj's set so I cleaned it up
a bit differently:
https://git.kernel.org/tip/8bca85cc1eb72e21a3544ab32e546a819d8674ca
> +enum es_result sev_ghcb_msr_read(u64 msr, u64 *value)
Why is this a separate function if it is called only once from x2avic_savic.c?
I think you should merge it with read_msr_from_hv(), rename latter to
x2avic_read_msr_from_hv()
and leave it here in sev/core.c.
> +{
> + struct pt_regs regs = { .cx = msr };
> + struct es_em_ctxt ctxt = { .regs = ®s };
> + struct ghcb_state state;
> + unsigned long flags;
> + enum es_result ret;
> + struct ghcb *ghcb;
> +
> + local_irq_save(flags);
> + ghcb = __sev_get_ghcb(&state);
> + vc_ghcb_invalidate(ghcb);
> +
> + ret = __vc_handle_msr(ghcb, &ctxt, false);
> + if (ret == ES_OK)
> + *value = regs.ax | regs.dx << 32;
> +
> + __sev_put_ghcb(&state);
> + local_irq_restore(flags);
> +
> + return ret;
> +}
...
> diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
> index 6a471bbc3dba..99151be4e173 100644
> --- a/arch/x86/kernel/apic/x2apic_savic.c
> +++ b/arch/x86/kernel/apic/x2apic_savic.c
> @@ -11,6 +11,7 @@
> #include <linux/cc_platform.h>
> #include <linux/percpu-defs.h>
> #include <linux/align.h>
> +#include <linux/sizes.h>
>
> #include <asm/apic.h>
> #include <asm/sev.h>
> @@ -20,6 +21,19 @@
> static DEFINE_PER_CPU(void *, apic_backing_page);
> static DEFINE_PER_CPU(bool, savic_setup_done);
>
> +enum lapic_lvt_entry {
What's that enum for?
Oh, you want to use it below but you don't. Why?
> + LVT_TIMER,
> + LVT_THERMAL_MONITOR,
> + LVT_PERFORMANCE_COUNTER,
> + LVT_LINT0,
> + LVT_LINT1,
> + LVT_ERROR,
> +
> + APIC_MAX_NR_LVT_ENTRIES,
> +};
> +
> +#define APIC_LVTx(x) (APIC_LVTT + 0x10 * (x))
> +
> static int x2apic_savic_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
> {
> return x2apic_enabled() && cc_platform_has(CC_ATTR_SNP_SECURE_AVIC);
> @@ -35,6 +49,22 @@ static inline void set_reg(char *page, int reg_off, u32 val)
> WRITE_ONCE(*((u32 *)(page + reg_off)), val);
> }
>
> +static u32 read_msr_from_hv(u32 reg)
A MSR's contents is u64. Make this function generic enough and have the
callsite select only the lower dword.
> +{
> + u64 data, msr;
> + int ret;
> +
> + msr = APIC_BASE_MSR + (reg >> 4);
> + ret = sev_ghcb_msr_read(msr, &data);
> + if (ret != ES_OK) {
> + pr_err("Secure AVIC msr (%#llx) read returned error (%d)\n", msr, ret);
Prepend "0x" to the format specifier.
> + /* MSR read failures are treated as fatal errors */
> + snp_abort();
> + }
> +
> + return lower_32_bits(data);
> +}
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver
2024-11-07 14:28 ` Borislav Petkov
@ 2024-11-08 8:59 ` Neeraj Upadhyay
2024-11-08 10:48 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-08 8:59 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On 11/7/2024 7:58 PM, Borislav Petkov wrote:
> On Thu, Nov 07, 2024 at 09:02:16AM +0530, Neeraj Upadhyay wrote:
>> Intention of doing per reg is to be explicit about which registers
>> are accessed from backing page, which from hv and which are not allowed
>> access. As access (and their perms) are per-reg and not range-based, this
>> made sense to me. Also, if ranges are used, I think 16-byte aligned
>> checks are needed for the range. If using ranges looks more logical grouping
>> here, I can update it as per the above range groupings.
>
> Is this list of registers going to remain or are we going to keep adding to
> it so that the ranges become contiguous?
>
From the APIC architecture details in APM and SDM, I see these gaps are reserved
ranges which are present for xapic also. x2apic keeps the register layout consistent.
So, these gaps looks to have have remained reserved for long. I don't have information
on the uarch reasons (if any) for the reserved space layout.
> And yes, there is some merit to explicitly naming them but you can also put
> that in a comment once above those functions too.
>
I understand your point but, for this specific case, to me, each register as a separate
"switch case" looks clearer and self-describing than keeping a range of different
registers and putting comment about which registers it covers.
In addition, while each APIC register is at 16-byte alignment, they are accessed
using dword size reads/writes. So, as mentioned in previous reply, using ranges
requires alignment checks.
One hypothetical example where using range checks could become klugy is when
the unused 12 bytes of 16 byte of a register is repurposed for implementation-
specific features and the read/write permissions are different for the architecture
register and the implementation-defined one. Secure AVIC uses (IRRn + 4) address
for ALLOWED_IRRn. However, the r/w permissions are same for IRRn and ALLOWED_IRRn.
So, using ranges for IRR register space works fine. However, it may not work
if similar register-space-repurposing happens for other features in future. I
understand this could be considered as speculative and hand-wavy reasoning. So,
I would ask, does above reasoning convince you with the current switch-case layout
or you want it to be range-based?
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver
2024-11-08 8:59 ` Neeraj Upadhyay
@ 2024-11-08 10:48 ` Borislav Petkov
2024-11-08 16:14 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-11-08 10:48 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Fri, Nov 08, 2024 at 02:29:03PM +0530, Neeraj Upadhyay wrote:
> From the APIC architecture details in APM and SDM, I see these gaps are reserved
What I actually meant here is whether SAVIC enablement is going to keep adding
more and more entries here so that it becomes practically *all* possible, each
spelled out explicitly.
But I went further in your patchset and it doesn't look like it so meh, ok.
> I would ask, does above reasoning convince you with the current switch-case layout
> or you want it to be range-based?
That's fine, let's keep them like they are now and we can always revisit if
the list grows too ugly.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver
2024-11-08 10:48 ` Borislav Petkov
@ 2024-11-08 16:14 ` Neeraj Upadhyay
0 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-08 16:14 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On 11/8/2024 4:18 PM, Borislav Petkov wrote:
> On Fri, Nov 08, 2024 at 02:29:03PM +0530, Neeraj Upadhyay wrote:
>> From the APIC architecture details in APM and SDM, I see these gaps are reserved
>
> What I actually meant here is whether SAVIC enablement is going to keep adding
> more and more entries here so that it becomes practically *all* possible, each
> spelled out explicitly.
>
Ah ok. My bad, I completely misread it and went too far.
> But I went further in your patchset and it doesn't look like it so meh, ok.
>
Yes, the offsets layout I described was with respect to the complete series. Thanks
for checking it.
>> I would ask, does above reasoning convince you with the current switch-case layout
>> or you want it to be range-based?
>
> That's fine, let's keep them like they are now and we can always revisit if
> the list grows too ugly.
>
Sure, thanks!
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC
2024-11-07 15:28 ` Borislav Petkov
@ 2024-11-08 18:08 ` Neeraj Upadhyay
2024-11-09 16:27 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-08 18:08 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On 11/7/2024 8:58 PM, Borislav Petkov wrote:
> On Fri, Sep 13, 2024 at 05:06:55PM +0530, Neeraj Upadhyay wrote:
>> From: Kishon Vijay Abraham I <kvijayab@amd.com>
>>
>> Secure AVIC lets guest manage the APIC backing page (unlike emulated
>> x2APIC or x2AVIC where the hypervisor manages the APIC backing page).
>>
>> However the introduced Secure AVIC Linux design still maintains the
>> APIC backing page in the hypervisor to shadow the APIC backing page
>> maintained by guest (It should be noted only subset of the registers
>> are shadowed for specific usecases and registers like APIC_IRR,
>> APIC_ISR are not shadowed).
>>
>> Add sev_ghcb_msr_read() to invoke "SVM_EXIT_MSR" VMGEXIT to read
>> MSRs from hypervisor. Initialize the Secure AVIC's APIC backing
>> page by copying the initial state of shadow APIC backing page in
>> the hypervisor to the guest APIC backing page. Specifically copy
>> APIC_LVR, APIC_LDR, and APIC_LVT MSRs from the shadow APIC backing
>> page.
>
> You don't have to explain what the patch does - rather, why the patch exists
> in the first place and perhaps mention some non-obvious stuff why the code
> does what it does.
>
> Check your whole set pls.
I will improve on this in the next version.
>> -static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>> +static enum es_result __vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt, bool write)
>
> Yeah, this one was bugging me already during Nikunj's set so I cleaned it up
> a bit differently:
>
> https://git.kernel.org/tip/8bca85cc1eb72e21a3544ab32e546a819d8674ca
>
Ok nice! I will rebase.
>> +enum es_result sev_ghcb_msr_read(u64 msr, u64 *value)
>
> Why is this a separate function if it is called only once from x2avic_savic.c?
>
As sev_ghcb_msr_read() work with any msr and is not limited to reading
x2apic msrs, I created a global sev function for it.
> I think you should merge it with read_msr_from_hv(), rename latter to
>
> x2avic_read_msr_from_hv()
>
> and leave it here in sev/core.c.
>
Ok sure, I will leave generalizing this to future use cases (if/when they come up)
and provide a secure avic specific function here (will do the same for
sev_ghcb_msr_write(), which comes later in this series).
"x2avic" terminology is not used in guest code. As this function only has secure
avic user, does secure_avic_ghcb_msr_read() work?
>> +enum lapic_lvt_entry {
>
> What's that enum for?
It's used in init_backing_page()
for (i = LVT_THERMAL_MONITOR; i < APIC_MAX_NR_LVT_ENTRIES; i++) {
val = read_msr_from_hv(APIC_LVTx(i));
set_reg(backing_page, APIC_LVTx(i), val);
}
>
> Oh, you want to use it below but you don't. Why?
>
As LVT_TIMER is unused, I will remove it:
enum lapic_lvt_entry {
LVT_THERMAL_MONITOR = 1,
LVT_PERFORMANCE_COUNTER,
LVT_LINT0,
LVT_LINT1,
LVT_ERROR,
APIC_MAX_NR_LVT_ENTRIES,
};
>> + LVT_TIMER,
>> + LVT_THERMAL_MONITOR,
>> + LVT_PERFORMANCE_COUNTER,
>> + LVT_LINT0,
>> + LVT_LINT1,
>> + LVT_ERROR,
>> +
>> + APIC_MAX_NR_LVT_ENTRIES,
>> +};
>> +
>> +#define APIC_LVTx(x) (APIC_LVTT + 0x10 * (x))
>> +
>> static int x2apic_savic_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
>> {
>> return x2apic_enabled() && cc_platform_has(CC_ATTR_SNP_SECURE_AVIC);
>> @@ -35,6 +49,22 @@ static inline void set_reg(char *page, int reg_off, u32 val)
>> WRITE_ONCE(*((u32 *)(page + reg_off)), val);
>> }
>>
>> +static u32 read_msr_from_hv(u32 reg)
>
> A MSR's contents is u64. Make this function generic enough and have the
> callsite select only the lower dword.
>
Ok sure, will update.
>> +{
>> + u64 data, msr;
>> + int ret;
>> +
>> + msr = APIC_BASE_MSR + (reg >> 4);
>> + ret = sev_ghcb_msr_read(msr, &data);
>> + if (ret != ES_OK) {
>> + pr_err("Secure AVIC msr (%#llx) read returned error (%d)\n", msr, ret);
>
> Prepend "0x" to the format specifier.
>
Using '#' prepends "0x". Am I missing something here?
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC
2024-11-08 18:08 ` Neeraj Upadhyay
@ 2024-11-09 16:27 ` Borislav Petkov
2024-11-09 16:51 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-11-09 16:27 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
On Fri, Nov 08, 2024 at 11:38:15PM +0530, Neeraj Upadhyay wrote:
> As sev_ghcb_msr_read() work with any msr and is not limited to reading
> x2apic msrs, I created a global sev function for it.
That can happen when the functionality is needed somewhere else too.
> Ok sure, I will leave generalizing this to future use cases (if/when they come up)
Yap.
> and provide a secure avic specific function here (will do the same for
> sev_ghcb_msr_write(), which comes later in this series).
>
> "x2avic" terminology is not used in guest code. As this function only has secure
> avic user, does secure_avic_ghcb_msr_read() work?
That or "savic_..."
> >> +enum lapic_lvt_entry {
> >
> > What's that enum for?
>
> It's used in init_backing_page()
Then use it properly:
diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index 99151be4e173..1753c4a71b50 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -200,8 +200,8 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
static void init_backing_page(void *backing_page)
{
+ enum lapic_lvt_entry i;
u32 val;
- int i;
val = read_msr_from_hv(APIC_LVR);
set_reg(backing_page, APIC_LVR, val);
> >> + pr_err("Secure AVIC msr (%#llx) read returned error (%d)\n", msr, ret);
> >
> > Prepend "0x" to the format specifier.
> >
>
> Using '#' prepends "0x". Am I missing something here?
Let's use the common thing pls even if the alternate form works too:
$ git grep "\"%#" | wc -l
411
$ git grep "\"0x" | wc -l
112823
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC
2024-11-09 16:27 ` Borislav Petkov
@ 2024-11-09 16:51 ` Neeraj Upadhyay
0 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-09 16:51 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj,
Santosh.Shukla, Vasant.Hegde, Suravee.Suthikulpanit, David.Kaplan,
x86, hpa, peterz, seanjc, pbonzini, kvm
>> and provide a secure avic specific function here (will do the same for
>> sev_ghcb_msr_write(), which comes later in this series).
>>
>> "x2avic" terminology is not used in guest code. As this function only has secure
>> avic user, does secure_avic_ghcb_msr_read() work?
>
> That or "savic_..."
>
Ok sure.
>>>> +enum lapic_lvt_entry {
>>>
>>> What's that enum for?
>>
>> It's used in init_backing_page()
>
> Then use it properly:
>
I will update it.
> diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
> index 99151be4e173..1753c4a71b50 100644
> --- a/arch/x86/kernel/apic/x2apic_savic.c
> +++ b/arch/x86/kernel/apic/x2apic_savic.c
> @@ -200,8 +200,8 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
>
> static void init_backing_page(void *backing_page)
> {
> + enum lapic_lvt_entry i;
> u32 val;
> - int i;
>
> val = read_msr_from_hv(APIC_LVR);
> set_reg(backing_page, APIC_LVR, val);
>
>>>> + pr_err("Secure AVIC msr (%#llx) read returned error (%d)\n", msr, ret);
>>>
>>> Prepend "0x" to the format specifier.
>>>
>>
>> Using '#' prepends "0x". Am I missing something here?
>
> Let's use the common thing pls even if the alternate form works too:
>
Ok sure, will change to "0x".
- Neeraj
> $ git grep "\"%#" | wc -l
> 411
> $ git grep "\"0x" | wc -l
> 112823
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [sos-linux-ext-patches] [RFC 05/14] x86/apic: Initialize APIC ID for Secure AVIC
2024-09-13 11:36 ` [RFC 05/14] x86/apic: Initialize APIC ID " Neeraj Upadhyay
@ 2024-11-09 20:13 ` Melody (Huibo) Wang
2024-11-10 3:55 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Melody (Huibo) Wang @ 2024-11-09 20:13 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
Hi Neeraj,
On 9/13/2024 4:36 AM, Neeraj Upadhyay wrote:
> Initialize the APIC ID in the APIC backing page with the
> CPUID function 0000_000bh_EDX (Extended Topology Enumeration),
> and ensure that APIC ID msr read from hypervisor is consistent
> with the value read from CPUID.
>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/kernel/apic/x2apic_savic.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
> index 99151be4e173..09fbc1857bf3 100644
> --- a/arch/x86/kernel/apic/x2apic_savic.c
> +++ b/arch/x86/kernel/apic/x2apic_savic.c
> @@ -14,6 +14,7 @@
> #include <linux/sizes.h>
>
> #include <asm/apic.h>
> +#include <asm/cpuid.h>
> #include <asm/sev.h>
>
> #include "local.h"
> @@ -200,6 +201,8 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
>
> static void init_backing_page(void *backing_page)
> {
> + u32 hv_apic_id;
> + u32 apic_id;
> u32 val;
> int i;
>
> @@ -220,6 +223,13 @@ static void init_backing_page(void *backing_page)
>
> val = read_msr_from_hv(APIC_LDR);
> set_reg(backing_page, APIC_LDR, val);
> +
> + /* Read APIC ID from Extended Topology Enumeration CPUID */
> + apic_id = cpuid_edx(0x0000000b);
> + hv_apic_id = read_msr_from_hv(APIC_ID);
> + WARN_ONCE(hv_apic_id != apic_id, "Inconsistent APIC_ID values: %d (cpuid), %d (msr)",
> + apic_id, hv_apic_id);
> + set_reg(backing_page, APIC_ID, apic_id);
> }
>
With this warning that hv_apic_id and apic_id is different, do you still want to set_reg after that? If so, wonder why we have this warning?
Thanks,
Melody
> static void x2apic_savic_setup(void)
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [sos-linux-ext-patches] [RFC 05/14] x86/apic: Initialize APIC ID for Secure AVIC
2024-11-09 20:13 ` [sos-linux-ext-patches] " Melody (Huibo) Wang
@ 2024-11-10 3:55 ` Neeraj Upadhyay
2024-11-10 12:12 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-10 3:55 UTC (permalink / raw)
To: Melody (Huibo) Wang, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
>> static void init_backing_page(void *backing_page)
>> {
>> + u32 hv_apic_id;
>> + u32 apic_id;
>> u32 val;
>> int i;
>>
>> @@ -220,6 +223,13 @@ static void init_backing_page(void *backing_page)
>>
>> val = read_msr_from_hv(APIC_LDR);
>> set_reg(backing_page, APIC_LDR, val);
>> +
>> + /* Read APIC ID from Extended Topology Enumeration CPUID */
>> + apic_id = cpuid_edx(0x0000000b);
>> + hv_apic_id = read_msr_from_hv(APIC_ID);
>> + WARN_ONCE(hv_apic_id != apic_id, "Inconsistent APIC_ID values: %d (cpuid), %d (msr)",
>> + apic_id, hv_apic_id);
>> + set_reg(backing_page, APIC_ID, apic_id);
>> }
>>
> With this warning that hv_apic_id and apic_id is different, do you still want to set_reg after that? If so, wonder why we have this warning?
>
"apic_id" as read from cpuid is the source of truth for guest and is the one
guest would be using for its interrupt/IPI flow.
Guest IPI flow does below:
1. Source vCPU updates the IRR bit in the destination vCPU's backing page.
2. Source vCPU takes an Automatic Exit to hv by doing ICR wrmsr operation.
The destination APIC ID in ICR write data contains "apic_id".
3. Hv uses "apic_id" to either kick the corresponding destination vCPU (
if not running) or write to AVIC doorbell to notify the running
destination vCPU about the new interrupt.
Given that in step 3, hv uses "apic_id" (provided by guest) to find the
corresponding vCPU information, "apic_id" and "hv_apic_id" need to match.
Mismatch is not considered as a fatal event for guest (snp_abort() is not
triggered) and a warning is raise, as even if hv fails to kick or notify
the target vCPU, the IPI (though delayed) will get handled the next time
target vCPU vmrun happens.
I will include this information in commit message and change WARN_ONCE() to
pr_warn() (while at it, will change the format specifiers from %d to %u).
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [sos-linux-ext-patches] [RFC 05/14] x86/apic: Initialize APIC ID for Secure AVIC
2024-11-10 3:55 ` Neeraj Upadhyay
@ 2024-11-10 12:12 ` Borislav Petkov
2024-11-10 15:22 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-11-10 12:12 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Melody (Huibo) Wang, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Sun, Nov 10, 2024 at 09:25:34AM +0530, Neeraj Upadhyay wrote:
> Given that in step 3, hv uses "apic_id" (provided by guest) to find the
> corresponding vCPU information, "apic_id" and "hv_apic_id" need to match.
> Mismatch is not considered as a fatal event for guest (snp_abort() is not
> triggered) and a warning is raise,
What is it considered then and why does the warning even exist?
What can anyone do about it?
If you don't kill the guest, what should the guest owner do if she sees that
warning?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [sos-linux-ext-patches] [RFC 05/14] x86/apic: Initialize APIC ID for Secure AVIC
2024-11-10 12:12 ` Borislav Petkov
@ 2024-11-10 15:22 ` Neeraj Upadhyay
2024-11-10 16:34 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-10 15:22 UTC (permalink / raw)
To: Borislav Petkov
Cc: Melody (Huibo) Wang, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On 11/10/2024 5:42 PM, Borislav Petkov wrote:
> On Sun, Nov 10, 2024 at 09:25:34AM +0530, Neeraj Upadhyay wrote:
>> Given that in step 3, hv uses "apic_id" (provided by guest) to find the
>> corresponding vCPU information, "apic_id" and "hv_apic_id" need to match.
>> Mismatch is not considered as a fatal event for guest (snp_abort() is not
>> triggered) and a warning is raise,
>
> What is it considered then and why does the warning even exist?
>
APIC ID mismatch can delay IPI handling, which can result in slow guest by
delaying activities like scheduling of tasks within guest.
> What can anyone do about it?
>
The misconfiguration would require fixing the vCPUs' APIC ID in the host.
> If you don't kill the guest, what should the guest owner do if she sees that
> warning?
>
If I get your point, unless a corrective action is possible without
hard reboot of the guest, doing a snp_abort() on detecting mismatch works better
here. That way, the issue can be caught early, and it does not disrupt the running
applications on a limping guest (which happens for the case where we only emit
a warning). So, thinking more, snp_abort() looks more apt here.
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [sos-linux-ext-patches] [RFC 05/14] x86/apic: Initialize APIC ID for Secure AVIC
2024-11-10 15:22 ` Neeraj Upadhyay
@ 2024-11-10 16:34 ` Borislav Petkov
2024-11-11 3:45 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-11-10 16:34 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Melody (Huibo) Wang, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Sun, Nov 10, 2024 at 08:52:03PM +0530, Neeraj Upadhyay wrote:
> If I get your point, unless a corrective action is possible without
> hard reboot of the guest, doing a snp_abort() on detecting mismatch works better
> here. That way, the issue can be caught early, and it does not disrupt the running
> applications on a limping guest (which happens for the case where we only emit
> a warning). So, thinking more, snp_abort() looks more apt here.
Well, sometimes you have no influence on the HV (public cloud, for example).
So WARN_ONCE was on the right track but the error message should be more
user-friendly:
WARN_ONCE(hv_apic_id != apic_id,
"APIC IDs mismatch: %d (HV: %d). IPI handling will suffer!",
apic_id, hv_apic_id);
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 05/14] x86/apic: Initialize APIC ID for Secure AVIC
2024-11-10 16:34 ` Borislav Petkov
@ 2024-11-11 3:45 ` Neeraj Upadhyay
0 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-11 3:45 UTC (permalink / raw)
To: Borislav Petkov
Cc: Melody (Huibo) Wang, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On 11/10/2024 10:04 PM, Borislav Petkov wrote:
> On Sun, Nov 10, 2024 at 08:52:03PM +0530, Neeraj Upadhyay wrote:
>> If I get your point, unless a corrective action is possible without
>> hard reboot of the guest, doing a snp_abort() on detecting mismatch works better
>> here. That way, the issue can be caught early, and it does not disrupt the running
>> applications on a limping guest (which happens for the case where we only emit
>> a warning). So, thinking more, snp_abort() looks more apt here.
>
> Well, sometimes you have no influence on the HV (public cloud, for example).
>
I see. Ok.
> So WARN_ONCE was on the right track but the error message should be more
> user-friendly:
>
> WARN_ONCE(hv_apic_id != apic_id,
> "APIC IDs mismatch: %d (HV: %d). IPI handling will suffer!",
> apic_id, hv_apic_id);
>
Ok sure, I will update it.
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [sos-linux-ext-patches] [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC
2024-09-13 11:36 ` [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC Neeraj Upadhyay
2024-11-07 15:28 ` Borislav Petkov
@ 2024-11-11 22:43 ` Melody (Huibo) Wang
2024-11-12 3:01 ` Neeraj Upadhyay
1 sibling, 1 reply; 81+ messages in thread
From: Melody (Huibo) Wang @ 2024-11-11 22:43 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
Hi Neeraj,
On 9/13/2024 4:36 AM, Neeraj Upadhyay wrote:
> +static void init_backing_page(void *backing_page)
> +{
> + u32 val;
> + int i;
> +
> + val = read_msr_from_hv(APIC_LVR);
> + set_reg(backing_page, APIC_LVR, val);
> +
When you read the register from hypervisor, there is certain value defined in APM Table 16-2. APIC Registers, says APIC_LVR has value 80??0010h out of reset.
More specifically, Bit 31 is set which means the presence of extended APIC registers, and Bit 4 is set which is part of version number: "The local APIC implementation is identified with a value=1Xh (20h-FFh are
reserved)".
I think you should verify those values instead of just reading from the hypervisor. Also, I think you probably should verify all of registers you read from the hypervisor before you use them in the guest. In other words, sanitize the inputs from the hypervisor.
Thanks,
Melody
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC
2024-11-11 22:43 ` [sos-linux-ext-patches] " Melody (Huibo) Wang
@ 2024-11-12 3:01 ` Neeraj Upadhyay
0 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-12 3:01 UTC (permalink / raw)
To: Melody (Huibo) Wang, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
On 11/12/2024 4:13 AM, Melody (Huibo) Wang wrote:
> Hi Neeraj,
>
> On 9/13/2024 4:36 AM, Neeraj Upadhyay wrote:
>
>> +static void init_backing_page(void *backing_page)
>> +{
>> + u32 val;
>> + int i;
>> +
>> + val = read_msr_from_hv(APIC_LVR);
>> + set_reg(backing_page, APIC_LVR, val);
>> +
>
> When you read the register from hypervisor, there is certain value defined in APM Table 16-2. APIC Registers, says APIC_LVR has value 80??0010h out of reset.
> > More specifically, Bit 31 is set which means the presence of extended APIC registers, and Bit 4 is set which is part of version number: "The local APIC implementation is identified with a value=1Xh (20h-FFh are
> reserved)".
>
> I think you should verify those values instead of just reading from the hypervisor. Also, I think you probably should verify all of registers you read from the hypervisor before you use them in the guest. In other words, sanitize the inputs from the hypervisor.
>
Ok, I will add this verification of hv read data (wherever applicable) as incremental patches.
- Neeraj
> Thanks,
> Melody
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-09-13 11:36 ` [RFC 01/14] x86/apic: Add new driver for Secure AVIC Neeraj Upadhyay
2024-10-08 19:15 ` Borislav Petkov
2024-10-09 10:10 ` Kirill A. Shutemov
@ 2024-11-18 21:45 ` Melody (Huibo) Wang
2024-11-21 5:05 ` Neeraj Upadhyay
2 siblings, 1 reply; 81+ messages in thread
From: Melody (Huibo) Wang @ 2024-11-18 21:45 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
Hi Neeraj,
On 9/13/2024 4:36 AM, Neeraj Upadhyay wrote:
> From: Kishon Vijay Abraham I <kvijayab@amd.com>
>
> The Secure AVIC feature provides SEV-SNP guests hardware acceleration
> for performance sensitive APIC accesses while securely managing the
> guest-owned APIC state through the use of a private APIC backing page.
> This helps prevent malicious hypervisor from generating unexpected
> interrupts for a vCPU or otherwise violate architectural assumptions
> around APIC behavior.
>
> Add a new x2APIC driver that will serve as the base of the Secure AVIC
> support. It is initially the same as the x2APIC phys driver, but will be
> modified as features of Secure AVIC are implemented.
>
> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Co-developed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/Kconfig | 12 +++
> arch/x86/boot/compressed/sev.c | 1 +
> arch/x86/coco/core.c | 3 +
> arch/x86/include/asm/msr-index.h | 4 +-
> arch/x86/kernel/apic/Makefile | 1 +
> arch/x86/kernel/apic/x2apic_savic.c | 112 ++++++++++++++++++++++++++++
> include/linux/cc_platform.h | 8 ++
> 7 files changed, 140 insertions(+), 1 deletion(-)
> create mode 100644 arch/x86/kernel/apic/x2apic_savic.c
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 007bab9f2a0e..b05b4e9d2e49 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -469,6 +469,18 @@ config X86_X2APIC
>
> If you don't know what to do here, say N.
>
> +config AMD_SECURE_AVIC
> + bool "AMD Secure AVIC"
> + depends on X86_X2APIC && AMD_MEM_ENCRYPT
If we remove the dependency on X2APIC, there are only 3 X2APIC functions which you call from this driver. Can we just expose them in the header, and then simply remove the dependency on X2APIC?
Thanks
Melody
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-11-18 21:45 ` Melody (Huibo) Wang
@ 2024-11-21 5:05 ` Neeraj Upadhyay
2024-11-21 5:41 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-21 5:05 UTC (permalink / raw)
To: Melody (Huibo) Wang, linux-kernel
Cc: tglx, mingo, dave.hansen, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, x86, hpa,
peterz, seanjc, pbonzini, kvm
On 11/19/2024 3:15 AM, Melody (Huibo) Wang wrote:
> Hi Neeraj,
>
> On 9/13/2024 4:36 AM, Neeraj Upadhyay wrote:
>> From: Kishon Vijay Abraham I <kvijayab@amd.com>
>>
>> The Secure AVIC feature provides SEV-SNP guests hardware acceleration
>> for performance sensitive APIC accesses while securely managing the
>> guest-owned APIC state through the use of a private APIC backing page.
>> This helps prevent malicious hypervisor from generating unexpected
>> interrupts for a vCPU or otherwise violate architectural assumptions
>> around APIC behavior.
>>
>> Add a new x2APIC driver that will serve as the base of the Secure AVIC
>> support. It is initially the same as the x2APIC phys driver, but will be
>> modified as features of Secure AVIC are implemented.
>>
>> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
>> Co-developed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
>> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
>> ---
>> arch/x86/Kconfig | 12 +++
>> arch/x86/boot/compressed/sev.c | 1 +
>> arch/x86/coco/core.c | 3 +
>> arch/x86/include/asm/msr-index.h | 4 +-
>> arch/x86/kernel/apic/Makefile | 1 +
>> arch/x86/kernel/apic/x2apic_savic.c | 112 ++++++++++++++++++++++++++++
>> include/linux/cc_platform.h | 8 ++
>> 7 files changed, 140 insertions(+), 1 deletion(-)
>> create mode 100644 arch/x86/kernel/apic/x2apic_savic.c
>>
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index 007bab9f2a0e..b05b4e9d2e49 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -469,6 +469,18 @@ config X86_X2APIC
>>
>> If you don't know what to do here, say N.
>>
>> +config AMD_SECURE_AVIC
>> + bool "AMD Secure AVIC"
>> + depends on X86_X2APIC && AMD_MEM_ENCRYPT
>
> If we remove the dependency on X2APIC, there are only 3 X2APIC functions which you call from this driver. Can we just expose them in the header, and then simply remove the dependency on X2APIC?
>
APIC common code (arch/x86/kernel/apic/apic.c) and other parts of the
x86 code use X86_X2APIC config to enable x2apic related initialization
and functionality. So, dependency on X2APIC need to be there.
- Neeraj
> Thanks
> Melody
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-11-21 5:05 ` Neeraj Upadhyay
@ 2024-11-21 5:41 ` Borislav Petkov
2024-11-21 8:03 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-11-21 5:41 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Melody (Huibo) Wang, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Thu, Nov 21, 2024 at 10:35:35AM +0530, Neeraj Upadhyay wrote:
> APIC common code (arch/x86/kernel/apic/apic.c) and other parts of the
> x86 code use X86_X2APIC config to enable x2apic related initialization
> and functionality. So, dependency on X2APIC need to be there.
Have you actually tried to remove the dependency and see how it looks?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-11-21 5:41 ` Borislav Petkov
@ 2024-11-21 8:03 ` Neeraj Upadhyay
2024-11-21 10:53 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-21 8:03 UTC (permalink / raw)
To: Borislav Petkov
Cc: Melody (Huibo) Wang, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On 11/21/2024 11:11 AM, Borislav Petkov wrote:
> On Thu, Nov 21, 2024 at 10:35:35AM +0530, Neeraj Upadhyay wrote:
>> APIC common code (arch/x86/kernel/apic/apic.c) and other parts of the
>> x86 code use X86_X2APIC config to enable x2apic related initialization
>> and functionality. So, dependency on X2APIC need to be there.
>
> Have you actually tried to remove the dependency and see how it looks?
>
No, I didn't try it previously, as based on checking the code below
is what I understand how the code is layered:
- Common x2APIC code in arch/x86/... initializes the x2APIC
architecture sequence and other parts of common x2apic initialization:
* Disable and enable x2apic (...kernel/apic/apic.c).
* max_apicid setting in (...kernel/apic/init.c)
* acpi_parse_x2apic() registration of APIC ID in early
topo maps (kernel/acpi/boot.c)
* Enable x2apic in startup code (...kernel/head_64.S).
- Each x2apic driver in arch/x86/kernel/apic provides callbacks for implementation
specific (x2apic_uv_x.c, apic_numachip.c) or a particular mode
specific (x2apic_phys.c, x2apic_cluster.c) functionality.
As SAVIC's guest APIC register accesses match x2avic (which uses x2APIC MSR
interface in guest), the x2apic common flow need to be executed in the
guest.
- Neeraj
> Thx.
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-11-21 8:03 ` Neeraj Upadhyay
@ 2024-11-21 10:53 ` Borislav Petkov
2024-11-25 7:21 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-11-21 10:53 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Melody (Huibo) Wang, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Thu, Nov 21, 2024 at 01:33:29PM +0530, Neeraj Upadhyay wrote:
> As SAVIC's guest APIC register accesses match x2avic (which uses x2APIC MSR
> interface in guest), the x2apic common flow need to be executed in the
> guest.
How much of that "common flow" is actually needed by SAVIC?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-11-21 10:53 ` Borislav Petkov
@ 2024-11-25 7:21 ` Neeraj Upadhyay
2024-11-25 10:08 ` Borislav Petkov
0 siblings, 1 reply; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-25 7:21 UTC (permalink / raw)
To: Borislav Petkov
Cc: Melody (Huibo) Wang, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On 11/21/2024 4:23 PM, Borislav Petkov wrote:
> On Thu, Nov 21, 2024 at 01:33:29PM +0530, Neeraj Upadhyay wrote:
>> As SAVIC's guest APIC register accesses match x2avic (which uses x2APIC MSR
>> interface in guest), the x2apic common flow need to be executed in the
>> guest.
>
> How much of that "common flow" is actually needed by SAVIC?
>
I see most of that flow required. By removing dependency on CONFIG_X86_X2APIC
and enabling SAVIC, I see below boot issues:
- Crash in register_lapic_address() in below path:
register_lapic_address+0x82/0xe0
early_acpi_boot_init+0xc7/0x160
setup_arch+0x9b2/0xec0
The issue happens as register_lapic_address() tries to setup APIC MMIO,
which applies to XAPIC and not to X2APIC. As SAVIC only supports X2APIC
msr interface, APIC MMIO setup fails.
void __init register_lapic_address(unsigned long address)
{
/* This should only happen once */
WARN_ON_ONCE(mp_lapic_addr);
mp_lapic_addr = address;
if (!x2apic_mode)
apic_set_fixmap(true);
}
- x2apic_enable() (which enables X2APIC in APIC base reg) not being called causes
read_msr_from_hv() to return below error:
Secure AVIC msr (0x803) read returned error (4)
KVM: unknown exit reason 24
- x2apic_set_max_apicid() not being called causes below BUG_ON to happen:
kernel BUG at arch/x86/kernel/apic/io_apic.c:2292!
void __init setup_IO_APIC(void)
{
...
for_each_ioapic(ioapic)
BUG_ON(mp_irqdomain_create(ioapic));
...
}
- Neeraj
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-11-25 7:21 ` Neeraj Upadhyay
@ 2024-11-25 10:08 ` Borislav Petkov
2024-11-25 11:16 ` Neeraj Upadhyay
0 siblings, 1 reply; 81+ messages in thread
From: Borislav Petkov @ 2024-11-25 10:08 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Melody (Huibo) Wang, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On Mon, Nov 25, 2024 at 12:51:36PM +0530, Neeraj Upadhyay wrote:
> I see most of that flow required. By removing dependency on CONFIG_X86_X2APIC
> and enabling SAVIC, I see below boot issues:
Ok, then please extend the Kconfig help text so that it explicitly calls out
the fact that the CONFIG_X86_X2APIC dependency is not only build but
functional one too and that SAVIC relies on X2APIC machinery being present in
the guest.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [RFC 01/14] x86/apic: Add new driver for Secure AVIC
2024-11-25 10:08 ` Borislav Petkov
@ 2024-11-25 11:16 ` Neeraj Upadhyay
0 siblings, 0 replies; 81+ messages in thread
From: Neeraj Upadhyay @ 2024-11-25 11:16 UTC (permalink / raw)
To: Borislav Petkov
Cc: Melody (Huibo) Wang, linux-kernel, tglx, mingo, dave.hansen,
Thomas.Lendacky, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, David.Kaplan, x86, hpa, peterz, seanjc,
pbonzini, kvm
On 11/25/2024 3:38 PM, Borislav Petkov wrote:
> On Mon, Nov 25, 2024 at 12:51:36PM +0530, Neeraj Upadhyay wrote:
>> I see most of that flow required. By removing dependency on CONFIG_X86_X2APIC
>> and enabling SAVIC, I see below boot issues:
>
> Ok, then please extend the Kconfig help text so that it explicitly calls out
> the fact that the CONFIG_X86_X2APIC dependency is not only build but
> functional one too and that SAVIC relies on X2APIC machinery being present in
> the guest.
>
Ok, I will update.
- Neeraj
> Thx.
>
^ permalink raw reply [flat|nested] 81+ messages in thread
end of thread, other threads:[~2024-11-25 11:16 UTC | newest]
Thread overview: 81+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-13 11:36 [RFC 00/14] AMD: Add Secure AVIC Guest Support Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 01/14] x86/apic: Add new driver for Secure AVIC Neeraj Upadhyay
2024-10-08 19:15 ` Borislav Petkov
2024-10-09 1:56 ` Neeraj Upadhyay
2024-10-09 5:23 ` Borislav Petkov
2024-10-09 6:01 ` Neeraj Upadhyay
2024-10-09 10:38 ` Borislav Petkov
2024-10-09 11:00 ` Neeraj Upadhyay
2024-10-09 11:02 ` Borislav Petkov
2024-10-09 12:38 ` Neeraj Upadhyay
2024-10-09 13:15 ` Tom Lendacky
2024-10-09 13:50 ` Neeraj Upadhyay
2024-10-09 10:10 ` Kirill A. Shutemov
2024-10-09 10:42 ` Borislav Petkov
2024-10-09 11:03 ` Kirill A. Shutemov
2024-10-09 11:22 ` Borislav Petkov
2024-10-09 12:12 ` Kirill A. Shutemov
2024-10-09 13:53 ` Borislav Petkov
2024-10-11 7:29 ` Kirill A. Shutemov
2024-11-18 21:45 ` Melody (Huibo) Wang
2024-11-21 5:05 ` Neeraj Upadhyay
2024-11-21 5:41 ` Borislav Petkov
2024-11-21 8:03 ` Neeraj Upadhyay
2024-11-21 10:53 ` Borislav Petkov
2024-11-25 7:21 ` Neeraj Upadhyay
2024-11-25 10:08 ` Borislav Petkov
2024-11-25 11:16 ` Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page Neeraj Upadhyay
2024-10-09 15:27 ` Dave Hansen
2024-10-09 16:31 ` Neeraj Upadhyay
2024-10-09 17:03 ` Dave Hansen
2024-10-09 17:52 ` Neeraj Upadhyay
2024-10-23 16:30 ` Borislav Petkov
2024-10-24 4:01 ` Neeraj Upadhyay
2024-10-24 11:49 ` Borislav Petkov
2024-10-24 12:31 ` Neeraj Upadhyay
2024-10-24 12:59 ` Borislav Petkov
2024-10-23 16:36 ` Borislav Petkov
2024-10-24 3:24 ` Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 03/14] x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver Neeraj Upadhyay
2024-11-06 18:16 ` Borislav Petkov
2024-11-07 3:32 ` Neeraj Upadhyay
2024-11-07 14:28 ` Borislav Petkov
2024-11-08 8:59 ` Neeraj Upadhyay
2024-11-08 10:48 ` Borislav Petkov
2024-11-08 16:14 ` Neeraj Upadhyay
2024-11-06 19:20 ` Melody (Huibo) Wang
2024-11-07 3:33 ` Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 04/14] x86/apic: Initialize APIC backing page for Secure AVIC Neeraj Upadhyay
2024-11-07 15:28 ` Borislav Petkov
2024-11-08 18:08 ` Neeraj Upadhyay
2024-11-09 16:27 ` Borislav Petkov
2024-11-09 16:51 ` Neeraj Upadhyay
2024-11-11 22:43 ` [sos-linux-ext-patches] " Melody (Huibo) Wang
2024-11-12 3:01 ` Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 05/14] x86/apic: Initialize APIC ID " Neeraj Upadhyay
2024-11-09 20:13 ` [sos-linux-ext-patches] " Melody (Huibo) Wang
2024-11-10 3:55 ` Neeraj Upadhyay
2024-11-10 12:12 ` Borislav Petkov
2024-11-10 15:22 ` Neeraj Upadhyay
2024-11-10 16:34 ` Borislav Petkov
2024-11-11 3:45 ` Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 06/14] x86/apic: Add update_vector callback " Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 07/14] x86/apic: Add support to send IPI " Neeraj Upadhyay
2024-09-13 11:36 ` [RFC 08/14] x86/apic: Support LAPIC timer " Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 09/14] x86/sev: Initialize VGIF for secondary VCPUs " Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 10/14] x86/apic: Add support to send NMI IPI " Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 11/14] x86/apic: Allow NMI to be injected from hypervisor " Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 12/14] x86/sev: Enable NMI support " Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 13/14] x86/apic: Enable Secure AVIC in Control MSR Neeraj Upadhyay
2024-09-13 11:37 ` [RFC 14/14] x86/sev: Indicate SEV-SNP guest supports Secure AVIC Neeraj Upadhyay
2024-10-17 8:23 ` [RFC 00/14] AMD: Add Secure AVIC Guest Support Kirill A. Shutemov
2024-10-18 2:33 ` Neeraj Upadhyay
2024-10-18 7:54 ` Kirill A. Shutemov
2024-10-29 9:47 ` Borislav Petkov
2024-10-29 10:24 ` Neeraj Upadhyay
2024-10-29 10:54 ` Borislav Petkov
2024-10-29 11:51 ` Kirill A. Shutemov
2024-10-29 12:15 ` Neeraj Upadhyay
2024-10-29 14:36 ` Kirill A. Shutemov
2024-10-29 15:28 ` Neeraj Upadhyay
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox