From: Ashish Kalra <Ashish.Kalra@amd.com>
To: <dave.hansen@linux.intel.com>, <tglx@linutronix.de>,
<mingo@redhat.com>, <bp@alien8.de>, <x86@kernel.org>
Cc: <hpa@zytor.com>, <rafael@kernel.org>, <peterz@infradead.org>,
<adrian.hunter@intel.com>,
<sathyanarayanan.kuppuswamy@linux.intel.com>,
<jun.nakajima@intel.com>, <kirill.shutemov@linux.intel.com>,
<rick.p.edgecombe@intel.com>, <linux-kernel@vger.kernel.org>,
<thomas.lendacky@amd.com>, <michael.roth@amd.com>,
<seanjc@google.com>, <kai.huang@intel.com>, <bhe@redhat.com>,
<bdas@redhat.com>, <vkuznets@redhat.com>,
<dionnaglaze@google.com>, <anisinha@redhat.com>,
<ardb@kernel.org>, <dyoung@redhat.com>,
<kexec@lists.infradead.org>, <linux-coco@lists.linux.dev>,
<jroedel@suse.de>
Subject: [PATCH v12 3/3] x86/snp: Convert shared memory back to private on kexec
Date: Tue, 30 Jul 2024 19:22:06 +0000 [thread overview]
Message-ID: <aecbf9d9fe183d4819f4c9a80bb28bf8a34bc113.1722366144.git.ashish.kalra@amd.com> (raw)
In-Reply-To: <cover.1722366144.git.ashish.kalra@amd.com>
From: Ashish Kalra <ashish.kalra@amd.com>
SNP guests allocate shared buffers to perform I/O. It is done by
allocating pages normally from the buddy allocator and converting them
to shared with set_memory_decrypted().
The second, kexec-ed, kernel has no idea what memory is converted this way.
It only sees E820_TYPE_RAM.
Accessing shared memory via private mapping will cause unrecoverable RMP
page-faults.
On kexec walk direct mapping and convert all shared memory back to
private. It makes all RAM private again and second kernel may use it
normally. Additionally for SNP guests convert all bss decrypted section
pages back to private.
The conversion occurs in two steps: stopping new conversions and
unsharing all memory. In the case of normal kexec, the stopping of
conversions takes place while scheduling is still functioning. This
allows for waiting until any ongoing conversions are finished. The
second step is carried out when all CPUs except one are inactive and
interrupts are disabled. This prevents any conflicts with code that may
access shared memory.
Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/coco/sev/core.c | 132 ++++++++++++++++++++++++++++++++++
arch/x86/include/asm/sev.h | 4 ++
arch/x86/mm/mem_encrypt_amd.c | 2 +
3 files changed, 138 insertions(+)
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index de1df0cb45da..4278cdbee3a5 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1010,6 +1010,138 @@ void snp_accept_memory(phys_addr_t start, phys_addr_t end)
set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
}
+static void set_pte_enc(pte_t *kpte, int level, void *va)
+{
+ struct pte_enc_desc d = {
+ .kpte = kpte,
+ .pte_level = level,
+ .va = va,
+ .encrypt = true
+ };
+
+ prepare_pte_enc(&d);
+ set_pte_enc_mask(kpte, d.pfn, d.new_pgprot);
+}
+
+static void unshare_all_memory(void)
+{
+ unsigned long addr, end, size, ghcb;
+ struct sev_es_runtime_data *data;
+ unsigned int npages, level;
+ bool skipped_addr;
+ pte_t *pte;
+ int cpu;
+
+ /* Unshare the direct mapping. */
+ addr = PAGE_OFFSET;
+ end = PAGE_OFFSET + get_max_mapped();
+
+ while (addr < end) {
+ pte = lookup_address(addr, &level);
+ size = page_level_size(level);
+ npages = size / PAGE_SIZE;
+ skipped_addr = false;
+
+ if (!pte || !pte_decrypted(*pte) || pte_none(*pte)) {
+ addr += size;
+ continue;
+ }
+
+ /*
+ * Ensure that all the per-cpu GHCBs are made private at the
+ * end of unsharing loop so that the switch to the slower MSR
+ * protocol happens last.
+ */
+ for_each_possible_cpu(cpu) {
+ data = per_cpu(runtime_data, cpu);
+ ghcb = (unsigned long)&data->ghcb_page;
+
+ if (addr <= ghcb && ghcb <= addr + size) {
+ skipped_addr = true;
+ break;
+ }
+ }
+
+ if (!skipped_addr) {
+ set_pte_enc(pte, level, (void *)addr);
+ snp_set_memory_private(addr, npages);
+ }
+ addr += size;
+ }
+
+ /* Unshare all bss decrypted memory. */
+ addr = (unsigned long)__start_bss_decrypted;
+ end = (unsigned long)__start_bss_decrypted_unused;
+ npages = (end - addr) >> PAGE_SHIFT;
+
+ for (; addr < end; addr += PAGE_SIZE) {
+ pte = lookup_address(addr, &level);
+ if (!pte || !pte_decrypted(*pte) || pte_none(*pte))
+ continue;
+
+ set_pte_enc(pte, level, (void *)addr);
+ }
+ addr = (unsigned long)__start_bss_decrypted;
+ snp_set_memory_private(addr, npages);
+
+ __flush_tlb_all();
+}
+
+/* Stop new private<->shared conversions */
+void snp_kexec_begin(void)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ if (!IS_ENABLED(CONFIG_KEXEC_CORE))
+ return;
+
+ /*
+ * Crash kernel ends up here with interrupts disabled: can't wait for
+ * conversions to finish.
+ *
+ * If race happened, just report and proceed.
+ */
+ if (!set_memory_enc_stop_conversion())
+ pr_warn("Failed to stop shared<->private conversions\n");
+}
+
+void snp_kexec_finish(void)
+{
+ struct sev_es_runtime_data *data;
+ unsigned int level, cpu;
+ unsigned long size;
+ struct ghcb *ghcb;
+ pte_t *pte;
+
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ if (!IS_ENABLED(CONFIG_KEXEC_CORE))
+ return;
+
+ unshare_all_memory();
+
+ /*
+ * Switch to using the MSR protocol to change per-cpu
+ * GHCBs to private.
+ * All the per-cpu GHCBs have been switched back to private,
+ * so can't do any more GHCB calls to the hypervisor beyond
+ * this point till the kexec kernel starts running.
+ */
+ boot_ghcb = NULL;
+ sev_cfg.ghcbs_initialized = false;
+
+ for_each_possible_cpu(cpu) {
+ data = per_cpu(runtime_data, cpu);
+ ghcb = &data->ghcb_page;
+ pte = lookup_address((unsigned long)ghcb, &level);
+ size = page_level_size(level);
+ set_pte_enc(pte, level, (void *)ghcb);
+ snp_set_memory_private((unsigned long)ghcb, (size / PAGE_SIZE));
+ }
+}
+
static int snp_set_vmsa(void *va, void *caa, int apic_id, bool make_vmsa)
{
int ret;
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index fd19a8f413d0..4876ab4c7043 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -417,6 +417,8 @@ void sev_show_status(void);
void snp_update_svsm_ca(void);
int prepare_pte_enc(struct pte_enc_desc *d);
void set_pte_enc_mask(pte_t *kpte, unsigned long pfn, pgprot_t new_prot);
+void snp_kexec_finish(void);
+void snp_kexec_begin(void);
#else /* !CONFIG_AMD_MEM_ENCRYPT */
@@ -455,6 +457,8 @@ static inline void sev_show_status(void) { }
static inline void snp_update_svsm_ca(void) { }
static inline int prepare_pte_enc(struct pte_enc_desc *d) { }
static inline void set_pte_enc_mask(pte_t *kpte, unsigned long pfn, pgprot_t new_prot) { }
+static inline void snp_kexec_finish(void) { }
+static inline void snp_kexec_begin(void) { }
#endif /* CONFIG_AMD_MEM_ENCRYPT */
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index f4be81db72ee..774f9677458f 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -490,6 +490,8 @@ void __init sme_early_init(void)
x86_platform.guest.enc_status_change_finish = amd_enc_status_change_finish;
x86_platform.guest.enc_tlb_flush_required = amd_enc_tlb_flush_required;
x86_platform.guest.enc_cache_flush_required = amd_enc_cache_flush_required;
+ x86_platform.guest.enc_kexec_begin = snp_kexec_begin;
+ x86_platform.guest.enc_kexec_finish = snp_kexec_finish;
/*
* AMD-SEV-ES intercepts the RDMSR to read the X2APIC ID in the
--
2.34.1
next prev parent reply other threads:[~2024-07-30 19:22 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-14 9:58 [PATCHv12 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 01/19] x86/acpi: Extract ACPI MADT wakeup code into a separate file Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 02/19] x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 03/19] cpu/hotplug: Add support for declaring CPU offlining not supported Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 04/19] cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 05/19] x86/relocate_kernel: Use named labels for less confusion Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 07/19] x86/mm: Make x86_platform.guest.enc_status_change_*() return errno Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 08/19] x86/mm: Return correct level from lookup_address() if pte is none Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 09/19] x86/tdx: Account shared memory Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 11/19] x86/tdx: Convert shared memory back to private on kexec Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 12/19] x86/mm: Make e820__end_ram_pfn() cover E820_TYPE_ACPI ranges Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 13/19] x86/mm: Do not zap page table entries mapping unaccepted memory table during kdump Kirill A. Shutemov
2024-06-14 9:58 ` [PATCHv12 14/19] x86/acpi: Rename fields in acpi_madt_multiproc_wakeup structure Kirill A. Shutemov
2024-06-14 9:59 ` [PATCHv12 15/19] x86/acpi: Do not attempt to bring up secondary CPUs in kexec case Kirill A. Shutemov
2024-06-14 9:59 ` [PATCHv12 16/19] x86/smp: Add smp_ops.stop_this_cpu() callback Kirill A. Shutemov
2024-06-14 9:59 ` [PATCHv12 17/19] x86/mm: Introduce kernel_ident_mapping_free() Kirill A. Shutemov
2024-06-14 9:59 ` [PATCHv12 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method Kirill A. Shutemov
2024-06-14 9:59 ` [PATCHv12 19/19] ACPI: tables: Print MULTIPROC_WAKEUP when MADT is parsed Kirill A. Shutemov
2024-06-17 21:13 ` [PATCH v8 0/2] x86/snp: Add kexec support Ashish Kalra
2024-06-17 21:15 ` [PATCH v8 1/2] x86/boot/compressed: Skip Video Memory access in Decompressor for SEV-ES/SNP Ashish Kalra
2024-06-19 10:22 ` Borislav Petkov
2024-06-17 21:15 ` [PATCH v8 2/2] x86/snp: Convert shared memory back to private on kexec Ashish Kalra
2024-06-20 22:22 ` [PATCH v9 0/3] x86/snp: Add kexec support Ashish Kalra
2024-06-20 22:23 ` [PATCH v9 1/3] x86/sev: Move SEV compilation units Ashish Kalra
2024-06-20 22:23 ` [PATCH v9 2/3] x86/boot: Skip video memory access in the decompressor for SEV-ES/SNP Ashish Kalra
2024-06-24 15:03 ` Tom Lendacky
2024-06-20 22:23 ` [PATCH v9 3/3] x86/snp: Convert shared memory back to private on kexec Ashish Kalra
2024-06-24 15:18 ` Tom Lendacky
2024-06-24 18:26 ` Borislav Petkov
2024-06-24 20:57 ` Kalra, Ashish
2024-06-25 3:59 ` Borislav Petkov
2024-06-28 4:27 ` Kalra, Ashish
2024-06-28 14:01 ` Tom Lendacky
2024-06-28 19:14 ` Kalra, Ashish
2024-06-28 20:33 ` Kalra, Ashish
2024-06-24 18:21 ` [PATCH v10 0/2] x86/snp: Add kexec support Ashish Kalra
2024-06-24 18:21 ` [PATCH v10 1/2] x86/boot: Skip video memory access in the decompressor for SEV-ES/SNP Ashish Kalra
2024-06-24 18:22 ` [PATCH v10 2/2] Subject: [PATCH v9 3/3] x86/snp: Convert shared memory back to private on kexec Ashish Kalra
2024-07-02 19:56 ` [PATCH v11 0/3] x86/snp: Add kexec support Ashish Kalra
2024-07-02 19:57 ` [PATCH v11 1/3] x86/boot: Skip video memory access in the decompressor for SEV-ES/SNP Ashish Kalra
2024-07-02 19:57 ` [PATCH v11 2/3] x86/mm: refactor __set_clr_pte_enc() Ashish Kalra
2024-07-05 14:26 ` Borislav Petkov
2024-07-02 19:58 ` [PATCH v11 3/3] x86/snp: Convert shared memory back to private on kexec Ashish Kalra
2024-07-05 14:28 ` Borislav Petkov
2024-07-05 14:29 ` Borislav Petkov
2024-07-10 20:12 ` Kalra, Ashish
2024-07-30 19:20 ` [PATCH v12 0/3] x86/snp: Add kexec support Ashish Kalra
2024-07-30 19:21 ` [PATCH v12 1/3] x86/boot: Skip video memory access in the decompressor for SEV-ES/SNP Ashish Kalra
2024-07-30 19:21 ` [PATCH v12 2/3] x86/mm: refactor __set_clr_pte_enc() Ashish Kalra
2024-07-30 19:22 ` Ashish Kalra [this message]
2024-08-01 19:14 ` [PATCH v13 0/3] x86/snp: Add kexec support Ashish Kalra
2024-08-01 19:14 ` [PATCH v13 1/3] x86/boot: Skip video memory access in the decompressor for SEV-ES/SNP Ashish Kalra
2024-08-01 19:14 ` [PATCH v13 2/3] x86/mm: refactor __set_clr_pte_enc() Ashish Kalra
2024-10-28 16:15 ` Tom Lendacky
2024-08-01 19:14 ` [PATCH v13 3/3] x86/snp: Convert shared memory back to private on kexec Ashish Kalra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aecbf9d9fe183d4819f4c9a80bb28bf8a34bc113.1722366144.git.ashish.kalra@amd.com \
--to=ashish.kalra@amd.com \
--cc=adrian.hunter@intel.com \
--cc=anisinha@redhat.com \
--cc=ardb@kernel.org \
--cc=bdas@redhat.com \
--cc=bhe@redhat.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=dionnaglaze@google.com \
--cc=dyoung@redhat.com \
--cc=hpa@zytor.com \
--cc=jroedel@suse.de \
--cc=jun.nakajima@intel.com \
--cc=kai.huang@intel.com \
--cc=kexec@lists.infradead.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.roth@amd.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rick.p.edgecombe@intel.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=vkuznets@redhat.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).