* [PATCHv11 00/19] x86/tdx: Add kexec support
@ 2024-05-28 9:55 Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 01/19] x86/acpi: Extract ACPI MADT wakeup code into a separate file Kirill A. Shutemov
` (20 more replies)
0 siblings, 21 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel
The patchset adds bits and pieces to get kexec (and crashkernel) work on
TDX guest.
The last patch implements CPU offlining according to the approved ACPI
spec change poposal[1]. It unlocks kexec with all CPUs visible in the target
kernel. It requires BIOS-side enabling. If it missing we fallback to booting
2nd kernel with single CPU.
Please review. I would be glad for any feedback.
[1] https://lore.kernel.org/all/13356251.uLZWGnKmhe@kreacher
v11:
- Rebased onto current tip/master;
- Rename CONFIG_X86_ACPI_MADT_WAKEUP to CONFIG_ACPI_MADT_WAKEUP;
- Drop CC_ATTR_GUEST_MEM_ENCRYPT checks around x86_platform.guest.enc_kexec_*
callbacks;
- Rename x86_platform.guest.enc_kexec_* callbacks;
- Report error code in case of vmm call fail in __set_memory_enc_pgtable();
- Update commit messages and comments;
- Add Reviewed-bys;
v10:
- Rebased to current tip/master;
- Preserve CR4.MCE instead of setting it unconditionally;
- Fix build error in Hyper-V code after rebase;
- Include Ashish's patch for real;
v9:
- Rebased;
- Keep page tables that maps E820_TYPE_ACPI (Ashish);
- Ack/Reviewed/Tested-bys from Sathya, Kai, Tao;
- Minor printk() message adjustments;
v8:
- Rework serialization of around conversion memory back to private;
- Print ACPI_MADT_TYPE_MULTIPROC_WAKEUP in acpi_table_print_madt_entry();
- Drop debugfs interface to dump info on shared memory;
- Adjust comments and commit messages;
- Reviewed-bys by Baoquan, Dave and Thomas;
v7:
- Call enc_kexec_stop_conversion() and enc_kexec_unshare_mem() after shutting
down IO-APIC, lapic and hpet. It meets AMD requirements.
- Minor style changes;
- Add Acked/Reviewed-bys;
v6:
- Rebased to v6.8-rc1;
- Provide default noop callbacks from .enc_kexec_stop_conversion and
.enc_kexec_unshare_mem;
- Split off patch that introduces .enc_kexec_* callbacks;
- asm_acpi_mp_play_dead(): program CR3 directly from RSI, no MOV to RAX
required;
- Restructure how smp_ops.stop_this_cpu() hooked up in crash_nmi_callback();
- kvmclock patch got merged via KVM tree;
v5:
- Rename smp_ops.crash_play_dead to smp_ops.stop_this_cpu and use it in
stop_this_cpu();
- Split off enc_kexec_stop_conversion() from enc_kexec_unshare_mem();
- Introduce kernel_ident_mapping_free();
- Add explicit include for alternatives and stringify.
- Add barrier() after setting conversion_allowed to false;
- Mark cpu_hotplug_offline_disabled __ro_after_init;
- Print error if failed to hand over CPU to BIOS;
- Update comments and commit messages;
v4:
- Fix build for !KEXEC_CORE;
- Cleaner ATLERNATIVE use;
- Update commit messages and comments;
- Add Reviewed-bys;
v3:
- Rework acpi_mp_crash_stop_other_cpus() to avoid invoking hotplug state
machine;
- Free page tables if reset vector setup failed;
- Change asm_acpi_mp_play_dead() to pass reset vector and PGD as arguments;
- Mark acpi_mp_* variables as static and __ro_after_init;
- Use u32 for apicid;
- Disable CPU offlining if reset vector setup failed;
- Rename madt.S -> madt_playdead.S;
- Mark tdx_kexec_unshare_mem() as static;
- Rebase onto up-to-date tip/master;
- Whitespace fixes;
- Reorder patches;
- Add Reviewed-bys;
- Update comments and commit messages;
v2:
- Rework how unsharing hook ups into kexec codepath;
- Rework kvmclock_disable() fix based on Sean's;
- s/cpu_hotplug_not_supported()/cpu_hotplug_disable_offlining()/;
- use play_dead_common() to implement acpi_mp_play_dead();
- cond_resched() in tdx_shared_memory_show();
- s/target kernel/second kernel/;
- Update commit messages and comments;
Ashish Kalra (1):
x86/mm: Do not zap page table entries mapping unaccepted memory table
during kdump.
Borislav Petkov (1):
x86/relocate_kernel: Use named labels for less confusion
Kirill A. Shutemov (17):
x86/acpi: Extract ACPI MADT wakeup code into a separate file
x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init
cpu/hotplug: Add support for declaring CPU offlining not supported
cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup
x86/kexec: Keep CR4.MCE set during kexec for TDX guest
x86/mm: Make x86_platform.guest.enc_status_change_*() return errno
x86/mm: Return correct level from lookup_address() if pte is none
x86/tdx: Account shared memory
x86/mm: Add callbacks to prepare encrypted memory for kexec
x86/tdx: Convert shared memory back to private on kexec
x86/mm: Make e820__end_ram_pfn() cover E820_TYPE_ACPI ranges
x86/acpi: Rename fields in acpi_madt_multiproc_wakeup structure
x86/acpi: Do not attempt to bring up secondary CPUs in kexec case
x86/smp: Add smp_ops.stop_this_cpu() callback
x86/mm: Introduce kernel_ident_mapping_free()
x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
ACPI: tables: Print MULTIPROC_WAKEUP when MADT is parsed
arch/x86/Kconfig | 7 +
arch/x86/coco/core.c | 1 -
arch/x86/coco/tdx/tdx.c | 96 ++++++++-
arch/x86/hyperv/ivm.c | 22 +-
arch/x86/include/asm/acpi.h | 7 +
arch/x86/include/asm/init.h | 3 +
arch/x86/include/asm/pgtable.h | 5 +
arch/x86/include/asm/pgtable_types.h | 1 +
arch/x86/include/asm/set_memory.h | 3 +
arch/x86/include/asm/smp.h | 1 +
arch/x86/include/asm/x86_init.h | 13 +-
arch/x86/kernel/acpi/Makefile | 1 +
arch/x86/kernel/acpi/boot.c | 86 +-------
arch/x86/kernel/acpi/madt_playdead.S | 28 +++
arch/x86/kernel/acpi/madt_wakeup.c | 292 +++++++++++++++++++++++++++
arch/x86/kernel/crash.c | 12 ++
arch/x86/kernel/e820.c | 9 +-
arch/x86/kernel/process.c | 7 +
arch/x86/kernel/reboot.c | 18 ++
arch/x86/kernel/relocate_kernel_64.S | 25 ++-
arch/x86/kernel/x86_init.c | 8 +-
arch/x86/mm/ident_map.c | 73 +++++++
arch/x86/mm/init_64.c | 16 +-
arch/x86/mm/mem_encrypt_amd.c | 8 +-
arch/x86/mm/pat/set_memory.c | 74 +++++--
drivers/acpi/tables.c | 14 ++
include/acpi/actbl2.h | 19 +-
include/linux/cc_platform.h | 10 -
include/linux/cpu.h | 2 +
kernel/cpu.c | 12 +-
30 files changed, 707 insertions(+), 166 deletions(-)
create mode 100644 arch/x86/kernel/acpi/madt_playdead.S
create mode 100644 arch/x86/kernel/acpi/madt_wakeup.c
--
2.43.0
^ permalink raw reply [flat|nested] 115+ messages in thread
* [PATCHv11 01/19] x86/acpi: Extract ACPI MADT wakeup code into a separate file
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 13:47 ` Borislav Petkov
2024-05-28 9:55 ` [PATCHv11 02/19] x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init Kirill A. Shutemov
` (19 subsequent siblings)
20 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
In order to prepare for the expansion of support for the ACPI MADT
wakeup method, move the relevant code into a separate file.
Introduce a new configuration option to clearly indicate dependencies
without the use of ifdefs.
There have been no functional changes.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/Kconfig | 7 +++
arch/x86/include/asm/acpi.h | 5 ++
arch/x86/kernel/acpi/Makefile | 1 +
arch/x86/kernel/acpi/boot.c | 86 +-----------------------------
arch/x86/kernel/acpi/madt_wakeup.c | 82 ++++++++++++++++++++++++++++
5 files changed, 96 insertions(+), 85 deletions(-)
create mode 100644 arch/x86/kernel/acpi/madt_wakeup.c
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e8837116704c..e30ea4129d2c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1118,6 +1118,13 @@ config X86_LOCAL_APIC
depends on X86_64 || SMP || X86_32_NON_STANDARD || X86_UP_APIC || PCI_MSI
select IRQ_DOMAIN_HIERARCHY
+config ACPI_MADT_WAKEUP
+ def_bool y
+ depends on X86_64
+ depends on ACPI
+ depends on SMP
+ depends on X86_LOCAL_APIC
+
config X86_IO_APIC
def_bool y
depends on X86_LOCAL_APIC || X86_UP_IOAPIC
diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
index 5af926c050f0..ceacac2b335d 100644
--- a/arch/x86/include/asm/acpi.h
+++ b/arch/x86/include/asm/acpi.h
@@ -78,6 +78,11 @@ static inline bool acpi_skip_set_wakeup_address(void)
#define acpi_skip_set_wakeup_address acpi_skip_set_wakeup_address
+union acpi_subtable_headers;
+
+int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
+ const unsigned long end);
+
/*
* Check if the CPU can handle C2 and deeper
*/
diff --git a/arch/x86/kernel/acpi/Makefile b/arch/x86/kernel/acpi/Makefile
index fc17b3f136fe..2feba7257665 100644
--- a/arch/x86/kernel/acpi/Makefile
+++ b/arch/x86/kernel/acpi/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_ACPI) += boot.o
obj-$(CONFIG_ACPI_SLEEP) += sleep.o wakeup_$(BITS).o
obj-$(CONFIG_ACPI_APEI) += apei.o
obj-$(CONFIG_ACPI_CPPC_LIB) += cppc.o
+obj-$(CONFIG_ACPI_MADT_WAKEUP) += madt_wakeup.o
ifneq ($(CONFIG_ACPI_PROCESSOR),)
obj-y += cstate.o
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 4bf82dbd2a6b..9f4618dcd704 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -67,13 +67,6 @@ static bool has_lapic_cpus __initdata;
static bool acpi_support_online_capable;
#endif
-#ifdef CONFIG_X86_64
-/* Physical address of the Multiprocessor Wakeup Structure mailbox */
-static u64 acpi_mp_wake_mailbox_paddr;
-/* Virtual address of the Multiprocessor Wakeup Structure mailbox */
-static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox;
-#endif
-
#ifdef CONFIG_X86_IO_APIC
/*
* Locks related to IOAPIC hotplug
@@ -341,60 +334,6 @@ acpi_parse_lapic_nmi(union acpi_subtable_headers * header, const unsigned long e
return 0;
}
-
-#ifdef CONFIG_X86_64
-static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip)
-{
- /*
- * Remap mailbox memory only for the first call to acpi_wakeup_cpu().
- *
- * Wakeup of secondary CPUs is fully serialized in the core code.
- * No need to protect acpi_mp_wake_mailbox from concurrent accesses.
- */
- if (!acpi_mp_wake_mailbox) {
- acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr,
- sizeof(*acpi_mp_wake_mailbox),
- MEMREMAP_WB);
- }
-
- /*
- * Mailbox memory is shared between the firmware and OS. Firmware will
- * listen on mailbox command address, and once it receives the wakeup
- * command, the CPU associated with the given apicid will be booted.
- *
- * The value of 'apic_id' and 'wakeup_vector' must be visible to the
- * firmware before the wakeup command is visible. smp_store_release()
- * ensures ordering and visibility.
- */
- acpi_mp_wake_mailbox->apic_id = apicid;
- acpi_mp_wake_mailbox->wakeup_vector = start_ip;
- smp_store_release(&acpi_mp_wake_mailbox->command,
- ACPI_MP_WAKE_COMMAND_WAKEUP);
-
- /*
- * Wait for the CPU to wake up.
- *
- * The CPU being woken up is essentially in a spin loop waiting to be
- * woken up. It should not take long for it wake up and acknowledge by
- * zeroing out ->command.
- *
- * ACPI specification doesn't provide any guidance on how long kernel
- * has to wait for a wake up acknowledgement. It also doesn't provide
- * a way to cancel a wake up request if it takes too long.
- *
- * In TDX environment, the VMM has control over how long it takes to
- * wake up secondary. It can postpone scheduling secondary vCPU
- * indefinitely. Giving up on wake up request and reporting error opens
- * possible attack vector for VMM: it can wake up a secondary CPU when
- * kernel doesn't expect it. Wait until positive result of the wake up
- * request.
- */
- while (READ_ONCE(acpi_mp_wake_mailbox->command))
- cpu_relax();
-
- return 0;
-}
-#endif /* CONFIG_X86_64 */
#endif /* CONFIG_X86_LOCAL_APIC */
#ifdef CONFIG_X86_IO_APIC
@@ -1124,29 +1063,6 @@ static int __init acpi_parse_madt_lapic_entries(void)
}
return 0;
}
-
-#ifdef CONFIG_X86_64
-static int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
- const unsigned long end)
-{
- struct acpi_madt_multiproc_wakeup *mp_wake;
-
- if (!IS_ENABLED(CONFIG_SMP))
- return -ENODEV;
-
- mp_wake = (struct acpi_madt_multiproc_wakeup *)header;
- if (BAD_MADT_ENTRY(mp_wake, end))
- return -EINVAL;
-
- acpi_table_print_madt_entry(&header->common);
-
- acpi_mp_wake_mailbox_paddr = mp_wake->base_address;
-
- apic_update_callback(wakeup_secondary_cpu_64, acpi_wakeup_cpu);
-
- return 0;
-}
-#endif /* CONFIG_X86_64 */
#endif /* CONFIG_X86_LOCAL_APIC */
#ifdef CONFIG_X86_IO_APIC
@@ -1343,7 +1259,7 @@ static void __init acpi_process_madt(void)
smp_found_config = 1;
}
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_ACPI_MADT_WAKEUP
/*
* Parse MADT MP Wake entry.
*/
diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c
new file mode 100644
index 000000000000..7f164d38bd0b
--- /dev/null
+++ b/arch/x86/kernel/acpi/madt_wakeup.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/acpi.h>
+#include <linux/io.h>
+#include <asm/apic.h>
+#include <asm/barrier.h>
+#include <asm/processor.h>
+
+/* Physical address of the Multiprocessor Wakeup Structure mailbox */
+static u64 acpi_mp_wake_mailbox_paddr;
+
+/* Virtual address of the Multiprocessor Wakeup Structure mailbox */
+static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox;
+
+static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip)
+{
+ /*
+ * Remap mailbox memory only for the first call to acpi_wakeup_cpu().
+ *
+ * Wakeup of secondary CPUs is fully serialized in the core code.
+ * No need to protect acpi_mp_wake_mailbox from concurrent accesses.
+ */
+ if (!acpi_mp_wake_mailbox) {
+ acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr,
+ sizeof(*acpi_mp_wake_mailbox),
+ MEMREMAP_WB);
+ }
+
+ /*
+ * Mailbox memory is shared between the firmware and OS. Firmware will
+ * listen on mailbox command address, and once it receives the wakeup
+ * command, the CPU associated with the given apicid will be booted.
+ *
+ * The value of 'apic_id' and 'wakeup_vector' must be visible to the
+ * firmware before the wakeup command is visible. smp_store_release()
+ * ensures ordering and visibility.
+ */
+ acpi_mp_wake_mailbox->apic_id = apicid;
+ acpi_mp_wake_mailbox->wakeup_vector = start_ip;
+ smp_store_release(&acpi_mp_wake_mailbox->command,
+ ACPI_MP_WAKE_COMMAND_WAKEUP);
+
+ /*
+ * Wait for the CPU to wake up.
+ *
+ * The CPU being woken up is essentially in a spin loop waiting to be
+ * woken up. It should not take long for it wake up and acknowledge by
+ * zeroing out ->command.
+ *
+ * ACPI specification doesn't provide any guidance on how long kernel
+ * has to wait for a wake up acknowledgment. It also doesn't provide
+ * a way to cancel a wake up request if it takes too long.
+ *
+ * In TDX environment, the VMM has control over how long it takes to
+ * wake up secondary. It can postpone scheduling secondary vCPU
+ * indefinitely. Giving up on wake up request and reporting error opens
+ * possible attack vector for VMM: it can wake up a secondary CPU when
+ * kernel doesn't expect it. Wait until positive result of the wake up
+ * request.
+ */
+ while (READ_ONCE(acpi_mp_wake_mailbox->command))
+ cpu_relax();
+
+ return 0;
+}
+
+int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
+ const unsigned long end)
+{
+ struct acpi_madt_multiproc_wakeup *mp_wake;
+
+ mp_wake = (struct acpi_madt_multiproc_wakeup *)header;
+ if (BAD_MADT_ENTRY(mp_wake, end))
+ return -EINVAL;
+
+ acpi_table_print_madt_entry(&header->common);
+
+ acpi_mp_wake_mailbox_paddr = mp_wake->base_address;
+
+ apic_update_callback(wakeup_secondary_cpu_64, acpi_wakeup_cpu);
+
+ return 0;
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 02/19] x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 01/19] x86/acpi: Extract ACPI MADT wakeup code into a separate file Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 03/19] cpu/hotplug: Add support for declaring CPU offlining not supported Kirill A. Shutemov
` (18 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
acpi_mp_wake_mailbox_paddr and acpi_mp_wake_mailbox initialized once
during ACPI MADT init and never changed.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/kernel/acpi/madt_wakeup.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c
index 7f164d38bd0b..cf79ea6f3007 100644
--- a/arch/x86/kernel/acpi/madt_wakeup.c
+++ b/arch/x86/kernel/acpi/madt_wakeup.c
@@ -6,10 +6,10 @@
#include <asm/processor.h>
/* Physical address of the Multiprocessor Wakeup Structure mailbox */
-static u64 acpi_mp_wake_mailbox_paddr;
+static u64 acpi_mp_wake_mailbox_paddr __ro_after_init;
/* Virtual address of the Multiprocessor Wakeup Structure mailbox */
-static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox;
+static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox __ro_after_init;
static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip)
{
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 03/19] cpu/hotplug: Add support for declaring CPU offlining not supported
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 01/19] x86/acpi: Extract ACPI MADT wakeup code into a separate file Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 02/19] x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 04/19] cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup Kirill A. Shutemov
` (17 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
The ACPI MADT mailbox wakeup method doesn't allow to offline CPU after
it got woke up.
Currently offlining hotplug is prevented based on the confidential
computing attribute which is set for Intel TDX. But TDX is not
the only possible user of the wake up method. The MADT wakeup can be
implemented outside of a confidential computing environment. Offline
support is a property of the wakeup method, not the CoCo implementation.
Introduce cpu_hotplug_disable_offlining() that can be called to indicate
that CPU offlining should be disabled.
This function is going to replace CC_ATTR_HOTPLUG_DISABLED for ACPI
MADT wakeup method.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tao Liu <ltao@redhat.com>
---
include/linux/cpu.h | 2 ++
kernel/cpu.c | 13 ++++++++++++-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 861c3bfc5f17..6e265b085f95 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -141,6 +141,7 @@ extern void cpus_read_lock(void);
extern void cpus_read_unlock(void);
extern int cpus_read_trylock(void);
extern void lockdep_assert_cpus_held(void);
+extern void cpu_hotplug_disable_offlining(void);
extern void cpu_hotplug_disable(void);
extern void cpu_hotplug_enable(void);
void clear_tasks_mm_cpumask(int cpu);
@@ -156,6 +157,7 @@ static inline void cpus_read_lock(void) { }
static inline void cpus_read_unlock(void) { }
static inline int cpus_read_trylock(void) { return true; }
static inline void lockdep_assert_cpus_held(void) { }
+static inline void cpu_hotplug_disable_offlining(void) { }
static inline void cpu_hotplug_disable(void) { }
static inline void cpu_hotplug_enable(void) { }
static inline int remove_cpu(unsigned int cpu) { return -EPERM; }
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 563877d6c28b..4c15b478e2bc 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -483,6 +483,8 @@ static int cpu_hotplug_disabled;
DEFINE_STATIC_PERCPU_RWSEM(cpu_hotplug_lock);
+static bool cpu_hotplug_offline_disabled __ro_after_init;
+
void cpus_read_lock(void)
{
percpu_down_read(&cpu_hotplug_lock);
@@ -542,6 +544,14 @@ static void lockdep_release_cpus_lock(void)
rwsem_release(&cpu_hotplug_lock.dep_map, _THIS_IP_);
}
+/* Declare CPU offlining not supported */
+void cpu_hotplug_disable_offlining(void)
+{
+ cpu_maps_update_begin();
+ cpu_hotplug_offline_disabled = true;
+ cpu_maps_update_done();
+}
+
/*
* Wait for currently running CPU hotplug operations to complete (if any) and
* disable future CPU hotplug (from sysfs). The 'cpu_add_remove_lock' protects
@@ -1471,7 +1481,8 @@ static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
* If the platform does not support hotplug, report it explicitly to
* differentiate it from a transient offlining failure.
*/
- if (cc_platform_has(CC_ATTR_HOTPLUG_DISABLED))
+ if (cc_platform_has(CC_ATTR_HOTPLUG_DISABLED) ||
+ cpu_hotplug_offline_disabled)
return -EOPNOTSUPP;
if (cpu_hotplug_disabled)
return -EBUSY;
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 04/19] cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (2 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 03/19] cpu/hotplug: Add support for declaring CPU offlining not supported Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion Kirill A. Shutemov
` (16 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
ACPI MADT doesn't allow to offline CPU after it got woke up.
Currently CPU hotplug is prevented based on the confidential computing
attribute which is set for Intel TDX. But TDX is not the only possible
user of the wake up method. Any platform that uses ACPI MADT wakeup
method cannot offline CPU.
Disable CPU offlining on ACPI MADT wakeup enumeration.
The change has no visible effects for users: currently, TDX guest is the
only platform that uses the ACPI MADT wakeup method.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/coco/core.c | 1 -
arch/x86/kernel/acpi/madt_wakeup.c | 3 +++
include/linux/cc_platform.h | 10 ----------
kernel/cpu.c | 3 +--
4 files changed, 4 insertions(+), 13 deletions(-)
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index b31ef2424d19..0f81f70aca82 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -29,7 +29,6 @@ static bool noinstr intel_cc_platform_has(enum cc_attr attr)
{
switch (attr) {
case CC_ATTR_GUEST_UNROLL_STRING_IO:
- case CC_ATTR_HOTPLUG_DISABLED:
case CC_ATTR_GUEST_MEM_ENCRYPT:
case CC_ATTR_MEM_ENCRYPT:
return true;
diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c
index cf79ea6f3007..d222be8d7a07 100644
--- a/arch/x86/kernel/acpi/madt_wakeup.c
+++ b/arch/x86/kernel/acpi/madt_wakeup.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-or-later
#include <linux/acpi.h>
+#include <linux/cpu.h>
#include <linux/io.h>
#include <asm/apic.h>
#include <asm/barrier.h>
@@ -76,6 +77,8 @@ int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
acpi_mp_wake_mailbox_paddr = mp_wake->base_address;
+ cpu_hotplug_disable_offlining();
+
apic_update_callback(wakeup_secondary_cpu_64, acpi_wakeup_cpu);
return 0;
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index 60693a145894..caa4b4430634 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -81,16 +81,6 @@ enum cc_attr {
*/
CC_ATTR_GUEST_SEV_SNP,
- /**
- * @CC_ATTR_HOTPLUG_DISABLED: Hotplug is not supported or disabled.
- *
- * The platform/OS is running as a guest/virtual machine does not
- * support CPU hotplug feature.
- *
- * Examples include TDX Guest.
- */
- CC_ATTR_HOTPLUG_DISABLED,
-
/**
* @CC_ATTR_HOST_SEV_SNP: AMD SNP enabled on the host.
*
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 4c15b478e2bc..a609385c7f99 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1481,8 +1481,7 @@ static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
* If the platform does not support hotplug, report it explicitly to
* differentiate it from a transient offlining failure.
*/
- if (cc_platform_has(CC_ATTR_HOTPLUG_DISABLED) ||
- cpu_hotplug_offline_disabled)
+ if (cpu_hotplug_offline_disabled)
return -EOPNOTSUPP;
if (cpu_hotplug_disabled)
return -EBUSY;
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (3 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 04/19] cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-29 10:47 ` Nikolay Borisov
2024-05-28 9:55 ` [PATCHv11 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest Kirill A. Shutemov
` (15 subsequent siblings)
20 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel
From: Borislav Petkov <bp@alien8.de>
That identity_mapped() functions was loving that "1" label to the point
of completely confusing its readers.
Use named labels in each place for clarity.
No functional changes.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/relocate_kernel_64.S | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 56cab1bb25f5..085eef5c3904 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -148,9 +148,10 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
*/
movl $X86_CR4_PAE, %eax
testq $X86_CR4_LA57, %r13
- jz 1f
+ jz .Lno_la57
orl $X86_CR4_LA57, %eax
-1:
+.Lno_la57:
+
movq %rax, %cr4
jmp 1f
@@ -165,9 +166,9 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
* used by kexec. Flush the caches before copying the kernel.
*/
testq %r12, %r12
- jz 1f
+ jz .Lsme_off
wbinvd
-1:
+.Lsme_off:
movq %rcx, %r11
call swap_pages
@@ -187,7 +188,7 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
*/
testq %r11, %r11
- jnz 1f
+ jnz .Lrelocate
xorl %eax, %eax
xorl %ebx, %ebx
xorl %ecx, %ecx
@@ -208,7 +209,7 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
ret
int3
-1:
+.Lrelocate:
popq %rdx
leaq PAGE_SIZE(%r10), %rsp
ANNOTATE_RETPOLINE_SAFE
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (4 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 11:12 ` Huang, Kai
2024-05-29 11:39 ` Nikolay Borisov
2024-05-28 9:55 ` [PATCHv11 07/19] x86/mm: Make x86_platform.guest.enc_status_change_*() return errno Kirill A. Shutemov
` (14 subsequent siblings)
20 siblings, 2 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel
TDX guests run with MCA enabled (CR4.MCE=1b) from the very start. If
that bit is cleared during CR4 register reprogramming during boot or
kexec flows, a #VE exception will be raised which the guest kernel
cannot handle it.
Therefore, make sure the CR4.MCE setting is preserved over kexec too and
avoid raising any #VEs.
The change doesn't affect non-TDX-guest environments.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/relocate_kernel_64.S | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 085eef5c3904..b668a6be4f6f 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -5,6 +5,8 @@
*/
#include <linux/linkage.h>
+#include <linux/stringify.h>
+#include <asm/alternative.h>
#include <asm/page_types.h>
#include <asm/kexec.h>
#include <asm/processor-flags.h>
@@ -143,15 +145,17 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
/*
* Set cr4 to a known state:
- * - physical address extension enabled
* - 5-level paging, if it was enabled before
+ * - Machine check exception on TDX guest, if it was enabled before.
+ * Clearing MCE might not be allowed in TDX guests, depending on setup.
+ * - physical address extension enabled
*/
- movl $X86_CR4_PAE, %eax
- testq $X86_CR4_LA57, %r13
- jz .Lno_la57
- orl $X86_CR4_LA57, %eax
-.Lno_la57:
+ movl $X86_CR4_LA57, %eax
+ ALTERNATIVE "", __stringify(orl $X86_CR4_MCE, %eax), X86_FEATURE_TDX_GUEST
+ /* R13 contains the original CR4 value, read in relocate_kernel() */
+ andl %r13d, %eax
+ orl $X86_CR4_PAE, %eax
movq %rax, %cr4
jmp 1f
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 07/19] x86/mm: Make x86_platform.guest.enc_status_change_*() return errno
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (5 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 08/19] x86/mm: Return correct level from lookup_address() if pte is none Kirill A. Shutemov
` (13 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Dave Hansen,
Tao Liu, Michael Kelley
TDX is going to have more than one reason to fail
enc_status_change_prepare().
Change the callback to return errno instead of assuming -EIO;
enc_status_change_finish() changed too to keep the interface symmetric.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
---
arch/x86/coco/tdx/tdx.c | 20 +++++++++++---------
arch/x86/hyperv/ivm.c | 22 ++++++++++------------
arch/x86/include/asm/x86_init.h | 4 ++--
arch/x86/kernel/x86_init.c | 4 ++--
arch/x86/mm/mem_encrypt_amd.c | 8 ++++----
arch/x86/mm/pat/set_memory.c | 12 +++++++-----
6 files changed, 36 insertions(+), 34 deletions(-)
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index c1cb90369915..26fa47db5782 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -798,28 +798,30 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
return true;
}
-static bool tdx_enc_status_change_prepare(unsigned long vaddr, int numpages,
- bool enc)
+static int tdx_enc_status_change_prepare(unsigned long vaddr, int numpages,
+ bool enc)
{
/*
* Only handle shared->private conversion here.
* See the comment in tdx_early_init().
*/
- if (enc)
- return tdx_enc_status_changed(vaddr, numpages, enc);
- return true;
+ if (enc && !tdx_enc_status_changed(vaddr, numpages, enc))
+ return -EIO;
+
+ return 0;
}
-static bool tdx_enc_status_change_finish(unsigned long vaddr, int numpages,
+static int tdx_enc_status_change_finish(unsigned long vaddr, int numpages,
bool enc)
{
/*
* Only handle private->shared conversion here.
* See the comment in tdx_early_init().
*/
- if (!enc)
- return tdx_enc_status_changed(vaddr, numpages, enc);
- return true;
+ if (!enc && !tdx_enc_status_changed(vaddr, numpages, enc))
+ return -EIO;
+
+ return 0;
}
void __init tdx_early_init(void)
diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index 768d73de0d09..b4a851d27c7c 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -523,9 +523,9 @@ static int hv_mark_gpa_visibility(u16 count, const u64 pfn[],
* transition is complete, hv_vtom_set_host_visibility() marks the pages
* as "present" again.
*/
-static bool hv_vtom_clear_present(unsigned long kbuffer, int pagecount, bool enc)
+static int hv_vtom_clear_present(unsigned long kbuffer, int pagecount, bool enc)
{
- return !set_memory_np(kbuffer, pagecount);
+ return set_memory_np(kbuffer, pagecount);
}
/*
@@ -536,20 +536,19 @@ static bool hv_vtom_clear_present(unsigned long kbuffer, int pagecount, bool enc
* with host. This function works as wrap of hv_mark_gpa_visibility()
* with memory base and size.
*/
-static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bool enc)
+static int hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bool enc)
{
enum hv_mem_host_visibility visibility = enc ?
VMBUS_PAGE_NOT_VISIBLE : VMBUS_PAGE_VISIBLE_READ_WRITE;
u64 *pfn_array;
phys_addr_t paddr;
+ int i, pfn, err;
void *vaddr;
int ret = 0;
- bool result = true;
- int i, pfn;
pfn_array = kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);
if (!pfn_array) {
- result = false;
+ ret = -ENOMEM;
goto err_set_memory_p;
}
@@ -568,10 +567,8 @@ static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bo
if (pfn == HV_MAX_MODIFY_GPA_REP_COUNT || i == pagecount - 1) {
ret = hv_mark_gpa_visibility(pfn, pfn_array,
visibility);
- if (ret) {
- result = false;
+ if (ret)
goto err_free_pfn_array;
- }
pfn = 0;
}
}
@@ -586,10 +583,11 @@ static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bo
* order to avoid leaving the memory range in a "broken" state. Setting
* the PRESENT bits shouldn't fail, but return an error if it does.
*/
- if (set_memory_p(kbuffer, pagecount))
- result = false;
+ err = set_memory_p(kbuffer, pagecount);
+ if (err && !ret)
+ ret = err;
- return result;
+ return ret;
}
static bool hv_vtom_tlb_flush_required(bool private)
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 6149eabe200f..28ac3cb9b987 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -151,8 +151,8 @@ struct x86_init_acpi {
* @enc_cache_flush_required Returns true if a cache flush is needed before changing page encryption status
*/
struct x86_guest {
- bool (*enc_status_change_prepare)(unsigned long vaddr, int npages, bool enc);
- bool (*enc_status_change_finish)(unsigned long vaddr, int npages, bool enc);
+ int (*enc_status_change_prepare)(unsigned long vaddr, int npages, bool enc);
+ int (*enc_status_change_finish)(unsigned long vaddr, int npages, bool enc);
bool (*enc_tlb_flush_required)(bool enc);
bool (*enc_cache_flush_required)(void);
};
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index d5dc5a92635a..a7143bb7dd93 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -134,8 +134,8 @@ struct x86_cpuinit_ops x86_cpuinit = {
static void default_nmi_init(void) { };
-static bool enc_status_change_prepare_noop(unsigned long vaddr, int npages, bool enc) { return true; }
-static bool enc_status_change_finish_noop(unsigned long vaddr, int npages, bool enc) { return true; }
+static int enc_status_change_prepare_noop(unsigned long vaddr, int npages, bool enc) { return 0; }
+static int enc_status_change_finish_noop(unsigned long vaddr, int npages, bool enc) { return 0; }
static bool enc_tlb_flush_required_noop(bool enc) { return false; }
static bool enc_cache_flush_required_noop(void) { return false; }
static bool is_private_mmio_noop(u64 addr) {return false; }
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 422602f6039b..e7b67519ddb5 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -283,7 +283,7 @@ static void enc_dec_hypercall(unsigned long vaddr, unsigned long size, bool enc)
#endif
}
-static bool amd_enc_status_change_prepare(unsigned long vaddr, int npages, bool enc)
+static int amd_enc_status_change_prepare(unsigned long vaddr, int npages, bool enc)
{
/*
* To maintain the security guarantees of SEV-SNP guests, make sure
@@ -292,11 +292,11 @@ static bool amd_enc_status_change_prepare(unsigned long vaddr, int npages, bool
if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && !enc)
snp_set_memory_shared(vaddr, npages);
- return true;
+ return 0;
}
/* Return true unconditionally: return value doesn't matter for the SEV side */
-static bool amd_enc_status_change_finish(unsigned long vaddr, int npages, bool enc)
+static int amd_enc_status_change_finish(unsigned long vaddr, int npages, bool enc)
{
/*
* After memory is mapped encrypted in the page table, validate it
@@ -308,7 +308,7 @@ static bool amd_enc_status_change_finish(unsigned long vaddr, int npages, bool e
if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
enc_dec_hypercall(vaddr, npages << PAGE_SHIFT, enc);
- return true;
+ return 0;
}
static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 19fdfbb171ed..498812f067cd 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2196,7 +2196,8 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
cpa_flush(&cpa, x86_platform.guest.enc_cache_flush_required());
/* Notify hypervisor that we are about to set/clr encryption attribute. */
- if (!x86_platform.guest.enc_status_change_prepare(addr, numpages, enc))
+ ret = x86_platform.guest.enc_status_change_prepare(addr, numpages, enc);
+ if (ret)
goto vmm_fail;
ret = __change_page_attr_set_clr(&cpa, 1);
@@ -2214,16 +2215,17 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
return ret;
/* Notify hypervisor that we have successfully set/clr encryption attribute. */
- if (!x86_platform.guest.enc_status_change_finish(addr, numpages, enc))
+ ret = x86_platform.guest.enc_status_change_finish(addr, numpages, enc);
+ if (ret)
goto vmm_fail;
return 0;
vmm_fail:
- WARN_ONCE(1, "CPA VMM failure to convert memory (addr=%p, numpages=%d) to %s.\n",
- (void *)addr, numpages, enc ? "private" : "shared");
+ WARN_ONCE(1, "CPA VMM failure to convert memory (addr=%p, numpages=%d) to %s: %d\n",
+ (void *)addr, numpages, enc ? "private" : "shared", ret);
- return -EIO;
+ return ret;
}
static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 08/19] x86/mm: Return correct level from lookup_address() if pte is none
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (6 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 07/19] x86/mm: Make x86_platform.guest.enc_status_change_*() return errno Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 09/19] x86/tdx: Account shared memory Kirill A. Shutemov
` (12 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Dave Hansen,
Tao Liu
Currently, lookup_address() returns two things:
1. A "pte_t" (which might be a p[g4um]d_t)
2. The 'level' of the page tables where the "pte_t" was found
(returned via a pointer)
If no pte_t is found, 'level' is essentially garbage.
Always fill out the level. For NULL "pte_t"s, fill in the level where
the p*d_none() entry was found mirroring the "found" behavior.
Always filling out the level allows using lookup_address() to precisely
skip over holes when walking kernel page tables.
Add one more entry into enum pg_level to indicate the size of the VA
covered by one PGD entry in 5-level paging mode.
Update comments for lookup_address() and lookup_address_in_pgd() to
reflect changes in the interface.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/include/asm/pgtable_types.h | 1 +
arch/x86/mm/pat/set_memory.c | 21 ++++++++++-----------
2 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index b78644962626..2f321137736c 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -549,6 +549,7 @@ enum pg_level {
PG_LEVEL_2M,
PG_LEVEL_1G,
PG_LEVEL_512G,
+ PG_LEVEL_256T,
PG_LEVEL_NUM
};
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 498812f067cd..a7a7a6c6a3fb 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -662,8 +662,9 @@ static inline pgprot_t verify_rwx(pgprot_t old, pgprot_t new, unsigned long star
/*
* Lookup the page table entry for a virtual address in a specific pgd.
- * Return a pointer to the entry, the level of the mapping, and the effective
- * NX and RW bits of all page table levels.
+ * Return a pointer to the entry (or NULL if the entry does not exist),
+ * the level of the entry, and the effective NX and RW bits of all
+ * page table levels.
*/
pte_t *lookup_address_in_pgd_attr(pgd_t *pgd, unsigned long address,
unsigned int *level, bool *nx, bool *rw)
@@ -672,13 +673,14 @@ pte_t *lookup_address_in_pgd_attr(pgd_t *pgd, unsigned long address,
pud_t *pud;
pmd_t *pmd;
- *level = PG_LEVEL_NONE;
+ *level = PG_LEVEL_256T;
*nx = false;
*rw = true;
if (pgd_none(*pgd))
return NULL;
+ *level = PG_LEVEL_512G;
*nx |= pgd_flags(*pgd) & _PAGE_NX;
*rw &= pgd_flags(*pgd) & _PAGE_RW;
@@ -686,10 +688,10 @@ pte_t *lookup_address_in_pgd_attr(pgd_t *pgd, unsigned long address,
if (p4d_none(*p4d))
return NULL;
- *level = PG_LEVEL_512G;
if (p4d_leaf(*p4d) || !p4d_present(*p4d))
return (pte_t *)p4d;
+ *level = PG_LEVEL_1G;
*nx |= p4d_flags(*p4d) & _PAGE_NX;
*rw &= p4d_flags(*p4d) & _PAGE_RW;
@@ -697,10 +699,10 @@ pte_t *lookup_address_in_pgd_attr(pgd_t *pgd, unsigned long address,
if (pud_none(*pud))
return NULL;
- *level = PG_LEVEL_1G;
if (pud_leaf(*pud) || !pud_present(*pud))
return (pte_t *)pud;
+ *level = PG_LEVEL_2M;
*nx |= pud_flags(*pud) & _PAGE_NX;
*rw &= pud_flags(*pud) & _PAGE_RW;
@@ -708,15 +710,13 @@ pte_t *lookup_address_in_pgd_attr(pgd_t *pgd, unsigned long address,
if (pmd_none(*pmd))
return NULL;
- *level = PG_LEVEL_2M;
if (pmd_leaf(*pmd) || !pmd_present(*pmd))
return (pte_t *)pmd;
+ *level = PG_LEVEL_4K;
*nx |= pmd_flags(*pmd) & _PAGE_NX;
*rw &= pmd_flags(*pmd) & _PAGE_RW;
- *level = PG_LEVEL_4K;
-
return pte_offset_kernel(pmd, address);
}
@@ -736,9 +736,8 @@ pte_t *lookup_address_in_pgd(pgd_t *pgd, unsigned long address,
* Lookup the page table entry for a virtual address. Return a pointer
* to the entry and the level of the mapping.
*
- * Note: We return pud and pmd either when the entry is marked large
- * or when the present bit is not set. Otherwise we would return a
- * pointer to a nonexisting mapping.
+ * Note: the function returns p4d, pud or pmd either when the entry is marked
+ * large or when the present bit is not set. Otherwise it returns NULL.
*/
pte_t *lookup_address(unsigned long address, unsigned int *level)
{
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 09/19] x86/tdx: Account shared memory
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (7 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 08/19] x86/mm: Return correct level from lookup_address() if pte is none Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-06-04 16:08 ` Dave Hansen
2024-05-28 9:55 ` [PATCHv11 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec Kirill A. Shutemov
` (11 subsequent siblings)
20 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
The kernel will convert all shared memory back to private during kexec.
The direct mapping page tables will provide information on which memory
is shared.
It is extremely important to convert all shared memory. If a page is
missed, it will cause the second kernel to crash when it accesses it.
Keep track of the number of shared pages. This will allow for
cross-checking against the shared information in the direct mapping and
reporting if the shared bit is lost.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/coco/tdx/tdx.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 26fa47db5782..979891e97d83 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -38,6 +38,8 @@
#define TDREPORT_SUBTYPE_0 0
+static atomic_long_t nr_shared;
+
/* Called from __tdx_hypercall() for unrecoverable failure */
noinstr void __noreturn __tdx_hypercall_failed(void)
{
@@ -821,6 +823,11 @@ static int tdx_enc_status_change_finish(unsigned long vaddr, int numpages,
if (!enc && !tdx_enc_status_changed(vaddr, numpages, enc))
return -EIO;
+ if (enc)
+ atomic_long_sub(numpages, &nr_shared);
+ else
+ atomic_long_add(numpages, &nr_shared);
+
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (8 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 09/19] x86/tdx: Account shared memory Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-29 10:42 ` Borislav Petkov
2024-06-04 16:16 ` [PATCHv11 " Dave Hansen
2024-05-28 9:55 ` [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec Kirill A. Shutemov
` (10 subsequent siblings)
20 siblings, 2 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Nikolay Borisov, Tao Liu
AMD SEV and Intel TDX guests allocate shared buffers for performing I/O.
This is done by allocating pages normally from the buddy allocator and
then converting them to shared using set_memory_decrypted().
On kexec, the second kernel is unaware of which memory has been
converted in this manner. It only sees E820_TYPE_RAM. Accessing shared
memory as private is fatal.
Therefore, the memory state must be reset to its original state before
starting the new kernel with kexec.
The process of converting shared memory back to private occurs in two
steps:
- enc_kexec_begin() stops new conversions.
- enc_kexec_finish() unshares all existing shared memory, reverting it
back to private.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/include/asm/x86_init.h | 9 +++++++++
arch/x86/kernel/crash.c | 12 ++++++++++++
arch/x86/kernel/reboot.c | 12 ++++++++++++
arch/x86/kernel/x86_init.c | 4 ++++
4 files changed, 37 insertions(+)
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 28ac3cb9b987..6cade48811cc 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -149,12 +149,21 @@ struct x86_init_acpi {
* @enc_status_change_finish Notify HV after the encryption status of a range is changed
* @enc_tlb_flush_required Returns true if a TLB flush is needed before changing page encryption status
* @enc_cache_flush_required Returns true if a cache flush is needed before changing page encryption status
+ * @enc_kexec_begin Begin the two-step process of conversion shared memory back
+ * to private. It stops the new conversions from being started
+ * and waits in-flight conversions to finish, if possible.
+ * @enc_kexec_finish Finish the two-step process of conversion shared memory to
+ * private. All memory is private after the call.
+ * It called with all CPUs but one shutdown and interrupts
+ * disabled.
*/
struct x86_guest {
int (*enc_status_change_prepare)(unsigned long vaddr, int npages, bool enc);
int (*enc_status_change_finish)(unsigned long vaddr, int npages, bool enc);
bool (*enc_tlb_flush_required)(bool enc);
bool (*enc_cache_flush_required)(void);
+ void (*enc_kexec_begin)(bool crash);
+ void (*enc_kexec_finish)(void);
};
/**
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index f06501445cd9..74f6305eb9ec 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -128,6 +128,18 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
#ifdef CONFIG_HPET_TIMER
hpet_disable();
#endif
+
+ /*
+ * Non-crash kexec calls enc_kexec_begin() while scheduling is still
+ * active. This allows the callback to wait until all in-flight
+ * shared<->private conversions are complete. In a crash scenario,
+ * enc_kexec_begin() get call after all but one CPU has been shut down
+ * and interrupts have been disabled. This only allows the callback to
+ * detect a race with the conversion and report it.
+ */
+ x86_platform.guest.enc_kexec_begin(true);
+ x86_platform.guest.enc_kexec_finish();
+
crash_save_cpu(regs, safe_smp_processor_id());
}
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index f3130f762784..097313147ad3 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -12,6 +12,7 @@
#include <linux/delay.h>
#include <linux/objtool.h>
#include <linux/pgtable.h>
+#include <linux/kexec.h>
#include <acpi/reboot.h>
#include <asm/io.h>
#include <asm/apic.h>
@@ -716,6 +717,14 @@ static void native_machine_emergency_restart(void)
void native_machine_shutdown(void)
{
+ /*
+ * Call enc_kexec_begin() while all CPUs are still active and
+ * interrupts are enabled. This will allow all in-flight memory
+ * conversions to finish cleanly.
+ */
+ if (kexec_in_progress)
+ x86_platform.guest.enc_kexec_begin(false);
+
/* Stop the cpus and apics */
#ifdef CONFIG_X86_IO_APIC
/*
@@ -752,6 +761,9 @@ void native_machine_shutdown(void)
#ifdef CONFIG_X86_64
x86_platform.iommu_shutdown();
#endif
+
+ if (kexec_in_progress)
+ x86_platform.guest.enc_kexec_finish();
}
static void __machine_emergency_restart(int emergency)
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index a7143bb7dd93..8a79fb505303 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -138,6 +138,8 @@ static int enc_status_change_prepare_noop(unsigned long vaddr, int npages, bool
static int enc_status_change_finish_noop(unsigned long vaddr, int npages, bool enc) { return 0; }
static bool enc_tlb_flush_required_noop(bool enc) { return false; }
static bool enc_cache_flush_required_noop(void) { return false; }
+static void enc_kexec_begin_noop(bool crash) {}
+static void enc_kexec_finish_noop(void) {}
static bool is_private_mmio_noop(u64 addr) {return false; }
struct x86_platform_ops x86_platform __ro_after_init = {
@@ -161,6 +163,8 @@ struct x86_platform_ops x86_platform __ro_after_init = {
.enc_status_change_finish = enc_status_change_finish_noop,
.enc_tlb_flush_required = enc_tlb_flush_required_noop,
.enc_cache_flush_required = enc_cache_flush_required_noop,
+ .enc_kexec_begin = enc_kexec_begin_noop,
+ .enc_kexec_finish = enc_kexec_finish_noop,
},
};
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (9 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-31 15:14 ` Borislav Petkov
2024-06-04 16:27 ` [PATCHv11 " Dave Hansen
2024-05-28 9:55 ` [PATCHv11 12/19] x86/mm: Make e820__end_ram_pfn() cover E820_TYPE_ACPI ranges Kirill A. Shutemov
` (9 subsequent siblings)
20 siblings, 2 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
TDX guests allocate shared buffers to perform I/O. It is done by
allocating pages normally from the buddy allocator and converting them
to shared with set_memory_decrypted().
The second, kexec-ed kernel has no idea what memory is converted this
way. It only sees E820_TYPE_RAM.
Accessing shared memory via private mapping is fatal. It leads to
unrecoverable TD exit.
On kexec walk direct mapping and convert all shared memory back to
private. It makes all RAM private again and second kernel may use it
normally.
The conversion occurs in two steps: stopping new conversions and
unsharing all memory. In the case of normal kexec, the stopping of
conversions takes place while scheduling is still functioning. This
allows for waiting until any ongoing conversions are finished. The
second step is carried out when all CPUs except one are inactive and
interrupts are disabled. This prevents any conflicts with code that may
access shared memory.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/coco/tdx/tdx.c | 69 +++++++++++++++++++++++++++++++
arch/x86/include/asm/pgtable.h | 5 +++
arch/x86/include/asm/set_memory.h | 3 ++
arch/x86/mm/pat/set_memory.c | 41 ++++++++++++++++--
4 files changed, 115 insertions(+), 3 deletions(-)
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 979891e97d83..c0a651fa8963 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -7,6 +7,7 @@
#include <linux/cpufeature.h>
#include <linux/export.h>
#include <linux/io.h>
+#include <linux/kexec.h>
#include <asm/coco.h>
#include <asm/tdx.h>
#include <asm/vmx.h>
@@ -14,6 +15,7 @@
#include <asm/insn.h>
#include <asm/insn-eval.h>
#include <asm/pgtable.h>
+#include <asm/set_memory.h>
/* MMIO direction */
#define EPT_READ 0
@@ -831,6 +833,70 @@ static int tdx_enc_status_change_finish(unsigned long vaddr, int numpages,
return 0;
}
+/* Stop new private<->shared conversions */
+static void tdx_kexec_begin(bool crash)
+{
+ /*
+ * Crash kernel reaches here with interrupts disabled: can't wait for
+ * conversions to finish.
+ *
+ * If race happened, just report and proceed.
+ */
+ if (!set_memory_enc_stop_conversion(!crash))
+ pr_warn("Failed to stop shared<->private conversions\n");
+}
+
+/* Walk direct mapping and convert all shared memory back to private */
+static void tdx_kexec_finish(void)
+{
+ unsigned long addr, end;
+ long found = 0, shared;
+
+ lockdep_assert_irqs_disabled();
+
+ addr = PAGE_OFFSET;
+ end = PAGE_OFFSET + get_max_mapped();
+
+ while (addr < end) {
+ unsigned long size;
+ unsigned int level;
+ pte_t *pte;
+
+ pte = lookup_address(addr, &level);
+ size = page_level_size(level);
+
+ if (pte && pte_decrypted(*pte)) {
+ int pages = size / PAGE_SIZE;
+
+ /*
+ * Touching memory with shared bit set triggers implicit
+ * conversion to shared.
+ *
+ * Make sure nobody touches the shared range from
+ * now on.
+ */
+ set_pte(pte, __pte(0));
+
+ if (!tdx_enc_status_changed(addr, pages, true)) {
+ pr_err("Failed to unshare range %#lx-%#lx\n",
+ addr, addr + size);
+ }
+
+ found += pages;
+ }
+
+ addr += size;
+ }
+
+ __flush_tlb_all();
+
+ shared = atomic_long_read(&nr_shared);
+ if (shared != found) {
+ pr_err("shared page accounting is off\n");
+ pr_err("nr_shared = %ld, nr_found = %ld\n", shared, found);
+ }
+}
+
void __init tdx_early_init(void)
{
struct tdx_module_args args = {
@@ -890,6 +956,9 @@ void __init tdx_early_init(void)
x86_platform.guest.enc_cache_flush_required = tdx_cache_flush_required;
x86_platform.guest.enc_tlb_flush_required = tdx_tlb_flush_required;
+ x86_platform.guest.enc_kexec_begin = tdx_kexec_begin;
+ x86_platform.guest.enc_kexec_finish = tdx_kexec_finish;
+
/*
* TDX intercepts the RDMSR to read the X2APIC ID in the parallel
* bringup low level code. That raises #VE which cannot be handled
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 65b8e5bb902c..e39311a89bf4 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -140,6 +140,11 @@ static inline int pte_young(pte_t pte)
return pte_flags(pte) & _PAGE_ACCESSED;
}
+static inline bool pte_decrypted(pte_t pte)
+{
+ return cc_mkdec(pte_val(pte)) == pte_val(pte);
+}
+
#define pmd_dirty pmd_dirty
static inline bool pmd_dirty(pmd_t pmd)
{
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 9aee31862b4a..d490db38db9e 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -49,8 +49,11 @@ int set_memory_wb(unsigned long addr, int numpages);
int set_memory_np(unsigned long addr, int numpages);
int set_memory_p(unsigned long addr, int numpages);
int set_memory_4k(unsigned long addr, int numpages);
+
+bool set_memory_enc_stop_conversion(bool wait);
int set_memory_encrypted(unsigned long addr, int numpages);
int set_memory_decrypted(unsigned long addr, int numpages);
+
int set_memory_np_noalias(unsigned long addr, int numpages);
int set_memory_nonglobal(unsigned long addr, int numpages);
int set_memory_global(unsigned long addr, int numpages);
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index a7a7a6c6a3fb..2a548b65ef5f 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2227,12 +2227,47 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
return ret;
}
+/*
+ * The lock serializes conversions between private and shared memory.
+ *
+ * It is taken for read on conversion. A write lock guarantees that no
+ * concurrent conversions are in progress.
+ */
+static DECLARE_RWSEM(mem_enc_lock);
+
+/*
+ * Stop new private<->shared conversions.
+ *
+ * Taking the exclusive mem_enc_lock waits for in-flight conversions to complete.
+ * The lock is not released to prevent new conversions from being started.
+ *
+ * If sleep is not allowed, as in a crash scenario, try to take the lock.
+ * Failure indicates that there is a race with the conversion.
+ */
+bool set_memory_enc_stop_conversion(bool wait)
+{
+ if (!wait)
+ return down_write_trylock(&mem_enc_lock);
+
+ down_write(&mem_enc_lock);
+
+ return true;
+}
+
static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
{
- if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
- return __set_memory_enc_pgtable(addr, numpages, enc);
+ int ret = 0;
- return 0;
+ if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
+ if (!down_read_trylock(&mem_enc_lock))
+ return -EBUSY;
+
+ ret = __set_memory_enc_pgtable(addr, numpages, enc);
+
+ up_read(&mem_enc_lock);
+ }
+
+ return ret;
}
int set_memory_encrypted(unsigned long addr, int numpages)
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 12/19] x86/mm: Make e820__end_ram_pfn() cover E820_TYPE_ACPI ranges
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (10 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 13/19] x86/mm: Do not zap page table entries mapping unaccepted memory table during kdump Kirill A. Shutemov
` (8 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Dave Hansen,
Tao Liu
e820__end_of_ram_pfn() is used to calculate max_pfn which, among other
things, guides where direct mapping ends. Any memory above max_pfn is
not going to be present in the direct mapping.
e820__end_of_ram_pfn() finds the end of the RAM based on the highest
E820_TYPE_RAM range. But it doesn't includes E820_TYPE_ACPI ranges into
calculation.
Despite the name, E820_TYPE_ACPI covers not only ACPI data, but also EFI
tables and might be required by kernel to function properly.
Usually the problem is hidden because there is some E820_TYPE_RAM memory
above E820_TYPE_ACPI. But crashkernel only presents pre-allocated crash
memory as E820_TYPE_RAM on boot. If the preallocated range is small, it
can fit under the last E820_TYPE_ACPI range.
Modify e820__end_of_ram_pfn() and e820__end_of_low_ram_pfn() to cover
E820_TYPE_ACPI memory.
The problem was discovered during debugging kexec for TDX guest. TDX
guest uses E820_TYPE_ACPI to store the unaccepted memory bitmap and pass
it between the kernels on kexec.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/kernel/e820.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 68b09f718f10..4893d30ce438 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -828,7 +828,7 @@ u64 __init e820__memblock_alloc_reserved(u64 size, u64 align)
/*
* Find the highest page frame number we have available
*/
-static unsigned long __init e820_end_pfn(unsigned long limit_pfn, enum e820_type type)
+static unsigned long __init e820__end_ram_pfn(unsigned long limit_pfn)
{
int i;
unsigned long last_pfn = 0;
@@ -839,7 +839,8 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn, enum e820_type
unsigned long start_pfn;
unsigned long end_pfn;
- if (entry->type != type)
+ if (entry->type != E820_TYPE_RAM &&
+ entry->type != E820_TYPE_ACPI)
continue;
start_pfn = entry->addr >> PAGE_SHIFT;
@@ -865,12 +866,12 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn, enum e820_type
unsigned long __init e820__end_of_ram_pfn(void)
{
- return e820_end_pfn(MAX_ARCH_PFN, E820_TYPE_RAM);
+ return e820__end_ram_pfn(MAX_ARCH_PFN);
}
unsigned long __init e820__end_of_low_ram_pfn(void)
{
- return e820_end_pfn(1UL << (32 - PAGE_SHIFT), E820_TYPE_RAM);
+ return e820__end_ram_pfn(1UL << (32 - PAGE_SHIFT));
}
static void __init early_panic(char *msg)
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 13/19] x86/mm: Do not zap page table entries mapping unaccepted memory table during kdump.
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (11 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 12/19] x86/mm: Make e820__end_ram_pfn() cover E820_TYPE_ACPI ranges Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 14/19] x86/acpi: Rename fields in acpi_madt_multiproc_wakeup structure Kirill A. Shutemov
` (7 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel
From: Ashish Kalra <ashish.kalra@amd.com>
During crashkernel boot only pre-allocated crash memory is presented as
E820_TYPE_RAM. This can cause page table entries mapping unaccepted memory
table to be zapped during phys_pte_init(), phys_pmd_init(), phys_pud_init()
and phys_p4d_init() as SNP/TDX guest use E820_TYPE_ACPI to store the
unaccepted memory table and pass it between the kernels on
kexec/kdump.
E820_TYPE_ACPI covers not only ACPI data, but also EFI tables and might
be required by kernel to function properly.
The problem was discovered during debugging kdump for SNP guest. The
unaccepted memory table stored with E820_TYPE_ACPI and passed between
the kernels on kdump was getting zapped as the PMD entry mapping this
is above the E820_TYPE_RAM range for the reserved crashkernel memory.
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/mm/init_64.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 7e177856ee4f..28002cc7a37d 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -469,7 +469,9 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
!e820__mapped_any(paddr & PAGE_MASK, paddr_next,
E820_TYPE_RAM) &&
!e820__mapped_any(paddr & PAGE_MASK, paddr_next,
- E820_TYPE_RESERVED_KERN))
+ E820_TYPE_RESERVED_KERN) &&
+ !e820__mapped_any(paddr & PAGE_MASK, paddr_next,
+ E820_TYPE_ACPI))
set_pte_init(pte, __pte(0), init);
continue;
}
@@ -524,7 +526,9 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
!e820__mapped_any(paddr & PMD_MASK, paddr_next,
E820_TYPE_RAM) &&
!e820__mapped_any(paddr & PMD_MASK, paddr_next,
- E820_TYPE_RESERVED_KERN))
+ E820_TYPE_RESERVED_KERN) &&
+ !e820__mapped_any(paddr & PMD_MASK, paddr_next,
+ E820_TYPE_ACPI))
set_pmd_init(pmd, __pmd(0), init);
continue;
}
@@ -611,7 +615,9 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
!e820__mapped_any(paddr & PUD_MASK, paddr_next,
E820_TYPE_RAM) &&
!e820__mapped_any(paddr & PUD_MASK, paddr_next,
- E820_TYPE_RESERVED_KERN))
+ E820_TYPE_RESERVED_KERN) &&
+ !e820__mapped_any(paddr & PUD_MASK, paddr_next,
+ E820_TYPE_ACPI))
set_pud_init(pud, __pud(0), init);
continue;
}
@@ -698,7 +704,9 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
!e820__mapped_any(paddr & P4D_MASK, paddr_next,
E820_TYPE_RAM) &&
!e820__mapped_any(paddr & P4D_MASK, paddr_next,
- E820_TYPE_RESERVED_KERN))
+ E820_TYPE_RESERVED_KERN) &&
+ !e820__mapped_any(paddr & P4D_MASK, paddr_next,
+ E820_TYPE_ACPI))
set_p4d_init(p4d, __p4d(0), init);
continue;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 14/19] x86/acpi: Rename fields in acpi_madt_multiproc_wakeup structure
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (12 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 13/19] x86/mm: Do not zap page table entries mapping unaccepted memory table during kdump Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 15/19] x86/acpi: Do not attempt to bring up secondary CPUs in kexec case Kirill A. Shutemov
` (6 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
In order to support MADT wakeup structure version 1, provide more
appropriate names for the fields in the structure.
Rename 'mailbox_version' to 'version'. This field signifies the version
of the structure and the related protocols, rather than the version of
the mailbox. This field has not been utilized in the code thus far.
Rename 'base_address' to 'mailbox_address' to clarify the kind of
address it represents. In version 1, the structure includes the reset
vector address. Clear and distinct naming helps to prevent any
confusion.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/kernel/acpi/madt_wakeup.c | 2 +-
include/acpi/actbl2.h | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c
index d222be8d7a07..004801b9b151 100644
--- a/arch/x86/kernel/acpi/madt_wakeup.c
+++ b/arch/x86/kernel/acpi/madt_wakeup.c
@@ -75,7 +75,7 @@ int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
acpi_table_print_madt_entry(&header->common);
- acpi_mp_wake_mailbox_paddr = mp_wake->base_address;
+ acpi_mp_wake_mailbox_paddr = mp_wake->mailbox_address;
cpu_hotplug_disable_offlining();
diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
index ae747c89d92c..fa63362469aa 100644
--- a/include/acpi/actbl2.h
+++ b/include/acpi/actbl2.h
@@ -1194,9 +1194,9 @@ struct acpi_madt_generic_translator {
struct acpi_madt_multiproc_wakeup {
struct acpi_subtable_header header;
- u16 mailbox_version;
+ u16 version;
u32 reserved; /* reserved - must be zero */
- u64 base_address;
+ u64 mailbox_address;
};
#define ACPI_MULTIPROC_WAKEUP_MB_OS_SIZE 2032
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 15/19] x86/acpi: Do not attempt to bring up secondary CPUs in kexec case
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (13 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 14/19] x86/acpi: Rename fields in acpi_madt_multiproc_wakeup structure Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 16/19] x86/smp: Add smp_ops.stop_this_cpu() callback Kirill A. Shutemov
` (5 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
ACPI MADT doesn't allow to offline a CPU after it was onlined. This
limits kexec: the second kernel won't be able to use more than one CPU.
To prevent a kexec kernel from onlining secondary CPUs invalidate the
mailbox address in the ACPI MADT wakeup structure which prevents a
kexec kernel to use it.
This is safe as the booting kernel has the mailbox address cached
already and acpi_wakeup_cpu() uses the cached value to bring up the
secondary CPUs.
Note: This is a Linux specific convention and not covered by the
ACPI specification.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/kernel/acpi/madt_wakeup.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c
index 004801b9b151..30820f9de5af 100644
--- a/arch/x86/kernel/acpi/madt_wakeup.c
+++ b/arch/x86/kernel/acpi/madt_wakeup.c
@@ -14,6 +14,11 @@ static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox __ro_afte
static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip)
{
+ if (!acpi_mp_wake_mailbox_paddr) {
+ pr_warn_once("No MADT mailbox: cannot bringup secondary CPUs. Booting with kexec?\n");
+ return -EOPNOTSUPP;
+ }
+
/*
* Remap mailbox memory only for the first call to acpi_wakeup_cpu().
*
@@ -64,6 +69,28 @@ static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip)
return 0;
}
+static void acpi_mp_disable_offlining(struct acpi_madt_multiproc_wakeup *mp_wake)
+{
+ cpu_hotplug_disable_offlining();
+
+ /*
+ * ACPI MADT doesn't allow to offline a CPU after it was onlined. This
+ * limits kexec: the second kernel won't be able to use more than one CPU.
+ *
+ * To prevent a kexec kernel from onlining secondary CPUs invalidate the
+ * mailbox address in the ACPI MADT wakeup structure which prevents a
+ * kexec kernel to use it.
+ *
+ * This is safe as the booting kernel has the mailbox address cached
+ * already and acpi_wakeup_cpu() uses the cached value to bring up the
+ * secondary CPUs.
+ *
+ * Note: This is a Linux specific convention and not covered by the
+ * ACPI specification.
+ */
+ mp_wake->mailbox_address = 0;
+}
+
int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
const unsigned long end)
{
@@ -77,7 +104,7 @@ int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
acpi_mp_wake_mailbox_paddr = mp_wake->mailbox_address;
- cpu_hotplug_disable_offlining();
+ acpi_mp_disable_offlining(mp_wake);
apic_update_callback(wakeup_secondary_cpu_64, acpi_wakeup_cpu);
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 16/19] x86/smp: Add smp_ops.stop_this_cpu() callback
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (14 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 15/19] x86/acpi: Do not attempt to bring up secondary CPUs in kexec case Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 17/19] x86/mm: Introduce kernel_ident_mapping_free() Kirill A. Shutemov
` (4 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
If the helper is defined, it is called instead of halt() to stop the CPU
at the end of stop_this_cpu() and on crash CPU shutdown.
ACPI MADT will use it to hand over the CPU to BIOS in order to be able
to wake it up again after kexec.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/include/asm/smp.h | 1 +
arch/x86/kernel/process.c | 7 +++++++
arch/x86/kernel/reboot.c | 6 ++++++
3 files changed, 14 insertions(+)
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index a35936b512fe..ca073f40698f 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -35,6 +35,7 @@ struct smp_ops {
int (*cpu_disable)(void);
void (*cpu_die)(unsigned int cpu);
void (*play_dead)(void);
+ void (*stop_this_cpu)(void);
void (*send_call_func_ipi)(const struct cpumask *mask);
void (*send_call_func_single_ipi)(int cpu);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b8441147eb5e..f63f8fd00a91 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -835,6 +835,13 @@ void __noreturn stop_this_cpu(void *dummy)
*/
cpumask_clear_cpu(cpu, &cpus_stop_mask);
+#ifdef CONFIG_SMP
+ if (smp_ops.stop_this_cpu) {
+ smp_ops.stop_this_cpu();
+ unreachable();
+ }
+#endif
+
for (;;) {
/*
* Use native_halt() so that memory contents don't change
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 097313147ad3..513809b5b27c 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -880,6 +880,12 @@ static int crash_nmi_callback(unsigned int val, struct pt_regs *regs)
cpu_emergency_disable_virtualization();
atomic_dec(&waiting_for_crash_ipi);
+
+ if (smp_ops.stop_this_cpu) {
+ smp_ops.stop_this_cpu();
+ unreachable();
+ }
+
/* Assume hlt works */
halt();
for (;;)
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 17/19] x86/mm: Introduce kernel_ident_mapping_free()
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (15 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 16/19] x86/smp: Add smp_ops.stop_this_cpu() callback Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method Kirill A. Shutemov
` (3 subsequent siblings)
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
The helper complements kernel_ident_mapping_init(): it frees the
identity mapping that was previously allocated. It will be used in the
error path to free a partially allocated mapping or if the mapping is no
longer needed.
The caller provides a struct x86_mapping_info with the free_pgd_page()
callback hooked up and the pgd_t to free.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/include/asm/init.h | 3 ++
arch/x86/mm/ident_map.c | 73 +++++++++++++++++++++++++++++++++++++
2 files changed, 76 insertions(+)
diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index cc9ccf61b6bd..14d72727d7ee 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -6,6 +6,7 @@
struct x86_mapping_info {
void *(*alloc_pgt_page)(void *); /* allocate buf for page table */
+ void (*free_pgt_page)(void *, void *); /* free buf for page table */
void *context; /* context for alloc_pgt_page */
unsigned long page_flag; /* page flag for PMD or PUD entry */
unsigned long offset; /* ident mapping offset */
@@ -16,4 +17,6 @@ struct x86_mapping_info {
int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
unsigned long pstart, unsigned long pend);
+void kernel_ident_mapping_free(struct x86_mapping_info *info, pgd_t *pgd);
+
#endif /* _ASM_X86_INIT_H */
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index 968d7005f4a7..3996af7b4abf 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -4,6 +4,79 @@
* included by both the compressed kernel and the regular kernel.
*/
+static void free_pte(struct x86_mapping_info *info, pmd_t *pmd)
+{
+ pte_t *pte = pte_offset_kernel(pmd, 0);
+
+ info->free_pgt_page(pte, info->context);
+}
+
+static void free_pmd(struct x86_mapping_info *info, pud_t *pud)
+{
+ pmd_t *pmd = pmd_offset(pud, 0);
+ int i;
+
+ for (i = 0; i < PTRS_PER_PMD; i++) {
+ if (!pmd_present(pmd[i]))
+ continue;
+
+ if (pmd_leaf(pmd[i]))
+ continue;
+
+ free_pte(info, &pmd[i]);
+ }
+
+ info->free_pgt_page(pmd, info->context);
+}
+
+static void free_pud(struct x86_mapping_info *info, p4d_t *p4d)
+{
+ pud_t *pud = pud_offset(p4d, 0);
+ int i;
+
+ for (i = 0; i < PTRS_PER_PUD; i++) {
+ if (!pud_present(pud[i]))
+ continue;
+
+ if (pud_leaf(pud[i]))
+ continue;
+
+ free_pmd(info, &pud[i]);
+ }
+
+ info->free_pgt_page(pud, info->context);
+}
+
+static void free_p4d(struct x86_mapping_info *info, pgd_t *pgd)
+{
+ p4d_t *p4d = p4d_offset(pgd, 0);
+ int i;
+
+ for (i = 0; i < PTRS_PER_P4D; i++) {
+ if (!p4d_present(p4d[i]))
+ continue;
+
+ free_pud(info, &p4d[i]);
+ }
+
+ if (pgtable_l5_enabled())
+ info->free_pgt_page(pgd, info->context);
+}
+
+void kernel_ident_mapping_free(struct x86_mapping_info *info, pgd_t *pgd)
+{
+ int i;
+
+ for (i = 0; i < PTRS_PER_PGD; i++) {
+ if (!pgd_present(pgd[i]))
+ continue;
+
+ free_p4d(info, &pgd[i]);
+ }
+
+ info->free_pgt_page(pgd, info->context);
+}
+
static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page,
unsigned long addr, unsigned long end)
{
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (16 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 17/19] x86/mm: Introduce kernel_ident_mapping_free() Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-06-03 8:39 ` Borislav Petkov
2024-05-28 9:55 ` [PATCHv11 19/19] ACPI: tables: Print MULTIPROC_WAKEUP when MADT is parsed Kirill A. Shutemov
` (2 subsequent siblings)
20 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
MADT Multiprocessor Wakeup structure version 1 brings support of CPU
offlining: BIOS provides a reset vector where the CPU has to jump to
for offlining itself. The new TEST mailbox command can be used to test
whether the CPU offlined itself which means the BIOS has control over
the CPU and can online it again via the ACPI MADT wakeup method.
Add CPU offling support for the ACPI MADT wakeup method by implementing
custom cpu_die(), play_dead() and stop_this_cpu() SMP operations.
CPU offlining makes is possible to hand over secondary CPUs over kexec,
not limiting the second kernel to a single CPU.
The change conforms to the approved ACPI spec change proposal. See the
Link.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/all/13356251.uLZWGnKmhe@kreacher
Acked-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/include/asm/acpi.h | 2 +
arch/x86/kernel/acpi/Makefile | 2 +-
arch/x86/kernel/acpi/madt_playdead.S | 28 ++++
arch/x86/kernel/acpi/madt_wakeup.c | 184 ++++++++++++++++++++++++++-
include/acpi/actbl2.h | 15 ++-
5 files changed, 227 insertions(+), 4 deletions(-)
create mode 100644 arch/x86/kernel/acpi/madt_playdead.S
diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
index ceacac2b335d..21bc53f5ed0c 100644
--- a/arch/x86/include/asm/acpi.h
+++ b/arch/x86/include/asm/acpi.h
@@ -83,6 +83,8 @@ union acpi_subtable_headers;
int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
const unsigned long end);
+void asm_acpi_mp_play_dead(u64 reset_vector, u64 pgd_pa);
+
/*
* Check if the CPU can handle C2 and deeper
*/
diff --git a/arch/x86/kernel/acpi/Makefile b/arch/x86/kernel/acpi/Makefile
index 2feba7257665..842a5f449404 100644
--- a/arch/x86/kernel/acpi/Makefile
+++ b/arch/x86/kernel/acpi/Makefile
@@ -4,7 +4,7 @@ obj-$(CONFIG_ACPI) += boot.o
obj-$(CONFIG_ACPI_SLEEP) += sleep.o wakeup_$(BITS).o
obj-$(CONFIG_ACPI_APEI) += apei.o
obj-$(CONFIG_ACPI_CPPC_LIB) += cppc.o
-obj-$(CONFIG_ACPI_MADT_WAKEUP) += madt_wakeup.o
+obj-$(CONFIG_ACPI_MADT_WAKEUP) += madt_wakeup.o madt_playdead.o
ifneq ($(CONFIG_ACPI_PROCESSOR),)
obj-y += cstate.o
diff --git a/arch/x86/kernel/acpi/madt_playdead.S b/arch/x86/kernel/acpi/madt_playdead.S
new file mode 100644
index 000000000000..4e498d28cdc8
--- /dev/null
+++ b/arch/x86/kernel/acpi/madt_playdead.S
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <linux/linkage.h>
+#include <asm/nospec-branch.h>
+#include <asm/page_types.h>
+#include <asm/processor-flags.h>
+
+ .text
+ .align PAGE_SIZE
+
+/*
+ * asm_acpi_mp_play_dead() - Hand over control of the CPU to the BIOS
+ *
+ * rdi: Address of the ACPI MADT MPWK ResetVector
+ * rsi: PGD of the identity mapping
+ */
+SYM_FUNC_START(asm_acpi_mp_play_dead)
+ /* Turn off global entries. Following CR3 write will flush them. */
+ movq %cr4, %rdx
+ andq $~(X86_CR4_PGE), %rdx
+ movq %rdx, %cr4
+
+ /* Switch to identity mapping */
+ movq %rsi, %cr3
+
+ /* Jump to reset vector */
+ ANNOTATE_RETPOLINE_SAFE
+ jmp *%rdi
+SYM_FUNC_END(asm_acpi_mp_play_dead)
diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c
index 30820f9de5af..6cfe762be28b 100644
--- a/arch/x86/kernel/acpi/madt_wakeup.c
+++ b/arch/x86/kernel/acpi/madt_wakeup.c
@@ -1,10 +1,19 @@
// SPDX-License-Identifier: GPL-2.0-or-later
#include <linux/acpi.h>
#include <linux/cpu.h>
+#include <linux/delay.h>
#include <linux/io.h>
+#include <linux/kexec.h>
+#include <linux/memblock.h>
+#include <linux/pgtable.h>
+#include <linux/sched/hotplug.h>
#include <asm/apic.h>
#include <asm/barrier.h>
+#include <asm/init.h>
+#include <asm/intel_pt.h>
+#include <asm/nmi.h>
#include <asm/processor.h>
+#include <asm/reboot.h>
/* Physical address of the Multiprocessor Wakeup Structure mailbox */
static u64 acpi_mp_wake_mailbox_paddr __ro_after_init;
@@ -12,6 +21,154 @@ static u64 acpi_mp_wake_mailbox_paddr __ro_after_init;
/* Virtual address of the Multiprocessor Wakeup Structure mailbox */
static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox __ro_after_init;
+static u64 acpi_mp_pgd __ro_after_init;
+static u64 acpi_mp_reset_vector_paddr __ro_after_init;
+
+static void acpi_mp_stop_this_cpu(void)
+{
+ asm_acpi_mp_play_dead(acpi_mp_reset_vector_paddr, acpi_mp_pgd);
+}
+
+static void acpi_mp_play_dead(void)
+{
+ play_dead_common();
+ asm_acpi_mp_play_dead(acpi_mp_reset_vector_paddr, acpi_mp_pgd);
+}
+
+static void acpi_mp_cpu_die(unsigned int cpu)
+{
+ u32 apicid = per_cpu(x86_cpu_to_apicid, cpu);
+ unsigned long timeout;
+
+ /*
+ * Use TEST mailbox command to prove that BIOS got control over
+ * the CPU before declaring it dead.
+ *
+ * BIOS has to clear 'command' field of the mailbox.
+ */
+ acpi_mp_wake_mailbox->apic_id = apicid;
+ smp_store_release(&acpi_mp_wake_mailbox->command,
+ ACPI_MP_WAKE_COMMAND_TEST);
+
+ /* Don't wait longer than a second. */
+ timeout = USEC_PER_SEC;
+ while (READ_ONCE(acpi_mp_wake_mailbox->command) && --timeout)
+ udelay(1);
+
+ if (!timeout)
+ pr_err("Failed to hand over CPU %d to BIOS\n", cpu);
+}
+
+/* The argument is required to match type of x86_mapping_info::alloc_pgt_page */
+static void __init *alloc_pgt_page(void *dummy)
+{
+ return memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+}
+
+static void __init free_pgt_page(void *pgt, void *dummy)
+{
+ return memblock_free(pgt, PAGE_SIZE);
+}
+
+/*
+ * Make sure asm_acpi_mp_play_dead() is present in the identity mapping at
+ * the same place as in the kernel page tables. asm_acpi_mp_play_dead() switches
+ * to the identity mapping and the function has be present at the same spot in
+ * the virtual address space before and after switching page tables.
+ */
+static int __init init_transition_pgtable(pgd_t *pgd)
+{
+ pgprot_t prot = PAGE_KERNEL_EXEC_NOENC;
+ unsigned long vaddr, paddr;
+ p4d_t *p4d;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *pte;
+
+ vaddr = (unsigned long)asm_acpi_mp_play_dead;
+ pgd += pgd_index(vaddr);
+ if (!pgd_present(*pgd)) {
+ p4d = (p4d_t *)alloc_pgt_page(NULL);
+ if (!p4d)
+ return -ENOMEM;
+ set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE));
+ }
+ p4d = p4d_offset(pgd, vaddr);
+ if (!p4d_present(*p4d)) {
+ pud = (pud_t *)alloc_pgt_page(NULL);
+ if (!pud)
+ return -ENOMEM;
+ set_p4d(p4d, __p4d(__pa(pud) | _KERNPG_TABLE));
+ }
+ pud = pud_offset(p4d, vaddr);
+ if (!pud_present(*pud)) {
+ pmd = (pmd_t *)alloc_pgt_page(NULL);
+ if (!pmd)
+ return -ENOMEM;
+ set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
+ }
+ pmd = pmd_offset(pud, vaddr);
+ if (!pmd_present(*pmd)) {
+ pte = (pte_t *)alloc_pgt_page(NULL);
+ if (!pte)
+ return -ENOMEM;
+ set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
+ }
+ pte = pte_offset_kernel(pmd, vaddr);
+
+ paddr = __pa(vaddr);
+ set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
+
+ return 0;
+}
+
+static int __init acpi_mp_setup_reset(u64 reset_vector)
+{
+ struct x86_mapping_info info = {
+ .alloc_pgt_page = alloc_pgt_page,
+ .free_pgt_page = free_pgt_page,
+ .page_flag = __PAGE_KERNEL_LARGE_EXEC,
+ .kernpg_flag = _KERNPG_TABLE_NOENC,
+ };
+ pgd_t *pgd;
+
+ pgd = alloc_pgt_page(NULL);
+ if (!pgd)
+ return -ENOMEM;
+
+ for (int i = 0; i < nr_pfn_mapped; i++) {
+ unsigned long mstart, mend;
+
+ mstart = pfn_mapped[i].start << PAGE_SHIFT;
+ mend = pfn_mapped[i].end << PAGE_SHIFT;
+ if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
+ kernel_ident_mapping_free(&info, pgd);
+ return -ENOMEM;
+ }
+ }
+
+ if (kernel_ident_mapping_init(&info, pgd,
+ PAGE_ALIGN_DOWN(reset_vector),
+ PAGE_ALIGN(reset_vector + 1))) {
+ kernel_ident_mapping_free(&info, pgd);
+ return -ENOMEM;
+ }
+
+ if (init_transition_pgtable(pgd)) {
+ kernel_ident_mapping_free(&info, pgd);
+ return -ENOMEM;
+ }
+
+ smp_ops.play_dead = acpi_mp_play_dead;
+ smp_ops.stop_this_cpu = acpi_mp_stop_this_cpu;
+ smp_ops.cpu_die = acpi_mp_cpu_die;
+
+ acpi_mp_reset_vector_paddr = reset_vector;
+ acpi_mp_pgd = __pa(pgd);
+
+ return 0;
+}
+
static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip)
{
if (!acpi_mp_wake_mailbox_paddr) {
@@ -97,14 +254,37 @@ int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
struct acpi_madt_multiproc_wakeup *mp_wake;
mp_wake = (struct acpi_madt_multiproc_wakeup *)header;
- if (BAD_MADT_ENTRY(mp_wake, end))
+
+ /*
+ * Cannot use the standard BAD_MADT_ENTRY() to sanity check the @mp_wake
+ * entry. 'sizeof (struct acpi_madt_multiproc_wakeup)' can be larger
+ * than the actual size of the MP wakeup entry in ACPI table because the
+ * 'reset_vector' is only available in the V1 MP wakeup structure.
+ */
+ if (!mp_wake)
+ return -EINVAL;
+ if (end - (unsigned long)mp_wake < ACPI_MADT_MP_WAKEUP_SIZE_V0)
+ return -EINVAL;
+ if (mp_wake->header.length < ACPI_MADT_MP_WAKEUP_SIZE_V0)
return -EINVAL;
acpi_table_print_madt_entry(&header->common);
acpi_mp_wake_mailbox_paddr = mp_wake->mailbox_address;
- acpi_mp_disable_offlining(mp_wake);
+ if (mp_wake->version >= ACPI_MADT_MP_WAKEUP_VERSION_V1 &&
+ mp_wake->header.length >= ACPI_MADT_MP_WAKEUP_SIZE_V1) {
+ if (acpi_mp_setup_reset(mp_wake->reset_vector)) {
+ pr_warn("Failed to setup MADT reset vector\n");
+ acpi_mp_disable_offlining(mp_wake);
+ }
+ } else {
+ /*
+ * CPU offlining requires version 1 of the ACPI MADT wakeup
+ * structure.
+ */
+ acpi_mp_disable_offlining(mp_wake);
+ }
apic_update_callback(wakeup_secondary_cpu_64, acpi_wakeup_cpu);
diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
index fa63362469aa..e27958ef8264 100644
--- a/include/acpi/actbl2.h
+++ b/include/acpi/actbl2.h
@@ -1197,8 +1197,20 @@ struct acpi_madt_multiproc_wakeup {
u16 version;
u32 reserved; /* reserved - must be zero */
u64 mailbox_address;
+ u64 reset_vector;
};
+/* Values for Version field above */
+
+enum acpi_madt_multiproc_wakeup_version {
+ ACPI_MADT_MP_WAKEUP_VERSION_NONE = 0,
+ ACPI_MADT_MP_WAKEUP_VERSION_V1 = 1,
+ ACPI_MADT_MP_WAKEUP_VERSION_RESERVED = 2, /* 2 and greater are reserved */
+};
+
+#define ACPI_MADT_MP_WAKEUP_SIZE_V0 16
+#define ACPI_MADT_MP_WAKEUP_SIZE_V1 24
+
#define ACPI_MULTIPROC_WAKEUP_MB_OS_SIZE 2032
#define ACPI_MULTIPROC_WAKEUP_MB_FIRMWARE_SIZE 2048
@@ -1211,7 +1223,8 @@ struct acpi_madt_multiproc_wakeup_mailbox {
u8 reserved_firmware[ACPI_MULTIPROC_WAKEUP_MB_FIRMWARE_SIZE]; /* reserved for firmware use */
};
-#define ACPI_MP_WAKE_COMMAND_WAKEUP 1
+#define ACPI_MP_WAKE_COMMAND_WAKEUP 1
+#define ACPI_MP_WAKE_COMMAND_TEST 2
/* 17: CPU Core Interrupt Controller (ACPI 6.5) */
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCHv11 19/19] ACPI: tables: Print MULTIPROC_WAKEUP when MADT is parsed
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (17 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method Kirill A. Shutemov
@ 2024-05-28 9:55 ` Kirill A. Shutemov
2024-05-28 10:01 ` [PATCHv11 00/19] x86/tdx: Add kexec support Rafael J. Wysocki
2024-05-30 23:36 ` [PATCH v7 0/3] x86/snp: " Ashish Kalra
20 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-28 9:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
Kirill A. Shutemov, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
When MADT is parsed, print MULTIPROC_WAKEUP information:
ACPI: MP Wakeup (version[1], mailbox[0x7fffd000], reset[0x7fffe068])
This debug information will be very helpful during bring up.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
---
drivers/acpi/tables.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index b976e5fc3fbc..9e1b01c35070 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -198,6 +198,20 @@ void acpi_table_print_madt_entry(struct acpi_subtable_header *header)
}
break;
+ case ACPI_MADT_TYPE_MULTIPROC_WAKEUP:
+ {
+ struct acpi_madt_multiproc_wakeup *p =
+ (struct acpi_madt_multiproc_wakeup *)header;
+ u64 reset_vector = 0;
+
+ if (p->version >= ACPI_MADT_MP_WAKEUP_VERSION_V1)
+ reset_vector = p->reset_vector;
+
+ pr_debug("MP Wakeup (version[%d], mailbox[%#llx], reset[%#llx])\n",
+ p->version, p->mailbox_address, reset_vector);
+ }
+ break;
+
case ACPI_MADT_TYPE_CORE_PIC:
{
struct acpi_madt_core_pic *p = (struct acpi_madt_core_pic *)header;
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCHv11 00/19] x86/tdx: Add kexec support
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (18 preceding siblings ...)
2024-05-28 9:55 ` [PATCHv11 19/19] ACPI: tables: Print MULTIPROC_WAKEUP when MADT is parsed Kirill A. Shutemov
@ 2024-05-28 10:01 ` Rafael J. Wysocki
2024-05-30 23:36 ` [PATCH v7 0/3] x86/snp: " Ashish Kalra
20 siblings, 0 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2024-05-28 10:01 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On Tue, May 28, 2024 at 11:55 AM Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> The patchset adds bits and pieces to get kexec (and crashkernel) work on
> TDX guest.
>
> The last patch implements CPU offlining according to the approved ACPI
> spec change poposal[1]. It unlocks kexec with all CPUs visible in the target
> kernel. It requires BIOS-side enabling. If it missing we fallback to booting
> 2nd kernel with single CPU.
>
> Please review. I would be glad for any feedback.
>
> [1] https://lore.kernel.org/all/13356251.uLZWGnKmhe@kreacher
For the ACPI-related changes in the series
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest
2024-05-28 9:55 ` [PATCHv11 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest Kirill A. Shutemov
@ 2024-05-28 11:12 ` Huang, Kai
2024-05-29 11:39 ` Nikolay Borisov
1 sibling, 0 replies; 115+ messages in thread
From: Huang, Kai @ 2024-05-28 11:12 UTC (permalink / raw)
To: kirill.shutemov@linux.intel.com, tglx@linutronix.de,
mingo@redhat.com, x86@kernel.org, bp@alien8.de,
dave.hansen@linux.intel.com
Cc: kexec@lists.infradead.org, ardb@kernel.org,
linux-coco@lists.linux.dev, ashish.kalra@amd.com,
thomas.lendacky@amd.com, Hunter, Adrian, Reshetova, Elena,
linux-kernel@vger.kernel.org, haiyangz@microsoft.com,
seanjc@google.com, kys@microsoft.com, bhe@redhat.com,
Nakajima, Jun, hpa@zytor.com, peterz@infradead.org,
linux-hyperv@vger.kernel.org, Edgecombe, Rick P,
rafael@kernel.org, sathyanarayanan.kuppuswamy@linux.intel.com,
linux-acpi@vger.kernel.org
On Tue, 2024-05-28 at 12:55 +0300, Kirill A. Shutemov wrote:
> TDX guests run with MCA enabled (CR4.MCE=1b) from the very start. If
> that bit is cleared during CR4 register reprogramming during boot or
> kexec flows, a #VE exception will be raised which the guest kernel
> cannot handle it.
Nit: the ending "it" isn't needed.
>
> Therefore, make sure the CR4.MCE setting is preserved over kexec too and
> avoid raising any #VEs.
>
> The change doesn't affect non-TDX-guest environments.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
> ---
> arch/x86/kernel/relocate_kernel_64.S | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> index 085eef5c3904..b668a6be4f6f 100644
> --- a/arch/x86/kernel/relocate_kernel_64.S
> +++ b/arch/x86/kernel/relocate_kernel_64.S
> @@ -5,6 +5,8 @@
> */
>
> #include <linux/linkage.h>
> +#include <linux/stringify.h>
> +#include <asm/alternative.h>
> #include <asm/page_types.h>
> #include <asm/kexec.h>
> #include <asm/processor-flags.h>
> @@ -143,15 +145,17 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>
> /*
> * Set cr4 to a known state:
> - * - physical address extension enabled
> * - 5-level paging, if it was enabled before
> + * - Machine check exception on TDX guest, if it was enabled before.
> + * Clearing MCE might not be allowed in TDX guests, depending on setup.
> + * - physical address extension enabled
> */
> - movl $X86_CR4_PAE, %eax
> - testq $X86_CR4_LA57, %r13
> - jz .Lno_la57
> - orl $X86_CR4_LA57, %eax
> -.Lno_la57:
> + movl $X86_CR4_LA57, %eax
> + ALTERNATIVE "", __stringify(orl $X86_CR4_MCE, %eax), X86_FEATURE_TDX_GUEST
>
> + /* R13 contains the original CR4 value, read in relocate_kernel() */
> + andl %r13d, %eax
> + orl $X86_CR4_PAE, %eax
> movq %rax, %cr4
>
> jmp 1f
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 01/19] x86/acpi: Extract ACPI MADT wakeup code into a separate file
2024-05-28 9:55 ` [PATCHv11 01/19] x86/acpi: Extract ACPI MADT wakeup code into a separate file Kirill A. Shutemov
@ 2024-05-28 13:47 ` Borislav Petkov
0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-05-28 13:47 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Tue, May 28, 2024 at 12:55:04PM +0300, Kirill A. Shutemov wrote:
> In order to prepare for the expansion of support for the ACPI MADT
> wakeup method, move the relevant code into a separate file.
>
> Introduce a new configuration option to clearly indicate dependencies
> without the use of ifdefs.
>
> There have been no functional changes.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Acked-by: Kai Huang <kai.huang@intel.com>
> Reviewed-by: Baoquan He <bhe@redhat.com>
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Tao Liu <ltao@redhat.com>
> ---
> arch/x86/Kconfig | 7 +++
> arch/x86/include/asm/acpi.h | 5 ++
> arch/x86/kernel/acpi/Makefile | 1 +
> arch/x86/kernel/acpi/boot.c | 86 +-----------------------------
> arch/x86/kernel/acpi/madt_wakeup.c | 82 ++++++++++++++++++++++++++++
> 5 files changed, 96 insertions(+), 85 deletions(-)
> create mode 100644 arch/x86/kernel/acpi/madt_wakeup.c
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec
2024-05-28 9:55 ` [PATCHv11 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec Kirill A. Shutemov
@ 2024-05-29 10:42 ` Borislav Petkov
2024-06-02 12:39 ` [PATCHv11.1 " Kirill A. Shutemov
2024-06-02 12:44 ` [PATCHv11.2 " Kirill A. Shutemov
2024-06-04 16:16 ` [PATCHv11 " Dave Hansen
1 sibling, 2 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-05-29 10:42 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Nikolay Borisov, Tao Liu
On Tue, May 28, 2024 at 12:55:13PM +0300, Kirill A. Shutemov wrote:
> diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
> index 28ac3cb9b987..6cade48811cc 100644
> --- a/arch/x86/include/asm/x86_init.h
> +++ b/arch/x86/include/asm/x86_init.h
> @@ -149,12 +149,21 @@ struct x86_init_acpi {
> * @enc_status_change_finish Notify HV after the encryption status of a range is changed
> * @enc_tlb_flush_required Returns true if a TLB flush is needed before changing page encryption status
> * @enc_cache_flush_required Returns true if a cache flush is needed before changing page encryption status
> + * @enc_kexec_begin Begin the two-step process of conversion shared memory back
s/conversion/converting/
> + * to private. It stops the new conversions from being started
> + * and waits in-flight conversions to finish, if possible.
Good.
Now add "The @crash parameter denotes whether the function is being
called in the crash shutdown path."
> + * @enc_kexec_finish Finish the two-step process of conversion shared memory to
s/conversion/converting/
> + * private. All memory is private after the call.
"... when the function returns."
> + * It called with all CPUs but one shutdown and interrupts
> + * disabled.
"It is called on only one CPU while the others are shut down and with
interrupts disabled."
> */
> struct x86_guest {
> int (*enc_status_change_prepare)(unsigned long vaddr, int npages, bool enc);
> int (*enc_status_change_finish)(unsigned long vaddr, int npages, bool enc);
> bool (*enc_tlb_flush_required)(bool enc);
> bool (*enc_cache_flush_required)(void);
> + void (*enc_kexec_begin)(bool crash);
> + void (*enc_kexec_finish)(void);
> };
>
> /**
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index f06501445cd9..74f6305eb9ec 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -128,6 +128,18 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
> #ifdef CONFIG_HPET_TIMER
> hpet_disable();
> #endif
> +
> + /*
> + * Non-crash kexec calls enc_kexec_begin() while scheduling is still
> + * active. This allows the callback to wait until all in-flight
> + * shared<->private conversions are complete. In a crash scenario,
> + * enc_kexec_begin() get call after all but one CPU has been shut down
"gets called" ... "have been shut down"
> + * and interrupts have been disabled. This only allows the callback to
only?
> + * detect a race with the conversion and report it.
> + */
> + x86_platform.guest.enc_kexec_begin(true);
> + x86_platform.guest.enc_kexec_finish();
> +
...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-05-28 9:55 ` [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion Kirill A. Shutemov
@ 2024-05-29 10:47 ` Nikolay Borisov
2024-05-29 11:17 ` Kirill A. Shutemov
` (2 more replies)
0 siblings, 3 replies; 115+ messages in thread
From: Nikolay Borisov @ 2024-05-29 10:47 UTC (permalink / raw)
To: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On 28.05.24 г. 12:55 ч., Kirill A. Shutemov wrote:
> From: Borislav Petkov <bp@alien8.de>
>
> That identity_mapped() functions was loving that "1" label to the point
> of completely confusing its readers.
>
> Use named labels in each place for clarity.
>
> No functional changes.
>
> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/kernel/relocate_kernel_64.S | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> index 56cab1bb25f5..085eef5c3904 100644
> --- a/arch/x86/kernel/relocate_kernel_64.S
> +++ b/arch/x86/kernel/relocate_kernel_64.S
> @@ -148,9 +148,10 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> */
> movl $X86_CR4_PAE, %eax
> testq $X86_CR4_LA57, %r13
> - jz 1f
> + jz .Lno_la57
> orl $X86_CR4_LA57, %eax
> -1:
> +.Lno_la57:
> +
> movq %rax, %cr4
>
> jmp 1f
That jmp 1f becomes redundant now as it simply jumps 1 line below.
> @@ -165,9 +166,9 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> * used by kexec. Flush the caches before copying the kernel.
> */
> testq %r12, %r12
> - jz 1f
> + jz .Lsme_off
> wbinvd
> -1:
> +.Lsme_off:
>
> movq %rcx, %r11
> call swap_pages
> @@ -187,7 +188,7 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> */
>
> testq %r11, %r11
> - jnz 1f
> + jnz .Lrelocate
> xorl %eax, %eax
> xorl %ebx, %ebx
> xorl %ecx, %ecx
> @@ -208,7 +209,7 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> ret
> int3
>
> -1:
> +.Lrelocate:
> popq %rdx
> leaq PAGE_SIZE(%r10), %rsp
> ANNOTATE_RETPOLINE_SAFE
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-05-29 10:47 ` Nikolay Borisov
@ 2024-05-29 11:17 ` Kirill A. Shutemov
2024-05-29 11:28 ` Borislav Petkov
2024-06-04 0:24 ` H. Peter Anvin
2024-06-03 14:43 ` H. Peter Anvin
2024-06-03 22:43 ` H. Peter Anvin
2 siblings, 2 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-05-29 11:17 UTC (permalink / raw)
To: Nikolay Borisov
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On Wed, May 29, 2024 at 01:47:50PM +0300, Nikolay Borisov wrote:
>
>
> On 28.05.24 г. 12:55 ч., Kirill A. Shutemov wrote:
> > From: Borislav Petkov <bp@alien8.de>
> >
> > That identity_mapped() functions was loving that "1" label to the point
> > of completely confusing its readers.
> >
> > Use named labels in each place for clarity.
> >
> > No functional changes.
> >
> > Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> > arch/x86/kernel/relocate_kernel_64.S | 13 +++++++------
> > 1 file changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> > index 56cab1bb25f5..085eef5c3904 100644
> > --- a/arch/x86/kernel/relocate_kernel_64.S
> > +++ b/arch/x86/kernel/relocate_kernel_64.S
> > @@ -148,9 +148,10 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> > */
> > movl $X86_CR4_PAE, %eax
> > testq $X86_CR4_LA57, %r13
> > - jz 1f
> > + jz .Lno_la57
> > orl $X86_CR4_LA57, %eax
> > -1:
> > +.Lno_la57:
> > +
> > movq %rax, %cr4
> > jmp 1f
>
> That jmp 1f becomes redundant now as it simply jumps 1 line below.
>
Nothing changed wrt this jump. It dates back to initial kexec
implementation.
See 5234f5eb04ab ("[PATCH] kexec: x86_64 kexec implementation").
But I don't see functional need in it.
Anyway, it is outside of the scope of the patch.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-05-29 11:17 ` Kirill A. Shutemov
@ 2024-05-29 11:28 ` Borislav Petkov
2024-05-29 12:33 ` Andrew Cooper
2024-06-04 0:24 ` H. Peter Anvin
1 sibling, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-05-29 11:28 UTC (permalink / raw)
To: Kirill A. Shutemov, Andrew Cooper
Cc: Nikolay Borisov, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86,
Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On Wed, May 29, 2024 at 02:17:29PM +0300, Kirill A. Shutemov wrote:
> > That jmp 1f becomes redundant now as it simply jumps 1 line below.
> >
>
> Nothing changed wrt this jump. It dates back to initial kexec
> implementation.
>
> See 5234f5eb04ab ("[PATCH] kexec: x86_64 kexec implementation").
>
> But I don't see functional need in it.
>
> Anyway, it is outside of the scope of the patch.
Yap, Kirill did what Nikolay should've done - git archeology. Please
don't forget to do that next time.
And back in the day they didn't comment non-obvious things because
commenting is for losers. :-\
So that unconditional forward jump either flushes branch prediction on
some old uarch or something else weird, uarch-special.
I doubt we can remove it just like that.
Lemme add Andy - he should know.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest
2024-05-28 9:55 ` [PATCHv11 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest Kirill A. Shutemov
2024-05-28 11:12 ` Huang, Kai
@ 2024-05-29 11:39 ` Nikolay Borisov
1 sibling, 0 replies; 115+ messages in thread
From: Nikolay Borisov @ 2024-05-29 11:39 UTC (permalink / raw)
To: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On 28.05.24 г. 12:55 ч., Kirill A. Shutemov wrote:
> TDX guests run with MCA enabled (CR4.MCE=1b) from the very start. If
> that bit is cleared during CR4 register reprogramming during boot or
> kexec flows, a #VE exception will be raised which the guest kernel
> cannot handle it.
>
> Therefore, make sure the CR4.MCE setting is preserved over kexec too and
> avoid raising any #VEs.
>
> The change doesn't affect non-TDX-guest environments.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-05-29 11:28 ` Borislav Petkov
@ 2024-05-29 12:33 ` Andrew Cooper
2024-05-29 15:15 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Andrew Cooper @ 2024-05-29 12:33 UTC (permalink / raw)
To: Borislav Petkov, Kirill A. Shutemov
Cc: Nikolay Borisov, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86,
Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On 29/05/2024 12:28 pm, Borislav Petkov wrote:
> On Wed, May 29, 2024 at 02:17:29PM +0300, Kirill A. Shutemov wrote:
>>> That jmp 1f becomes redundant now as it simply jumps 1 line below.
>>>
>> Nothing changed wrt this jump. It dates back to initial kexec
>> implementation.
>>
>> See 5234f5eb04ab ("[PATCH] kexec: x86_64 kexec implementation").
>>
>> But I don't see functional need in it.
>>
>> Anyway, it is outside of the scope of the patch.
> Yap, Kirill did what Nikolay should've done - git archeology. Please
> don't forget to do that next time.
>
> And back in the day they didn't comment non-obvious things because
> commenting is for losers. :-\
>
> So that unconditional forward jump either flushes branch prediction on
> some old uarch or something else weird, uarch-special.
>
> I doubt we can remove it just like that.
>
> Lemme add Andy - he should know.
Seems I've gained a reputation...
jmp 1f dates back to ye olde 8086, which started the whole trend of the
instruction pointer just being a figment of the ISA's imagination[1].
Hardware maintains the pointer to the next byte to fetch (the prefetch
queue was up to 6 bytes), and there was a micro-op to subtract the
current length of the prefetch queue from the accumulator.
In those days, the prefetch queue was not coherent with main memory, and
jumps (being a discontinuity in the instruction stream) simply flushed
the prefetch queue.
This was necessary after modifying executable code, because otherwise
you could end up executing stale bytes from the prefetch queue and then
non-stale bytes thereafter. (Otherwise known as the way to distinguish
the 8086 from the 8088 because the latter only had a 4 byte prefetch queue.)
Anyway. It's how you used to spell "serialising operation" before that
term ever entered the architecture. Linux still supports CPUs prior to
the Pentium, so still needs to care about prefetch queues in the 486.
However, this example appears to be in 64bit code and following a write
to CR4 which will be fully serialising, so it's probably copy&paste from
32bit code where it would be necessary in principle.
~Andrew
[1]
https://www.righto.com/2023/01/inside-8086-processors-instruction.html#fn:pc
In fact, anyone who hasn't should read the entire series on the 8086,
https://www.righto.com/p/index.html
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-05-29 12:33 ` Andrew Cooper
@ 2024-05-29 15:15 ` Borislav Petkov
0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-05-29 15:15 UTC (permalink / raw)
To: Andrew Cooper, Nikolay Borisov
Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Dave Hansen,
x86, Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On Wed, May 29, 2024 at 01:33:35PM +0100, Andrew Cooper wrote:
> Seems I've gained a reputation...
Yes you have. You have this weird interest in very deep uarch details
that I can't share. Not at that detail. :-P
> jmp 1f dates back to ye olde 8086, which started the whole trend of the
> instruction pointer just being a figment of the ISA's imagination[1].
>
> Hardware maintains the pointer to the next byte to fetch (the prefetch
> queue was up to 6 bytes), and there was a micro-op to subtract the
> current length of the prefetch queue from the accumulator.
>
> In those days, the prefetch queue was not coherent with main memory, and
> jumps (being a discontinuity in the instruction stream) simply flushed
> the prefetch queue.
>
> This was necessary after modifying executable code, because otherwise
> you could end up executing stale bytes from the prefetch queue and then
> non-stale bytes thereafter. (Otherwise known as the way to distinguish
> the 8086 from the 8088 because the latter only had a 4 byte prefetch queue.)
Thanks - that certainly wakes up a long-asleep neuron in the back of my
mind...
> Anyway. It's how you used to spell "serialising operation" before that
> term ever entered the architecture. Linux still supports CPUs prior to
> the Pentium, so still needs to care about prefetch queues in the 486.
>
> However, this example appears to be in 64bit code and following a write
> to CR4 which will be fully serialising, so it's probably copy&paste from
> 32bit code where it would be necessary in principle.
Yap, fully agreed. We could try to remove it and see what complains.
Nikolay, wanna do a patch which properly explains the situation?
> https://www.righto.com/2023/01/inside-8086-processors-instruction.html#fn:pc
>
> In fact, anyone who hasn't should read the entire series on the 8086,
> https://www.righto.com/p/index.html
Oh yeah, already bookmarked.
Thanks Andy!
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* [PATCH v7 0/3] x86/snp: Add kexec support
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
` (19 preceding siblings ...)
2024-05-28 10:01 ` [PATCHv11 00/19] x86/tdx: Add kexec support Rafael J. Wysocki
@ 2024-05-30 23:36 ` Ashish Kalra
2024-05-30 23:36 ` [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec Ashish Kalra
` (2 more replies)
20 siblings, 3 replies; 115+ messages in thread
From: Ashish Kalra @ 2024-05-30 23:36 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86
Cc: rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel
From: Ashish Kalra <ashish.kalra@amd.com>
The patchset adds bits and pieces to get kexec (and crashkernel) work on
SNP guest.
The series is based off of and tested against Kirill Shutemov's tree:
https://github.com/intel/tdx.git guest-kexec
----
v7:
- Rebased onto current tip/master;
- Moved back to checking the md attribute instead of checking the
efi_setup for detecting if running under kexec kernel as
suggested in upstream review feedback.
v6:
- Updated and restructured the commit message for patch 1/3 to
explain the issue in detail.
- Updated inline comments in patch 1/3 to explain the issue in
detail.
- Moved back to checking efi_setup for detecting if running
under kexec kernel.
v5:
- Removed sev_es_enabled() function and using sev_status directly to
check for SEV-ES/SEV-SNP guest.
- used --base option to generate patches to specify Kirill's TDX guest
kexec patches as prerequisite patches to fix kernel test robot
build errors.
v4:
- Rebased to current tip/master.
- Reviewed-bys from Sathya.
- Remove snp_kexec_unprep_rom_memory() as it is not needed any more as
SEV-SNP code is not validating the ROM range in probe_roms() anymore.
- Fix kernel test robot build error/warnings.
v3:
- Rebased;
- moved Keep page tables that maps E820_TYPE_ACPI patch to Kirill's tdx
guest kexec patch series.
- checking the md attribute instead of checking the efi_setup for
detecting if running under kexec kernel.
- added new sev_es_enabled() function.
- skip video memory access in decompressor for SEV-ES/SNP systems to
prevent guest termination as boot stage2 #VC handler does not handle
MMIO.
v2:
- address zeroing of unaccepted memory table mappings at all page table levels
adding phys_pte_init(), phys_pud_init() and phys_p4d_init().
- include skip efi_arch_mem_reserve() in case of kexec as part of this
patch set.
- rename last_address_shd_kexec to a more appropriate
kexec_last_address_to_make_private.
- remove duplicate code shared with TDX and use common interfaces
defined for SNP and TDX for kexec/kdump.
- remove set_pte_enc() dependency on pg_level_to_pfn() and make the
function simpler.
- rename unshare_pte() to make_pte_private().
- clarify and make the comment for using kexec_last_address_to_make_private
more understandable.
- general cleanup.
Ashish Kalra (3):
efi/x86: Fix EFI memory map corruption with kexec
x86/boot/compressed: Skip Video Memory access in Decompressor for
SEV-ES/SNP.
x86/snp: Convert shared memory back to private on kexec
arch/x86/boot/compressed/misc.c | 6 +-
arch/x86/include/asm/sev.h | 4 +
arch/x86/kernel/sev.c | 162 ++++++++++++++++++++++++++++++++
arch/x86/mm/mem_encrypt_amd.c | 3 +
arch/x86/platform/efi/quirks.c | 30 +++++-
5 files changed, 200 insertions(+), 5 deletions(-)
base-commit: f8441cd55885e43eb0d4e8eedc6c5ab15d2dabf1
prerequisite-patch-id: a911f230c2524bd791c47f62f17f0a93cbf726b6
prerequisite-patch-id: bfe2fa046349978ac1825275eb205acecfbc22f3
prerequisite-patch-id: 5e60d292457c7cd98fd3e45c23127e9463b56a69
prerequisite-patch-id: 1f97d0a2edb7509dd58276f628d1a4bda62c154c
prerequisite-patch-id: 6e07f4d4ac95ad1d2c7750ebd3e87483fb9fd48f
prerequisite-patch-id: 24ec385d6a89cf2c8553c6d29515cc513643a68a
prerequisite-patch-id: 6a8bda2b3cf9bfab8177acdcfc8dd0408ed129fa
prerequisite-patch-id: 99382c42348b9a076ba930eca0dfc9d000ec951d
prerequisite-patch-id: 469a0a3c78b0eca82527cd85e2205fb8fb89d645
prerequisite-patch-id: 2be870cdf58bdc6a10ca3c18bf874e5c6cfb7e42
prerequisite-patch-id: 7fc62697fb6bdade0bab66ba2b45a19759008f9e
prerequisite-patch-id: 95356474298029468750a9c1bc2224fb09a86eed
prerequisite-patch-id: d4966ae63e86d24b0bf578da4dae871cd9002b12
prerequisite-patch-id: fccde6f1fa385b5af0195f81fcb95acd71822428
prerequisite-patch-id: 16048ee15e392b0b9217b8923939b0059311abd2
prerequisite-patch-id: 5c9ae9aa294f72f63ae2c3551507dfbd92525803
prerequisite-patch-id: 758bdb686290c018cbd5b7d005354019f9d15248
prerequisite-patch-id: c85fd0bb6d183a40da73720eaa607481b1d51daf
prerequisite-patch-id: 60760e0c98ab7ccd2ca22ae3e9f20ff5a94c6e91
--
2.34.1
^ permalink raw reply [flat|nested] 115+ messages in thread
* [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-05-30 23:36 ` [PATCH v7 0/3] x86/snp: " Ashish Kalra
@ 2024-05-30 23:36 ` Ashish Kalra
2024-05-31 9:12 ` Alexander Kuleshov
2024-06-03 8:56 ` Borislav Petkov
2024-05-30 23:37 ` [PATCH v7 2/3] x86/boot/compressed: Skip Video Memory access in Decompressor for SEV-ES/SNP Ashish Kalra
2024-05-30 23:37 ` [PATCH v7 3/3] x86/snp: Convert shared memory back to private on kexec Ashish Kalra
2 siblings, 2 replies; 115+ messages in thread
From: Ashish Kalra @ 2024-05-30 23:36 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86
Cc: rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel
From: Ashish Kalra <ashish.kalra@amd.com>
With SNP guest kexec observe the following efi memmap corruption :
[ 0.000000] efi: EFI v2.7 by EDK II
[ 0.000000] efi: SMBIOS=0x7e33f000 SMBIOS 3.0=0x7e33d000 ACPI=0x7e57e000 ACPI 2.0=0x7e57e014 MEMATTR=0x7cc3c018 Unaccepted=0x7c09e018
[ 0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries:
[ 0.000000] efi: mem03: [type=269370880|attr=0x0e42100e42180e41] range=[0x0486200e41038c18-0x200e898a0eee713ac17] (invalid)
[ 0.000000] efi: mem04: [type=12336|attr=0x0e410686300e4105] range=[0x100e420000000176-0x8c290f26248d200e175] (invalid)
[ 0.000000] efi: mem06: [type=1124304408|attr=0x000030b400000028] range=[0x0e51300e45280e77-0xb44ed2142f460c1e76] (invalid)
[ 0.000000] efi: mem08: [type=68|attr=0x300e540583280e41] range=[0x0000011affff3cd8-0x486200e54b38c0bcd7] (invalid)
[ 0.000000] efi: mem10: [type=1107529240|attr=0x0e42280e41300e41] range=[0x300e41058c280e42-0x38010ae54c5c328ee41] (invalid)
[ 0.000000] efi: mem11: [type=189335566|attr=0x048d200e42038e18] range=[0x0000318c00000048-0xe42029228ce4200047] (invalid)
[ 0.000000] efi: mem12: [type=239142534|attr=0x0000002400000b4b] range=[0x0e41380e0a7d700e-0x80f26238f22bfe500d] (invalid)
[ 0.000000] efi: mem14: [type=239207055|attr=0x0e41300e43380e0a] range=[0x8c280e42048d200e-0xc70b028f2f27cc0a00d] (invalid)
[ 0.000000] efi: mem15: [type=239210510|attr=0x00080e660b47080e] range=[0x0000324c0000001c-0xa78028634ce490001b] (invalid)
[ 0.000000] efi: mem16: [type=4294848528|attr=0x0000329400000014] range=[0x0e410286100e4100-0x80f252036a218f20ff] (invalid)
[ 0.000000] efi: mem19: [type=2250772033|attr=0x42180e42200e4328] range=[0x41280e0ab9020683-0xe0e538c28b39e62682] (invalid)
[ 0.000000] efi: mem20: [type=16| | | | | | | | | | |WB| |WC| ] range=[0x00000008ffff4438-0xffff44340090333c437] (invalid)
[ 0.000000] efi: mem22: [Reserved |attr=0x000000c1ffff4420] range=[0xffff442400003398-0x1033a04240003f397] (invalid)
[ 0.000000] efi: mem23: [type=1141080856|attr=0x080e41100e43180e] range=[0x280e66300e4b280e-0x440dc5ee7141f4c080d] (invalid)
[ 0.000000] efi: mem25: [Reserved |attr=0x0000000affff44a0] range=[0xffff44a400003428-0x1034304a400013427] (invalid)
[ 0.000000] efi: mem28: [type=16| | | | | | | | | | |WB| |WC| ] range=[0x0000000affff4488-0xffff448400b034bc487] (invalid)
[ 0.000000] efi: mem30: [Reserved |attr=0x0000000affff4470] range=[0xffff447400003518-0x10352047400013517] (invalid)
[ 0.000000] efi: mem33: [type=16| | | | | | | | | | |WB| |WC| ] range=[0x0000000affff4458-0xffff445400b035ac457] (invalid)
[ 0.000000] efi: mem35: [type=269372416|attr=0x0e42100e42180e41] range=[0x0486200e44038c18-0x200e8b8a0eee823ac17] (invalid)
[ 0.000000] efi: mem37: [type=2351435330|attr=0x0e42100e42180e42] range=[0x470783380e410686-0x2002b2a041c2141e685] (invalid)
[ 0.000000] efi: mem38: [type=1093668417|attr=0x100e420000000270] range=[0x42100e42180e4220-0xfff366a4e421b78c21f] (invalid)
[ 0.000000] efi: mem39: [type=76357646|attr=0x180e42200e42280e] range=[0x0e410686300e4105-0x4130f251a0710ae5104] (invalid)
[ 0.000000] efi: mem40: [type=940444268|attr=0x0e42200e42280e41] range=[0x180e42200e42280e-0x300fc71c300b4f2480d] (invalid)
[ 0.000000] efi: mem41: [MMIO |attr=0x8c280e42048d200e] range=[0xffff479400003728-0x42138e0c87820292727] (invalid)
[ 0.000000] efi: mem42: [type=1191674680|attr=0x0000004c0000000b] range=[0x300e41380e0a0246-0x470b0f26238f22b8245] (invalid)
[ 0.000000] efi: mem43: [type=2010|attr=0x0301f00e4d078338] range=[0x45038e180e42028f-0xe4556bf118f282528e] (invalid)
[ 0.000000] efi: mem44: [type=1109921345|attr=0x300e44000000006c] range=[0x44080e42100e4218-0xfff39254e42138ac217] (invalid)
...
This EFI memap corruption is happening with efi_arch_mem_reserve() invocation in case of kexec boot.
( efi_arch_mem_reserve() is invoked with the following call-stack: )
[ 0.310010] efi_arch_mem_reserve+0xb1/0x220
[ 0.311382] efi_mem_reserve+0x36/0x60
[ 0.311973] efi_bgrt_init+0x17d/0x1a0
[ 0.313265] acpi_parse_bgrt+0x12/0x20
[ 0.313858] acpi_table_parse+0x77/0xd0
[ 0.314463] acpi_boot_init+0x362/0x630
[ 0.315069] setup_arch+0xa88/0xf80
[ 0.315629] start_kernel+0x68/0xa90
[ 0.316194] x86_64_start_reservations+0x1c/0x30
[ 0.316921] x86_64_start_kernel+0xbf/0x110
[ 0.317582] common_startup_64+0x13e/0x141
efi_arch_mem_reserve() calls efi_memmap_alloc() to allocate memory for
EFI memory map and due to early allocation it uses memblock allocation.
Later during boot, efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
in case of a kexec-ed kernel boot.
This function kexec_enter_virtual_mode() installs the new EFI memory map by
calling efi_memmap_init_late() which remaps the efi_memmap physically allocated
in efi_arch_mem_reserve(), but this remapping is still using memblock allocation.
Subsequently, when memblock is freed later in boot flow, this remapped
efi_memmap will have random corruption (similar to a use-after-free scenario).
The corrupted EFI memory map is then passed to the next kexec-ed kernel
which causes a panic when trying to use the corrupted EFI memory map.
Fix this EFI memory map corruption by skipping efi_arch_mem_reserve() for kexec.
Additionally, efi_mem_reserve() is used to reserve boot service memory
eg. bgrt, but it is not necessary for kexec boot, as there are no
boot services in kexec reboot at all after the first kernel ExitBootServices().
The UEFI memmap passed to kexec kernel includes not only the runtime
service memory map but also the boot service memory ranges which were
reserved by the first kernel with efi_mem_reserve, and those boot service
memory ranges have already been marked "EFI_MEMORY_RUNTIME" attribute.
This is the additional reason why efi_mem_reserve can be skipped
for kexec booting and by checking the set EFI_MEMORY_RUNTIME attribute.
Suggested-by: Dave Young <dyoung@redhat.com>
[Dave Young: checking the md attribute instead of checking the efi_setup]
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/platform/efi/quirks.c | 30 +++++++++++++++++++++++++++---
1 file changed, 27 insertions(+), 3 deletions(-)
diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index f0cc00032751..6f398c59278a 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -255,15 +255,39 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
struct efi_memory_map_data data = { 0 };
struct efi_mem_range mr;
efi_memory_desc_t md;
- int num_entries;
+ int num_entries, ret;
void *new;
- if (efi_mem_desc_lookup(addr, &md) ||
- md.type != EFI_BOOT_SERVICES_DATA) {
+ /*
+ * efi_mem_reserve() is used to reserve boot service memory, eg. bgrt,
+ * but it is not neccasery for kexec, as there are no boot services in
+ * kexec reboot at all after the first kernel's ExitBootServices().
+ *
+ * Additionally kexec_enter_virtual_mode() during late init will remap
+ * the efi_memmap physical pages allocated here via memblock & then
+ * subsequently cause random EFI memmap corruption once memblock is freed.
+ *
+ * Therefore, skip efi_mem_reserve for kexec booting by checking the
+ * EFI_MEMORY_RUNTIME attribute which indicates boot service memory
+ * ranges reserved by the first kernel using efi_mem_reserve and marked
+ * with EFI_MEMORY_RUNTIME attribute.
+ */
+
+ ret = efi_mem_desc_lookup(addr, &md);
+ if (ret) {
pr_err("Failed to lookup EFI memory descriptor for %pa\n", &addr);
return;
}
+ if (md.type != EFI_BOOT_SERVICES_DATA) {
+ pr_err("Skip reserving non EFI Boot Service Data memory for %pa\n", &addr);
+ return;
+ }
+
+ /* Kexec copied the efi memmap from the first kernel, thus skip the case */
+ if (md.attribute & EFI_MEMORY_RUNTIME)
+ return;
+
if (addr + size > md.phys_addr + (md.num_pages << EFI_PAGE_SHIFT)) {
pr_err("Region spans EFI memory descriptors, %pa\n", &addr);
return;
--
2.34.1
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCH v7 2/3] x86/boot/compressed: Skip Video Memory access in Decompressor for SEV-ES/SNP.
2024-05-30 23:36 ` [PATCH v7 0/3] x86/snp: " Ashish Kalra
2024-05-30 23:36 ` [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec Ashish Kalra
@ 2024-05-30 23:37 ` Ashish Kalra
2024-06-05 20:14 ` Borislav Petkov
2024-05-30 23:37 ` [PATCH v7 3/3] x86/snp: Convert shared memory back to private on kexec Ashish Kalra
2 siblings, 1 reply; 115+ messages in thread
From: Ashish Kalra @ 2024-05-30 23:37 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86
Cc: rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel
From: Ashish Kalra <ashish.kalra@amd.com>
Accessing guest video memory/RAM during kernel decompressor
causes guest termination as boot stage2 #VC handler for
SEV-ES/SNP systems does not support MMIO handling.
This issue is observed with SEV-ES/SNP guest kexec as
kexec -c adds screen_info to the boot parameters
passed to the kexec kernel, which causes console output to
be dumped to both video and serial.
As the decompressor output gets cleared really fast, it is
preferable to get the console output only on serial, hence,
skip accessing video RAM during decompressor stage to
prevent guest termination.
Serial console output during decompressor stage works as
boot stage2 #VC handler already supports handling port I/O.
Suggested-by: Thomas Lendacy <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
arch/x86/boot/compressed/misc.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b70e4a21c15f..3b9f96b3dbcc 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -427,8 +427,10 @@ asmlinkage __visible void *extract_kernel(void *rmode, unsigned char *output)
vidport = 0x3d4;
}
- lines = boot_params_ptr->screen_info.orig_video_lines;
- cols = boot_params_ptr->screen_info.orig_video_cols;
+ if (!(sev_status & MSR_AMD64_SEV_ES_ENABLED)) {
+ lines = boot_params_ptr->screen_info.orig_video_lines;
+ cols = boot_params_ptr->screen_info.orig_video_cols;
+ }
init_default_io_ops();
--
2.34.1
^ permalink raw reply related [flat|nested] 115+ messages in thread
* [PATCH v7 3/3] x86/snp: Convert shared memory back to private on kexec
2024-05-30 23:36 ` [PATCH v7 0/3] x86/snp: " Ashish Kalra
2024-05-30 23:36 ` [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec Ashish Kalra
2024-05-30 23:37 ` [PATCH v7 2/3] x86/boot/compressed: Skip Video Memory access in Decompressor for SEV-ES/SNP Ashish Kalra
@ 2024-05-30 23:37 ` Ashish Kalra
2 siblings, 0 replies; 115+ messages in thread
From: Ashish Kalra @ 2024-05-30 23:37 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86
Cc: rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel
From: Ashish Kalra <ashish.kalra@amd.com>
SNP guests allocate shared buffers to perform I/O. It is done by
allocating pages normally from the buddy allocator and converting them
to shared with set_memory_decrypted().
The second kernel has no idea what memory is converted this way. It only
sees E820_TYPE_RAM.
Accessing shared memory via private mapping will cause unrecoverable RMP
page-faults.
On kexec walk direct mapping and convert all shared memory back to
private. It makes all RAM private again and second kernel may use it
normally. Additionally for SNP guests convert all bss decrypted section
pages back to private.
The conversion occurs in two steps: stopping new conversions and
unsharing all memory. In the case of normal kexec, the stopping of
conversions takes place while scheduling is still functioning. This
allows for waiting until any ongoing conversions are finished. The
second step is carried out when all CPUs except one are inactive and
interrupts are disabled. This prevents any conflicts with code that may
access shared memory.
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/include/asm/sev.h | 4 +
arch/x86/kernel/sev.c | 162 ++++++++++++++++++++++++++++++++++
arch/x86/mm/mem_encrypt_amd.c | 3 +
3 files changed, 169 insertions(+)
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index ca20cc4e5826..f9b0a4eb1980 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -229,6 +229,8 @@ void snp_accept_memory(phys_addr_t start, phys_addr_t end);
u64 snp_get_unsupported_features(u64 status);
u64 sev_get_status(void);
void sev_show_status(void);
+void snp_kexec_finish(void);
+void snp_kexec_begin(bool crash);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -258,6 +260,8 @@ static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) { }
static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
static inline u64 sev_get_status(void) { return 0; }
static inline void sev_show_status(void) { }
+static inline void snp_kexec_finish(void) { }
+static inline void snp_kexec_begin(bool crash) { }
#endif
#ifdef CONFIG_KVM_AMD_SEV
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 3342ed58e168..941f3996a9b6 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -42,6 +42,8 @@
#include <asm/apic.h>
#include <asm/cpuid.h>
#include <asm/cmdline.h>
+#include <asm/pgtable.h>
+#include <asm/set_memory.h>
#define DR7_RESET_VALUE 0x400
@@ -92,6 +94,9 @@ static struct ghcb *boot_ghcb __section(".data");
/* Bitmap of SEV features supported by the hypervisor */
static u64 sev_hv_features __ro_after_init;
+/* Last address to be switched to private during kexec */
+static unsigned long kexec_last_addr_to_make_private;
+
/* #VC handler runtime per-CPU data */
struct sev_es_runtime_data {
struct ghcb ghcb_page;
@@ -913,6 +918,163 @@ void snp_accept_memory(phys_addr_t start, phys_addr_t end)
set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
}
+static bool set_pte_enc(pte_t *kpte, int level, void *va)
+{
+ pte_t new_pte;
+
+ if (pte_none(*kpte))
+ return false;
+
+ /*
+ * Change the physical page attribute from C=0 to C=1. Flush the
+ * caches to ensure that data gets accessed with the correct C-bit.
+ */
+ if (pte_present(*kpte))
+ clflush_cache_range(va, page_level_size(level));
+
+ new_pte = __pte(cc_mkenc(pte_val(*kpte)));
+ set_pte_atomic(kpte, new_pte);
+
+ return true;
+}
+
+static bool make_pte_private(pte_t *pte, unsigned long addr, int pages, int level)
+{
+ struct sev_es_runtime_data *data;
+ struct ghcb *ghcb;
+
+ data = this_cpu_read(runtime_data);
+ ghcb = &data->ghcb_page;
+
+ /* Check for GHCB for being part of a PMD range. */
+ if ((unsigned long)ghcb >= addr &&
+ (unsigned long)ghcb <= (addr + (pages * PAGE_SIZE))) {
+ /*
+ * Ensure that the current cpu's GHCB is made private
+ * at the end of unshared loop so that we continue to use the
+ * optimized GHCB protocol and not force the switch to
+ * MSR protocol till the very end.
+ */
+ pr_debug("setting boot_ghcb to NULL for this cpu ghcb\n");
+ kexec_last_addr_to_make_private = addr;
+ return true;
+ }
+
+ if (!set_pte_enc(pte, level, (void *)addr))
+ return false;
+
+ snp_set_memory_private(addr, pages);
+
+ return true;
+}
+
+static void unshare_all_memory(void)
+{
+ unsigned long addr, end;
+
+ /*
+ * Walk direct mapping and convert all shared memory back to private,
+ */
+
+ addr = PAGE_OFFSET;
+ end = PAGE_OFFSET + get_max_mapped();
+
+ while (addr < end) {
+ unsigned long size;
+ unsigned int level;
+ pte_t *pte;
+
+ pte = lookup_address(addr, &level);
+ size = page_level_size(level);
+
+ /*
+ * pte_none() check is required to skip physical memory holes in direct mapped.
+ */
+ if (pte && pte_decrypted(*pte) && !pte_none(*pte)) {
+ int pages = size / PAGE_SIZE;
+
+ if (!make_pte_private(pte, addr, pages, level)) {
+ pr_err("Failed to unshare range %#lx-%#lx\n",
+ addr, addr + size);
+ }
+
+ }
+
+ addr += size;
+ }
+ __flush_tlb_all();
+
+}
+
+static void unshare_all_bss_decrypted_memory(void)
+{
+ unsigned long vaddr, vaddr_end;
+ unsigned int level;
+ unsigned int npages;
+ pte_t *pte;
+
+ vaddr = (unsigned long)__start_bss_decrypted;
+ vaddr_end = (unsigned long)__start_bss_decrypted_unused;
+ npages = (vaddr_end - vaddr) >> PAGE_SHIFT;
+ for (; vaddr < vaddr_end; vaddr += PAGE_SIZE) {
+ pte = lookup_address(vaddr, &level);
+ if (!pte || !pte_decrypted(*pte) || pte_none(*pte))
+ continue;
+
+ set_pte_enc(pte, level, (void *)vaddr);
+ }
+ vaddr = (unsigned long)__start_bss_decrypted;
+ snp_set_memory_private(vaddr, npages);
+}
+
+/* Stop new private<->shared conversions */
+void snp_kexec_begin(bool crash)
+{
+ /*
+ * Crash kernel reaches here with interrupts disabled: can't wait for
+ * conversions to finish.
+ *
+ * If race happened, just report and proceed.
+ */
+ bool wait_for_lock = !crash;
+
+ if (!set_memory_enc_stop_conversion(wait_for_lock))
+ pr_warn("Failed to stop shared<->private conversions\n");
+}
+
+/* Walk direct mapping and convert all shared memory back to private */
+void snp_kexec_finish(void)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ unshare_all_memory();
+
+ unshare_all_bss_decrypted_memory();
+
+ if (kexec_last_addr_to_make_private) {
+ unsigned long size;
+ unsigned int level;
+ pte_t *pte;
+
+ /*
+ * Switch to using the MSR protocol to change this cpu's
+ * GHCB to private.
+ * All the per-cpu GHCBs have been switched back to private,
+ * so can't do any more GHCB calls to the hypervisor beyond
+ * this point till the kexec kernel starts running.
+ */
+ boot_ghcb = NULL;
+ sev_cfg.ghcbs_initialized = false;
+
+ pr_debug("boot ghcb 0x%lx\n", kexec_last_addr_to_make_private);
+ pte = lookup_address(kexec_last_addr_to_make_private, &level);
+ size = page_level_size(level);
+ set_pte_enc(pte, level, (void *)kexec_last_addr_to_make_private);
+ snp_set_memory_private(kexec_last_addr_to_make_private, (size / PAGE_SIZE));
+ }
+}
+
static int snp_set_vmsa(void *va, bool vmsa)
{
u64 attrs;
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index e7b67519ddb5..3ba792cd28ef 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -468,6 +468,9 @@ void __init sme_early_init(void)
x86_platform.guest.enc_tlb_flush_required = amd_enc_tlb_flush_required;
x86_platform.guest.enc_cache_flush_required = amd_enc_cache_flush_required;
+ x86_platform.guest.enc_kexec_begin = snp_kexec_begin;
+ x86_platform.guest.enc_kexec_finish = snp_kexec_finish;
+
/*
* AMD-SEV-ES intercepts the RDMSR to read the X2APIC ID in the
* parallel bringup low level code. That raises #VC which cannot be
--
2.34.1
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-05-30 23:36 ` [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec Ashish Kalra
@ 2024-05-31 9:12 ` Alexander Kuleshov
2024-06-03 8:56 ` Borislav Petkov
1 sibling, 0 replies; 115+ messages in thread
From: Alexander Kuleshov @ 2024-05-31 9:12 UTC (permalink / raw)
To: Ashish Kalra
Cc: tglx, mingo, bp, dave.hansen, x86, rafael, hpa, peterz,
adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel
On 30.05.2024 23:36, Ashish Kalra wrote:
>From: Ashish Kalra <ashish.kalra@amd.com>
>+ * but it is not neccasery for kexec, as there are no boot services in
A typo in necessary
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-05-28 9:55 ` [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec Kirill A. Shutemov
@ 2024-05-31 15:14 ` Borislav Petkov
2024-05-31 17:34 ` Kalra, Ashish
` (2 more replies)
2024-06-04 16:27 ` [PATCHv11 " Dave Hansen
1 sibling, 3 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-05-31 15:14 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Tue, May 28, 2024 at 12:55:14PM +0300, Kirill A. Shutemov wrote:
> +static void tdx_kexec_finish(void)
> +{
> + unsigned long addr, end;
> + long found = 0, shared;
> +
> + lockdep_assert_irqs_disabled();
> +
> + addr = PAGE_OFFSET;
> + end = PAGE_OFFSET + get_max_mapped();
> +
> + while (addr < end) {
> + unsigned long size;
> + unsigned int level;
> + pte_t *pte;
> +
> + pte = lookup_address(addr, &level);
> + size = page_level_size(level);
> +
> + if (pte && pte_decrypted(*pte)) {
> + int pages = size / PAGE_SIZE;
> +
> + /*
> + * Touching memory with shared bit set triggers implicit
> + * conversion to shared.
> + *
> + * Make sure nobody touches the shared range from
> + * now on.
> + */
> + set_pte(pte, __pte(0));
> +
Format the below into a comment here:
/*
The only thing one can do at this point on failure is panic. It is
reasonable to proceed, especially for the crash case because the
kexec-ed kernel is using a different page table so there won't be
a mismatch between shared/private marking of the page so it doesn't
matter.
Also, even if the failure is real and the page cannot be touched as
private, the kdump kernel will boot fine as it uses pre-reserved memory.
What happens next depends on what the dumping process does and there's
a reasonable chance to produce useful dump on crash.
Regardless, the print leaves a trace in the log to give a clue for
debug.
One possible reason for the failure is if kdump raced with memory
conversion. In this case shared bit in page table got set (or not
cleared form shared->private conversion), but the page is actually
private. So this failure is not going to affect the kexec'ed kernel.
*/
<---
> + if (!tdx_enc_status_changed(addr, pages, true)) {
> + pr_err("Failed to unshare range %#lx-%#lx\n",
> + addr, addr + size);
> + }
> +
> + found += pages;
> + }
> +
> + addr += size;
> + }
> +
> + __flush_tlb_all();
> +
> + shared = atomic_long_read(&nr_shared);
> + if (shared != found) {
> + pr_err("shared page accounting is off\n");
> + pr_err("nr_shared = %ld, nr_found = %ld\n", shared, found);
> + }
> +}
...
> static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
> {
> - if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
> - return __set_memory_enc_pgtable(addr, numpages, enc);
> + int ret = 0;
>
> - return 0;
> + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
> + if (!down_read_trylock(&mem_enc_lock))
> + return -EBUSY;
> +
> + ret = __set_memory_enc_pgtable(addr, numpages, enc);
> +
> + up_read(&mem_enc_lock);
> + }
So CC_ATTR_MEM_ENCRYPT is set for SEV* guests too. You need to change
that code here to take the lock only on TDX, where you want it, not on
the others.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-05-31 15:14 ` Borislav Petkov
@ 2024-05-31 17:34 ` Kalra, Ashish
2024-05-31 18:06 ` Borislav Petkov
2024-06-02 14:20 ` Kirill A. Shutemov
2024-06-02 14:23 ` [PATCHv11.1 " Kirill A. Shutemov
2 siblings, 1 reply; 115+ messages in thread
From: Kalra, Ashish @ 2024-05-31 17:34 UTC (permalink / raw)
To: Borislav Petkov, Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Sean Christopherson, Huang, Kai, Ard Biesheuvel, Baoquan He,
H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
Hello Boris,
On 5/31/2024 10:14 AM, Borislav Petkov wrote:
>> static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
>> {
>> - if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
>> - return __set_memory_enc_pgtable(addr, numpages, enc);
>> + int ret = 0;
>>
>> - return 0;
>> + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
>> + if (!down_read_trylock(&mem_enc_lock))
>> + return -EBUSY;
>> +
>> + ret = __set_memory_enc_pgtable(addr, numpages, enc);
>> +
>> + up_read(&mem_enc_lock);
>> + }
> So CC_ATTR_MEM_ENCRYPT is set for SEV* guests too. You need to change
> that code here to take the lock only on TDX, where you want it, not on
> the others.
SNP guest kexec patches are based on top of this patch-series and SNP
guests also need this exclusive mem_enc_lock protection, so
CC_ATTR_MEM_ENCRYPT makes sense to be used here.
Thanks, Ashish
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-05-31 17:34 ` Kalra, Ashish
@ 2024-05-31 18:06 ` Borislav Petkov
0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-05-31 18:06 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Dave Hansen,
x86, Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Sean Christopherson, Huang, Kai,
Ard Biesheuvel, Baoquan He, H. Peter Anvin, K. Y. Srinivasan,
Haiyang Zhang, kexec, linux-hyperv, linux-acpi, linux-coco,
linux-kernel, Tao Liu
On Fri, May 31, 2024 at 12:34:49PM -0500, Kalra, Ashish wrote:
> SNP guest kexec patches are based on top of this patch-series and SNP guests
> also need this exclusive mem_enc_lock protection, so CC_ATTR_MEM_ENCRYPT
> makes sense to be used here.
Well, for the future, I'd encourage you to always send an Acked-by: you
or Reviewed-by: you as a reply to such patches so that it is clear that
such a change is desired.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* [PATCHv11.1 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec
2024-05-29 10:42 ` Borislav Petkov
@ 2024-06-02 12:39 ` Kirill A. Shutemov
2024-06-02 12:42 ` Kirill A. Shutemov
2024-06-02 12:44 ` [PATCHv11.2 " Kirill A. Shutemov
1 sibling, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-02 12:39 UTC (permalink / raw)
To: bp
Cc: adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kirill.shutemov, kys, linux-acpi, linux-coco, linux-hyperv,
linux-kernel, ltao, mingo, nik.borisov, peterz, rafael,
rick.p.edgecombe, sathyanarayanan.kuppuswamy, seanjc, tglx,
thomas.lendacky, x86
AMD SEV and Intel TDX guests allocate shared buffers for performing I/O.
This is done by allocating pages normally from the buddy allocator and
then converting them to shared using set_memory_decrypted().
On kexec, the second kernel is unaware of which memory has been
converted in this manner. It only sees E820_TYPE_RAM. Accessing shared
memory as private is fatal.
Therefore, the memory state must be reset to its original state before
starting the new kernel with kexec.
The process of converting shared memory back to private occurs in two
steps:
- enc_kexec_begin() stops new conversions.
- enc_kexec_finish() unshares all existing shared memory, reverting it
back to private.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/include/asm/x86_init.h | 9 +++++++++
arch/x86/kernel/crash.c | 12 ++++++++++++
arch/x86/kernel/reboot.c | 12 ++++++++++++
arch/x86/kernel/x86_init.c | 4 ++++
4 files changed, 37 insertions(+)
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 28ac3cb9b987..6cade48811cc 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -149,12 +149,21 @@ struct x86_init_acpi {
* @enc_status_change_finish Notify HV after the encryption status of a range is changed
* @enc_tlb_flush_required Returns true if a TLB flush is needed before changing page encryption status
* @enc_cache_flush_required Returns true if a cache flush is needed before changing page encryption status
+ * @enc_kexec_begin Begin the two-step process of conversion shared memory back
+ * to private. It stops the new conversions from being started
+ * and waits in-flight conversions to finish, if possible.
+ * @enc_kexec_finish Finish the two-step process of conversion shared memory to
+ * private. All memory is private after the call.
+ * It called with all CPUs but one shutdown and interrupts
+ * disabled.
*/
struct x86_guest {
int (*enc_status_change_prepare)(unsigned long vaddr, int npages, bool enc);
int (*enc_status_change_finish)(unsigned long vaddr, int npages, bool enc);
bool (*enc_tlb_flush_required)(bool enc);
bool (*enc_cache_flush_required)(void);
+ void (*enc_kexec_begin)(bool crash);
+ void (*enc_kexec_finish)(void);
};
/**
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index f06501445cd9..74f6305eb9ec 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -128,6 +128,18 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
#ifdef CONFIG_HPET_TIMER
hpet_disable();
#endif
+
+ /*
+ * Non-crash kexec calls enc_kexec_begin() while scheduling is still
+ * active. This allows the callback to wait until all in-flight
+ * shared<->private conversions are complete. In a crash scenario,
+ * enc_kexec_begin() get call after all but one CPU has been shut down
+ * and interrupts have been disabled. This only allows the callback to
+ * detect a race with the conversion and report it.
+ */
+ x86_platform.guest.enc_kexec_begin(true);
+ x86_platform.guest.enc_kexec_finish();
+
crash_save_cpu(regs, safe_smp_processor_id());
}
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index f3130f762784..097313147ad3 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -12,6 +12,7 @@
#include <linux/delay.h>
#include <linux/objtool.h>
#include <linux/pgtable.h>
+#include <linux/kexec.h>
#include <acpi/reboot.h>
#include <asm/io.h>
#include <asm/apic.h>
@@ -716,6 +717,14 @@ static void native_machine_emergency_restart(void)
void native_machine_shutdown(void)
{
+ /*
+ * Call enc_kexec_begin() while all CPUs are still active and
+ * interrupts are enabled. This will allow all in-flight memory
+ * conversions to finish cleanly.
+ */
+ if (kexec_in_progress)
+ x86_platform.guest.enc_kexec_begin(false);
+
/* Stop the cpus and apics */
#ifdef CONFIG_X86_IO_APIC
/*
@@ -752,6 +761,9 @@ void native_machine_shutdown(void)
#ifdef CONFIG_X86_64
x86_platform.iommu_shutdown();
#endif
+
+ if (kexec_in_progress)
+ x86_platform.guest.enc_kexec_finish();
}
static void __machine_emergency_restart(int emergency)
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index a7143bb7dd93..8a79fb505303 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -138,6 +138,8 @@ static int enc_status_change_prepare_noop(unsigned long vaddr, int npages, bool
static int enc_status_change_finish_noop(unsigned long vaddr, int npages, bool enc) { return 0; }
static bool enc_tlb_flush_required_noop(bool enc) { return false; }
static bool enc_cache_flush_required_noop(void) { return false; }
+static void enc_kexec_begin_noop(bool crash) {}
+static void enc_kexec_finish_noop(void) {}
static bool is_private_mmio_noop(u64 addr) {return false; }
struct x86_platform_ops x86_platform __ro_after_init = {
@@ -161,6 +163,8 @@ struct x86_platform_ops x86_platform __ro_after_init = {
.enc_status_change_finish = enc_status_change_finish_noop,
.enc_tlb_flush_required = enc_tlb_flush_required_noop,
.enc_cache_flush_required = enc_cache_flush_required_noop,
+ .enc_kexec_begin = enc_kexec_begin_noop,
+ .enc_kexec_finish = enc_kexec_finish_noop,
},
};
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCHv11.1 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec
2024-06-02 12:39 ` [PATCHv11.1 " Kirill A. Shutemov
@ 2024-06-02 12:42 ` Kirill A. Shutemov
0 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-02 12:42 UTC (permalink / raw)
To: bp
Cc: adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kys, linux-acpi, linux-coco, linux-hyperv, linux-kernel, ltao,
mingo, nik.borisov, peterz, rafael, rick.p.edgecombe,
sathyanarayanan.kuppuswamy, seanjc, tglx, thomas.lendacky, x86
Please disregard this. I failed to fold changes :/
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* [PATCHv11.2 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec
2024-05-29 10:42 ` Borislav Petkov
2024-06-02 12:39 ` [PATCHv11.1 " Kirill A. Shutemov
@ 2024-06-02 12:44 ` Kirill A. Shutemov
1 sibling, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-02 12:44 UTC (permalink / raw)
To: bp
Cc: adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kirill.shutemov, kys, linux-acpi, linux-coco, linux-hyperv,
linux-kernel, ltao, mingo, nik.borisov, peterz, rafael,
rick.p.edgecombe, sathyanarayanan.kuppuswamy, seanjc, tglx,
thomas.lendacky, x86
AMD SEV and Intel TDX guests allocate shared buffers for performing I/O.
This is done by allocating pages normally from the buddy allocator and
then converting them to shared using set_memory_decrypted().
On kexec, the second kernel is unaware of which memory has been
converted in this manner. It only sees E820_TYPE_RAM. Accessing shared
memory as private is fatal.
Therefore, the memory state must be reset to its original state before
starting the new kernel with kexec.
The process of converting shared memory back to private occurs in two
steps:
- enc_kexec_begin() stops new conversions.
- enc_kexec_finish() unshares all existing shared memory, reverting it
back to private.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/include/asm/x86_init.h | 12 ++++++++++++
arch/x86/kernel/crash.c | 12 ++++++++++++
arch/x86/kernel/reboot.c | 12 ++++++++++++
arch/x86/kernel/x86_init.c | 4 ++++
4 files changed, 40 insertions(+)
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 28ac3cb9b987..b0f313278967 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -149,12 +149,24 @@ struct x86_init_acpi {
* @enc_status_change_finish Notify HV after the encryption status of a range is changed
* @enc_tlb_flush_required Returns true if a TLB flush is needed before changing page encryption status
* @enc_cache_flush_required Returns true if a cache flush is needed before changing page encryption status
+ * @enc_kexec_begin Begin the two-step process of converting shared memory back
+ * to private. It stops the new conversions from being started
+ * and waits in-flight conversions to finish, if possible.
+ * The @crash parameter denotes whether the function is being
+ * called in the crash shutdown path.
+ * @enc_kexec_finish Finish the two-step process of converting shared memory to
+ * private. All memory is private after the call when
+ * the function returns.
+ * It is called on only one CPU while the others are shut down
+ * and with interrupts disabled.
*/
struct x86_guest {
int (*enc_status_change_prepare)(unsigned long vaddr, int npages, bool enc);
int (*enc_status_change_finish)(unsigned long vaddr, int npages, bool enc);
bool (*enc_tlb_flush_required)(bool enc);
bool (*enc_cache_flush_required)(void);
+ void (*enc_kexec_begin)(bool crash);
+ void (*enc_kexec_finish)(void);
};
/**
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index f06501445cd9..fc52ea80cdc8 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -128,6 +128,18 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
#ifdef CONFIG_HPET_TIMER
hpet_disable();
#endif
+
+ /*
+ * Non-crash kexec calls enc_kexec_begin() while scheduling is still
+ * active. This allows the callback to wait until all in-flight
+ * shared<->private conversions are complete. In a crash scenario,
+ * enc_kexec_begin() gets called after all but one CPU have been shut
+ * down and interrupts have been disabled. This allows the callback to
+ * detect a race with the conversion and report it.
+ */
+ x86_platform.guest.enc_kexec_begin(true);
+ x86_platform.guest.enc_kexec_finish();
+
crash_save_cpu(regs, safe_smp_processor_id());
}
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index f3130f762784..097313147ad3 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -12,6 +12,7 @@
#include <linux/delay.h>
#include <linux/objtool.h>
#include <linux/pgtable.h>
+#include <linux/kexec.h>
#include <acpi/reboot.h>
#include <asm/io.h>
#include <asm/apic.h>
@@ -716,6 +717,14 @@ static void native_machine_emergency_restart(void)
void native_machine_shutdown(void)
{
+ /*
+ * Call enc_kexec_begin() while all CPUs are still active and
+ * interrupts are enabled. This will allow all in-flight memory
+ * conversions to finish cleanly.
+ */
+ if (kexec_in_progress)
+ x86_platform.guest.enc_kexec_begin(false);
+
/* Stop the cpus and apics */
#ifdef CONFIG_X86_IO_APIC
/*
@@ -752,6 +761,9 @@ void native_machine_shutdown(void)
#ifdef CONFIG_X86_64
x86_platform.iommu_shutdown();
#endif
+
+ if (kexec_in_progress)
+ x86_platform.guest.enc_kexec_finish();
}
static void __machine_emergency_restart(int emergency)
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index a7143bb7dd93..8a79fb505303 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -138,6 +138,8 @@ static int enc_status_change_prepare_noop(unsigned long vaddr, int npages, bool
static int enc_status_change_finish_noop(unsigned long vaddr, int npages, bool enc) { return 0; }
static bool enc_tlb_flush_required_noop(bool enc) { return false; }
static bool enc_cache_flush_required_noop(void) { return false; }
+static void enc_kexec_begin_noop(bool crash) {}
+static void enc_kexec_finish_noop(void) {}
static bool is_private_mmio_noop(u64 addr) {return false; }
struct x86_platform_ops x86_platform __ro_after_init = {
@@ -161,6 +163,8 @@ struct x86_platform_ops x86_platform __ro_after_init = {
.enc_status_change_finish = enc_status_change_finish_noop,
.enc_tlb_flush_required = enc_tlb_flush_required_noop,
.enc_cache_flush_required = enc_cache_flush_required_noop,
+ .enc_kexec_begin = enc_kexec_begin_noop,
+ .enc_kexec_finish = enc_kexec_finish_noop,
},
};
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-05-31 15:14 ` Borislav Petkov
2024-05-31 17:34 ` Kalra, Ashish
@ 2024-06-02 14:20 ` Kirill A. Shutemov
2024-06-02 14:23 ` [PATCHv11.1 " Kirill A. Shutemov
2 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-02 14:20 UTC (permalink / raw)
To: Borislav Petkov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Fri, May 31, 2024 at 05:14:42PM +0200, Borislav Petkov wrote:
> On Tue, May 28, 2024 at 12:55:14PM +0300, Kirill A. Shutemov wrote:
> > +static void tdx_kexec_finish(void)
> > +{
> > + unsigned long addr, end;
> > + long found = 0, shared;
> > +
> > + lockdep_assert_irqs_disabled();
> > +
> > + addr = PAGE_OFFSET;
> > + end = PAGE_OFFSET + get_max_mapped();
> > +
> > + while (addr < end) {
> > + unsigned long size;
> > + unsigned int level;
> > + pte_t *pte;
> > +
> > + pte = lookup_address(addr, &level);
> > + size = page_level_size(level);
> > +
> > + if (pte && pte_decrypted(*pte)) {
> > + int pages = size / PAGE_SIZE;
> > +
> > + /*
> > + * Touching memory with shared bit set triggers implicit
> > + * conversion to shared.
> > + *
> > + * Make sure nobody touches the shared range from
> > + * now on.
> > + */
> > + set_pte(pte, __pte(0));
> > +
>
> Format the below into a comment here:
>
> /*
>
> The only thing one can do at this point on failure is panic. It is
> reasonable to proceed, especially for the crash case because the
> kexec-ed kernel is using a different page table so there won't be
> a mismatch between shared/private marking of the page so it doesn't
> matter.
Page tables would not make a difference here. We will switch to identity
mappings soon. And kexec-ed kernel will build new page tables from
scratch.
I will drop the part after "It is reasonable to proceed".
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* [PATCHv11.1 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-05-31 15:14 ` Borislav Petkov
2024-05-31 17:34 ` Kalra, Ashish
2024-06-02 14:20 ` Kirill A. Shutemov
@ 2024-06-02 14:23 ` Kirill A. Shutemov
2024-06-03 8:37 ` Borislav Petkov
2 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-02 14:23 UTC (permalink / raw)
To: bp
Cc: adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kirill.shutemov, kys, linux-acpi, linux-coco, linux-hyperv,
linux-kernel, ltao, mingo, peterz, rafael, rick.p.edgecombe,
sathyanarayanan.kuppuswamy, seanjc, tglx, thomas.lendacky, x86
TDX guests allocate shared buffers to perform I/O. It is done by
allocating pages normally from the buddy allocator and converting them
to shared with set_memory_decrypted().
The second, kexec-ed kernel has no idea what memory is converted this
way. It only sees E820_TYPE_RAM.
Accessing shared memory via private mapping is fatal. It leads to
unrecoverable TD exit.
On kexec walk direct mapping and convert all shared memory back to
private. It makes all RAM private again and second kernel may use it
normally.
The conversion occurs in two steps: stopping new conversions and
unsharing all memory. In the case of normal kexec, the stopping of
conversions takes place while scheduling is still functioning. This
allows for waiting until any ongoing conversions are finished. The
second step is carried out when all CPUs except one are inactive and
interrupts are disabled. This prevents any conflicts with code that may
access shared memory.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Tested-by: Tao Liu <ltao@redhat.com>
---
arch/x86/coco/tdx/tdx.c | 90 +++++++++++++++++++++++++++++++
arch/x86/include/asm/pgtable.h | 5 ++
arch/x86/include/asm/set_memory.h | 3 ++
arch/x86/mm/pat/set_memory.c | 41 ++++++++++++--
4 files changed, 136 insertions(+), 3 deletions(-)
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 979891e97d83..afd71bc6eb02 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -7,6 +7,7 @@
#include <linux/cpufeature.h>
#include <linux/export.h>
#include <linux/io.h>
+#include <linux/kexec.h>
#include <asm/coco.h>
#include <asm/tdx.h>
#include <asm/vmx.h>
@@ -14,6 +15,7 @@
#include <asm/insn.h>
#include <asm/insn-eval.h>
#include <asm/pgtable.h>
+#include <asm/set_memory.h>
/* MMIO direction */
#define EPT_READ 0
@@ -831,6 +833,91 @@ static int tdx_enc_status_change_finish(unsigned long vaddr, int numpages,
return 0;
}
+/* Stop new private<->shared conversions */
+static void tdx_kexec_begin(bool crash)
+{
+ /*
+ * Crash kernel reaches here with interrupts disabled: can't wait for
+ * conversions to finish.
+ *
+ * If race happened, just report and proceed.
+ */
+ if (!set_memory_enc_stop_conversion(!crash))
+ pr_warn("Failed to stop shared<->private conversions\n");
+}
+
+/* Walk direct mapping and convert all shared memory back to private */
+static void tdx_kexec_finish(void)
+{
+ unsigned long addr, end;
+ long found = 0, shared;
+
+ lockdep_assert_irqs_disabled();
+
+ addr = PAGE_OFFSET;
+ end = PAGE_OFFSET + get_max_mapped();
+
+ while (addr < end) {
+ unsigned long size;
+ unsigned int level;
+ pte_t *pte;
+
+ pte = lookup_address(addr, &level);
+ size = page_level_size(level);
+
+ if (pte && pte_decrypted(*pte)) {
+ int pages = size / PAGE_SIZE;
+
+ /*
+ * Touching memory with shared bit set triggers implicit
+ * conversion to shared.
+ *
+ * Make sure nobody touches the shared range from
+ * now on.
+ */
+ set_pte(pte, __pte(0));
+
+ /*
+ * The only thing one can do at this point on failure
+ * is panic. It is reasonable to proceed.
+ *
+ * Also, even if the failure is real and the page cannot
+ * be touched as private, the kdump kernel will boot
+ * fine as it uses pre-reserved memory. What happens
+ * next depends on what the dumping process does and
+ * there's a reasonable chance to produce useful dump
+ * on crash.
+ *
+ * Regardless, the print leaves a trace in the log to
+ * give a clue for debug.
+ *
+ * One possible reason for the failure is if kdump raced
+ * with memory conversion. In this case shared bit in
+ * page table got set (or not cleared) during
+ * shared<->private conversion, but the page is actually
+ * private. So this failure is not going to affect the
+ * kexec'ed kernel.
+ */
+ if (!tdx_enc_status_changed(addr, pages, true)) {
+ pr_err("Failed to unshare range %#lx-%#lx\n",
+ addr, addr + size);
+ }
+
+ found += pages;
+ }
+
+ addr += size;
+ }
+
+ __flush_tlb_all();
+
+ shared = atomic_long_read(&nr_shared);
+ if (shared != found) {
+ pr_err("shared page accounting is off\n");
+ pr_err("nr_shared = %ld, nr_found = %ld\n", shared, found);
+ }
+}
+
void __init tdx_early_init(void)
{
struct tdx_module_args args = {
@@ -890,6 +977,9 @@ void __init tdx_early_init(void)
x86_platform.guest.enc_cache_flush_required = tdx_cache_flush_required;
x86_platform.guest.enc_tlb_flush_required = tdx_tlb_flush_required;
+ x86_platform.guest.enc_kexec_begin = tdx_kexec_begin;
+ x86_platform.guest.enc_kexec_finish = tdx_kexec_finish;
+
/*
* TDX intercepts the RDMSR to read the X2APIC ID in the parallel
* bringup low level code. That raises #VE which cannot be handled
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 65b8e5bb902c..e39311a89bf4 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -140,6 +140,11 @@ static inline int pte_young(pte_t pte)
return pte_flags(pte) & _PAGE_ACCESSED;
}
+static inline bool pte_decrypted(pte_t pte)
+{
+ return cc_mkdec(pte_val(pte)) == pte_val(pte);
+}
+
#define pmd_dirty pmd_dirty
static inline bool pmd_dirty(pmd_t pmd)
{
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 9aee31862b4a..d490db38db9e 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -49,8 +49,11 @@ int set_memory_wb(unsigned long addr, int numpages);
int set_memory_np(unsigned long addr, int numpages);
int set_memory_p(unsigned long addr, int numpages);
int set_memory_4k(unsigned long addr, int numpages);
+
+bool set_memory_enc_stop_conversion(bool wait);
int set_memory_encrypted(unsigned long addr, int numpages);
int set_memory_decrypted(unsigned long addr, int numpages);
+
int set_memory_np_noalias(unsigned long addr, int numpages);
int set_memory_nonglobal(unsigned long addr, int numpages);
int set_memory_global(unsigned long addr, int numpages);
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index a7a7a6c6a3fb..2a548b65ef5f 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2227,12 +2227,47 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
return ret;
}
+/*
+ * The lock serializes conversions between private and shared memory.
+ *
+ * It is taken for read on conversion. A write lock guarantees that no
+ * concurrent conversions are in progress.
+ */
+static DECLARE_RWSEM(mem_enc_lock);
+
+/*
+ * Stop new private<->shared conversions.
+ *
+ * Taking the exclusive mem_enc_lock waits for in-flight conversions to complete.
+ * The lock is not released to prevent new conversions from being started.
+ *
+ * If sleep is not allowed, as in a crash scenario, try to take the lock.
+ * Failure indicates that there is a race with the conversion.
+ */
+bool set_memory_enc_stop_conversion(bool wait)
+{
+ if (!wait)
+ return down_write_trylock(&mem_enc_lock);
+
+ down_write(&mem_enc_lock);
+
+ return true;
+}
+
static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
{
- if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
- return __set_memory_enc_pgtable(addr, numpages, enc);
+ int ret = 0;
- return 0;
+ if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
+ if (!down_read_trylock(&mem_enc_lock))
+ return -EBUSY;
+
+ ret = __set_memory_enc_pgtable(addr, numpages, enc);
+
+ up_read(&mem_enc_lock);
+ }
+
+ return ret;
}
int set_memory_encrypted(unsigned long addr, int numpages)
--
2.43.0
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCHv11.1 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-06-02 14:23 ` [PATCHv11.1 " Kirill A. Shutemov
@ 2024-06-03 8:37 ` Borislav Petkov
2024-06-04 15:32 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-03 8:37 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kys, linux-acpi, linux-coco, linux-hyperv, linux-kernel, ltao,
mingo, peterz, rafael, rick.p.edgecombe,
sathyanarayanan.kuppuswamy, seanjc, tglx, thomas.lendacky, x86
On Sun, Jun 02, 2024 at 05:23:03PM +0300, Kirill A. Shutemov wrote:
> + /*
> + * The only thing one can do at this point on failure
> + * is panic. It is reasonable to proceed.
It makes even less sense now: panic() means "all stops and we die" and
you say it is reasonable to proceed.
I'm confused.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-05-28 9:55 ` [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method Kirill A. Shutemov
@ 2024-06-03 8:39 ` Borislav Petkov
2024-06-07 15:14 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-03 8:39 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Tue, May 28, 2024 at 12:55:21PM +0300, Kirill A. Shutemov wrote:
> MADT Multiprocessor Wakeup structure version 1 brings support of CPU
s/of /for /
> offlining: BIOS provides a reset vector where the CPU has to jump to
> for offlining itself. The new TEST mailbox command can be used to test
> whether the CPU offlined itself which means the BIOS has control over
> the CPU and can online it again via the ACPI MADT wakeup method.
>
> Add CPU offling support for the ACPI MADT wakeup method by implementing
Unknown word [offling] in commit message.
Please introduce a spellchecker into your patch creation workflow.
> custom cpu_die(), play_dead() and stop_this_cpu() SMP operations.
>
> CPU offlining makes is possible to hand over secondary CPUs over kexec,
s/is /it /
> not limiting the second kernel to a single CPU.
...
> +/*
> + * Make sure asm_acpi_mp_play_dead() is present in the identity mapping at
> + * the same place as in the kernel page tables. asm_acpi_mp_play_dead() switches
> + * to the identity mapping and the function has be present at the same spot in
> + * the virtual address space before and after switching page tables.
> + */
> +static int __init init_transition_pgtable(pgd_t *pgd)
This looks like a generic helper which should be in set_memory.c. And
looking at that file, there's populate_pgd() which does pretty much the
same thing, if I squint real hard.
Let's tone down the duplication.
> +{
> + pgprot_t prot = PAGE_KERNEL_EXEC_NOENC;
> + unsigned long vaddr, paddr;
> + p4d_t *p4d;
> + pud_t *pud;
> + pmd_t *pmd;
> + pte_t *pte;
> +
> + vaddr = (unsigned long)asm_acpi_mp_play_dead;
> + pgd += pgd_index(vaddr);
> + if (!pgd_present(*pgd)) {
> + p4d = (p4d_t *)alloc_pgt_page(NULL);
> + if (!p4d)
> + return -ENOMEM;
> + set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE));
> + }
> + p4d = p4d_offset(pgd, vaddr);
> + if (!p4d_present(*p4d)) {
> + pud = (pud_t *)alloc_pgt_page(NULL);
> + if (!pud)
> + return -ENOMEM;
> + set_p4d(p4d, __p4d(__pa(pud) | _KERNPG_TABLE));
> + }
> + pud = pud_offset(p4d, vaddr);
> + if (!pud_present(*pud)) {
> + pmd = (pmd_t *)alloc_pgt_page(NULL);
> + if (!pmd)
> + return -ENOMEM;
> + set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
> + }
> + pmd = pmd_offset(pud, vaddr);
> + if (!pmd_present(*pmd)) {
> + pte = (pte_t *)alloc_pgt_page(NULL);
> + if (!pte)
> + return -ENOMEM;
> + set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
> + }
> + pte = pte_offset_kernel(pmd, vaddr);
> +
> + paddr = __pa(vaddr);
> + set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
> +
> + return 0;
> +}
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-05-30 23:36 ` [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec Ashish Kalra
2024-05-31 9:12 ` Alexander Kuleshov
@ 2024-06-03 8:56 ` Borislav Petkov
2024-06-03 13:06 ` Kalra, Ashish
1 sibling, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-03 8:56 UTC (permalink / raw)
To: Ashish Kalra, Mike Rapoport
Cc: tglx, mingo, dave.hansen, x86, rafael, hpa, peterz, adrian.hunter,
sathyanarayanan.kuppuswamy, jun.nakajima, rick.p.edgecombe,
thomas.lendacky, michael.roth, seanjc, kai.huang, bhe,
kirill.shutemov, bdas, vkuznets, dionnaglaze, anisinha, jroedel,
ardb, kexec, linux-coco, linux-kernel
On Thu, May 30, 2024 at 11:36:55PM +0000, Ashish Kalra wrote:
> From: Ashish Kalra <ashish.kalra@amd.com>
>
> With SNP guest kexec observe the following efi memmap corruption :
>
> [ 0.000000] efi: EFI v2.7 by EDK II
> [ 0.000000] efi: SMBIOS=0x7e33f000 SMBIOS 3.0=0x7e33d000 ACPI=0x7e57e000 ACPI 2.0=0x7e57e014 MEMATTR=0x7cc3c018 Unaccepted=0x7c09e018
> [ 0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries:
> [ 0.000000] efi: mem03: [type=269370880|attr=0x0e42100e42180e41] range=[0x0486200e41038c18-0x200e898a0eee713ac17] (invalid)
> [ 0.000000] efi: mem04: [type=12336|attr=0x0e410686300e4105] range=[0x100e420000000176-0x8c290f26248d200e175] (invalid)
> [ 0.000000] efi: mem06: [type=1124304408|attr=0x000030b400000028] range=[0x0e51300e45280e77-0xb44ed2142f460c1e76] (invalid)
> [ 0.000000] efi: mem08: [type=68|attr=0x300e540583280e41] range=[0x0000011affff3cd8-0x486200e54b38c0bcd7] (invalid)
> [ 0.000000] efi: mem10: [type=1107529240|attr=0x0e42280e41300e41] range=[0x300e41058c280e42-0x38010ae54c5c328ee41] (invalid)
> [ 0.000000] efi: mem11: [type=189335566|attr=0x048d200e42038e18] range=[0x0000318c00000048-0xe42029228ce4200047] (invalid)
> [ 0.000000] efi: mem12: [type=239142534|attr=0x0000002400000b4b] range=[0x0e41380e0a7d700e-0x80f26238f22bfe500d] (invalid)
> [ 0.000000] efi: mem14: [type=239207055|attr=0x0e41300e43380e0a] range=[0x8c280e42048d200e-0xc70b028f2f27cc0a00d] (invalid)
> [ 0.000000] efi: mem15: [type=239210510|attr=0x00080e660b47080e] range=[0x0000324c0000001c-0xa78028634ce490001b] (invalid)
> [ 0.000000] efi: mem16: [type=4294848528|attr=0x0000329400000014] range=[0x0e410286100e4100-0x80f252036a218f20ff] (invalid)
> [ 0.000000] efi: mem19: [type=2250772033|attr=0x42180e42200e4328] range=[0x41280e0ab9020683-0xe0e538c28b39e62682] (invalid)
> [ 0.000000] efi: mem20: [type=16| | | | | | | | | | |WB| |WC| ] range=[0x00000008ffff4438-0xffff44340090333c437] (invalid)
> [ 0.000000] efi: mem22: [Reserved |attr=0x000000c1ffff4420] range=[0xffff442400003398-0x1033a04240003f397] (invalid)
> [ 0.000000] efi: mem23: [type=1141080856|attr=0x080e41100e43180e] range=[0x280e66300e4b280e-0x440dc5ee7141f4c080d] (invalid)
> [ 0.000000] efi: mem25: [Reserved |attr=0x0000000affff44a0] range=[0xffff44a400003428-0x1034304a400013427] (invalid)
> [ 0.000000] efi: mem28: [type=16| | | | | | | | | | |WB| |WC| ] range=[0x0000000affff4488-0xffff448400b034bc487] (invalid)
> [ 0.000000] efi: mem30: [Reserved |attr=0x0000000affff4470] range=[0xffff447400003518-0x10352047400013517] (invalid)
> [ 0.000000] efi: mem33: [type=16| | | | | | | | | | |WB| |WC| ] range=[0x0000000affff4458-0xffff445400b035ac457] (invalid)
> [ 0.000000] efi: mem35: [type=269372416|attr=0x0e42100e42180e41] range=[0x0486200e44038c18-0x200e8b8a0eee823ac17] (invalid)
> [ 0.000000] efi: mem37: [type=2351435330|attr=0x0e42100e42180e42] range=[0x470783380e410686-0x2002b2a041c2141e685] (invalid)
> [ 0.000000] efi: mem38: [type=1093668417|attr=0x100e420000000270] range=[0x42100e42180e4220-0xfff366a4e421b78c21f] (invalid)
> [ 0.000000] efi: mem39: [type=76357646|attr=0x180e42200e42280e] range=[0x0e410686300e4105-0x4130f251a0710ae5104] (invalid)
> [ 0.000000] efi: mem40: [type=940444268|attr=0x0e42200e42280e41] range=[0x180e42200e42280e-0x300fc71c300b4f2480d] (invalid)
> [ 0.000000] efi: mem41: [MMIO |attr=0x8c280e42048d200e] range=[0xffff479400003728-0x42138e0c87820292727] (invalid)
> [ 0.000000] efi: mem42: [type=1191674680|attr=0x0000004c0000000b] range=[0x300e41380e0a0246-0x470b0f26238f22b8245] (invalid)
> [ 0.000000] efi: mem43: [type=2010|attr=0x0301f00e4d078338] range=[0x45038e180e42028f-0xe4556bf118f282528e] (invalid)
> [ 0.000000] efi: mem44: [type=1109921345|attr=0x300e44000000006c] range=[0x44080e42100e4218-0xfff39254e42138ac217] (invalid)
> ...
>
> This EFI memap corruption is happening with efi_arch_mem_reserve() invocation in case of kexec boot.
>
> ( efi_arch_mem_reserve() is invoked with the following call-stack: )
>
> [ 0.310010] efi_arch_mem_reserve+0xb1/0x220
> [ 0.311382] efi_mem_reserve+0x36/0x60
> [ 0.311973] efi_bgrt_init+0x17d/0x1a0
> [ 0.313265] acpi_parse_bgrt+0x12/0x20
> [ 0.313858] acpi_table_parse+0x77/0xd0
> [ 0.314463] acpi_boot_init+0x362/0x630
> [ 0.315069] setup_arch+0xa88/0xf80
> [ 0.315629] start_kernel+0x68/0xa90
> [ 0.316194] x86_64_start_reservations+0x1c/0x30
> [ 0.316921] x86_64_start_kernel+0xbf/0x110
> [ 0.317582] common_startup_64+0x13e/0x141
>
> efi_arch_mem_reserve() calls efi_memmap_alloc() to allocate memory for
> EFI memory map and due to early allocation it uses memblock allocation.
>
> Later during boot, efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
> in case of a kexec-ed kernel boot.
>
> This function kexec_enter_virtual_mode() installs the new EFI memory map by
> calling efi_memmap_init_late() which remaps the efi_memmap physically allocated
> in efi_arch_mem_reserve(), but this remapping is still using memblock allocation.
>
> Subsequently, when memblock is freed later in boot flow, this remapped
> efi_memmap will have random corruption (similar to a use-after-free scenario).
>
> The corrupted EFI memory map is then passed to the next kexec-ed kernel
> which causes a panic when trying to use the corrupted EFI memory map.
This sounds fishy: memblock allocated memory is not freed later in the
boot - it remains reserved. Only free memory is freed from memblock to
the buddy allocator.
Or is the problem that memblock-allocated memory cannot be memremapped
because *raisins*?
Mike?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 8:56 ` Borislav Petkov
@ 2024-06-03 13:06 ` Kalra, Ashish
2024-06-03 13:39 ` Mike Rapoport
0 siblings, 1 reply; 115+ messages in thread
From: Kalra, Ashish @ 2024-06-03 13:06 UTC (permalink / raw)
To: Borislav Petkov, Mike Rapoport
Cc: tglx, mingo, dave.hansen, x86, rafael, hpa, peterz, adrian.hunter,
sathyanarayanan.kuppuswamy, jun.nakajima, rick.p.edgecombe,
thomas.lendacky, michael.roth, seanjc, kai.huang, bhe,
kirill.shutemov, bdas, vkuznets, dionnaglaze, anisinha, jroedel,
ardb, kexec, linux-coco, linux-kernel
On 6/3/2024 3:56 AM, Borislav Petkov wrote
>> EFI memory map and due to early allocation it uses memblock allocation.
>>
>> Later during boot, efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
>> in case of a kexec-ed kernel boot.
>>
>> This function kexec_enter_virtual_mode() installs the new EFI memory map by
>> calling efi_memmap_init_late() which remaps the efi_memmap physically allocated
>> in efi_arch_mem_reserve(), but this remapping is still using memblock allocation.
>>
>> Subsequently, when memblock is freed later in boot flow, this remapped
>> efi_memmap will have random corruption (similar to a use-after-free scenario).
>>
>> The corrupted EFI memory map is then passed to the next kexec-ed kernel
>> which causes a panic when trying to use the corrupted EFI memory map.
> This sounds fishy: memblock allocated memory is not freed later in the
> boot - it remains reserved. Only free memory is freed from memblock to
> the buddy allocator.
>
> Or is the problem that memblock-allocated memory cannot be memremapped
> because *raisins*?
This is what seems to be happening:
efi_arch_mem_reserve() calls efi_memmap_alloc() to allocate memory for
EFI memory map and due to early allocation it uses memblock allocation.
And later efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
in case of a kexec-ed kernel boot.
This function kexec_enter_virtual_mode() installs the new EFI memory map by
calling efi_memmap_init_late() which does memremap() on memblock-allocated memory.
Thanks, Ashish
>
> Mike?
>
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 13:06 ` Kalra, Ashish
@ 2024-06-03 13:39 ` Mike Rapoport
2024-06-03 14:01 ` Kalra, Ashish
0 siblings, 1 reply; 115+ messages in thread
From: Mike Rapoport @ 2024-06-03 13:39 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Borislav Petkov, tglx, mingo, dave.hansen, x86, rafael, hpa,
peterz, adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel
On Mon, Jun 03, 2024 at 08:06:56AM -0500, Kalra, Ashish wrote:
> On 6/3/2024 3:56 AM, Borislav Petkov wrote
>
> > > EFI memory map and due to early allocation it uses memblock allocation.
> > >
> > > Later during boot, efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
> > > in case of a kexec-ed kernel boot.
> > >
> > > This function kexec_enter_virtual_mode() installs the new EFI memory map by
> > > calling efi_memmap_init_late() which remaps the efi_memmap physically allocated
> > > in efi_arch_mem_reserve(), but this remapping is still using memblock allocation.
> > >
> > > Subsequently, when memblock is freed later in boot flow, this remapped
> > > efi_memmap will have random corruption (similar to a use-after-free scenario).
> > >
> > > The corrupted EFI memory map is then passed to the next kexec-ed kernel
> > > which causes a panic when trying to use the corrupted EFI memory map.
> > This sounds fishy: memblock allocated memory is not freed later in the
> > boot - it remains reserved. Only free memory is freed from memblock to
> > the buddy allocator.
> >
> > Or is the problem that memblock-allocated memory cannot be memremapped
> > because *raisins*?
>
> This is what seems to be happening:
>
> efi_arch_mem_reserve() calls efi_memmap_alloc() to allocate memory for
> EFI memory map and due to early allocation it uses memblock allocation.
>
> And later efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
> in case of a kexec-ed kernel boot.
>
> This function kexec_enter_virtual_mode() installs the new EFI memory map by
> calling efi_memmap_init_late() which does memremap() on memblock-allocated memory.
Does the issue happen only with SNP?
I didn't really dig, but my theory would be that it has something to do
with arch_memremap_can_ram_remap() in arch/x86/mm/ioremap.c
> Thanks, Ashish
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 13:39 ` Mike Rapoport
@ 2024-06-03 14:01 ` Kalra, Ashish
2024-06-03 14:46 ` Borislav Petkov
2024-06-03 15:29 ` Mike Rapoport
0 siblings, 2 replies; 115+ messages in thread
From: Kalra, Ashish @ 2024-06-03 14:01 UTC (permalink / raw)
To: Mike Rapoport
Cc: Borislav Petkov, tglx, mingo, dave.hansen, x86, rafael, hpa,
peterz, adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel
On 6/3/2024 8:39 AM, Mike Rapoport wrote:
> On Mon, Jun 03, 2024 at 08:06:56AM -0500, Kalra, Ashish wrote:
>> On 6/3/2024 3:56 AM, Borislav Petkov wrote
>>
>>>> EFI memory map and due to early allocation it uses memblock allocation.
>>>>
>>>> Later during boot, efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
>>>> in case of a kexec-ed kernel boot.
>>>>
>>>> This function kexec_enter_virtual_mode() installs the new EFI memory map by
>>>> calling efi_memmap_init_late() which remaps the efi_memmap physically allocated
>>>> in efi_arch_mem_reserve(), but this remapping is still using memblock allocation.
>>>>
>>>> Subsequently, when memblock is freed later in boot flow, this remapped
>>>> efi_memmap will have random corruption (similar to a use-after-free scenario).
>>>>
>>>> The corrupted EFI memory map is then passed to the next kexec-ed kernel
>>>> which causes a panic when trying to use the corrupted EFI memory map.
>>> This sounds fishy: memblock allocated memory is not freed later in the
>>> boot - it remains reserved. Only free memory is freed from memblock to
>>> the buddy allocator.
>>>
>>> Or is the problem that memblock-allocated memory cannot be memremapped
>>> because *raisins*?
>> This is what seems to be happening:
>>
>> efi_arch_mem_reserve() calls efi_memmap_alloc() to allocate memory for
>> EFI memory map and due to early allocation it uses memblock allocation.
>>
>> And later efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
>> in case of a kexec-ed kernel boot.
>>
>> This function kexec_enter_virtual_mode() installs the new EFI memory map by
>> calling efi_memmap_init_late() which does memremap() on memblock-allocated memory.
> Does the issue happen only with SNP?
This is observed under SNP as efi_arch_mem_reserve() is only being
called with SNP enabled and then efi_arch_mem_reserve() allocates EFI
memory map using memblock.
If we skip efi_arch_mem_reserve() (which should probably be anyway
skipped for kexec case), then for kexec boot, EFI memmap is memremapped
in the same virtual address as the first kernel and not the allocated
memblock address.
Thanks, Ashish
>
> I didn't really dig, but my theory would be that it has something to do
> with arch_memremap_can_ram_remap() in arch/x86/mm/ioremap.c
>
>> Thanks, Ashish
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-05-29 10:47 ` Nikolay Borisov
2024-05-29 11:17 ` Kirill A. Shutemov
@ 2024-06-03 14:43 ` H. Peter Anvin
2024-06-12 12:10 ` Nikolay Borisov
2024-06-03 22:43 ` H. Peter Anvin
2 siblings, 1 reply; 115+ messages in thread
From: H. Peter Anvin @ 2024-06-03 14:43 UTC (permalink / raw)
To: Nikolay Borisov, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, K. Y. Srinivasan,
Haiyang Zhang, kexec, linux-hyperv, linux-acpi, linux-coco,
linux-kernel
On 5/29/24 03:47, Nikolay Borisov wrote:
>>
>> diff --git a/arch/x86/kernel/relocate_kernel_64.S
>> b/arch/x86/kernel/relocate_kernel_64.S
>> index 56cab1bb25f5..085eef5c3904 100644
>> --- a/arch/x86/kernel/relocate_kernel_64.S
>> +++ b/arch/x86/kernel/relocate_kernel_64.S
>> @@ -148,9 +148,10 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>> */
>> movl $X86_CR4_PAE, %eax
>> testq $X86_CR4_LA57, %r13
>> - jz 1f
>> + jz .Lno_la57
>> orl $X86_CR4_LA57, %eax
>> -1:
>> +.Lno_la57:
>> +
>> movq %rax, %cr4
>> jmp 1f
>
> That jmp 1f becomes redundant now as it simply jumps 1 line below.
>
Uh... am I the only person to notice that ALL that is needed here is:
andl $(X86_CR4_PAE|X86_CR4_LA57), %r13d
movq %r13, %rax
... since %r13 is dead afterwards, and PAE *will* have been set in %r13
already?
I don't believe that this specific jmp is actually needed -- there are
several more synchronizing jumps later -- but it doesn't hurt.
However, if the effort is for improving the readability, it might be
worthwhile to encapsulate the "jmp 1f; 1:" as a macro, e.g. "SYNC_CODE".
-hpa
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 14:01 ` Kalra, Ashish
@ 2024-06-03 14:46 ` Borislav Petkov
2024-06-03 15:31 ` Mike Rapoport
2024-06-03 15:29 ` Mike Rapoport
1 sibling, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-03 14:46 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Mike Rapoport, tglx, mingo, dave.hansen, x86, rafael, hpa, peterz,
adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel
On Mon, Jun 03, 2024 at 09:01:49AM -0500, Kalra, Ashish wrote:
> If we skip efi_arch_mem_reserve() (which should probably be anyway skipped
> for kexec case), then for kexec boot, EFI memmap is memremapped in the same
> virtual address as the first kernel and not the allocated memblock address.
Are you saying that we should simply do
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index fdf07dd6f459..410cb0743289 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -577,6 +577,9 @@ void __init efi_mem_reserve(phys_addr_t addr, u64 size)
if (WARN_ON_ONCE(efi_enabled(EFI_PARAVIRT)))
return;
+ if (kexec_in_progress)
+ return;
+
if (!memblock_is_region_reserved(addr, size))
memblock_reserve(addr, size);
and skip that whole call?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 14:01 ` Kalra, Ashish
2024-06-03 14:46 ` Borislav Petkov
@ 2024-06-03 15:29 ` Mike Rapoport
2024-06-03 16:56 ` Kalra, Ashish
1 sibling, 1 reply; 115+ messages in thread
From: Mike Rapoport @ 2024-06-03 15:29 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Borislav Petkov, tglx, mingo, dave.hansen, x86, rafael, hpa,
peterz, adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel
On Mon, Jun 03, 2024 at 09:01:49AM -0500, Kalra, Ashish wrote:
> On 6/3/2024 8:39 AM, Mike Rapoport wrote:
>
> > On Mon, Jun 03, 2024 at 08:06:56AM -0500, Kalra, Ashish wrote:
> > > On 6/3/2024 3:56 AM, Borislav Petkov wrote
> > >
> > > > > EFI memory map and due to early allocation it uses memblock allocation.
> > > > >
> > > > > Later during boot, efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
> > > > > in case of a kexec-ed kernel boot.
> > > > >
> > > > > This function kexec_enter_virtual_mode() installs the new EFI memory map by
> > > > > calling efi_memmap_init_late() which remaps the efi_memmap physically allocated
> > > > > in efi_arch_mem_reserve(), but this remapping is still using memblock allocation.
> > > > >
> > > > > Subsequently, when memblock is freed later in boot flow, this remapped
> > > > > efi_memmap will have random corruption (similar to a use-after-free scenario).
> > > > >
> > > > > The corrupted EFI memory map is then passed to the next kexec-ed kernel
> > > > > which causes a panic when trying to use the corrupted EFI memory map.
> > > > This sounds fishy: memblock allocated memory is not freed later in the
> > > > boot - it remains reserved. Only free memory is freed from memblock to
> > > > the buddy allocator.
> > > >
> > > > Or is the problem that memblock-allocated memory cannot be memremapped
> > > > because *raisins*?
> > > This is what seems to be happening:
> > >
> > > efi_arch_mem_reserve() calls efi_memmap_alloc() to allocate memory for
> > > EFI memory map and due to early allocation it uses memblock allocation.
> > >
> > > And later efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
> > > in case of a kexec-ed kernel boot.
> > >
> > > This function kexec_enter_virtual_mode() installs the new EFI memory map by
> > > calling efi_memmap_init_late() which does memremap() on memblock-allocated memory.
> > Does the issue happen only with SNP?
>
> This is observed under SNP as efi_arch_mem_reserve() is only being called
> with SNP enabled and then efi_arch_mem_reserve() allocates EFI memory map
> using memblock.
I don't see how efi_arch_mem_reserve() is only called with SNP. What did I
miss?
> If we skip efi_arch_mem_reserve() (which should probably be anyway skipped
> for kexec case), then for kexec boot, EFI memmap is memremapped in the same
> virtual address as the first kernel and not the allocated memblock address.
Maybe we should skip efi_arch_mem_reserve() for kexec case, but I think we
still need to understand what's causing memory corruption.
> Thanks, Ashish
>
> >
> > I didn't really dig, but my theory would be that it has something to do
> > with arch_memremap_can_ram_remap() in arch/x86/mm/ioremap.c
> > > Thanks, Ashish
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 14:46 ` Borislav Petkov
@ 2024-06-03 15:31 ` Mike Rapoport
2024-06-03 16:48 ` Kalra, Ashish
2024-06-04 1:23 ` Dave Young
0 siblings, 2 replies; 115+ messages in thread
From: Mike Rapoport @ 2024-06-03 15:31 UTC (permalink / raw)
To: Borislav Petkov
Cc: Kalra, Ashish, tglx, mingo, dave.hansen, x86, rafael, hpa, peterz,
adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel
On Mon, Jun 03, 2024 at 04:46:39PM +0200, Borislav Petkov wrote:
> On Mon, Jun 03, 2024 at 09:01:49AM -0500, Kalra, Ashish wrote:
> > If we skip efi_arch_mem_reserve() (which should probably be anyway skipped
> > for kexec case), then for kexec boot, EFI memmap is memremapped in the same
> > virtual address as the first kernel and not the allocated memblock address.
>
> Are you saying that we should simply do
>
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index fdf07dd6f459..410cb0743289 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -577,6 +577,9 @@ void __init efi_mem_reserve(phys_addr_t addr, u64 size)
> if (WARN_ON_ONCE(efi_enabled(EFI_PARAVIRT)))
> return;
>
> + if (kexec_in_progress)
> + return;
> +
> if (!memblock_is_region_reserved(addr, size))
> memblock_reserve(addr, size);
>
> and skip that whole call?
I think Ashish suggested rather
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index fdf07dd6f459..eccc10ab15a4 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -580,6 +580,9 @@ void __init efi_mem_reserve(phys_addr_t addr, u64 size)
if (!memblock_is_region_reserved(addr, size))
memblock_reserve(addr, size);
+ if (kexec_in_progress)
+ return;
+
/*
* Some architectures (x86) reserve all boot services ranges
* until efi_free_boot_services() because of buggy firmware
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
--
Sincerely yours,
Mike.
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 15:31 ` Mike Rapoport
@ 2024-06-03 16:48 ` Kalra, Ashish
2024-06-03 16:57 ` Borislav Petkov
2024-06-03 17:05 ` Kalra, Ashish
2024-06-04 1:23 ` Dave Young
1 sibling, 2 replies; 115+ messages in thread
From: Kalra, Ashish @ 2024-06-03 16:48 UTC (permalink / raw)
To: Mike Rapoport, Borislav Petkov
Cc: tglx, mingo, dave.hansen, x86, rafael, hpa, peterz, adrian.hunter,
sathyanarayanan.kuppuswamy, jun.nakajima, rick.p.edgecombe,
thomas.lendacky, michael.roth, seanjc, kai.huang, bhe,
kirill.shutemov, bdas, vkuznets, dionnaglaze, anisinha, jroedel,
ardb, kexec, linux-coco, linux-kernel, Dave Young
On 6/3/2024 10:31 AM, Mike Rapoport wrote:
> On Mon, Jun 03, 2024 at 04:46:39PM +0200, Borislav Petkov wrote:
>> On Mon, Jun 03, 2024 at 09:01:49AM -0500, Kalra, Ashish wrote:
>>> If we skip efi_arch_mem_reserve() (which should probably be anyway skipped
>>> for kexec case), then for kexec boot, EFI memmap is memremapped in the same
>>> virtual address as the first kernel and not the allocated memblock address.
>> Are you saying that we should simply do
>>
>> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
>> index fdf07dd6f459..410cb0743289 100644
>> --- a/drivers/firmware/efi/efi.c
>> +++ b/drivers/firmware/efi/efi.c
>> @@ -577,6 +577,9 @@ void __init efi_mem_reserve(phys_addr_t addr, u64 size)
>> if (WARN_ON_ONCE(efi_enabled(EFI_PARAVIRT)))
>> return;
>>
>> + if (kexec_in_progress)
>> + return;
>> +
>> if (!memblock_is_region_reserved(addr, size))
>> memblock_reserve(addr, size);
>>
>> and skip that whole call?
> I think Ashish suggested rather
>
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index fdf07dd6f459..eccc10ab15a4 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -580,6 +580,9 @@ void __init efi_mem_reserve(phys_addr_t addr, u64 size)
> if (!memblock_is_region_reserved(addr, size))
> memblock_reserve(addr, size);
>
> + if (kexec_in_progress)
> + return;
> +
> /*
> * Some architectures (x86) reserve all boot services ranges
> * until efi_free_boot_services() because of buggy firmware
>
Yes, something similar as above, as efi_mem_reserve() is used to reserve
boot service memory and is not necessary for kexec boot.
So, Dave Young (dyoung@redhat.com) had suggested that we skip
efi_arch_mem_reserve() for kexec by checking the set EFI_MEMORY_RUNTIME
attribute as below:
diff
<https://lore.kernel.org/lkml/Zl3HfiQ6oHdTdOdA@kernel.org/T/#iZ2e.:..:f4be03b8488665f56a1e5c6e6459f447352dfcf5.1717111180.git.ashish.kalra::40amd.com:1arch:x86:platform:efi:quirks.c>
--git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index f0cc00032751..6f398c59278a 100644 ---
a/arch/x86/platform/efi/quirks.c +++ b/arch/x86/platform/efi/quirks.c @@
-255,15 +255,39 @@ void __init efi_arch_mem_reserve(phys_addr_t addr,
u64 size) struct efi_memory_map_data data = { 0 };
struct efi_mem_range mr;
efi_memory_desc_t md;
- int num_entries; + int num_entries, ret; void *new;
- if (efi_mem_desc_lookup(addr, &md) || - md.type !=
EFI_BOOT_SERVICES_DATA) { + /* + * efi_mem_reserve() is used to reserve
boot service memory, eg. bgrt, + * but it is not neccasery for kexec, as
there are no boot services in + * kexec reboot at all after the first
kernel's ExitBootServices(). + * + * Therefore, skip efi_mem_reserve for
kexec booting by checking the + * EFI_MEMORY_RUNTIME attribute which
indicates boot service memory + * ranges reserved by the first kernel
using efi_mem_reserve and marked + * with EFI_MEMORY_RUNTIME attribute.
+ */ + + ret = efi_mem_desc_lookup(addr, &md); + if (ret) { pr_err("Failed to lookup EFI memory descriptor for %pa\n", &addr);
return;
}
+ if (md.type != EFI_BOOT_SERVICES_DATA) { + pr_err("Skip reserving non
EFI Boot Service Data memory for %pa\n", &addr); + return; + } + + /*
Kexec copied the efi memmap from the first kernel, thus skip the case */
+ if (md.attribute & EFI_MEMORY_RUNTIME) + return; + if (addr + size > md.phys_addr + (md.num_pages << EFI_PAGE_SHIFT)) {
pr_err("Region spans EFI memory descriptors, %pa\n", &addr);
return;
--
2.34.1
>> --
>> Regards/Gruss,
>> Boris.
>>
>> https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 15:29 ` Mike Rapoport
@ 2024-06-03 16:56 ` Kalra, Ashish
2024-06-03 17:41 ` Mike Rapoport
0 siblings, 1 reply; 115+ messages in thread
From: Kalra, Ashish @ 2024-06-03 16:56 UTC (permalink / raw)
To: Mike Rapoport
Cc: Borislav Petkov, tglx, mingo, dave.hansen, x86, rafael, hpa,
peterz, adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel
On 6/3/2024 10:29 AM, Mike Rapoport wrote:
> On Mon, Jun 03, 2024 at 09:01:49AM -0500, Kalra, Ashish wrote:
>> On 6/3/2024 8:39 AM, Mike Rapoport wrote:
>>
>>> On Mon, Jun 03, 2024 at 08:06:56AM -0500, Kalra, Ashish wrote:
>>>> On 6/3/2024 3:56 AM, Borislav Petkov wrote
>>>>
>>>>>> EFI memory map and due to early allocation it uses memblock allocation.
>>>>>>
>>>>>> Later during boot, efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
>>>>>> in case of a kexec-ed kernel boot.
>>>>>>
>>>>>> This function kexec_enter_virtual_mode() installs the new EFI memory map by
>>>>>> calling efi_memmap_init_late() which remaps the efi_memmap physically allocated
>>>>>> in efi_arch_mem_reserve(), but this remapping is still using memblock allocation.
>>>>>>
>>>>>> Subsequently, when memblock is freed later in boot flow, this remapped
>>>>>> efi_memmap will have random corruption (similar to a use-after-free scenario).
>>>>>>
>>>>>> The corrupted EFI memory map is then passed to the next kexec-ed kernel
>>>>>> which causes a panic when trying to use the corrupted EFI memory map.
>>>>> This sounds fishy: memblock allocated memory is not freed later in the
>>>>> boot - it remains reserved. Only free memory is freed from memblock to
>>>>> the buddy allocator.
>>>>>
>>>>> Or is the problem that memblock-allocated memory cannot be memremapped
>>>>> because *raisins*?
>>>> This is what seems to be happening:
>>>>
>>>> efi_arch_mem_reserve() calls efi_memmap_alloc() to allocate memory for
>>>> EFI memory map and due to early allocation it uses memblock allocation.
>>>>
>>>> And later efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
>>>> in case of a kexec-ed kernel boot.
>>>>
>>>> This function kexec_enter_virtual_mode() installs the new EFI memory map by
>>>> calling efi_memmap_init_late() which does memremap() on memblock-allocated memory.
>>> Does the issue happen only with SNP?
>> This is observed under SNP as efi_arch_mem_reserve() is only being called
>> with SNP enabled and then efi_arch_mem_reserve() allocates EFI memory map
>> using memblock.
> I don't see how efi_arch_mem_reserve() is only called with SNP. What did I
> miss?
>
This is the call stack for efi_arch_mem_reserve():
[ 0.310010] efi_arch_mem_reserve+0xb1/0x220
[ 0.311382] efi_mem_reserve+0x36/0x60
[ 0.311973] efi_bgrt_init+0x17d/0x1a0
[ 0.313265] acpi_parse_bgrt+0x12/0x20
[ 0.313858] acpi_table_parse+0x77/0xd0
[ 0.314463] acpi_boot_init+0x362/0x630
[ 0.315069] setup_arch+0xa88/0xf80
[ 0.315629] start_kernel+0x68/0xa90
[ 0.316194] x86_64_start_reservations+0x1c/0x30
[ 0.316921] x86_64_start_kernel+0xbf/0x110
[ 0.317582] common_startup_64+0x13e/0x141
So, probably it is being invoked specifically for AMD platform ?
>> If we skip efi_arch_mem_reserve() (which should probably be anyway skipped
>> for kexec case), then for kexec boot, EFI memmap is memremapped in the same
>> virtual address as the first kernel and not the allocated memblock address.
> Maybe we should skip efi_arch_mem_reserve() for kexec case, but I think we
> still need to understand what's causing memory corruption.
When, efi_arch_mem_reserve() allocates memory for EFI memory map using
memblock and then later in boot, kexec_enter_virtual_mode() does
memremap on this memblock allocated memory, subsequently after this i
see EFI memory map corruption, so are there are any issues doing
memremap on memblock-allocated memory ?
Thanks, Ashish
>>> I didn't really dig, but my theory would be that it has something to do
>>> with arch_memremap_can_ram_remap() in arch/x86/mm/ioremap.c
>>>> Thanks, Ashish
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 16:48 ` Kalra, Ashish
@ 2024-06-03 16:57 ` Borislav Petkov
2024-06-03 17:08 ` Kalra, Ashish
2024-06-03 17:05 ` Kalra, Ashish
1 sibling, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-03 16:57 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Mike Rapoport, tglx, mingo, dave.hansen, x86, rafael, hpa, peterz,
adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel,
Dave Young
On Mon, Jun 03, 2024 at 11:48:03AM -0500, Kalra, Ashish wrote:
> Yes, something similar as above, as efi_mem_reserve() is used to reserve
> boot service memory and is not necessary for kexec boot.
>
> So, Dave Young (dyoung@redhat.com) had suggested that we skip
> efi_arch_mem_reserve() for kexec by checking the set EFI_MEMORY_RUNTIME
> attribute as below:a
efi_arch_mem_reserve() or efi_mem_reserve() altogether?
Btw, that below got really gibberished by your mail client. Snipped.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 16:48 ` Kalra, Ashish
2024-06-03 16:57 ` Borislav Petkov
@ 2024-06-03 17:05 ` Kalra, Ashish
2024-06-03 17:10 ` Borislav Petkov
1 sibling, 1 reply; 115+ messages in thread
From: Kalra, Ashish @ 2024-06-03 17:05 UTC (permalink / raw)
To: Mike Rapoport, Borislav Petkov
Cc: tglx, mingo, dave.hansen, x86, rafael, hpa, peterz, adrian.hunter,
sathyanarayanan.kuppuswamy, jun.nakajima, rick.p.edgecombe,
thomas.lendacky, michael.roth, seanjc, kai.huang, bhe,
kirill.shutemov, bdas, vkuznets, dionnaglaze, anisinha, jroedel,
ardb, kexec, linux-coco, linux-kernel, Dave Young
Re-sending this, last response got garbled.
On 6/3/2024 11:48 AM, Kalra, Ashish wrote:
> On 6/3/2024 10:31 AM, Mike Rapoport wrote:
>
>> On Mon, Jun 03, 2024 at 04:46:39PM +0200, Borislav Petkov wrote:
>>> On Mon, Jun 03, 2024 at 09:01:49AM -0500, Kalra, Ashish wrote:
>>>> If we skip efi_arch_mem_reserve() (which should probably be anyway
>>>> skipped
>>>> for kexec case), then for kexec boot, EFI memmap is memremapped in
>>>> the same
>>>> virtual address as the first kernel and not the allocated memblock
>>>> address.
>>> Are you saying that we should simply do
>>>
>>> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
>>> index fdf07dd6f459..410cb0743289 100644
>>> --- a/drivers/firmware/efi/efi.c
>>> +++ b/drivers/firmware/efi/efi.c
>>> @@ -577,6 +577,9 @@ void __init efi_mem_reserve(phys_addr_t addr,
>>> u64 size)
>>> if (WARN_ON_ONCE(efi_enabled(EFI_PARAVIRT)))
>>> return;
>>> + if (kexec_in_progress)
>>> + return;
>>> +
>>> if (!memblock_is_region_reserved(addr, size))
>>> memblock_reserve(addr, size);
>>> and skip that whole call?
>> I think Ashish suggested rather
>>
>> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
>> index fdf07dd6f459..eccc10ab15a4 100644
>> --- a/drivers/firmware/efi/efi.c
>> +++ b/drivers/firmware/efi/efi.c
>> @@ -580,6 +580,9 @@ void __init efi_mem_reserve(phys_addr_t addr, u64
>> size)
>> if (!memblock_is_region_reserved(addr, size))
>> memblock_reserve(addr, size);
>> + if (kexec_in_progress)
>> + return;
>> +
>> /*
>> * Some architectures (x86) reserve all boot services ranges
>> * until efi_free_boot_services() because of buggy firmware
> Yes, something similar as above, as efi_mem_reserve() is used to
> reserve boot service memory and is not necessary for kexec boot.
>
> So, Dave Young (dyoung@redhat.com) had suggested that we skip
> efi_arch_mem_reserve() for kexec by checking the set
> EFI_MEMORY_RUNTIME attribute as below:
>
diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index f0cc00032751..6f398c59278a 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -255,15 +255,39 @@ void __init efi_arch_mem_reserve(phys_addr_t addr,
u64 size)
struct efi_memory_map_data data = { 0 };
struct efi_mem_range mr;
efi_memory_desc_t md;
- int num_entries;
+ int num_entries, ret;
void *new;
- if (efi_mem_desc_lookup(addr, &md) ||
- md.type != EFI_BOOT_SERVICES_DATA) {
+ /*
+ * efi_mem_reserve() is used to reserve boot service memory, eg.
bgrt,
+ * but it is not neccasery for kexec, as there are no boot
services in
+ * kexec reboot at all after the first kernel's ExitBootServices().
+ *
+ * Therefore, skip efi_mem_reserve for kexec booting by checking the
+ * EFI_MEMORY_RUNTIME attribute which indicates boot service memory
+ * ranges reserved by the first kernel using efi_mem_reserve and
marked
+ * with EFI_MEMORY_RUNTIME attribute.
+ */
+
+ ret = efi_mem_desc_lookup(addr, &md);
+ if (ret) {
pr_err("Failed to lookup EFI memory descriptor for
%pa\n", &addr);
return;
}
+ if (md.type != EFI_BOOT_SERVICES_DATA) {
+ pr_err("Skip reserving non EFI Boot Service Data memory
for %pa\n", &addr);
+ return;
+ }
+
+ /* Kexec copied the efi memmap from the first kernel, thus skip
the case */
+ if (md.attribute & EFI_MEMORY_RUNTIME)
+ return;
+
if (addr + size > md.phys_addr + (md.num_pages <<
EFI_PAGE_SHIFT)) {
pr_err("Region spans EFI memory descriptors, %pa\n",
&addr);
return;
Thanks, Ashish
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 16:57 ` Borislav Petkov
@ 2024-06-03 17:08 ` Kalra, Ashish
2024-06-03 17:12 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Kalra, Ashish @ 2024-06-03 17:08 UTC (permalink / raw)
To: Borislav Petkov
Cc: Mike Rapoport, tglx, mingo, dave.hansen, x86, rafael, hpa, peterz,
adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel,
Dave Young
On 6/3/2024 11:57 AM, Borislav Petkov wrote:
> On Mon, Jun 03, 2024 at 11:48:03AM -0500, Kalra, Ashish wrote:
>> Yes, something similar as above, as efi_mem_reserve() is used to reserve
>> boot service memory and is not necessary for kexec boot.
>>
>> So, Dave Young (dyoung@redhat.com) had suggested that we skip
>> efi_arch_mem_reserve() for kexec by checking the set EFI_MEMORY_RUNTIME
>> attribute as below:a
> efi_arch_mem_reserve() or efi_mem_reserve() altogether?
efi_arch_mem_reserve().
Thanks, Ashish
>
> Btw, that below got really gibberished by your mail client. Snipped.
>
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 17:05 ` Kalra, Ashish
@ 2024-06-03 17:10 ` Borislav Petkov
0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-06-03 17:10 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Mike Rapoport, tglx, mingo, dave.hansen, x86, rafael, hpa, peterz,
adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel,
Dave Young
On Mon, Jun 03, 2024 at 12:05:45PM -0500, Kalra, Ashish wrote:
> Re-sending this, last response got garbled.
And this got linewrapped.
Thunderbird section in Documentation/process/email-clients.rst.
> index f0cc00032751..6f398c59278a 100644
> --- a/arch/x86/platform/efi/quirks.c
> +++ b/arch/x86/platform/efi/quirks.c
> @@ -255,15 +255,39 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64
> size)
^^^
> struct efi_memory_map_data data = { 0 };
> struct efi_mem_range mr;
> efi_memory_desc_t md;
> - int num_entries;
> + int num_entries, ret;
> void *new;
>
> - if (efi_mem_desc_lookup(addr, &md) ||
> - md.type != EFI_BOOT_SERVICES_DATA) {
> + /*
> + * efi_mem_reserve() is used to reserve boot service memory, eg.
> bgrt,
^^^
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 17:08 ` Kalra, Ashish
@ 2024-06-03 17:12 ` Borislav Petkov
2024-06-04 22:12 ` Kalra, Ashish
2024-06-04 22:35 ` Kalra, Ashish
0 siblings, 2 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-06-03 17:12 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Mike Rapoport, tglx, mingo, dave.hansen, x86, rafael, hpa, peterz,
adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel,
Dave Young
On Mon, Jun 03, 2024 at 12:08:48PM -0500, Kalra, Ashish wrote:
> efi_arch_mem_reserve().
Now it only remains for you to explain why...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 16:56 ` Kalra, Ashish
@ 2024-06-03 17:41 ` Mike Rapoport
0 siblings, 0 replies; 115+ messages in thread
From: Mike Rapoport @ 2024-06-03 17:41 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Borislav Petkov, tglx, mingo, dave.hansen, x86, rafael, hpa,
peterz, adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel
On Mon, Jun 03, 2024 at 11:56:01AM -0500, Kalra, Ashish wrote:
> On 6/3/2024 10:29 AM, Mike Rapoport wrote:
>
> > On Mon, Jun 03, 2024 at 09:01:49AM -0500, Kalra, Ashish wrote:
> > > On 6/3/2024 8:39 AM, Mike Rapoport wrote:
> > >
> > > > On Mon, Jun 03, 2024 at 08:06:56AM -0500, Kalra, Ashish wrote:
> > > > > On 6/3/2024 3:56 AM, Borislav Petkov wrote
> > > > >
> > > > > > > EFI memory map and due to early allocation it uses memblock allocation.
> > > > > > >
> > > > > > > Later during boot, efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
> > > > > > > in case of a kexec-ed kernel boot.
> > > > > > >
> > > > > > > This function kexec_enter_virtual_mode() installs the new EFI memory map by
> > > > > > > calling efi_memmap_init_late() which remaps the efi_memmap physically allocated
> > > > > > > in efi_arch_mem_reserve(), but this remapping is still using memblock allocation.
> > > > > > >
> > > > > > > Subsequently, when memblock is freed later in boot flow, this remapped
> > > > > > > efi_memmap will have random corruption (similar to a use-after-free scenario).
> > > > > > >
> > > > > > > The corrupted EFI memory map is then passed to the next kexec-ed kernel
> > > > > > > which causes a panic when trying to use the corrupted EFI memory map.
> > > > > > This sounds fishy: memblock allocated memory is not freed later in the
> > > > > > boot - it remains reserved. Only free memory is freed from memblock to
> > > > > > the buddy allocator.
> > > > > >
> > > > > > Or is the problem that memblock-allocated memory cannot be memremapped
> > > > > > because *raisins*?
> > > > > This is what seems to be happening:
> > > > >
> > > > > efi_arch_mem_reserve() calls efi_memmap_alloc() to allocate memory for
> > > > > EFI memory map and due to early allocation it uses memblock allocation.
> > > > >
> > > > > And later efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
> > > > > in case of a kexec-ed kernel boot.
> > > > >
> > > > > This function kexec_enter_virtual_mode() installs the new EFI memory map by
> > > > > calling efi_memmap_init_late() which does memremap() on memblock-allocated memory.
> > > > Does the issue happen only with SNP?
> > > This is observed under SNP as efi_arch_mem_reserve() is only being called
> > > with SNP enabled and then efi_arch_mem_reserve() allocates EFI memory map
> > > using memblock.
> > I don't see how efi_arch_mem_reserve() is only called with SNP. What did I
> > miss?
>
> This is the call stack for efi_arch_mem_reserve():
>
> [ 0.310010] efi_arch_mem_reserve+0xb1/0x220
> [ 0.311382] efi_mem_reserve+0x36/0x60
> [ 0.311973] efi_bgrt_init+0x17d/0x1a0
> [ 0.313265] acpi_parse_bgrt+0x12/0x20
> [ 0.313858] acpi_table_parse+0x77/0xd0
> [ 0.314463] acpi_boot_init+0x362/0x630
> [ 0.315069] setup_arch+0xa88/0xf80
> [ 0.315629] start_kernel+0x68/0xa90
> [ 0.316194] x86_64_start_reservations+0x1c/0x30
> [ 0.316921] x86_64_start_kernel+0xbf/0x110
> [ 0.317582] common_startup_64+0x13e/0x141
>
> So, probably it is being invoked specifically for AMD platform ?
AFAIU, efi_bgrt_init() can be called for any x86 platform, with or without
encryption.
So if my understating is correct, efi_arch_mem_reserve() will be called with SNP
disabled as well. And if kexec works ok without SNP but fails with SNP this
may give as a clue to the root cause of the failure.
> > > If we skip efi_arch_mem_reserve() (which should probably be anyway skipped
> > > for kexec case), then for kexec boot, EFI memmap is memremapped in the same
> > > virtual address as the first kernel and not the allocated memblock address.
> > Maybe we should skip efi_arch_mem_reserve() for kexec case, but I think we
> > still need to understand what's causing memory corruption.
>
> When, efi_arch_mem_reserve() allocates memory for EFI memory map using
> memblock and then later in boot, kexec_enter_virtual_mode() does memremap on
> this memblock allocated memory, subsequently after this i see EFI memory map
> corruption, so are there are any issues doing memremap on memblock-allocated
> memory ?
memblock-allocated memory is just RAM, so my take is that memremap() cannot
figure out the encryption bits properly.
You can check if there are issues with memrmapp()ing memblock-allocated
memory by sticking memblock_phys_alloc() somewhere, filling that memory with a
pattern and then calling memremap(addr, size, MEMREMAP_WB) and checking if
the pattern is still there.
> Thanks, Ashish
>
> > > > I didn't really dig, but my theory would be that it has something to do
> > > > with arch_memremap_can_ram_remap() in arch/x86/mm/ioremap.c
> > > > > Thanks, Ashish
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-05-29 10:47 ` Nikolay Borisov
2024-05-29 11:17 ` Kirill A. Shutemov
2024-06-03 14:43 ` H. Peter Anvin
@ 2024-06-03 22:43 ` H. Peter Anvin
2 siblings, 0 replies; 115+ messages in thread
From: H. Peter Anvin @ 2024-06-03 22:43 UTC (permalink / raw)
To: Nikolay Borisov, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, K. Y. Srinivasan,
Haiyang Zhang, kexec, linux-hyperv, linux-acpi, linux-coco,
linux-kernel
On 5/29/24 03:47, Nikolay Borisov wrote:
>>
>> diff --git a/arch/x86/kernel/relocate_kernel_64.S
>> b/arch/x86/kernel/relocate_kernel_64.S
>> index 56cab1bb25f5..085eef5c3904 100644
>> --- a/arch/x86/kernel/relocate_kernel_64.S
>> +++ b/arch/x86/kernel/relocate_kernel_64.S
>> @@ -148,9 +148,10 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>> */
>> movl $X86_CR4_PAE, %eax
>> testq $X86_CR4_LA57, %r13
>> - jz 1f
>> + jz .Lno_la57
>> orl $X86_CR4_LA57, %eax
>> -1:
>> +.Lno_la57:
>> +
>> movq %rax, %cr4
>> jmp 1f
>
Sorry if this is a duplicate; something strange happened with my email.
If you are cleaning up this code anyway...
this whole piece of code can be simplified to:
and $(X86_CR4_PAE | X86_CR4_LA57), %r13d
mov %r13, %cr4
The PAE bit in %r13 is guaranteed to be set, and %r13 is dead after this.
-hpa
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-05-29 11:17 ` Kirill A. Shutemov
2024-05-29 11:28 ` Borislav Petkov
@ 2024-06-04 0:24 ` H. Peter Anvin
2024-06-04 9:15 ` Borislav Petkov
1 sibling, 1 reply; 115+ messages in thread
From: H. Peter Anvin @ 2024-06-04 0:24 UTC (permalink / raw)
To: Kirill A. Shutemov, Nikolay Borisov
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, K. Y. Srinivasan,
Haiyang Zhang, kexec, linux-hyperv, linux-acpi, linux-coco,
linux-kernel
Trying one more time; sorry (again) if someone receives this in duplicate.
>>>
>>> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
>>> index 56cab1bb25f5..085eef5c3904 100644
>>> --- a/arch/x86/kernel/relocate_kernel_64.S
>>> +++ b/arch/x86/kernel/relocate_kernel_64.S
>>> @@ -148,9 +148,10 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>>> */
>>> movl $X86_CR4_PAE, %eax
>>> testq $X86_CR4_LA57, %r13
>>> - jz 1f
>>> + jz .Lno_la57
>>> orl $X86_CR4_LA57, %eax
>>> -1:
>>> +.Lno_la57:
>>> +
>>> movq %rax, %cr4
If we are cleaning up this code... the above can simply be:
andl $(X86_CR4_PAE | X86_CR4_LA54), %r13
movq %r13, %cr4
%r13 is dead afterwards, and the PAE bit *will* be set in %r13 anyway.
-hpa
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 15:31 ` Mike Rapoport
2024-06-03 16:48 ` Kalra, Ashish
@ 2024-06-04 1:23 ` Dave Young
2024-06-04 9:43 ` Borislav Petkov
1 sibling, 1 reply; 115+ messages in thread
From: Dave Young @ 2024-06-04 1:23 UTC (permalink / raw)
To: Mike Rapoport
Cc: Borislav Petkov, Kalra, Ashish, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel
On Mon, 3 Jun 2024 at 23:33, Mike Rapoport <rppt@kernel.org> wrote:
>
> On Mon, Jun 03, 2024 at 04:46:39PM +0200, Borislav Petkov wrote:
> > On Mon, Jun 03, 2024 at 09:01:49AM -0500, Kalra, Ashish wrote:
> > > If we skip efi_arch_mem_reserve() (which should probably be anyway skipped
> > > for kexec case), then for kexec boot, EFI memmap is memremapped in the same
> > > virtual address as the first kernel and not the allocated memblock address.
> >
> > Are you saying that we should simply do
> >
> > diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> > index fdf07dd6f459..410cb0743289 100644
> > --- a/drivers/firmware/efi/efi.c
> > +++ b/drivers/firmware/efi/efi.c
> > @@ -577,6 +577,9 @@ void __init efi_mem_reserve(phys_addr_t addr, u64 size)
> > if (WARN_ON_ONCE(efi_enabled(EFI_PARAVIRT)))
> > return;
> >
> > + if (kexec_in_progress)
> > + return;
> > +
kexec_in_progress is only for checking if this is in a reboot (kexec) code path.
But eif_mem_reserve is only called during the boot time so checking
kexec_in_progress is meaningless here.
current_kernel_is_booted_via_kexec != is_rebooting_with_kexec
The code change below in the patch looks good to me, but I'm not sure
what caused the memory corruption, it indeed worth some more digging,
maybe SEV/SNP related.
+ if (md.attribute & EFI_MEMORY_RUNTIME)
+ return;
Thanks
Dave
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-06-04 0:24 ` H. Peter Anvin
@ 2024-06-04 9:15 ` Borislav Petkov
2024-06-04 15:21 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-04 9:15 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Kirill A. Shutemov, Nikolay Borisov, Thomas Gleixner, Ingo Molnar,
Dave Hansen, x86, Rafael J. Wysocki, Peter Zijlstra,
Adrian Hunter, Kuppuswamy Sathyanarayanan, Elena Reshetova,
Jun Nakajima, Rick Edgecombe, Tom Lendacky, Kalra, Ashish,
Sean Christopherson, Huang, Kai, Ard Biesheuvel, Baoquan He,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On Mon, Jun 03, 2024 at 05:24:00PM -0700, H. Peter Anvin wrote:
> Trying one more time; sorry (again) if someone receives this in duplicate.
>
> > > >
> > > > diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> > > > index 56cab1bb25f5..085eef5c3904 100644
> > > > --- a/arch/x86/kernel/relocate_kernel_64.S
> > > > +++ b/arch/x86/kernel/relocate_kernel_64.S
> > > > @@ -148,9 +148,10 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> > > > */
> > > > movl $X86_CR4_PAE, %eax
> > > > testq $X86_CR4_LA57, %r13
> > > > - jz 1f
> > > > + jz .Lno_la57
> > > > orl $X86_CR4_LA57, %eax
> > > > -1:
> > > > +.Lno_la57:
> > > > +
> > > > movq %rax, %cr4
>
> If we are cleaning up this code... the above can simply be:
>
> andl $(X86_CR4_PAE | X86_CR4_LA54), %r13
> movq %r13, %cr4
>
> %r13 is dead afterwards, and the PAE bit *will* be set in %r13 anyway.
Yeah, with a proper comment. The testing of bits is not really needed.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-04 1:23 ` Dave Young
@ 2024-06-04 9:43 ` Borislav Petkov
2024-06-04 11:09 ` Dave Young
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-04 9:43 UTC (permalink / raw)
To: Dave Young
Cc: Mike Rapoport, Kalra, Ashish, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel
On Tue, Jun 04, 2024 at 09:23:58AM +0800, Dave Young wrote:
> kexec_in_progress is only for checking if this is in a reboot (kexec) code path.
> But eif_mem_reserve is only called during the boot time so checking
> kexec_in_progress is meaningless here.
> current_kernel_is_booted_via_kexec != is_rebooting_with_kexec
That's exactly what I wanna check: whether this is a kexec-ed kernel. Or
is there a better helper for that?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-04 9:43 ` Borislav Petkov
@ 2024-06-04 11:09 ` Dave Young
2024-06-04 18:02 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Dave Young @ 2024-06-04 11:09 UTC (permalink / raw)
To: Borislav Petkov
Cc: Mike Rapoport, Kalra, Ashish, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel
On Tue, 4 Jun 2024 at 17:44, Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Jun 04, 2024 at 09:23:58AM +0800, Dave Young wrote:
> > kexec_in_progress is only for checking if this is in a reboot (kexec) code path.
> > But eif_mem_reserve is only called during the boot time so checking
> > kexec_in_progress is meaningless here.
> > current_kernel_is_booted_via_kexec != is_rebooting_with_kexec
>
> That's exactly what I wanna check: whether this is a kexec-ed kernel. Or
> is there a better helper for that?
No general way to check if it is a kexec-ed kernel or not, for x86
one can check the efi_setup as Ashish's original patch did, as the
kexec booted kernel (efi boot) will have efi setup_data passed in.
Otherwise there is a type_of_loader field for x86 boot protocol,
kexec-tools is 0x0D, the kexec_file_load also uses this. But adding
the type_of_loader was only added in kexec-tools code when Yinghai
worked on the kexec-tools bzImage64 load, so older kexec-tools will
not set this field. Anyway the in-kernel kexec_file_load code for x86
added 0x0D as loader type from the beginning.
Anyway there is not such a helper for all cases.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-06-04 9:15 ` Borislav Petkov
@ 2024-06-04 15:21 ` Kirill A. Shutemov
2024-06-04 17:57 ` Borislav Petkov
2024-06-11 18:26 ` H. Peter Anvin
0 siblings, 2 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-04 15:21 UTC (permalink / raw)
To: Borislav Petkov
Cc: H. Peter Anvin, Nikolay Borisov, Thomas Gleixner, Ingo Molnar,
Dave Hansen, x86, Rafael J. Wysocki, Peter Zijlstra,
Adrian Hunter, Kuppuswamy Sathyanarayanan, Elena Reshetova,
Jun Nakajima, Rick Edgecombe, Tom Lendacky, Kalra, Ashish,
Sean Christopherson, Huang, Kai, Ard Biesheuvel, Baoquan He,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On Tue, Jun 04, 2024 at 11:15:03AM +0200, Borislav Petkov wrote:
> On Mon, Jun 03, 2024 at 05:24:00PM -0700, H. Peter Anvin wrote:
> > Trying one more time; sorry (again) if someone receives this in duplicate.
> >
> > > > >
> > > > > diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> > > > > index 56cab1bb25f5..085eef5c3904 100644
> > > > > --- a/arch/x86/kernel/relocate_kernel_64.S
> > > > > +++ b/arch/x86/kernel/relocate_kernel_64.S
> > > > > @@ -148,9 +148,10 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> > > > > */
> > > > > movl $X86_CR4_PAE, %eax
> > > > > testq $X86_CR4_LA57, %r13
> > > > > - jz 1f
> > > > > + jz .Lno_la57
> > > > > orl $X86_CR4_LA57, %eax
> > > > > -1:
> > > > > +.Lno_la57:
> > > > > +
> > > > > movq %rax, %cr4
> >
> > If we are cleaning up this code... the above can simply be:
> >
> > andl $(X86_CR4_PAE | X86_CR4_LA54), %r13
> > movq %r13, %cr4
> >
> > %r13 is dead afterwards, and the PAE bit *will* be set in %r13 anyway.
>
> Yeah, with a proper comment. The testing of bits is not really needed.
I think it is better fit the next patch.
What about this?
From b45fe48092abad2612c2bafbb199e4de80c99545 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Fri, 10 Feb 2023 12:53:11 +0300
Subject: [PATCHv11.1 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest
TDX guests run with MCA enabled (CR4.MCE=1b) from the very start. If
that bit is cleared during CR4 register reprogramming during boot or
kexec flows, a #VE exception will be raised which the guest kernel
cannot handle it.
Therefore, make sure the CR4.MCE setting is preserved over kexec too and
avoid raising any #VEs.
The change doesn't affect non-TDX-guest environments.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/relocate_kernel_64.S | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 085eef5c3904..9c2cf70c5f54 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -5,6 +5,8 @@
*/
#include <linux/linkage.h>
+#include <linux/stringify.h>
+#include <asm/alternative.h>
#include <asm/page_types.h>
#include <asm/kexec.h>
#include <asm/processor-flags.h>
@@ -145,14 +147,15 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
* Set cr4 to a known state:
* - physical address extension enabled
* - 5-level paging, if it was enabled before
+ * - Machine check exception on TDX guest, if it was enabled before.
+ * Clearing MCE might not be allowed in TDX guests, depending on setup.
+ *
+ * Use R13 that contains the original CR4 value, read in relocate_kernel().
+ * PAE is always set in the original CR4.
*/
- movl $X86_CR4_PAE, %eax
- testq $X86_CR4_LA57, %r13
- jz .Lno_la57
- orl $X86_CR4_LA57, %eax
-.Lno_la57:
-
- movq %rax, %cr4
+ andl $(X86_CR4_PAE | X86_CR4_LA57), %r13d
+ ALTERNATIVE "", __stringify(orl $X86_CR4_MCE, %r13d), X86_FEATURE_TDX_GUEST
+ movq %r13, %cr4
jmp 1f
1:
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCHv11.1 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-06-03 8:37 ` Borislav Petkov
@ 2024-06-04 15:32 ` Kirill A. Shutemov
2024-06-04 15:47 ` Dave Hansen
0 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-04 15:32 UTC (permalink / raw)
To: Borislav Petkov
Cc: adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kys, linux-acpi, linux-coco, linux-hyperv, linux-kernel, ltao,
mingo, peterz, rafael, rick.p.edgecombe,
sathyanarayanan.kuppuswamy, seanjc, tglx, thomas.lendacky, x86
On Mon, Jun 03, 2024 at 10:37:54AM +0200, Borislav Petkov wrote:
> On Sun, Jun 02, 2024 at 05:23:03PM +0300, Kirill A. Shutemov wrote:
> > + /*
> > + * The only thing one can do at this point on failure
> > + * is panic. It is reasonable to proceed.
>
> It makes even less sense now: panic() means "all stops and we die" and
> you say it is reasonable to proceed.
>
> I'm confused.
Right.
What about the comment below?
/*
* One possible reason for the failure is if kexec raced
* with memory conversion. In this case shared bit in
* page table got set (or not cleared) during
* shared<->private conversion, but the page is actually
* private. So this failure is not going to affect the
* kexec'ed kernel.
*
* The only thing one can do at this point on failure
* at this point is panic. In absence of better options,
* it is reasonable to proceed, hoping the failure is a
* benign shared bit mismatch due to the race.
*
* Also, even if the failure is real and the page cannot
* be touched as private, the kdump kernel will boot
* fine as it uses pre-reserved memory. What happens
* next depends on what the dumping process does and
* there's a reasonable chance to produce useful dump
* on crash.
*
* Regardless, the print leaves a trace in the log to
* give a clue for debug.
*/
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11.1 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-06-04 15:32 ` Kirill A. Shutemov
@ 2024-06-04 15:47 ` Dave Hansen
2024-06-04 16:14 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Dave Hansen @ 2024-06-04 15:47 UTC (permalink / raw)
To: Kirill A. Shutemov, Borislav Petkov
Cc: adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kys, linux-acpi, linux-coco, linux-hyperv, linux-kernel, ltao,
mingo, peterz, rafael, rick.p.edgecombe,
sathyanarayanan.kuppuswamy, seanjc, tglx, thomas.lendacky, x86
On 6/4/24 08:32, Kirill A. Shutemov wrote:
> What about the comment below?
>
> /*
> * One possible reason for the failure is if kexec raced
> * with memory conversion. In this case shared bit in
> * page table got set (or not cleared) during
> * shared<->private conversion, but the page is actually
> * private. So this failure is not going to affect the
> * kexec'ed kernel.
> *
> * The only thing one can do at this point on failure
> * at this point is panic. In absence of better options,
> * it is reasonable to proceed, hoping the failure is a
> * benign shared bit mismatch due to the race.
> *
> * Also, even if the failure is real and the page cannot
> * be touched as private, the kdump kernel will boot
> * fine as it uses pre-reserved memory. What happens
> * next depends on what the dumping process does and
> * there's a reasonable chance to produce useful dump
> * on crash.
> *
> * Regardless, the print leaves a trace in the log to
> * give a clue for debug.
> */
It's rambling too much for my taste.
Let's boil this down to what matters:
1. Failures to change encryption status here can lead a future kernel
to touch shared memory with a private mapping
2. That causes an immediate unrecoverable guest shutdown (right?)
3. kdump kernels should not be affected since they have their own
memory ranges and its encryption status is not being tweawked here
4. The pr_err() may help make some sense out of #2 when it happens
I'm not sure the reason behind the failed conversion is important here.
I wouldn't mention panic().
We don't need to opine about what the next kernel might or might not do.
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 09/19] x86/tdx: Account shared memory
2024-05-28 9:55 ` [PATCHv11 09/19] x86/tdx: Account shared memory Kirill A. Shutemov
@ 2024-06-04 16:08 ` Dave Hansen
2024-06-04 16:24 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Dave Hansen @ 2024-06-04 16:08 UTC (permalink / raw)
To: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel, Tao Liu
On 5/28/24 02:55, Kirill A. Shutemov wrote:
> Keep track of the number of shared pages. This will allow for
> cross-checking against the shared information in the direct mapping
> and reporting if the shared bit is lost.
It's probably also worth mentioning that conversions are slow and
relatively rare and even though a global atomic isn't really scalable,
it also isn't worth doing anything fancier.
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 26fa47db5782..979891e97d83 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -38,6 +38,8 @@
>
> #define TDREPORT_SUBTYPE_0 0
>
> +static atomic_long_t nr_shared;
Doesn't this technically need to be:
static atomic_long_t nr_shared = ATOMIC_LONG_INIT(0);
? I thought we had some architectures where the 0 logical value wasn't
actually all 0's.
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11.1 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-06-04 15:47 ` Dave Hansen
@ 2024-06-04 16:14 ` Kirill A. Shutemov
2024-06-04 18:05 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-04 16:14 UTC (permalink / raw)
To: Dave Hansen
Cc: Borislav Petkov, adrian.hunter, ardb, ashish.kalra, bhe,
dave.hansen, elena.reshetova, haiyangz, hpa, jun.nakajima,
kai.huang, kexec, kys, linux-acpi, linux-coco, linux-hyperv,
linux-kernel, ltao, mingo, peterz, rafael, rick.p.edgecombe,
sathyanarayanan.kuppuswamy, seanjc, tglx, thomas.lendacky, x86
On Tue, Jun 04, 2024 at 08:47:22AM -0700, Dave Hansen wrote:
> On 6/4/24 08:32, Kirill A. Shutemov wrote:
> > What about the comment below?
> >
> > /*
> > * One possible reason for the failure is if kexec raced
> > * with memory conversion. In this case shared bit in
> > * page table got set (or not cleared) during
> > * shared<->private conversion, but the page is actually
> > * private. So this failure is not going to affect the
> > * kexec'ed kernel.
> > *
> > * The only thing one can do at this point on failure
> > * at this point is panic. In absence of better options,
> > * it is reasonable to proceed, hoping the failure is a
> > * benign shared bit mismatch due to the race.
> > *
> > * Also, even if the failure is real and the page cannot
> > * be touched as private, the kdump kernel will boot
> > * fine as it uses pre-reserved memory. What happens
> > * next depends on what the dumping process does and
> > * there's a reasonable chance to produce useful dump
> > * on crash.
> > *
> > * Regardless, the print leaves a trace in the log to
> > * give a clue for debug.
> > */
>
> It's rambling too much for my taste.
>
> Let's boil this down to what matters:
>
> 1. Failures to change encryption status here can lead a future kernel
> to touch shared memory with a private mapping
> 2. That causes an immediate unrecoverable guest shutdown (right?)
Right.
> 3. kdump kernels should not be affected since they have their own
> memory ranges and its encryption status is not being tweawked here
> 4. The pr_err() may help make some sense out of #2 when it happens
>
> I'm not sure the reason behind the failed conversion is important here.
The important part is that failure can be benign. It explains "can" in #1.
But okay.
> I wouldn't mention panic().
>
> We don't need to opine about what the next kernel might or might not do.
Is this any better?
/*
* If tdx_enc_status_changed() fails, it leaves memory
* in an unknown state. If the memory remains shared,
* it can result in an unrecoverable guest shutdown on
* the first accessed through a private mapping.
*
* The kdump kernel boot is not impacted as it uses
* a pre-reserved memory range that is always private.
* However, gathering crash information could lead to
* a crash if it accesses unconverted memory through
* a private mapping.
*
* pr_err() may assist in understanding such crashes.
*/
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec
2024-05-28 9:55 ` [PATCHv11 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec Kirill A. Shutemov
2024-05-29 10:42 ` Borislav Petkov
@ 2024-06-04 16:16 ` Dave Hansen
1 sibling, 0 replies; 115+ messages in thread
From: Dave Hansen @ 2024-06-04 16:16 UTC (permalink / raw)
To: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel, Nikolay Borisov, Tao Liu
On 5/28/24 02:55, Kirill A. Shutemov wrote:
> + x86_platform.guest.enc_kexec_begin(true);
> + x86_platform.guest.enc_kexec_finish();
I really despise the random, unlabeled true/false/0/1 arguments to
functions like this.
I'll bring it up in the non-noop patch though.
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 09/19] x86/tdx: Account shared memory
2024-06-04 16:08 ` Dave Hansen
@ 2024-06-04 16:24 ` Kirill A. Shutemov
0 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-04 16:24 UTC (permalink / raw)
To: Dave Hansen
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel, Tao Liu
On Tue, Jun 04, 2024 at 09:08:25AM -0700, Dave Hansen wrote:
> On 5/28/24 02:55, Kirill A. Shutemov wrote:
> > Keep track of the number of shared pages. This will allow for
> > cross-checking against the shared information in the direct mapping
> > and reporting if the shared bit is lost.
>
> It's probably also worth mentioning that conversions are slow and
> relatively rare and even though a global atomic isn't really scalable,
> it also isn't worth doing anything fancier.
Okay, will do.
> > diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> > index 26fa47db5782..979891e97d83 100644
> > --- a/arch/x86/coco/tdx/tdx.c
> > +++ b/arch/x86/coco/tdx/tdx.c
> > @@ -38,6 +38,8 @@
> >
> > #define TDREPORT_SUBTYPE_0 0
> >
> > +static atomic_long_t nr_shared;
>
> Doesn't this technically need to be:
>
> static atomic_long_t nr_shared = ATOMIC_LONG_INIT(0);
>
> ? I thought we had some architectures where the 0 logical value wasn't
> actually all 0's.
Hm. I am not aware of such requirement. I see plenty uninitilized
atomic_long_t in generic code. For instance, invalid_kread_bytes.
And I doubt TDX will ever be built for non-x86 :P
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-05-28 9:55 ` [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec Kirill A. Shutemov
2024-05-31 15:14 ` Borislav Petkov
@ 2024-06-04 16:27 ` Dave Hansen
2024-06-05 12:43 ` Kirill A. Shutemov
1 sibling, 1 reply; 115+ messages in thread
From: Dave Hansen @ 2024-06-04 16:27 UTC (permalink / raw)
To: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel, Tao Liu
On 5/28/24 02:55, Kirill A. Shutemov wrote:
> +/* Stop new private<->shared conversions */
> +static void tdx_kexec_begin(bool crash)
> +{
> + /*
> + * Crash kernel reaches here with interrupts disabled: can't wait for
> + * conversions to finish.
> + *
> + * If race happened, just report and proceed.
> + */
> + if (!set_memory_enc_stop_conversion(!crash))
> + pr_warn("Failed to stop shared<->private conversions\n");
> +}
I don't like having to pass 'crash' in here.
If interrupts are the problem we have ways of testing for those directly.
If it's being in an oops that's a problem, we have 'oops_in_progress'
for that.
In other words, I'd much rather this function (or better yet
set_memory_enc_stop_conversion() itself) use some existing API to change
its behavior in a crash rather than have the context be passed down and
twiddled through several levels of function calls.
There are a ton of these in the console code:
if (oops_in_progress)
foo_trylock();
else
foo_lock();
To me, that's a billion times more clear than a 'wait' argument that
gets derives from who-knows-what that I have to trace through ten levels
of function calls.
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-06-04 15:21 ` Kirill A. Shutemov
@ 2024-06-04 17:57 ` Borislav Petkov
2024-06-11 18:26 ` H. Peter Anvin
1 sibling, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-06-04 17:57 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: H. Peter Anvin, Nikolay Borisov, Thomas Gleixner, Ingo Molnar,
Dave Hansen, x86, Rafael J. Wysocki, Peter Zijlstra,
Adrian Hunter, Kuppuswamy Sathyanarayanan, Elena Reshetova,
Jun Nakajima, Rick Edgecombe, Tom Lendacky, Kalra, Ashish,
Sean Christopherson, Huang, Kai, Ard Biesheuvel, Baoquan He,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On Tue, Jun 04, 2024 at 06:21:27PM +0300, Kirill A. Shutemov wrote:
> What about this?
Yeah, LGTM.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-04 11:09 ` Dave Young
@ 2024-06-04 18:02 ` Borislav Petkov
2024-06-05 2:53 ` Dave Young
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-04 18:02 UTC (permalink / raw)
To: Dave Young
Cc: Mike Rapoport, Kalra, Ashish, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel
On Tue, Jun 04, 2024 at 07:09:56PM +0800, Dave Young wrote:
> Anyway there is not such a helper for all cases.
But maybe there should be...
This is not the first case where the need arises to be able to say:
if (am I a kexeced kernel)
in code.
Perhaps we should have a global var kexeced or so which gets incremented
on each kexec-ed kernel, somewhere in very early boot of the kexec-ed
kernel we do
kexeced++;
and then other code can query it and know whether this is a kexec-ed
kernel and how many times it got kexec-ed...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11.1 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-06-04 16:14 ` Kirill A. Shutemov
@ 2024-06-04 18:05 ` Borislav Petkov
2024-06-05 12:21 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-04 18:05 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Dave Hansen, adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kys, linux-acpi, linux-coco, linux-hyperv, linux-kernel, ltao,
mingo, peterz, rafael, rick.p.edgecombe,
sathyanarayanan.kuppuswamy, seanjc, tglx, thomas.lendacky, x86
On Tue, Jun 04, 2024 at 07:14:00PM +0300, Kirill A. Shutemov wrote:
> /*
> * If tdx_enc_status_changed() fails, it leaves memory
> * in an unknown state. If the memory remains shared,
> * it can result in an unrecoverable guest shutdown on
> * the first accessed through a private mapping.
"access"
So this sentence above can go too, right?
Because that comment is in tdx_kexec_finish() and we're basically going
off to kexec. So can a guest even access it through a private mapping?
We're shutting down so nothing is running anymore...
> * The kdump kernel boot is not impacted as it uses
> * a pre-reserved memory range that is always private.
> * However, gathering crash information could lead to
> * a crash if it accesses unconverted memory through
> * a private mapping.
When does the kexec kernel even get such a private mapping? It is not
even up yet...
> * pr_err() may assist in understanding such crashes.
"Print error info in order to leave bread crumbs for debugging." is what
I'd say.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 17:12 ` Borislav Petkov
@ 2024-06-04 22:12 ` Kalra, Ashish
2024-06-04 22:35 ` Kalra, Ashish
1 sibling, 0 replies; 115+ messages in thread
From: Kalra, Ashish @ 2024-06-04 22:12 UTC (permalink / raw)
To: Borislav Petkov
Cc: Mike Rapoport, tglx, mingo, dave.hansen, x86, rafael, hpa, peterz,
adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel,
Dave Young
On 6/3/2024 12:12 PM, Borislav Petkov wrote:
> On Mon, Jun 03, 2024 at 12:08:48PM -0500, Kalra, Ashish wrote:
>> efi_arch_mem_reserve().
> Now it only remains for you to explain why...
Here is a detailed explanation of what is causing the EFI memory map corruption, with added debug logs and memblock debugging enabled:
Initially at boot, efi_memblock_x86_reserve_range() does early_memremap() of the EFI memory map passed as part of setup_data, as the following logs show:
...
[ 0.000000] efi: in efi_memblock_x86_reserve_range, phys map 0x27fff9110 [ 0.000000] memblock_reserve: [0x000000027fff9110-0x000000027fffa12f] efi_memblock_x86_reserve_range+0x168/0x2a0
...
Later, efi_arch_mem_reserve() is invoked, which calls efi_memmap_alloc() which does memblock_phys_alloc() to insert new EFI memory descriptor into efi.memap:
...
[ 0.733263] memblock_reserve: [0x000000027ffcaf80-0x000000027ffcbfff] memblock_alloc_range_nid+0xf1/0x1b0 [ 0.734787] efi: efi_arch_mem_reserve, efi phys map 0x27ffcaf80
...
Finally, at the end of boot, kexec_enter_virtual_mode() is called.
It does mapping of efi regions which were passed via setup_data.
So it unregisters the early mem-remapped EFI memmap and installs the new EFI memory map as below:
( Because of efi_arch_mem_reserve() getting invoked, the new EFI memmap phys base being remapped now is the memblock allocation done in efi_arch_mem_reserve()).
[ 4.042160] efi: efi memmap phys map 0x27ffcaf80
So, kexec_enter_virtual_mode() does the following :
if (efi_memmap_init_late(efi.memmap.phys_map, <---- refers to the new EFI memmap phys base allocated via memblock in efi_arch_mem_reserve(). efi.memmap.desc_size * efi.memmap.nr_map)) { ...
This late init, does a memremap() on this memblock-allocated memory, but then immediately frees it :
drivers/firmware/efi/memmap.c:
*/ int __init __efi_memmap_init(struct efi_memory_map_data *data) {
..
phys_map = data->phys_map; <----------------------- refers to the new EFI memmap phys base allocated via memblock in efi_arch_mem_reserve().
if (data->flags & EFI_MEMMAP_LATE) map.map = memremap(phys_map, data->size, MEMREMAP_WB); ... ... if (efi.memmap.flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB)) { __efi_memmap_free(efi.memmap.phys_map, efi.memmap.desc_size * efi.memmap.nr_map, efi.memmap.flags); }
map.phys_map = data->phys_map;
...
efi.memmap = map;
...
This happens as kexec_enter_virtual_mode() can only handle the early mapped EFI memmap and not the one which is memblock allocated by efi_arch_mem_reserve(). As seen above this memblock allocated (EFI_MEMMAP_MEMBLOCK tagged) memory gets freed.
This is confirmed by memblock debugging:
[ 4.044057] memblock_free_late: [0x000000027ffcaf80-0x000000027ffcbfff] __efi_memmap_free+0x66/0x80
So while this memory is memremapped, it has also been freed and then it gets into a use-after-free condition and subsequently gets corrupted.
This corruption is seen just before kexec-ing into the new kernel:
...
[ 11.045522] PEFILE: Unsigned PE binary^M [ 11.060801] kexec-bzImage64: efi memmap phys map 0x27ffcaf80 ... [ 11.061220] kexec-bzImage64: mmap entry, type = 11, va = 0xfffffffeffc00000, pa = 0xffc00000, np = 0x400, attr = 0x8000000000000001^M [ 11.061225] kexec-bzImage64: mmap entry, type = 6, va = 0xfffffffeffb04000, pa = 0x7f704000, np = 0x84, attr = 0x800000000000000f^M [ 11.061228] kexec-bzImage64: mmap entry, type = 4, va = 0xfffffffeff700000, pa = 0x7f100000, np = 0x300, attr = 0x0^M [ 11.061231] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M <---------------- CORRUPTED!!! [ 11.061234] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061236] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061239] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061241] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0,
attr = 0x0^M [ 11.061243] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061245] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061248] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061250] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061252] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061255] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061257] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061259] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061262] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061264] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061266]
kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061268] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061271] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061273] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061275] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061278] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061280] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061282] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061284] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061287] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061289] kexec-bzImage64: mmap entry, type = 0,
va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061291] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061294] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061296] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061298] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061301] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061303] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061305] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061307] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061310] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M [ 11.061312] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr =
0x0^M [ 11.061314] kexec-bzImage64: mmap entry, type = 14080, va = 0x14f29, pa = 0x36c0, np = 0x0, attr = 0x0^M [ 11.061317] kexec-bzImage64: mmap entry, type = 85808, va = 0x0, pa = 0x0, np = 0x72, attr = 0x14f40
...
This EFI memmapphys map address 0x27ffcaf80 being mem-remapped and also getting freed and then in use after free condition (while setting up the EFI memory map for the next kernel with kexec -s) in the above logs confirm the use-after-free case.
Looking at the above code flow, it makes sense to skip efi_arch_mem_reserve() to fix this issue, as it anyway needs to be skipped for kexec case.
Thanks, Ashish
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-03 17:12 ` Borislav Petkov
2024-06-04 22:12 ` Kalra, Ashish
@ 2024-06-04 22:35 ` Kalra, Ashish
2024-06-05 1:48 ` Dave Young
1 sibling, 1 reply; 115+ messages in thread
From: Kalra, Ashish @ 2024-06-04 22:35 UTC (permalink / raw)
To: Borislav Petkov
Cc: Mike Rapoport, tglx, mingo, dave.hansen, x86, rafael, hpa, peterz,
adrian.hunter, sathyanarayanan.kuppuswamy, jun.nakajima,
rick.p.edgecombe, thomas.lendacky, michael.roth, seanjc,
kai.huang, bhe, kirill.shutemov, bdas, vkuznets, dionnaglaze,
anisinha, jroedel, ardb, kexec, linux-coco, linux-kernel,
Dave Young
Re-sending as the earlier response got line-wrapped.
On 6/3/2024 12:12 PM, Borislav Petkov wrote:
> On Mon, Jun 03, 2024 at 12:08:48PM -0500, Kalra, Ashish wrote:
>> efi_arch_mem_reserve().
> Now it only remains for you to explain why...
Here is a detailed explanation of what is causing the EFI memory map corruption, with added debug logs and memblock debugging enabled:
Initially at boot, efi_memblock_x86_reserve_range() does early_memremap() of the EFI memory map passed as part of setup_data, as the following logs show:
...
[ 0.000000] efi: in efi_memblock_x86_reserve_range, phys map 0x27fff9110
[ 0.000000] memblock_reserve: [0x000000027fff9110-0x000000027fffa12f] efi_memblock_x86_reserve_range+0x168/0x2a0
...
Later, efi_arch_mem_reserve() is invoked, which calls efi_memmap_alloc() which does memblock_phys_alloc() to insert new EFI memory descriptor into efi.memap:
...
[ 0.733263] memblock_reserve: [0x000000027ffcaf80-0x000000027ffcbfff] memblock_alloc_range_nid+0xf1/0x1b0
[ 0.734787] efi: efi_arch_mem_reserve, efi phys map 0x27ffcaf80
...
Finally, at the end of boot, kexec_enter_virtual_mode() is called.
It does mapping of efi regions which were passed via setup_data.
So it unregisters the early mem-remapped EFI memmap and installs the new EFI memory map as below:
( Because of efi_arch_mem_reserve() getting invoked, the new EFI memmap phys base being remapped now is the memblock allocation done in efi_arch_mem_reserve()).
[ 4.042160] efi: efi memmap phys map 0x27ffcaf80
So, kexec_enter_virtual_mode() does the following :
if (efi_memmap_init_late(efi.memmap.phys_map, <- refers to the new EFI memmap phys base allocated via memblock in efi_arch_mem_reserve().
efi.memmap.desc_size * efi.memmap.nr_map)) { ...
This late init, does a memremap() on this memblock-allocated memory, but then immediately frees it :
drivers/firmware/efi/memmap.c:
int __init __efi_memmap_init(struct efi_memory_map_data *data)
{
..
phys_map = data->phys_map; <- refers to the new EFI memmap phys base allocated via memblock in efi_arch_mem_reserve().
if (data->flags & EFI_MEMMAP_LATE)
map.map = memremap(phys_map, data->size, MEMREMAP_WB);
...
...
if (efi.memmap.flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB)) {
__efi_memmap_free(efi.memmap.phys_map,
efi.memmap.desc_size * efi.memmap.nr_map, efi.memmap.flags);
}
...
map.phys_map = data->phys_map;
...
efi.memmap = map;
...
This happens as kexec_enter_virtual_mode() can only handle the early mapped EFI memmap and not the one which is memblock allocated by efi_arch_mem_reserve(). As seen above this memblock allocated (EFI_MEMMAP_MEMBLOCK tagged) memory gets freed.
This is confirmed by memblock debugging:
[ 4.044057] memblock_free_late: [0x000000027ffcaf80-0x000000027ffcbfff] __efi_memmap_free+0x66/0x80
So while this memory is memremapped, it has also been freed and then it gets into a use-after-free condition and subsequently gets corrupted.
This corruption is seen just before kexec-ing into the new kernel:
...
[ 11.045522] PEFILE: Unsigned PE binary^M
[ 11.060801] kexec-bzImage64: efi memmap phys map 0x27ffcaf80^M
...
[ 11.061220] kexec-bzImage64: mmap entry, type = 11, va = 0xfffffffeffc00000, pa = 0xffc00000, np = 0x400, attr = 0x8000000000000001^M
[ 11.061225] kexec-bzImage64: mmap entry, type = 6, va = 0xfffffffeffb04000, pa = 0x7f704000, np = 0x84, attr = 0x800000000000000f^M
[ 11.061228] kexec-bzImage64: mmap entry, type = 4, va = 0xfffffffeff700000, pa = 0x7f100000, np = 0x300, attr = 0x0^M
[ 11.061231] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M <- CORRUPTION!!!
[ 11.061234] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061236] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061239] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061241] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061243] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061245] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061248] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061250] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061252] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061255] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061257] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061259] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061262] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061264] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061266] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061268] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061271] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061273] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061275] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061278] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061280] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061282] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061284] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061287] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061289] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061291] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061294] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061296] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061298] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061301] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061303] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061305] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061307] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061310] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061312] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
[ 11.061314] kexec-bzImage64: mmap entry, type = 14080, va = 0x14f29, pa = 0x36c0, np = 0x0, attr = 0x0^M
[ 11.061317] kexec-bzImage64: mmap entry, type = 85808, va = 0x0, pa = 0x0, np = 0x72, attr = 0x14f40^M
[ 11.061320] kexec-bzImage64: mmap entry, type = 0, va = 0x14f4b, pa = 0x65, np = 0x1, attr = 0x0^M
[ 11.061323] kexec-bzImage64: mmap entry, type = 85840, va = 0x0, pa = 0x2, np = 0x69, attr = 0x14f59^M
[ 11.061325] kexec-bzImage64: mmap entry, type = 0, va = 0x14f65, pa = 0x6c, np = 0x0, attr = 0x0^M
[ 11.061328] kexec-bzImage64: mmap entry, type = 85871, va = 0x0, pa = 0x0, np = 0x7a, attr = 0x14f7f^M
...
This EFI phys map address 0x27ffcaf80 is being mem-remapped and also getting freed and then in use after free condition (while setting up the EFI memory map for the next kernel with kexec -s) in the above logs confirm the use-after-free case.
Looking at the above code flow, it makes sense to skip efi_arch_mem_reserve() to fix this issue, as it anyway needs to be skipped for kexec case.
Thanks, Ashish
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-04 22:35 ` Kalra, Ashish
@ 2024-06-05 1:48 ` Dave Young
2024-06-05 1:52 ` Dave Young
2024-06-05 2:14 ` Kalra, Ashish
0 siblings, 2 replies; 115+ messages in thread
From: Dave Young @ 2024-06-05 1:48 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Borislav Petkov, Mike Rapoport, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel, dan.j.williams
On Wed, 5 Jun 2024 at 06:36, Kalra, Ashish <ashish.kalra@amd.com> wrote:
>
> Re-sending as the earlier response got line-wrapped.
>
> On 6/3/2024 12:12 PM, Borislav Petkov wrote:
> > On Mon, Jun 03, 2024 at 12:08:48PM -0500, Kalra, Ashish wrote:
> >> efi_arch_mem_reserve().
> > Now it only remains for you to explain why...
>
> Here is a detailed explanation of what is causing the EFI memory map corruption, with added debug logs and memblock debugging enabled:
>
> Initially at boot, efi_memblock_x86_reserve_range() does early_memremap() of the EFI memory map passed as part of setup_data, as the following logs show:
>
> ...
>
> [ 0.000000] efi: in efi_memblock_x86_reserve_range, phys map 0x27fff9110
> [ 0.000000] memblock_reserve: [0x000000027fff9110-0x000000027fffa12f] efi_memblock_x86_reserve_range+0x168/0x2a0
>
> ...
>
> Later, efi_arch_mem_reserve() is invoked, which calls efi_memmap_alloc() which does memblock_phys_alloc() to insert new EFI memory descriptor into efi.memap:
>
> ...
>
> [ 0.733263] memblock_reserve: [0x000000027ffcaf80-0x000000027ffcbfff] memblock_alloc_range_nid+0xf1/0x1b0
> [ 0.734787] efi: efi_arch_mem_reserve, efi phys map 0x27ffcaf80
>
> ...
>
> Finally, at the end of boot, kexec_enter_virtual_mode() is called.
>
> It does mapping of efi regions which were passed via setup_data.
>
> So it unregisters the early mem-remapped EFI memmap and installs the new EFI memory map as below:
>
> ( Because of efi_arch_mem_reserve() getting invoked, the new EFI memmap phys base being remapped now is the memblock allocation done in efi_arch_mem_reserve()).
>
> [ 4.042160] efi: efi memmap phys map 0x27ffcaf80
>
> So, kexec_enter_virtual_mode() does the following :
>
> if (efi_memmap_init_late(efi.memmap.phys_map, <- refers to the new EFI memmap phys base allocated via memblock in efi_arch_mem_reserve().
> efi.memmap.desc_size * efi.memmap.nr_map)) { ...
>
> This late init, does a memremap() on this memblock-allocated memory, but then immediately frees it :
>
> drivers/firmware/efi/memmap.c:
>
> int __init __efi_memmap_init(struct efi_memory_map_data *data)
> {
>
> ..
>
> phys_map = data->phys_map; <- refers to the new EFI memmap phys base allocated via memblock in efi_arch_mem_reserve().
>
> if (data->flags & EFI_MEMMAP_LATE)
> map.map = memremap(phys_map, data->size, MEMREMAP_WB);
> ...
> ...
> if (efi.memmap.flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB)) {
> __efi_memmap_free(efi.memmap.phys_map,
> efi.memmap.desc_size * efi.memmap.nr_map, efi.memmap.flags);
> }
From your debugging the memmap should not be freed. This piece of
code was added in below commit, added Dan Williams in cc list:
commit f0ef6523475f18ccd213e22ee593dfd131a2c5ea
Author: Dan Williams <dan.j.williams@intel.com>
Date: Mon Jan 13 18:22:44 2020 +0100
efi: Fix efi_memmap_alloc() leaks
With efi_fake_memmap() and efi_arch_mem_reserve() the efi table may be
updated and replaced multiple times. When that happens a previous
dynamically allocated efi memory map can be garbage collected. Use the
new EFI_MEMMAP_{SLAB,MEMBLOCK} flags to detect when a dynamically
allocated memory map is being replaced.
>
> ...
> map.phys_map = data->phys_map;
>
> ...
>
> efi.memmap = map;
>
> ...
>
> This happens as kexec_enter_virtual_mode() can only handle the early mapped EFI memmap and not the one which is memblock allocated by efi_arch_mem_reserve(). As seen above this memblock allocated (EFI_MEMMAP_MEMBLOCK tagged) memory gets freed.
>
> This is confirmed by memblock debugging:
>
> [ 4.044057] memblock_free_late: [0x000000027ffcaf80-0x000000027ffcbfff] __efi_memmap_free+0x66/0x80
>
> So while this memory is memremapped, it has also been freed and then it gets into a use-after-free condition and subsequently gets corrupted.
>
> This corruption is seen just before kexec-ing into the new kernel:
>
> ...
> [ 11.045522] PEFILE: Unsigned PE binary^M
> [ 11.060801] kexec-bzImage64: efi memmap phys map 0x27ffcaf80^M
> ...
> [ 11.061220] kexec-bzImage64: mmap entry, type = 11, va = 0xfffffffeffc00000, pa = 0xffc00000, np = 0x400, attr = 0x8000000000000001^M
> [ 11.061225] kexec-bzImage64: mmap entry, type = 6, va = 0xfffffffeffb04000, pa = 0x7f704000, np = 0x84, attr = 0x800000000000000f^M
> [ 11.061228] kexec-bzImage64: mmap entry, type = 4, va = 0xfffffffeff700000, pa = 0x7f100000, np = 0x300, attr = 0x0^M
> [ 11.061231] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M <- CORRUPTION!!!
> [ 11.061234] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061236] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061239] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061241] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061243] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061245] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061248] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061250] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061252] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061255] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061257] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061259] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061262] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061264] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061266] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061268] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061271] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061273] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061275] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061278] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061280] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061282] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061284] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061287] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061289] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061291] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061294] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061296] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061298] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061301] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061303] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061305] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061307] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061310] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061312] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
> [ 11.061314] kexec-bzImage64: mmap entry, type = 14080, va = 0x14f29, pa = 0x36c0, np = 0x0, attr = 0x0^M
> [ 11.061317] kexec-bzImage64: mmap entry, type = 85808, va = 0x0, pa = 0x0, np = 0x72, attr = 0x14f40^M
> [ 11.061320] kexec-bzImage64: mmap entry, type = 0, va = 0x14f4b, pa = 0x65, np = 0x1, attr = 0x0^M
> [ 11.061323] kexec-bzImage64: mmap entry, type = 85840, va = 0x0, pa = 0x2, np = 0x69, attr = 0x14f59^M
> [ 11.061325] kexec-bzImage64: mmap entry, type = 0, va = 0x14f65, pa = 0x6c, np = 0x0, attr = 0x0^M
> [ 11.061328] kexec-bzImage64: mmap entry, type = 85871, va = 0x0, pa = 0x0, np = 0x7a, attr = 0x14f7f^M
>
>
> ...
>
> This EFI phys map address 0x27ffcaf80 is being mem-remapped and also getting freed and then in use after free condition (while setting up the EFI memory map for the next kernel with kexec -s) in the above logs confirm the use-after-free case.
>
> Looking at the above code flow, it makes sense to skip efi_arch_mem_reserve() to fix this issue, as it anyway needs to be skipped for kexec case.
>
> Thanks, Ashish
>
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-05 1:48 ` Dave Young
@ 2024-06-05 1:52 ` Dave Young
2024-06-05 1:58 ` Dave Young
2024-06-05 2:14 ` Kalra, Ashish
1 sibling, 1 reply; 115+ messages in thread
From: Dave Young @ 2024-06-05 1:52 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Borislav Petkov, Mike Rapoport, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel, dan.j.williams
> > ...
> > if (efi.memmap.flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB)) {
> > __efi_memmap_free(efi.memmap.phys_map,
> > efi.memmap.desc_size * efi.memmap.nr_map, efi.memmap.flags);
> > }
>
> From your debugging the memmap should not be freed. This piece of
> code was added in below commit, added Dan Williams in cc list:
> commit f0ef6523475f18ccd213e22ee593dfd131a2c5ea
> Author: Dan Williams <dan.j.williams@intel.com>
> Date: Mon Jan 13 18:22:44 2020 +0100
>
> efi: Fix efi_memmap_alloc() leaks
>
> With efi_fake_memmap() and efi_arch_mem_reserve() the efi table may be
> updated and replaced multiple times. When that happens a previous
> dynamically allocated efi memory map can be garbage collected. Use the
> new EFI_MEMMAP_{SLAB,MEMBLOCK} flags to detect when a dynamically
> allocated memory map is being replaced.
>
Dan, probably those regions should be freed only for "fake" memmap?
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-05 1:52 ` Dave Young
@ 2024-06-05 1:58 ` Dave Young
2024-06-05 2:08 ` Kalra, Ashish
0 siblings, 1 reply; 115+ messages in thread
From: Dave Young @ 2024-06-05 1:58 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Borislav Petkov, Mike Rapoport, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel, dan.j.williams
On Wed, 5 Jun 2024 at 09:52, Dave Young <dyoung@redhat.com> wrote:
>
> > > ...
> > > if (efi.memmap.flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB)) {
> > > __efi_memmap_free(efi.memmap.phys_map,
> > > efi.memmap.desc_size * efi.memmap.nr_map, efi.memmap.flags);
> > > }
> >
> > From your debugging the memmap should not be freed. This piece of
> > code was added in below commit, added Dan Williams in cc list:
> > commit f0ef6523475f18ccd213e22ee593dfd131a2c5ea
> > Author: Dan Williams <dan.j.williams@intel.com>
> > Date: Mon Jan 13 18:22:44 2020 +0100
> >
> > efi: Fix efi_memmap_alloc() leaks
> >
> > With efi_fake_memmap() and efi_arch_mem_reserve() the efi table may be
> > updated and replaced multiple times. When that happens a previous
> > dynamically allocated efi memory map can be garbage collected. Use the
> > new EFI_MEMMAP_{SLAB,MEMBLOCK} flags to detect when a dynamically
> > allocated memory map is being replaced.
> >
>
> Dan, probably those regions should be freed only for "fake" memmap?
Ashish, can you comment out the __efi_memmap_free see if it works for
you just confirm about the behavior.
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-05 1:58 ` Dave Young
@ 2024-06-05 2:08 ` Kalra, Ashish
2024-06-05 2:28 ` Dave Young
0 siblings, 1 reply; 115+ messages in thread
From: Kalra, Ashish @ 2024-06-05 2:08 UTC (permalink / raw)
To: Dave Young
Cc: Borislav Petkov, Mike Rapoport, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel, dan.j.williams
Hello Dave,
On 6/4/2024 8:58 PM, Dave Young wrote:
> On Wed, 5 Jun 2024 at 09:52, Dave Young <dyoung@redhat.com> wrote:
>>>> ...
>>>> if (efi.memmap.flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB)) {
>>>> __efi_memmap_free(efi.memmap.phys_map,
>>>> efi.memmap.desc_size * efi.memmap.nr_map, efi.memmap.flags);
>>>> }
>>> From your debugging the memmap should not be freed. This piece of
>>> code was added in below commit, added Dan Williams in cc list:
>>> commit f0ef6523475f18ccd213e22ee593dfd131a2c5ea
>>> Author: Dan Williams <dan.j.williams@intel.com>
>>> Date: Mon Jan 13 18:22:44 2020 +0100
>>>
>>> efi: Fix efi_memmap_alloc() leaks
>>>
>>> With efi_fake_memmap() and efi_arch_mem_reserve() the efi table may be
>>> updated and replaced multiple times. When that happens a previous
>>> dynamically allocated efi memory map can be garbage collected. Use the
>>> new EFI_MEMMAP_{SLAB,MEMBLOCK} flags to detect when a dynamically
>>> allocated memory map is being replaced.
>>>
>> Dan, probably those regions should be freed only for "fake" memmap?
> Ashish, can you comment out the __efi_memmap_free see if it works for
> you just confirm about the behavior.
Yes, i have already tried and tested that, if i avoid __efi_memmap_free(), then i don't see this memory map corruption.
Thanks, Ashish
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-05 1:48 ` Dave Young
2024-06-05 1:52 ` Dave Young
@ 2024-06-05 2:14 ` Kalra, Ashish
1 sibling, 0 replies; 115+ messages in thread
From: Kalra, Ashish @ 2024-06-05 2:14 UTC (permalink / raw)
To: Dave Young
Cc: Borislav Petkov, Mike Rapoport, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel, dan.j.williams
On 6/4/2024 8:48 PM, Dave Young wrote:
> On Wed, 5 Jun 2024 at 06:36, Kalra, Ashish <ashish.kalra@amd.com> wrote:
>> Re-sending as the earlier response got line-wrapped.
>>
>> On 6/3/2024 12:12 PM, Borislav Petkov wrote:
>>> On Mon, Jun 03, 2024 at 12:08:48PM -0500, Kalra, Ashish wrote:
>>>> efi_arch_mem_reserve().
>>> Now it only remains for you to explain why...
>> Here is a detailed explanation of what is causing the EFI memory map corruption, with added debug logs and memblock debugging enabled:
>>
>> Initially at boot, efi_memblock_x86_reserve_range() does early_memremap() of the EFI memory map passed as part of setup_data, as the following logs show:
>>
>> ...
>>
>> [ 0.000000] efi: in efi_memblock_x86_reserve_range, phys map 0x27fff9110
>> [ 0.000000] memblock_reserve: [0x000000027fff9110-0x000000027fffa12f] efi_memblock_x86_reserve_range+0x168/0x2a0
>>
>> ...
>>
>> Later, efi_arch_mem_reserve() is invoked, which calls efi_memmap_alloc() which does memblock_phys_alloc() to insert new EFI memory descriptor into efi.memap:
>>
>> ...
>>
>> [ 0.733263] memblock_reserve: [0x000000027ffcaf80-0x000000027ffcbfff] memblock_alloc_range_nid+0xf1/0x1b0
>> [ 0.734787] efi: efi_arch_mem_reserve, efi phys map 0x27ffcaf80
>>
>> ...
>>
>> Finally, at the end of boot, kexec_enter_virtual_mode() is called.
>>
>> It does mapping of efi regions which were passed via setup_data.
>>
>> So it unregisters the early mem-remapped EFI memmap and installs the new EFI memory map as below:
>>
>> ( Because of efi_arch_mem_reserve() getting invoked, the new EFI memmap phys base being remapped now is the memblock allocation done in efi_arch_mem_reserve()).
>>
>> [ 4.042160] efi: efi memmap phys map 0x27ffcaf80
>>
>> So, kexec_enter_virtual_mode() does the following :
>>
>> if (efi_memmap_init_late(efi.memmap.phys_map, <- refers to the new EFI memmap phys base allocated via memblock in efi_arch_mem_reserve().
>> efi.memmap.desc_size * efi.memmap.nr_map)) { ...
>>
>> This late init, does a memremap() on this memblock-allocated memory, but then immediately frees it :
>>
>> drivers/firmware/efi/memmap.c:
>>
>> int __init __efi_memmap_init(struct efi_memory_map_data *data)
>> {
>>
>> ..
>>
>> phys_map = data->phys_map; <- refers to the new EFI memmap phys base allocated via memblock in efi_arch_mem_reserve().
>>
>> if (data->flags & EFI_MEMMAP_LATE)
>> map.map = memremap(phys_map, data->size, MEMREMAP_WB);
>> ...
>> ...
>> if (efi.memmap.flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB)) {
>> __efi_memmap_free(efi.memmap.phys_map,
>> efi.memmap.desc_size * efi.memmap.nr_map, efi.memmap.flags);
>> }
> From your debugging the memmap should not be freed.
Yes, it looks like that it should not be freed, as the new and previous efi memory map can be same.
Thanks, Ashish
> This piece of
> code was added in below commit, added Dan Williams in cc list:
> commit f0ef6523475f18ccd213e22ee593dfd131a2c5ea
> Author: Dan Williams <dan.j.williams@intel.com>
> Date: Mon Jan 13 18:22:44 2020 +0100
>
> efi: Fix efi_memmap_alloc() leaks
>
> With efi_fake_memmap() and efi_arch_mem_reserve() the efi table may be
> updated and replaced multiple times. When that happens a previous
> dynamically allocated efi memory map can be garbage collected. Use the
> new EFI_MEMMAP_{SLAB,MEMBLOCK} flags to detect when a dynamically
> allocated memory map is being replaced.
>
>
>> ...
>> map.phys_map = data->phys_map;
>>
>> ...
>>
>> efi.memmap = map;
>>
>> ...
>>
>> This happens as kexec_enter_virtual_mode() can only handle the early mapped EFI memmap and not the one which is memblock allocated by efi_arch_mem_reserve(). As seen above this memblock allocated (EFI_MEMMAP_MEMBLOCK tagged) memory gets freed.
>>
>> This is confirmed by memblock debugging:
>>
>> [ 4.044057] memblock_free_late: [0x000000027ffcaf80-0x000000027ffcbfff] __efi_memmap_free+0x66/0x80
>>
>> So while this memory is memremapped, it has also been freed and then it gets into a use-after-free condition and subsequently gets corrupted.
>>
>> This corruption is seen just before kexec-ing into the new kernel:
>>
>> ...
>> [ 11.045522] PEFILE: Unsigned PE binary^M
>> [ 11.060801] kexec-bzImage64: efi memmap phys map 0x27ffcaf80^M
>> ...
>> [ 11.061220] kexec-bzImage64: mmap entry, type = 11, va = 0xfffffffeffc00000, pa = 0xffc00000, np = 0x400, attr = 0x8000000000000001^M
>> [ 11.061225] kexec-bzImage64: mmap entry, type = 6, va = 0xfffffffeffb04000, pa = 0x7f704000, np = 0x84, attr = 0x800000000000000f^M
>> [ 11.061228] kexec-bzImage64: mmap entry, type = 4, va = 0xfffffffeff700000, pa = 0x7f100000, np = 0x300, attr = 0x0^M
>> [ 11.061231] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M <- CORRUPTION!!!
>> [ 11.061234] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061236] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061239] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061241] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061243] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061245] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061248] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061250] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061252] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061255] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061257] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061259] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061262] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061264] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061266] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061268] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061271] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061273] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061275] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061278] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061280] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061282] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061284] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061287] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061289] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061291] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061294] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061296] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061298] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061301] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061303] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061305] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061307] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061310] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061312] kexec-bzImage64: mmap entry, type = 0, va = 0x0, pa = 0x0, np = 0x0, attr = 0x0^M
>> [ 11.061314] kexec-bzImage64: mmap entry, type = 14080, va = 0x14f29, pa = 0x36c0, np = 0x0, attr = 0x0^M
>> [ 11.061317] kexec-bzImage64: mmap entry, type = 85808, va = 0x0, pa = 0x0, np = 0x72, attr = 0x14f40^M
>> [ 11.061320] kexec-bzImage64: mmap entry, type = 0, va = 0x14f4b, pa = 0x65, np = 0x1, attr = 0x0^M
>> [ 11.061323] kexec-bzImage64: mmap entry, type = 85840, va = 0x0, pa = 0x2, np = 0x69, attr = 0x14f59^M
>> [ 11.061325] kexec-bzImage64: mmap entry, type = 0, va = 0x14f65, pa = 0x6c, np = 0x0, attr = 0x0^M
>> [ 11.061328] kexec-bzImage64: mmap entry, type = 85871, va = 0x0, pa = 0x0, np = 0x7a, attr = 0x14f7f^M
>>
>>
>> ...
>>
>> This EFI phys map address 0x27ffcaf80 is being mem-remapped and also getting freed and then in use after free condition (while setting up the EFI memory map for the next kernel with kexec -s) in the above logs confirm the use-after-free case.
>>
>> Looking at the above code flow, it makes sense to skip efi_arch_mem_reserve() to fix this issue, as it anyway needs to be skipped for kexec case.
>>
>> Thanks, Ashish
>>
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-05 2:08 ` Kalra, Ashish
@ 2024-06-05 2:28 ` Dave Young
2024-06-05 11:09 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Dave Young @ 2024-06-05 2:28 UTC (permalink / raw)
To: Kalra, Ashish
Cc: Borislav Petkov, Mike Rapoport, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel, dan.j.williams
On Wed, 5 Jun 2024 at 10:09, Kalra, Ashish <ashish.kalra@amd.com> wrote:
>
> Hello Dave,
>
> On 6/4/2024 8:58 PM, Dave Young wrote:
> > On Wed, 5 Jun 2024 at 09:52, Dave Young <dyoung@redhat.com> wrote:
> >>>> ...
> >>>> if (efi.memmap.flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB)) {
> >>>> __efi_memmap_free(efi.memmap.phys_map,
> >>>> efi.memmap.desc_size * efi.memmap.nr_map, efi.memmap.flags);
> >>>> }
> >>> From your debugging the memmap should not be freed. This piece of
> >>> code was added in below commit, added Dan Williams in cc list:
> >>> commit f0ef6523475f18ccd213e22ee593dfd131a2c5ea
> >>> Author: Dan Williams <dan.j.williams@intel.com>
> >>> Date: Mon Jan 13 18:22:44 2020 +0100
> >>>
> >>> efi: Fix efi_memmap_alloc() leaks
> >>>
> >>> With efi_fake_memmap() and efi_arch_mem_reserve() the efi table may be
> >>> updated and replaced multiple times. When that happens a previous
> >>> dynamically allocated efi memory map can be garbage collected. Use the
> >>> new EFI_MEMMAP_{SLAB,MEMBLOCK} flags to detect when a dynamically
> >>> allocated memory map is being replaced.
> >>>
> >> Dan, probably those regions should be freed only for "fake" memmap?
> > Ashish, can you comment out the __efi_memmap_free see if it works for
> > you just confirm about the behavior.
>
> Yes, i have already tried and tested that, if i avoid __efi_memmap_free(), then i don't see this memory map corruption.
Ok, thanks! I think the right way is creating two patches, one to
remove the __efi_memmap_free, another is skip efi_arch_mem_reserve
when the EFI_MEMORY_RUNTIME bit was set already. But the first one
should be the fix for the root cause.
efi fake mem is only for debugging purposes, the "memleak" mentioned
in commit 0f96a99dab36 should be solved in another way if needed (are
they really leaked? or just not useful anymore)
Anyway this is my opinion, please wait for x86 and efi reviewer's inputs.
>
> Thanks, Ashish
>
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-04 18:02 ` Borislav Petkov
@ 2024-06-05 2:53 ` Dave Young
2024-06-05 7:42 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Dave Young @ 2024-06-05 2:53 UTC (permalink / raw)
To: Borislav Petkov
Cc: Mike Rapoport, Kalra, Ashish, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel
On Wed, 5 Jun 2024 at 02:03, Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Jun 04, 2024 at 07:09:56PM +0800, Dave Young wrote:
> > Anyway there is not such a helper for all cases.
>
> But maybe there should be...
>
> This is not the first case where the need arises to be able to say:
>
> if (am I a kexeced kernel)
>
> in code.
>
> Perhaps we should have a global var kexeced or so which gets incremented
> on each kexec-ed kernel, somewhere in very early boot of the kexec-ed
> kernel we do
>
> kexeced++;
>
> and then other code can query it and know whether this is a kexec-ed
> kernel and how many times it got kexec-ed...
It's something good to have but not must for the time being, also no
idea how to save the status across boot, for EFI boot case probably a
EFI var can be used, but how can it be cleared in case of physical
boot. Otherwise probably injecting some kernel parameters, anyway
this needs more thinking.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-05 2:53 ` Dave Young
@ 2024-06-05 7:42 ` Borislav Petkov
2024-06-05 8:17 ` Ard Biesheuvel
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-05 7:42 UTC (permalink / raw)
To: Dave Young
Cc: Mike Rapoport, Kalra, Ashish, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, ardb, kexec, linux-coco,
linux-kernel
On Wed, Jun 05, 2024 at 10:53:44AM +0800, Dave Young wrote:
> It's something good to have but not must for the time being, also no
> idea how to save the status across boot, for EFI boot case probably a
> EFI var can be used;
Yes.
> but how can it be cleared in case of physical boot. Otherwise
> probably injecting some kernel parameters, anyway this needs more
> thinking.
Yeah, this'll need proper analysis whether we can even do that reliably.
We need to increment it only on the kexec reboot paths and clear it on
the normal reboot paths.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-05 7:42 ` Borislav Petkov
@ 2024-06-05 8:17 ` Ard Biesheuvel
2024-06-05 11:15 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Ard Biesheuvel @ 2024-06-05 8:17 UTC (permalink / raw)
To: Borislav Petkov
Cc: Dave Young, Mike Rapoport, Kalra, Ashish, tglx, mingo,
dave.hansen, x86, rafael, hpa, peterz, adrian.hunter,
sathyanarayanan.kuppuswamy, jun.nakajima, rick.p.edgecombe,
thomas.lendacky, michael.roth, seanjc, kai.huang, bhe,
kirill.shutemov, bdas, vkuznets, dionnaglaze, anisinha, jroedel,
kexec, linux-coco, linux-kernel
On Wed, 5 Jun 2024 at 09:43, Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Jun 05, 2024 at 10:53:44AM +0800, Dave Young wrote:
> > It's something good to have but not must for the time being, also no
> > idea how to save the status across boot, for EFI boot case probably a
> > EFI var can be used;
>
> Yes.
>
> > but how can it be cleared in case of physical boot. Otherwise
> > probably injecting some kernel parameters, anyway this needs more
> > thinking.
>
> Yeah, this'll need proper analysis whether we can even do that reliably.
>
> We need to increment it only on the kexec reboot paths and clear it on
> the normal reboot paths.
>
I'd argue for the opposite: ideally, the difference between the first
boot and not-the-first-boot should be abstracted away by the
'bootloader' side of kexec as much as possible, so that the tricky
early startup code doesn't have to be riddled with different code
paths depending on !kexec vs kexec.
TDX is a good case in point here: rather than add more conditionals,
I'd urge to remove them so the TDX startup code doesn't have to care
about the difference at all. If there is anything special that needs
to be done, it belongs in the kexec implementation of the previous
kernel.
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-05 2:28 ` Dave Young
@ 2024-06-05 11:09 ` Borislav Petkov
2024-06-06 1:52 ` Dave Young
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-05 11:09 UTC (permalink / raw)
To: Dave Young, ardb, dan.j.williams
Cc: Kalra, Ashish, Mike Rapoport, tglx, mingo, dave.hansen, x86,
rafael, hpa, peterz, adrian.hunter, sathyanarayanan.kuppuswamy,
jun.nakajima, rick.p.edgecombe, thomas.lendacky, michael.roth,
seanjc, kai.huang, bhe, kirill.shutemov, bdas, vkuznets,
dionnaglaze, anisinha, jroedel, kexec, linux-coco, linux-kernel
Moving Ard and Dan to To:
On Wed, Jun 05, 2024 at 10:28:18AM +0800, Dave Young wrote:
> Ok, thanks! I think the right way is creating two patches, one to
> remove the __efi_memmap_free,
Yap, that
f0ef6523475f ("efi: Fix efi_memmap_alloc() leaks")
needs revisiting.
So AFAIU, the flow is this:
In a kexec-ed kernel:
1. efi_arch_mem_reserve() gets called by bgrt, erst, mokvar... whatever
to hold on to boot services regions for longer otherwise EFI
"implementations" explode.
2. On same kexec-ed kernel, we call into kexec_enter_virtual_mode()
because it needs to get the runtime services regions from the first
kernel
3. As part of that call, it'll do
efi_memmap_init_late->__efi_memmap_init():
if (efi.memmap.flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB))
__efi_memmap_free(efi.memmap.phys_map,
and the memory which got allocated in step 1 is gone, thus reverting
what efi_arch_mem_reserve() is trying to fix.
IOW, we need a
EFI_MEMMAP_DO_NOT_TOUCH_MY_MEMORY
flag which'll stop this from happening. But I'd prefer it if Ard decides
what the right thing to do here is.
> another is skip efi_arch_mem_reserve when the EFI_MEMORY_RUNTIME bit
> was set already.
Can that even happen?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-05 8:17 ` Ard Biesheuvel
@ 2024-06-05 11:15 ` Borislav Petkov
0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-06-05 11:15 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Dave Young, Mike Rapoport, Kalra, Ashish, tglx, mingo,
dave.hansen, x86, rafael, hpa, peterz, adrian.hunter,
sathyanarayanan.kuppuswamy, jun.nakajima, rick.p.edgecombe,
thomas.lendacky, michael.roth, seanjc, kai.huang, bhe,
kirill.shutemov, bdas, vkuznets, dionnaglaze, anisinha, jroedel,
kexec, linux-coco, linux-kernel
On Wed, Jun 05, 2024 at 10:17:22AM +0200, Ard Biesheuvel wrote:
> I'd argue for the opposite: ideally, the difference between the first
> boot and not-the-first-boot should be abstracted away by the
> 'bootloader' side of kexec as much as possible, so that the tricky
> early startup code doesn't have to be riddled with different code
> paths depending on !kexec vs kexec.
Well, off and on we end up needing to be able to ask whether the current
kernel is kexec-ed. So you need to be able to access that aspect in
kernel code - not in the bootloader. Perhaps read it from the
bootloader, sure.
But see my other mail from just now - it might end up not needing it
after all and I'd prefer if we never ever have to ask that question but
just from staring at EFI code it reminded me that we do need to ask that
question already:
if (efi_setup)
kexec_enter_virtual_mode();
else
__efi_enter_virtual_mode();
*exactly* because of EFI and that virtual_map call nonsense of allowing
it only once.
And we check efi_setup here because that works. But you can't use that
globally. And so on...
> TDX is a good case in point here: rather than add more conditionals,
> I'd urge to remove them so the TDX startup code doesn't have to care
> about the difference at all. If there is anything special that needs
> to be done, it belongs in the kexec implementation of the previous
> kernel.
Sure, but reality is not as easy sometimes.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11.1 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-06-04 18:05 ` Borislav Petkov
@ 2024-06-05 12:21 ` Kirill A. Shutemov
2024-06-05 16:24 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-05 12:21 UTC (permalink / raw)
To: Borislav Petkov
Cc: Dave Hansen, adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kys, linux-acpi, linux-coco, linux-hyperv, linux-kernel, ltao,
mingo, peterz, rafael, rick.p.edgecombe,
sathyanarayanan.kuppuswamy, seanjc, tglx, thomas.lendacky, x86
On Tue, Jun 04, 2024 at 08:05:54PM +0200, Borislav Petkov wrote:
> On Tue, Jun 04, 2024 at 07:14:00PM +0300, Kirill A. Shutemov wrote:
> > /*
> > * If tdx_enc_status_changed() fails, it leaves memory
> > * in an unknown state. If the memory remains shared,
> > * it can result in an unrecoverable guest shutdown on
> > * the first accessed through a private mapping.
>
> "access"
Okay.
> So this sentence above can go too, right?
I don't think so.
> Because that comment is in tdx_kexec_finish() and we're basically going
> off to kexec. So can a guest even access it through a private mapping?
> We're shutting down so nothing is running anymore...
This kernel can't. But the next kernel can.
If a page can be accessed via private mapping is determined by the
presence in Secure EPT. This state persist across kexec.
> > * The kdump kernel boot is not impacted as it uses
> > * a pre-reserved memory range that is always private.
> > * However, gathering crash information could lead to
> > * a crash if it accesses unconverted memory through
> > * a private mapping.
>
> When does the kexec kernel even get such a private mapping? It is not
> even up yet...
Crash kernel provides access to this memory via /proc/vmcore. Crash kernel
will assume all memory there is private.
> > * pr_err() may assist in understanding such crashes.
>
> "Print error info in order to leave bread crumbs for debugging." is what
> I'd say.
Okay.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-06-04 16:27 ` [PATCHv11 " Dave Hansen
@ 2024-06-05 12:43 ` Kirill A. Shutemov
2024-06-05 16:05 ` Dave Hansen
0 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-05 12:43 UTC (permalink / raw)
To: Dave Hansen
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel, Tao Liu
On Tue, Jun 04, 2024 at 09:27:59AM -0700, Dave Hansen wrote:
> On 5/28/24 02:55, Kirill A. Shutemov wrote:
> > +/* Stop new private<->shared conversions */
> > +static void tdx_kexec_begin(bool crash)
> > +{
> > + /*
> > + * Crash kernel reaches here with interrupts disabled: can't wait for
> > + * conversions to finish.
> > + *
> > + * If race happened, just report and proceed.
> > + */
> > + if (!set_memory_enc_stop_conversion(!crash))
> > + pr_warn("Failed to stop shared<->private conversions\n");
> > +}
>
> I don't like having to pass 'crash' in here.
>
> If interrupts are the problem we have ways of testing for those directly.
>
> If it's being in an oops that's a problem, we have 'oops_in_progress'
> for that.
>
> In other words, I'd much rather this function (or better yet
> set_memory_enc_stop_conversion() itself) use some existing API to change
> its behavior in a crash rather than have the context be passed down and
> twiddled through several levels of function calls.
>
> There are a ton of these in the console code:
>
> if (oops_in_progress)
> foo_trylock();
> else
> foo_lock();
>
> To me, that's a billion times more clear than a 'wait' argument that
> gets derives from who-knows-what that I have to trace through ten levels
> of function calls.
Okay fair enough. Check out the fixup below. Is it what you mean?
One other thing I realized is that these callback are dead code if kernel
compiled without kexec support. Do we want them to be wrapped with
#ifdef COFNIG_KEXEC_CORE everywhere? It is going to be ugly.
Any better ideas?
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 3d23ea0f5d45..1c5aa036b76b 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -834,7 +834,7 @@ static int tdx_enc_status_change_finish(unsigned long vaddr, int numpages,
}
/* Stop new private<->shared conversions */
-static void tdx_kexec_begin(bool crash)
+static void tdx_kexec_begin(void)
{
/*
* Crash kernel reaches here with interrupts disabled: can't wait for
@@ -842,7 +842,7 @@ static void tdx_kexec_begin(bool crash)
*
* If race happened, just report and proceed.
*/
- if (!set_memory_enc_stop_conversion(!crash))
+ if (!set_memory_enc_stop_conversion())
pr_warn("Failed to stop shared<->private conversions\n");
}
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index d490db38db9e..4b2abce2e3e7 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -50,7 +50,7 @@ int set_memory_np(unsigned long addr, int numpages);
int set_memory_p(unsigned long addr, int numpages);
int set_memory_4k(unsigned long addr, int numpages);
-bool set_memory_enc_stop_conversion(bool wait);
+bool set_memory_enc_stop_conversion(void);
int set_memory_encrypted(unsigned long addr, int numpages);
int set_memory_decrypted(unsigned long addr, int numpages);
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index b0f313278967..213cf5379a5a 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -152,8 +152,6 @@ struct x86_init_acpi {
* @enc_kexec_begin Begin the two-step process of converting shared memory back
* to private. It stops the new conversions from being started
* and waits in-flight conversions to finish, if possible.
- * The @crash parameter denotes whether the function is being
- * called in the crash shutdown path.
* @enc_kexec_finish Finish the two-step process of converting shared memory to
* private. All memory is private after the call when
* the function returns.
@@ -165,7 +163,7 @@ struct x86_guest {
int (*enc_status_change_finish)(unsigned long vaddr, int npages, bool enc);
bool (*enc_tlb_flush_required)(bool enc);
bool (*enc_cache_flush_required)(void);
- void (*enc_kexec_begin)(bool crash);
+ void (*enc_kexec_begin)(void);
void (*enc_kexec_finish)(void);
};
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index fc52ea80cdc8..340af8155658 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -137,7 +137,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
* down and interrupts have been disabled. This allows the callback to
* detect a race with the conversion and report it.
*/
- x86_platform.guest.enc_kexec_begin(true);
+ x86_platform.guest.enc_kexec_begin();
x86_platform.guest.enc_kexec_finish();
crash_save_cpu(regs, safe_smp_processor_id());
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 513809b5b27c..0e0a4cf6b5eb 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -723,7 +723,7 @@ void native_machine_shutdown(void)
* conversions to finish cleanly.
*/
if (kexec_in_progress)
- x86_platform.guest.enc_kexec_begin(false);
+ x86_platform.guest.enc_kexec_begin();
/* Stop the cpus and apics */
#ifdef CONFIG_X86_IO_APIC
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 8a79fb505303..82b128d3f309 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -138,7 +138,7 @@ static int enc_status_change_prepare_noop(unsigned long vaddr, int npages, bool
static int enc_status_change_finish_noop(unsigned long vaddr, int npages, bool enc) { return 0; }
static bool enc_tlb_flush_required_noop(bool enc) { return false; }
static bool enc_cache_flush_required_noop(void) { return false; }
-static void enc_kexec_begin_noop(bool crash) {}
+static void enc_kexec_begin_noop(void) {}
static void enc_kexec_finish_noop(void) {}
static bool is_private_mmio_noop(u64 addr) {return false; }
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 2a548b65ef5f..443a97e515c0 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2240,13 +2240,14 @@ static DECLARE_RWSEM(mem_enc_lock);
*
* Taking the exclusive mem_enc_lock waits for in-flight conversions to complete.
* The lock is not released to prevent new conversions from being started.
- *
- * If sleep is not allowed, as in a crash scenario, try to take the lock.
- * Failure indicates that there is a race with the conversion.
*/
-bool set_memory_enc_stop_conversion(bool wait)
+bool set_memory_enc_stop_conversion(void)
{
- if (!wait)
+ /*
+ * In a crash scenario, sleep is not allowed. Try to take the lock.
+ * Failure indicates that there is a race with the conversion.
+ */
+ if (oops_in_progress)
return down_write_trylock(&mem_enc_lock);
down_write(&mem_enc_lock);
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-06-05 12:43 ` Kirill A. Shutemov
@ 2024-06-05 16:05 ` Dave Hansen
0 siblings, 0 replies; 115+ messages in thread
From: Dave Hansen @ 2024-06-05 16:05 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, H. Peter Anvin,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel, Tao Liu
On 6/5/24 05:43, Kirill A. Shutemov wrote:
> Okay fair enough. Check out the fixup below. Is it what you mean?
Yes. Much better.
> One other thing I realized is that these callback are dead code if kernel
> compiled without kexec support. Do we want them to be wrapped with
> #ifdef COFNIG_KEXEC_CORE everywhere? It is going to be ugly.
>
> Any better ideas?
The other callbacks don't have #ifdefs either and they're dependent on
memory encryption as far as I can tell.
I think a simple:
if (IS_ENABLED(COFNIG_KEXEC_CORE))
return;
in the top of the callbacks will result in a tiny little stub function
when kexec is disabled. So the bloat will be limited to kernels that
have TDX compiled in but kexec compiled out (probably never). The bloat
will be two callback pointer, one tiny stub function, and a quick
call/return in a slow path.
I think that probably ends up being a few dozen bytes of bloat in kernel
text for a "probably never" config.
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11.1 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-06-05 12:21 ` Kirill A. Shutemov
@ 2024-06-05 16:24 ` Borislav Petkov
2024-06-06 12:39 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-05 16:24 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Dave Hansen, adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kys, linux-acpi, linux-coco, linux-hyperv, linux-kernel, ltao,
mingo, peterz, rafael, rick.p.edgecombe,
sathyanarayanan.kuppuswamy, seanjc, tglx, thomas.lendacky, x86
On Wed, Jun 05, 2024 at 03:21:42PM +0300, Kirill A. Shutemov wrote:
> If a page can be accessed via private mapping is determined by the
> presence in Secure EPT. This state persist across kexec.
I just love it how I tickle out details each time I touch this comment
because we three can't write a single concise and self-contained
explanation. :-(
Ok, next version:
"Private mappings persist across kexec. If tdx_enc_status_changed() fails
in the first kernel, it leaves memory in an unknown state.
If that memory remains shared, accessing it in the *next* kernel through
a private mapping will result in an unrecoverable guest shutdown.
The kdump kernel boot is not impacted as it uses a pre-reserved memory
range that is always private. However, gathering crash information
could lead to a crash if it accesses unconverted memory through
a private mapping which is possible when accessing that memory through
/proc/vmcore, for example.
In all cases, print error info in order to leave enough bread crumbs for
debugging."
I think this is getting in the right direction as it actually makes
sense now.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 2/3] x86/boot/compressed: Skip Video Memory access in Decompressor for SEV-ES/SNP.
2024-05-30 23:37 ` [PATCH v7 2/3] x86/boot/compressed: Skip Video Memory access in Decompressor for SEV-ES/SNP Ashish Kalra
@ 2024-06-05 20:14 ` Borislav Petkov
0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-06-05 20:14 UTC (permalink / raw)
To: Ashish Kalra
Cc: tglx, mingo, dave.hansen, x86, rafael, hpa, peterz, adrian.hunter,
sathyanarayanan.kuppuswamy, jun.nakajima, rick.p.edgecombe,
thomas.lendacky, michael.roth, seanjc, kai.huang, bhe,
kirill.shutemov, bdas, vkuznets, dionnaglaze, anisinha, jroedel,
ardb, kexec, linux-coco, linux-kernel
On Thu, May 30, 2024 at 11:37:14PM +0000, Ashish Kalra wrote:
> - lines = boot_params_ptr->screen_info.orig_video_lines;
> - cols = boot_params_ptr->screen_info.orig_video_cols;
> + if (!(sev_status & MSR_AMD64_SEV_ES_ENABLED)) {
> + lines = boot_params_ptr->screen_info.orig_video_lines;
> + cols = boot_params_ptr->screen_info.orig_video_cols;
> + }
By now I get an allergic reaction from this sprinkling of "if sev..."
everywhere in the code.
> init_default_io_ops();
<--- right here there's a call to
early_tdx_detect();
You can add a early_sev_detect() counterpart here and clear lines and
cols in it along with an explanation why it is being done.
This is at least a bit cleaner than this.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec
2024-06-05 11:09 ` Borislav Petkov
@ 2024-06-06 1:52 ` Dave Young
0 siblings, 0 replies; 115+ messages in thread
From: Dave Young @ 2024-06-06 1:52 UTC (permalink / raw)
To: Borislav Petkov
Cc: ardb, dan.j.williams, Kalra, Ashish, Mike Rapoport, tglx, mingo,
dave.hansen, x86, rafael, hpa, peterz, adrian.hunter,
sathyanarayanan.kuppuswamy, jun.nakajima, rick.p.edgecombe,
thomas.lendacky, michael.roth, seanjc, kai.huang, bhe,
kirill.shutemov, bdas, vkuznets, dionnaglaze, anisinha, jroedel,
kexec, linux-coco, linux-kernel
On Wed, 5 Jun 2024 at 19:09, Borislav Petkov <bp@alien8.de> wrote:
>
> Moving Ard and Dan to To:
>
> On Wed, Jun 05, 2024 at 10:28:18AM +0800, Dave Young wrote:
> > Ok, thanks! I think the right way is creating two patches, one to
> > remove the __efi_memmap_free,
>
> Yap, that
>
> f0ef6523475f ("efi: Fix efi_memmap_alloc() leaks")
>
> needs revisiting.
>
> So AFAIU, the flow is this:
>
> In a kexec-ed kernel:
>
> 1. efi_arch_mem_reserve() gets called by bgrt, erst, mokvar... whatever
> to hold on to boot services regions for longer otherwise EFI
> "implementations" explode.
>
> 2. On same kexec-ed kernel, we call into kexec_enter_virtual_mode()
> because it needs to get the runtime services regions from the first
> kernel
>
> 3. As part of that call, it'll do
> efi_memmap_init_late->__efi_memmap_init():
>
> if (efi.memmap.flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB))
> __efi_memmap_free(efi.memmap.phys_map,
>
> and the memory which got allocated in step 1 is gone, thus reverting
> what efi_arch_mem_reserve() is trying to fix.
>
> IOW, we need a
>
> EFI_MEMMAP_DO_NOT_TOUCH_MY_MEMORY
>
> flag which'll stop this from happening. But I'd prefer it if Ard decides
> what the right thing to do here is.
>
> > another is skip efi_arch_mem_reserve when the EFI_MEMORY_RUNTIME bit
> > was set already.
>
> Can that even happen?
Yes, let's say we have two different cases both go through
drivers/firmware/efi/efi-bgrt.c -> efi_mem_reserve ->
efi_arch_mem_reserve
1. normal boot (non kexec-ed)
The bgrt region is reserved and mark as EFI_MEMORY_RUNTIME with a
new efi mem range which is inserted in the memmap, later kexec will
carry over to 2nd kernel (drop those boot service areas without
EFI_MEMORY_RUNTIME)
2. kexec-ed boot
In the same call path, the previous kernel saved bgrt region has
already set EFI_MEMORY_RUNTIME, but it is re-reserved with a new mem
entry in memmap, this is not necessary and duplicate. I did not
check the efi boot code if it will de-duplicate the memmap later, but
anyway this is useless and it should be skipped.
Thanks
Dave
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11.1 11/19] x86/tdx: Convert shared memory back to private on kexec
2024-06-05 16:24 ` Borislav Petkov
@ 2024-06-06 12:39 ` Kirill A. Shutemov
0 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-06 12:39 UTC (permalink / raw)
To: Borislav Petkov
Cc: Dave Hansen, adrian.hunter, ardb, ashish.kalra, bhe, dave.hansen,
elena.reshetova, haiyangz, hpa, jun.nakajima, kai.huang, kexec,
kys, linux-acpi, linux-coco, linux-hyperv, linux-kernel, ltao,
mingo, peterz, rafael, rick.p.edgecombe,
sathyanarayanan.kuppuswamy, seanjc, tglx, thomas.lendacky, x86
On Wed, Jun 05, 2024 at 06:24:19PM +0200, Borislav Petkov wrote:
> On Wed, Jun 05, 2024 at 03:21:42PM +0300, Kirill A. Shutemov wrote:
> > If a page can be accessed via private mapping is determined by the
> > presence in Secure EPT. This state persist across kexec.
>
> I just love it how I tickle out details each time I touch this comment
> because we three can't write a single concise and self-contained
> explanation. :-(
>
> Ok, next version:
>
> "Private mappings persist across kexec. If tdx_enc_status_changed() fails
s/Private mappings persist /Memory encryption state persists /
> in the first kernel, it leaves memory in an unknown state.
>
> If that memory remains shared, accessing it in the *next* kernel through
> a private mapping will result in an unrecoverable guest shutdown.
>
> The kdump kernel boot is not impacted as it uses a pre-reserved memory
> range that is always private. However, gathering crash information
> could lead to a crash if it accesses unconverted memory through
> a private mapping which is possible when accessing that memory through
> /proc/vmcore, for example.
>
> In all cases, print error info in order to leave enough bread crumbs for
> debugging."
>
> I think this is getting in the right direction as it actually makes
> sense now.
Otherwise looks good to me.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-03 8:39 ` Borislav Petkov
@ 2024-06-07 15:14 ` Kirill A. Shutemov
2024-06-10 13:40 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-07 15:14 UTC (permalink / raw)
To: Borislav Petkov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Mon, Jun 03, 2024 at 10:39:30AM +0200, Borislav Petkov wrote:
> > +/*
> > + * Make sure asm_acpi_mp_play_dead() is present in the identity mapping at
> > + * the same place as in the kernel page tables. asm_acpi_mp_play_dead() switches
> > + * to the identity mapping and the function has be present at the same spot in
> > + * the virtual address space before and after switching page tables.
> > + */
> > +static int __init init_transition_pgtable(pgd_t *pgd)
>
> This looks like a generic helper which should be in set_memory.c. And
> looking at that file, there's populate_pgd() which does pretty much the
> same thing, if I squint real hard.
>
> Let's tone down the duplication.
Okay, there is a function called kernel_map_pages_in_pgd() in set_memory.c
that does what we need here.
I tried to use it, but encountered a few issues:
- The code in set_memory.c allocates memory using the buddy allocator,
which is not yet ready. We can work around this limitation by delaying
the initialization of offlining until later, using a separate
early_initcall();
- I noticed a complaint that the allocation is being done from an atomic
context: a spinlock called cpa_lock is taken when populate_pgd()
allocates memory.
I am not sure why this was not noticed before. kernel_map_pages_in_pgd()
has only been used in EFI mapping initialization so far, so maybe it is
somehow special, I don't know.
I was able to address this issue by switching cpa_lock to a mutex.
However, this solution will only work if the callers for set_memory
interfaces are not called from an atomic context. I need to verify if
this is the case.
- The function __flush_tlb_all() in kernel_(un)map_pages_in_pgd() must be
called with preemption disabled. Once again, I am unsure why this has
not caused issues in the EFI case.
- I discovered a bug in kernel_ident_mapping_free() when it is used on a
machine with 5-level paging. I will submit a proper patch to fix this
issue.
The fixup is below.
Any comments?
diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c
index 6cfe762be28b..fbbfe78f7f27 100644
--- a/arch/x86/kernel/acpi/madt_wakeup.c
+++ b/arch/x86/kernel/acpi/madt_wakeup.c
@@ -59,82 +59,55 @@ static void acpi_mp_cpu_die(unsigned int cpu)
pr_err("Failed to hand over CPU %d to BIOS\n", cpu);
}
+static void acpi_mp_disable_offlining(struct acpi_madt_multiproc_wakeup *mp_wake)
+{
+ cpu_hotplug_disable_offlining();
+
+ /*
+ * ACPI MADT doesn't allow to offline a CPU after it was onlined. This
+ * limits kexec: the second kernel won't be able to use more than one CPU.
+ *
+ * To prevent a kexec kernel from onlining secondary CPUs invalidate the
+ * mailbox address in the ACPI MADT wakeup structure which prevents a
+ * kexec kernel to use it.
+ *
+ * This is safe as the booting kernel has the mailbox address cached
+ * already and acpi_wakeup_cpu() uses the cached value to bring up the
+ * secondary CPUs.
+ *
+ * Note: This is a Linux specific convention and not covered by the
+ * ACPI specification.
+ */
+ mp_wake->mailbox_address = 0;
+}
+
/* The argument is required to match type of x86_mapping_info::alloc_pgt_page */
static void __init *alloc_pgt_page(void *dummy)
{
- return memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+ return (void *)get_zeroed_page(GFP_KERNEL);
}
static void __init free_pgt_page(void *pgt, void *dummy)
{
- return memblock_free(pgt, PAGE_SIZE);
+ return free_page((unsigned long)pgt);
}
-/*
- * Make sure asm_acpi_mp_play_dead() is present in the identity mapping at
- * the same place as in the kernel page tables. asm_acpi_mp_play_dead() switches
- * to the identity mapping and the function has be present at the same spot in
- * the virtual address space before and after switching page tables.
- */
-static int __init init_transition_pgtable(pgd_t *pgd)
-{
- pgprot_t prot = PAGE_KERNEL_EXEC_NOENC;
- unsigned long vaddr, paddr;
- p4d_t *p4d;
- pud_t *pud;
- pmd_t *pmd;
- pte_t *pte;
-
- vaddr = (unsigned long)asm_acpi_mp_play_dead;
- pgd += pgd_index(vaddr);
- if (!pgd_present(*pgd)) {
- p4d = (p4d_t *)alloc_pgt_page(NULL);
- if (!p4d)
- return -ENOMEM;
- set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE));
- }
- p4d = p4d_offset(pgd, vaddr);
- if (!p4d_present(*p4d)) {
- pud = (pud_t *)alloc_pgt_page(NULL);
- if (!pud)
- return -ENOMEM;
- set_p4d(p4d, __p4d(__pa(pud) | _KERNPG_TABLE));
- }
- pud = pud_offset(p4d, vaddr);
- if (!pud_present(*pud)) {
- pmd = (pmd_t *)alloc_pgt_page(NULL);
- if (!pmd)
- return -ENOMEM;
- set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
- }
- pmd = pmd_offset(pud, vaddr);
- if (!pmd_present(*pmd)) {
- pte = (pte_t *)alloc_pgt_page(NULL);
- if (!pte)
- return -ENOMEM;
- set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
- }
- pte = pte_offset_kernel(pmd, vaddr);
-
- paddr = __pa(vaddr);
- set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
-
- return 0;
-}
-
-static int __init acpi_mp_setup_reset(u64 reset_vector)
+static int __init acpi_mp_setup_reset(union acpi_subtable_headers *header,
+ const unsigned long end)
{
+ struct acpi_madt_multiproc_wakeup *mp_wake;
struct x86_mapping_info info = {
.alloc_pgt_page = alloc_pgt_page,
.free_pgt_page = free_pgt_page,
.page_flag = __PAGE_KERNEL_LARGE_EXEC,
- .kernpg_flag = _KERNPG_TABLE_NOENC,
+ .kernpg_flag = _KERNPG_TABLE,
};
+ unsigned long vaddr, pfn;
pgd_t *pgd;
pgd = alloc_pgt_page(NULL);
if (!pgd)
- return -ENOMEM;
+ goto err;
for (int i = 0; i < nr_pfn_mapped; i++) {
unsigned long mstart, mend;
@@ -143,30 +116,45 @@ static int __init acpi_mp_setup_reset(u64 reset_vector)
mend = pfn_mapped[i].end << PAGE_SHIFT;
if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
kernel_ident_mapping_free(&info, pgd);
- return -ENOMEM;
+ goto err;
}
}
if (kernel_ident_mapping_init(&info, pgd,
- PAGE_ALIGN_DOWN(reset_vector),
- PAGE_ALIGN(reset_vector + 1))) {
+ PAGE_ALIGN_DOWN(acpi_mp_reset_vector_paddr),
+ PAGE_ALIGN(acpi_mp_reset_vector_paddr + 1))) {
kernel_ident_mapping_free(&info, pgd);
- return -ENOMEM;
+ goto err;
}
- if (init_transition_pgtable(pgd)) {
+ /*
+ * Make sure asm_acpi_mp_play_dead() is present in the identity mapping
+ * at the same place as in the kernel page tables.
+ *
+ * asm_acpi_mp_play_dead() switches to the identity mapping and the
+ * function has be present at the same spot in the virtual address space
+ * before and after switching page tables.
+ */
+ vaddr = (unsigned long)asm_acpi_mp_play_dead;
+ pfn = __pa(vaddr) >> PAGE_SHIFT;
+ if (kernel_map_pages_in_pgd(pgd, pfn, vaddr, 1, _KERNPG_TABLE)) {
kernel_ident_mapping_free(&info, pgd);
- return -ENOMEM;
+ goto err;
}
smp_ops.play_dead = acpi_mp_play_dead;
smp_ops.stop_this_cpu = acpi_mp_stop_this_cpu;
smp_ops.cpu_die = acpi_mp_cpu_die;
- acpi_mp_reset_vector_paddr = reset_vector;
acpi_mp_pgd = __pa(pgd);
return 0;
+err:
+ pr_warn("Failed to setup MADT reset vector\n");
+ mp_wake = (struct acpi_madt_multiproc_wakeup *)header;
+ acpi_mp_disable_offlining(mp_wake);
+ return -ENOMEM;
+
}
static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip)
@@ -226,28 +214,6 @@ static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip)
return 0;
}
-static void acpi_mp_disable_offlining(struct acpi_madt_multiproc_wakeup *mp_wake)
-{
- cpu_hotplug_disable_offlining();
-
- /*
- * ACPI MADT doesn't allow to offline a CPU after it was onlined. This
- * limits kexec: the second kernel won't be able to use more than one CPU.
- *
- * To prevent a kexec kernel from onlining secondary CPUs invalidate the
- * mailbox address in the ACPI MADT wakeup structure which prevents a
- * kexec kernel to use it.
- *
- * This is safe as the booting kernel has the mailbox address cached
- * already and acpi_wakeup_cpu() uses the cached value to bring up the
- * secondary CPUs.
- *
- * Note: This is a Linux specific convention and not covered by the
- * ACPI specification.
- */
- mp_wake->mailbox_address = 0;
-}
-
int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
const unsigned long end)
{
@@ -274,10 +240,7 @@ int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
if (mp_wake->version >= ACPI_MADT_MP_WAKEUP_VERSION_V1 &&
mp_wake->header.length >= ACPI_MADT_MP_WAKEUP_SIZE_V1) {
- if (acpi_mp_setup_reset(mp_wake->reset_vector)) {
- pr_warn("Failed to setup MADT reset vector\n");
- acpi_mp_disable_offlining(mp_wake);
- }
+ acpi_mp_reset_vector_paddr = mp_wake->reset_vector;
} else {
/*
* CPU offlining requires version 1 of the ACPI MADT wakeup
@@ -290,3 +253,13 @@ int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
return 0;
}
+
+static int __init acpi_mp_offline_init(void)
+{
+ if (!acpi_mp_reset_vector_paddr)
+ return 0;
+
+ return acpi_table_parse_madt(ACPI_MADT_TYPE_MULTIPROC_WAKEUP,
+ acpi_mp_setup_reset, 1);
+}
+early_initcall(acpi_mp_offline_init);
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index 3996af7b4abf..c45127265f2f 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -60,7 +60,7 @@ static void free_p4d(struct x86_mapping_info *info, pgd_t *pgd)
}
if (pgtable_l5_enabled())
- info->free_pgt_page(pgd, info->context);
+ info->free_pgt_page(p4d, info->context);
}
void kernel_ident_mapping_free(struct x86_mapping_info *info, pgd_t *pgd)
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 443a97e515c0..72715674f492 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -69,7 +69,7 @@ static const int cpa_warn_level = CPA_PROTECT;
* entries change the page attribute in parallel to some other cpu
* splitting a large page entry along with changing the attribute.
*/
-static DEFINE_SPINLOCK(cpa_lock);
+static DEFINE_MUTEX(cpa_lock);
#define CPA_FLUSHTLB 1
#define CPA_ARRAY 2
@@ -1186,10 +1186,10 @@ static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
struct page *base;
if (!debug_pagealloc_enabled())
- spin_unlock(&cpa_lock);
+ mutex_unlock(&cpa_lock);
base = alloc_pages(GFP_KERNEL, 0);
if (!debug_pagealloc_enabled())
- spin_lock(&cpa_lock);
+ mutex_lock(&cpa_lock);
if (!base)
return -ENOMEM;
@@ -1804,10 +1804,10 @@ static int __change_page_attr_set_clr(struct cpa_data *cpa, int primary)
cpa->numpages = 1;
if (!debug_pagealloc_enabled())
- spin_lock(&cpa_lock);
+ mutex_lock(&cpa_lock);
ret = __change_page_attr(cpa, primary);
if (!debug_pagealloc_enabled())
- spin_unlock(&cpa_lock);
+ mutex_unlock(&cpa_lock);
if (ret)
goto out;
@@ -2516,7 +2516,9 @@ int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
cpa.mask_set = __pgprot(_PAGE_PRESENT | page_flags);
retval = __change_page_attr_set_clr(&cpa, 1);
+ preempt_disable();
__flush_tlb_all();
+ preempt_enable();
out:
return retval;
@@ -2551,7 +2553,9 @@ int __init kernel_unmap_pages_in_pgd(pgd_t *pgd, unsigned long address,
WARN_ONCE(num_online_cpus() > 1, "Don't call after initializing SMP");
retval = __change_page_attr_set_clr(&cpa, 1);
+ preempt_disable();
__flush_tlb_all();
+ preempt_enable();
return retval;
}
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-07 15:14 ` Kirill A. Shutemov
@ 2024-06-10 13:40 ` Borislav Petkov
2024-06-10 14:01 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-10 13:40 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Fri, Jun 07, 2024 at 06:14:28PM +0300, Kirill A. Shutemov wrote:
> I was able to address this issue by switching cpa_lock to a mutex.
> However, this solution will only work if the callers for set_memory
> interfaces are not called from an atomic context. I need to verify if
> this is the case.
Dunno, I'd be nervous about this. Althouth from looking at
ad5ca55f6bdb ("x86, cpa: srlz cpa(), global flush tlb after splitting big page and before doing cpa")
I don't see how "So that we don't allow any other cpu" can't be done
with a mutex. Perhaps the set_memory* interfaces should be usable in as
many contexts as possible.
Have you run this with lockdep enabled?
> - The function __flush_tlb_all() in kernel_(un)map_pages_in_pgd() must be
> called with preemption disabled. Once again, I am unsure why this has
> not caused issues in the EFI case.
It could be because EFI does all that setup on the BSP only before the
others have arrived but I don't remember anymore... It is more than
a decade ago when I did this...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-10 13:40 ` Borislav Petkov
@ 2024-06-10 14:01 ` Kirill A. Shutemov
2024-06-11 15:47 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-10 14:01 UTC (permalink / raw)
To: Borislav Petkov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Mon, Jun 10, 2024 at 03:40:20PM +0200, Borislav Petkov wrote:
> On Fri, Jun 07, 2024 at 06:14:28PM +0300, Kirill A. Shutemov wrote:
> > I was able to address this issue by switching cpa_lock to a mutex.
> > However, this solution will only work if the callers for set_memory
> > interfaces are not called from an atomic context. I need to verify if
> > this is the case.
>
> Dunno, I'd be nervous about this. Althouth from looking at
>
> ad5ca55f6bdb ("x86, cpa: srlz cpa(), global flush tlb after splitting big page and before doing cpa")
>
> I don't see how "So that we don't allow any other cpu" can't be done
> with a mutex. Perhaps the set_memory* interfaces should be usable in as
> many contexts as possible.
>
> Have you run this with lockdep enabled?
Yes, it booted to the shell just fine. However, that doesn't prove
anything. The set_memory_* function has many obscured cases.
> > - The function __flush_tlb_all() in kernel_(un)map_pages_in_pgd() must be
> > called with preemption disabled. Once again, I am unsure why this has
> > not caused issues in the EFI case.
>
> It could be because EFI does all that setup on the BSP only before the
> others have arrived but I don't remember anymore... It is more than
> a decade ago when I did this...
Are you okay with this? Disabling preemption looks strange, but I don't
see a better option.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-10 14:01 ` Kirill A. Shutemov
@ 2024-06-11 15:47 ` Kirill A. Shutemov
2024-06-11 19:46 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-11 15:47 UTC (permalink / raw)
To: Borislav Petkov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Mon, Jun 10, 2024 at 05:01:55PM +0300, Kirill A. Shutemov wrote:
> On Mon, Jun 10, 2024 at 03:40:20PM +0200, Borislav Petkov wrote:
> > On Fri, Jun 07, 2024 at 06:14:28PM +0300, Kirill A. Shutemov wrote:
> > > I was able to address this issue by switching cpa_lock to a mutex.
> > > However, this solution will only work if the callers for set_memory
> > > interfaces are not called from an atomic context. I need to verify if
> > > this is the case.
> >
> > Dunno, I'd be nervous about this. Althouth from looking at
> >
> > ad5ca55f6bdb ("x86, cpa: srlz cpa(), global flush tlb after splitting big page and before doing cpa")
> >
> > I don't see how "So that we don't allow any other cpu" can't be done
> > with a mutex. Perhaps the set_memory* interfaces should be usable in as
> > many contexts as possible.
> >
> > Have you run this with lockdep enabled?
>
> Yes, it booted to the shell just fine. However, that doesn't prove
> anything. The set_memory_* function has many obscured cases.
>
> > > - The function __flush_tlb_all() in kernel_(un)map_pages_in_pgd() must be
> > > called with preemption disabled. Once again, I am unsure why this has
> > > not caused issues in the EFI case.
> >
> > It could be because EFI does all that setup on the BSP only before the
> > others have arrived but I don't remember anymore... It is more than
> > a decade ago when I did this...
>
> Are you okay with this? Disabling preemption looks strange, but I don't
> see a better option.
Borislav, given this code deduplication effort is not trivial, maybe we
can do it as a separate patchset on top of this one?
I also wounder if it makes sense to combine ident_map.c and set_memory.c.
There's some overlap between the two.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-06-04 15:21 ` Kirill A. Shutemov
2024-06-04 17:57 ` Borislav Petkov
@ 2024-06-11 18:26 ` H. Peter Anvin
2024-06-12 9:22 ` Kirill A. Shutemov
1 sibling, 1 reply; 115+ messages in thread
From: H. Peter Anvin @ 2024-06-11 18:26 UTC (permalink / raw)
To: Kirill A. Shutemov, Borislav Petkov
Cc: Nikolay Borisov, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86,
Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, K. Y. Srinivasan,
Haiyang Zhang, kexec, linux-hyperv, linux-acpi, linux-coco,
linux-kernel
On 6/4/24 08:21, Kirill A. Shutemov wrote:
>
> From b45fe48092abad2612c2bafbb199e4de80c99545 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Fri, 10 Feb 2023 12:53:11 +0300
> Subject: [PATCHv11.1 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest
>
> TDX guests run with MCA enabled (CR4.MCE=1b) from the very start. If
> that bit is cleared during CR4 register reprogramming during boot or
> kexec flows, a #VE exception will be raised which the guest kernel
> cannot handle it.
>
> Therefore, make sure the CR4.MCE setting is preserved over kexec too and
> avoid raising any #VEs.
>
> The change doesn't affect non-TDX-guest environments.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/kernel/relocate_kernel_64.S | 17 ++++++++++-------
> 1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> index 085eef5c3904..9c2cf70c5f54 100644
> --- a/arch/x86/kernel/relocate_kernel_64.S
> +++ b/arch/x86/kernel/relocate_kernel_64.S
> @@ -5,6 +5,8 @@
> */
>
> #include <linux/linkage.h>
> +#include <linux/stringify.h>
> +#include <asm/alternative.h>
> #include <asm/page_types.h>
> #include <asm/kexec.h>
> #include <asm/processor-flags.h>
> @@ -145,14 +147,15 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> * Set cr4 to a known state:
> * - physical address extension enabled
> * - 5-level paging, if it was enabled before
> + * - Machine check exception on TDX guest, if it was enabled before.
> + * Clearing MCE might not be allowed in TDX guests, depending on setup.
> + *
> + * Use R13 that contains the original CR4 value, read in relocate_kernel().
> + * PAE is always set in the original CR4.
> */
> - movl $X86_CR4_PAE, %eax
> - testq $X86_CR4_LA57, %r13
> - jz .Lno_la57
> - orl $X86_CR4_LA57, %eax
> -.Lno_la57:
> -
> - movq %rax, %cr4
> + andl $(X86_CR4_PAE | X86_CR4_LA57), %r13d
> + ALTERNATIVE "", __stringify(orl $X86_CR4_MCE, %r13d), X86_FEATURE_TDX_GUEST
> + movq %r13, %cr4
>
If this is the case, I don't really see a reason to clear MCE per se as
I'm guessing a machine check here will be fatal anyway? It just changes
the method of death.
Also, is there a reason to save %cr4, run code, and *then* clear the
relevant bits? Wouldn't it be better to sanitize %cr4 as soon as possible?
-hpa
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-11 15:47 ` Kirill A. Shutemov
@ 2024-06-11 19:46 ` Borislav Petkov
2024-06-12 9:24 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-11 19:46 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Tue, Jun 11, 2024 at 06:47:05PM +0300, Kirill A. Shutemov wrote:
> Borislav, given this code deduplication effort is not trivial, maybe we
> can do it as a separate patchset on top of this one?
Sure, as long as it gets done and doesn't get delayed indefinitely by
new and more important features enablement.
Usually, we do unifications and cleanups first - then new features but
this kexec pile has been long in the making already...
> I also wounder if it makes sense to combine ident_map.c and
> set_memory.c. There's some overlap between the two.
Yeah, we have a bunch of different pagetable manipulating things, all
with their peculiarities and unifying them and having a good set of APIs
which everything else uses, is always a good thing.
And since we're talking cleanups, there's another thing I've been
looking at critically: CONFIG_X86_5LEVEL. Maybe it is time to get rid of
it and make the 5level stuff unconditional. And get rid of a bunch of
code since both vendors support 5level now...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-06-11 18:26 ` H. Peter Anvin
@ 2024-06-12 9:22 ` Kirill A. Shutemov
2024-06-12 23:06 ` Andrew Cooper
0 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-12 9:22 UTC (permalink / raw)
To: H. Peter Anvin, Andrew Cooper
Cc: Borislav Petkov, Nikolay Borisov, Thomas Gleixner, Ingo Molnar,
Dave Hansen, x86, Rafael J. Wysocki, Peter Zijlstra,
Adrian Hunter, Kuppuswamy Sathyanarayanan, Elena Reshetova,
Jun Nakajima, Rick Edgecombe, Tom Lendacky, Kalra, Ashish,
Sean Christopherson, Huang, Kai, Ard Biesheuvel, Baoquan He,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On Tue, Jun 11, 2024 at 11:26:17AM -0700, H. Peter Anvin wrote:
> On 6/4/24 08:21, Kirill A. Shutemov wrote:
> >
> > From b45fe48092abad2612c2bafbb199e4de80c99545 Mon Sep 17 00:00:00 2001
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > Date: Fri, 10 Feb 2023 12:53:11 +0300
> > Subject: [PATCHv11.1 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest
> >
> > TDX guests run with MCA enabled (CR4.MCE=1b) from the very start. If
> > that bit is cleared during CR4 register reprogramming during boot or
> > kexec flows, a #VE exception will be raised which the guest kernel
> > cannot handle it.
> >
> > Therefore, make sure the CR4.MCE setting is preserved over kexec too and
> > avoid raising any #VEs.
> >
> > The change doesn't affect non-TDX-guest environments.
> >
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> > arch/x86/kernel/relocate_kernel_64.S | 17 ++++++++++-------
> > 1 file changed, 10 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> > index 085eef5c3904..9c2cf70c5f54 100644
> > --- a/arch/x86/kernel/relocate_kernel_64.S
> > +++ b/arch/x86/kernel/relocate_kernel_64.S
> > @@ -5,6 +5,8 @@
> > */
> > #include <linux/linkage.h>
> > +#include <linux/stringify.h>
> > +#include <asm/alternative.h>
> > #include <asm/page_types.h>
> > #include <asm/kexec.h>
> > #include <asm/processor-flags.h>
> > @@ -145,14 +147,15 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> > * Set cr4 to a known state:
> > * - physical address extension enabled
> > * - 5-level paging, if it was enabled before
> > + * - Machine check exception on TDX guest, if it was enabled before.
> > + * Clearing MCE might not be allowed in TDX guests, depending on setup.
> > + *
> > + * Use R13 that contains the original CR4 value, read in relocate_kernel().
> > + * PAE is always set in the original CR4.
> > */
> > - movl $X86_CR4_PAE, %eax
> > - testq $X86_CR4_LA57, %r13
> > - jz .Lno_la57
> > - orl $X86_CR4_LA57, %eax
> > -.Lno_la57:
> > -
> > - movq %rax, %cr4
> > + andl $(X86_CR4_PAE | X86_CR4_LA57), %r13d
> > + ALTERNATIVE "", __stringify(orl $X86_CR4_MCE, %r13d), X86_FEATURE_TDX_GUEST
> > + movq %r13, %cr4
>
> If this is the case, I don't really see a reason to clear MCE per se as I'm
> guessing a machine check here will be fatal anyway? It just changes the
> method of death.
Andrew had a strong opinion on method of death here.
https://lore.kernel.org/all/1144340e-dd95-ee3b-dabb-579f9a65b3c7@citrix.com
> Also, is there a reason to save %cr4, run code, and *then* clear the
> relevant bits? Wouldn't it be better to sanitize %cr4 as soon as possible?
You mean set new CR4 directly in relocate_kernel() before switching CR3?
I guess it is possible.
But I can say I see huge benefit of changing it. Such change would have
own risks.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-11 19:46 ` Borislav Petkov
@ 2024-06-12 9:24 ` Kirill A. Shutemov
2024-06-12 9:29 ` Borislav Petkov
0 siblings, 1 reply; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-12 9:24 UTC (permalink / raw)
To: Borislav Petkov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Tue, Jun 11, 2024 at 09:46:53PM +0200, Borislav Petkov wrote:
> On Tue, Jun 11, 2024 at 06:47:05PM +0300, Kirill A. Shutemov wrote:
> > Borislav, given this code deduplication effort is not trivial, maybe we
> > can do it as a separate patchset on top of this one?
>
> Sure, as long as it gets done and doesn't get delayed indefinitely by
> new and more important features enablement.
I will try to deliver it in timely manner.
> Usually, we do unifications and cleanups first - then new features but
> this kexec pile has been long in the making already...
>
> > I also wounder if it makes sense to combine ident_map.c and
> > set_memory.c. There's some overlap between the two.
>
> Yeah, we have a bunch of different pagetable manipulating things, all
> with their peculiarities and unifying them and having a good set of APIs
> which everything else uses, is always a good thing.
Will give it a try.
> And since we're talking cleanups, there's another thing I've been
> looking at critically: CONFIG_X86_5LEVEL. Maybe it is time to get rid of
> it and make the 5level stuff unconditional. And get rid of a bunch of
> code since both vendors support 5level now...
Can do.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-12 9:24 ` Kirill A. Shutemov
@ 2024-06-12 9:29 ` Borislav Petkov
2024-06-13 13:41 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-12 9:29 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Wed, Jun 12, 2024 at 12:24:30PM +0300, Kirill A. Shutemov wrote:
> I will try to deliver it in timely manner.
:-P
> > Yeah, we have a bunch of different pagetable manipulating things, all
> > with their peculiarities and unifying them and having a good set of APIs
> > which everything else uses, is always a good thing.
>
> Will give it a try.
>
> > And since we're talking cleanups, there's another thing I've been
> > looking at critically: CONFIG_X86_5LEVEL. Maybe it is time to get rid of
> > it and make the 5level stuff unconditional. And get rid of a bunch of
> > code since both vendors support 5level now...
>
> Can do.
Much appreciated, thanks!
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-06-03 14:43 ` H. Peter Anvin
@ 2024-06-12 12:10 ` Nikolay Borisov
0 siblings, 0 replies; 115+ messages in thread
From: Nikolay Borisov @ 2024-06-12 12:10 UTC (permalink / raw)
To: H. Peter Anvin, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86
Cc: Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
Huang, Kai, Ard Biesheuvel, Baoquan He, K. Y. Srinivasan,
Haiyang Zhang, kexec, linux-hyperv, linux-acpi, linux-coco,
linux-kernel
On 3.06.24 г. 17:43 ч., H. Peter Anvin wrote:
> On 5/29/24 03:47, Nikolay Borisov wrote:
>>>
>>> diff --git a/arch/x86/kernel/relocate_kernel_64.S
>>> b/arch/x86/kernel/relocate_kernel_64.S
>>> index 56cab1bb25f5..085eef5c3904 100644
>>> --- a/arch/x86/kernel/relocate_kernel_64.S
>>> +++ b/arch/x86/kernel/relocate_kernel_64.S
>>> @@ -148,9 +148,10 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>>> */
>>> movl $X86_CR4_PAE, %eax
>>> testq $X86_CR4_LA57, %r13
>>> - jz 1f
>>> + jz .Lno_la57
>>> orl $X86_CR4_LA57, %eax
>>> -1:
>>> +.Lno_la57:
>>> +
>>> movq %rax, %cr4
>>> jmp 1f
>>
>> That jmp 1f becomes redundant now as it simply jumps 1 line below.
>>
>
> Uh... am I the only person to notice that ALL that is needed here is:
>
> andl $(X86_CR4_PAE|X86_CR4_LA57), %r13d
> movq %r13, %rax
>
> ... since %r13 is dead afterwards, and PAE *will* have been set in %r13
> already?
>
> I don't believe that this specific jmp is actually needed -- there are
> several more synchronizing jumps later -- but it doesn't hurt.
>
> However, if the effort is for improving the readability, it might be
> worthwhile to encapsulate the "jmp 1f; 1:" as a macro, e.g. "SYNC_CODE".
The preceding move to CR4 is itself a serializing instruction, no?
>
> -hpa
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-06-12 9:22 ` Kirill A. Shutemov
@ 2024-06-12 23:06 ` Andrew Cooper
2024-06-12 23:25 ` H. Peter Anvin
0 siblings, 1 reply; 115+ messages in thread
From: Andrew Cooper @ 2024-06-12 23:06 UTC (permalink / raw)
To: Kirill A. Shutemov, H. Peter Anvin
Cc: Borislav Petkov, Nikolay Borisov, Thomas Gleixner, Ingo Molnar,
Dave Hansen, x86, Rafael J. Wysocki, Peter Zijlstra,
Adrian Hunter, Kuppuswamy Sathyanarayanan, Elena Reshetova,
Jun Nakajima, Rick Edgecombe, Tom Lendacky, Kalra, Ashish,
Sean Christopherson, Huang, Kai, Ard Biesheuvel, Baoquan He,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On 12/06/2024 10:22 am, Kirill A. Shutemov wrote:
> On Tue, Jun 11, 2024 at 11:26:17AM -0700, H. Peter Anvin wrote:
>> On 6/4/24 08:21, Kirill A. Shutemov wrote:
>>> From b45fe48092abad2612c2bafbb199e4de80c99545 Mon Sep 17 00:00:00 2001
>>> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>>> Date: Fri, 10 Feb 2023 12:53:11 +0300
>>> Subject: [PATCHv11.1 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest
>>>
>>> TDX guests run with MCA enabled (CR4.MCE=1b) from the very start. If
>>> that bit is cleared during CR4 register reprogramming during boot or
>>> kexec flows, a #VE exception will be raised which the guest kernel
>>> cannot handle it.
>>>
>>> Therefore, make sure the CR4.MCE setting is preserved over kexec too and
>>> avoid raising any #VEs.
>>>
>>> The change doesn't affect non-TDX-guest environments.
>>>
>>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>> ---
>>> arch/x86/kernel/relocate_kernel_64.S | 17 ++++++++++-------
>>> 1 file changed, 10 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
>>> index 085eef5c3904..9c2cf70c5f54 100644
>>> --- a/arch/x86/kernel/relocate_kernel_64.S
>>> +++ b/arch/x86/kernel/relocate_kernel_64.S
>>> @@ -5,6 +5,8 @@
>>> */
>>> #include <linux/linkage.h>
>>> +#include <linux/stringify.h>
>>> +#include <asm/alternative.h>
>>> #include <asm/page_types.h>
>>> #include <asm/kexec.h>
>>> #include <asm/processor-flags.h>
>>> @@ -145,14 +147,15 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>>> * Set cr4 to a known state:
>>> * - physical address extension enabled
>>> * - 5-level paging, if it was enabled before
>>> + * - Machine check exception on TDX guest, if it was enabled before.
>>> + * Clearing MCE might not be allowed in TDX guests, depending on setup.
>>> + *
>>> + * Use R13 that contains the original CR4 value, read in relocate_kernel().
>>> + * PAE is always set in the original CR4.
>>> */
>>> - movl $X86_CR4_PAE, %eax
>>> - testq $X86_CR4_LA57, %r13
>>> - jz .Lno_la57
>>> - orl $X86_CR4_LA57, %eax
>>> -.Lno_la57:
>>> -
>>> - movq %rax, %cr4
>>> + andl $(X86_CR4_PAE | X86_CR4_LA57), %r13d
>>> + ALTERNATIVE "", __stringify(orl $X86_CR4_MCE, %r13d), X86_FEATURE_TDX_GUEST
>>> + movq %r13, %cr4
>> If this is the case, I don't really see a reason to clear MCE per se as I'm
>> guessing a machine check here will be fatal anyway? It just changes the
>> method of death.
> Andrew had a strong opinion on method of death here.
>
> https://lore.kernel.org/all/1144340e-dd95-ee3b-dabb-579f9a65b3c7@citrix.com
Not sure if I intended it to come across that strongly, but given a
choice, the !CR4.MCE death is cleaner because at least you're not
interpreting garbage and trying to use it as a valid IDT.
~Andrew
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
2024-06-12 23:06 ` Andrew Cooper
@ 2024-06-12 23:25 ` H. Peter Anvin
0 siblings, 0 replies; 115+ messages in thread
From: H. Peter Anvin @ 2024-06-12 23:25 UTC (permalink / raw)
To: Andrew Cooper, Kirill A. Shutemov
Cc: Borislav Petkov, Nikolay Borisov, Thomas Gleixner, Ingo Molnar,
Dave Hansen, x86, Rafael J. Wysocki, Peter Zijlstra,
Adrian Hunter, Kuppuswamy Sathyanarayanan, Elena Reshetova,
Jun Nakajima, Rick Edgecombe, Tom Lendacky, Kalra, Ashish,
Sean Christopherson, Huang, Kai, Ard Biesheuvel, Baoquan He,
K. Y. Srinivasan, Haiyang Zhang, kexec, linux-hyperv, linux-acpi,
linux-coco, linux-kernel
On June 12, 2024 4:06:07 PM PDT, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>On 12/06/2024 10:22 am, Kirill A. Shutemov wrote:
>> On Tue, Jun 11, 2024 at 11:26:17AM -0700, H. Peter Anvin wrote:
>>> On 6/4/24 08:21, Kirill A. Shutemov wrote:
>>>> From b45fe48092abad2612c2bafbb199e4de80c99545 Mon Sep 17 00:00:00 2001
>>>> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>>>> Date: Fri, 10 Feb 2023 12:53:11 +0300
>>>> Subject: [PATCHv11.1 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest
>>>>
>>>> TDX guests run with MCA enabled (CR4.MCE=1b) from the very start. If
>>>> that bit is cleared during CR4 register reprogramming during boot or
>>>> kexec flows, a #VE exception will be raised which the guest kernel
>>>> cannot handle it.
>>>>
>>>> Therefore, make sure the CR4.MCE setting is preserved over kexec too and
>>>> avoid raising any #VEs.
>>>>
>>>> The change doesn't affect non-TDX-guest environments.
>>>>
>>>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>> ---
>>>> arch/x86/kernel/relocate_kernel_64.S | 17 ++++++++++-------
>>>> 1 file changed, 10 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
>>>> index 085eef5c3904..9c2cf70c5f54 100644
>>>> --- a/arch/x86/kernel/relocate_kernel_64.S
>>>> +++ b/arch/x86/kernel/relocate_kernel_64.S
>>>> @@ -5,6 +5,8 @@
>>>> */
>>>> #include <linux/linkage.h>
>>>> +#include <linux/stringify.h>
>>>> +#include <asm/alternative.h>
>>>> #include <asm/page_types.h>
>>>> #include <asm/kexec.h>
>>>> #include <asm/processor-flags.h>
>>>> @@ -145,14 +147,15 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>>>> * Set cr4 to a known state:
>>>> * - physical address extension enabled
>>>> * - 5-level paging, if it was enabled before
>>>> + * - Machine check exception on TDX guest, if it was enabled before.
>>>> + * Clearing MCE might not be allowed in TDX guests, depending on setup.
>>>> + *
>>>> + * Use R13 that contains the original CR4 value, read in relocate_kernel().
>>>> + * PAE is always set in the original CR4.
>>>> */
>>>> - movl $X86_CR4_PAE, %eax
>>>> - testq $X86_CR4_LA57, %r13
>>>> - jz .Lno_la57
>>>> - orl $X86_CR4_LA57, %eax
>>>> -.Lno_la57:
>>>> -
>>>> - movq %rax, %cr4
>>>> + andl $(X86_CR4_PAE | X86_CR4_LA57), %r13d
>>>> + ALTERNATIVE "", __stringify(orl $X86_CR4_MCE, %r13d), X86_FEATURE_TDX_GUEST
>>>> + movq %r13, %cr4
>>> If this is the case, I don't really see a reason to clear MCE per se as I'm
>>> guessing a machine check here will be fatal anyway? It just changes the
>>> method of death.
>> Andrew had a strong opinion on method of death here.
>>
>> https://lore.kernel.org/all/1144340e-dd95-ee3b-dabb-579f9a65b3c7@citrix.com
>
>Not sure if I intended it to come across that strongly, but given a
>choice, the !CR4.MCE death is cleaner because at least you're not
>interpreting garbage and trying to use it as a valid IDT.
>
>~Andrew
Zorch the IDT if it isn't valid?
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-12 9:29 ` Borislav Petkov
@ 2024-06-13 13:41 ` Kirill A. Shutemov
2024-06-13 14:56 ` Borislav Petkov
2024-06-21 13:38 ` Borislav Petkov
0 siblings, 2 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-13 13:41 UTC (permalink / raw)
To: Borislav Petkov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Wed, Jun 12, 2024 at 11:29:43AM +0200, Borislav Petkov wrote:
> > > And since we're talking cleanups, there's another thing I've been
> > > looking at critically: CONFIG_X86_5LEVEL. Maybe it is time to get rid of
> > > it and make the 5level stuff unconditional. And get rid of a bunch of
> > > code since both vendors support 5level now...
> >
> > Can do.
>
> Much appreciated, thanks!
It is easy enough to do. See the patch below.
But I am not sure if I can justify it properly. If someone doesn't really
need 5-level paging, disabling it at compile-time would save ~34K of
kernel code with the configuration.
Is it worth saving ~100 lines of code?
Documentation/arch/x86/cpuinfo.rst | 8 +++-----
Documentation/arch/x86/x86_64/5level-paging.rst | 9 ---------
arch/x86/Kconfig | 24 +-----------------------
arch/x86/boot/compressed/pgtable_64.c | 10 +++-------
| 4 ----
arch/x86/include/asm/disabled-features.h | 9 +--------
arch/x86/include/asm/page_64.h | 2 --
arch/x86/include/asm/page_64_types.h | 7 -------
arch/x86/include/asm/pgtable_64_types.h | 18 ------------------
arch/x86/kernel/alternative.c | 2 +-
arch/x86/kernel/head64.c | 5 -----
arch/x86/kernel/head_64.S | 2 --
arch/x86/mm/init.c | 4 ----
arch/x86/mm/pgtable.c | 2 --
drivers/firmware/efi/libstub/x86-5lvl.c | 2 +-
tools/arch/x86/include/asm/disabled-features.h | 9 +--------
16 files changed, 11 insertions(+), 106 deletions(-)
diff --git a/Documentation/arch/x86/cpuinfo.rst b/Documentation/arch/x86/cpuinfo.rst
index 8895784d4784..0ea70924c89e 100644
--- a/Documentation/arch/x86/cpuinfo.rst
+++ b/Documentation/arch/x86/cpuinfo.rst
@@ -171,10 +171,10 @@ For example, when an old kernel is running on new hardware.
c: The kernel disabled support for it at compile-time.
------------------------------------------------------
-For example, if 5-level-paging is not enabled when building (i.e.,
-CONFIG_X86_5LEVEL is not selected) the flag "la57" will not show up [#f1]_.
+For example, if Linear Address Masking (LAM) is not enabled when building (i.e.,
+CONFIG_ADDRESS_MASKING is not selected) the flag "lam" will not show up.
Even though the feature will still be detected via CPUID, the kernel disables
-it by clearing via setup_clear_cpu_cap(X86_FEATURE_LA57).
+it by clearing via setup_clear_cpu_cap(X86_FEATURE_LAM).
d: The feature is disabled at boot-time.
----------------------------------------
@@ -197,5 +197,3 @@ missing at runtime. For example, AVX flags will not show up if XSAVE feature
is disabled since they depend on XSAVE feature. Another example would be broken
CPUs and them missing microcode patches. Due to that, the kernel decides not to
enable a feature.
-
-.. [#f1] 5-level paging uses linear address of 57 bits.
diff --git a/Documentation/arch/x86/x86_64/5level-paging.rst b/Documentation/arch/x86/x86_64/5level-paging.rst
index 71f882f4a173..ad7ddc13f79d 100644
--- a/Documentation/arch/x86/x86_64/5level-paging.rst
+++ b/Documentation/arch/x86/x86_64/5level-paging.rst
@@ -22,15 +22,6 @@ QEMU 2.9 and later support 5-level paging.
Virtual memory layout for 5-level paging is described in
Documentation/arch/x86/x86_64/mm.rst
-
-Enabling 5-level paging
-=======================
-CONFIG_X86_5LEVEL=y enables the feature.
-
-Kernel with CONFIG_X86_5LEVEL=y still able to boot on 4-level hardware.
-In this case additional page table level -- p4d -- will be folded at
-runtime.
-
User-space and large virtual address space
==========================================
On x86, 5-level paging enables 56-bit userspace virtual address space.
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e8837116704c..c62827c2ecea 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -408,8 +408,7 @@ config DYNAMIC_PHYSICAL_MASK
config PGTABLE_LEVELS
int
- default 5 if X86_5LEVEL
- default 4 if X86_64
+ default 5 if X86_64
default 3 if X86_PAE
default 2
@@ -1491,27 +1490,6 @@ config X86_PAE
has the cost of more pagetable lookup overhead, and also
consumes more pagetable space per process.
-config X86_5LEVEL
- bool "Enable 5-level page tables support"
- default y
- select DYNAMIC_MEMORY_LAYOUT
- select SPARSEMEM_VMEMMAP
- depends on X86_64
- help
- 5-level paging enables access to larger address space:
- up to 128 PiB of virtual address space and 4 PiB of
- physical address space.
-
- It will be supported by future Intel CPUs.
-
- A kernel with the option enabled can be booted on machines that
- support 4- or 5-level paging.
-
- See Documentation/arch/x86/x86_64/5level-paging.rst for more
- information.
-
- Say N if unsure.
-
config X86_DIRECT_GBPAGES
def_bool y
depends on X86_64
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index c882e1f67af0..f9b77b66c792 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -10,12 +10,10 @@
#define BIOS_START_MIN 0x20000U /* 128K, less than this is insane */
#define BIOS_START_MAX 0x9f000U /* 640K, absolute maximum */
-#ifdef CONFIG_X86_5LEVEL
/* __pgtable_l5_enabled needs to be in .data to avoid being cleared along with .bss */
unsigned int __section(".data") __pgtable_l5_enabled;
unsigned int __section(".data") pgdir_shift = 39;
unsigned int __section(".data") ptrs_per_p4d = 1;
-#endif
/* Buffer to preserve trampoline memory */
static char trampoline_save[TRAMPOLINE_32BIT_SIZE];
@@ -113,7 +111,6 @@ asmlinkage void configure_5level_paging(struct boot_params *bp, void *pgtable)
* Check if LA57 is desired and supported.
*
* There are several parts to the check:
- * - if the kernel supports 5-level paging: CONFIG_X86_5LEVEL=y
* - if user asked to disable 5-level paging: no5lvl in cmdline
* - if the machine supports 5-level paging:
* + CPUID leaf 7 is supported
@@ -121,10 +118,9 @@ asmlinkage void configure_5level_paging(struct boot_params *bp, void *pgtable)
*
* That's substitute for boot_cpu_has() in early boot code.
*/
- if (IS_ENABLED(CONFIG_X86_5LEVEL) &&
- !cmdline_find_option_bool("no5lvl") &&
- native_cpuid_eax(0) >= 7 &&
- (native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31)))) {
+ if (!cmdline_find_option_bool("no5lvl") &&
+ native_cpuid_eax(0) >= 7 &&
+ (native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31)))) {
l5_required = true;
/* Initialize variables for 5-level paging */
--git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
index b5c79f43359b..32361cef909e 100644
--- a/arch/x86/boot/header.S
+++ b/arch/x86/boot/header.S
@@ -361,12 +361,8 @@ xloadflags:
#endif
#ifdef CONFIG_X86_64
-#ifdef CONFIG_X86_5LEVEL
#define XLF56 (XLF_5LEVEL|XLF_5LEVEL_ENABLED)
#else
-#define XLF56 XLF_5LEVEL
-#endif
-#else
#define XLF56 0
#endif
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index c492bdc97b05..19cf1678fcaa 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -38,12 +38,6 @@
# define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31))
#endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */
-#ifdef CONFIG_X86_5LEVEL
-# define DISABLE_LA57 0
-#else
-# define DISABLE_LA57 (1<<(X86_FEATURE_LA57 & 31))
-#endif
-
#ifdef CONFIG_MITIGATION_PAGE_TABLE_ISOLATION
# define DISABLE_PTI 0
#else
@@ -149,8 +143,7 @@
#define DISABLED_MASK13 0
#define DISABLED_MASK14 0
#define DISABLED_MASK15 0
-#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \
- DISABLE_ENQCMD)
+#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_UMIP|DISABLE_ENQCMD)
#define DISABLED_MASK17 0
#define DISABLED_MASK18 (DISABLE_IBT)
#define DISABLED_MASK19 (DISABLE_SEV_SNP)
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index cc6b8e087192..3b8cb6a8b122 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -60,7 +60,6 @@ static inline void clear_page(void *page)
void copy_page(void *to, void *from);
-#ifdef CONFIG_X86_5LEVEL
/*
* User space process size. This is the first address outside the user range.
* There are a few constraints that determine this:
@@ -91,7 +90,6 @@ static __always_inline unsigned long task_size_max(void)
return ret;
}
-#endif /* CONFIG_X86_5LEVEL */
#endif /* !__ASSEMBLY__ */
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 06ef25411d62..714e88a72c9f 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -52,14 +52,7 @@
/* See Documentation/arch/x86/x86_64/mm.rst for a description of the memory map. */
#define __PHYSICAL_MASK_SHIFT 52
-
-#ifdef CONFIG_X86_5LEVEL
#define __VIRTUAL_MASK_SHIFT (pgtable_l5_enabled() ? 56 : 47)
-/* See task_size_max() in <asm/page_64.h> */
-#else
-#define __VIRTUAL_MASK_SHIFT 47
-#define task_size_max() ((_AC(1,UL) << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
-#endif
#define TASK_SIZE_MAX task_size_max()
#define DEFAULT_MAP_WINDOW ((1UL << 47) - PAGE_SIZE)
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 9053dfe9fa03..576aea58b0c0 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -23,7 +23,6 @@ typedef struct { pmdval_t pmd; } pmd_t;
extern unsigned int __pgtable_l5_enabled;
-#ifdef CONFIG_X86_5LEVEL
#ifdef USE_EARLY_PGTABLE_L5
/*
* cpu_feature_enabled() is not available in early boot code.
@@ -37,10 +36,6 @@ static inline bool pgtable_l5_enabled(void)
#define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
#endif /* USE_EARLY_PGTABLE_L5 */
-#else
-#define pgtable_l5_enabled() 0
-#endif /* CONFIG_X86_5LEVEL */
-
extern unsigned int pgdir_shift;
extern unsigned int ptrs_per_p4d;
@@ -48,8 +43,6 @@ extern unsigned int ptrs_per_p4d;
#define SHARED_KERNEL_PMD 0
-#ifdef CONFIG_X86_5LEVEL
-
/*
* PGDIR_SHIFT determines what a top-level page table entry can map
*/
@@ -67,17 +60,6 @@ extern unsigned int ptrs_per_p4d;
#define MAX_POSSIBLE_PHYSMEM_BITS 52
-#else /* CONFIG_X86_5LEVEL */
-
-/*
- * PGDIR_SHIFT determines what a top-level page table entry can map
- */
-#define PGDIR_SHIFT 39
-#define PTRS_PER_PGD 512
-#define MAX_PTRS_PER_P4D 1
-
-#endif /* CONFIG_X86_5LEVEL */
-
/*
* 3rd level page
*/
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 37596a417094..f1c519abb925 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -457,7 +457,7 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
DPRINTK(ALT, "alt table %px, -> %px", start, end);
/*
- * In the case CONFIG_X86_5LEVEL=y, KASAN_SHADOW_START is defined using
+ * KASAN_SHADOW_START is defined using
* cpu_feature_enabled(X86_FEATURE_LA57) and is therefore patched here.
* During the process, KASAN becomes confused seeing partial LA57
* conversion and triggers a false-positive out-of-bound report.
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index a817ed0724d1..df19bdea1c86 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -52,13 +52,11 @@ extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
static unsigned int __initdata next_early_pgt;
pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
-#ifdef CONFIG_X86_5LEVEL
unsigned int __pgtable_l5_enabled __ro_after_init;
unsigned int pgdir_shift __ro_after_init = 39;
EXPORT_SYMBOL(pgdir_shift);
unsigned int ptrs_per_p4d __ro_after_init = 1;
EXPORT_SYMBOL(ptrs_per_p4d);
-#endif
#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
unsigned long page_offset_base __ro_after_init = __PAGE_OFFSET_BASE_L4;
@@ -71,9 +69,6 @@ EXPORT_SYMBOL(vmemmap_base);
static inline bool check_la57_support(void)
{
- if (!IS_ENABLED(CONFIG_X86_5LEVEL))
- return false;
-
/*
* 5-level paging is detected and enabled at kernel decompression
* stage. Only check if it has been enabled there.
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 330922b328bf..4b2b2138c163 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -659,12 +659,10 @@ SYM_DATA_START_PTI_ALIGNED(init_top_pgt)
SYM_DATA_END(init_top_pgt)
#endif
-#ifdef CONFIG_X86_5LEVEL
SYM_DATA_START_PAGE_ALIGNED(level4_kernel_pgt)
.fill 511,8,0
.quad level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC
SYM_DATA_END(level4_kernel_pgt)
-#endif
SYM_DATA_START_PAGE_ALIGNED(level3_kernel_pgt)
.fill L3_START_KERNEL,8,0
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index eb503f53c319..5a980a452f4c 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -173,11 +173,7 @@ __ref void *alloc_low_pages(unsigned int num)
* randomization is enabled.
*/
-#ifndef CONFIG_X86_5LEVEL
-#define INIT_PGD_PAGE_TABLES 3
-#else
#define INIT_PGD_PAGE_TABLES 4
-#endif
#ifndef CONFIG_RANDOMIZE_MEMORY
#define INIT_PGD_PAGE_COUNT (2 * INIT_PGD_PAGE_TABLES)
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 93e54ba91fbf..982775ef8b34 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -691,7 +691,6 @@ void native_set_fixmap(unsigned /* enum fixed_addresses */ idx,
}
#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
-#ifdef CONFIG_X86_5LEVEL
/**
* p4d_set_huge - setup kernel P4D mapping
*
@@ -710,7 +709,6 @@ int p4d_set_huge(p4d_t *p4d, phys_addr_t addr, pgprot_t prot)
void p4d_clear_huge(p4d_t *p4d)
{
}
-#endif
/**
* pud_set_huge - setup kernel PUD mapping
diff --git a/drivers/firmware/efi/libstub/x86-5lvl.c b/drivers/firmware/efi/libstub/x86-5lvl.c
index 77359e802181..f1c5fb45d5f7 100644
--- a/drivers/firmware/efi/libstub/x86-5lvl.c
+++ b/drivers/firmware/efi/libstub/x86-5lvl.c
@@ -62,7 +62,7 @@ efi_status_t efi_setup_5level_paging(void)
void efi_5level_switch(void)
{
- bool want_la57 = IS_ENABLED(CONFIG_X86_5LEVEL) && !efi_no5lvl;
+ bool want_la57 = !efi_no5lvl;
bool have_la57 = native_read_cr4() & X86_CR4_LA57;
bool need_toggle = want_la57 ^ have_la57;
u64 *pgt = (void *)la57_toggle + PAGE_SIZE;
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
index c492bdc97b05..19cf1678fcaa 100644
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -38,12 +38,6 @@
# define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31))
#endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */
-#ifdef CONFIG_X86_5LEVEL
-# define DISABLE_LA57 0
-#else
-# define DISABLE_LA57 (1<<(X86_FEATURE_LA57 & 31))
-#endif
-
#ifdef CONFIG_MITIGATION_PAGE_TABLE_ISOLATION
# define DISABLE_PTI 0
#else
@@ -149,8 +143,7 @@
#define DISABLED_MASK13 0
#define DISABLED_MASK14 0
#define DISABLED_MASK15 0
-#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \
- DISABLE_ENQCMD)
+#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_UMIP|DISABLE_ENQCMD)
#define DISABLED_MASK17 0
#define DISABLED_MASK18 (DISABLE_IBT)
#define DISABLED_MASK19 (DISABLE_SEV_SNP)
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply related [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-13 13:41 ` Kirill A. Shutemov
@ 2024-06-13 14:56 ` Borislav Petkov
2024-06-14 14:06 ` Tom Lendacky
2024-06-21 13:38 ` Borislav Petkov
1 sibling, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2024-06-13 14:56 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
On Thu, Jun 13, 2024 at 04:41:00PM +0300, Kirill A. Shutemov wrote:
> It is easy enough to do. See the patch below.
Thanks, will have a look.
> But I am not sure if I can justify it properly. If someone doesn't really
> need 5-level paging, disabling it at compile-time would save ~34K of
> kernel code with the configuration.
>
> Is it worth saving ~100 lines of code?
Well, it goes both ways: is it worth saving ~34K kernel text and for that make
the code a lot less conditional, more readable, contain less ugly ifdeffery,
...?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-13 14:56 ` Borislav Petkov
@ 2024-06-14 14:06 ` Tom Lendacky
2024-06-18 12:20 ` Kirill A. Shutemov
0 siblings, 1 reply; 115+ messages in thread
From: Tom Lendacky @ 2024-06-14 14:06 UTC (permalink / raw)
To: Borislav Petkov, Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Kalra, Ashish,
Sean Christopherson, Huang, Kai, Ard Biesheuvel, Baoquan He,
H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang, kexec,
linux-hyperv, linux-acpi, linux-coco, linux-kernel, Tao Liu
On 6/13/24 09:56, Borislav Petkov wrote:
> On Thu, Jun 13, 2024 at 04:41:00PM +0300, Kirill A. Shutemov wrote:
>> It is easy enough to do. See the patch below.
>
> Thanks, will have a look.
>
>> But I am not sure if I can justify it properly. If someone doesn't really
>> need 5-level paging, disabling it at compile-time would save ~34K of
>> kernel code with the configuration.
>>
>> Is it worth saving ~100 lines of code?
>
> Well, it goes both ways: is it worth saving ~34K kernel text and for that make
> the code a lot less conditional, more readable, contain less ugly ifdeffery,
Won't getting rid of the config option cause 5-level to be used by default
on all platforms that support it? The no5lvl command line option would
have to be used to get 4-level paging at that point.
Thanks,
Tom
> ...?
>
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-14 14:06 ` Tom Lendacky
@ 2024-06-18 12:20 ` Kirill A. Shutemov
0 siblings, 0 replies; 115+ messages in thread
From: Kirill A. Shutemov @ 2024-06-18 12:20 UTC (permalink / raw)
To: Tom Lendacky
Cc: Borislav Petkov, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86,
Rafael J. Wysocki, Peter Zijlstra, Adrian Hunter,
Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
Rick Edgecombe, Kalra, Ashish, Sean Christopherson, Huang, Kai,
Ard Biesheuvel, Baoquan He, H. Peter Anvin, K. Y. Srinivasan,
Haiyang Zhang, kexec, linux-hyperv, linux-acpi, linux-coco,
linux-kernel, Tao Liu
On Fri, Jun 14, 2024 at 09:06:30AM -0500, Tom Lendacky wrote:
> On 6/13/24 09:56, Borislav Petkov wrote:
> > On Thu, Jun 13, 2024 at 04:41:00PM +0300, Kirill A. Shutemov wrote:
> > > It is easy enough to do. See the patch below.
> >
> > Thanks, will have a look.
> >
> > > But I am not sure if I can justify it properly. If someone doesn't really
> > > need 5-level paging, disabling it at compile-time would save ~34K of
> > > kernel code with the configuration.
> > >
> > > Is it worth saving ~100 lines of code?
> >
> > Well, it goes both ways: is it worth saving ~34K kernel text and for that make
> > the code a lot less conditional, more readable, contain less ugly ifdeffery,
>
> Won't getting rid of the config option cause 5-level to be used by default
> on all platforms that support it? The no5lvl command line option would have
> to be used to get 4-level paging at that point.
Yes, there won't be compile-time option to disable 5-level paging.
Is it a problem?
We benchmarked it back when 5-level paging got introduced and were not able
to see a measurable difference between 4- and 5-level paging on the same
machine. There's some memory overhead on more page table, but it shouldn't
be a show stopper.
I would prefer to get 5-level paging enabled if the machine supports it.
"no5lvl" cmdline option can be useful for debug or if your workload is
somehow special.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 115+ messages in thread
* Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
2024-06-13 13:41 ` Kirill A. Shutemov
2024-06-13 14:56 ` Borislav Petkov
@ 2024-06-21 13:38 ` Borislav Petkov
1 sibling, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2024-06-21 13:38 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, Rafael J. Wysocki,
Peter Zijlstra, Adrian Hunter, Kuppuswamy Sathyanarayanan,
Elena Reshetova, Jun Nakajima, Rick Edgecombe, Tom Lendacky,
Kalra, Ashish, Sean Christopherson, Huang, Kai, Ard Biesheuvel,
Baoquan He, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
kexec, linux-hyperv, linux-acpi, linux-coco, linux-kernel,
Tao Liu
[-- Attachment #1: Type: text/plain, Size: 2873 bytes --]
On Thu, Jun 13, 2024 at 04:41:00PM +0300, Kirill A. Shutemov wrote:
> Documentation/arch/x86/cpuinfo.rst | 8 +++-----
> Documentation/arch/x86/x86_64/5level-paging.rst | 9 ---------
> arch/x86/Kconfig | 24 +-----------------------
> arch/x86/boot/compressed/pgtable_64.c | 10 +++-------
> arch/x86/boot/header.S | 4 ----
> arch/x86/include/asm/disabled-features.h | 9 +--------
> arch/x86/include/asm/page_64.h | 2 --
> arch/x86/include/asm/page_64_types.h | 7 -------
> arch/x86/include/asm/pgtable_64_types.h | 18 ------------------
> arch/x86/kernel/alternative.c | 2 +-
> arch/x86/kernel/head64.c | 5 -----
> arch/x86/kernel/head_64.S | 2 --
> arch/x86/mm/init.c | 4 ----
> arch/x86/mm/pgtable.c | 2 --
> drivers/firmware/efi/libstub/x86-5lvl.c | 2 +-
> tools/arch/x86/include/asm/disabled-features.h | 9 +--------
> 16 files changed, 11 insertions(+), 106 deletions(-)
This causes
ld: vmlinux.o: in function `rip_rel_ptr':
/home/boris/kernel/5th/linux/./arch/x86/include/asm/asm.h:120:(.head.text+0xb96): undefined reference to `page_offset_base'
ld: /home/boris/kernel/5th/linux/./arch/x86/include/asm/asm.h:120:(.head.text+0xbaa): undefined reference to `vmalloc_base'
ld: /home/boris/kernel/5th/linux/./arch/x86/include/asm/asm.h:120:(.head.text+0xbb4): undefined reference to `vmemmap_base'
make[2]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1
make[1]: *** [/mnt/kernel/kernel/5th/linux/Makefile:1171: vmlinux] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:240: __sub-make] Error 2
with my .config. Attached.
Also:
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index f9b77b66c792..25559a788aad 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -115,12 +115,10 @@ asmlinkage void configure_5level_paging(struct boot_params *bp, void *pgtable)
* - if the machine supports 5-level paging:
* + CPUID leaf 7 is supported
* + the leaf has the feature bit set
- *
- * That's substitute for boot_cpu_has() in early boot code.
*/
if (!cmdline_find_option_bool("no5lvl") &&
native_cpuid_eax(0) >= 7 &&
- (native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31)))) {
+ (native_cpuid_ecx(7) & BIT_UL(16))) {
l5_required = true;
/* Initialize variables for 5-level paging */
We can simply check CPUID and be done with it, that early.
Other than that, I like it. Let's do it.
Less ifdeffery, less conditionals. A win-win thing.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
[-- Attachment #2: .config --]
[-- Type: text/plain, Size: 148042 bytes --]
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 6.10.0-rc4 Kernel Configuration
#
CONFIG_CC_VERSION_TEXT="gcc (Debian 13.2.0-25) 13.2.0"
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=130200
CONFIG_CLANG_VERSION=0
CONFIG_AS_IS_GNU=y
CONFIG_AS_VERSION=24200
CONFIG_LD_IS_BFD=y
CONFIG_LD_VERSION=24200
CONFIG_LLD_VERSION=0
CONFIG_CC_CAN_LINK=y
CONFIG_CC_CAN_LINK_STATIC=y
CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT=y
CONFIG_GCC_ASM_GOTO_OUTPUT_WORKAROUND=y
CONFIG_TOOLS_SUPPORT_RELR=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
CONFIG_PAHOLE_VERSION=124
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_TABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y
#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
# CONFIG_WERROR is not set
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_HAVE_KERNEL_ZSTD=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
# CONFIG_KERNEL_ZSTD is not set
CONFIG_DEFAULT_INIT=""
CONFIG_DEFAULT_HOSTNAME="zn"
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
# CONFIG_WATCH_QUEUE is not set
CONFIG_CROSS_MEMORY_ATTACH=y
# CONFIG_USELIB is not set
# CONFIG_AUDIT is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_HARDIRQS_SW_RESEND=y
CONFIG_GENERIC_IRQ_CHIP=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_IRQ_MSI_IOMMU=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
# end of IRQ subsystem
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_INIT=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST_IDLE=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y
CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y
CONFIG_CONTEXT_TRACKING=y
CONFIG_CONTEXT_TRACKING_IDLE=y
#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y
CONFIG_CLOCKSOURCE_WATCHDOG_MAX_SKEW_US=100
# end of Timers subsystem
CONFIG_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
#
# BPF subsystem
#
# CONFIG_BPF_SYSCALL is not set
# CONFIG_BPF_JIT is not set
# end of BPF subsystem
CONFIG_PREEMPT_BUILD=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_PREEMPTION=y
CONFIG_PREEMPT_DYNAMIC=y
CONFIG_SCHED_CORE=y
#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
# CONFIG_PSI is not set
# end of CPU/Task time and stats accounting
# CONFIG_CPU_ISOLATION is not set
#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU_GENERIC=y
CONFIG_NEED_TASKS_RCU=y
CONFIG_TASKS_RCU=y
CONFIG_TASKS_RUDE_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
# end of RCU Subsystem
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_IKHEADERS is not set
CONFIG_LOG_BUF_SHIFT=25
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
# CONFIG_PRINTK_INDEX is not set
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
#
# Scheduler features
#
# CONFIG_UCLAMP_TASK is not set
# end of Scheduler features
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_CC_HAS_INT128=y
CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5"
CONFIG_GCC10_NO_ARRAY_BOUNDS=y
CONFIG_CC_NO_ARRAY_BOUNDS=y
CONFIG_GCC_NO_STRINGOP_OVERFLOW=y
CONFIG_CC_NO_STRINGOP_OVERFLOW=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_NUMA_BALANCING=y
CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
CONFIG_CGROUPS=y
# CONFIG_CGROUP_FAVOR_DYNMODS is not set
# CONFIG_MEMCG is not set
# CONFIG_BLK_CGROUP is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
# CONFIG_RT_GROUP_SCHED is not set
CONFIG_SCHED_MM_CID=y
# CONFIG_CGROUP_PIDS is not set
# CONFIG_CGROUP_RDMA is not set
# CONFIG_CGROUP_FREEZER is not set
# CONFIG_CGROUP_HUGETLB is not set
# CONFIG_CPUSETS is not set
# CONFIG_CGROUP_DEVICE is not set
# CONFIG_CGROUP_CPUACCT is not set
# CONFIG_CGROUP_PERF is not set
# CONFIG_CGROUP_MISC is not set
# CONFIG_CGROUP_DEBUG is not set
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_TIME_NS is not set
# CONFIG_IPC_NS is not set
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_NET_NS is not set
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_SCHED_AUTOGROUP=y
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
# CONFIG_RD_LZMA is not set
# CONFIG_RD_XZ is not set
# CONFIG_RD_LZO is not set
# CONFIG_RD_LZ4 is not set
# CONFIG_RD_ZSTD is not set
# CONFIG_BOOT_CONFIG is not set
CONFIG_INITRAMFS_PRESERVE_MTIME=y
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_LD_ORPHAN_WARN=y
CONFIG_LD_ORPHAN_WARN_LEVEL="warn"
CONFIG_SYSCTL=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
# CONFIG_EXPERT is not set
CONFIG_UID16=y
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
CONFIG_FHANDLE=y
CONFIG_POSIX_TIMERS=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_FUTEX=y
CONFIG_FUTEX_PI=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_IO_URING=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_MEMBARRIER=y
CONFIG_KCMP=y
CONFIG_RSEQ=y
CONFIG_CACHESTAT_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_SELFTEST is not set
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
CONFIG_HAVE_PERF_EVENTS=y
CONFIG_GUEST_PERF_EVENTS=y
#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
# end of Kernel Performance Events And Counters
CONFIG_SYSTEM_DATA_VERIFICATION=y
# CONFIG_PROFILING is not set
CONFIG_TRACEPOINTS=y
#
# Kexec and crash features
#
CONFIG_CRASH_RESERVE=y
CONFIG_VMCORE_INFO=y
CONFIG_KEXEC_CORE=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
# CONFIG_KEXEC_SIG is not set
# CONFIG_KEXEC_JUMP is not set
CONFIG_CRASH_DUMP=y
CONFIG_CRASH_HOTPLUG=y
CONFIG_CRASH_MAX_MEMORY_RANGES=8192
# end of Kexec and crash features
# end of General setup
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_AUDIT_ARCH=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DYNAMIC_PHYSICAL_MASK=y
CONFIG_PGTABLE_LEVELS=5
CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
#
# Processor type and features
#
CONFIG_SMP=y
CONFIG_X86_X2APIC=y
# CONFIG_X86_POSTED_MSI is not set
CONFIG_X86_MPPARSE=y
CONFIG_X86_CPU_RESCTRL=y
# CONFIG_X86_FRED is not set
# CONFIG_X86_EXTENDED_PLATFORM is not set
# CONFIG_X86_INTEL_LPSS is not set
CONFIG_X86_AMD_PLATFORM_DEVICE=y
# CONFIG_IOSF_MBI is not set
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
# CONFIG_HYPERVISOR_GUEST is not set
CONFIG_MK8=y
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_HAVE_PAE=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_IA32_FEAT_CTL=y
CONFIG_X86_VMX_FEATURE_NAMES=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_HYGON=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_CPU_SUP_ZHAOXIN=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
# CONFIG_GART_IOMMU is not set
CONFIG_BOOT_VESA_SUPPORT=y
# CONFIG_MAXSMP is not set
CONFIG_NR_CPUS_RANGE_BEGIN=2
CONFIG_NR_CPUS_RANGE_END=512
CONFIG_NR_CPUS_DEFAULT=64
CONFIG_NR_CPUS=16
CONFIG_SCHED_CLUSTER=y
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
CONFIG_SCHED_MC_PRIO=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_ACPI_MADT_WAKEUP=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
CONFIG_X86_MCE=y
# CONFIG_X86_MCELOG_LEGACY is not set
# CONFIG_X86_MCE_INTEL is not set
CONFIG_X86_MCE_AMD=y
CONFIG_X86_MCE_THRESHOLD=y
CONFIG_X86_MCE_INJECT=m
#
# Performance monitoring
#
# CONFIG_PERF_EVENTS_INTEL_UNCORE is not set
# CONFIG_PERF_EVENTS_INTEL_RAPL is not set
# CONFIG_PERF_EVENTS_INTEL_CSTATE is not set
CONFIG_PERF_EVENTS_AMD_POWER=m
CONFIG_PERF_EVENTS_AMD_UNCORE=y
# CONFIG_PERF_EVENTS_AMD_BRS is not set
# end of Performance monitoring
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
CONFIG_X86_IOPL_IOPERM=y
CONFIG_MICROCODE=y
# CONFIG_MICROCODE_LATE_LOADING is not set
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
CONFIG_X86_DIRECT_GBPAGES=y
# CONFIG_X86_CPA_STATISTICS is not set
CONFIG_X86_MEM_ENCRYPT=y
CONFIG_AMD_MEM_ENCRYPT=y
CONFIG_NUMA=y
# CONFIG_AMD_NUMA is not set
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_NUMA_EMU=y
CONFIG_NODES_SHIFT=1
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_PROC_KCORE_TEXT=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
# CONFIG_X86_PMEM_LEGACY is not set
# CONFIG_X86_CHECK_BIOS_CORRUPTION is not set
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_X86_UMIP=y
CONFIG_CC_HAS_IBT=y
CONFIG_X86_CET=y
CONFIG_X86_KERNEL_IBT=y
CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
# CONFIG_X86_INTEL_TSX_MODE_OFF is not set
# CONFIG_X86_INTEL_TSX_MODE_ON is not set
CONFIG_X86_INTEL_TSX_MODE_AUTO=y
# CONFIG_X86_SGX is not set
# CONFIG_X86_USER_SHADOW_STACK is not set
CONFIG_EFI=y
CONFIG_EFI_STUB=y
CONFIG_EFI_HANDOVER_PROTOCOL=y
CONFIG_EFI_MIXED=y
# CONFIG_EFI_FAKE_MEMMAP is not set
CONFIG_EFI_RUNTIME_MAP=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_SCHED_HRTICK=y
CONFIG_ARCH_SUPPORTS_KEXEC=y
CONFIG_ARCH_SUPPORTS_KEXEC_FILE=y
CONFIG_ARCH_SELECTS_KEXEC_FILE=y
CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY=y
CONFIG_ARCH_SUPPORTS_KEXEC_SIG=y
CONFIG_ARCH_SUPPORTS_KEXEC_SIG_FORCE=y
CONFIG_ARCH_SUPPORTS_KEXEC_BZIMAGE_VERIFY_SIG=y
CONFIG_ARCH_SUPPORTS_KEXEC_JUMP=y
CONFIG_ARCH_SUPPORTS_CRASH_DUMP=y
CONFIG_ARCH_SUPPORTS_CRASH_HOTPLUG=y
CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
# CONFIG_RANDOMIZE_BASE is not set
CONFIG_PHYSICAL_ALIGN=0x200000
# CONFIG_ADDRESS_MASKING is not set
CONFIG_HOTPLUG_CPU=y
CONFIG_COMPAT_VDSO=y
# CONFIG_LEGACY_VSYSCALL_XONLY is not set
CONFIG_LEGACY_VSYSCALL_NONE=y
# CONFIG_CMDLINE_BOOL is not set
CONFIG_MODIFY_LDT_SYSCALL=y
# CONFIG_STRICT_SIGALTSTACK_SIZE is not set
CONFIG_HAVE_LIVEPATCH=y
# CONFIG_LIVEPATCH is not set
# end of Processor type and features
CONFIG_CC_HAS_NAMED_AS=y
CONFIG_USE_X86_SEG_SUPPORT=y
CONFIG_CC_HAS_SLS=y
CONFIG_CC_HAS_RETURN_THUNK=y
CONFIG_CC_HAS_ENTRY_PADDING=y
CONFIG_FUNCTION_PADDING_CFI=11
CONFIG_FUNCTION_PADDING_BYTES=16
CONFIG_CALL_PADDING=y
CONFIG_HAVE_CALL_THUNKS=y
CONFIG_CALL_THUNKS=y
CONFIG_PREFIX_SYMBOLS=y
CONFIG_CPU_MITIGATIONS=y
CONFIG_MITIGATION_PAGE_TABLE_ISOLATION=y
CONFIG_MITIGATION_RETPOLINE=y
CONFIG_MITIGATION_RETHUNK=y
CONFIG_MITIGATION_UNRET_ENTRY=y
CONFIG_MITIGATION_CALL_DEPTH_TRACKING=y
# CONFIG_CALL_THUNKS_DEBUG is not set
CONFIG_MITIGATION_IBPB_ENTRY=y
CONFIG_MITIGATION_IBRS_ENTRY=y
CONFIG_MITIGATION_SRSO=y
# CONFIG_MITIGATION_SLS is not set
# CONFIG_MITIGATION_GDS_FORCE is not set
CONFIG_MITIGATION_RFDS=y
CONFIG_MITIGATION_SPECTRE_BHI=y
CONFIG_ARCH_HAS_ADD_PAGES=y
#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_HIBERNATION=y
# CONFIG_HIBERNATION_SNAPSHOT_DEV is not set
CONFIG_HIBERNATION_COMP_LZO=y
# CONFIG_HIBERNATION_COMP_LZ4 is not set
CONFIG_HIBERNATION_DEF_COMP="lzo"
CONFIG_PM_STD_PARTITION=""
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
# CONFIG_PM_AUTOSLEEP is not set
# CONFIG_PM_USERSPACE_AUTOSLEEP is not set
# CONFIG_PM_WAKELOCKS is not set
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
CONFIG_PM_CLK=y
CONFIG_WQ_POWER_EFFICIENT_DEFAULT=y
# CONFIG_ENERGY_MODEL is not set
CONFIG_ARCH_SUPPORTS_ACPI=y
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
CONFIG_ACPI_THERMAL_LIB=y
# CONFIG_ACPI_DEBUGGER is not set
CONFIG_ACPI_SPCR_TABLE=y
# CONFIG_ACPI_FPDT is not set
CONFIG_ACPI_LPIT=y
CONFIG_ACPI_SLEEP=y
# CONFIG_ACPI_REV_OVERRIDE_POSSIBLE is not set
# CONFIG_ACPI_EC_DEBUGFS is not set
# CONFIG_ACPI_AC is not set
# CONFIG_ACPI_BATTERY is not set
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
# CONFIG_ACPI_FAN is not set
# CONFIG_ACPI_TAD is not set
# CONFIG_ACPI_DOCK is not set
CONFIG_ACPI_CPU_FREQ_PSS=y
CONFIG_ACPI_PROCESSOR_CSTATE=y
CONFIG_ACPI_PROCESSOR_IDLE=y
CONFIG_ACPI_CPPC_LIB=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_PROCESSOR_AGGREGATOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
# CONFIG_ACPI_TABLE_UPGRADE is not set
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_PCI_SLOT=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_HOTPLUG_IOAPIC=y
# CONFIG_ACPI_SBS is not set
CONFIG_ACPI_HED=y
# CONFIG_ACPI_BGRT is not set
CONFIG_ACPI_NHLT=y
CONFIG_ACPI_NFIT=y
# CONFIG_NFIT_SECURITY_DEBUG is not set
CONFIG_ACPI_NUMA=y
# CONFIG_ACPI_HMAT is not set
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
CONFIG_ACPI_APEI=y
CONFIG_ACPI_APEI_GHES=y
CONFIG_ACPI_APEI_PCIEAER=y
CONFIG_ACPI_APEI_MEMORY_FAILURE=y
# CONFIG_ACPI_APEI_EINJ is not set
# CONFIG_ACPI_APEI_ERST_DEBUG is not set
# CONFIG_ACPI_DPTF is not set
# CONFIG_ACPI_EXTLOG is not set
# CONFIG_ACPI_CONFIGFS is not set
# CONFIG_ACPI_PFRUT is not set
CONFIG_ACPI_PCC=y
# CONFIG_ACPI_FFH is not set
# CONFIG_PMIC_OPREGION is not set
CONFIG_ACPI_VIOT=y
# CONFIG_ACPI_PRMT is not set
CONFIG_X86_PM_TIMER=y
#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
# CONFIG_CPU_FREQ_STAT is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y
#
# CPU frequency scaling drivers
#
CONFIG_X86_INTEL_PSTATE=y
# CONFIG_X86_PCC_CPUFREQ is not set
CONFIG_X86_AMD_PSTATE=y
CONFIG_X86_AMD_PSTATE_DEFAULT_MODE=3
# CONFIG_X86_AMD_PSTATE_UT is not set
CONFIG_X86_ACPI_CPUFREQ=m
CONFIG_X86_ACPI_CPUFREQ_CPB=y
CONFIG_X86_POWERNOW_K8=m
CONFIG_X86_AMD_FREQ_SENSITIVITY=m
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
# CONFIG_X86_P4_CLOCKMOD is not set
#
# shared options
#
# end of CPU Frequency scaling
#
# CPU Idle
#
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y
# CONFIG_CPU_IDLE_GOV_TEO is not set
# end of CPU Idle
# CONFIG_INTEL_IDLE is not set
# end of Power management and ACPI options
#
# Bus options (PCI etc.)
#
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_MMCONF_FAM10H=y
CONFIG_ISA_DMA_API=y
CONFIG_AMD_NB=y
# end of Bus options (PCI etc.)
#
# Binary Emulations
#
CONFIG_IA32_EMULATION=y
# CONFIG_IA32_EMULATION_DEFAULT_DISABLED is not set
# CONFIG_X86_X32_ABI is not set
CONFIG_COMPAT_32=y
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
# end of Binary Emulations
CONFIG_KVM_COMMON=y
CONFIG_HAVE_KVM_PFNCACHE=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_DIRTY_RING=y
CONFIG_HAVE_KVM_DIRTY_RING_TSO=y
CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_READONLY_MEM=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM_VFIO=y
CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
CONFIG_KVM_COMPAT=y
CONFIG_HAVE_KVM_IRQ_BYPASS=y
CONFIG_HAVE_KVM_NO_POLL=y
CONFIG_KVM_XFER_TO_GUEST_WORK=y
CONFIG_HAVE_KVM_PM_NOTIFIER=y
CONFIG_KVM_GENERIC_HARDWARE_ENABLING=y
CONFIG_KVM_GENERIC_MMU_NOTIFIER=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=m
# CONFIG_KVM_INTEL is not set
CONFIG_KVM_AMD=m
CONFIG_KVM_AMD_SEV=y
CONFIG_KVM_SMM=y
CONFIG_KVM_HYPERV=y
# CONFIG_KVM_XEN is not set
CONFIG_KVM_MAX_NR_VCPUS=1024
CONFIG_AS_AVX512=y
CONFIG_AS_SHA1_NI=y
CONFIG_AS_SHA256_NI=y
CONFIG_AS_TPAUSE=y
CONFIG_AS_GFNI=y
CONFIG_AS_VAES=y
CONFIG_AS_VPCLMULQDQ=y
CONFIG_AS_WRUSS=y
CONFIG_ARCH_CONFIGURES_CPU_MITIGATIONS=y
#
# General architecture-dependent options
#
CONFIG_HOTPLUG_SMT=y
CONFIG_HOTPLUG_CORE_SYNC=y
CONFIG_HOTPLUG_CORE_SYNC_DEAD=y
CONFIG_HOTPLUG_CORE_SYNC_FULL=y
CONFIG_HOTPLUG_SPLIT_STARTUP=y
CONFIG_HOTPLUG_PARALLEL=y
CONFIG_GENERIC_ENTRY=y
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
# CONFIG_STATIC_KEYS_SELFTEST is not set
# CONFIG_STATIC_CALL_SELFTEST is not set
CONFIG_OPTPROBES=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_UPROBES=y
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_KRETPROBES=y
CONFIG_KRETPROBE_ON_RETHOOK=y
CONFIG_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y
CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
CONFIG_HAVE_NMI=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_ARCH_HAS_SET_DIRECT_MAP=y
CONFIG_ARCH_HAS_CPU_FINALIZE_INIT=y
CONFIG_ARCH_HAS_CPU_PASID=y
CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
CONFIG_ARCH_WANTS_NO_INSTR=y
CONFIG_HAVE_ASM_MODVERSIONS=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_RSEQ=y
CONFIG_HAVE_RUST=y
CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
CONFIG_MMU_GATHER_MERGE_VMAS=y
CONFIG_MMU_LAZY_TLB_REFCOUNT=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
CONFIG_HAVE_ARCH_SECCOMP=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP=y
CONFIG_SECCOMP_FILTER=y
# CONFIG_SECCOMP_CACHE_DEBUG is not set
CONFIG_HAVE_ARCH_STACKLEAK=y
CONFIG_HAVE_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR=y
# CONFIG_STACKPROTECTOR_STRONG is not set
CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y
CONFIG_LTO_NONE=y
CONFIG_ARCH_SUPPORTS_CFI_CLANG=y
CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
CONFIG_HAVE_CONTEXT_TRACKING_USER=y
CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_MOVE_PUD=y
CONFIG_HAVE_MOVE_PMD=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_HUGE_VMALLOC=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_PMD_MKWRITE=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
CONFIG_SOFTIRQ_ON_OWN_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_HAVE_EXIT_THREAD=y
CONFIG_ARCH_MMAP_RND_BITS=28
CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES=y
CONFIG_HAVE_PAGE_SIZE_4KB=y
CONFIG_PAGE_SIZE_4KB=y
CONFIG_PAGE_SIZE_LESS_THAN_64KB=y
CONFIG_PAGE_SIZE_LESS_THAN_256KB=y
CONFIG_PAGE_SHIFT=12
CONFIG_HAVE_OBJTOOL=y
CONFIG_HAVE_JUMP_LABEL_HACK=y
CONFIG_HAVE_NOINSTR_HACK=y
CONFIG_HAVE_NOINSTR_VALIDATION=y
CONFIG_HAVE_UACCESS_VALIDATION=y
CONFIG_HAVE_STACK_VALIDATION=y
CONFIG_HAVE_RELIABLE_STACKTRACE=y
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_COMPAT_OLD_SIGACTION=y
CONFIG_COMPAT_32BIT_TIME=y
CONFIG_HAVE_ARCH_VMAP_STACK=y
CONFIG_VMAP_STACK=y
CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y
CONFIG_RANDOMIZE_KSTACK_OFFSET=y
# CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT is not set
CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
CONFIG_STRICT_MODULE_RWX=y
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y
CONFIG_ARCH_USE_MEMREMAP_PROT=y
# CONFIG_LOCK_EVENT_COUNTS is not set
CONFIG_ARCH_HAS_MEM_ENCRYPT=y
CONFIG_ARCH_HAS_CC_PLATFORM=y
CONFIG_HAVE_STATIC_CALL=y
CONFIG_HAVE_STATIC_CALL_INLINE=y
CONFIG_HAVE_PREEMPT_DYNAMIC=y
CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y
CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y
CONFIG_ARCH_HAS_ELFCORE_COMPAT=y
CONFIG_ARCH_HAS_PARANOID_L1D_FLUSH=y
CONFIG_DYNAMIC_SIGFRAME=y
CONFIG_ARCH_HAS_HW_PTE_YOUNG=y
CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y
CONFIG_ARCH_HAS_KERNEL_FPU_SUPPORT=y
#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# end of GCOV-based kernel profiling
CONFIG_HAVE_GCC_PLUGINS=y
CONFIG_FUNCTION_ALIGNMENT_4B=y
CONFIG_FUNCTION_ALIGNMENT_16B=y
CONFIG_FUNCTION_ALIGNMENT=16
# end of General architecture-dependent options
CONFIG_RT_MUTEXES=y
CONFIG_MODULE_SIG_FORMAT=y
CONFIG_MODULES=y
# CONFIG_MODULE_DEBUG is not set
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODULE_UNLOAD_TAINT_TRACKING is not set
# CONFIG_MODVERSIONS is not set
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_MODULE_SIG=y
CONFIG_MODULE_SIG_FORCE=y
CONFIG_MODULE_SIG_ALL=y
# CONFIG_MODULE_SIG_SHA1 is not set
CONFIG_MODULE_SIG_SHA256=y
# CONFIG_MODULE_SIG_SHA384 is not set
# CONFIG_MODULE_SIG_SHA512 is not set
# CONFIG_MODULE_SIG_SHA3_256 is not set
# CONFIG_MODULE_SIG_SHA3_384 is not set
# CONFIG_MODULE_SIG_SHA3_512 is not set
CONFIG_MODULE_SIG_HASH="sha256"
CONFIG_MODULE_COMPRESS_NONE=y
# CONFIG_MODULE_COMPRESS_GZIP is not set
# CONFIG_MODULE_COMPRESS_XZ is not set
# CONFIG_MODULE_COMPRESS_ZSTD is not set
# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
CONFIG_MODPROBE_PATH="/sbin/modprobe"
# CONFIG_TRIM_UNUSED_KSYMS is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
CONFIG_BLOCK_LEGACY_AUTOLOAD=y
CONFIG_BLK_CGROUP_PUNT_BIO=y
CONFIG_BLK_DEV_BSG_COMMON=y
# CONFIG_BLK_DEV_BSGLIB is not set
CONFIG_BLK_DEV_INTEGRITY=y
CONFIG_BLK_DEV_INTEGRITY_T10=y
CONFIG_BLK_DEV_WRITE_MOUNTED=y
# CONFIG_BLK_DEV_ZONED is not set
CONFIG_BLK_WBT=y
CONFIG_BLK_WBT_MQ=y
# CONFIG_BLK_DEBUG_FS is not set
# CONFIG_BLK_SED_OPAL is not set
# CONFIG_BLK_INLINE_ENCRYPTION is not set
#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_AIX_PARTITION is not set
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
# CONFIG_LDM_PARTITION is not set
# CONFIG_SGI_PARTITION is not set
# CONFIG_ULTRIX_PARTITION is not set
# CONFIG_SUN_PARTITION is not set
# CONFIG_KARMA_PARTITION is not set
CONFIG_EFI_PARTITION=y
# CONFIG_SYSV68_PARTITION is not set
# CONFIG_CMDLINE_PARTITION is not set
# end of Partition Types
CONFIG_BLK_MQ_PCI=y
CONFIG_BLK_MQ_VIRTIO=y
CONFIG_BLK_PM=y
CONFIG_BLOCK_HOLDER_DEPRECATED=y
CONFIG_BLK_MQ_STACKING=y
#
# IO Schedulers
#
CONFIG_MQ_IOSCHED_DEADLINE=y
# CONFIG_MQ_IOSCHED_KYBER is not set
# CONFIG_IOSCHED_BFQ is not set
# end of IO Schedulers
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_PADATA=y
CONFIG_ASN1=y
CONFIG_UNINLINE_SPIN_UNLOCK=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_MUTEX_SPIN_ON_OWNER=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y
CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE=y
CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
CONFIG_FREEZER=y
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
CONFIG_BINFMT_MISC=y
CONFIG_COREDUMP=y
# end of Executable file formats
#
# Memory Management options
#
CONFIG_SWAP=y
# CONFIG_ZSWAP is not set
#
# Slab allocator options
#
CONFIG_SLUB=y
CONFIG_SLAB_MERGE_DEFAULT=y
CONFIG_SLAB_FREELIST_RANDOM=y
CONFIG_SLAB_FREELIST_HARDENED=y
# CONFIG_SLUB_STATS is not set
CONFIG_SLUB_CPU_PARTIAL=y
# CONFIG_RANDOM_KMALLOC_CACHES is not set
# end of Slab allocator options
CONFIG_SHUFFLE_PAGE_ALLOCATOR=y
# CONFIG_COMPAT_BRK is not set
CONFIG_SPARSEMEM=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP=y
CONFIG_ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP=y
CONFIG_HAVE_GUP_FAST=y
CONFIG_MEMORY_ISOLATION=y
CONFIG_EXCLUSIVE_SYSTEM_RAM=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_COMPACTION=y
CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1
# CONFIG_PAGE_REPORTING is not set
CONFIG_MIGRATION=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_ARCH_ENABLE_THP_MIGRATION=y
CONFIG_CONTIG_ALLOC=y
CONFIG_PCP_BATCH_SCALE_MAX=5
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_MMU_NOTIFIER=y
CONFIG_KSM=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=65536
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
CONFIG_MEMORY_FAILURE=y
# CONFIG_HWPOISON_INJECT is not set
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ARCH_WANTS_THP_SWAP=y
CONFIG_TRANSPARENT_HUGEPAGE=y
# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
# CONFIG_TRANSPARENT_HUGEPAGE_NEVER is not set
CONFIG_THP_SWAP=y
# CONFIG_READ_ONLY_THP_FOR_FS is not set
CONFIG_PGTABLE_HAS_HUGE_LEAVES=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_USE_PERCPU_NUMA_NODE_ID=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
# CONFIG_CMA is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
# CONFIG_IDLE_PAGE_TRACKING is not set
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y
CONFIG_ARCH_HAS_PTE_DEVMAP=y
CONFIG_ZONE_DMA=y
CONFIG_ZONE_DMA32=y
CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
CONFIG_ARCH_HAS_PKEYS=y
CONFIG_VM_EVENT_COUNTERS=y
# CONFIG_PERCPU_STATS is not set
# CONFIG_GUP_TEST is not set
# CONFIG_DMAPOOL_TEST is not set
CONFIG_ARCH_HAS_PTE_SPECIAL=y
CONFIG_MEMFD_CREATE=y
CONFIG_SECRETMEM=y
# CONFIG_ANON_VMA_NAME is not set
# CONFIG_USERFAULTFD is not set
# CONFIG_LRU_GEN is not set
CONFIG_ARCH_SUPPORTS_PER_VMA_LOCK=y
CONFIG_PER_VMA_LOCK=y
CONFIG_LOCK_MM_AND_FIND_VMA=y
CONFIG_IOMMU_MM_DATA=y
CONFIG_EXECMEM=y
#
# Data Access Monitoring
#
# CONFIG_DAMON is not set
# end of Data Access Monitoring
# end of Memory Management options
CONFIG_NET=y
CONFIG_NET_INGRESS=y
CONFIG_NET_EGRESS=y
CONFIG_SKB_EXTENSIONS=y
#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_DIAG=m
CONFIG_UNIX=y
CONFIG_AF_UNIX_OOB=y
CONFIG_UNIX_DIAG=m
# CONFIG_TLS is not set
CONFIG_XFRM=y
CONFIG_XFRM_OFFLOAD=y
CONFIG_XFRM_ALGO=m
CONFIG_XFRM_USER=m
# CONFIG_XFRM_USER_COMPAT is not set
# CONFIG_XFRM_INTERFACE is not set
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_MIGRATE=y
# CONFIG_XFRM_STATISTICS is not set
CONFIG_XFRM_AH=m
CONFIG_XFRM_ESP=m
CONFIG_XFRM_IPCOMP=m
# CONFIG_NET_KEY is not set
CONFIG_NET_HANDSHAKE=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_FIB_TRIE_STATS=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_ROUTE_CLASSID=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE_DEMUX=m
CONFIG_NET_IP_TUNNEL=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE_COMMON=y
CONFIG_IP_MROUTE=y
CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
CONFIG_SYN_COOKIES=y
CONFIG_NET_IPVTI=m
CONFIG_NET_UDP_TUNNEL=m
CONFIG_NET_FOU=m
CONFIG_NET_FOU_IP_TUNNELS=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_ESP_OFFLOAD=m
# CONFIG_INET_ESPINTCP is not set
CONFIG_INET_IPCOMP=m
CONFIG_INET_TABLE_PERTURB_ORDER=16
CONFIG_INET_XFRM_TUNNEL=m
CONFIG_INET_TUNNEL=m
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
CONFIG_INET_UDP_DIAG=m
# CONFIG_INET_RAW_DIAG is not set
CONFIG_INET_DIAG_DESTROY=y
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_SIGPOOL=y
# CONFIG_TCP_AO is not set
CONFIG_TCP_MD5SIG=y
CONFIG_IPV6=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
CONFIG_IPV6_OPTIMISTIC_DAD=y
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_ESP_OFFLOAD=m
# CONFIG_INET6_ESPINTCP is not set
CONFIG_INET6_IPCOMP=m
CONFIG_IPV6_MIP6=y
# CONFIG_IPV6_ILA is not set
CONFIG_INET6_XFRM_TUNNEL=m
CONFIG_INET6_TUNNEL=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_SIT=m
CONFIG_IPV6_SIT_6RD=y
CONFIG_IPV6_NDISC_NODETYPE=y
CONFIG_IPV6_TUNNEL=m
CONFIG_IPV6_GRE=m
CONFIG_IPV6_FOU=m
CONFIG_IPV6_FOU_TUNNEL=m
CONFIG_IPV6_MULTIPLE_TABLES=y
CONFIG_IPV6_SUBTREES=y
CONFIG_IPV6_MROUTE=y
CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y
CONFIG_IPV6_PIMSM_V2=y
# CONFIG_IPV6_SEG6_LWTUNNEL is not set
# CONFIG_IPV6_SEG6_HMAC is not set
# CONFIG_IPV6_RPL_LWTUNNEL is not set
# CONFIG_IPV6_IOAM6_LWTUNNEL is not set
# CONFIG_MPTCP is not set
# CONFIG_NETWORK_SECMARK is not set
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
CONFIG_NETFILTER=y
CONFIG_NETFILTER_ADVANCED=y
#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_INGRESS=y
CONFIG_NETFILTER_EGRESS=y
CONFIG_NETFILTER_NETLINK=m
CONFIG_NETFILTER_FAMILY_ARP=y
# CONFIG_NETFILTER_NETLINK_HOOK is not set
CONFIG_NETFILTER_NETLINK_ACCT=m
CONFIG_NETFILTER_NETLINK_QUEUE=m
CONFIG_NETFILTER_NETLINK_LOG=m
CONFIG_NETFILTER_NETLINK_OSF=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_LOG_SYSLOG=m
CONFIG_NETFILTER_CONNCOUNT=m
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_ZONES=y
CONFIG_NF_CONNTRACK_PROCFS=y
CONFIG_NF_CONNTRACK_EVENTS=y
CONFIG_NF_CONNTRACK_TIMEOUT=y
CONFIG_NF_CONNTRACK_TIMESTAMP=y
CONFIG_NF_CONNTRACK_LABELS=y
CONFIG_NF_CT_PROTO_DCCP=y
CONFIG_NF_CT_PROTO_GRE=y
CONFIG_NF_CT_PROTO_SCTP=y
CONFIG_NF_CT_PROTO_UDPLITE=y
CONFIG_NF_CONNTRACK_AMANDA=m
CONFIG_NF_CONNTRACK_FTP=m
CONFIG_NF_CONNTRACK_H323=m
CONFIG_NF_CONNTRACK_IRC=m
CONFIG_NF_CONNTRACK_BROADCAST=m
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
CONFIG_NF_CONNTRACK_SNMP=m
CONFIG_NF_CONNTRACK_PPTP=m
CONFIG_NF_CONNTRACK_SANE=m
CONFIG_NF_CONNTRACK_SIP=m
CONFIG_NF_CONNTRACK_TFTP=m
CONFIG_NF_CT_NETLINK=m
CONFIG_NF_CT_NETLINK_TIMEOUT=m
CONFIG_NF_CT_NETLINK_HELPER=m
CONFIG_NETFILTER_NETLINK_GLUE_CT=y
CONFIG_NF_NAT=m
CONFIG_NF_NAT_AMANDA=m
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
CONFIG_NF_NAT_SIP=m
CONFIG_NF_NAT_TFTP=m
CONFIG_NF_NAT_REDIRECT=y
CONFIG_NF_NAT_MASQUERADE=y
CONFIG_NETFILTER_SYNPROXY=m
CONFIG_NF_TABLES=m
CONFIG_NF_TABLES_INET=y
CONFIG_NF_TABLES_NETDEV=y
CONFIG_NFT_NUMGEN=m
CONFIG_NFT_CT=m
CONFIG_NFT_CONNLIMIT=m
CONFIG_NFT_LOG=m
CONFIG_NFT_LIMIT=m
CONFIG_NFT_MASQ=m
CONFIG_NFT_REDIR=m
CONFIG_NFT_NAT=m
CONFIG_NFT_TUNNEL=m
CONFIG_NFT_QUEUE=m
CONFIG_NFT_QUOTA=m
CONFIG_NFT_REJECT=m
CONFIG_NFT_REJECT_INET=m
CONFIG_NFT_COMPAT=m
CONFIG_NFT_HASH=m
CONFIG_NFT_XFRM=m
CONFIG_NFT_SOCKET=m
CONFIG_NFT_OSF=m
CONFIG_NFT_TPROXY=m
CONFIG_NFT_SYNPROXY=m
CONFIG_NF_DUP_NETDEV=m
CONFIG_NFT_DUP_NETDEV=m
CONFIG_NFT_FWD_NETDEV=m
CONFIG_NFT_REJECT_NETDEV=m
# CONFIG_NF_FLOW_TABLE is not set
CONFIG_NETFILTER_XTABLES=m
# CONFIG_NETFILTER_XTABLES_COMPAT is not set
#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=m
CONFIG_NETFILTER_XT_CONNMARK=m
CONFIG_NETFILTER_XT_SET=m
#
# Xtables targets
#
CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
CONFIG_NETFILTER_XT_TARGET_CT=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_HL=m
CONFIG_NETFILTER_XT_TARGET_HMARK=m
CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
# CONFIG_NETFILTER_XT_TARGET_LED is not set
CONFIG_NETFILTER_XT_TARGET_LOG=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_NAT=m
CONFIG_NETFILTER_XT_TARGET_NETMAP=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
# CONFIG_NETFILTER_XT_TARGET_NOTRACK is not set
CONFIG_NETFILTER_XT_TARGET_RATEEST=m
CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
CONFIG_NETFILTER_XT_TARGET_MASQUERADE=m
CONFIG_NETFILTER_XT_TARGET_TEE=m
CONFIG_NETFILTER_XT_TARGET_TPROXY=m
CONFIG_NETFILTER_XT_TARGET_TRACE=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
#
# Xtables matches
#
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
# CONFIG_NETFILTER_XT_MATCH_BPF is not set
# CONFIG_NETFILTER_XT_MATCH_CGROUP is not set
CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
CONFIG_NETFILTER_XT_MATCH_COMMENT=m
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_CPU=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ECN=m
CONFIG_NETFILTER_XT_MATCH_ESP=m
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
CONFIG_NETFILTER_XT_MATCH_HELPER=m
CONFIG_NETFILTER_XT_MATCH_HL=m
CONFIG_NETFILTER_XT_MATCH_IPCOMP=m
CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
CONFIG_NETFILTER_XT_MATCH_IPVS=m
CONFIG_NETFILTER_XT_MATCH_L2TP=m
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
CONFIG_NETFILTER_XT_MATCH_NFACCT=m
CONFIG_NETFILTER_XT_MATCH_OSF=m
CONFIG_NETFILTER_XT_MATCH_OWNER=m
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
CONFIG_NETFILTER_XT_MATCH_RATEEST=m
CONFIG_NETFILTER_XT_MATCH_REALM=m
CONFIG_NETFILTER_XT_MATCH_RECENT=m
CONFIG_NETFILTER_XT_MATCH_SCTP=m
CONFIG_NETFILTER_XT_MATCH_SOCKET=m
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
CONFIG_NETFILTER_XT_MATCH_TIME=m
CONFIG_NETFILTER_XT_MATCH_U32=m
# end of Core Netfilter Configuration
CONFIG_IP_SET=m
CONFIG_IP_SET_MAX=256
CONFIG_IP_SET_BITMAP_IP=m
CONFIG_IP_SET_BITMAP_IPMAC=m
CONFIG_IP_SET_BITMAP_PORT=m
CONFIG_IP_SET_HASH_IP=m
CONFIG_IP_SET_HASH_IPMARK=m
CONFIG_IP_SET_HASH_IPPORT=m
CONFIG_IP_SET_HASH_IPPORTIP=m
CONFIG_IP_SET_HASH_IPPORTNET=m
CONFIG_IP_SET_HASH_IPMAC=m
CONFIG_IP_SET_HASH_MAC=m
CONFIG_IP_SET_HASH_NETPORTNET=m
CONFIG_IP_SET_HASH_NET=m
CONFIG_IP_SET_HASH_NETNET=m
CONFIG_IP_SET_HASH_NETPORT=m
CONFIG_IP_SET_HASH_NETIFACE=m
CONFIG_IP_SET_LIST_SET=m
CONFIG_IP_VS=m
CONFIG_IP_VS_IPV6=y
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12
#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_AH_ESP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y
CONFIG_IP_VS_PROTO_SCTP=y
#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_FO=m
CONFIG_IP_VS_OVF=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
# CONFIG_IP_VS_MH is not set
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m
# CONFIG_IP_VS_TWOS is not set
#
# IPVS SH scheduler
#
CONFIG_IP_VS_SH_TAB_BITS=8
#
# IPVS MH scheduler
#
CONFIG_IP_VS_MH_TAB_INDEX=12
#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m
CONFIG_IP_VS_NFCT=y
CONFIG_IP_VS_PE_SIP=m
#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=m
CONFIG_IP_NF_IPTABLES_LEGACY=m
CONFIG_NF_SOCKET_IPV4=m
CONFIG_NF_TPROXY_IPV4=m
CONFIG_NF_TABLES_IPV4=y
CONFIG_NFT_REJECT_IPV4=m
# CONFIG_NFT_DUP_IPV4 is not set
# CONFIG_NFT_FIB_IPV4 is not set
# CONFIG_NF_TABLES_ARP is not set
CONFIG_NF_DUP_IPV4=m
CONFIG_NF_LOG_ARP=m
CONFIG_NF_LOG_IPV4=m
CONFIG_NF_REJECT_IPV4=m
CONFIG_NF_NAT_SNMP_BASIC=m
CONFIG_NF_NAT_PPTP=m
CONFIG_NF_NAT_H323=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_RPFILTER=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_SYNPROXY=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_TTL=m
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
# end of IP: Netfilter Configuration
#
# IPv6: Netfilter Configuration
#
CONFIG_IP6_NF_IPTABLES_LEGACY=m
CONFIG_NF_SOCKET_IPV6=m
CONFIG_NF_TPROXY_IPV6=m
CONFIG_NF_TABLES_IPV6=y
CONFIG_NFT_REJECT_IPV6=m
# CONFIG_NFT_DUP_IPV6 is not set
# CONFIG_NFT_FIB_IPV6 is not set
CONFIG_NF_DUP_IPV6=m
CONFIG_NF_REJECT_IPV6=m
CONFIG_NF_LOG_IPV6=m
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_AH=m
CONFIG_IP6_NF_MATCH_EUI64=m
CONFIG_IP6_NF_MATCH_FRAG=m
CONFIG_IP6_NF_MATCH_OPTS=m
CONFIG_IP6_NF_MATCH_HL=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=m
CONFIG_IP6_NF_MATCH_MH=m
CONFIG_IP6_NF_MATCH_RPFILTER=m
CONFIG_IP6_NF_MATCH_RT=m
# CONFIG_IP6_NF_MATCH_SRH is not set
CONFIG_IP6_NF_TARGET_HL=m
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_REJECT=m
CONFIG_IP6_NF_TARGET_SYNPROXY=m
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_RAW=m
CONFIG_IP6_NF_NAT=m
CONFIG_IP6_NF_TARGET_MASQUERADE=m
CONFIG_IP6_NF_TARGET_NPT=m
# end of IPv6: Netfilter Configuration
CONFIG_NF_DEFRAG_IPV6=m
# CONFIG_NF_CONNTRACK_BRIDGE is not set
# CONFIG_IP_DCCP is not set
CONFIG_IP_SCTP=m
# CONFIG_SCTP_DBG_OBJCNT is not set
CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5=y
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1 is not set
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE is not set
CONFIG_SCTP_COOKIE_HMAC_MD5=y
CONFIG_SCTP_COOKIE_HMAC_SHA1=y
CONFIG_INET_SCTP_DIAG=m
# CONFIG_RDS is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_L2TP is not set
# CONFIG_BRIDGE is not set
# CONFIG_NET_DSA is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_LLC2 is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_PHONET is not set
# CONFIG_6LOWPAN is not set
# CONFIG_IEEE802154 is not set
# CONFIG_NET_SCHED is not set
# CONFIG_DCB is not set
CONFIG_DNS_RESOLVER=m
# CONFIG_BATMAN_ADV is not set
# CONFIG_OPENVSWITCH is not set
# CONFIG_VSOCKETS is not set
CONFIG_NETLINK_DIAG=m
# CONFIG_MPLS is not set
# CONFIG_NET_NSH is not set
# CONFIG_HSR is not set
# CONFIG_NET_SWITCHDEV is not set
# CONFIG_NET_L3_MASTER_DEV is not set
# CONFIG_QRTR is not set
# CONFIG_NET_NCSI is not set
CONFIG_PCPU_DEV_REFCNT=y
CONFIG_MAX_SKB_FRAGS=17
CONFIG_RPS=y
CONFIG_RFS_ACCEL=y
CONFIG_SOCK_RX_QUEUE_MAPPING=y
CONFIG_XPS=y
# CONFIG_CGROUP_NET_PRIO is not set
# CONFIG_CGROUP_NET_CLASSID is not set
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
CONFIG_NET_FLOW_LIMIT=y
#
# Network testing
#
CONFIG_NET_PKTGEN=m
CONFIG_NET_DROP_MONITOR=m
# end of Network testing
# end of Networking options
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
# CONFIG_AF_KCM is not set
# CONFIG_MCTP is not set
CONFIG_FIB_RULES=y
# CONFIG_WIRELESS is not set
# CONFIG_RFKILL is not set
CONFIG_NET_9P=y
CONFIG_NET_9P_FD=y
CONFIG_NET_9P_VIRTIO=y
# CONFIG_NET_9P_DEBUG is not set
# CONFIG_CAIF is not set
# CONFIG_CEPH_LIB is not set
# CONFIG_NFC is not set
# CONFIG_PSAMPLE is not set
# CONFIG_NET_IFE is not set
# CONFIG_LWTUNNEL is not set
CONFIG_DST_CACHE=y
CONFIG_GRO_CELLS=y
CONFIG_NET_SELFTESTS=y
CONFIG_PAGE_POOL=y
CONFIG_PAGE_POOL_STATS=y
CONFIG_FAILOVER=m
CONFIG_ETHTOOL_NETLINK=y
#
# Device Drivers
#
CONFIG_HAVE_EISA=y
# CONFIG_EISA is not set
CONFIG_HAVE_PCI=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_PCI=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_PCIEAER=y
# CONFIG_PCIEAER_INJECT is not set
# CONFIG_PCIE_ECRC is not set
CONFIG_PCIEASPM=y
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
CONFIG_PCIE_PME=y
CONFIG_PCIE_DPC=y
CONFIG_PCIE_PTM=y
# CONFIG_PCIE_EDR is not set
CONFIG_PCI_MSI=y
CONFIG_PCI_QUIRKS=y
# CONFIG_PCI_DEBUG is not set
CONFIG_PCI_REALLOC_ENABLE_AUTO=y
CONFIG_PCI_STUB=y
# CONFIG_PCI_PF_STUB is not set
CONFIG_PCI_ATS=y
CONFIG_PCI_LOCKLESS_CONFIG=y
CONFIG_PCI_IOV=y
CONFIG_PCI_PRI=y
CONFIG_PCI_PASID=y
CONFIG_PCI_LABEL=y
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=1
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_ACPI=y
CONFIG_HOTPLUG_PCI_ACPI_IBM=y
# CONFIG_HOTPLUG_PCI_CPCI is not set
# CONFIG_HOTPLUG_PCI_SHPC is not set
#
# PCI controller drivers
#
# CONFIG_VMD is not set
#
# Cadence-based PCIe controllers
#
# end of Cadence-based PCIe controllers
#
# DesignWare-based PCIe controllers
#
# CONFIG_PCI_MESON is not set
# CONFIG_PCIE_DW_PLAT_HOST is not set
# end of DesignWare-based PCIe controllers
#
# Mobiveil-based PCIe controllers
#
# end of Mobiveil-based PCIe controllers
# end of PCI controller drivers
#
# PCI Endpoint
#
# CONFIG_PCI_ENDPOINT is not set
# end of PCI Endpoint
#
# PCI switch controller drivers
#
# CONFIG_PCI_SW_SWITCHTEC is not set
# end of PCI switch controller drivers
# CONFIG_CXL_BUS is not set
# CONFIG_PCCARD is not set
# CONFIG_RAPIDIO is not set
#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_DEVTMPFS=y
# CONFIG_DEVTMPFS_MOUNT is not set
# CONFIG_DEVTMPFS_SAFE is not set
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
#
# Firmware loader
#
CONFIG_FW_LOADER=y
CONFIG_FW_LOADER_DEBUG=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_FW_LOADER_USER_HELPER is not set
# CONFIG_FW_LOADER_COMPRESS is not set
CONFIG_FW_CACHE=y
# CONFIG_FW_UPLOAD is not set
# end of Firmware loader
CONFIG_ALLOW_DEV_COREDUMP=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
CONFIG_GENERIC_CPU_DEVICES=y
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_GENERIC_CPU_VULNERABILITIES=y
CONFIG_REGMAP=y
CONFIG_REGMAP_I2C=m
CONFIG_DMA_SHARED_BUFFER=y
# CONFIG_DMA_FENCE_TRACE is not set
# CONFIG_FW_DEVLINK_SYNC_STATE_TIMEOUT is not set
# end of Generic Driver Options
#
# Bus devices
#
# CONFIG_MHI_BUS is not set
# CONFIG_MHI_BUS_EP is not set
# end of Bus devices
#
# Cache Drivers
#
# end of Cache Drivers
# CONFIG_CONNECTOR is not set
#
# Firmware Drivers
#
#
# ARM System Control and Management Interface Protocol
#
# end of ARM System Control and Management Interface Protocol
# CONFIG_EDD is not set
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_DMIID=y
CONFIG_DMI_SYSFS=y
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
# CONFIG_FW_CFG_SYSFS is not set
CONFIG_SYSFB=y
CONFIG_SYSFB_SIMPLEFB=y
# CONFIG_GOOGLE_FIRMWARE is not set
#
# EFI (Extensible Firmware Interface) Support
#
CONFIG_EFI_ESRT=y
CONFIG_EFI_VARS_PSTORE=m
# CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE is not set
CONFIG_EFI_DXE_MEM_ATTRIBUTES=y
CONFIG_EFI_RUNTIME_WRAPPERS=y
# CONFIG_EFI_BOOTLOADER_CONTROL is not set
# CONFIG_EFI_CAPSULE_LOADER is not set
# CONFIG_EFI_TEST is not set
# CONFIG_APPLE_PROPERTIES is not set
# CONFIG_RESET_ATTACK_MITIGATION is not set
# CONFIG_EFI_RCI2_TABLE is not set
# CONFIG_EFI_DISABLE_PCI_DMA is not set
CONFIG_EFI_EARLYCON=y
# CONFIG_EFI_CUSTOM_SSDT_OVERLAYS is not set
# CONFIG_EFI_DISABLE_RUNTIME is not set
# CONFIG_EFI_COCO_SECRET is not set
CONFIG_UNACCEPTED_MEMORY=y
# end of EFI (Extensible Firmware Interface) Support
CONFIG_UEFI_CPER=y
CONFIG_UEFI_CPER_X86=y
#
# Qualcomm firmware drivers
#
# end of Qualcomm firmware drivers
#
# Tegra firmware driver
#
# end of Tegra firmware driver
# end of Firmware Drivers
# CONFIG_GNSS is not set
# CONFIG_MTD is not set
# CONFIG_OF is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
# CONFIG_PARPORT is not set
CONFIG_PNP=y
# CONFIG_PNP_DEBUG_MESSAGES is not set
#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_NULL_BLK is not set
# CONFIG_BLK_DEV_FD is not set
CONFIG_CDROM=y
# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
# CONFIG_ZRAM is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
# CONFIG_BLK_DEV_DRBD is not set
CONFIG_BLK_DEV_NBD=m
# CONFIG_BLK_DEV_RAM is not set
# CONFIG_CDROM_PKTCDVD is not set
# CONFIG_ATA_OVER_ETH is not set
CONFIG_VIRTIO_BLK=m
# CONFIG_BLK_DEV_RBD is not set
# CONFIG_BLK_DEV_UBLK is not set
#
# NVME Support
#
CONFIG_NVME_CORE=y
CONFIG_BLK_DEV_NVME=y
CONFIG_NVME_MULTIPATH=y
# CONFIG_NVME_VERBOSE_ERRORS is not set
CONFIG_NVME_HWMON=y
# CONFIG_NVME_FC is not set
# CONFIG_NVME_TCP is not set
# CONFIG_NVME_HOST_AUTH is not set
# CONFIG_NVME_TARGET is not set
# end of NVME Support
#
# Misc devices
#
# CONFIG_AD525X_DPOT is not set
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ICS932S401 is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_HP_ILO is not set
# CONFIG_APDS9802ALS is not set
# CONFIG_ISL29003 is not set
# CONFIG_ISL29020 is not set
# CONFIG_SENSORS_TSL2550 is not set
# CONFIG_SENSORS_BH1770 is not set
# CONFIG_SENSORS_APDS990X is not set
# CONFIG_HMC6352 is not set
# CONFIG_DS1682 is not set
# CONFIG_SRAM is not set
# CONFIG_DW_XDATA_PCIE is not set
# CONFIG_PCI_ENDPOINT_TEST is not set
# CONFIG_XILINX_SDFEC is not set
# CONFIG_NSM is not set
# CONFIG_C2PORT is not set
#
# EEPROM support
#
CONFIG_EEPROM_AT24=m
CONFIG_EEPROM_MAX6875=m
CONFIG_EEPROM_93CX6=m
# CONFIG_EEPROM_IDT_89HPESX is not set
CONFIG_EEPROM_EE1004=m
# end of EEPROM support
# CONFIG_CB710_CORE is not set
#
# Texas Instruments shared transport line discipline
#
# CONFIG_TI_ST is not set
# end of Texas Instruments shared transport line discipline
# CONFIG_SENSORS_LIS3_I2C is not set
# CONFIG_ALTERA_STAPL is not set
# CONFIG_INTEL_MEI is not set
# CONFIG_VMWARE_VMCI is not set
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_BCM_VK is not set
# CONFIG_MISC_ALCOR_PCI is not set
# CONFIG_MISC_RTSX_PCI is not set
# CONFIG_MISC_RTSX_USB is not set
# CONFIG_UACCE is not set
# CONFIG_PVPANIC is not set
# CONFIG_GP_PCI1XXXX is not set
# end of Misc devices
#
# SCSI device support
#
CONFIG_SCSI_MOD=y
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI_COMMON=y
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
# CONFIG_SCSI_PROC_FS is not set
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=y
CONFIG_BLK_DEV_SR=y
CONFIG_CHR_DEV_SG=y
CONFIG_BLK_DEV_BSG=y
# CONFIG_CHR_DEV_SCH is not set
# CONFIG_SCSI_CONSTANTS is not set
# CONFIG_SCSI_LOGGING is not set
# CONFIG_SCSI_SCAN_ASYNC is not set
#
# SCSI Transports
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
# CONFIG_SCSI_ISCSI_ATTRS is not set
# CONFIG_SCSI_SAS_ATTRS is not set
# CONFIG_SCSI_SAS_LIBSAS is not set
# CONFIG_SCSI_SRP_ATTRS is not set
# end of SCSI Transports
# CONFIG_SCSI_LOWLEVEL is not set
# CONFIG_SCSI_DH is not set
# end of SCSI device support
CONFIG_ATA=y
CONFIG_SATA_HOST=y
CONFIG_PATA_TIMINGS=y
CONFIG_ATA_VERBOSE_ERROR=y
CONFIG_ATA_FORCE=y
CONFIG_ATA_ACPI=y
CONFIG_SATA_ZPODD=y
CONFIG_SATA_PMP=y
#
# Controllers with non-SFF native interface
#
CONFIG_SATA_AHCI=y
CONFIG_SATA_MOBILE_LPM_POLICY=0
# CONFIG_SATA_AHCI_PLATFORM is not set
# CONFIG_AHCI_DWC is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_SATA_ACARD_AHCI is not set
# CONFIG_SATA_SIL24 is not set
CONFIG_ATA_SFF=y
#
# SFF controllers with custom DMA interface
#
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_SX4 is not set
CONFIG_ATA_BMDMA=y
#
# SATA SFF controllers with BMDMA
#
CONFIG_ATA_PIIX=y
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_SVW is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
#
# PATA SFF controllers with BMDMA
#
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_ATP867X is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RDC is not set
# CONFIG_PATA_SCH is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_TOSHIBA is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set
#
# PIO-only SFF controllers
#
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_RZ1000 is not set
#
# Generic fallback / legacy drivers
#
# CONFIG_PATA_ACPI is not set
# CONFIG_ATA_GENERIC is not set
# CONFIG_PATA_LEGACY is not set
CONFIG_MD=y
# CONFIG_BLK_DEV_MD is not set
CONFIG_MD_BITMAP_FILE=y
# CONFIG_BCACHE is not set
CONFIG_BLK_DEV_DM_BUILTIN=y
CONFIG_BLK_DEV_DM=m
# CONFIG_DM_DEBUG is not set
CONFIG_DM_BUFIO=m
# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set
# CONFIG_DM_UNSTRIPED is not set
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
# CONFIG_DM_THIN_PROVISIONING is not set
# CONFIG_DM_CACHE is not set
# CONFIG_DM_WRITECACHE is not set
# CONFIG_DM_EBS is not set
# CONFIG_DM_ERA is not set
# CONFIG_DM_CLONE is not set
# CONFIG_DM_MIRROR is not set
# CONFIG_DM_RAID is not set
# CONFIG_DM_ZERO is not set
# CONFIG_DM_MULTIPATH is not set
# CONFIG_DM_DELAY is not set
# CONFIG_DM_DUST is not set
# CONFIG_DM_UEVENT is not set
# CONFIG_DM_FLAKEY is not set
# CONFIG_DM_VERITY is not set
# CONFIG_DM_SWITCH is not set
# CONFIG_DM_LOG_WRITES is not set
# CONFIG_DM_INTEGRITY is not set
# CONFIG_DM_VDO is not set
# CONFIG_TARGET_CORE is not set
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
CONFIG_FIREWIRE=m
CONFIG_FIREWIRE_OHCI=m
CONFIG_FIREWIRE_SBP2=m
CONFIG_FIREWIRE_NET=m
CONFIG_FIREWIRE_NOSY=m
# end of IEEE 1394 (FireWire) support
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
CONFIG_MII=y
CONFIG_NET_CORE=y
# CONFIG_BONDING is not set
# CONFIG_DUMMY is not set
# CONFIG_WIREGUARD is not set
# CONFIG_EQUALIZER is not set
# CONFIG_NET_FC is not set
# CONFIG_IFB is not set
# CONFIG_NET_TEAM is not set
# CONFIG_MACVLAN is not set
# CONFIG_IPVLAN is not set
# CONFIG_VXLAN is not set
# CONFIG_GENEVE is not set
# CONFIG_BAREUDP is not set
# CONFIG_GTP is not set
# CONFIG_PFCP is not set
# CONFIG_AMT is not set
# CONFIG_MACSEC is not set
CONFIG_NETCONSOLE=y
# CONFIG_NETCONSOLE_EXTENDED_LOG is not set
CONFIG_NETPOLL=y
CONFIG_NET_POLL_CONTROLLER=y
CONFIG_TUN=y
# CONFIG_TUN_VNET_CROSS_LE is not set
CONFIG_VETH=m
CONFIG_VIRTIO_NET=m
CONFIG_NLMON=m
# CONFIG_ARCNET is not set
CONFIG_ETHERNET=y
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_VENDOR_ADAPTEC is not set
# CONFIG_NET_VENDOR_AGERE is not set
# CONFIG_NET_VENDOR_ALACRITECH is not set
# CONFIG_NET_VENDOR_ALTEON is not set
# CONFIG_ALTERA_TSE is not set
# CONFIG_NET_VENDOR_AMAZON is not set
# CONFIG_NET_VENDOR_AMD is not set
# CONFIG_NET_VENDOR_AQUANTIA is not set
# CONFIG_NET_VENDOR_ARC is not set
# CONFIG_NET_VENDOR_ASIX is not set
# CONFIG_NET_VENDOR_ATHEROS is not set
# CONFIG_CX_ECAT is not set
# CONFIG_NET_VENDOR_BROADCOM is not set
# CONFIG_NET_VENDOR_CADENCE is not set
# CONFIG_NET_VENDOR_CAVIUM is not set
# CONFIG_NET_VENDOR_CHELSIO is not set
# CONFIG_NET_VENDOR_CISCO is not set
# CONFIG_NET_VENDOR_CORTINA is not set
# CONFIG_NET_VENDOR_DAVICOM is not set
# CONFIG_DNET is not set
# CONFIG_NET_VENDOR_DEC is not set
# CONFIG_NET_VENDOR_DLINK is not set
# CONFIG_NET_VENDOR_EMULEX is not set
CONFIG_NET_VENDOR_ENGLEDER=y
# CONFIG_TSNEP is not set
# CONFIG_NET_VENDOR_EZCHIP is not set
# CONFIG_NET_VENDOR_FUNGIBLE is not set
# CONFIG_NET_VENDOR_GOOGLE is not set
# CONFIG_NET_VENDOR_HUAWEI is not set
# CONFIG_NET_VENDOR_INTEL is not set
# CONFIG_JME is not set
# CONFIG_NET_VENDOR_LITEX is not set
# CONFIG_NET_VENDOR_MARVELL is not set
# CONFIG_NET_VENDOR_MELLANOX is not set
# CONFIG_NET_VENDOR_MICREL is not set
# CONFIG_NET_VENDOR_MICROCHIP is not set
# CONFIG_NET_VENDOR_MICROSEMI is not set
# CONFIG_NET_VENDOR_MICROSOFT is not set
# CONFIG_NET_VENDOR_MYRI is not set
# CONFIG_FEALNX is not set
# CONFIG_NET_VENDOR_NI is not set
# CONFIG_NET_VENDOR_NATSEMI is not set
# CONFIG_NET_VENDOR_NETERION is not set
# CONFIG_NET_VENDOR_NETRONOME is not set
# CONFIG_NET_VENDOR_NVIDIA is not set
# CONFIG_NET_VENDOR_OKI is not set
# CONFIG_ETHOC is not set
# CONFIG_NET_VENDOR_PACKET_ENGINES is not set
# CONFIG_NET_VENDOR_PENSANDO is not set
# CONFIG_NET_VENDOR_QLOGIC is not set
# CONFIG_NET_VENDOR_BROCADE is not set
# CONFIG_NET_VENDOR_QUALCOMM is not set
# CONFIG_NET_VENDOR_RDC is not set
CONFIG_NET_VENDOR_REALTEK=y
CONFIG_8139CP=y
CONFIG_8139TOO=y
# CONFIG_8139TOO_PIO is not set
CONFIG_8139TOO_TUNE_TWISTER=y
CONFIG_8139TOO_8129=y
# CONFIG_8139_OLD_RX_RESET is not set
CONFIG_R8169=y
# CONFIG_NET_VENDOR_RENESAS is not set
# CONFIG_NET_VENDOR_ROCKER is not set
# CONFIG_NET_VENDOR_SAMSUNG is not set
# CONFIG_NET_VENDOR_SEEQ is not set
# CONFIG_NET_VENDOR_SILAN is not set
# CONFIG_NET_VENDOR_SIS is not set
# CONFIG_NET_VENDOR_SOLARFLARE is not set
# CONFIG_NET_VENDOR_SMSC is not set
# CONFIG_NET_VENDOR_SOCIONEXT is not set
# CONFIG_NET_VENDOR_STMICRO is not set
# CONFIG_NET_VENDOR_SUN is not set
# CONFIG_NET_VENDOR_SYNOPSYS is not set
# CONFIG_NET_VENDOR_TEHUTI is not set
# CONFIG_NET_VENDOR_TI is not set
CONFIG_NET_VENDOR_VERTEXCOM=y
# CONFIG_NET_VENDOR_VIA is not set
CONFIG_NET_VENDOR_WANGXUN=y
# CONFIG_NGBE is not set
# CONFIG_TXGBE is not set
# CONFIG_NET_VENDOR_WIZNET is not set
# CONFIG_NET_VENDOR_XILINX is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
CONFIG_PHYLIB=y
CONFIG_SWPHY=y
# CONFIG_LED_TRIGGER_PHY is not set
CONFIG_FIXED_PHY=y
#
# MII PHY device drivers
#
# CONFIG_AIR_EN8811H_PHY is not set
# CONFIG_AMD_PHY is not set
# CONFIG_ADIN_PHY is not set
# CONFIG_ADIN1100_PHY is not set
# CONFIG_AQUANTIA_PHY is not set
# CONFIG_AX88796B_PHY is not set
# CONFIG_BROADCOM_PHY is not set
# CONFIG_BCM54140_PHY is not set
# CONFIG_BCM7XXX_PHY is not set
# CONFIG_BCM84881_PHY is not set
# CONFIG_BCM87XX_PHY is not set
# CONFIG_CICADA_PHY is not set
# CONFIG_CORTINA_PHY is not set
# CONFIG_DAVICOM_PHY is not set
# CONFIG_ICPLUS_PHY is not set
# CONFIG_LXT_PHY is not set
# CONFIG_INTEL_XWAY_PHY is not set
# CONFIG_LSI_ET1011C_PHY is not set
# CONFIG_MARVELL_PHY is not set
# CONFIG_MARVELL_10G_PHY is not set
# CONFIG_MARVELL_88Q2XXX_PHY is not set
# CONFIG_MARVELL_88X2222_PHY is not set
# CONFIG_MAXLINEAR_GPHY is not set
# CONFIG_MEDIATEK_GE_PHY is not set
# CONFIG_MICREL_PHY is not set
# CONFIG_MICROCHIP_T1S_PHY is not set
# CONFIG_MICROCHIP_PHY is not set
# CONFIG_MICROCHIP_T1_PHY is not set
# CONFIG_MICROSEMI_PHY is not set
# CONFIG_MOTORCOMM_PHY is not set
# CONFIG_NATIONAL_PHY is not set
# CONFIG_NXP_CBTX_PHY is not set
# CONFIG_NXP_C45_TJA11XX_PHY is not set
# CONFIG_NXP_TJA11XX_PHY is not set
# CONFIG_NCN26000_PHY is not set
# CONFIG_QCA83XX_PHY is not set
# CONFIG_QCA808X_PHY is not set
# CONFIG_QSEMI_PHY is not set
CONFIG_REALTEK_PHY=y
# CONFIG_RENESAS_PHY is not set
# CONFIG_ROCKCHIP_PHY is not set
# CONFIG_SMSC_PHY is not set
# CONFIG_STE10XP is not set
# CONFIG_TERANETICS_PHY is not set
# CONFIG_DP83822_PHY is not set
# CONFIG_DP83TC811_PHY is not set
# CONFIG_DP83848_PHY is not set
# CONFIG_DP83867_PHY is not set
# CONFIG_DP83869_PHY is not set
# CONFIG_DP83TD510_PHY is not set
# CONFIG_DP83TG720_PHY is not set
# CONFIG_VITESSE_PHY is not set
# CONFIG_XILINX_GMII2RGMII is not set
CONFIG_MDIO_DEVICE=y
CONFIG_MDIO_BUS=y
CONFIG_FWNODE_MDIO=y
CONFIG_ACPI_MDIO=y
CONFIG_MDIO_DEVRES=y
# CONFIG_MDIO_BITBANG is not set
# CONFIG_MDIO_BCM_UNIMAC is not set
# CONFIG_MDIO_MVUSB is not set
# CONFIG_MDIO_THUNDER is not set
#
# MDIO Multiplexers
#
#
# PCS device drivers
#
# end of PCS device drivers
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_USB_NET_DRIVERS is not set
# CONFIG_WLAN is not set
# CONFIG_WAN is not set
#
# Wireless WAN
#
# CONFIG_WWAN is not set
# end of Wireless WAN
# CONFIG_VMXNET3 is not set
# CONFIG_FUJITSU_ES is not set
# CONFIG_NETDEVSIM is not set
CONFIG_NET_FAILOVER=m
# CONFIG_ISDN is not set
#
# Input device support
#
CONFIG_INPUT=y
# CONFIG_INPUT_LEDS is not set
# CONFIG_INPUT_FF_MEMLESS is not set
# CONFIG_INPUT_SPARSEKMAP is not set
# CONFIG_INPUT_MATRIXKMAP is not set
CONFIG_INPUT_VIVALDIFMAP=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
# CONFIG_KEYBOARD_ADP5588 is not set
# CONFIG_KEYBOARD_ADP5589 is not set
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_QT1050 is not set
# CONFIG_KEYBOARD_QT1070 is not set
# CONFIG_KEYBOARD_QT2160 is not set
# CONFIG_KEYBOARD_DLINK_DIR685 is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_GPIO is not set
# CONFIG_KEYBOARD_GPIO_POLLED is not set
# CONFIG_KEYBOARD_TCA6416 is not set
# CONFIG_KEYBOARD_TCA8418 is not set
# CONFIG_KEYBOARD_MATRIX is not set
# CONFIG_KEYBOARD_LM8323 is not set
# CONFIG_KEYBOARD_LM8333 is not set
# CONFIG_KEYBOARD_MAX7359 is not set
# CONFIG_KEYBOARD_MCS is not set
# CONFIG_KEYBOARD_MPR121 is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_SAMSUNG is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_TM2_TOUCHKEY is not set
CONFIG_KEYBOARD_XTKBD=y
# CONFIG_KEYBOARD_CYPRESS_SF is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=m
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_BYD=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_SYNAPTICS_SMBUS=y
CONFIG_MOUSE_PS2_CYPRESS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
CONFIG_MOUSE_PS2_ELANTECH=y
CONFIG_MOUSE_PS2_ELANTECH_SMBUS=y
CONFIG_MOUSE_PS2_SENTELIC=y
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_PS2_FOCALTECH=y
CONFIG_MOUSE_PS2_SMBUS=y
CONFIG_MOUSE_SERIAL=m
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_BCM5974 is not set
# CONFIG_MOUSE_CYAPA is not set
# CONFIG_MOUSE_ELAN_I2C is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_MOUSE_GPIO is not set
# CONFIG_MOUSE_SYNAPTICS_I2C is not set
# CONFIG_MOUSE_SYNAPTICS_USB is not set
# CONFIG_INPUT_JOYSTICK is not set
CONFIG_INPUT_TABLET=y
# CONFIG_TABLET_USB_ACECAD is not set
# CONFIG_TABLET_USB_AIPTEK is not set
# CONFIG_TABLET_USB_HANWANG is not set
# CONFIG_TABLET_USB_KBTAB is not set
# CONFIG_TABLET_USB_PEGASUS is not set
# CONFIG_TABLET_SERIAL_WACOM4 is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_AD714X is not set
# CONFIG_INPUT_BMA150 is not set
# CONFIG_INPUT_E3X0_BUTTON is not set
CONFIG_INPUT_PCSPKR=m
# CONFIG_INPUT_MMA8450 is not set
# CONFIG_INPUT_APANEL is not set
# CONFIG_INPUT_GPIO_BEEPER is not set
# CONFIG_INPUT_GPIO_DECODER is not set
# CONFIG_INPUT_GPIO_VIBRA is not set
# CONFIG_INPUT_ATLAS_BTNS is not set
# CONFIG_INPUT_ATI_REMOTE2 is not set
# CONFIG_INPUT_KEYSPAN_REMOTE is not set
# CONFIG_INPUT_KXTJ9 is not set
# CONFIG_INPUT_POWERMATE is not set
# CONFIG_INPUT_YEALINK is not set
# CONFIG_INPUT_CM109 is not set
# CONFIG_INPUT_UINPUT is not set
# CONFIG_INPUT_PCF8574 is not set
# CONFIG_INPUT_GPIO_ROTARY_ENCODER is not set
# CONFIG_INPUT_DA7280_HAPTICS is not set
# CONFIG_INPUT_ADXL34X is not set
# CONFIG_INPUT_IMS_PCU is not set
# CONFIG_INPUT_IQS269A is not set
# CONFIG_INPUT_IQS626A is not set
# CONFIG_INPUT_IQS7222 is not set
# CONFIG_INPUT_CMA3000 is not set
# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
# CONFIG_INPUT_DRV260X_HAPTICS is not set
# CONFIG_INPUT_DRV2665_HAPTICS is not set
# CONFIG_INPUT_DRV2667_HAPTICS is not set
# CONFIG_RMI4_CORE is not set
#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=m
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
# CONFIG_SERIO_ALTERA_PS2 is not set
# CONFIG_SERIO_PS2MULT is not set
# CONFIG_SERIO_ARC_PS2 is not set
# CONFIG_SERIO_GPIO_PS2 is not set
# CONFIG_USERIO is not set
# CONFIG_GAMEPORT is not set
# end of Hardware I/O ports
# end of Input device support
#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_LEGACY_TIOCSTI=y
CONFIG_LDISC_AUTOLOAD=y
#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_16550A_VARIANTS=y
# CONFIG_SERIAL_8250_FINTEK is not set
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_PCILIB=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_EXAR=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
# CONFIG_SERIAL_8250_PCI1XXXX is not set
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
# CONFIG_SERIAL_8250_RSA is not set
# CONFIG_SERIAL_8250_DW is not set
# CONFIG_SERIAL_8250_RT288X is not set
# CONFIG_SERIAL_8250_LPSS is not set
# CONFIG_SERIAL_8250_MID is not set
CONFIG_SERIAL_8250_PERICOM=y
#
# Non-8250 serial port support
#
# CONFIG_SERIAL_UARTLITE is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
# CONFIG_SERIAL_LANTIQ is not set
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_SC16IS7XX is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
# CONFIG_SERIAL_ARC is not set
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_SERIAL_FSL_LINFLEXUART is not set
# CONFIG_SERIAL_SPRD is not set
# end of Serial drivers
CONFIG_SERIAL_MCTRL_GPIO=y
# CONFIG_SERIAL_NONSTANDARD is not set
# CONFIG_N_GSM is not set
# CONFIG_NOZOMI is not set
# CONFIG_NULL_TTY is not set
CONFIG_HVC_DRIVER=y
# CONFIG_SERIAL_DEV_BUS is not set
CONFIG_VIRTIO_CONSOLE=m
# CONFIG_IPMI_HANDLER is not set
CONFIG_HW_RANDOM=m
# CONFIG_HW_RANDOM_TIMERIOMEM is not set
# CONFIG_HW_RANDOM_INTEL is not set
CONFIG_HW_RANDOM_AMD=m
# CONFIG_HW_RANDOM_BA431 is not set
# CONFIG_HW_RANDOM_VIA is not set
# CONFIG_HW_RANDOM_VIRTIO is not set
# CONFIG_HW_RANDOM_XIPHERA is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
CONFIG_DEVMEM=y
CONFIG_NVRAM=m
CONFIG_DEVPORT=y
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
CONFIG_HPET_MMAP_DEFAULT=y
CONFIG_HANGCHECK_TIMER=m
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
# CONFIG_XILLYBUS is not set
# CONFIG_XILLYUSB is not set
# end of Character devices
#
# I2C support
#
CONFIG_I2C=y
# CONFIG_ACPI_I2C_OPREGION is not set
CONFIG_I2C_BOARDINFO=y
# CONFIG_I2C_COMPAT is not set
# CONFIG_I2C_CHARDEV is not set
# CONFIG_I2C_MUX is not set
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_ALGOBIT=m
#
# I2C Hardware Bus support
#
#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD756_S4882=m
CONFIG_I2C_AMD8111=m
# CONFIG_I2C_AMD_MP2 is not set
# CONFIG_I2C_I801 is not set
# CONFIG_I2C_ISCH is not set
# CONFIG_I2C_ISMT is not set
# CONFIG_I2C_PIIX4 is not set
# CONFIG_I2C_NFORCE2 is not set
# CONFIG_I2C_NVIDIA_GPU is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_VIA is not set
# CONFIG_I2C_VIAPRO is not set
# CONFIG_I2C_ZHAOXIN is not set
#
# ACPI drivers
#
# CONFIG_I2C_SCMI is not set
#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_CBUS_GPIO is not set
# CONFIG_I2C_DESIGNWARE_PLATFORM is not set
# CONFIG_I2C_DESIGNWARE_PCI is not set
# CONFIG_I2C_EMEV2 is not set
# CONFIG_I2C_GPIO is not set
# CONFIG_I2C_OCORES is not set
# CONFIG_I2C_PCA_PLATFORM is not set
# CONFIG_I2C_SIMTEC is not set
# CONFIG_I2C_XILINX is not set
#
# External I2C/SMBus adapter drivers
#
# CONFIG_I2C_DIOLAN_U2C is not set
# CONFIG_I2C_CP2615 is not set
# CONFIG_I2C_PCI1XXXX is not set
# CONFIG_I2C_ROBOTFUZZ_OSIF is not set
# CONFIG_I2C_TAOS_EVM is not set
# CONFIG_I2C_TINY_USB is not set
# CONFIG_I2C_VIPERBOARD is not set
#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_MLXCPLD is not set
# CONFIG_I2C_VIRTIO is not set
# end of I2C Hardware Bus support
# CONFIG_I2C_STUB is not set
# CONFIG_I2C_SLAVE is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# end of I2C support
# CONFIG_I3C is not set
# CONFIG_SPI is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set
# CONFIG_PPS is not set
#
# PTP clock support
#
# CONFIG_PTP_1588_CLOCK is not set
CONFIG_PTP_1588_CLOCK_OPTIONAL=y
#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
# end of PTP clock support
CONFIG_PINCTRL=y
# CONFIG_DEBUG_PINCTRL is not set
# CONFIG_PINCTRL_AMD is not set
# CONFIG_PINCTRL_CY8C95X0 is not set
# CONFIG_PINCTRL_MCP23S08 is not set
# CONFIG_PINCTRL_SX150X is not set
#
# Intel pinctrl drivers
#
# CONFIG_PINCTRL_BAYTRAIL is not set
# CONFIG_PINCTRL_CHERRYVIEW is not set
# CONFIG_PINCTRL_LYNXPOINT is not set
# CONFIG_PINCTRL_INTEL_PLATFORM is not set
# CONFIG_PINCTRL_ALDERLAKE is not set
# CONFIG_PINCTRL_BROXTON is not set
# CONFIG_PINCTRL_CANNONLAKE is not set
# CONFIG_PINCTRL_CEDARFORK is not set
# CONFIG_PINCTRL_DENVERTON is not set
# CONFIG_PINCTRL_ELKHARTLAKE is not set
# CONFIG_PINCTRL_EMMITSBURG is not set
# CONFIG_PINCTRL_GEMINILAKE is not set
# CONFIG_PINCTRL_ICELAKE is not set
# CONFIG_PINCTRL_JASPERLAKE is not set
# CONFIG_PINCTRL_LAKEFIELD is not set
# CONFIG_PINCTRL_LEWISBURG is not set
# CONFIG_PINCTRL_METEORLAKE is not set
# CONFIG_PINCTRL_METEORPOINT is not set
# CONFIG_PINCTRL_SUNRISEPOINT is not set
# CONFIG_PINCTRL_TIGERLAKE is not set
# end of Intel pinctrl drivers
#
# Renesas pinctrl drivers
#
# end of Renesas pinctrl drivers
CONFIG_GPIOLIB=y
CONFIG_GPIOLIB_FASTPATH_LIMIT=512
CONFIG_GPIO_ACPI=y
# CONFIG_DEBUG_GPIO is not set
CONFIG_GPIO_CDEV=y
CONFIG_GPIO_CDEV_V1=y
CONFIG_GPIO_GENERIC=m
#
# Memory mapped GPIO drivers
#
CONFIG_GPIO_AMDPT=m
# CONFIG_GPIO_DWAPB is not set
# CONFIG_GPIO_EXAR is not set
# CONFIG_GPIO_GENERIC_PLATFORM is not set
# CONFIG_GPIO_GRANITERAPIDS is not set
# CONFIG_GPIO_MB86S7X is not set
# CONFIG_GPIO_AMD_FCH is not set
# end of Memory mapped GPIO drivers
#
# Port-mapped I/O GPIO drivers
#
# CONFIG_GPIO_VX855 is not set
# CONFIG_GPIO_F7188X is not set
# CONFIG_GPIO_IT87 is not set
# CONFIG_GPIO_SCH311X is not set
# CONFIG_GPIO_WINBOND is not set
# CONFIG_GPIO_WS16C48 is not set
# end of Port-mapped I/O GPIO drivers
#
# I2C GPIO expanders
#
# CONFIG_GPIO_FXL6408 is not set
# CONFIG_GPIO_DS4520 is not set
# CONFIG_GPIO_MAX7300 is not set
# CONFIG_GPIO_MAX732X is not set
# CONFIG_GPIO_PCA953X is not set
# CONFIG_GPIO_PCA9570 is not set
# CONFIG_GPIO_PCF857X is not set
# CONFIG_GPIO_TPIC2810 is not set
# end of I2C GPIO expanders
#
# MFD GPIO expanders
#
# CONFIG_GPIO_ELKHARTLAKE is not set
# end of MFD GPIO expanders
#
# PCI GPIO expanders
#
# CONFIG_GPIO_AMD8111 is not set
# CONFIG_GPIO_BT8XX is not set
CONFIG_GPIO_ML_IOH=m
# CONFIG_GPIO_PCI_IDIO_16 is not set
# CONFIG_GPIO_PCIE_IDIO_24 is not set
# CONFIG_GPIO_RDC321X is not set
# end of PCI GPIO expanders
#
# USB GPIO expanders
#
CONFIG_GPIO_VIPERBOARD=m
# end of USB GPIO expanders
#
# Virtual GPIO drivers
#
# CONFIG_GPIO_AGGREGATOR is not set
# CONFIG_GPIO_LATCH is not set
# CONFIG_GPIO_MOCKUP is not set
# CONFIG_GPIO_VIRTIO is not set
# CONFIG_GPIO_SIM is not set
# end of Virtual GPIO drivers
# CONFIG_W1 is not set
# CONFIG_POWER_RESET is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
CONFIG_POWER_SUPPLY_HWMON=y
# CONFIG_IP5XXX_POWER is not set
# CONFIG_TEST_POWER is not set
# CONFIG_CHARGER_ADP5061 is not set
# CONFIG_BATTERY_CW2015 is not set
# CONFIG_BATTERY_DS2780 is not set
# CONFIG_BATTERY_DS2781 is not set
# CONFIG_BATTERY_DS2782 is not set
# CONFIG_BATTERY_SAMSUNG_SDI is not set
# CONFIG_BATTERY_SBS is not set
# CONFIG_CHARGER_SBS is not set
# CONFIG_BATTERY_BQ27XXX is not set
# CONFIG_BATTERY_MAX17042 is not set
# CONFIG_CHARGER_MAX8903 is not set
# CONFIG_CHARGER_LP8727 is not set
# CONFIG_CHARGER_GPIO is not set
# CONFIG_CHARGER_LT3651 is not set
# CONFIG_CHARGER_LTC4162L is not set
# CONFIG_CHARGER_MAX77976 is not set
# CONFIG_CHARGER_BQ2415X is not set
# CONFIG_CHARGER_BQ24257 is not set
# CONFIG_CHARGER_BQ24735 is not set
# CONFIG_CHARGER_BQ2515X is not set
# CONFIG_CHARGER_BQ25890 is not set
# CONFIG_CHARGER_BQ25980 is not set
# CONFIG_CHARGER_BQ256XX is not set
# CONFIG_BATTERY_GAUGE_LTC2941 is not set
# CONFIG_BATTERY_GOLDFISH is not set
# CONFIG_BATTERY_RT5033 is not set
# CONFIG_CHARGER_RT9455 is not set
# CONFIG_CHARGER_BD99954 is not set
# CONFIG_BATTERY_UG3105 is not set
# CONFIG_FUEL_GAUGE_MM8013 is not set
CONFIG_HWMON=y
# CONFIG_HWMON_DEBUG_CHIP is not set
#
# Native drivers
#
# CONFIG_SENSORS_ABITUGURU is not set
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_AD7414 is not set
# CONFIG_SENSORS_AD7418 is not set
# CONFIG_SENSORS_ADM1021 is not set
# CONFIG_SENSORS_ADM1025 is not set
# CONFIG_SENSORS_ADM1026 is not set
# CONFIG_SENSORS_ADM1029 is not set
# CONFIG_SENSORS_ADM1031 is not set
# CONFIG_SENSORS_ADM1177 is not set
# CONFIG_SENSORS_ADM9240 is not set
# CONFIG_SENSORS_ADT7410 is not set
# CONFIG_SENSORS_ADT7411 is not set
# CONFIG_SENSORS_ADT7462 is not set
# CONFIG_SENSORS_ADT7470 is not set
# CONFIG_SENSORS_ADT7475 is not set
# CONFIG_SENSORS_AHT10 is not set
# CONFIG_SENSORS_AQUACOMPUTER_D5NEXT is not set
# CONFIG_SENSORS_AS370 is not set
# CONFIG_SENSORS_ASC7621 is not set
# CONFIG_SENSORS_ASUS_ROG_RYUJIN is not set
# CONFIG_SENSORS_AXI_FAN_CONTROL is not set
CONFIG_SENSORS_K8TEMP=m
CONFIG_SENSORS_K10TEMP=m
CONFIG_SENSORS_FAM15H_POWER=m
# CONFIG_SENSORS_APPLESMC is not set
# CONFIG_SENSORS_ASB100 is not set
# CONFIG_SENSORS_ATXP1 is not set
# CONFIG_SENSORS_CHIPCAP2 is not set
# CONFIG_SENSORS_CORSAIR_CPRO is not set
# CONFIG_SENSORS_CORSAIR_PSU is not set
# CONFIG_SENSORS_DRIVETEMP is not set
# CONFIG_SENSORS_DS620 is not set
# CONFIG_SENSORS_DS1621 is not set
# CONFIG_SENSORS_DELL_SMM is not set
# CONFIG_SENSORS_I5K_AMB is not set
# CONFIG_SENSORS_F71805F is not set
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_F75375S is not set
# CONFIG_SENSORS_FSCHMD is not set
# CONFIG_SENSORS_GIGABYTE_WATERFORCE is not set
# CONFIG_SENSORS_GL518SM is not set
# CONFIG_SENSORS_GL520SM is not set
# CONFIG_SENSORS_G760A is not set
# CONFIG_SENSORS_G762 is not set
# CONFIG_SENSORS_HIH6130 is not set
# CONFIG_SENSORS_HS3001 is not set
# CONFIG_SENSORS_I5500 is not set
# CONFIG_SENSORS_CORETEMP is not set
# CONFIG_SENSORS_IT87 is not set
# CONFIG_SENSORS_JC42 is not set
# CONFIG_SENSORS_POWERZ is not set
# CONFIG_SENSORS_POWR1220 is not set
# CONFIG_SENSORS_LENOVO_EC is not set
# CONFIG_SENSORS_LINEAGE is not set
# CONFIG_SENSORS_LTC2945 is not set
# CONFIG_SENSORS_LTC2947_I2C is not set
# CONFIG_SENSORS_LTC2990 is not set
# CONFIG_SENSORS_LTC2991 is not set
# CONFIG_SENSORS_LTC2992 is not set
# CONFIG_SENSORS_LTC4151 is not set
# CONFIG_SENSORS_LTC4215 is not set
# CONFIG_SENSORS_LTC4222 is not set
# CONFIG_SENSORS_LTC4245 is not set
# CONFIG_SENSORS_LTC4260 is not set
# CONFIG_SENSORS_LTC4261 is not set
# CONFIG_SENSORS_LTC4282 is not set
# CONFIG_SENSORS_MAX127 is not set
# CONFIG_SENSORS_MAX16065 is not set
# CONFIG_SENSORS_MAX1619 is not set
# CONFIG_SENSORS_MAX1668 is not set
# CONFIG_SENSORS_MAX197 is not set
# CONFIG_SENSORS_MAX31730 is not set
# CONFIG_SENSORS_MAX31760 is not set
# CONFIG_MAX31827 is not set
# CONFIG_SENSORS_MAX6620 is not set
# CONFIG_SENSORS_MAX6621 is not set
# CONFIG_SENSORS_MAX6639 is not set
# CONFIG_SENSORS_MAX6642 is not set
# CONFIG_SENSORS_MAX6650 is not set
# CONFIG_SENSORS_MAX6697 is not set
# CONFIG_SENSORS_MAX31790 is not set
# CONFIG_SENSORS_MC34VR500 is not set
# CONFIG_SENSORS_MCP3021 is not set
# CONFIG_SENSORS_TC654 is not set
# CONFIG_SENSORS_TPS23861 is not set
# CONFIG_SENSORS_MR75203 is not set
# CONFIG_SENSORS_LM63 is not set
# CONFIG_SENSORS_LM73 is not set
# CONFIG_SENSORS_LM75 is not set
# CONFIG_SENSORS_LM77 is not set
# CONFIG_SENSORS_LM78 is not set
# CONFIG_SENSORS_LM80 is not set
# CONFIG_SENSORS_LM83 is not set
# CONFIG_SENSORS_LM85 is not set
# CONFIG_SENSORS_LM87 is not set
# CONFIG_SENSORS_LM90 is not set
# CONFIG_SENSORS_LM92 is not set
# CONFIG_SENSORS_LM93 is not set
# CONFIG_SENSORS_LM95234 is not set
# CONFIG_SENSORS_LM95241 is not set
# CONFIG_SENSORS_LM95245 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_PC87427 is not set
# CONFIG_SENSORS_NCT6683 is not set
# CONFIG_SENSORS_NCT6775 is not set
# CONFIG_SENSORS_NCT6775_I2C is not set
# CONFIG_SENSORS_NCT7802 is not set
# CONFIG_SENSORS_NPCM7XX is not set
# CONFIG_SENSORS_NZXT_KRAKEN2 is not set
# CONFIG_SENSORS_NZXT_KRAKEN3 is not set
# CONFIG_SENSORS_NZXT_SMART2 is not set
# CONFIG_SENSORS_OCC_P8_I2C is not set
# CONFIG_SENSORS_OXP is not set
# CONFIG_SENSORS_PCF8591 is not set
# CONFIG_PMBUS is not set
# CONFIG_SENSORS_PT5161L is not set
# CONFIG_SENSORS_SBTSI is not set
# CONFIG_SENSORS_SBRMI is not set
# CONFIG_SENSORS_SHT15 is not set
# CONFIG_SENSORS_SHT21 is not set
# CONFIG_SENSORS_SHT3x is not set
# CONFIG_SENSORS_SHT4x is not set
# CONFIG_SENSORS_SHTC1 is not set
# CONFIG_SENSORS_SIS5595 is not set
# CONFIG_SENSORS_DME1737 is not set
# CONFIG_SENSORS_EMC1403 is not set
# CONFIG_SENSORS_EMC2103 is not set
# CONFIG_SENSORS_EMC2305 is not set
# CONFIG_SENSORS_EMC6W201 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_SMSC47M192 is not set
# CONFIG_SENSORS_SMSC47B397 is not set
# CONFIG_SENSORS_STTS751 is not set
# CONFIG_SENSORS_ADC128D818 is not set
# CONFIG_SENSORS_ADS7828 is not set
# CONFIG_SENSORS_AMC6821 is not set
# CONFIG_SENSORS_INA209 is not set
# CONFIG_SENSORS_INA2XX is not set
# CONFIG_SENSORS_INA238 is not set
# CONFIG_SENSORS_INA3221 is not set
# CONFIG_SENSORS_TC74 is not set
# CONFIG_SENSORS_THMC50 is not set
# CONFIG_SENSORS_TMP102 is not set
# CONFIG_SENSORS_TMP103 is not set
# CONFIG_SENSORS_TMP108 is not set
# CONFIG_SENSORS_TMP401 is not set
# CONFIG_SENSORS_TMP421 is not set
# CONFIG_SENSORS_TMP464 is not set
# CONFIG_SENSORS_TMP513 is not set
# CONFIG_SENSORS_VIA_CPUTEMP is not set
# CONFIG_SENSORS_VIA686A is not set
# CONFIG_SENSORS_VT1211 is not set
# CONFIG_SENSORS_VT8231 is not set
# CONFIG_SENSORS_W83773G is not set
# CONFIG_SENSORS_W83781D is not set
# CONFIG_SENSORS_W83791D is not set
# CONFIG_SENSORS_W83792D is not set
# CONFIG_SENSORS_W83793 is not set
# CONFIG_SENSORS_W83795 is not set
# CONFIG_SENSORS_W83L785TS is not set
# CONFIG_SENSORS_W83L786NG is not set
# CONFIG_SENSORS_W83627HF is not set
# CONFIG_SENSORS_W83627EHF is not set
# CONFIG_SENSORS_XGENE is not set
#
# ACPI drivers
#
# CONFIG_SENSORS_ACPI_POWER is not set
# CONFIG_SENSORS_ATK0110 is not set
# CONFIG_SENSORS_ASUS_WMI is not set
# CONFIG_SENSORS_ASUS_EC is not set
# CONFIG_SENSORS_HP_WMI is not set
CONFIG_THERMAL=y
# CONFIG_THERMAL_NETLINK is not set
# CONFIG_THERMAL_STATISTICS is not set
# CONFIG_THERMAL_DEBUGFS is not set
CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0
CONFIG_THERMAL_HWMON=y
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
# CONFIG_THERMAL_DEFAULT_GOV_BANG_BANG is not set
CONFIG_THERMAL_GOV_FAIR_SHARE=y
CONFIG_THERMAL_GOV_STEP_WISE=y
CONFIG_THERMAL_GOV_BANG_BANG=y
# CONFIG_THERMAL_GOV_USER_SPACE is not set
# CONFIG_THERMAL_EMULATION is not set
#
# Intel thermal drivers
#
# CONFIG_INTEL_POWERCLAMP is not set
CONFIG_X86_THERMAL_VECTOR=y
# CONFIG_X86_PKG_TEMP_THERMAL is not set
# CONFIG_INTEL_SOC_DTS_THERMAL is not set
#
# ACPI INT340X thermal drivers
#
# CONFIG_INT340X_THERMAL is not set
# end of ACPI INT340X thermal drivers
# CONFIG_INTEL_PCH_THERMAL is not set
# CONFIG_INTEL_TCC_COOLING is not set
# CONFIG_INTEL_HFI_THERMAL is not set
# end of Intel thermal drivers
# CONFIG_WATCHDOG is not set
CONFIG_SSB_POSSIBLE=y
# CONFIG_SSB is not set
CONFIG_BCMA_POSSIBLE=y
# CONFIG_BCMA is not set
#
# Multifunction device drivers
#
CONFIG_MFD_CORE=m
# CONFIG_MFD_AS3711 is not set
# CONFIG_MFD_SMPRO is not set
# CONFIG_PMIC_ADP5520 is not set
# CONFIG_MFD_AAT2870_CORE is not set
# CONFIG_MFD_BCM590XX is not set
# CONFIG_MFD_BD9571MWV is not set
# CONFIG_MFD_AXP20X_I2C is not set
# CONFIG_MFD_CS42L43_I2C is not set
# CONFIG_MFD_MADERA is not set
# CONFIG_PMIC_DA903X is not set
# CONFIG_MFD_DA9052_I2C is not set
# CONFIG_MFD_DA9055 is not set
# CONFIG_MFD_DA9062 is not set
# CONFIG_MFD_DA9063 is not set
# CONFIG_MFD_DA9150 is not set
# CONFIG_MFD_DLN2 is not set
# CONFIG_MFD_MC13XXX_I2C is not set
# CONFIG_MFD_MP2629 is not set
# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set
# CONFIG_LPC_ICH is not set
# CONFIG_LPC_SCH is not set
# CONFIG_MFD_INTEL_LPSS_ACPI is not set
# CONFIG_MFD_INTEL_LPSS_PCI is not set
# CONFIG_MFD_INTEL_PMC_BXT is not set
# CONFIG_MFD_IQS62X is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_88PM800 is not set
# CONFIG_MFD_88PM805 is not set
# CONFIG_MFD_88PM860X is not set
# CONFIG_MFD_MAX14577 is not set
# CONFIG_MFD_MAX77541 is not set
# CONFIG_MFD_MAX77693 is not set
# CONFIG_MFD_MAX77843 is not set
# CONFIG_MFD_MAX8907 is not set
# CONFIG_MFD_MAX8925 is not set
# CONFIG_MFD_MAX8997 is not set
# CONFIG_MFD_MAX8998 is not set
# CONFIG_MFD_MT6360 is not set
# CONFIG_MFD_MT6370 is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_MENF21BMC is not set
CONFIG_MFD_VIPERBOARD=m
# CONFIG_MFD_RETU is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_MFD_SY7636A is not set
# CONFIG_MFD_RDC321X is not set
# CONFIG_MFD_RT4831 is not set
# CONFIG_MFD_RT5033 is not set
# CONFIG_MFD_RT5120 is not set
# CONFIG_MFD_RC5T583 is not set
# CONFIG_MFD_SI476X_CORE is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_MFD_SKY81452 is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_LP3943 is not set
# CONFIG_MFD_LP8788 is not set
# CONFIG_MFD_TI_LMU is not set
# CONFIG_MFD_PALMAS is not set
# CONFIG_TPS6105X is not set
# CONFIG_TPS65010 is not set
# CONFIG_TPS6507X is not set
# CONFIG_MFD_TPS65086 is not set
# CONFIG_MFD_TPS65090 is not set
# CONFIG_MFD_TI_LP873X is not set
# CONFIG_MFD_TPS6586X is not set
# CONFIG_MFD_TPS65910 is not set
# CONFIG_MFD_TPS65912_I2C is not set
# CONFIG_MFD_TPS6594_I2C is not set
# CONFIG_TWL4030_CORE is not set
# CONFIG_TWL6040_CORE is not set
# CONFIG_MFD_WL1273_CORE is not set
# CONFIG_MFD_LM3533 is not set
# CONFIG_MFD_TQMX86 is not set
# CONFIG_MFD_VX855 is not set
# CONFIG_MFD_ARIZONA_I2C is not set
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM831X_I2C is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_WM8994 is not set
# CONFIG_MFD_ATC260X_I2C is not set
# end of Multifunction device drivers
# CONFIG_REGULATOR is not set
# CONFIG_RC_CORE is not set
#
# CEC support
#
# CONFIG_MEDIA_CEC_SUPPORT is not set
# end of CEC support
# CONFIG_MEDIA_SUPPORT is not set
#
# Graphics support
#
CONFIG_APERTURE_HELPERS=y
CONFIG_SCREEN_INFO=y
CONFIG_VIDEO=y
# CONFIG_AUXDISPLAY is not set
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
# CONFIG_AGP_INTEL is not set
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_VGA_SWITCHEROO is not set
CONFIG_DRM=y
# CONFIG_DRM_DEBUG_MM is not set
CONFIG_DRM_KMS_HELPER=y
CONFIG_DRM_FBDEV_EMULATION=y
CONFIG_DRM_FBDEV_OVERALLOC=100
CONFIG_DRM_LOAD_EDID_FIRMWARE=y
CONFIG_DRM_DISPLAY_HELPER=m
# CONFIG_DRM_DISPLAY_DP_AUX_CEC is not set
# CONFIG_DRM_DISPLAY_DP_AUX_CHARDEV is not set
CONFIG_DRM_DISPLAY_DP_HELPER=y
CONFIG_DRM_TTM=y
CONFIG_DRM_VRAM_HELPER=y
CONFIG_DRM_TTM_HELPER=y
CONFIG_DRM_GEM_SHMEM_HELPER=y
CONFIG_DRM_SUBALLOC_HELPER=m
#
# I2C encoder or helper chips
#
# CONFIG_DRM_I2C_CH7006 is not set
# CONFIG_DRM_I2C_SIL164 is not set
# CONFIG_DRM_I2C_NXP_TDA998X is not set
# CONFIG_DRM_I2C_NXP_TDA9950 is not set
# end of I2C encoder or helper chips
#
# ARM devices
#
# end of ARM devices
CONFIG_DRM_RADEON=m
CONFIG_DRM_RADEON_USERPTR=y
# CONFIG_DRM_AMDGPU is not set
# CONFIG_DRM_NOUVEAU is not set
# CONFIG_DRM_I915 is not set
# CONFIG_DRM_XE is not set
# CONFIG_DRM_VGEM is not set
# CONFIG_DRM_VKMS is not set
# CONFIG_DRM_GMA500 is not set
# CONFIG_DRM_UDL is not set
# CONFIG_DRM_AST is not set
# CONFIG_DRM_MGAG200 is not set
# CONFIG_DRM_QXL is not set
CONFIG_DRM_VIRTIO_GPU=m
CONFIG_DRM_VIRTIO_GPU_KMS=y
CONFIG_DRM_PANEL=y
#
# Display Panels
#
# end of Display Panels
CONFIG_DRM_BRIDGE=y
CONFIG_DRM_PANEL_BRIDGE=y
#
# Display Interface Bridges
#
# CONFIG_DRM_ANALOGIX_ANX78XX is not set
# end of Display Interface Bridges
# CONFIG_DRM_ETNAVIV is not set
CONFIG_DRM_BOCHS=y
CONFIG_DRM_CIRRUS_QEMU=y
# CONFIG_DRM_GM12U320 is not set
# CONFIG_DRM_SIMPLEDRM is not set
# CONFIG_DRM_VBOXVIDEO is not set
# CONFIG_DRM_GUD is not set
# CONFIG_DRM_SSD130X is not set
CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y
#
# Frame buffer Devices
#
CONFIG_FB=y
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
CONFIG_FB_VGA16=m
CONFIG_FB_VESA=y
CONFIG_FB_EFI=y
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I740 is not set
# CONFIG_FB_MATROX is not set
CONFIG_FB_RADEON=m
CONFIG_FB_RADEON_I2C=y
CONFIG_FB_RADEON_BACKLIGHT=y
# CONFIG_FB_RADEON_DEBUG is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIA is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_SMSCUFX is not set
# CONFIG_FB_UDL is not set
# CONFIG_FB_IBM_GXT4500 is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_SIMPLE is not set
# CONFIG_FB_SSD1307 is not set
# CONFIG_FB_SM712 is not set
CONFIG_FB_CORE=y
CONFIG_FB_NOTIFY=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_DEVICE=y
CONFIG_FB_DDC=m
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
CONFIG_FB_SYS_FILLRECT=y
CONFIG_FB_SYS_COPYAREA=y
CONFIG_FB_SYS_IMAGEBLIT=y
# CONFIG_FB_FOREIGN_ENDIAN is not set
CONFIG_FB_SYSMEM_FOPS=y
CONFIG_FB_DEFERRED_IO=y
CONFIG_FB_IOMEM_FOPS=y
CONFIG_FB_IOMEM_HELPERS=y
CONFIG_FB_SYSMEM_HELPERS=y
CONFIG_FB_SYSMEM_HELPERS_DEFERRED=y
CONFIG_FB_BACKLIGHT=m
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y
# end of Frame buffer Devices
#
# Backlight & LCD device support
#
# CONFIG_LCD_CLASS_DEVICE is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_KTD253 is not set
# CONFIG_BACKLIGHT_KTD2801 is not set
# CONFIG_BACKLIGHT_KTZ8866 is not set
# CONFIG_BACKLIGHT_APPLE is not set
# CONFIG_BACKLIGHT_QCOM_WLED is not set
# CONFIG_BACKLIGHT_SAHARA is not set
# CONFIG_BACKLIGHT_ADP8860 is not set
# CONFIG_BACKLIGHT_ADP8870 is not set
# CONFIG_BACKLIGHT_LM3639 is not set
# CONFIG_BACKLIGHT_GPIO is not set
# CONFIG_BACKLIGHT_LV5207LP is not set
# CONFIG_BACKLIGHT_BD6107 is not set
# CONFIG_BACKLIGHT_ARCXCNN is not set
# end of Backlight & LCD device support
CONFIG_VGASTATE=m
CONFIG_HDMI=y
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
CONFIG_FRAMEBUFFER_CONSOLE=y
# CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION is not set
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
# CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER is not set
# end of Console display driver support
# CONFIG_LOGO is not set
# end of Graphics support
# CONFIG_DRM_ACCEL is not set
CONFIG_SOUND=m
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_SEQ_DEVICE=m
CONFIG_SND_JACK=y
CONFIG_SND_JACK_INPUT_DEV=y
# CONFIG_SND_OSSEMUL is not set
CONFIG_SND_PCM_TIMER=y
CONFIG_SND_HRTIMER=m
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_MAX_CARDS=4
# CONFIG_SND_SUPPORT_OLD_API is not set
CONFIG_SND_PROC_FS=y
CONFIG_SND_VERBOSE_PROCFS=y
# CONFIG_SND_VERBOSE_PRINTK is not set
CONFIG_SND_CTL_FAST_LOOKUP=y
# CONFIG_SND_DEBUG is not set
# CONFIG_SND_CTL_INPUT_VALIDATION is not set
CONFIG_SND_VMASTER=y
CONFIG_SND_DMA_SGBUF=y
CONFIG_SND_CTL_LED=m
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_SEQ_HRTIMER_DEFAULT=y
# CONFIG_SND_SEQ_UMP is not set
CONFIG_SND_AC97_CODEC=m
# CONFIG_SND_DRIVERS is not set
CONFIG_SND_PCI=y
# CONFIG_SND_AD1889 is not set
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ASIHPI is not set
CONFIG_SND_ATIIXP=m
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AW2 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CA0106 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_OXYGEN is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CTXFI is not set
# CONFIG_SND_DARLA20 is not set
# CONFIG_SND_GINA20 is not set
# CONFIG_SND_LAYLA20 is not set
# CONFIG_SND_DARLA24 is not set
# CONFIG_SND_GINA24 is not set
# CONFIG_SND_LAYLA24 is not set
# CONFIG_SND_MONA is not set
# CONFIG_SND_MIA is not set
# CONFIG_SND_ECHO3G is not set
# CONFIG_SND_INDIGO is not set
# CONFIG_SND_INDIGOIO is not set
# CONFIG_SND_INDIGODJ is not set
# CONFIG_SND_INDIGOIOX is not set
# CONFIG_SND_INDIGODJX is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_EMU10K1X is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_HDSPM is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
CONFIG_SND_INTEL8X0=m
CONFIG_SND_INTEL8X0M=m
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_LOLA is not set
# CONFIG_SND_LX6464ES is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_PCXHR is not set
# CONFIG_SND_RIPTIDE is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_SE6X is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VIRTUOSO is not set
# CONFIG_SND_VX222 is not set
# CONFIG_SND_YMFPCI is not set
#
# HD-Audio
#
CONFIG_SND_HDA=m
CONFIG_SND_HDA_GENERIC_LEDS=y
CONFIG_SND_HDA_INTEL=m
# CONFIG_SND_HDA_HWDEP is not set
CONFIG_SND_HDA_RECONFIG=y
CONFIG_SND_HDA_INPUT_BEEP=y
CONFIG_SND_HDA_INPUT_BEEP_MODE=1
# CONFIG_SND_HDA_PATCH_LOADER is not set
CONFIG_SND_HDA_SCODEC_COMPONENT=m
# CONFIG_SND_HDA_SCODEC_CS35L41_I2C is not set
# CONFIG_SND_HDA_SCODEC_CS35L56_I2C is not set
# CONFIG_SND_HDA_SCODEC_TAS2781_I2C is not set
CONFIG_SND_HDA_CODEC_REALTEK=m
# CONFIG_SND_HDA_CODEC_ANALOG is not set
# CONFIG_SND_HDA_CODEC_SIGMATEL is not set
# CONFIG_SND_HDA_CODEC_VIA is not set
CONFIG_SND_HDA_CODEC_HDMI=m
# CONFIG_SND_HDA_CODEC_CIRRUS is not set
# CONFIG_SND_HDA_CODEC_CS8409 is not set
# CONFIG_SND_HDA_CODEC_CONEXANT is not set
# CONFIG_SND_HDA_CODEC_CA0110 is not set
# CONFIG_SND_HDA_CODEC_CA0132 is not set
# CONFIG_SND_HDA_CODEC_CMEDIA is not set
# CONFIG_SND_HDA_CODEC_SI3054 is not set
CONFIG_SND_HDA_GENERIC=m
CONFIG_SND_HDA_POWER_SAVE_DEFAULT=0
# CONFIG_SND_HDA_INTEL_HDMI_SILENT_STREAM is not set
# CONFIG_SND_HDA_CTL_DEV_ID is not set
# end of HD-Audio
CONFIG_SND_HDA_CORE=m
CONFIG_SND_HDA_COMPONENT=y
CONFIG_SND_HDA_PREALLOC_SIZE=0
CONFIG_SND_INTEL_NHLT=y
CONFIG_SND_INTEL_DSP_CONFIG=m
CONFIG_SND_INTEL_SOUNDWIRE_ACPI=m
# CONFIG_SND_USB is not set
# CONFIG_SND_FIREWIRE is not set
CONFIG_SND_SOC=m
CONFIG_SND_SOC_ACPI=m
# CONFIG_SND_SOC_ADI is not set
CONFIG_SND_SOC_AMD_ACP=m
# CONFIG_SND_SOC_AMD_CZ_DA7219MX98357_MACH is not set
# CONFIG_SND_SOC_AMD_CZ_RT5645_MACH is not set
# CONFIG_SND_SOC_AMD_ST_ES8336_MACH is not set
CONFIG_SND_SOC_AMD_ACP3x=m
CONFIG_SND_SOC_AMD_RENOIR=m
# CONFIG_SND_SOC_AMD_RENOIR_MACH is not set
CONFIG_SND_SOC_AMD_ACP5x=m
# CONFIG_SND_SOC_AMD_ACP6x is not set
CONFIG_SND_AMD_ACP_CONFIG=m
# CONFIG_SND_SOC_AMD_ACP_COMMON is not set
# CONFIG_SND_SOC_AMD_RPL_ACP6x is not set
CONFIG_SND_SOC_AMD_ACP63_TOPLEVEL=m
# CONFIG_SND_SOC_AMD_PS is not set
# CONFIG_SND_ATMEL_SOC is not set
# CONFIG_SND_BCM63XX_I2S_WHISTLER is not set
# CONFIG_SND_DESIGNWARE_I2S is not set
#
# SoC Audio for Freescale CPUs
#
#
# Common SoC Audio options for Freescale CPUs:
#
# CONFIG_SND_SOC_FSL_ASRC is not set
# CONFIG_SND_SOC_FSL_SAI is not set
# CONFIG_SND_SOC_FSL_AUDMIX is not set
# CONFIG_SND_SOC_FSL_SSI is not set
# CONFIG_SND_SOC_FSL_SPDIF is not set
# CONFIG_SND_SOC_FSL_ESAI is not set
# CONFIG_SND_SOC_FSL_MICFIL is not set
# CONFIG_SND_SOC_FSL_XCVR is not set
# CONFIG_SND_SOC_IMX_AUDMUX is not set
# end of SoC Audio for Freescale CPUs
# CONFIG_SND_SOC_CHV3_I2S is not set
# CONFIG_SND_I2S_HI6210_I2S is not set
# CONFIG_SND_SOC_IMG is not set
# CONFIG_SND_SOC_INTEL_SST_TOPLEVEL is not set
# CONFIG_SND_SOC_INTEL_AVS is not set
# CONFIG_SND_SOC_MTK_BTCVSD is not set
# CONFIG_SND_SOC_SOF_TOPLEVEL is not set
#
# STMicroelectronics STM32 SOC audio support
#
# end of STMicroelectronics STM32 SOC audio support
# CONFIG_SND_SOC_XILINX_I2S is not set
# CONFIG_SND_SOC_XILINX_AUDIO_FORMATTER is not set
# CONFIG_SND_SOC_XILINX_SPDIF is not set
# CONFIG_SND_SOC_XTFPGA_I2S is not set
CONFIG_SND_SOC_I2C_AND_SPI=m
#
# CODEC drivers
#
# CONFIG_SND_SOC_AC97_CODEC is not set
# CONFIG_SND_SOC_ADAU1372_I2C is not set
# CONFIG_SND_SOC_ADAU1701 is not set
# CONFIG_SND_SOC_ADAU1761_I2C is not set
CONFIG_SND_SOC_ADAU7002=m
# CONFIG_SND_SOC_ADAU7118_HW is not set
# CONFIG_SND_SOC_ADAU7118_I2C is not set
# CONFIG_SND_SOC_AK4118 is not set
# CONFIG_SND_SOC_AK4375 is not set
# CONFIG_SND_SOC_AK4458 is not set
# CONFIG_SND_SOC_AK4554 is not set
# CONFIG_SND_SOC_AK4613 is not set
# CONFIG_SND_SOC_AK4642 is not set
# CONFIG_SND_SOC_AK5386 is not set
# CONFIG_SND_SOC_AK5558 is not set
# CONFIG_SND_SOC_ALC5623 is not set
# CONFIG_SND_SOC_AW8738 is not set
# CONFIG_SND_SOC_AW88395 is not set
# CONFIG_SND_SOC_AW88261 is not set
# CONFIG_SND_SOC_AW87390 is not set
# CONFIG_SND_SOC_AW88399 is not set
# CONFIG_SND_SOC_BD28623 is not set
# CONFIG_SND_SOC_BT_SCO is not set
# CONFIG_SND_SOC_CHV3_CODEC is not set
# CONFIG_SND_SOC_CS35L32 is not set
# CONFIG_SND_SOC_CS35L33 is not set
# CONFIG_SND_SOC_CS35L34 is not set
# CONFIG_SND_SOC_CS35L35 is not set
# CONFIG_SND_SOC_CS35L36 is not set
# CONFIG_SND_SOC_CS35L41_I2C is not set
# CONFIG_SND_SOC_CS35L45_I2C is not set
# CONFIG_SND_SOC_CS35L56_I2C is not set
# CONFIG_SND_SOC_CS42L42 is not set
# CONFIG_SND_SOC_CS42L51_I2C is not set
# CONFIG_SND_SOC_CS42L52 is not set
# CONFIG_SND_SOC_CS42L56 is not set
# CONFIG_SND_SOC_CS42L73 is not set
# CONFIG_SND_SOC_CS42L83 is not set
# CONFIG_SND_SOC_CS4234 is not set
# CONFIG_SND_SOC_CS4265 is not set
# CONFIG_SND_SOC_CS4270 is not set
# CONFIG_SND_SOC_CS4271_I2C is not set
# CONFIG_SND_SOC_CS42XX8_I2C is not set
# CONFIG_SND_SOC_CS43130 is not set
# CONFIG_SND_SOC_CS4341 is not set
# CONFIG_SND_SOC_CS4349 is not set
# CONFIG_SND_SOC_CS53L30 is not set
# CONFIG_SND_SOC_CX2072X is not set
# CONFIG_SND_SOC_DA7213 is not set
# CONFIG_SND_SOC_DMIC is not set
# CONFIG_SND_SOC_ES7134 is not set
# CONFIG_SND_SOC_ES7241 is not set
# CONFIG_SND_SOC_ES8316 is not set
# CONFIG_SND_SOC_ES8326 is not set
# CONFIG_SND_SOC_ES8328_I2C is not set
# CONFIG_SND_SOC_GTM601 is not set
# CONFIG_SND_SOC_HDA is not set
# CONFIG_SND_SOC_ICS43432 is not set
# CONFIG_SND_SOC_MAX98088 is not set
# CONFIG_SND_SOC_MAX98090 is not set
# CONFIG_SND_SOC_MAX98357A is not set
# CONFIG_SND_SOC_MAX98504 is not set
# CONFIG_SND_SOC_MAX9867 is not set
# CONFIG_SND_SOC_MAX98927 is not set
# CONFIG_SND_SOC_MAX98520 is not set
# CONFIG_SND_SOC_MAX98373_I2C is not set
# CONFIG_SND_SOC_MAX98388 is not set
# CONFIG_SND_SOC_MAX98390 is not set
# CONFIG_SND_SOC_MAX98396 is not set
# CONFIG_SND_SOC_MAX9860 is not set
# CONFIG_SND_SOC_MSM8916_WCD_DIGITAL is not set
# CONFIG_SND_SOC_PCM1681 is not set
# CONFIG_SND_SOC_PCM1789_I2C is not set
# CONFIG_SND_SOC_PCM179X_I2C is not set
# CONFIG_SND_SOC_PCM186X_I2C is not set
# CONFIG_SND_SOC_PCM3060_I2C is not set
# CONFIG_SND_SOC_PCM3168A_I2C is not set
# CONFIG_SND_SOC_PCM5102A is not set
# CONFIG_SND_SOC_PCM512x_I2C is not set
# CONFIG_SND_SOC_PCM6240 is not set
# CONFIG_SND_SOC_RT5616 is not set
# CONFIG_SND_SOC_RT5631 is not set
# CONFIG_SND_SOC_RT5640 is not set
# CONFIG_SND_SOC_RT5659 is not set
# CONFIG_SND_SOC_RT9120 is not set
# CONFIG_SND_SOC_RTQ9128 is not set
# CONFIG_SND_SOC_SGTL5000 is not set
# CONFIG_SND_SOC_SIMPLE_AMPLIFIER is not set
# CONFIG_SND_SOC_SIMPLE_MUX is not set
# CONFIG_SND_SOC_SMA1303 is not set
# CONFIG_SND_SOC_SPDIF is not set
# CONFIG_SND_SOC_SRC4XXX_I2C is not set
# CONFIG_SND_SOC_SSM2305 is not set
# CONFIG_SND_SOC_SSM2518 is not set
# CONFIG_SND_SOC_SSM2602_I2C is not set
CONFIG_SND_SOC_SSM4567=m
# CONFIG_SND_SOC_STA32X is not set
# CONFIG_SND_SOC_STA350 is not set
# CONFIG_SND_SOC_STI_SAS is not set
# CONFIG_SND_SOC_TAS2552 is not set
# CONFIG_SND_SOC_TAS2562 is not set
# CONFIG_SND_SOC_TAS2764 is not set
# CONFIG_SND_SOC_TAS2770 is not set
# CONFIG_SND_SOC_TAS2780 is not set
# CONFIG_SND_SOC_TAS2781_I2C is not set
# CONFIG_SND_SOC_TAS5086 is not set
# CONFIG_SND_SOC_TAS571X is not set
# CONFIG_SND_SOC_TAS5720 is not set
# CONFIG_SND_SOC_TAS5805M is not set
# CONFIG_SND_SOC_TAS6424 is not set
# CONFIG_SND_SOC_TDA7419 is not set
# CONFIG_SND_SOC_TFA9879 is not set
# CONFIG_SND_SOC_TFA989X is not set
# CONFIG_SND_SOC_TLV320ADC3XXX is not set
# CONFIG_SND_SOC_TLV320AIC23_I2C is not set
# CONFIG_SND_SOC_TLV320AIC31XX is not set
# CONFIG_SND_SOC_TLV320AIC32X4_I2C is not set
# CONFIG_SND_SOC_TLV320AIC3X_I2C is not set
# CONFIG_SND_SOC_TLV320ADCX140 is not set
CONFIG_SND_SOC_TS3A227E=m
# CONFIG_SND_SOC_TSCS42XX is not set
# CONFIG_SND_SOC_TSCS454 is not set
# CONFIG_SND_SOC_UDA1334 is not set
# CONFIG_SND_SOC_WM8510 is not set
# CONFIG_SND_SOC_WM8523 is not set
# CONFIG_SND_SOC_WM8524 is not set
# CONFIG_SND_SOC_WM8580 is not set
# CONFIG_SND_SOC_WM8711 is not set
# CONFIG_SND_SOC_WM8728 is not set
# CONFIG_SND_SOC_WM8731_I2C is not set
# CONFIG_SND_SOC_WM8737 is not set
# CONFIG_SND_SOC_WM8741 is not set
# CONFIG_SND_SOC_WM8750 is not set
# CONFIG_SND_SOC_WM8753 is not set
# CONFIG_SND_SOC_WM8776 is not set
# CONFIG_SND_SOC_WM8782 is not set
# CONFIG_SND_SOC_WM8804_I2C is not set
# CONFIG_SND_SOC_WM8903 is not set
# CONFIG_SND_SOC_WM8904 is not set
# CONFIG_SND_SOC_WM8940 is not set
# CONFIG_SND_SOC_WM8960 is not set
# CONFIG_SND_SOC_WM8961 is not set
# CONFIG_SND_SOC_WM8962 is not set
# CONFIG_SND_SOC_WM8974 is not set
# CONFIG_SND_SOC_WM8978 is not set
# CONFIG_SND_SOC_WM8985 is not set
# CONFIG_SND_SOC_MAX9759 is not set
# CONFIG_SND_SOC_MT6351 is not set
# CONFIG_SND_SOC_MT6358 is not set
# CONFIG_SND_SOC_MT6660 is not set
# CONFIG_SND_SOC_NAU8315 is not set
# CONFIG_SND_SOC_NAU8540 is not set
# CONFIG_SND_SOC_NAU8810 is not set
# CONFIG_SND_SOC_NAU8821 is not set
# CONFIG_SND_SOC_NAU8822 is not set
# CONFIG_SND_SOC_NAU8824 is not set
# CONFIG_SND_SOC_TPA6130A2 is not set
# CONFIG_SND_SOC_LPASS_WSA_MACRO is not set
# CONFIG_SND_SOC_LPASS_VA_MACRO is not set
# CONFIG_SND_SOC_LPASS_RX_MACRO is not set
# CONFIG_SND_SOC_LPASS_TX_MACRO is not set
# end of CODEC drivers
# CONFIG_SND_SIMPLE_CARD is not set
# CONFIG_SND_X86 is not set
# CONFIG_SND_VIRTIO is not set
CONFIG_AC97_BUS=m
CONFIG_HID_SUPPORT=y
CONFIG_HID=y
CONFIG_HID_BATTERY_STRENGTH=y
CONFIG_HIDRAW=y
CONFIG_UHID=m
CONFIG_HID_GENERIC=y
#
# Special HID drivers
#
# CONFIG_HID_A4TECH is not set
# CONFIG_HID_ACCUTOUCH is not set
# CONFIG_HID_ACRUX is not set
# CONFIG_HID_APPLE is not set
# CONFIG_HID_APPLEIR is not set
# CONFIG_HID_ASUS is not set
# CONFIG_HID_AUREAL is not set
# CONFIG_HID_BELKIN is not set
# CONFIG_HID_BETOP_FF is not set
# CONFIG_HID_BIGBEN_FF is not set
# CONFIG_HID_CHERRY is not set
# CONFIG_HID_CHICONY is not set
# CONFIG_HID_CORSAIR is not set
# CONFIG_HID_COUGAR is not set
# CONFIG_HID_MACALLY is not set
# CONFIG_HID_PRODIKEYS is not set
# CONFIG_HID_CMEDIA is not set
# CONFIG_HID_CP2112 is not set
# CONFIG_HID_CREATIVE_SB0540 is not set
# CONFIG_HID_CYPRESS is not set
# CONFIG_HID_DRAGONRISE is not set
# CONFIG_HID_EMS_FF is not set
# CONFIG_HID_ELAN is not set
# CONFIG_HID_ELECOM is not set
# CONFIG_HID_ELO is not set
# CONFIG_HID_EVISION is not set
# CONFIG_HID_EZKEY is not set
# CONFIG_HID_FT260 is not set
# CONFIG_HID_GEMBIRD is not set
# CONFIG_HID_GFRM is not set
# CONFIG_HID_GLORIOUS is not set
# CONFIG_HID_HOLTEK is not set
# CONFIG_HID_GOOGLE_STADIA_FF is not set
# CONFIG_HID_VIVALDI is not set
# CONFIG_HID_GT683R is not set
# CONFIG_HID_KEYTOUCH is not set
# CONFIG_HID_KYE is not set
# CONFIG_HID_UCLOGIC is not set
# CONFIG_HID_WALTOP is not set
# CONFIG_HID_VIEWSONIC is not set
# CONFIG_HID_VRC2 is not set
# CONFIG_HID_XIAOMI is not set
# CONFIG_HID_GYRATION is not set
# CONFIG_HID_ICADE is not set
# CONFIG_HID_ITE is not set
# CONFIG_HID_JABRA is not set
# CONFIG_HID_TWINHAN is not set
# CONFIG_HID_KENSINGTON is not set
# CONFIG_HID_LCPOWER is not set
# CONFIG_HID_LED is not set
# CONFIG_HID_LENOVO is not set
# CONFIG_HID_LETSKETCH is not set
# CONFIG_HID_LOGITECH is not set
# CONFIG_HID_MAGICMOUSE is not set
# CONFIG_HID_MALTRON is not set
# CONFIG_HID_MAYFLASH is not set
# CONFIG_HID_MEGAWORLD_FF is not set
# CONFIG_HID_REDRAGON is not set
# CONFIG_HID_MICROSOFT is not set
# CONFIG_HID_MONTEREY is not set
# CONFIG_HID_MULTITOUCH is not set
# CONFIG_HID_NINTENDO is not set
# CONFIG_HID_NTI is not set
# CONFIG_HID_NTRIG is not set
# CONFIG_HID_ORTEK is not set
# CONFIG_HID_PANTHERLORD is not set
# CONFIG_HID_PENMOUNT is not set
# CONFIG_HID_PETALYNX is not set
# CONFIG_HID_PICOLCD is not set
# CONFIG_HID_PLANTRONICS is not set
# CONFIG_HID_PXRC is not set
# CONFIG_HID_RAZER is not set
# CONFIG_HID_PRIMAX is not set
# CONFIG_HID_RETRODE is not set
# CONFIG_HID_ROCCAT is not set
# CONFIG_HID_SAITEK is not set
# CONFIG_HID_SAMSUNG is not set
# CONFIG_HID_SEMITEK is not set
# CONFIG_HID_SIGMAMICRO is not set
# CONFIG_HID_SONY is not set
# CONFIG_HID_SPEEDLINK is not set
# CONFIG_HID_STEAM is not set
# CONFIG_HID_STEELSERIES is not set
# CONFIG_HID_SUNPLUS is not set
# CONFIG_HID_RMI is not set
# CONFIG_HID_GREENASIA is not set
# CONFIG_HID_SMARTJOYPLUS is not set
# CONFIG_HID_TIVO is not set
# CONFIG_HID_TOPSEED is not set
# CONFIG_HID_TOPRE is not set
# CONFIG_HID_THINGM is not set
# CONFIG_HID_THRUSTMASTER is not set
# CONFIG_HID_UDRAW_PS3 is not set
# CONFIG_HID_U2FZERO is not set
# CONFIG_HID_WACOM is not set
# CONFIG_HID_WIIMOTE is not set
# CONFIG_HID_WINWING is not set
# CONFIG_HID_XINMO is not set
# CONFIG_HID_ZEROPLUS is not set
# CONFIG_HID_ZYDACRON is not set
# CONFIG_HID_SENSOR_HUB is not set
# CONFIG_HID_ALPS is not set
# CONFIG_HID_MCP2200 is not set
# CONFIG_HID_MCP2221 is not set
# end of Special HID drivers
#
# HID-BPF support
#
# end of HID-BPF support
#
# USB HID support
#
CONFIG_USB_HID=y
CONFIG_HID_PID=y
CONFIG_USB_HIDDEV=y
# end of USB HID support
CONFIG_I2C_HID=y
# CONFIG_I2C_HID_ACPI is not set
# CONFIG_I2C_HID_OF is not set
#
# Intel ISH HID support
#
# CONFIG_INTEL_ISH_HID is not set
# end of Intel ISH HID support
#
# AMD SFH HID Support
#
CONFIG_AMD_SFH_HID=m
# end of AMD SFH HID Support
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
# CONFIG_USB_LED_TRIG is not set
# CONFIG_USB_ULPI_BUS is not set
# CONFIG_USB_CONN_GPIO is not set
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
CONFIG_USB_PCI=y
CONFIG_USB_PCI_AMD=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y
#
# Miscellaneous USB options
#
CONFIG_USB_DEFAULT_PERSIST=y
# CONFIG_USB_FEW_INIT_RETRIES is not set
CONFIG_USB_DYNAMIC_MINORS=y
# CONFIG_USB_OTG is not set
# CONFIG_USB_OTG_PRODUCTLIST is not set
# CONFIG_USB_LEDS_TRIGGER_USBPORT is not set
CONFIG_USB_AUTOSUSPEND_DELAY=2
CONFIG_USB_DEFAULT_AUTHORIZATION_MODE=1
# CONFIG_USB_MON is not set
#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
CONFIG_USB_XHCI_HCD=y
# CONFIG_USB_XHCI_DBGCAP is not set
CONFIG_USB_XHCI_PCI=y
# CONFIG_USB_XHCI_PCI_RENESAS is not set
# CONFIG_USB_XHCI_PLATFORM is not set
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_EHCI_TT_NEWSCHED=y
CONFIG_USB_EHCI_PCI=y
# CONFIG_USB_EHCI_FSL is not set
# CONFIG_USB_EHCI_HCD_PLATFORM is not set
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_OHCI_HCD_PCI=y
# CONFIG_USB_OHCI_HCD_PLATFORM is not set
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_HCD_TEST_MODE is not set
#
# USB Device Class drivers
#
# CONFIG_USB_ACM is not set
CONFIG_USB_PRINTER=m
# CONFIG_USB_WDM is not set
# CONFIG_USB_TMC is not set
#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#
#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=y
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_REALTEK is not set
# CONFIG_USB_STORAGE_DATAFAB is not set
# CONFIG_USB_STORAGE_FREECOM is not set
# CONFIG_USB_STORAGE_ISD200 is not set
# CONFIG_USB_STORAGE_USBAT is not set
# CONFIG_USB_STORAGE_SDDR09 is not set
# CONFIG_USB_STORAGE_SDDR55 is not set
# CONFIG_USB_STORAGE_JUMPSHOT is not set
# CONFIG_USB_STORAGE_ALAUDA is not set
# CONFIG_USB_STORAGE_ONETOUCH is not set
# CONFIG_USB_STORAGE_KARMA is not set
# CONFIG_USB_STORAGE_CYPRESS_ATACB is not set
# CONFIG_USB_STORAGE_ENE_UB6250 is not set
# CONFIG_USB_UAS is not set
#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USBIP_CORE is not set
#
# USB dual-mode controller drivers
#
# CONFIG_USB_CDNS_SUPPORT is not set
# CONFIG_USB_MUSB_HDRC is not set
# CONFIG_USB_DWC3 is not set
# CONFIG_USB_DWC2 is not set
# CONFIG_USB_CHIPIDEA is not set
# CONFIG_USB_ISP1760 is not set
#
# USB port drivers
#
CONFIG_USB_SERIAL=m
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_SIMPLE is not set
# CONFIG_USB_SERIAL_AIRCABLE is not set
# CONFIG_USB_SERIAL_ARK3116 is not set
# CONFIG_USB_SERIAL_BELKIN is not set
# CONFIG_USB_SERIAL_CH341 is not set
# CONFIG_USB_SERIAL_WHITEHEAT is not set
# CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set
# CONFIG_USB_SERIAL_CP210X is not set
# CONFIG_USB_SERIAL_CYPRESS_M8 is not set
# CONFIG_USB_SERIAL_EMPEG is not set
# CONFIG_USB_SERIAL_FTDI_SIO is not set
# CONFIG_USB_SERIAL_VISOR is not set
# CONFIG_USB_SERIAL_IPAQ is not set
# CONFIG_USB_SERIAL_IR is not set
# CONFIG_USB_SERIAL_EDGEPORT is not set
# CONFIG_USB_SERIAL_EDGEPORT_TI is not set
# CONFIG_USB_SERIAL_F81232 is not set
# CONFIG_USB_SERIAL_F8153X is not set
# CONFIG_USB_SERIAL_GARMIN is not set
# CONFIG_USB_SERIAL_IPW is not set
# CONFIG_USB_SERIAL_IUU is not set
# CONFIG_USB_SERIAL_KEYSPAN_PDA is not set
# CONFIG_USB_SERIAL_KEYSPAN is not set
# CONFIG_USB_SERIAL_KLSI is not set
# CONFIG_USB_SERIAL_KOBIL_SCT is not set
# CONFIG_USB_SERIAL_MCT_U232 is not set
# CONFIG_USB_SERIAL_METRO is not set
# CONFIG_USB_SERIAL_MOS7720 is not set
# CONFIG_USB_SERIAL_MOS7840 is not set
# CONFIG_USB_SERIAL_MXUPORT is not set
# CONFIG_USB_SERIAL_NAVMAN is not set
# CONFIG_USB_SERIAL_PL2303 is not set
# CONFIG_USB_SERIAL_OTI6858 is not set
# CONFIG_USB_SERIAL_QCAUX is not set
# CONFIG_USB_SERIAL_QUALCOMM is not set
# CONFIG_USB_SERIAL_SPCP8X5 is not set
# CONFIG_USB_SERIAL_SAFE is not set
# CONFIG_USB_SERIAL_SIERRAWIRELESS is not set
# CONFIG_USB_SERIAL_SYMBOL is not set
# CONFIG_USB_SERIAL_TI is not set
# CONFIG_USB_SERIAL_CYBERJACK is not set
# CONFIG_USB_SERIAL_OPTION is not set
# CONFIG_USB_SERIAL_OMNINET is not set
# CONFIG_USB_SERIAL_OPTICON is not set
# CONFIG_USB_SERIAL_XSENS_MT is not set
# CONFIG_USB_SERIAL_WISHBONE is not set
# CONFIG_USB_SERIAL_SSU100 is not set
# CONFIG_USB_SERIAL_QT2 is not set
# CONFIG_USB_SERIAL_UPD78F0730 is not set
# CONFIG_USB_SERIAL_XR is not set
# CONFIG_USB_SERIAL_DEBUG is not set
#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
# CONFIG_USB_SEVSEG is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_APPLEDISPLAY is not set
# CONFIG_APPLE_MFI_FASTCHARGE is not set
# CONFIG_USB_LJCA is not set
# CONFIG_USB_SISUSBVGA is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
# CONFIG_USB_TEST is not set
# CONFIG_USB_EHSET_TEST_FIXTURE is not set
# CONFIG_USB_ISIGHTFW is not set
# CONFIG_USB_YUREX is not set
# CONFIG_USB_EZUSB_FX2 is not set
# CONFIG_USB_HUB_USB251XB is not set
# CONFIG_USB_HSIC_USB3503 is not set
# CONFIG_USB_HSIC_USB4604 is not set
# CONFIG_USB_LINK_LAYER_TEST is not set
# CONFIG_USB_CHAOSKEY is not set
#
# USB Physical Layer drivers
#
# CONFIG_NOP_USB_XCEIV is not set
# CONFIG_USB_GPIO_VBUS is not set
# CONFIG_USB_ISP1301 is not set
# end of USB Physical Layer drivers
# CONFIG_USB_GADGET is not set
CONFIG_TYPEC=m
CONFIG_TYPEC_TCPM=m
# CONFIG_TYPEC_TCPCI is not set
CONFIG_TYPEC_FUSB302=m
CONFIG_TYPEC_UCSI=m
# CONFIG_UCSI_CCG is not set
CONFIG_UCSI_ACPI=m
# CONFIG_UCSI_STM32G0 is not set
# CONFIG_TYPEC_TPS6598X is not set
# CONFIG_TYPEC_ANX7411 is not set
# CONFIG_TYPEC_RT1719 is not set
# CONFIG_TYPEC_HD3SS3220 is not set
# CONFIG_TYPEC_STUSB160X is not set
# CONFIG_TYPEC_WUSB3801 is not set
#
# USB Type-C Multiplexer/DeMultiplexer Switch support
#
# CONFIG_TYPEC_MUX_FSA4480 is not set
# CONFIG_TYPEC_MUX_GPIO_SBU is not set
# CONFIG_TYPEC_MUX_PI3USB30532 is not set
# CONFIG_TYPEC_MUX_IT5205 is not set
# CONFIG_TYPEC_MUX_NB7VPQ904M is not set
# CONFIG_TYPEC_MUX_PTN36502 is not set
# CONFIG_TYPEC_MUX_WCD939X_USBSS is not set
# end of USB Type-C Multiplexer/DeMultiplexer Switch support
#
# USB Type-C Alternate Mode drivers
#
# CONFIG_TYPEC_DP_ALTMODE is not set
# end of USB Type-C Alternate Mode drivers
CONFIG_USB_ROLE_SWITCH=m
# CONFIG_USB_ROLES_INTEL_XHCI is not set
# CONFIG_MMC is not set
# CONFIG_SCSI_UFSHCD is not set
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=m
# CONFIG_LEDS_CLASS_FLASH is not set
# CONFIG_LEDS_CLASS_MULTICOLOR is not set
# CONFIG_LEDS_BRIGHTNESS_HW_CHANGED is not set
#
# LED drivers
#
# CONFIG_LEDS_APU is not set
# CONFIG_LEDS_AW200XX is not set
# CONFIG_LEDS_LM3530 is not set
# CONFIG_LEDS_LM3532 is not set
# CONFIG_LEDS_LM3642 is not set
# CONFIG_LEDS_PCA9532 is not set
# CONFIG_LEDS_GPIO is not set
# CONFIG_LEDS_LP3944 is not set
# CONFIG_LEDS_LP3952 is not set
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_PCA963X is not set
# CONFIG_LEDS_PCA995X is not set
# CONFIG_LEDS_BD2606MVV is not set
# CONFIG_LEDS_BD2802 is not set
# CONFIG_LEDS_INTEL_SS4200 is not set
# CONFIG_LEDS_LT3593 is not set
# CONFIG_LEDS_TCA6507 is not set
# CONFIG_LEDS_TLC591XX is not set
# CONFIG_LEDS_LM355x is not set
# CONFIG_LEDS_IS31FL319X is not set
#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
# CONFIG_LEDS_BLINKM is not set
# CONFIG_LEDS_MLXCPLD is not set
# CONFIG_LEDS_MLXREG is not set
# CONFIG_LEDS_USER is not set
# CONFIG_LEDS_NIC78BX is not set
#
# Flash and Torch LED drivers
#
#
# RGB LED drivers
#
#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
# CONFIG_LEDS_TRIGGER_TIMER is not set
# CONFIG_LEDS_TRIGGER_ONESHOT is not set
# CONFIG_LEDS_TRIGGER_DISK is not set
# CONFIG_LEDS_TRIGGER_HEARTBEAT is not set
# CONFIG_LEDS_TRIGGER_BACKLIGHT is not set
# CONFIG_LEDS_TRIGGER_CPU is not set
# CONFIG_LEDS_TRIGGER_ACTIVITY is not set
# CONFIG_LEDS_TRIGGER_GPIO is not set
# CONFIG_LEDS_TRIGGER_DEFAULT_ON is not set
#
# iptables trigger is under Netfilter config (LED target)
#
# CONFIG_LEDS_TRIGGER_TRANSIENT is not set
# CONFIG_LEDS_TRIGGER_CAMERA is not set
# CONFIG_LEDS_TRIGGER_PANIC is not set
# CONFIG_LEDS_TRIGGER_NETDEV is not set
# CONFIG_LEDS_TRIGGER_PATTERN is not set
# CONFIG_LEDS_TRIGGER_TTY is not set
#
# Simple LED drivers
#
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=y
CONFIG_EDAC_LEGACY_SYSFS=y
CONFIG_EDAC_DEBUG=y
CONFIG_EDAC_DECODE_MCE=m
# CONFIG_EDAC_GHES is not set
CONFIG_EDAC_AMD64=m
CONFIG_EDAC_E752X=m
CONFIG_EDAC_I82975X=m
CONFIG_EDAC_I3000=m
CONFIG_EDAC_I3200=m
CONFIG_EDAC_IE31200=m
CONFIG_EDAC_X38=m
CONFIG_EDAC_I5400=m
CONFIG_EDAC_I5100=m
CONFIG_EDAC_I7300=m
CONFIG_RTC_LIB=y
CONFIG_RTC_MC146818_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
CONFIG_RTC_SYSTOHC=y
CONFIG_RTC_SYSTOHC_DEVICE="rtc0"
# CONFIG_RTC_DEBUG is not set
CONFIG_RTC_NVMEM=y
#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set
#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_ABB5ZES3 is not set
# CONFIG_RTC_DRV_ABEOZ9 is not set
# CONFIG_RTC_DRV_ABX80X is not set
# CONFIG_RTC_DRV_DS1307 is not set
# CONFIG_RTC_DRV_DS1374 is not set
# CONFIG_RTC_DRV_DS1672 is not set
# CONFIG_RTC_DRV_MAX6900 is not set
# CONFIG_RTC_DRV_MAX31335 is not set
# CONFIG_RTC_DRV_RS5C372 is not set
# CONFIG_RTC_DRV_ISL1208 is not set
# CONFIG_RTC_DRV_ISL12022 is not set
# CONFIG_RTC_DRV_X1205 is not set
# CONFIG_RTC_DRV_PCF8523 is not set
# CONFIG_RTC_DRV_PCF85063 is not set
# CONFIG_RTC_DRV_PCF85363 is not set
# CONFIG_RTC_DRV_PCF8563 is not set
# CONFIG_RTC_DRV_PCF8583 is not set
# CONFIG_RTC_DRV_M41T80 is not set
# CONFIG_RTC_DRV_BQ32K is not set
# CONFIG_RTC_DRV_S35390A is not set
# CONFIG_RTC_DRV_FM3130 is not set
# CONFIG_RTC_DRV_RX8010 is not set
# CONFIG_RTC_DRV_RX8111 is not set
# CONFIG_RTC_DRV_RX8581 is not set
# CONFIG_RTC_DRV_RX8025 is not set
# CONFIG_RTC_DRV_EM3027 is not set
# CONFIG_RTC_DRV_RV3028 is not set
# CONFIG_RTC_DRV_RV3032 is not set
# CONFIG_RTC_DRV_RV8803 is not set
# CONFIG_RTC_DRV_SD3078 is not set
#
# SPI RTC drivers
#
CONFIG_RTC_I2C_AND_SPI=y
#
# SPI and I2C RTC drivers
#
# CONFIG_RTC_DRV_DS3232 is not set
# CONFIG_RTC_DRV_PCF2127 is not set
# CONFIG_RTC_DRV_RV3029C2 is not set
# CONFIG_RTC_DRV_RX6110 is not set
#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
# CONFIG_RTC_DRV_DS1286 is not set
# CONFIG_RTC_DRV_DS1511 is not set
# CONFIG_RTC_DRV_DS1553 is not set
# CONFIG_RTC_DRV_DS1685_FAMILY is not set
# CONFIG_RTC_DRV_DS1742 is not set
# CONFIG_RTC_DRV_DS2404 is not set
# CONFIG_RTC_DRV_STK17TA8 is not set
# CONFIG_RTC_DRV_M48T86 is not set
# CONFIG_RTC_DRV_M48T35 is not set
# CONFIG_RTC_DRV_M48T59 is not set
# CONFIG_RTC_DRV_MSM6242 is not set
# CONFIG_RTC_DRV_RP5C01 is not set
#
# on-CPU RTC drivers
#
# CONFIG_RTC_DRV_FTRTC010 is not set
#
# HID Sensor RTC drivers
#
# CONFIG_RTC_DRV_GOLDFISH is not set
# CONFIG_DMADEVICES is not set
#
# DMABUF options
#
CONFIG_SYNC_FILE=y
# CONFIG_SW_SYNC is not set
# CONFIG_UDMABUF is not set
# CONFIG_DMABUF_MOVE_NOTIFY is not set
# CONFIG_DMABUF_DEBUG is not set
# CONFIG_DMABUF_SELFTESTS is not set
# CONFIG_DMABUF_HEAPS is not set
# CONFIG_DMABUF_SYSFS_STATS is not set
# end of DMABUF options
# CONFIG_UIO is not set
# CONFIG_VFIO is not set
CONFIG_IRQ_BYPASS_MANAGER=y
CONFIG_VIRT_DRIVERS=y
CONFIG_VMGENID=y
# CONFIG_VBOXGUEST is not set
# CONFIG_NITRO_ENCLAVES is not set
CONFIG_TSM_REPORTS=m
# CONFIG_EFI_SECRET is not set
CONFIG_SEV_GUEST=m
CONFIG_VIRTIO_ANCHOR=y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_PCI_LIB=y
CONFIG_VIRTIO_PCI_LIB_LEGACY=y
CONFIG_VIRTIO_MENU=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_PCI_ADMIN_LEGACY=y
CONFIG_VIRTIO_PCI_LEGACY=y
# CONFIG_VIRTIO_PMEM is not set
# CONFIG_VIRTIO_BALLOON is not set
CONFIG_VIRTIO_INPUT=y
CONFIG_VIRTIO_MMIO=y
# CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set
CONFIG_VIRTIO_DMA_SHARED_BUFFER=m
# CONFIG_VIRTIO_DEBUG is not set
# CONFIG_VDPA is not set
# CONFIG_VHOST_MENU is not set
#
# Microsoft Hyper-V guest support
#
# end of Microsoft Hyper-V guest support
# CONFIG_GREYBUS is not set
# CONFIG_COMEDI is not set
# CONFIG_STAGING is not set
# CONFIG_GOLDFISH is not set
# CONFIG_CHROME_PLATFORMS is not set
# CONFIG_MELLANOX_PLATFORM is not set
# CONFIG_SURFACE_PLATFORMS is not set
CONFIG_X86_PLATFORM_DEVICES=y
CONFIG_ACPI_WMI=m
CONFIG_WMI_BMOF=m
# CONFIG_MXM_WMI is not set
# CONFIG_NVIDIA_WMI_EC_BACKLIGHT is not set
# CONFIG_XIAOMI_WMI is not set
# CONFIG_GIGABYTE_WMI is not set
# CONFIG_YOGABOOK is not set
# CONFIG_ACERHDF is not set
# CONFIG_ACER_WIRELESS is not set
# CONFIG_ACER_WMI is not set
# CONFIG_AMD_PMC is not set
# CONFIG_AMD_HSMP is not set
# CONFIG_AMD_WBRF is not set
# CONFIG_ADV_SWBUTTON is not set
# CONFIG_APPLE_GMUX is not set
# CONFIG_ASUS_LAPTOP is not set
# CONFIG_ASUS_WIRELESS is not set
# CONFIG_ASUS_TF103C_DOCK is not set
# CONFIG_EEEPC_LAPTOP is not set
# CONFIG_X86_PLATFORM_DRIVERS_DELL is not set
# CONFIG_FUJITSU_TABLET is not set
# CONFIG_GPD_POCKET_FAN is not set
# CONFIG_X86_PLATFORM_DRIVERS_HP is not set
# CONFIG_WIRELESS_HOTKEY is not set
# CONFIG_IBM_RTL is not set
# CONFIG_LENOVO_YMC is not set
# CONFIG_SENSORS_HDAPS is not set
# CONFIG_THINKPAD_LMI is not set
# CONFIG_INTEL_IFS is not set
# CONFIG_INTEL_SAR_INT1092 is not set
#
# Intel Speed Select Technology interface support
#
# CONFIG_INTEL_SPEED_SELECT_INTERFACE is not set
# end of Intel Speed Select Technology interface support
# CONFIG_INTEL_WMI_SBL_FW_UPDATE is not set
# CONFIG_INTEL_WMI_THUNDERBOLT is not set
#
# Intel Uncore Frequency Control
#
# CONFIG_INTEL_UNCORE_FREQ_CONTROL is not set
# end of Intel Uncore Frequency Control
# CONFIG_INTEL_HID_EVENT is not set
# CONFIG_INTEL_VBTN is not set
# CONFIG_INTEL_INT0002_VGPIO is not set
# CONFIG_INTEL_PUNIT_IPC is not set
# CONFIG_INTEL_RST is not set
# CONFIG_INTEL_SMARTCONNECT is not set
# CONFIG_INTEL_TURBO_MAX_3 is not set
# CONFIG_INTEL_VSEC is not set
# CONFIG_ACPI_QUICKSTART is not set
# CONFIG_MEEGOPAD_ANX7428 is not set
# CONFIG_MSI_WMI is not set
# CONFIG_MSI_WMI_PLATFORM is not set
# CONFIG_PCENGINES_APU2 is not set
# CONFIG_BARCO_P50_GPIO is not set
# CONFIG_SAMSUNG_LAPTOP is not set
# CONFIG_SAMSUNG_Q10 is not set
# CONFIG_TOSHIBA_BT_RFKILL is not set
# CONFIG_TOSHIBA_HAPS is not set
# CONFIG_TOSHIBA_WMI is not set
# CONFIG_ACPI_CMPC is not set
# CONFIG_PANASONIC_LAPTOP is not set
# CONFIG_TOPSTAR_LAPTOP is not set
# CONFIG_MLX_PLATFORM is not set
# CONFIG_INSPUR_PLATFORM_PROFILE is not set
# CONFIG_LENOVO_WMI_CAMERA is not set
# CONFIG_INTEL_IPS is not set
# CONFIG_INTEL_SCU_PCI is not set
# CONFIG_INTEL_SCU_PLATFORM is not set
# CONFIG_SIEMENS_SIMATIC_IPC is not set
# CONFIG_WINMATE_FM07_KEYS is not set
CONFIG_HAVE_CLK=y
CONFIG_HAVE_CLK_PREPARE=y
CONFIG_COMMON_CLK=y
# CONFIG_COMMON_CLK_MAX9485 is not set
# CONFIG_COMMON_CLK_SI5341 is not set
# CONFIG_COMMON_CLK_SI5351 is not set
# CONFIG_COMMON_CLK_SI544 is not set
# CONFIG_COMMON_CLK_CDCE706 is not set
# CONFIG_COMMON_CLK_CS2000_CP is not set
# CONFIG_XILINX_VCU is not set
# CONFIG_HWSPINLOCK is not set
#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# end of Clock Source drivers
CONFIG_MAILBOX=y
CONFIG_PCC=y
# CONFIG_ALTERA_MBOX is not set
CONFIG_IOMMU_IOVA=y
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y
#
# Generic IOMMU Pagetable Support
#
CONFIG_IOMMU_IO_PGTABLE=y
# end of Generic IOMMU Pagetable Support
# CONFIG_IOMMU_DEBUGFS is not set
# CONFIG_IOMMU_DEFAULT_DMA_STRICT is not set
CONFIG_IOMMU_DEFAULT_DMA_LAZY=y
# CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set
CONFIG_IOMMU_DMA=y
CONFIG_IOMMU_SVA=y
CONFIG_IOMMU_IOPF=y
CONFIG_AMD_IOMMU=y
# CONFIG_INTEL_IOMMU is not set
# CONFIG_IOMMUFD is not set
CONFIG_IRQ_REMAP=y
CONFIG_VIRTIO_IOMMU=y
#
# Remoteproc drivers
#
# CONFIG_REMOTEPROC is not set
# end of Remoteproc drivers
#
# Rpmsg drivers
#
# CONFIG_RPMSG_QCOM_GLINK_RPM is not set
# CONFIG_RPMSG_VIRTIO is not set
# end of Rpmsg drivers
# CONFIG_SOUNDWIRE is not set
#
# SOC (System On Chip) specific Drivers
#
#
# Amlogic SoC drivers
#
# end of Amlogic SoC drivers
#
# Broadcom SoC drivers
#
# end of Broadcom SoC drivers
#
# NXP/Freescale QorIQ SoC drivers
#
# end of NXP/Freescale QorIQ SoC drivers
#
# fujitsu SoC drivers
#
# end of fujitsu SoC drivers
#
# i.MX SoC drivers
#
# end of i.MX SoC drivers
#
# Enable LiteX SoC Builder specific drivers
#
# end of Enable LiteX SoC Builder specific drivers
# CONFIG_WPCM450_SOC is not set
#
# Qualcomm SoC drivers
#
# end of Qualcomm SoC drivers
# CONFIG_SOC_TI is not set
#
# Xilinx SoC drivers
#
# end of Xilinx SoC drivers
# end of SOC (System On Chip) specific Drivers
#
# PM Domains
#
#
# Amlogic PM Domains
#
# end of Amlogic PM Domains
#
# Broadcom PM Domains
#
# end of Broadcom PM Domains
#
# i.MX PM Domains
#
# end of i.MX PM Domains
#
# Qualcomm PM Domains
#
# end of Qualcomm PM Domains
# end of PM Domains
# CONFIG_PM_DEVFREQ is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
# CONFIG_NTB is not set
# CONFIG_PWM is not set
#
# IRQ chip support
#
# CONFIG_LAN966X_OIC is not set
# end of IRQ chip support
# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set
#
# PHY Subsystem
#
CONFIG_GENERIC_PHY=y
# CONFIG_USB_LGM_PHY is not set
# CONFIG_PHY_CAN_TRANSCEIVER is not set
#
# PHY drivers for Broadcom platforms
#
# CONFIG_BCM_KONA_USB2_PHY is not set
# end of PHY drivers for Broadcom platforms
# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# CONFIG_PHY_INTEL_LGM_EMMC is not set
# end of PHY Subsystem
# CONFIG_POWERCAP is not set
# CONFIG_MCB is not set
#
# Performance monitor support
#
# CONFIG_DWC_PCIE_PMU is not set
# end of Performance monitor support
CONFIG_RAS=y
CONFIG_RAS_CEC=y
CONFIG_RAS_CEC_DEBUG=y
CONFIG_AMD_ATL=m
CONFIG_RAS_FMPM=m
# CONFIG_USB4 is not set
#
# Android
#
# CONFIG_ANDROID_BINDER_IPC is not set
# end of Android
CONFIG_LIBNVDIMM=y
CONFIG_BLK_DEV_PMEM=m
CONFIG_ND_CLAIM=y
CONFIG_ND_BTT=m
CONFIG_BTT=y
CONFIG_DAX=y
CONFIG_DEV_DAX=m
CONFIG_NVMEM=y
CONFIG_NVMEM_SYSFS=y
# CONFIG_NVMEM_LAYOUTS is not set
# CONFIG_NVMEM_RMEM is not set
#
# HW tracing support
#
# CONFIG_STM is not set
# CONFIG_INTEL_TH is not set
# end of HW tracing support
# CONFIG_FPGA is not set
# CONFIG_TEE is not set
# CONFIG_SIOX is not set
# CONFIG_SLIMBUS is not set
# CONFIG_INTERCONNECT is not set
# CONFIG_COUNTER is not set
# CONFIG_MOST is not set
# CONFIG_PECI is not set
# CONFIG_HTE is not set
# end of Device Drivers
#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
CONFIG_VALIDATE_FS_PARSER=y
CONFIG_FS_IOMAP=y
CONFIG_FS_STACK=y
CONFIG_BUFFER_HEAD=y
CONFIG_LEGACY_DIRECT_IO=y
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
CONFIG_EXT4_FS=y
CONFIG_EXT4_USE_FOR_EXT2=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
CONFIG_BTRFS_FS=m
CONFIG_BTRFS_FS_POSIX_ACL=y
# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
# CONFIG_BTRFS_DEBUG is not set
# CONFIG_BTRFS_ASSERT is not set
# CONFIG_BTRFS_FS_REF_VERIFY is not set
# CONFIG_NILFS2_FS is not set
# CONFIG_F2FS_FS is not set
# CONFIG_BCACHEFS_FS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
# CONFIG_EXPORTFS_BLOCK_OPS is not set
CONFIG_FILE_LOCKING=y
# CONFIG_FS_ENCRYPTION is not set
# CONFIG_FS_VERITY is not set
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_FANOTIFY=y
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
CONFIG_FUSE_FS=m
# CONFIG_CUSE is not set
# CONFIG_VIRTIO_FS is not set
CONFIG_FUSE_PASSTHROUGH=y
# CONFIG_OVERLAY_FS is not set
#
# Caches
#
CONFIG_NETFS_SUPPORT=y
# CONFIG_NETFS_STATS is not set
CONFIG_FSCACHE=y
# CONFIG_FSCACHE_STATS is not set
# CONFIG_FSCACHE_DEBUG is not set
# CONFIG_CACHEFILES is not set
# end of Caches
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=m
# end of CD-ROM/DVD Filesystems
#
# DOS/FAT/EXFAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="ascii"
CONFIG_FAT_DEFAULT_UTF8=y
# CONFIG_EXFAT_FS is not set
# CONFIG_NTFS3_FS is not set
# CONFIG_NTFS_FS is not set
# end of DOS/FAT/EXFAT/NT Filesystems
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
# CONFIG_PROC_VMCORE_DEVICE_DUMP is not set
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_PROC_CHILDREN=y
CONFIG_PROC_PID_ARCH_STATUS=y
CONFIG_PROC_CPU_RESCTRL=y
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
CONFIG_TMPFS_INODE64=y
# CONFIG_TMPFS_QUOTA is not set
CONFIG_HUGETLBFS=y
# CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON is not set
CONFIG_HUGETLB_PAGE=y
CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y
CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
CONFIG_CONFIGFS_FS=m
CONFIG_EFIVAR_FS=m
# end of Pseudo filesystems
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ORANGEFS_FS is not set
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_SQUASHFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX6FS_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_PSTORE=y
CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240
CONFIG_PSTORE_COMPRESS=y
# CONFIG_PSTORE_CONSOLE is not set
# CONFIG_PSTORE_PMSG is not set
# CONFIG_PSTORE_FTRACE is not set
CONFIG_PSTORE_RAM=m
# CONFIG_PSTORE_BLK is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_EROFS_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
CONFIG_NFS_V2=m
CONFIG_NFS_V3=m
# CONFIG_NFS_V3_ACL is not set
CONFIG_NFS_V4=m
# CONFIG_NFS_SWAP is not set
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_PNFS_FILE_LAYOUT=m
CONFIG_PNFS_BLOCK=m
CONFIG_PNFS_FLEXFILE_LAYOUT=m
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
# CONFIG_NFS_V4_1_MIGRATION is not set
CONFIG_NFS_FSCACHE=y
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
# CONFIG_NFS_DISABLE_UDP_SUPPORT is not set
CONFIG_NFS_V4_2_READ_PLUS=y
# CONFIG_NFSD is not set
CONFIG_GRACE_PERIOD=m
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_NFS_COMMON=y
CONFIG_NFS_V4_2_SSC_HELPER=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
CONFIG_SUNRPC_BACKCHANNEL=y
CONFIG_RPCSEC_GSS_KRB5=m
CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1=y
# CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_CAMELLIA is not set
# CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA2 is not set
# CONFIG_SUNRPC_DEBUG is not set
# CONFIG_CEPH_FS is not set
# CONFIG_CIFS is not set
# CONFIG_SMB_SERVER is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
CONFIG_9P_FS=y
# CONFIG_9P_FSCACHE is not set
CONFIG_9P_FS_POSIX_ACL=y
CONFIG_9P_FS_SECURITY=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=m
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_MAC_ROMAN=m
CONFIG_NLS_MAC_CELTIC=m
CONFIG_NLS_MAC_CENTEURO=m
CONFIG_NLS_MAC_CROATIAN=m
CONFIG_NLS_MAC_CYRILLIC=m
CONFIG_NLS_MAC_GAELIC=m
CONFIG_NLS_MAC_GREEK=m
CONFIG_NLS_MAC_ICELAND=m
CONFIG_NLS_MAC_INUIT=m
CONFIG_NLS_MAC_ROMANIAN=m
CONFIG_NLS_MAC_TURKISH=m
CONFIG_NLS_UTF8=m
# CONFIG_DLM is not set
# CONFIG_UNICODE is not set
CONFIG_IO_WQ=y
# end of File systems
#
# Security options
#
CONFIG_KEYS=y
# CONFIG_KEYS_REQUEST_CACHE is not set
# CONFIG_PERSISTENT_KEYRINGS is not set
# CONFIG_TRUSTED_KEYS is not set
# CONFIG_ENCRYPTED_KEYS is not set
# CONFIG_KEY_DH_OPERATIONS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
# CONFIG_SECURITY is not set
# CONFIG_SECURITYFS is not set
CONFIG_HARDENED_USERCOPY=y
# CONFIG_FORTIFY_SOURCE is not set
# CONFIG_STATIC_USERMODEHELPER is not set
# CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_LSM="yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
#
# Kernel hardening options
#
#
# Memory initialization
#
CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y
CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_BARE=y
CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y
CONFIG_INIT_STACK_NONE=y
# CONFIG_INIT_STACK_ALL_PATTERN is not set
# CONFIG_INIT_STACK_ALL_ZERO is not set
# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set
# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set
CONFIG_CC_HAS_ZERO_CALL_USED_REGS=y
# CONFIG_ZERO_CALL_USED_REGS is not set
# end of Memory initialization
#
# Hardening of kernel data structures
#
CONFIG_LIST_HARDENED=y
# CONFIG_BUG_ON_DATA_CORRUPTION is not set
# end of Hardening of kernel data structures
CONFIG_RANDSTRUCT_NONE=y
# end of Kernel hardening options
# end of Security options
CONFIG_XOR_BLOCKS=m
CONFIG_CRYPTO=y
#
# Crypto core or helper
#
# CONFIG_CRYPTO_FIPS is not set
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=m
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_SIG=y
CONFIG_CRYPTO_SIG2=y
CONFIG_CRYPTO_SKCIPHER=m
CONFIG_CRYPTO_SKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=m
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=m
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_AKCIPHER=y
CONFIG_CRYPTO_KPP2=y
CONFIG_CRYPTO_KPP=m
CONFIG_CRYPTO_ACOMP2=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_USER=m
# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
# CONFIG_CRYPTO_MANAGER_EXTRA_TESTS is not set
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_NULL2=m
CONFIG_CRYPTO_PCRYPT=m
CONFIG_CRYPTO_CRYPTD=m
CONFIG_CRYPTO_AUTHENC=m
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_SIMD=m
# end of Crypto core or helper
#
# Public-key cryptography
#
CONFIG_CRYPTO_RSA=y
CONFIG_CRYPTO_DH=m
# CONFIG_CRYPTO_DH_RFC7919_GROUPS is not set
CONFIG_CRYPTO_ECC=m
CONFIG_CRYPTO_ECDH=m
# CONFIG_CRYPTO_ECDSA is not set
# CONFIG_CRYPTO_ECRDSA is not set
# CONFIG_CRYPTO_SM2 is not set
CONFIG_CRYPTO_CURVE25519=m
# end of Public-key cryptography
#
# Block ciphers
#
CONFIG_CRYPTO_AES=y
# CONFIG_CRYPTO_AES_TI is not set
CONFIG_CRYPTO_ANUBIS=m
# CONFIG_CRYPTO_ARIA is not set
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_BLOWFISH_COMMON=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAST_COMMON=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_DES=m
CONFIG_CRYPTO_FCRYPT=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_SEED=m
CONFIG_CRYPTO_SERPENT=m
# CONFIG_CRYPTO_SM4_GENERIC is not set
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m
# end of Block ciphers
#
# Length-preserving ciphers and modes
#
# CONFIG_CRYPTO_ADIANTUM is not set
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_CHACHA20=m
CONFIG_CRYPTO_CBC=m
CONFIG_CRYPTO_CTR=m
CONFIG_CRYPTO_CTS=m
CONFIG_CRYPTO_ECB=m
# CONFIG_CRYPTO_HCTR2 is not set
# CONFIG_CRYPTO_KEYWRAP is not set
CONFIG_CRYPTO_LRW=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_XTS=m
# end of Length-preserving ciphers and modes
#
# AEAD (authenticated encryption with associated data) ciphers
#
CONFIG_CRYPTO_AEGIS128=m
CONFIG_CRYPTO_CHACHA20POLY1305=m
CONFIG_CRYPTO_CCM=m
CONFIG_CRYPTO_GCM=m
CONFIG_CRYPTO_GENIV=m
CONFIG_CRYPTO_SEQIV=m
CONFIG_CRYPTO_ECHAINIV=m
CONFIG_CRYPTO_ESSIV=m
# end of AEAD (authenticated encryption with associated data) ciphers
#
# Hashes, digests, and MACs
#
CONFIG_CRYPTO_BLAKE2B=m
CONFIG_CRYPTO_CMAC=m
CONFIG_CRYPTO_GHASH=m
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_POLY1305=m
CONFIG_CRYPTO_RMD160=m
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_SHA3=m
# CONFIG_CRYPTO_SM3_GENERIC is not set
# CONFIG_CRYPTO_STREEBOG is not set
CONFIG_CRYPTO_VMAC=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_XCBC=m
CONFIG_CRYPTO_XXHASH=m
# end of Hashes, digests, and MACs
#
# CRCs (cyclic redundancy checks)
#
CONFIG_CRYPTO_CRC32C=y
CONFIG_CRYPTO_CRC32=m
CONFIG_CRYPTO_CRCT10DIF=y
CONFIG_CRYPTO_CRC64_ROCKSOFT=y
# end of CRCs (cyclic redundancy checks)
#
# Compression
#
CONFIG_CRYPTO_DEFLATE=y
CONFIG_CRYPTO_LZO=y
# CONFIG_CRYPTO_842 is not set
CONFIG_CRYPTO_LZ4=m
CONFIG_CRYPTO_LZ4HC=m
# CONFIG_CRYPTO_ZSTD is not set
# end of Compression
#
# Random number generation
#
CONFIG_CRYPTO_ANSI_CPRNG=m
CONFIG_CRYPTO_DRBG_MENU=m
CONFIG_CRYPTO_DRBG_HMAC=y
# CONFIG_CRYPTO_DRBG_HASH is not set
# CONFIG_CRYPTO_DRBG_CTR is not set
CONFIG_CRYPTO_DRBG=m
CONFIG_CRYPTO_JITTERENTROPY=m
CONFIG_CRYPTO_JITTERENTROPY_MEMORY_BLOCKS=64
CONFIG_CRYPTO_JITTERENTROPY_MEMORY_BLOCKSIZE=32
CONFIG_CRYPTO_JITTERENTROPY_OSR=1
# end of Random number generation
#
# Userspace interface
#
CONFIG_CRYPTO_USER_API=m
CONFIG_CRYPTO_USER_API_HASH=m
CONFIG_CRYPTO_USER_API_SKCIPHER=m
CONFIG_CRYPTO_USER_API_RNG=m
# CONFIG_CRYPTO_USER_API_RNG_CAVP is not set
CONFIG_CRYPTO_USER_API_AEAD=m
CONFIG_CRYPTO_USER_API_ENABLE_OBSOLETE=y
# end of Userspace interface
CONFIG_CRYPTO_HASH_INFO=y
#
# Accelerated Cryptographic Algorithms for CPU (x86)
#
CONFIG_CRYPTO_CURVE25519_X86=m
CONFIG_CRYPTO_AES_NI_INTEL=m
CONFIG_CRYPTO_BLOWFISH_X86_64=m
CONFIG_CRYPTO_CAMELLIA_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64=m
CONFIG_CRYPTO_CAST5_AVX_X86_64=m
CONFIG_CRYPTO_CAST6_AVX_X86_64=m
CONFIG_CRYPTO_DES3_EDE_X86_64=m
CONFIG_CRYPTO_SERPENT_SSE2_X86_64=m
CONFIG_CRYPTO_SERPENT_AVX_X86_64=m
CONFIG_CRYPTO_SERPENT_AVX2_X86_64=m
# CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64 is not set
# CONFIG_CRYPTO_SM4_AESNI_AVX2_X86_64 is not set
CONFIG_CRYPTO_TWOFISH_X86_64=m
CONFIG_CRYPTO_TWOFISH_X86_64_3WAY=m
CONFIG_CRYPTO_TWOFISH_AVX_X86_64=m
# CONFIG_CRYPTO_ARIA_AESNI_AVX_X86_64 is not set
# CONFIG_CRYPTO_ARIA_AESNI_AVX2_X86_64 is not set
# CONFIG_CRYPTO_ARIA_GFNI_AVX512_X86_64 is not set
CONFIG_CRYPTO_CHACHA20_X86_64=m
CONFIG_CRYPTO_AEGIS128_AESNI_SSE2=m
# CONFIG_CRYPTO_NHPOLY1305_SSE2 is not set
# CONFIG_CRYPTO_NHPOLY1305_AVX2 is not set
# CONFIG_CRYPTO_BLAKE2S_X86 is not set
# CONFIG_CRYPTO_POLYVAL_CLMUL_NI is not set
CONFIG_CRYPTO_POLY1305_X86_64=m
CONFIG_CRYPTO_SHA1_SSSE3=m
CONFIG_CRYPTO_SHA256_SSSE3=m
CONFIG_CRYPTO_SHA512_SSSE3=m
# CONFIG_CRYPTO_SM3_AVX_X86_64 is not set
CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL=m
CONFIG_CRYPTO_CRC32C_INTEL=m
CONFIG_CRYPTO_CRC32_PCLMUL=m
CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m
# end of Accelerated Cryptographic Algorithms for CPU (x86)
CONFIG_CRYPTO_HW=y
# CONFIG_CRYPTO_DEV_PADLOCK is not set
# CONFIG_CRYPTO_DEV_ATMEL_ECC is not set
# CONFIG_CRYPTO_DEV_ATMEL_SHA204A is not set
CONFIG_CRYPTO_DEV_CCP=y
CONFIG_CRYPTO_DEV_CCP_DD=m
CONFIG_CRYPTO_DEV_SP_PSP=y
# CONFIG_CRYPTO_DEV_NITROX_CNN55XX is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set
# CONFIG_CRYPTO_DEV_QAT_C62X is not set
# CONFIG_CRYPTO_DEV_QAT_4XXX is not set
# CONFIG_CRYPTO_DEV_QAT_420XX is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set
# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set
# CONFIG_CRYPTO_DEV_VIRTIO is not set
# CONFIG_CRYPTO_DEV_SAFEXCEL is not set
# CONFIG_CRYPTO_DEV_AMLOGIC_GXL is not set
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_X509_CERTIFICATE_PARSER=y
# CONFIG_PKCS8_PRIVATE_KEY_PARSER is not set
CONFIG_PKCS7_MESSAGE_PARSER=y
# CONFIG_PKCS7_TEST_KEY is not set
CONFIG_SIGNED_PE_FILE_VERIFICATION=y
# CONFIG_FIPS_SIGNATURE_SELFTEST is not set
#
# Certificates for signature checking
#
CONFIG_MODULE_SIG_KEY="certs/signing_key.pem"
CONFIG_MODULE_SIG_KEY_TYPE_RSA=y
CONFIG_SYSTEM_TRUSTED_KEYRING=y
CONFIG_SYSTEM_TRUSTED_KEYS=""
# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
# CONFIG_SECONDARY_TRUSTED_KEYRING is not set
# CONFIG_SYSTEM_BLACKLIST_KEYRING is not set
# end of Certificates for signature checking
CONFIG_BINARY_PRINTF=y
#
# Library routines
#
CONFIG_RAID6_PQ=m
CONFIG_RAID6_PQ_BENCHMARK=y
# CONFIG_PACKING is not set
CONFIG_BITREVERSE=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_CORDIC=y
# CONFIG_PRIME_NUMBERS is not set
CONFIG_RATIONAL=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_ARCH_USE_SYM_ANNOTATIONS=y
#
# Crypto library routines
#
CONFIG_CRYPTO_LIB_UTILS=y
CONFIG_CRYPTO_LIB_AES=y
CONFIG_CRYPTO_LIB_ARC4=m
CONFIG_CRYPTO_LIB_GF128MUL=m
CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
CONFIG_CRYPTO_ARCH_HAVE_LIB_CHACHA=m
CONFIG_CRYPTO_LIB_CHACHA_GENERIC=m
# CONFIG_CRYPTO_LIB_CHACHA is not set
CONFIG_CRYPTO_ARCH_HAVE_LIB_CURVE25519=m
CONFIG_CRYPTO_LIB_CURVE25519_GENERIC=m
CONFIG_CRYPTO_LIB_CURVE25519=m
CONFIG_CRYPTO_LIB_DES=m
CONFIG_CRYPTO_LIB_POLY1305_RSIZE=11
CONFIG_CRYPTO_ARCH_HAVE_LIB_POLY1305=m
CONFIG_CRYPTO_LIB_POLY1305_GENERIC=m
# CONFIG_CRYPTO_LIB_POLY1305 is not set
# CONFIG_CRYPTO_LIB_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_LIB_SHA1=y
CONFIG_CRYPTO_LIB_SHA256=y
# end of Crypto library routines
CONFIG_CRC_CCITT=m
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC64_ROCKSOFT=y
CONFIG_CRC_ITU_T=m
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
CONFIG_CRC64=y
# CONFIG_CRC4 is not set
CONFIG_CRC7=m
CONFIG_LIBCRC32C=m
CONFIG_CRC8=m
CONFIG_XXHASH=y
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_COMPRESS=m
CONFIG_LZ4HC_COMPRESS=m
CONFIG_LZ4_DECOMPRESS=m
CONFIG_ZSTD_COMMON=m
CONFIG_ZSTD_COMPRESS=m
CONFIG_ZSTD_DECOMPRESS=m
CONFIG_XZ_DEC=y
CONFIG_XZ_DEC_X86=y
CONFIG_XZ_DEC_POWERPC=y
CONFIG_XZ_DEC_ARM=y
CONFIG_XZ_DEC_ARMTHUMB=y
CONFIG_XZ_DEC_SPARC=y
# CONFIG_XZ_DEC_MICROLZMA is not set
CONFIG_XZ_DEC_BCJ=y
# CONFIG_XZ_DEC_TEST is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_REED_SOLOMON=m
CONFIG_REED_SOLOMON_ENC8=y
CONFIG_REED_SOLOMON_DEC8=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
CONFIG_TEXTSEARCH_FSM=m
CONFIG_INTERVAL_TREE=y
CONFIG_XARRAY_MULTI=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_DMA_OPS=y
CONFIG_NEED_SG_DMA_FLAGS=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED=y
CONFIG_SWIOTLB=y
# CONFIG_SWIOTLB_DYNAMIC is not set
CONFIG_DMA_NEED_SYNC=y
CONFIG_DMA_COHERENT_POOL=y
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_DMA_MAP_BENCHMARK is not set
CONFIG_SGL_ALLOC=y
CONFIG_CPU_RMAP=y
CONFIG_DQL=y
CONFIG_GLOB=y
# CONFIG_GLOB_SELFTEST is not set
CONFIG_NLATTR=y
CONFIG_CLZ_TAB=y
# CONFIG_IRQ_POLL is not set
CONFIG_MPILIB=y
CONFIG_DIMLIB=m
CONFIG_OID_REGISTRY=y
CONFIG_UCS2_STRING=y
CONFIG_HAVE_GENERIC_VDSO=y
CONFIG_GENERIC_GETTIMEOFDAY=y
CONFIG_GENERIC_VDSO_TIME_NS=y
CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT=y
CONFIG_FONT_SUPPORT=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
CONFIG_SG_POOL=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_MEMREGION=y
CONFIG_ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION=y
CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y
CONFIG_ARCH_HAS_COPY_MC=y
CONFIG_ARCH_STACKWALK=y
CONFIG_STACKDEPOT=y
CONFIG_STACKDEPOT_MAX_FRAMES=64
CONFIG_SBITMAP=y
# CONFIG_LWQ_TEST is not set
# end of Library routines
CONFIG_FIRMWARE_TABLE=y
#
# Kernel hacking
#
#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
# CONFIG_PRINTK_CALLER is not set
# CONFIG_STACKTRACE_BUILD_ID is not set
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_CONSOLE_LOGLEVEL_QUIET=4
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_BOOT_PRINTK_DELAY=y
CONFIG_DYNAMIC_DEBUG=y
CONFIG_DYNAMIC_DEBUG_CORE=y
CONFIG_SYMBOLIC_ERRNAME=y
CONFIG_DEBUG_BUGVERBOSE=y
# end of printk and dmesg options
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_MISC is not set
#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
CONFIG_AS_HAS_NON_CONST_ULEB128=y
# CONFIG_DEBUG_INFO_NONE is not set
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
# CONFIG_DEBUG_INFO_DWARF4 is not set
# CONFIG_DEBUG_INFO_DWARF5 is not set
# CONFIG_DEBUG_INFO_REDUCED is not set
CONFIG_DEBUG_INFO_COMPRESSED_NONE=y
# CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set
# CONFIG_DEBUG_INFO_COMPRESSED_ZSTD is not set
# CONFIG_DEBUG_INFO_SPLIT is not set
CONFIG_PAHOLE_HAS_SPLIT_BTF=y
CONFIG_PAHOLE_HAS_LANG_EXCLUDE=y
# CONFIG_GDB_SCRIPTS is not set
CONFIG_FRAME_WARN=2048
CONFIG_STRIP_ASM_SYMS=y
# CONFIG_READABLE_ASM is not set
# CONFIG_HEADERS_INSTALL is not set
# CONFIG_DEBUG_SECTION_MISMATCH is not set
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
CONFIG_OBJTOOL=y
CONFIG_NOINSTR_VALIDATION=y
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
# end of Compile-time checks and compiler options
#
# Generic Kernel Debugging Instruments
#
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x01b6
CONFIG_MAGIC_SYSRQ_SERIAL=y
CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE=""
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_FS_ALLOW_ALL=y
# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
# CONFIG_DEBUG_FS_ALLOW_NONE is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_ARCH_HAS_UBSAN=y
# CONFIG_UBSAN is not set
CONFIG_HAVE_ARCH_KCSAN=y
CONFIG_HAVE_KCSAN_COMPILER=y
# CONFIG_KCSAN is not set
# end of Generic Kernel Debugging Instruments
#
# Networking Debugging
#
# CONFIG_NET_DEV_REFCNT_TRACKER is not set
# CONFIG_NET_NS_REFCNT_TRACKER is not set
# CONFIG_DEBUG_NET is not set
# end of Networking Debugging
#
# Memory Debugging
#
CONFIG_PAGE_EXTENSION=y
# CONFIG_DEBUG_PAGEALLOC is not set
CONFIG_SLUB_DEBUG=y
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_PAGE_OWNER is not set
# CONFIG_PAGE_TABLE_CHECK is not set
CONFIG_PAGE_POISONING=y
# CONFIG_DEBUG_PAGE_REF is not set
# CONFIG_DEBUG_RODATA_TEST is not set
CONFIG_ARCH_HAS_DEBUG_WX=y
CONFIG_DEBUG_WX=y
CONFIG_GENERIC_PTDUMP=y
CONFIG_PTDUMP_CORE=y
# CONFIG_PTDUMP_DEBUGFS is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
# CONFIG_PER_VMA_LOCK_STATS is not set
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_SHRINKER_DEBUG is not set
# CONFIG_DEBUG_STACK_USAGE is not set
CONFIG_SCHED_STACK_END_CHECK=y
CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VM_PGTABLE is not set
CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
# CONFIG_DEBUG_VIRTUAL is not set
CONFIG_DEBUG_MEMORY_INIT=y
# CONFIG_DEBUG_PER_CPU_MAPS is not set
CONFIG_ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP=y
# CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is not set
# CONFIG_MEM_ALLOC_PROFILING is not set
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
# CONFIG_KASAN is not set
CONFIG_HAVE_ARCH_KFENCE=y
# CONFIG_KFENCE is not set
CONFIG_HAVE_ARCH_KMSAN=y
# end of Memory Debugging
# CONFIG_DEBUG_SHIRQ is not set
#
# Debug Oops, Lockups and Hangs
#
# CONFIG_PANIC_ON_OOPS is not set
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=0
CONFIG_LOCKUP_DETECTOR=y
CONFIG_SOFTLOCKUP_DETECTOR=y
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
CONFIG_HARDLOCKUP_DETECTOR=y
# CONFIG_HARDLOCKUP_DETECTOR_PREFER_BUDDY is not set
CONFIG_HARDLOCKUP_DETECTOR_PERF=y
# CONFIG_HARDLOCKUP_DETECTOR_BUDDY is not set
# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
# CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is not set
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
# CONFIG_WQ_WATCHDOG is not set
# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
# CONFIG_TEST_LOCKUP is not set
# end of Debug Oops, Lockups and Hangs
#
# Scheduler Debugging
#
# CONFIG_SCHED_DEBUG is not set
CONFIG_SCHED_INFO=y
# CONFIG_SCHEDSTATS is not set
# end of Scheduler Debugging
# CONFIG_DEBUG_TIMEKEEPING is not set
# CONFIG_DEBUG_PREEMPT is not set
#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_LOCK_DEBUGGING_SUPPORT=y
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
# CONFIG_DEBUG_RWSEMS is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_DEBUG_ATOMIC_SLEEP is not set
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_LOCK_TORTURE_TEST is not set
# CONFIG_WW_MUTEX_SELFTEST is not set
# CONFIG_SCF_TORTURE_TEST is not set
# CONFIG_CSD_LOCK_WAIT_DEBUG is not set
# end of Lock Debugging (spinlocks, mutexes, etc...)
# CONFIG_NMI_CHECK_CPU is not set
# CONFIG_DEBUG_IRQFLAGS is not set
CONFIG_STACKTRACE=y
# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
# CONFIG_DEBUG_KOBJECT is not set
#
# Debug kernel data structures
#
CONFIG_DEBUG_LIST=y
# CONFIG_DEBUG_PLIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_MAPLE_TREE is not set
# end of Debug kernel data structures
#
# RCU Debugging
#
# CONFIG_RCU_SCALE_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_REF_SCALE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=21
CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
# CONFIG_RCU_CPU_STALL_CPUTIME is not set
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# end of RCU Debugging
# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
# CONFIG_LATENCYTOP is not set
# CONFIG_DEBUG_CGROUP_REF is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_RETHOOK=y
CONFIG_RETHOOK=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_RETVAL=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y
CONFIG_HAVE_DYNAMIC_FTRACE_NO_PATCHABLE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_OBJTOOL_MCOUNT=y
CONFIG_HAVE_OBJTOOL_NOP_MCOUNT=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_HAVE_BUILDTIME_MCOUNT_SORT=y
CONFIG_BUILDTIME_MCOUNT_SORT=y
CONFIG_TRACER_MAX_TRACE=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
# CONFIG_BOOTTIME_TRACING is not set
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
# CONFIG_FUNCTION_GRAPH_RETVAL is not set
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y
# CONFIG_FPROBE is not set
# CONFIG_FUNCTION_PROFILER is not set
CONFIG_STACK_TRACER=y
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_PREEMPT_TRACER is not set
# CONFIG_SCHED_TRACER is not set
# CONFIG_HWLAT_TRACER is not set
# CONFIG_OSNOISE_TRACER is not set
# CONFIG_TIMERLAT_TRACER is not set
# CONFIG_MMIOTRACE is not set
CONFIG_FTRACE_SYSCALLS=y
CONFIG_TRACER_SNAPSHOT=y
# CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP is not set
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
# CONFIG_PROFILE_ALL_BRANCHES is not set
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENTS=y
# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set
CONFIG_UPROBE_EVENTS=y
CONFIG_DYNAMIC_EVENTS=y
CONFIG_PROBE_EVENTS=y
CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_FTRACE_MCOUNT_USE_CC=y
# CONFIG_SYNTH_EVENTS is not set
# CONFIG_USER_EVENTS is not set
# CONFIG_HIST_TRIGGERS is not set
# CONFIG_TRACE_EVENT_INJECT is not set
# CONFIG_TRACEPOINT_BENCHMARK is not set
# CONFIG_RING_BUFFER_BENCHMARK is not set
# CONFIG_TRACE_EVAL_MAP_FILE is not set
# CONFIG_FTRACE_RECORD_RECURSION is not set
# CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING is not set
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_FTRACE_SORT_STARTUP_TEST is not set
# CONFIG_RING_BUFFER_STARTUP_TEST is not set
# CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set
# CONFIG_PREEMPTIRQ_DELAY_TEST is not set
# CONFIG_KPROBE_EVENT_GEN_TEST is not set
# CONFIG_RV is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_SAMPLE_FTRACE_DIRECT=y
CONFIG_HAVE_SAMPLE_FTRACE_DIRECT_MULTI=y
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
CONFIG_STRICT_DEVMEM=y
CONFIG_IO_STRICT_DEVMEM=y
#
# x86 Debugging
#
# CONFIG_X86_VERBOSE_BOOTUP is not set
CONFIG_EARLY_PRINTK=y
# CONFIG_EARLY_PRINTK_DBGP is not set
# CONFIG_EARLY_PRINTK_USB_XDBC is not set
# CONFIG_EFI_PGT_DUMP is not set
# CONFIG_DEBUG_TLBFLUSH is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
# CONFIG_X86_DECODER_SELFTEST is not set
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
# CONFIG_DEBUG_BOOT_PARAMS is not set
# CONFIG_CPA_DEBUG is not set
CONFIG_DEBUG_ENTRY=y
# CONFIG_DEBUG_NMI_SELFTEST is not set
CONFIG_X86_DEBUG_FPU=y
# CONFIG_PUNIT_ATOM_DEBUG is not set
CONFIG_UNWINDER_ORC=y
# CONFIG_UNWINDER_FRAME_POINTER is not set
# end of x86 Debugging
#
# Kernel Testing and Coverage
#
# CONFIG_KUNIT is not set
# CONFIG_NOTIFIER_ERROR_INJECTION is not set
CONFIG_FUNCTION_ERROR_INJECTION=y
# CONFIG_FAULT_INJECTION is not set
CONFIG_ARCH_HAS_KCOV=y
CONFIG_CC_HAS_SANCOV_TRACE_PC=y
# CONFIG_KCOV is not set
# CONFIG_RUNTIME_TESTING_MENU is not set
CONFIG_ARCH_USE_MEMTEST=y
# CONFIG_MEMTEST is not set
# end of Kernel Testing and Coverage
#
# Rust hacking
#
# end of Rust hacking
# end of Kernel hacking
^ permalink raw reply related [flat|nested] 115+ messages in thread
end of thread, other threads:[~2024-06-21 13:39 UTC | newest]
Thread overview: 115+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-28 9:55 [PATCHv11 00/19] x86/tdx: Add kexec support Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 01/19] x86/acpi: Extract ACPI MADT wakeup code into a separate file Kirill A. Shutemov
2024-05-28 13:47 ` Borislav Petkov
2024-05-28 9:55 ` [PATCHv11 02/19] x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 03/19] cpu/hotplug: Add support for declaring CPU offlining not supported Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 04/19] cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion Kirill A. Shutemov
2024-05-29 10:47 ` Nikolay Borisov
2024-05-29 11:17 ` Kirill A. Shutemov
2024-05-29 11:28 ` Borislav Petkov
2024-05-29 12:33 ` Andrew Cooper
2024-05-29 15:15 ` Borislav Petkov
2024-06-04 0:24 ` H. Peter Anvin
2024-06-04 9:15 ` Borislav Petkov
2024-06-04 15:21 ` Kirill A. Shutemov
2024-06-04 17:57 ` Borislav Petkov
2024-06-11 18:26 ` H. Peter Anvin
2024-06-12 9:22 ` Kirill A. Shutemov
2024-06-12 23:06 ` Andrew Cooper
2024-06-12 23:25 ` H. Peter Anvin
2024-06-03 14:43 ` H. Peter Anvin
2024-06-12 12:10 ` Nikolay Borisov
2024-06-03 22:43 ` H. Peter Anvin
2024-05-28 9:55 ` [PATCHv11 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest Kirill A. Shutemov
2024-05-28 11:12 ` Huang, Kai
2024-05-29 11:39 ` Nikolay Borisov
2024-05-28 9:55 ` [PATCHv11 07/19] x86/mm: Make x86_platform.guest.enc_status_change_*() return errno Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 08/19] x86/mm: Return correct level from lookup_address() if pte is none Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 09/19] x86/tdx: Account shared memory Kirill A. Shutemov
2024-06-04 16:08 ` Dave Hansen
2024-06-04 16:24 ` Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec Kirill A. Shutemov
2024-05-29 10:42 ` Borislav Petkov
2024-06-02 12:39 ` [PATCHv11.1 " Kirill A. Shutemov
2024-06-02 12:42 ` Kirill A. Shutemov
2024-06-02 12:44 ` [PATCHv11.2 " Kirill A. Shutemov
2024-06-04 16:16 ` [PATCHv11 " Dave Hansen
2024-05-28 9:55 ` [PATCHv11 11/19] x86/tdx: Convert shared memory back to private on kexec Kirill A. Shutemov
2024-05-31 15:14 ` Borislav Petkov
2024-05-31 17:34 ` Kalra, Ashish
2024-05-31 18:06 ` Borislav Petkov
2024-06-02 14:20 ` Kirill A. Shutemov
2024-06-02 14:23 ` [PATCHv11.1 " Kirill A. Shutemov
2024-06-03 8:37 ` Borislav Petkov
2024-06-04 15:32 ` Kirill A. Shutemov
2024-06-04 15:47 ` Dave Hansen
2024-06-04 16:14 ` Kirill A. Shutemov
2024-06-04 18:05 ` Borislav Petkov
2024-06-05 12:21 ` Kirill A. Shutemov
2024-06-05 16:24 ` Borislav Petkov
2024-06-06 12:39 ` Kirill A. Shutemov
2024-06-04 16:27 ` [PATCHv11 " Dave Hansen
2024-06-05 12:43 ` Kirill A. Shutemov
2024-06-05 16:05 ` Dave Hansen
2024-05-28 9:55 ` [PATCHv11 12/19] x86/mm: Make e820__end_ram_pfn() cover E820_TYPE_ACPI ranges Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 13/19] x86/mm: Do not zap page table entries mapping unaccepted memory table during kdump Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 14/19] x86/acpi: Rename fields in acpi_madt_multiproc_wakeup structure Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 15/19] x86/acpi: Do not attempt to bring up secondary CPUs in kexec case Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 16/19] x86/smp: Add smp_ops.stop_this_cpu() callback Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 17/19] x86/mm: Introduce kernel_ident_mapping_free() Kirill A. Shutemov
2024-05-28 9:55 ` [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method Kirill A. Shutemov
2024-06-03 8:39 ` Borislav Petkov
2024-06-07 15:14 ` Kirill A. Shutemov
2024-06-10 13:40 ` Borislav Petkov
2024-06-10 14:01 ` Kirill A. Shutemov
2024-06-11 15:47 ` Kirill A. Shutemov
2024-06-11 19:46 ` Borislav Petkov
2024-06-12 9:24 ` Kirill A. Shutemov
2024-06-12 9:29 ` Borislav Petkov
2024-06-13 13:41 ` Kirill A. Shutemov
2024-06-13 14:56 ` Borislav Petkov
2024-06-14 14:06 ` Tom Lendacky
2024-06-18 12:20 ` Kirill A. Shutemov
2024-06-21 13:38 ` Borislav Petkov
2024-05-28 9:55 ` [PATCHv11 19/19] ACPI: tables: Print MULTIPROC_WAKEUP when MADT is parsed Kirill A. Shutemov
2024-05-28 10:01 ` [PATCHv11 00/19] x86/tdx: Add kexec support Rafael J. Wysocki
2024-05-30 23:36 ` [PATCH v7 0/3] x86/snp: " Ashish Kalra
2024-05-30 23:36 ` [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec Ashish Kalra
2024-05-31 9:12 ` Alexander Kuleshov
2024-06-03 8:56 ` Borislav Petkov
2024-06-03 13:06 ` Kalra, Ashish
2024-06-03 13:39 ` Mike Rapoport
2024-06-03 14:01 ` Kalra, Ashish
2024-06-03 14:46 ` Borislav Petkov
2024-06-03 15:31 ` Mike Rapoport
2024-06-03 16:48 ` Kalra, Ashish
2024-06-03 16:57 ` Borislav Petkov
2024-06-03 17:08 ` Kalra, Ashish
2024-06-03 17:12 ` Borislav Petkov
2024-06-04 22:12 ` Kalra, Ashish
2024-06-04 22:35 ` Kalra, Ashish
2024-06-05 1:48 ` Dave Young
2024-06-05 1:52 ` Dave Young
2024-06-05 1:58 ` Dave Young
2024-06-05 2:08 ` Kalra, Ashish
2024-06-05 2:28 ` Dave Young
2024-06-05 11:09 ` Borislav Petkov
2024-06-06 1:52 ` Dave Young
2024-06-05 2:14 ` Kalra, Ashish
2024-06-03 17:05 ` Kalra, Ashish
2024-06-03 17:10 ` Borislav Petkov
2024-06-04 1:23 ` Dave Young
2024-06-04 9:43 ` Borislav Petkov
2024-06-04 11:09 ` Dave Young
2024-06-04 18:02 ` Borislav Petkov
2024-06-05 2:53 ` Dave Young
2024-06-05 7:42 ` Borislav Petkov
2024-06-05 8:17 ` Ard Biesheuvel
2024-06-05 11:15 ` Borislav Petkov
2024-06-03 15:29 ` Mike Rapoport
2024-06-03 16:56 ` Kalra, Ashish
2024-06-03 17:41 ` Mike Rapoport
2024-05-30 23:37 ` [PATCH v7 2/3] x86/boot/compressed: Skip Video Memory access in Decompressor for SEV-ES/SNP Ashish Kalra
2024-06-05 20:14 ` Borislav Petkov
2024-05-30 23:37 ` [PATCH v7 3/3] x86/snp: Convert shared memory back to private on kexec Ashish Kalra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).