public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization
@ 2024-10-16 11:14 Kirill A. Shutemov
  2024-10-16 11:14 ` [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero Kirill A. Shutemov
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2024-10-16 11:14 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Rafael J. Wysocki, Andy Lutomirski,
	Peter Zijlstra, Baoquan He
  Cc: Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi, Kirill A. Shutemov

Use kernel_ident_mapping_init() to initialize kernel page tables where
possible, replacing manual initialization, reducing code duplication.

v4:
 - Reviewed-bys from Kai;
 - Fix comment in acpi_mp_setup_reset() (Rafael);
v3:
 - Reviewed-bys from Tom;
 - Improve commit messages;
v2:
 - A separate patch to change what PA is mapped at relocate_kernel() VA.
 - Improve commit messages;
 - Add Reveiwed-by from Kai;

Kirill A. Shutemov (4):
  x86/mm/ident_map: Fix virtual address wrap to zero
  x86/acpi: Replace manual page table initialization with
    kernel_ident_mapping_init()
  x86/64/kexec: Map original relocate_kernel() in
    init_transition_pgtable()
  x86/64/kexec: Rewrite init_transition_pgtable() with
    kernel_ident_mapping_init()

 arch/x86/include/asm/kexec.h       |  5 +-
 arch/x86/kernel/acpi/madt_wakeup.c | 73 +++++-------------------
 arch/x86/kernel/machine_kexec_64.c | 89 +++++++++++-------------------
 arch/x86/mm/ident_map.c            | 14 +----
 4 files changed, 50 insertions(+), 131 deletions(-)

-- 
2.45.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero
  2024-10-16 11:14 [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
@ 2024-10-16 11:14 ` Kirill A. Shutemov
  2024-10-30 11:47   ` Borislav Petkov
                     ` (2 more replies)
  2024-10-16 11:14 ` [PATCHv4, REBASED 2/4] x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init() Kirill A. Shutemov
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2024-10-16 11:14 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Rafael J. Wysocki, Andy Lutomirski,
	Peter Zijlstra, Baoquan He
  Cc: Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi, Kirill A. Shutemov,
	Kai Huang

Calculation of 'next' virtual address doesn't protect against wrapping
to zero. It can result in page table corruption and hang. The
problematic case is possible if user sets high x86_mapping_info::offset.

The wrapping to zero only occurs if the top PGD entry is accessed.
There are no such users in the upstream. Only hibernate_64.c uses
x86_mapping_info::offset, and it operates on the direct mapping range,
which is not the top PGD entry.

Replace manual 'next' calculation with p?d_addr_end() which handles
wrapping correctly.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ident_map.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index 437e96fb4977..5872f3ee863c 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -101,9 +101,7 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
 		pmd_t *pmd;
 		bool use_gbpage;
 
-		next = (addr & PUD_MASK) + PUD_SIZE;
-		if (next > end)
-			next = end;
+		next = pud_addr_end(addr, end);
 
 		/* if this is already a gbpage, this portion is already mapped */
 		if (pud_leaf(*pud))
@@ -154,10 +152,7 @@ static int ident_p4d_init(struct x86_mapping_info *info, p4d_t *p4d_page,
 		p4d_t *p4d = p4d_page + p4d_index(addr);
 		pud_t *pud;
 
-		next = (addr & P4D_MASK) + P4D_SIZE;
-		if (next > end)
-			next = end;
-
+		next = p4d_addr_end(addr, end);
 		if (p4d_present(*p4d)) {
 			pud = pud_offset(p4d, 0);
 			result = ident_pud_init(info, pud, addr, next);
@@ -199,10 +194,7 @@ int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
 		pgd_t *pgd = pgd_page + pgd_index(addr);
 		p4d_t *p4d;
 
-		next = (addr & PGDIR_MASK) + PGDIR_SIZE;
-		if (next > end)
-			next = end;
-
+		next = pgd_addr_end(addr, end);
 		if (pgd_present(*pgd)) {
 			p4d = p4d_offset(pgd, 0);
 			result = ident_p4d_init(info, p4d, addr, next);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCHv4, REBASED 2/4] x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init()
  2024-10-16 11:14 [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
  2024-10-16 11:14 ` [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero Kirill A. Shutemov
@ 2024-10-16 11:14 ` Kirill A. Shutemov
  2025-03-10 19:50   ` [tip: x86/mm] " tip-bot2 for Kirill A. Shutemov
  2025-03-19 11:04   ` [tip: x86/core] " tip-bot2 for Kirill A. Shutemov
  2024-10-16 11:14 ` [PATCHv4, REBASED 3/4] x86/64/kexec: Map original relocate_kernel() in init_transition_pgtable() Kirill A. Shutemov
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2024-10-16 11:14 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Rafael J. Wysocki, Andy Lutomirski,
	Peter Zijlstra, Baoquan He
  Cc: Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi, Kirill A. Shutemov,
	Kai Huang, Rafael J . Wysocki

The function init_transition_pgtable() maps the page with
asm_acpi_mp_play_dead() into an identity mapping.

Replace manual page table initialization with kernel_ident_mapping_init()
to avoid code duplication. Use x86_mapping_info::offset to get the page
mapped at the correct location.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 arch/x86/kernel/acpi/madt_wakeup.c | 73 ++++++------------------------
 1 file changed, 15 insertions(+), 58 deletions(-)

diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c
index d5ef6215583b..f36f28405dcc 100644
--- a/arch/x86/kernel/acpi/madt_wakeup.c
+++ b/arch/x86/kernel/acpi/madt_wakeup.c
@@ -70,58 +70,6 @@ static void __init free_pgt_page(void *pgt, void *dummy)
 	return memblock_free(pgt, PAGE_SIZE);
 }
 
-/*
- * Make sure asm_acpi_mp_play_dead() is present in the identity mapping at
- * the same place as in the kernel page tables. asm_acpi_mp_play_dead() switches
- * to the identity mapping and the function has be present at the same spot in
- * the virtual address space before and after switching page tables.
- */
-static int __init init_transition_pgtable(pgd_t *pgd)
-{
-	pgprot_t prot = PAGE_KERNEL_EXEC_NOENC;
-	unsigned long vaddr, paddr;
-	p4d_t *p4d;
-	pud_t *pud;
-	pmd_t *pmd;
-	pte_t *pte;
-
-	vaddr = (unsigned long)asm_acpi_mp_play_dead;
-	pgd += pgd_index(vaddr);
-	if (!pgd_present(*pgd)) {
-		p4d = (p4d_t *)alloc_pgt_page(NULL);
-		if (!p4d)
-			return -ENOMEM;
-		set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE));
-	}
-	p4d = p4d_offset(pgd, vaddr);
-	if (!p4d_present(*p4d)) {
-		pud = (pud_t *)alloc_pgt_page(NULL);
-		if (!pud)
-			return -ENOMEM;
-		set_p4d(p4d, __p4d(__pa(pud) | _KERNPG_TABLE));
-	}
-	pud = pud_offset(p4d, vaddr);
-	if (!pud_present(*pud)) {
-		pmd = (pmd_t *)alloc_pgt_page(NULL);
-		if (!pmd)
-			return -ENOMEM;
-		set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
-	}
-	pmd = pmd_offset(pud, vaddr);
-	if (!pmd_present(*pmd)) {
-		pte = (pte_t *)alloc_pgt_page(NULL);
-		if (!pte)
-			return -ENOMEM;
-		set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
-	}
-	pte = pte_offset_kernel(pmd, vaddr);
-
-	paddr = __pa(vaddr);
-	set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
-
-	return 0;
-}
-
 static int __init acpi_mp_setup_reset(u64 reset_vector)
 {
 	struct x86_mapping_info info = {
@@ -130,6 +78,7 @@ static int __init acpi_mp_setup_reset(u64 reset_vector)
 		.page_flag      = __PAGE_KERNEL_LARGE_EXEC,
 		.kernpg_flag    = _KERNPG_TABLE_NOENC,
 	};
+	unsigned long mstart, mend;
 	pgd_t *pgd;
 
 	pgd = alloc_pgt_page(NULL);
@@ -137,8 +86,6 @@ static int __init acpi_mp_setup_reset(u64 reset_vector)
 		return -ENOMEM;
 
 	for (int i = 0; i < nr_pfn_mapped; i++) {
-		unsigned long mstart, mend;
-
 		mstart = pfn_mapped[i].start << PAGE_SHIFT;
 		mend   = pfn_mapped[i].end << PAGE_SHIFT;
 		if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
@@ -147,14 +94,24 @@ static int __init acpi_mp_setup_reset(u64 reset_vector)
 		}
 	}
 
-	if (kernel_ident_mapping_init(&info, pgd,
-				      PAGE_ALIGN_DOWN(reset_vector),
-				      PAGE_ALIGN(reset_vector + 1))) {
+	mstart = PAGE_ALIGN_DOWN(reset_vector);
+	mend = mstart + PAGE_SIZE;
+	if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
 		kernel_ident_mapping_free(&info, pgd);
 		return -ENOMEM;
 	}
 
-	if (init_transition_pgtable(pgd)) {
+	/*
+	 * Make sure asm_acpi_mp_play_dead() is present in the identity mapping
+	 * at the same place as in the kernel page tables.
+	 * asm_acpi_mp_play_dead() switches to the identity mapping and the
+	 * function must be present at the same spot in the virtual address space
+	 * before and after switching page tables.
+	 */
+	info.offset = __START_KERNEL_map - phys_base;
+	mstart = PAGE_ALIGN_DOWN(__pa(asm_acpi_mp_play_dead));
+	mend = mstart + PAGE_SIZE;
+	if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
 		kernel_ident_mapping_free(&info, pgd);
 		return -ENOMEM;
 	}
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCHv4, REBASED 3/4] x86/64/kexec: Map original relocate_kernel() in init_transition_pgtable()
  2024-10-16 11:14 [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
  2024-10-16 11:14 ` [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero Kirill A. Shutemov
  2024-10-16 11:14 ` [PATCHv4, REBASED 2/4] x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init() Kirill A. Shutemov
@ 2024-10-16 11:14 ` Kirill A. Shutemov
  2024-10-31 14:23   ` Borislav Petkov
  2024-10-16 11:14 ` [PATCHv4, REBASED 4/4] x86/64/kexec: Rewrite init_transition_pgtable() with kernel_ident_mapping_init() Kirill A. Shutemov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Kirill A. Shutemov @ 2024-10-16 11:14 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Rafael J. Wysocki, Andy Lutomirski,
	Peter Zijlstra, Baoquan He
  Cc: Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi, Kirill A. Shutemov,
	Kai Huang

The init_transition_pgtable() function sets up transitional page tables.
It ensures that the relocate_kernel() function is present in the
identity mapping at the same location as in the kernel page tables.
relocate_kernel() switches to the identity mapping, and the function
must be present at the same location in the virtual address space before
and after switching page tables.

init_transition_pgtable() maps a copy of relocate_kernel() in
image->control_code_page at the relocate_kernel() virtual address, but
the original physical address of relocate_kernel() would also work.

It is safe to use original relocate_kernel() physical address cannot be
overwritten until swap_pages() is called, and the relocate_kernel()
virtual address will not be used by then.

Map the original relocate_kernel() at the relocate_kernel() virtual
address in the identity mapping. It is preparation to replace the
init_transition_pgtable() implementation with a call to
kernel_ident_mapping_init().

Note that while relocate_kernel() switches to the identity mapping, it
does not flush global TLB entries (CR4.PGE is not cleared). This means
that in most cases, the kernel still runs relocate_kernel() from the
original physical address before the change.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kernel/machine_kexec_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 9c9ac606893e..645690e81c2d 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -157,7 +157,7 @@ static int init_transition_pgtable(struct kimage *image, pgd_t *pgd)
 	pte_t *pte;
 
 	vaddr = (unsigned long)relocate_kernel;
-	paddr = __pa(page_address(image->control_code_page)+PAGE_SIZE);
+	paddr = __pa(relocate_kernel);
 	pgd += pgd_index(vaddr);
 	if (!pgd_present(*pgd)) {
 		p4d = (p4d_t *)get_zeroed_page(GFP_KERNEL);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCHv4, REBASED 4/4] x86/64/kexec: Rewrite init_transition_pgtable() with kernel_ident_mapping_init()
  2024-10-16 11:14 [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
                   ` (2 preceding siblings ...)
  2024-10-16 11:14 ` [PATCHv4, REBASED 3/4] x86/64/kexec: Map original relocate_kernel() in init_transition_pgtable() Kirill A. Shutemov
@ 2024-10-16 11:14 ` Kirill A. Shutemov
  2024-10-29 15:09 ` [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
  2025-03-10 19:45 ` Ingo Molnar
  5 siblings, 0 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2024-10-16 11:14 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Rafael J. Wysocki, Andy Lutomirski,
	Peter Zijlstra, Baoquan He
  Cc: Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi, Kirill A. Shutemov,
	Kai Huang

init_transition_pgtable() sets up transitional page tables. Rewrite it
using kernel_ident_mapping_init() to avoid code duplication.

Change struct kimage_arch to track allocated page tables as a list, not
linking them to specific page table levels.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/include/asm/kexec.h       |  5 +-
 arch/x86/kernel/machine_kexec_64.c | 89 +++++++++++-------------------
 2 files changed, 32 insertions(+), 62 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index ae5482a2f0ca..7f9287f371e6 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -145,10 +145,7 @@ struct kimage_arch {
 };
 #else
 struct kimage_arch {
-	p4d_t *p4d;
-	pud_t *pud;
-	pmd_t *pmd;
-	pte_t *pte;
+	struct list_head pages;
 };
 #endif /* CONFIG_X86_32 */
 
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 645690e81c2d..fb350372835c 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -134,71 +134,42 @@ map_efi_systab(struct x86_mapping_info *info, pgd_t *level4p)
 	return 0;
 }
 
+static void *alloc_transition_pgt_page(void *data)
+{
+	struct kimage *image = (struct kimage *)data;
+	unsigned long virt;
+
+	virt = get_zeroed_page(GFP_KERNEL);
+	if (!virt)
+		return NULL;
+
+	list_add(&virt_to_page(virt)->lru, &image->arch.pages);
+	return (void *)virt;
+}
+
 static void free_transition_pgtable(struct kimage *image)
 {
-	free_page((unsigned long)image->arch.p4d);
-	image->arch.p4d = NULL;
-	free_page((unsigned long)image->arch.pud);
-	image->arch.pud = NULL;
-	free_page((unsigned long)image->arch.pmd);
-	image->arch.pmd = NULL;
-	free_page((unsigned long)image->arch.pte);
-	image->arch.pte = NULL;
+	struct page *page, *tmp;
+
+	list_for_each_entry_safe(page, tmp, &image->arch.pages, lru) {
+		list_del(&page->lru);
+		free_page((unsigned long)page_address(page));
+	}
 }
 
 static int init_transition_pgtable(struct kimage *image, pgd_t *pgd)
 {
-	pgprot_t prot = PAGE_KERNEL_EXEC_NOENC;
-	unsigned long vaddr, paddr;
-	int result = -ENOMEM;
-	p4d_t *p4d;
-	pud_t *pud;
-	pmd_t *pmd;
-	pte_t *pte;
+	struct x86_mapping_info info = {
+		.alloc_pgt_page	= alloc_transition_pgt_page,
+		.context	= image,
+		.page_flag	= __PAGE_KERNEL_LARGE_EXEC,
+		.kernpg_flag	= _KERNPG_TABLE_NOENC,
+		.offset = __START_KERNEL_map - phys_base,
+	};
+	unsigned long mstart = PAGE_ALIGN_DOWN(__pa(relocate_kernel));
+	unsigned long mend = mstart + PAGE_SIZE;
 
-	vaddr = (unsigned long)relocate_kernel;
-	paddr = __pa(relocate_kernel);
-	pgd += pgd_index(vaddr);
-	if (!pgd_present(*pgd)) {
-		p4d = (p4d_t *)get_zeroed_page(GFP_KERNEL);
-		if (!p4d)
-			goto err;
-		image->arch.p4d = p4d;
-		set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE));
-	}
-	p4d = p4d_offset(pgd, vaddr);
-	if (!p4d_present(*p4d)) {
-		pud = (pud_t *)get_zeroed_page(GFP_KERNEL);
-		if (!pud)
-			goto err;
-		image->arch.pud = pud;
-		set_p4d(p4d, __p4d(__pa(pud) | _KERNPG_TABLE));
-	}
-	pud = pud_offset(p4d, vaddr);
-	if (!pud_present(*pud)) {
-		pmd = (pmd_t *)get_zeroed_page(GFP_KERNEL);
-		if (!pmd)
-			goto err;
-		image->arch.pmd = pmd;
-		set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
-	}
-	pmd = pmd_offset(pud, vaddr);
-	if (!pmd_present(*pmd)) {
-		pte = (pte_t *)get_zeroed_page(GFP_KERNEL);
-		if (!pte)
-			goto err;
-		image->arch.pte = pte;
-		set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
-	}
-	pte = pte_offset_kernel(pmd, vaddr);
-
-	if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
-		prot = PAGE_KERNEL_EXEC;
-
-	set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
-	return 0;
-err:
-	return result;
+	return kernel_ident_mapping_init(&info, pgd, mstart, mend);
 }
 
 static void *alloc_pgt_page(void *data)
@@ -299,6 +270,8 @@ int machine_kexec_prepare(struct kimage *image)
 	unsigned long start_pgtable;
 	int result;
 
+	INIT_LIST_HEAD(&image->arch.pages);
+
 	/* Calculate the offsets */
 	start_pgtable = page_to_pfn(image->control_code_page) << PAGE_SHIFT;
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization
  2024-10-16 11:14 [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
                   ` (3 preceding siblings ...)
  2024-10-16 11:14 ` [PATCHv4, REBASED 4/4] x86/64/kexec: Rewrite init_transition_pgtable() with kernel_ident_mapping_init() Kirill A. Shutemov
@ 2024-10-29 15:09 ` Kirill A. Shutemov
  2024-10-31 16:21   ` Dave Hansen
  2025-03-10 19:45 ` Ingo Molnar
  5 siblings, 1 reply; 17+ messages in thread
From: Kirill A. Shutemov @ 2024-10-29 15:09 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Rafael J. Wysocki, Andy Lutomirski,
	Peter Zijlstra, Baoquan He
  Cc: Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi

On Wed, Oct 16, 2024 at 02:14:54PM +0300, Kirill A. Shutemov wrote:
> Use kernel_ident_mapping_init() to initialize kernel page tables where
> possible, replacing manual initialization, reducing code duplication.
> 
> v4:
>  - Reviewed-bys from Kai;
>  - Fix comment in acpi_mp_setup_reset() (Rafael);
> v3:
>  - Reviewed-bys from Tom;
>  - Improve commit messages;
> v2:
>  - A separate patch to change what PA is mapped at relocate_kernel() VA.
>  - Improve commit messages;
>  - Add Reveiwed-by from Kai;
> 
> Kirill A. Shutemov (4):
>   x86/mm/ident_map: Fix virtual address wrap to zero
>   x86/acpi: Replace manual page table initialization with
>     kernel_ident_mapping_init()
>   x86/64/kexec: Map original relocate_kernel() in
>     init_transition_pgtable()
>   x86/64/kexec: Rewrite init_transition_pgtable() with
>     kernel_ident_mapping_init()
> 
>  arch/x86/include/asm/kexec.h       |  5 +-
>  arch/x86/kernel/acpi/madt_wakeup.c | 73 +++++-------------------
>  arch/x86/kernel/machine_kexec_64.c | 89 +++++++++++-------------------
>  arch/x86/mm/ident_map.c            | 14 +----
>  4 files changed, 50 insertions(+), 131 deletions(-)

Any feedback on this series?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero
  2024-10-16 11:14 ` [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero Kirill A. Shutemov
@ 2024-10-30 11:47   ` Borislav Petkov
  2024-10-31 10:11     ` Kirill A. Shutemov
  2025-03-10 19:50   ` [tip: x86/mm] x86/mm/ident_map: Fix theoretical virtual address overflow " tip-bot2 for Kirill A. Shutemov
  2025-03-19 11:04   ` [tip: x86/core] " tip-bot2 for Kirill A. Shutemov
  2 siblings, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2024-10-30 11:47 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin,
	Rafael J. Wysocki, Andy Lutomirski, Peter Zijlstra, Baoquan He,
	Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi, Kai Huang

On Wed, Oct 16, 2024 at 02:14:55PM +0300, Kirill A. Shutemov wrote:
> Calculation of 'next' virtual address doesn't protect against wrapping
> to zero. It can result in page table corruption and hang. The
> problematic case is possible if user sets high x86_mapping_info::offset.
> 
> The wrapping to zero only occurs if the top PGD entry is accessed.
> There are no such users in the upstream. Only hibernate_64.c uses
> x86_mapping_info::offset, and it operates on the direct mapping range,
> which is not the top PGD entry.
> 
> Replace manual 'next' calculation with p?d_addr_end() which handles
> wrapping correctly.

So this is a fix for a theoretical issue as it cannot happen currently?

Can we call that out in the commit message so that the stable AI doesn't pick
it up?

And which commit is it fixing?

aece27851d44 ("x86, 64bit, mm: Add generic kernel/ident mapping helper")
perhaps?

Always add Fixes: tags when a patch is fixing something - you know that.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero
  2024-10-30 11:47   ` Borislav Petkov
@ 2024-10-31 10:11     ` Kirill A. Shutemov
  2024-10-31 13:59       ` Borislav Petkov
  0 siblings, 1 reply; 17+ messages in thread
From: Kirill A. Shutemov @ 2024-10-31 10:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin,
	Rafael J. Wysocki, Andy Lutomirski, Peter Zijlstra, Baoquan He,
	Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi, Kai Huang

On Wed, Oct 30, 2024 at 12:47:12PM +0100, Borislav Petkov wrote:
> On Wed, Oct 16, 2024 at 02:14:55PM +0300, Kirill A. Shutemov wrote:
> > Calculation of 'next' virtual address doesn't protect against wrapping
> > to zero. It can result in page table corruption and hang. The
> > problematic case is possible if user sets high x86_mapping_info::offset.
> > 
> > The wrapping to zero only occurs if the top PGD entry is accessed.
> > There are no such users in the upstream. Only hibernate_64.c uses
> > x86_mapping_info::offset, and it operates on the direct mapping range,
> > which is not the top PGD entry.
> > 
> > Replace manual 'next' calculation with p?d_addr_end() which handles
> > wrapping correctly.
> 
> So this is a fix for a theoretical issue as it cannot happen currently?

Right.

> Can we call that out in the commit message so that the stable AI doesn't pick
> it up?

Do we have magic words for that?

I tried to express that in the second paragraph: "no such users in the
upstream".

> And which commit is it fixing?
> 
> aece27851d44 ("x86, 64bit, mm: Add generic kernel/ident mapping helper")
> perhaps?

This one is closer:

e4630fdd4763 ("x86/power/64: Always create temporary identity mapping correctly")

It adds x86_mapping_info::offset.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero
  2024-10-31 10:11     ` Kirill A. Shutemov
@ 2024-10-31 13:59       ` Borislav Petkov
  2024-11-04  9:26         ` Kirill A. Shutemov
  0 siblings, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2024-10-31 13:59 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin,
	Rafael J. Wysocki, Andy Lutomirski, Peter Zijlstra, Baoquan He,
	Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi, Kai Huang

On Thu, Oct 31, 2024 at 12:11:52PM +0200, Kirill A. Shutemov wrote:
> Do we have magic words for that?

No clue.

> I tried to express that in the second paragraph: "no such users in the
> upstream".

Right, so perhaps better to spell it out explicitly:

"Backporter's note:

This fixes a theoretical issue only and there's no need to backport it to
stable."

at the end of the commit message.

> > And which commit is it fixing?
> > 
> > aece27851d44 ("x86, 64bit, mm: Add generic kernel/ident mapping helper")
> > perhaps?
> 
> This one is closer:
> 
> e4630fdd4763 ("x86/power/64: Always create temporary identity mapping correctly")
> 
> It adds x86_mapping_info::offset.

But aece27851d44 has the faulty check...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv4, REBASED 3/4] x86/64/kexec: Map original relocate_kernel() in init_transition_pgtable()
  2024-10-16 11:14 ` [PATCHv4, REBASED 3/4] x86/64/kexec: Map original relocate_kernel() in init_transition_pgtable() Kirill A. Shutemov
@ 2024-10-31 14:23   ` Borislav Petkov
  0 siblings, 0 replies; 17+ messages in thread
From: Borislav Petkov @ 2024-10-31 14:23 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin,
	Rafael J. Wysocki, Andy Lutomirski, Peter Zijlstra, Baoquan He,
	Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi, Kai Huang

On Wed, Oct 16, 2024 at 02:14:57PM +0300, Kirill A. Shutemov wrote:
> The init_transition_pgtable() function sets up transitional page tables.
> It ensures that the relocate_kernel() function is present in the
> identity mapping at the same location as in the kernel page tables.
> relocate_kernel() switches to the identity mapping, and the function
> must be present at the same location in the virtual address space before
> and after switching page tables.
> 
> init_transition_pgtable() maps a copy of relocate_kernel() in
> image->control_code_page at the relocate_kernel() virtual address, but
> the original physical address of relocate_kernel() would also work.
> 
> It is safe to use original relocate_kernel() physical address cannot be
					^^^^^^^^^^^^^^^

something went missing here in that sentence. Reads weird.

> overwritten until swap_pages() is called, and the relocate_kernel()
> virtual address will not be used by then.

...

> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
> index 9c9ac606893e..645690e81c2d 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -157,7 +157,7 @@ static int init_transition_pgtable(struct kimage *image, pgd_t *pgd)
>  	pte_t *pte;
>  
>  	vaddr = (unsigned long)relocate_kernel;
> -	paddr = __pa(page_address(image->control_code_page)+PAGE_SIZE);
> +	paddr = __pa(relocate_kernel);
>  	pgd += pgd_index(vaddr);
>  	if (!pgd_present(*pgd)) {
>  		p4d = (p4d_t *)get_zeroed_page(GFP_KERNEL);

Such changes always make me nervous so I'd queue them only after this merge
window is over so that they can get maximal testing in next. Unless someone
objects...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization
  2024-10-29 15:09 ` [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
@ 2024-10-31 16:21   ` Dave Hansen
  0 siblings, 0 replies; 17+ messages in thread
From: Dave Hansen @ 2024-10-31 16:21 UTC (permalink / raw)
  To: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Rafael J. Wysocki,
	Andy Lutomirski, Peter Zijlstra, Baoquan He
  Cc: Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi

On 10/29/24 08:09, Kirill A. Shutemov wrote:
> Any feedback on this series?

It's tempting because it removes so much code, but it's also worrying
because these code paths are notoriously hard to test.

I'd be much happier if you'd send this after 6.13-rc1 drops and we give
it as much testing time as possible.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero
  2024-10-31 13:59       ` Borislav Petkov
@ 2024-11-04  9:26         ` Kirill A. Shutemov
  0 siblings, 0 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2024-11-04  9:26 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin,
	Rafael J. Wysocki, Andy Lutomirski, Peter Zijlstra, Baoquan He,
	Ard Biesheuvel, Tom Lendacky, Andrew Morton, Thomas Zimmermann,
	Sean Christopherson, linux-kernel, linux-acpi, Kai Huang

On Thu, Oct 31, 2024 at 02:59:16PM +0100, Borislav Petkov wrote:
> On Thu, Oct 31, 2024 at 12:11:52PM +0200, Kirill A. Shutemov wrote:
> > Do we have magic words for that?
> 
> No clue.
> 
> > I tried to express that in the second paragraph: "no such users in the
> > upstream".
> 
> Right, so perhaps better to spell it out explicitly:
> 
> "Backporter's note:
> 
> This fixes a theoretical issue only and there's no need to backport it to
> stable."
> 
> at the end of the commit message.

Okay.

> 
> > > And which commit is it fixing?
> > > 
> > > aece27851d44 ("x86, 64bit, mm: Add generic kernel/ident mapping helper")
> > > perhaps?
> > 
> > This one is closer:
> > 
> > e4630fdd4763 ("x86/power/64: Always create temporary identity mapping correctly")
> > 
> > It adds x86_mapping_info::offset.
> 
> But aece27851d44 has the faulty check...

It cannot be triggered without 'offset'.

I'll put both.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization
  2024-10-16 11:14 [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
                   ` (4 preceding siblings ...)
  2024-10-29 15:09 ` [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
@ 2025-03-10 19:45 ` Ingo Molnar
  5 siblings, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2025-03-10 19:45 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Rafael J. Wysocki, Andy Lutomirski,
	Peter Zijlstra, Baoquan He, Ard Biesheuvel, Tom Lendacky,
	Andrew Morton, Thomas Zimmermann, Sean Christopherson,
	linux-kernel, linux-acpi, David Woodhouse


* Kirill A. Shutemov <kirill.shutemov@linux.intel.com> wrote:

> Use kernel_ident_mapping_init() to initialize kernel page tables where
> possible, replacing manual initialization, reducing code duplication.
> 
> v4:
>  - Reviewed-bys from Kai;
>  - Fix comment in acpi_mp_setup_reset() (Rafael);
> v3:
>  - Reviewed-bys from Tom;
>  - Improve commit messages;
> v2:
>  - A separate patch to change what PA is mapped at relocate_kernel() VA.
>  - Improve commit messages;
>  - Add Reveiwed-by from Kai;
> 
> Kirill A. Shutemov (4):
>   x86/mm/ident_map: Fix virtual address wrap to zero
>   x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init()
>   x86/64/kexec: Map original relocate_kernel() in init_transition_pgtable()
>   x86/64/kexec: Rewrite init_transition_pgtable() with kernel_ident_mapping_init()
> 
>  arch/x86/include/asm/kexec.h       |  5 +-
>  arch/x86/kernel/acpi/madt_wakeup.c | 73 +++++-------------------
>  arch/x86/kernel/machine_kexec_64.c | 89 +++++++++++-------------------
>  arch/x86/mm/ident_map.c            | 14 +----
>  4 files changed, 50 insertions(+), 131 deletions(-)

So looks like this series feel between the cracks during the holiday 
season.

To help move them along, I've fixed up the first patch with the review 
feedback clarification requests, and applied patch #1 and #2 to 
tip:x86/mm:

  4f10ec03fe1e ("x86/mm/ident_map: Fix theoretical virtual address overflow to zero")
  376daf20eda4 ("x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init()")

Patches #3 and #4 don't apply anymore, due to interference by other 
work with commits like:

  4b5bc2ec9a23 ("x86/kexec: Allocate PGD for x86_64 transition page tables separately")

If the remaining patches are still relevant, mind porting them to 
latest -tip?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [tip: x86/mm] x86/mm/ident_map: Fix theoretical virtual address overflow to zero
  2024-10-16 11:14 ` [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero Kirill A. Shutemov
  2024-10-30 11:47   ` Borislav Petkov
@ 2025-03-10 19:50   ` tip-bot2 for Kirill A. Shutemov
  2025-03-19 11:04   ` [tip: x86/core] " tip-bot2 for Kirill A. Shutemov
  2 siblings, 0 replies; 17+ messages in thread
From: tip-bot2 for Kirill A. Shutemov @ 2025-03-10 19:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kirill A. Shutemov, Ingo Molnar, Kai Huang, Tom Lendacky,
	Andy Lutomirski, Linus Torvalds, x86, linux-kernel

The following commit has been merged into the x86/mm branch of tip:

Commit-ID:     4f10ec03fe1ed12479134be33ddf006382744651
Gitweb:        https://git.kernel.org/tip/4f10ec03fe1ed12479134be33ddf006382744651
Author:        Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate:    Wed, 16 Oct 2024 14:14:55 +03:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 10 Mar 2025 20:31:30 +01:00

x86/mm/ident_map: Fix theoretical virtual address overflow to zero

The current calculation of the 'next' virtual address in the
page table initialization functions in arch/x86/mm/ident_map.c
doesn't protect against wrapping to zero.

This is a theoretical issue that cannot happen currently,
the problematic case is possible only if the user sets a
high enough x86_mapping_info::offset value - which no
current code in the upstream kernel does.

( The wrapping to zero only occurs if the top PGD entry is accessed.
  There are no such users upstream. Only hibernate_64.c uses
  x86_mapping_info::offset, and it operates on the direct mapping
  range, which is not the top PGD entry. )

Should such an overflow happen, it can result in page table
corruption and a hang.

To future-proof this code, replace the manual 'next' calculation
with p?d_addr_end() which handles wrapping correctly.

[ Backporter's note: there's no need to backport this patch. ]

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20241016111458.846228-2-kirill.shutemov@linux.intel.com
---
 arch/x86/mm/ident_map.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index 5ab7bd2..bd5d101 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -101,9 +101,7 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
 		pmd_t *pmd;
 		bool use_gbpage;
 
-		next = (addr & PUD_MASK) + PUD_SIZE;
-		if (next > end)
-			next = end;
+		next = pud_addr_end(addr, end);
 
 		/* if this is already a gbpage, this portion is already mapped */
 		if (pud_leaf(*pud))
@@ -154,10 +152,7 @@ static int ident_p4d_init(struct x86_mapping_info *info, p4d_t *p4d_page,
 		p4d_t *p4d = p4d_page + p4d_index(addr);
 		pud_t *pud;
 
-		next = (addr & P4D_MASK) + P4D_SIZE;
-		if (next > end)
-			next = end;
-
+		next = p4d_addr_end(addr, end);
 		if (p4d_present(*p4d)) {
 			pud = pud_offset(p4d, 0);
 			result = ident_pud_init(info, pud, addr, next);
@@ -199,10 +194,7 @@ int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
 		pgd_t *pgd = pgd_page + pgd_index(addr);
 		p4d_t *p4d;
 
-		next = (addr & PGDIR_MASK) + PGDIR_SIZE;
-		if (next > end)
-			next = end;
-
+		next = pgd_addr_end(addr, end);
 		if (pgd_present(*pgd)) {
 			p4d = p4d_offset(pgd, 0);
 			result = ident_p4d_init(info, p4d, addr, next);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [tip: x86/mm] x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init()
  2024-10-16 11:14 ` [PATCHv4, REBASED 2/4] x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init() Kirill A. Shutemov
@ 2025-03-10 19:50   ` tip-bot2 for Kirill A. Shutemov
  2025-03-19 11:04   ` [tip: x86/core] " tip-bot2 for Kirill A. Shutemov
  1 sibling, 0 replies; 17+ messages in thread
From: tip-bot2 for Kirill A. Shutemov @ 2025-03-10 19:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kirill A. Shutemov, Ingo Molnar, Kai Huang, Tom Lendacky,
	Rafael J. Wysocki, Andy Lutomirski, Linus Torvalds, x86,
	linux-kernel

The following commit has been merged into the x86/mm branch of tip:

Commit-ID:     376daf20eda4898a49a4746f225153efb4c094e7
Gitweb:        https://git.kernel.org/tip/376daf20eda4898a49a4746f225153efb4c094e7
Author:        Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate:    Wed, 16 Oct 2024 14:14:56 +03:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 10 Mar 2025 20:31:28 +01:00

x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init()

The init_transition_pgtable() functions maps the page with
asm_acpi_mp_play_dead() into an identity mapping.

Replace open-coded manual page table initialization with
kernel_ident_mapping_init() to avoid code duplication.

Use x86_mapping_info::offset to get the page mapped at the
correct location.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20241016111458.846228-3-kirill.shutemov@linux.intel.com
---
 arch/x86/kernel/acpi/madt_wakeup.c | 73 +++++------------------------
 1 file changed, 15 insertions(+), 58 deletions(-)

diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c
index d5ef621..f36f284 100644
--- a/arch/x86/kernel/acpi/madt_wakeup.c
+++ b/arch/x86/kernel/acpi/madt_wakeup.c
@@ -70,58 +70,6 @@ static void __init free_pgt_page(void *pgt, void *dummy)
 	return memblock_free(pgt, PAGE_SIZE);
 }
 
-/*
- * Make sure asm_acpi_mp_play_dead() is present in the identity mapping at
- * the same place as in the kernel page tables. asm_acpi_mp_play_dead() switches
- * to the identity mapping and the function has be present at the same spot in
- * the virtual address space before and after switching page tables.
- */
-static int __init init_transition_pgtable(pgd_t *pgd)
-{
-	pgprot_t prot = PAGE_KERNEL_EXEC_NOENC;
-	unsigned long vaddr, paddr;
-	p4d_t *p4d;
-	pud_t *pud;
-	pmd_t *pmd;
-	pte_t *pte;
-
-	vaddr = (unsigned long)asm_acpi_mp_play_dead;
-	pgd += pgd_index(vaddr);
-	if (!pgd_present(*pgd)) {
-		p4d = (p4d_t *)alloc_pgt_page(NULL);
-		if (!p4d)
-			return -ENOMEM;
-		set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE));
-	}
-	p4d = p4d_offset(pgd, vaddr);
-	if (!p4d_present(*p4d)) {
-		pud = (pud_t *)alloc_pgt_page(NULL);
-		if (!pud)
-			return -ENOMEM;
-		set_p4d(p4d, __p4d(__pa(pud) | _KERNPG_TABLE));
-	}
-	pud = pud_offset(p4d, vaddr);
-	if (!pud_present(*pud)) {
-		pmd = (pmd_t *)alloc_pgt_page(NULL);
-		if (!pmd)
-			return -ENOMEM;
-		set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
-	}
-	pmd = pmd_offset(pud, vaddr);
-	if (!pmd_present(*pmd)) {
-		pte = (pte_t *)alloc_pgt_page(NULL);
-		if (!pte)
-			return -ENOMEM;
-		set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
-	}
-	pte = pte_offset_kernel(pmd, vaddr);
-
-	paddr = __pa(vaddr);
-	set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
-
-	return 0;
-}
-
 static int __init acpi_mp_setup_reset(u64 reset_vector)
 {
 	struct x86_mapping_info info = {
@@ -130,6 +78,7 @@ static int __init acpi_mp_setup_reset(u64 reset_vector)
 		.page_flag      = __PAGE_KERNEL_LARGE_EXEC,
 		.kernpg_flag    = _KERNPG_TABLE_NOENC,
 	};
+	unsigned long mstart, mend;
 	pgd_t *pgd;
 
 	pgd = alloc_pgt_page(NULL);
@@ -137,8 +86,6 @@ static int __init acpi_mp_setup_reset(u64 reset_vector)
 		return -ENOMEM;
 
 	for (int i = 0; i < nr_pfn_mapped; i++) {
-		unsigned long mstart, mend;
-
 		mstart = pfn_mapped[i].start << PAGE_SHIFT;
 		mend   = pfn_mapped[i].end << PAGE_SHIFT;
 		if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
@@ -147,14 +94,24 @@ static int __init acpi_mp_setup_reset(u64 reset_vector)
 		}
 	}
 
-	if (kernel_ident_mapping_init(&info, pgd,
-				      PAGE_ALIGN_DOWN(reset_vector),
-				      PAGE_ALIGN(reset_vector + 1))) {
+	mstart = PAGE_ALIGN_DOWN(reset_vector);
+	mend = mstart + PAGE_SIZE;
+	if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
 		kernel_ident_mapping_free(&info, pgd);
 		return -ENOMEM;
 	}
 
-	if (init_transition_pgtable(pgd)) {
+	/*
+	 * Make sure asm_acpi_mp_play_dead() is present in the identity mapping
+	 * at the same place as in the kernel page tables.
+	 * asm_acpi_mp_play_dead() switches to the identity mapping and the
+	 * function must be present at the same spot in the virtual address space
+	 * before and after switching page tables.
+	 */
+	info.offset = __START_KERNEL_map - phys_base;
+	mstart = PAGE_ALIGN_DOWN(__pa(asm_acpi_mp_play_dead));
+	mend = mstart + PAGE_SIZE;
+	if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
 		kernel_ident_mapping_free(&info, pgd);
 		return -ENOMEM;
 	}

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [tip: x86/core] x86/mm/ident_map: Fix theoretical virtual address overflow to zero
  2024-10-16 11:14 ` [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero Kirill A. Shutemov
  2024-10-30 11:47   ` Borislav Petkov
  2025-03-10 19:50   ` [tip: x86/mm] x86/mm/ident_map: Fix theoretical virtual address overflow " tip-bot2 for Kirill A. Shutemov
@ 2025-03-19 11:04   ` tip-bot2 for Kirill A. Shutemov
  2 siblings, 0 replies; 17+ messages in thread
From: tip-bot2 for Kirill A. Shutemov @ 2025-03-19 11:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kirill A. Shutemov, Ingo Molnar, Kai Huang, Tom Lendacky,
	Andy Lutomirski, Linus Torvalds, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     f666c92090a41ac5524dade63ff96b3adcf8c2ab
Gitweb:        https://git.kernel.org/tip/f666c92090a41ac5524dade63ff96b3adcf8c2ab
Author:        Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate:    Wed, 16 Oct 2024 14:14:55 +03:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 19 Mar 2025 11:12:29 +01:00

x86/mm/ident_map: Fix theoretical virtual address overflow to zero

The current calculation of the 'next' virtual address in the
page table initialization functions in arch/x86/mm/ident_map.c
doesn't protect against wrapping to zero.

This is a theoretical issue that cannot happen currently,
the problematic case is possible only if the user sets a
high enough x86_mapping_info::offset value - which no
current code in the upstream kernel does.

( The wrapping to zero only occurs if the top PGD entry is accessed.
  There are no such users upstream. Only hibernate_64.c uses
  x86_mapping_info::offset, and it operates on the direct mapping
  range, which is not the top PGD entry. )

Should such an overflow happen, it can result in page table
corruption and a hang.

To future-proof this code, replace the manual 'next' calculation
with p?d_addr_end() which handles wrapping correctly.

[ Backporter's note: there's no need to backport this patch. ]

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20241016111458.846228-2-kirill.shutemov@linux.intel.com
---
 arch/x86/mm/ident_map.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index 5ab7bd2..bd5d101 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -101,9 +101,7 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
 		pmd_t *pmd;
 		bool use_gbpage;
 
-		next = (addr & PUD_MASK) + PUD_SIZE;
-		if (next > end)
-			next = end;
+		next = pud_addr_end(addr, end);
 
 		/* if this is already a gbpage, this portion is already mapped */
 		if (pud_leaf(*pud))
@@ -154,10 +152,7 @@ static int ident_p4d_init(struct x86_mapping_info *info, p4d_t *p4d_page,
 		p4d_t *p4d = p4d_page + p4d_index(addr);
 		pud_t *pud;
 
-		next = (addr & P4D_MASK) + P4D_SIZE;
-		if (next > end)
-			next = end;
-
+		next = p4d_addr_end(addr, end);
 		if (p4d_present(*p4d)) {
 			pud = pud_offset(p4d, 0);
 			result = ident_pud_init(info, pud, addr, next);
@@ -199,10 +194,7 @@ int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
 		pgd_t *pgd = pgd_page + pgd_index(addr);
 		p4d_t *p4d;
 
-		next = (addr & PGDIR_MASK) + PGDIR_SIZE;
-		if (next > end)
-			next = end;
-
+		next = pgd_addr_end(addr, end);
 		if (pgd_present(*pgd)) {
 			p4d = p4d_offset(pgd, 0);
 			result = ident_p4d_init(info, p4d, addr, next);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [tip: x86/core] x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init()
  2024-10-16 11:14 ` [PATCHv4, REBASED 2/4] x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init() Kirill A. Shutemov
  2025-03-10 19:50   ` [tip: x86/mm] " tip-bot2 for Kirill A. Shutemov
@ 2025-03-19 11:04   ` tip-bot2 for Kirill A. Shutemov
  1 sibling, 0 replies; 17+ messages in thread
From: tip-bot2 for Kirill A. Shutemov @ 2025-03-19 11:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kirill A. Shutemov, Ingo Molnar, Kai Huang, Tom Lendacky,
	Rafael J. Wysocki, Andy Lutomirski, Linus Torvalds, x86,
	linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     775d37d8f01ea1349be8bd7cc8651b51814c48df
Gitweb:        https://git.kernel.org/tip/775d37d8f01ea1349be8bd7cc8651b51814c48df
Author:        Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate:    Wed, 16 Oct 2024 14:14:56 +03:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 19 Mar 2025 11:12:29 +01:00

x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init()

The init_transition_pgtable() functions maps the page with
asm_acpi_mp_play_dead() into an identity mapping.

Replace open-coded manual page table initialization with
kernel_ident_mapping_init() to avoid code duplication.

Use x86_mapping_info::offset to get the page mapped at the
correct location.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20241016111458.846228-3-kirill.shutemov@linux.intel.com
---
 arch/x86/kernel/acpi/madt_wakeup.c | 73 +++++------------------------
 1 file changed, 15 insertions(+), 58 deletions(-)

diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c
index d5ef621..f36f284 100644
--- a/arch/x86/kernel/acpi/madt_wakeup.c
+++ b/arch/x86/kernel/acpi/madt_wakeup.c
@@ -70,58 +70,6 @@ static void __init free_pgt_page(void *pgt, void *dummy)
 	return memblock_free(pgt, PAGE_SIZE);
 }
 
-/*
- * Make sure asm_acpi_mp_play_dead() is present in the identity mapping at
- * the same place as in the kernel page tables. asm_acpi_mp_play_dead() switches
- * to the identity mapping and the function has be present at the same spot in
- * the virtual address space before and after switching page tables.
- */
-static int __init init_transition_pgtable(pgd_t *pgd)
-{
-	pgprot_t prot = PAGE_KERNEL_EXEC_NOENC;
-	unsigned long vaddr, paddr;
-	p4d_t *p4d;
-	pud_t *pud;
-	pmd_t *pmd;
-	pte_t *pte;
-
-	vaddr = (unsigned long)asm_acpi_mp_play_dead;
-	pgd += pgd_index(vaddr);
-	if (!pgd_present(*pgd)) {
-		p4d = (p4d_t *)alloc_pgt_page(NULL);
-		if (!p4d)
-			return -ENOMEM;
-		set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE));
-	}
-	p4d = p4d_offset(pgd, vaddr);
-	if (!p4d_present(*p4d)) {
-		pud = (pud_t *)alloc_pgt_page(NULL);
-		if (!pud)
-			return -ENOMEM;
-		set_p4d(p4d, __p4d(__pa(pud) | _KERNPG_TABLE));
-	}
-	pud = pud_offset(p4d, vaddr);
-	if (!pud_present(*pud)) {
-		pmd = (pmd_t *)alloc_pgt_page(NULL);
-		if (!pmd)
-			return -ENOMEM;
-		set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
-	}
-	pmd = pmd_offset(pud, vaddr);
-	if (!pmd_present(*pmd)) {
-		pte = (pte_t *)alloc_pgt_page(NULL);
-		if (!pte)
-			return -ENOMEM;
-		set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
-	}
-	pte = pte_offset_kernel(pmd, vaddr);
-
-	paddr = __pa(vaddr);
-	set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
-
-	return 0;
-}
-
 static int __init acpi_mp_setup_reset(u64 reset_vector)
 {
 	struct x86_mapping_info info = {
@@ -130,6 +78,7 @@ static int __init acpi_mp_setup_reset(u64 reset_vector)
 		.page_flag      = __PAGE_KERNEL_LARGE_EXEC,
 		.kernpg_flag    = _KERNPG_TABLE_NOENC,
 	};
+	unsigned long mstart, mend;
 	pgd_t *pgd;
 
 	pgd = alloc_pgt_page(NULL);
@@ -137,8 +86,6 @@ static int __init acpi_mp_setup_reset(u64 reset_vector)
 		return -ENOMEM;
 
 	for (int i = 0; i < nr_pfn_mapped; i++) {
-		unsigned long mstart, mend;
-
 		mstart = pfn_mapped[i].start << PAGE_SHIFT;
 		mend   = pfn_mapped[i].end << PAGE_SHIFT;
 		if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
@@ -147,14 +94,24 @@ static int __init acpi_mp_setup_reset(u64 reset_vector)
 		}
 	}
 
-	if (kernel_ident_mapping_init(&info, pgd,
-				      PAGE_ALIGN_DOWN(reset_vector),
-				      PAGE_ALIGN(reset_vector + 1))) {
+	mstart = PAGE_ALIGN_DOWN(reset_vector);
+	mend = mstart + PAGE_SIZE;
+	if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
 		kernel_ident_mapping_free(&info, pgd);
 		return -ENOMEM;
 	}
 
-	if (init_transition_pgtable(pgd)) {
+	/*
+	 * Make sure asm_acpi_mp_play_dead() is present in the identity mapping
+	 * at the same place as in the kernel page tables.
+	 * asm_acpi_mp_play_dead() switches to the identity mapping and the
+	 * function must be present at the same spot in the virtual address space
+	 * before and after switching page tables.
+	 */
+	info.offset = __START_KERNEL_map - phys_base;
+	mstart = PAGE_ALIGN_DOWN(__pa(asm_acpi_mp_play_dead));
+	mend = mstart + PAGE_SIZE;
+	if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) {
 		kernel_ident_mapping_free(&info, pgd);
 		return -ENOMEM;
 	}

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-03-19 11:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-16 11:14 [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
2024-10-16 11:14 ` [PATCHv4, REBASED 1/4] x86/mm/ident_map: Fix virtual address wrap to zero Kirill A. Shutemov
2024-10-30 11:47   ` Borislav Petkov
2024-10-31 10:11     ` Kirill A. Shutemov
2024-10-31 13:59       ` Borislav Petkov
2024-11-04  9:26         ` Kirill A. Shutemov
2025-03-10 19:50   ` [tip: x86/mm] x86/mm/ident_map: Fix theoretical virtual address overflow " tip-bot2 for Kirill A. Shutemov
2025-03-19 11:04   ` [tip: x86/core] " tip-bot2 for Kirill A. Shutemov
2024-10-16 11:14 ` [PATCHv4, REBASED 2/4] x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init() Kirill A. Shutemov
2025-03-10 19:50   ` [tip: x86/mm] " tip-bot2 for Kirill A. Shutemov
2025-03-19 11:04   ` [tip: x86/core] " tip-bot2 for Kirill A. Shutemov
2024-10-16 11:14 ` [PATCHv4, REBASED 3/4] x86/64/kexec: Map original relocate_kernel() in init_transition_pgtable() Kirill A. Shutemov
2024-10-31 14:23   ` Borislav Petkov
2024-10-16 11:14 ` [PATCHv4, REBASED 4/4] x86/64/kexec: Rewrite init_transition_pgtable() with kernel_ident_mapping_init() Kirill A. Shutemov
2024-10-29 15:09 ` [PATCHv4, REBASED 0/4] x86: Reduce code duplication on page table initialization Kirill A. Shutemov
2024-10-31 16:21   ` Dave Hansen
2025-03-10 19:45 ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox