All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] x86/devmem: move range_is_allowed() to drivers/char/mem.c
@ 2025-05-20 15:20 Arnd Bergmann
  2025-05-20 15:20 ` [PATCH 2/3] x86/devmem: remove phys_mem_access_prot_allowed() Arnd Bergmann
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Arnd Bergmann @ 2025-05-20 15:20 UTC (permalink / raw)
  To: Dave Hansen, Dan Williams
  Cc: Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Greg Kroah-Hartman, Nikolay Borisov,
	linux-kernel, Arnd Bergmann

From: Arnd Bergmann <arnd@arndb.de>

The global inline function in include/linux/io.h causes problems
because of the unfortunate naming that is too generic, and because
it reintroduces a dependency on the PAGE_SIZE definition that should
not exist in this header file.

Something I had not seen during the earlier review is how the
x86 phys_mem_access_prot_allowed() is called directly after the
generic check for range_is_allowed(), so checking it again actually
has no effect at all, and the definition can be made local to
drivers/char/mem.c with no other caller.

Fixes: 1b3f2bd04d90 ("x86/devmem: Remove duplicate range_is_allowed() definition")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/x86/mm/pat/memtype.c |  3 ---
 drivers/char/mem.c        | 18 ++++++++++++++++++
 include/linux/io.h        | 21 ---------------------
 3 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
index 2e7923844afe..fe24b8d2dc4b 100644
--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -789,9 +789,6 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
 	if (!pat_enabled())
 		return 1;
 
-	if (!range_is_allowed(pfn, size))
-		return 0;
-
 	if (file->f_flags & O_DSYNC)
 		pcm = _PAGE_CACHE_MODE_UC_MINUS;
 
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 2a483369e255..85f963ce3b2d 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -347,6 +347,24 @@ static const struct vm_operations_struct mmap_mem_ops = {
 #endif
 };
 
+static int range_is_allowed(unsigned long pfn, unsigned long size)
+{
+	u64 from = ((u64)pfn) << PAGE_SHIFT;
+	u64 to = from + size;
+	u64 cursor = from;
+
+	if (!IS_ENABLED(CONFIG_STRICT_DEVMEM))
+		return 1;
+
+	while (cursor < to) {
+		if (!devmem_is_allowed(pfn))
+			return 0;
+		cursor += PAGE_SIZE;
+		pfn++;
+	}
+	return 1;
+}
+
 static int mmap_mem(struct file *file, struct vm_area_struct *vma)
 {
 	size_t size = vma->vm_end - vma->vm_start;
diff --git a/include/linux/io.h b/include/linux/io.h
index 28a238217aa6..8c0a8e8b6066 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -164,25 +164,4 @@ static inline void arch_io_free_memtype_wc(resource_size_t base,
 int devm_arch_io_reserve_memtype_wc(struct device *dev, resource_size_t start,
 				    resource_size_t size);
 
-#ifdef CONFIG_STRICT_DEVMEM
-static inline int range_is_allowed(unsigned long pfn, unsigned long size)
-{
-	u64 from = ((u64)pfn) << PAGE_SHIFT;
-	u64 to = from + size;
-	u64 cursor = from;
-
-	while (cursor < to) {
-		if (!devmem_is_allowed(pfn))
-			return 0;
-		cursor += PAGE_SIZE;
-		pfn++;
-	}
-	return 1;
-}
-#else
-static inline int range_is_allowed(unsigned long pfn, unsigned long size)
-{
-	return 1;
-}
-#endif
 #endif /* _LINUX_IO_H */
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] x86/devmem: remove phys_mem_access_prot_allowed()
  2025-05-20 15:20 [PATCH 1/3] x86/devmem: move range_is_allowed() to drivers/char/mem.c Arnd Bergmann
@ 2025-05-20 15:20 ` Arnd Bergmann
  2025-05-21 22:08   ` Dan Williams
  2025-05-20 15:20 ` [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64 Arnd Bergmann
  2025-05-21 22:05 ` [PATCH 1/3] x86/devmem: move range_is_allowed() to drivers/char/mem.c Dan Williams
  2 siblings, 1 reply; 11+ messages in thread
From: Arnd Bergmann @ 2025-05-20 15:20 UTC (permalink / raw)
  To: Dave Hansen, Dan Williams
  Cc: Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Greg Kroah-Hartman, Nikolay Borisov,
	linux-kernel, Arnd Bergmann

From: Arnd Bergmann <arnd@arndb.de>

phys_mem_access_prot_allowed() was originally a workaround for
certain 32-bit CPUs, but after commit 1886297ce0c8 ("x86/mm/pat:
Fix BUG_ON() in mmap_mem() on QEMU/i386"), it no longer does anything
special as the only remaining bit now is the same logic that follows in
phys_mem_access_prot(), mapping everything as normal memory except when
either O_DSYNC is set or the address is highmem.

Just remove the thing entirely.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/x86/mm/pat/memtype.c | 15 ---------------
 drivers/char/mem.c        | 10 ----------
 include/linux/pgtable.h   |  2 --
 3 files changed, 27 deletions(-)

diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
index fe24b8d2dc4b..fec14f1f53a6 100644
--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -781,21 +781,6 @@ static inline void pgprot_set_cachemode(pgprot_t *prot, enum page_cache_mode pcm
 			 cachemode2protval(pcm));
 }
 
-int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
-				unsigned long size, pgprot_t *vma_prot)
-{
-	enum page_cache_mode pcm = _PAGE_CACHE_MODE_WB;
-
-	if (!pat_enabled())
-		return 1;
-
-	if (file->f_flags & O_DSYNC)
-		pcm = _PAGE_CACHE_MODE_UC_MINUS;
-
-	pgprot_set_cachemode(vma_prot, pcm);
-	return 1;
-}
-
 /*
  * Change the memory type for the physical address range in kernel identity
  * mapping space if that range is a part of identity map.
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 85f963ce3b2d..30a5c0e0542a 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -256,12 +256,6 @@ static ssize_t write_mem(struct file *file, const char __user *buf,
 	return written;
 }
 
-int __weak phys_mem_access_prot_allowed(struct file *file,
-	unsigned long pfn, unsigned long size, pgprot_t *vma_prot)
-{
-	return 1;
-}
-
 #ifndef __HAVE_PHYS_MEM_ACCESS_PROT
 
 /*
@@ -387,10 +381,6 @@ static int mmap_mem(struct file *file, struct vm_area_struct *vma)
 	if (!range_is_allowed(vma->vm_pgoff, size))
 		return -EPERM;
 
-	if (!phys_mem_access_prot_allowed(file, vma->vm_pgoff, size,
-						&vma->vm_page_prot))
-		return -EINVAL;
-
 	vma->vm_page_prot = phys_mem_access_prot(file, vma->vm_pgoff,
 						 size,
 						 vma->vm_page_prot);
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index adbebe2c6ce4..15b5a6b0943f 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1764,8 +1764,6 @@ static inline int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
 #endif
 
 struct file;
-int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
-			unsigned long size, pgprot_t *vma_prot);
 
 #ifndef CONFIG_X86_ESPFIX64
 static inline void init_espfix_bsp(void) { }
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64
  2025-05-20 15:20 [PATCH 1/3] x86/devmem: move range_is_allowed() to drivers/char/mem.c Arnd Bergmann
  2025-05-20 15:20 ` [PATCH 2/3] x86/devmem: remove phys_mem_access_prot_allowed() Arnd Bergmann
@ 2025-05-20 15:20 ` Arnd Bergmann
  2025-05-21 22:14   ` Dan Williams
  2025-05-21 22:05 ` [PATCH 1/3] x86/devmem: move range_is_allowed() to drivers/char/mem.c Dan Williams
  2 siblings, 1 reply; 11+ messages in thread
From: Arnd Bergmann @ 2025-05-20 15:20 UTC (permalink / raw)
  To: Dave Hansen, Dan Williams
  Cc: Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Greg Kroah-Hartman, Nikolay Borisov,
	linux-kernel, Arnd Bergmann

From: Arnd Bergmann <arnd@arndb.de>

Traditionally, both reading and mapping anything in the low 1MB area is
allowed on x86, through a series of ugly hacks.  In combination with
features such as memory encryption, this keeps causing trouble and
requires building additional hacks on top.

Chances are that this is only really used for 32-bit machines, as the
usual users of this were dosemu, svgalib or ancient XFree86 versions,
none of which should be used on 64-bit kernels any more.

Remove both the custom devmem_is_allowed() and the custom
xlate_dev_mem_ptr() on 64-bit kernels, and use the normal implementation
based on phys_to_virt() instead for read/write access on the linear
map.

As a result, this makes x86-64 behave more like the other architecture
on /dev/mem, allowing read/write access only on actual system RAM and
only when CONFIG_STRICT_DEVMEM is disabled, while mmap() can be use
on MMIO pages with the normal restrictions.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
Unlike the other two patches in this series, this one is expected to
change the behavior on x86-64 kernels, which has the risk of
regressions, but seems worthwhile to me.

Are there any reasons left for keeping these hacks?
---
 arch/x86/Kconfig          | 3 ++-
 arch/x86/include/asm/io.h | 3 ++-
 arch/x86/mm/init.c        | 4 +++-
 arch/x86/mm/ioremap.c     | 2 ++
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 47a3cd5ffc4f..635328b57e35 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -85,7 +85,7 @@ config X86
 	select ARCH_HAS_CURRENT_STACK_POINTER
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEBUG_VM_PGTABLE	if !X86_PAE
-	select ARCH_HAS_DEVMEM_IS_ALLOWED
+	select ARCH_HAS_DEVMEM_IS_ALLOWED	if !X86_64
 	select ARCH_HAS_DMA_OPS			if GART_IOMMU || XEN
 	select ARCH_HAS_EARLY_DEBUG		if KGDB
 	select ARCH_HAS_ELF_RANDOMIZE
@@ -180,6 +180,7 @@ config X86
 	select GENERIC_IRQ_PROBE
 	select GENERIC_IRQ_RESERVATION_MODE
 	select GENERIC_IRQ_SHOW
+	select GENERIC_LIB_DEVMEM_IS_ALLOWED	if X86_64
 	select GENERIC_PENDING_IRQ		if SMP
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_TIME_VSYSCALL
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 367d45755f85..c1ecca34f28d 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -285,11 +285,12 @@ BUILDIO(l, u32)
 #define outsw outsw
 #define outsl outsl
 
+#ifdef CONFIG_X86_32
 extern void *xlate_dev_mem_ptr(phys_addr_t phys);
 extern void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr);
-
 #define xlate_dev_mem_ptr xlate_dev_mem_ptr
 #define unxlate_dev_mem_ptr unxlate_dev_mem_ptr
+#endif
 
 extern int ioremap_change_attr(unsigned long vaddr, unsigned long size,
 				enum page_cache_mode pcm);
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index aa56d9ac0b8f..16d0f242f0de 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -856,11 +856,12 @@ void __init poking_init(void)
 	pte_unmap_unlock(ptep, ptl);
 }
 
+#ifdef CONFIG_X86_32
 /*
  * devmem_is_allowed() checks to see if /dev/mem access to a certain address
  * is valid. The argument is a physical page number.
  *
- * On x86, access has to be given to the first megabyte of RAM because that
+ * On x86-32, access has to be given to the first megabyte of RAM because that
  * area traditionally contains BIOS code and data regions used by X, dosemu,
  * and similar apps. Since they map the entire memory range, the whole range
  * must be allowed (for mapping), but any areas that would otherwise be
@@ -897,6 +898,7 @@ int devmem_is_allowed(unsigned long pagenr)
 
 	return 1;
 }
+#endif
 
 void free_init_pages(const char *what, unsigned long begin, unsigned long end)
 {
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 11b4ea7d7fa5..1e51d1c245bf 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -509,6 +509,7 @@ void *arch_memremap_wb(phys_addr_t phys_addr, size_t size, unsigned long flags)
 	return (void __force *)ioremap_encrypted(phys_addr, size);
 }
 
+#ifdef CONFIG_X86_32
 /*
  * Convert a physical pointer to a virtual kernel pointer for /dev/mem
  * access
@@ -533,6 +534,7 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr)
 {
 	memunmap((void *)((unsigned long)addr & PAGE_MASK));
 }
+#endif
 
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 /*
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] x86/devmem: move range_is_allowed() to drivers/char/mem.c
  2025-05-20 15:20 [PATCH 1/3] x86/devmem: move range_is_allowed() to drivers/char/mem.c Arnd Bergmann
  2025-05-20 15:20 ` [PATCH 2/3] x86/devmem: remove phys_mem_access_prot_allowed() Arnd Bergmann
  2025-05-20 15:20 ` [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64 Arnd Bergmann
@ 2025-05-21 22:05 ` Dan Williams
  2 siblings, 0 replies; 11+ messages in thread
From: Dan Williams @ 2025-05-21 22:05 UTC (permalink / raw)
  To: Arnd Bergmann, Dave Hansen, Dan Williams
  Cc: Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Greg Kroah-Hartman, Nikolay Borisov,
	linux-kernel, Arnd Bergmann

Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> The global inline function in include/linux/io.h causes problems
> because of the unfortunate naming that is too generic, and because
> it reintroduces a dependency on the PAGE_SIZE definition that should
> not exist in this header file.

Ah, good point.

> Something I had not seen during the earlier review is how the
> x86 phys_mem_access_prot_allowed() is called directly after the
> generic check for range_is_allowed(), so checking it again actually
> has no effect at all, and the definition can be made local to
> drivers/char/mem.c with no other caller.

Oh, true.

> 
> Fixes: 1b3f2bd04d90 ("x86/devmem: Remove duplicate range_is_allowed() definition")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  arch/x86/mm/pat/memtype.c |  3 ---
>  drivers/char/mem.c        | 18 ++++++++++++++++++
>  include/linux/io.h        | 21 ---------------------
>  3 files changed, 18 insertions(+), 24 deletions(-)

Looks good to me.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] x86/devmem: remove phys_mem_access_prot_allowed()
  2025-05-20 15:20 ` [PATCH 2/3] x86/devmem: remove phys_mem_access_prot_allowed() Arnd Bergmann
@ 2025-05-21 22:08   ` Dan Williams
  0 siblings, 0 replies; 11+ messages in thread
From: Dan Williams @ 2025-05-21 22:08 UTC (permalink / raw)
  To: Arnd Bergmann, Dave Hansen, Dan Williams
  Cc: Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Greg Kroah-Hartman, Nikolay Borisov,
	linux-kernel, Arnd Bergmann

Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> phys_mem_access_prot_allowed() was originally a workaround for
> certain 32-bit CPUs, but after commit 1886297ce0c8 ("x86/mm/pat:
> Fix BUG_ON() in mmap_mem() on QEMU/i386"), it no longer does anything
> special as the only remaining bit now is the same logic that follows in
> phys_mem_access_prot(), mapping everything as normal memory except when
> either O_DSYNC is set or the address is highmem.
> 
> Just remove the thing entirely.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  arch/x86/mm/pat/memtype.c | 15 ---------------
>  drivers/char/mem.c        | 10 ----------
>  include/linux/pgtable.h   |  2 --
>  3 files changed, 27 deletions(-)

Agree, I came to the same conclusion in the commit message of
1b3f2bd04d90, but then failed to circle back and remove the thing. So,

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64
  2025-05-20 15:20 ` [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64 Arnd Bergmann
@ 2025-05-21 22:14   ` Dan Williams
  2025-05-22 12:24     ` Arnd Bergmann
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2025-05-21 22:14 UTC (permalink / raw)
  To: Arnd Bergmann, Dave Hansen, Dan Williams
  Cc: Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Greg Kroah-Hartman, Nikolay Borisov,
	linux-kernel, Arnd Bergmann, Kees Cook

[add Kees]

Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> Traditionally, both reading and mapping anything in the low 1MB area is
> allowed on x86, through a series of ugly hacks.  In combination with
> features such as memory encryption, this keeps causing trouble and
> requires building additional hacks on top.
> 
> Chances are that this is only really used for 32-bit machines, as the
> usual users of this were dosemu, svgalib or ancient XFree86 versions,
> none of which should be used on 64-bit kernels any more.
> 
> Remove both the custom devmem_is_allowed() and the custom
> xlate_dev_mem_ptr() on 64-bit kernels, and use the normal implementation
> based on phys_to_virt() instead for read/write access on the linear
> map.
> 
> As a result, this makes x86-64 behave more like the other architecture
> on /dev/mem, allowing read/write access only on actual system RAM and
> only when CONFIG_STRICT_DEVMEM is disabled, while mmap() can be use
> on MMIO pages with the normal restrictions.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Unlike the other two patches in this series, this one is expected to
> change the behavior on x86-64 kernels, which has the risk of
> regressions, but seems worthwhile to me.
> 
> Are there any reasons left for keeping these hacks?

Kees did this search which seems to suggest that there is still code out
there that may not be prepared for a behavior change here:

http://lore.kernel.org/202504101926.0F8FB73@keescook

Maybe those paths fallback to dmi sysfs or other mechanisms for digging through
BIOS data, but I do not think we can know for sure that this removal is
regression free ahead of time.

I would be interested to see though, so:

Acked-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64
  2025-05-21 22:14   ` Dan Williams
@ 2025-05-22 12:24     ` Arnd Bergmann
  2025-06-03 18:18       ` Dan Williams
  0 siblings, 1 reply; 11+ messages in thread
From: Arnd Bergmann @ 2025-05-22 12:24 UTC (permalink / raw)
  To: Dan Williams, Arnd Bergmann, Dave Hansen
  Cc: Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Greg Kroah-Hartman, Nikolay Borisov,
	linux-kernel, Kees Cook

On Thu, May 22, 2025, at 00:14, Dan Williams wrote:
> Arnd Bergmann wrote:
>>
>> Unlike the other two patches in this series, this one is expected to
>> change the behavior on x86-64 kernels, which has the risk of
>> regressions, but seems worthwhile to me.
>> 
>> Are there any reasons left for keeping these hacks?
>
> Kees did this search which seems to suggest that there is still code out
> there that may not be prepared for a behavior change here:
>
> http://lore.kernel.org/202504101926.0F8FB73@keescook
>
> Maybe those paths fallback to dmi sysfs or other mechanisms for digging through
> BIOS data, but I do not think we can know for sure that this removal is
> regression free ahead of time.

I looked at those three and from what I can tell, two attempt
to access PCI BAR areas, which should not change with my patch:
if they are busy or exclusive, they are already disallowed, otherwise
they can still be mapped.

The third one maps the BIOS area at 0xf0000, and as far as I can tell
the hack explicitly allowed mapping that even though it is marked
busy on x86-64 since 5d94e81f69d4 ("x86: Introduce pci_map_biosrom()").

Is there any downside to marking this one non-busy and still allowing
the ROM to be mapped? Would that bring back the issue of conflicting
mapping flags between kernel and userspace?

--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -27,7 +27,7 @@ static struct resource system_rom_resource = {
        .name   = "System ROM",
        .start  = 0xf0000,
        .end    = 0xfffff,
-       .flags  = IORESOURCE_BUSY | IORESOURCE_READONLY | IORESOURCE_MEM
+       .flags  = IORESOURCE_READONLY | IORESOURCE_MEM
 };
 
 static struct resource extension_rom_resource = {

> I would be interested to see though, so:
>
> Acked-by: Dan Williams <dan.j.williams@intel.com>

Thanks,

      ARnd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64
  2025-05-22 12:24     ` Arnd Bergmann
@ 2025-06-03 18:18       ` Dan Williams
  2025-06-03 19:25         ` Arnd Bergmann
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2025-06-03 18:18 UTC (permalink / raw)
  To: Arnd Bergmann, Dan Williams, Arnd Bergmann, Dave Hansen
  Cc: Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Greg Kroah-Hartman, Nikolay Borisov,
	linux-kernel, Kees Cook, Naveen N Rao (AMD)

[add Naveen]

Arnd Bergmann wrote:
> On Thu, May 22, 2025, at 00:14, Dan Williams wrote:
> > Arnd Bergmann wrote:
> >>
> >> Unlike the other two patches in this series, this one is expected to
> >> change the behavior on x86-64 kernels, which has the risk of
> >> regressions, but seems worthwhile to me.
> >> 
> >> Are there any reasons left for keeping these hacks?
> >
> > Kees did this search which seems to suggest that there is still code out
> > there that may not be prepared for a behavior change here:
> >
> > http://lore.kernel.org/202504101926.0F8FB73@keescook
> >
> > Maybe those paths fallback to dmi sysfs or other mechanisms for digging through
> > BIOS data, but I do not think we can know for sure that this removal is
> > regression free ahead of time.
> 
> I looked at those three and from what I can tell, two attempt
> to access PCI BAR areas, which should not change with my patch:
> if they are busy or exclusive, they are already disallowed, otherwise
> they can still be mapped.
> 
> The third one maps the BIOS area at 0xf0000, and as far as I can tell
> the hack explicitly allowed mapping that even though it is marked
> busy on x86-64 since 5d94e81f69d4 ("x86: Introduce pci_map_biosrom()").
> 
> Is there any downside to marking this one non-busy and still allowing
> the ROM to be mapped? Would that bring back the issue of conflicting
> mapping flags between kernel and userspace?

For the confidential VM case I expect the answer is "yes" per this patch
attempt:

http://lore.kernel.org/20250403120228.2344377-1-naveen@kernel.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64
  2025-06-03 18:18       ` Dan Williams
@ 2025-06-03 19:25         ` Arnd Bergmann
  2025-06-03 19:36           ` Dan Williams
  0 siblings, 1 reply; 11+ messages in thread
From: Arnd Bergmann @ 2025-06-03 19:25 UTC (permalink / raw)
  To: Dan Williams, Arnd Bergmann, Dave Hansen
  Cc: Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Greg Kroah-Hartman, Nikolay Borisov,
	linux-kernel, Kees Cook, Naveen N Rao

On Tue, Jun 3, 2025, at 20:18, Dan Williams wrote:
> [add Naveen]
>
> Arnd Bergmann wrote:
>> On Thu, May 22, 2025, at 00:14, Dan Williams wrote:
>> > Arnd Bergmann wrote:
>>
>> The third one maps the BIOS area at 0xf0000, and as far as I can tell
>> the hack explicitly allowed mapping that even though it is marked
>> busy on x86-64 since 5d94e81f69d4 ("x86: Introduce pci_map_biosrom()").
>> 
>> Is there any downside to marking this one non-busy and still allowing
>> the ROM to be mapped? Would that bring back the issue of conflicting
>> mapping flags between kernel and userspace?
>
> For the confidential VM case I expect the answer is "yes" per this patch
> attempt:
>
> http://lore.kernel.org/20250403120228.2344377-1-naveen@kernel.org

I thought the problem here was the read() on /dev/mem, not
the mmap(), are you sure it's both?

With this patch [3/3], the memremap() hack for mem_read() goes away on
64-bit, so there should be no way it gets mapped again using that,
and the generic devmem_is_allowed() just forbids it as well.

The mmap() access in turn goes through this function

pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
                                unsigned long size, pgprot_t vma_prot)
{
        if (!phys_mem_access_encrypted(pfn << PAGE_SHIFT, size))
                vma_prot = pgprot_decrypted(vma_prot);

        return vma_prot;
}

which I would expect to return the correct vma_prot value already.

      Arnd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64
  2025-06-03 19:25         ` Arnd Bergmann
@ 2025-06-03 19:36           ` Dan Williams
  2025-06-11 14:18             ` Naveen N Rao
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2025-06-03 19:36 UTC (permalink / raw)
  To: Arnd Bergmann, Dan Williams, Arnd Bergmann, Dave Hansen
  Cc: Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Greg Kroah-Hartman, Nikolay Borisov,
	linux-kernel, Kees Cook, Naveen N Rao

Arnd Bergmann wrote:
> On Tue, Jun 3, 2025, at 20:18, Dan Williams wrote:
> > [add Naveen]
> >
> > Arnd Bergmann wrote:
> >> On Thu, May 22, 2025, at 00:14, Dan Williams wrote:
> >> > Arnd Bergmann wrote:
> >>
> >> The third one maps the BIOS area at 0xf0000, and as far as I can tell
> >> the hack explicitly allowed mapping that even though it is marked
> >> busy on x86-64 since 5d94e81f69d4 ("x86: Introduce pci_map_biosrom()").
> >> 
> >> Is there any downside to marking this one non-busy and still allowing
> >> the ROM to be mapped? Would that bring back the issue of conflicting
> >> mapping flags between kernel and userspace?
> >
> > For the confidential VM case I expect the answer is "yes" per this patch
> > attempt:
> >
> > http://lore.kernel.org/20250403120228.2344377-1-naveen@kernel.org
> 
> I thought the problem here was the read() on /dev/mem, not
> the mmap(), are you sure it's both?
> 
> With this patch [3/3], the memremap() hack for mem_read() goes away on
> 64-bit, so there should be no way it gets mapped again using that,
> and the generic devmem_is_allowed() just forbids it as well.
> 
> The mmap() access in turn goes through this function
> 
> pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>                                 unsigned long size, pgprot_t vma_prot)
> {
>         if (!phys_mem_access_encrypted(pfn << PAGE_SHIFT, size))
>                 vma_prot = pgprot_decrypted(vma_prot);
> 
>         return vma_prot;
> }
> 
> which I would expect to return the correct vma_prot value already.

My understanding is that while that gets the correct vma_prot and solves
the TDX problem it leaves the SEV-SNP problem that the range may not be
"accepted" ('pvalidate' invoked for the range) at the time the mapping
is established. So rather than try to make sure ROMs are accepted early
the proposal is just block altogether.

Naveen, did I get that right?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64
  2025-06-03 19:36           ` Dan Williams
@ 2025-06-11 14:18             ` Naveen N Rao
  0 siblings, 0 replies; 11+ messages in thread
From: Naveen N Rao @ 2025-06-11 14:18 UTC (permalink / raw)
  To: Dan Williams
  Cc: Arnd Bergmann, Arnd Bergmann, Dave Hansen, Andy Lutomirski,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, Greg Kroah-Hartman, Nikolay Borisov, linux-kernel, Kees Cook

On Tue, Jun 03, 2025 at 12:36:24PM -0700, Dan Williams wrote:
> Arnd Bergmann wrote:
> > On Tue, Jun 3, 2025, at 20:18, Dan Williams wrote:
> > > [add Naveen]
> > >
> > > Arnd Bergmann wrote:
> > >> On Thu, May 22, 2025, at 00:14, Dan Williams wrote:
> > >> > Arnd Bergmann wrote:
> > >>
> > >> The third one maps the BIOS area at 0xf0000, and as far as I can tell
> > >> the hack explicitly allowed mapping that even though it is marked
> > >> busy on x86-64 since 5d94e81f69d4 ("x86: Introduce pci_map_biosrom()").
> > >> 
> > >> Is there any downside to marking this one non-busy and still allowing
> > >> the ROM to be mapped? Would that bring back the issue of conflicting
> > >> mapping flags between kernel and userspace?
> > >
> > > For the confidential VM case I expect the answer is "yes" per this patch
> > > attempt:
> > >
> > > http://lore.kernel.org/20250403120228.2344377-1-naveen@kernel.org
> > 
> > I thought the problem here was the read() on /dev/mem, not
> > the mmap(), are you sure it's both?
> > 
> > With this patch [3/3], the memremap() hack for mem_read() goes away on
> > 64-bit, so there should be no way it gets mapped again using that,
> > and the generic devmem_is_allowed() just forbids it as well.
> > 
> > The mmap() access in turn goes through this function
> > 
> > pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
> >                                 unsigned long size, pgprot_t vma_prot)
> > {
> >         if (!phys_mem_access_encrypted(pfn << PAGE_SHIFT, size))
> >                 vma_prot = pgprot_decrypted(vma_prot);
> > 
> >         return vma_prot;
> > }
> > 
> > which I would expect to return the correct vma_prot value already.
> 
> My understanding is that while that gets the correct vma_prot and solves
> the TDX problem it leaves the SEV-SNP problem that the range may not be
> "accepted" ('pvalidate' invoked for the range) at the time the mapping
> is established. So rather than try to make sure ROMs are accepted early
> the proposal is just block altogether.
> 
> Naveen, did I get that right?

That's correct. This patch series doesn't help address that issue, where 
the mmap() is a problem too.

Thanks,
Naveen


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-06-11 14:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-20 15:20 [PATCH 1/3] x86/devmem: move range_is_allowed() to drivers/char/mem.c Arnd Bergmann
2025-05-20 15:20 ` [PATCH 2/3] x86/devmem: remove phys_mem_access_prot_allowed() Arnd Bergmann
2025-05-21 22:08   ` Dan Williams
2025-05-20 15:20 ` [PATCH 3/3] [RFC] x86/devmem: remove low 1MB hack for x86-64 Arnd Bergmann
2025-05-21 22:14   ` Dan Williams
2025-05-22 12:24     ` Arnd Bergmann
2025-06-03 18:18       ` Dan Williams
2025-06-03 19:25         ` Arnd Bergmann
2025-06-03 19:36           ` Dan Williams
2025-06-11 14:18             ` Naveen N Rao
2025-05-21 22:05 ` [PATCH 1/3] x86/devmem: move range_is_allowed() to drivers/char/mem.c Dan Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.