* [PATCH v4] fs: hide names_cachep behind runtime access machinery
@ 2025-10-30 10:52 Mateusz Guzik
2025-10-30 13:13 ` kernel test robot
` (4 more replies)
0 siblings, 5 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-30 10:52 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, torvalds, pfalcato,
Mateusz Guzik
The var is used twice for every path lookup, while the cache is
initialized early and stays valid for the duration.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
ACHTUNG WARNING POZOR UWAGA Блять: the namei cache can be used by
modules while the runtime machinery does not work with them. I did some
testing and ifdef MODULE seems to work around the probnlem, but perhaps
someone with build-fu could chime in? I verified with a hello world
module that this works fine, but maybe I missed a case.
v4:
- unbotch the diff below, apologies for the spam
v3:
- fix compilation failure on longarch as reported by kernel test robot,
used their repro script to confirm
v2:
- ifdef on module usage -- the runtime thing does *not* work with modules
- patch up the section warn, thanks to Pedro for spotting what's up with
the problem
Linus cc'ed as he added the runtime thing + dcache usage in the first place.
Per the above the machinery does not support kernel modules and I have
no interest in spending time to extend it.
I tried to add a compilation time warn should someone compile a module
with it, but there is no shared header so I decided to drop the matter.
Should someone(tm) make this work for modules I'm not going to protest.
Vast majority of actual usage is coming from core kernel, which *is*
getting the new treatment and I don't think the ifdef is particularly
nasty.
fs/dcache.c | 3 +--
include/asm-generic/vmlinux.lds.h | 3 ++-
include/linux/fs.h | 15 +++++++++++++--
3 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c
index 035cccbc9276..ef83323276f0 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -35,8 +35,6 @@
#include "internal.h"
#include "mount.h"
-#include <asm/runtime-const.h>
-
/*
* Usage:
* dcache->d_inode->i_lock protects:
@@ -3265,6 +3263,7 @@ void __init vfs_caches_init(void)
{
names_cachep = kmem_cache_create_usercopy("names_cache", PATH_MAX, 0,
SLAB_HWCACHE_ALIGN|SLAB_PANIC, 0, PATH_MAX, NULL);
+ runtime_const_init(ptr, names_cachep);
dcache_init();
inode_init();
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index dcdbd962abd6..c7d85c80111c 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -939,7 +939,8 @@
#define RUNTIME_CONST_VARIABLES \
RUNTIME_CONST(shift, d_hash_shift) \
- RUNTIME_CONST(ptr, dentry_hashtable)
+ RUNTIME_CONST(ptr, dentry_hashtable) \
+ RUNTIME_CONST(ptr, names_cachep)
/* Alignment must be consistent with (kunit_suite *) in include/kunit/test.h */
#define KUNIT_TABLE() \
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 68c4a59ec8fb..cfaabd4824f2 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -50,6 +50,8 @@
#include <linux/unicode.h>
#include <asm/byteorder.h>
+#include <asm/runtime-const.h>
+
#include <uapi/linux/fs.h>
struct backing_dev_info;
@@ -2960,8 +2962,17 @@ extern void __init vfs_caches_init(void);
extern struct kmem_cache *names_cachep;
-#define __getname() kmem_cache_alloc(names_cachep, GFP_KERNEL)
-#define __putname(name) kmem_cache_free(names_cachep, (void *)(name))
+/*
+ * XXX The runtime_const machinery does not support modules at the moment.
+ */
+#ifdef MODULE
+#define __names_cachep names_cachep
+#else
+#define __names_cachep runtime_const_ptr(names_cachep)
+#endif
+
+#define __getname() kmem_cache_alloc(__names_cachep, GFP_KERNEL)
+#define __putname(name) kmem_cache_free(__names_cachep, (void *)(name))
extern struct super_block *blockdev_superblock;
static inline bool sb_is_blkdev_sb(struct super_block *sb)
--
2.34.1
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 10:52 [PATCH v4] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
@ 2025-10-30 13:13 ` kernel test robot
2025-10-30 13:19 ` Mateusz Guzik
2025-10-30 16:15 ` Linus Torvalds
` (3 subsequent siblings)
4 siblings, 1 reply; 51+ messages in thread
From: kernel test robot @ 2025-10-30 13:13 UTC (permalink / raw)
To: Mateusz Guzik, brauner
Cc: oe-kbuild-all, viro, jack, linux-kernel, linux-fsdevel, torvalds,
pfalcato, Mateusz Guzik
Hi Mateusz,
kernel test robot noticed the following build errors:
[auto build test ERROR on arnd-asm-generic/master]
[also build test ERROR on linus/master brauner-vfs/vfs.all linux/master v6.18-rc3 next-20251030]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/fs-hide-names_cachep-behind-runtime-access-machinery/20251030-185523
base: https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git master
patch link: https://lore.kernel.org/r/20251030105242.801528-1-mjguzik%40gmail.com
patch subject: [PATCH v4] fs: hide names_cachep behind runtime access machinery
config: riscv-allnoconfig (https://download.01.org/0day-ci/archive/20251030/202510302004.OdLRz1Wy-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510302004.OdLRz1Wy-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510302004.OdLRz1Wy-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from arch/riscv/include/asm/runtime-const.h:7,
from include/linux/fs.h:53,
from include/linux/huge_mm.h:7,
from include/linux/mm.h:1016,
from arch/riscv/kernel/asm-offsets.c:8:
arch/riscv/include/asm/cacheflush.h: In function 'flush_cache_vmap':
>> arch/riscv/include/asm/cacheflush.h:49:13: error: implicit declaration of function 'is_vmalloc_or_module_addr' [-Wimplicit-function-declaration]
49 | if (is_vmalloc_or_module_addr((void *)start)) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~
In file included from include/linux/compat.h:18,
from arch/riscv/include/asm/elf.h:12,
from include/linux/elf.h:6,
from include/linux/module.h:20,
from include/linux/device/driver.h:21,
from include/linux/device.h:32,
from include/linux/node.h:18,
from include/linux/memory.h:19,
from arch/riscv/include/asm/runtime-const.h:9:
include/uapi/linux/aio_abi.h: At top level:
>> include/uapi/linux/aio_abi.h:79:9: error: unknown type name '__kernel_rwf_t'; did you mean '__kernel_off_t'?
79 | __kernel_rwf_t aio_rw_flags; /* RWF_* flags */
| ^~~~~~~~~~~~~~
| __kernel_off_t
make[3]: *** [scripts/Makefile.build:182: arch/riscv/kernel/asm-offsets.s] Error 1
make[3]: Target 'prepare' not remade because of errors.
make[2]: *** [Makefile:1282: prepare0] Error 2
make[2]: Target 'prepare' not remade because of errors.
make[1]: *** [Makefile:248: __sub-make] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:248: __sub-make] Error 2
make: Target 'prepare' not remade because of errors.
vim +/is_vmalloc_or_module_addr +49 arch/riscv/include/asm/cacheflush.h
08f051eda33b51 Andrew Waterman 2017-10-25 42
7e3811521dc393 Alexandre Ghiti 2023-07-25 43 #ifdef CONFIG_64BIT
503638e0babf36 Alexandre Ghiti 2024-07-17 44 extern u64 new_vmalloc[NR_CPUS / sizeof(u64) + 1];
503638e0babf36 Alexandre Ghiti 2024-07-17 45 extern char _end[];
503638e0babf36 Alexandre Ghiti 2024-07-17 46 #define flush_cache_vmap flush_cache_vmap
503638e0babf36 Alexandre Ghiti 2024-07-17 47 static inline void flush_cache_vmap(unsigned long start, unsigned long end)
503638e0babf36 Alexandre Ghiti 2024-07-17 48 {
503638e0babf36 Alexandre Ghiti 2024-07-17 @49 if (is_vmalloc_or_module_addr((void *)start)) {
503638e0babf36 Alexandre Ghiti 2024-07-17 50 int i;
503638e0babf36 Alexandre Ghiti 2024-07-17 51
503638e0babf36 Alexandre Ghiti 2024-07-17 52 /*
503638e0babf36 Alexandre Ghiti 2024-07-17 53 * We don't care if concurrently a cpu resets this value since
503638e0babf36 Alexandre Ghiti 2024-07-17 54 * the only place this can happen is in handle_exception() where
503638e0babf36 Alexandre Ghiti 2024-07-17 55 * an sfence.vma is emitted.
503638e0babf36 Alexandre Ghiti 2024-07-17 56 */
503638e0babf36 Alexandre Ghiti 2024-07-17 57 for (i = 0; i < ARRAY_SIZE(new_vmalloc); ++i)
503638e0babf36 Alexandre Ghiti 2024-07-17 58 new_vmalloc[i] = -1ULL;
503638e0babf36 Alexandre Ghiti 2024-07-17 59 }
503638e0babf36 Alexandre Ghiti 2024-07-17 60 }
7a92fc8b4d2068 Alexandre Ghiti 2023-12-12 61 #define flush_cache_vmap_early(start, end) local_flush_tlb_kernel_range(start, end)
7e3811521dc393 Alexandre Ghiti 2023-07-25 62 #endif
7e3811521dc393 Alexandre Ghiti 2023-07-25 63
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 13:13 ` kernel test robot
@ 2025-10-30 13:19 ` Mateusz Guzik
0 siblings, 0 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-30 13:19 UTC (permalink / raw)
To: kernel test robot
Cc: brauner, oe-kbuild-all, viro, jack, linux-kernel, linux-fsdevel,
torvalds, pfalcato
I'm not sending a v5. If you guys are fine with the patch I'm going to
fix up whatever other fallout later.
On Thu, Oct 30, 2025 at 2:14 PM kernel test robot <lkp@intel.com> wrote:
>
> Hi Mateusz,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on arnd-asm-generic/master]
> [also build test ERROR on linus/master brauner-vfs/vfs.all linux/master v6.18-rc3 next-20251030]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/fs-hide-names_cachep-behind-runtime-access-machinery/20251030-185523
> base: https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git master
> patch link: https://lore.kernel.org/r/20251030105242.801528-1-mjguzik%40gmail.com
> patch subject: [PATCH v4] fs: hide names_cachep behind runtime access machinery
> config: riscv-allnoconfig (https://download.01.org/0day-ci/archive/20251030/202510302004.OdLRz1Wy-lkp@intel.com/config)
> compiler: riscv64-linux-gcc (GCC) 15.1.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251030/202510302004.OdLRz1Wy-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202510302004.OdLRz1Wy-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
> In file included from arch/riscv/include/asm/runtime-const.h:7,
> from include/linux/fs.h:53,
> from include/linux/huge_mm.h:7,
> from include/linux/mm.h:1016,
> from arch/riscv/kernel/asm-offsets.c:8:
> arch/riscv/include/asm/cacheflush.h: In function 'flush_cache_vmap':
> >> arch/riscv/include/asm/cacheflush.h:49:13: error: implicit declaration of function 'is_vmalloc_or_module_addr' [-Wimplicit-function-declaration]
> 49 | if (is_vmalloc_or_module_addr((void *)start)) {
> | ^~~~~~~~~~~~~~~~~~~~~~~~~
> In file included from include/linux/compat.h:18,
> from arch/riscv/include/asm/elf.h:12,
> from include/linux/elf.h:6,
> from include/linux/module.h:20,
> from include/linux/device/driver.h:21,
> from include/linux/device.h:32,
> from include/linux/node.h:18,
> from include/linux/memory.h:19,
> from arch/riscv/include/asm/runtime-const.h:9:
> include/uapi/linux/aio_abi.h: At top level:
> >> include/uapi/linux/aio_abi.h:79:9: error: unknown type name '__kernel_rwf_t'; did you mean '__kernel_off_t'?
> 79 | __kernel_rwf_t aio_rw_flags; /* RWF_* flags */
> | ^~~~~~~~~~~~~~
> | __kernel_off_t
> make[3]: *** [scripts/Makefile.build:182: arch/riscv/kernel/asm-offsets.s] Error 1
> make[3]: Target 'prepare' not remade because of errors.
> make[2]: *** [Makefile:1282: prepare0] Error 2
> make[2]: Target 'prepare' not remade because of errors.
> make[1]: *** [Makefile:248: __sub-make] Error 2
> make[1]: Target 'prepare' not remade because of errors.
> make: *** [Makefile:248: __sub-make] Error 2
> make: Target 'prepare' not remade because of errors.
>
>
> vim +/is_vmalloc_or_module_addr +49 arch/riscv/include/asm/cacheflush.h
>
> 08f051eda33b51 Andrew Waterman 2017-10-25 42
> 7e3811521dc393 Alexandre Ghiti 2023-07-25 43 #ifdef CONFIG_64BIT
> 503638e0babf36 Alexandre Ghiti 2024-07-17 44 extern u64 new_vmalloc[NR_CPUS / sizeof(u64) + 1];
> 503638e0babf36 Alexandre Ghiti 2024-07-17 45 extern char _end[];
> 503638e0babf36 Alexandre Ghiti 2024-07-17 46 #define flush_cache_vmap flush_cache_vmap
> 503638e0babf36 Alexandre Ghiti 2024-07-17 47 static inline void flush_cache_vmap(unsigned long start, unsigned long end)
> 503638e0babf36 Alexandre Ghiti 2024-07-17 48 {
> 503638e0babf36 Alexandre Ghiti 2024-07-17 @49 if (is_vmalloc_or_module_addr((void *)start)) {
> 503638e0babf36 Alexandre Ghiti 2024-07-17 50 int i;
> 503638e0babf36 Alexandre Ghiti 2024-07-17 51
> 503638e0babf36 Alexandre Ghiti 2024-07-17 52 /*
> 503638e0babf36 Alexandre Ghiti 2024-07-17 53 * We don't care if concurrently a cpu resets this value since
> 503638e0babf36 Alexandre Ghiti 2024-07-17 54 * the only place this can happen is in handle_exception() where
> 503638e0babf36 Alexandre Ghiti 2024-07-17 55 * an sfence.vma is emitted.
> 503638e0babf36 Alexandre Ghiti 2024-07-17 56 */
> 503638e0babf36 Alexandre Ghiti 2024-07-17 57 for (i = 0; i < ARRAY_SIZE(new_vmalloc); ++i)
> 503638e0babf36 Alexandre Ghiti 2024-07-17 58 new_vmalloc[i] = -1ULL;
> 503638e0babf36 Alexandre Ghiti 2024-07-17 59 }
> 503638e0babf36 Alexandre Ghiti 2024-07-17 60 }
> 7a92fc8b4d2068 Alexandre Ghiti 2023-12-12 61 #define flush_cache_vmap_early(start, end) local_flush_tlb_kernel_range(start, end)
> 7e3811521dc393 Alexandre Ghiti 2023-07-25 62 #endif
> 7e3811521dc393 Alexandre Ghiti 2023-07-25 63
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 10:52 [PATCH v4] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
2025-10-30 13:13 ` kernel test robot
@ 2025-10-30 16:15 ` Linus Torvalds
2025-10-30 16:35 ` Mateusz Guzik
2025-10-31 13:30 ` [PATCH v4] " kernel test robot
` (2 subsequent siblings)
4 siblings, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2025-10-30 16:15 UTC (permalink / raw)
To: Mateusz Guzik; +Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, pfalcato
On Thu, 30 Oct 2025 at 03:52, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> Should someone(tm) make this work for modules I'm not going to protest.
Btw, that's a good point. When I did this all originally, I explicitly
did *not* want to make it work for modules, but I do note that it can
be used for modules very easily by mistake.
> Vast majority of actual usage is coming from core kernel, which *is*
> getting the new treatment and I don't think the ifdef is particularly
> nasty.
I suspect we should make that #ifdef be an integral part of the
runtime const headers. Because right now it's really much too easy to
get it wrong, and I wonder if we already do.
Linus
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 16:15 ` Linus Torvalds
@ 2025-10-30 16:35 ` Mateusz Guzik
2025-10-30 18:07 ` Linus Torvalds
0 siblings, 1 reply; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-30 16:35 UTC (permalink / raw)
To: Linus Torvalds; +Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, pfalcato
On Thu, Oct 30, 2025 at 5:16 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Thu, 30 Oct 2025 at 03:52, Mateusz Guzik <mjguzik@gmail.com> wrote:
> >
> > Should someone(tm) make this work for modules I'm not going to protest.
>
> Btw, that's a good point. When I did this all originally, I explicitly
> did *not* want to make it work for modules, but I do note that it can
> be used for modules very easily by mistake.
>
> > Vast majority of actual usage is coming from core kernel, which *is*
> > getting the new treatment and I don't think the ifdef is particularly
> > nasty.
>
> I suspect we should make that #ifdef be an integral part of the
> runtime const headers. Because right now it's really much too easy to
> get it wrong, and I wonder if we already do.
>
I don't know if you are suggesting to make the entire thing fail to
compile if included for a module, or to transparently convert
runtime-optimized access into plain access.
I presume the former.
Even then, there is the cosmetic issue of deciding whether to ifdef
within headers or create include/linux/runtime-constants.h which pulls
in the per-arch stuff and ifdef in there.
Personally I'm leaning towards just forcing compilation failure and
duplicating the code to do it within per-arch headers, for example:
diff --git a/arch/x86/include/asm/runtime-const.h
b/arch/x86/include/asm/runtime-const.h
index 8d983cfd06ea..42e6303b52f7 100644
--- a/arch/x86/include/asm/runtime-const.h
+++ b/arch/x86/include/asm/runtime-const.h
@@ -2,6 +2,10 @@
#ifndef _ASM_RUNTIME_CONST_H
#define _ASM_RUNTIME_CONST_H
+#ifdef MODULE
+#error "this functionality is not available for modules"
+#endif
+
#ifdef __ASSEMBLY__
.macro RUNTIME_CONST_PTR sym reg
Just tell me which way you want this sorted out and if it is less than
few minutes of screwing around I'll take care of it.
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 16:35 ` Mateusz Guzik
@ 2025-10-30 18:07 ` Linus Torvalds
2025-10-30 18:25 ` Linus Torvalds
2025-10-30 21:39 ` Mateusz Guzik
0 siblings, 2 replies; 51+ messages in thread
From: Linus Torvalds @ 2025-10-30 18:07 UTC (permalink / raw)
To: Mateusz Guzik, Thomas Gleixner
Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, pfalcato
[-- Attachment #1: Type: text/plain, Size: 1704 bytes --]
[ Adding Thomas, because he's been working on our x86 uaccess code,
and I actually think we get this all wrong for access_ok() etc ]
On Thu, 30 Oct 2025 at 09:35, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> I don't know if you are suggesting to make the entire thing fail to
> compile if included for a module, or to transparently convert
> runtime-optimized access into plain access.
>
> I presume the former.
I think *including* it should be ok, because we have things like
<asm/uaccess.h> - or your addition to <linux/fs.h> - that use it for
core functionality that is then not supported for module use.
Yeah, in a perfect world we'd have those things only in "internal"
headers and people couldn't include them even by mistake, but that
ends up being a pain.
So I don't think your
+#ifdef MODULE
+#error "this functionality is not available for modules"
+#endif
model works, because I think it might be too painful to fix (but hey,
maybe I'm wrong).
I was thinking more along the lines of forcing linker errors or
something like that.
ENTIRELY UNTESTED PATCH attached - may not compile at all, but
something like this *might* work to show when a module uses the
runtime_const infrastructure.
And I think I should have made the default runtime const value
something small. But the original use of this was just the dcache
code, and that used it purely as a pointer, so a non-fixed-up address
would cause a nice clean oops. Then I started using it for the user
access limit, and now it's actually wrong if used by modules.
Thanks for making me think about this. I thought about the module case
*originally*, but then with some of the expanded use I definitely did
not.
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 894 bytes --]
arch/x86/include/asm/runtime-const.h | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/runtime-const.h b/arch/x86/include/asm/runtime-const.h
index 8d983cfd06ea..01e35997587d 100644
--- a/arch/x86/include/asm/runtime-const.h
+++ b/arch/x86/include/asm/runtime-const.h
@@ -2,7 +2,18 @@
#ifndef _ASM_RUNTIME_CONST_H
#define _ASM_RUNTIME_CONST_H
-#ifdef __ASSEMBLY__
+#ifdef MODULE
+
+/*
+ * None of this is available to modules, so we force link errors
+ * if people try to use it
+ */
+extern unsigned long no_runtime_const;
+#define runtime_const_ptr(sym) ((typeof(sym))no_runtime_const)
+#define runtime_const_shift_right_32(val, sym) ((u32)no_runtime_const)
+#define runtime_const_init(type,sym) do { no_runtime_const=1; } while (0)
+
+#elif defined(__ASSEMBLY__)
.macro RUNTIME_CONST_PTR sym reg
movq $0x0123456789abcdef, %\reg
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 18:07 ` Linus Torvalds
@ 2025-10-30 18:25 ` Linus Torvalds
2025-10-30 21:39 ` Mateusz Guzik
1 sibling, 0 replies; 51+ messages in thread
From: Linus Torvalds @ 2025-10-30 18:25 UTC (permalink / raw)
To: Mateusz Guzik, Thomas Gleixner
Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, pfalcato
On Thu, 30 Oct 2025 at 11:07, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> ENTIRELY UNTESTED PATCH attached - may not compile at all, but
> something like this *might* work to show when a module uses the
> runtime_const infrastructure.
Hmm. I tested it, and it seems to work. And by "work", I mean "show
real existing problems":
ERROR: modpost: "no_runtime_const" [arch/x86/kvm/kvm.ko] undefined!
ERROR: modpost: "no_runtime_const" [arch/x86/kvm/kvm-amd.ko] undefined!
ERROR: modpost: "no_runtime_const" [fs/erofs/erofs.ko] undefined!
ERROR: modpost: "no_runtime_const" [lib/tests/usercopy_kunit.ko] undefined!
ERROR: modpost: "no_runtime_const" [lib/test_lockup.ko] undefined!
ERROR: modpost: "no_runtime_const" [drivers/acpi/acpi_dbg.ko] undefined!
ERROR: modpost: "no_runtime_const" [drivers/xen/xen-privcmd.ko] undefined!
ERROR: modpost: "no_runtime_const"
[drivers/iommu/iommufd/iommufd.ko] undefined!
ERROR: modpost: "no_runtime_const" [drivers/gpu/drm/drm.ko] undefined!
ERROR: modpost: "no_runtime_const"
[drivers/gpu/drm/radeon/radeon.ko] undefined!
WARNING: modpost: suppressed 29 unresolved symbol warnings because
there were too many)
and yeah, I think it comes from access_ok() use.
It turns out that all of this "works", but entirely by mistake, and
not really properly.
I picked the default value for the runtime_const pointer of
0x0123456789abcdef because it's easy to see in disassembly, and
because it causes a nice oops if not fixed up because it's a
non-canonical address on normal x86-64.
And *because* it's in that non-canonical range, it's actually "good
enough" for access_ok() in practice. But it sure as hell ain't right.
I think that for x86-64 and for the short term, the right thing to do
is to make access_ok() be out-of-line. Nobody should use it any more
anyway, it's a legacy operation for back when doing access_ok() +
__get_user() was a big and valid optimization.
So I think the other thing that kind of saved us - but probably also
meant that the bug wasn't as obvious as it should have been - was
exactly the fact that it affects that operation that really nobody
should use anyway.
Linus
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 18:07 ` Linus Torvalds
2025-10-30 18:25 ` Linus Torvalds
@ 2025-10-30 21:39 ` Mateusz Guzik
2025-10-30 22:06 ` Mateusz Guzik
2025-10-31 12:08 ` Christian Brauner
1 sibling, 2 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-30 21:39 UTC (permalink / raw)
To: Linus Torvalds
Cc: Thomas Gleixner, brauner, viro, jack, linux-kernel, linux-fsdevel,
pfalcato
On Thu, Oct 30, 2025 at 7:07 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> [ Adding Thomas, because he's been working on our x86 uaccess code,
> and I actually think we get this all wrong for access_ok() etc ]
>
> On Thu, 30 Oct 2025 at 09:35, Mateusz Guzik <mjguzik@gmail.com> wrote:
> >
> > I don't know if you are suggesting to make the entire thing fail to
> > compile if included for a module, or to transparently convert
> > runtime-optimized access into plain access.
> >
> > I presume the former.
>
> I think *including* it should be ok, because we have things like
> <asm/uaccess.h> - or your addition to <linux/fs.h> - that use it for
> core functionality that is then not supported for module use.
>
> Yeah, in a perfect world we'd have those things only in "internal"
> headers and people couldn't include them even by mistake, but that
> ends up being a pain.
>
> So I don't think your
>
> +#ifdef MODULE
> +#error "this functionality is not available for modules"
> +#endif
>
> model works, because I think it might be too painful to fix (but hey,
> maybe I'm wrong).
>
In my proposal the patch which messes with the namei cache address
would have the following in fs.h:
#ifndef MODULE
#include <asm/runtime-const.h>
#endif
As in, unless the kernel itself is being compiled, it would pretend
the runtime machinery does not even exist, which imo is preferable to
failing later at link time.
Then whatever functionality using runtime-const is straight up not
available and code insisting on providing something for modules anyway
is forced to provide an ifdefed implementation.
Ignoring the safety vs modules thing and back to the names_cachep
patch: the reported riscv build failure has proven problematic to fix.
Turns out mm.h includes mm_huge.h, which then includes fs.h(!). Adding
the runtime-const.h include into fs.h then results in compilation
failure on that platform as it depends on vmalloc-related symbols
which are only getting declared *after* fs.h gets included.
I tried to get rid of the fs.h inclusion in mm_huge.h, but that
uncovered a bunch of other build failures where code works only
because fs.h got sneaked in by someone else.
Given the level of bullshit here it may be it is just straight up
infeasible to include runtime-const.h in fs.h without major
rototoiling, which I'm not signing up for.
I wonder if it would make sense to bypass the problem by moving the
pathname handling routines to a different header -- might be useful in
its own right to slim down the kitchen sink that fs.h turned out to
be, but that's another bikeshed-y material.
I may end up just ditching this for the time being.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 21:39 ` Mateusz Guzik
@ 2025-10-30 22:06 ` Mateusz Guzik
2025-10-31 12:08 ` Christian Brauner
1 sibling, 0 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-30 22:06 UTC (permalink / raw)
To: Linus Torvalds
Cc: Thomas Gleixner, brauner, viro, jack, linux-kernel, linux-fsdevel,
pfalcato
On Thu, Oct 30, 2025 at 10:39:46PM +0100, Mateusz Guzik wrote:
> On Thu, Oct 30, 2025 at 7:07 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > [ Adding Thomas, because he's been working on our x86 uaccess code,
> > and I actually think we get this all wrong for access_ok() etc ]
> >
> > On Thu, 30 Oct 2025 at 09:35, Mateusz Guzik <mjguzik@gmail.com> wrote:
> > >
> > > I don't know if you are suggesting to make the entire thing fail to
> > > compile if included for a module, or to transparently convert
> > > runtime-optimized access into plain access.
> > >
> > > I presume the former.
> >
> > I think *including* it should be ok, because we have things like
> > <asm/uaccess.h> - or your addition to <linux/fs.h> - that use it for
> > core functionality that is then not supported for module use.
> >
> > Yeah, in a perfect world we'd have those things only in "internal"
> > headers and people couldn't include them even by mistake, but that
> > ends up being a pain.
> >
> > So I don't think your
> >
> > +#ifdef MODULE
> > +#error "this functionality is not available for modules"
> > +#endif
> >
> > model works, because I think it might be too painful to fix (but hey,
> > maybe I'm wrong).
> >
>
> In my proposal the patch which messes with the namei cache address
> would have the following in fs.h:
> #ifndef MODULE
> #include <asm/runtime-const.h>
> #endif
>
> As in, unless the kernel itself is being compiled, it would pretend
> the runtime machinery does not even exist, which imo is preferable to
> failing later at link time.
>
> Then whatever functionality using runtime-const is straight up not
> available and code insisting on providing something for modules anyway
> is forced to provide an ifdefed implementation.
>
Here is a build-tested diff for bzImage itself and M=fs/erofs on the
x86-64 architecture.
It keeps access_ok() inline for demostrative purposes, I have no opinion
what to do with this specific sucker.
diff --git a/arch/x86/include/asm/runtime-const.h b/arch/x86/include/asm/runtime-const.h
index 8d983cfd06ea..dc3273ac2034 100644
--- a/arch/x86/include/asm/runtime-const.h
+++ b/arch/x86/include/asm/runtime-const.h
@@ -2,6 +2,10 @@
#ifndef _ASM_RUNTIME_CONST_H
#define _ASM_RUNTIME_CONST_H
+#ifdef MODULE
+#error "this functionality is not available for modules"
+#endif
+
#ifdef __ASSEMBLY__
.macro RUNTIME_CONST_PTR sym reg
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index c8a5ae35c871..ce8f6be1964e 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -12,13 +12,14 @@
#include <asm/cpufeatures.h>
#include <asm/page.h>
#include <asm/percpu.h>
-#include <asm/runtime-const.h>
-/*
- * Virtual variable: there's no actual backing store for this,
- * it can purely be used as 'runtime_const_ptr(USER_PTR_MAX)'
- */
extern unsigned long USER_PTR_MAX;
+#ifdef MODULE
+#define __USER_PTR_MAX USER_PTR_MAX
+#else
+#include <asm/runtime-const.h>
+#define __USER_PTR_MAX runtime_const_ptr(USER_PTR_MAX)
+#endif
#ifdef CONFIG_ADDRESS_MASKING
/*
@@ -54,7 +55,7 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,
#endif
#define valid_user_address(x) \
- likely((__force unsigned long)(x) <= runtime_const_ptr(USER_PTR_MAX))
+ likely((__force unsigned long)(x) <= __USER_PTR_MAX)
/*
* Masking the user address is an alternative to a conditional
@@ -67,7 +68,7 @@ static inline void __user *mask_user_address(const void __user *ptr)
asm("cmp %1,%0\n\t"
"cmova %1,%0"
:"=r" (ret)
- :"r" (runtime_const_ptr(USER_PTR_MAX)),
+ :"r" (__USER_PTR_MAX),
"0" (ptr));
return ret;
}
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3ff9682d8bc4..5a3d89ed75d1 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -78,6 +78,9 @@
DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
+unsigned long USER_PTR_MAX __ro_after_init = TASK_SIZE_MAX;
+EXPORT_SYMBOL(USER_PTR_MAX);
+
u32 elf_hwcap2 __read_mostly;
/* Number of siblings per CPU package */
@@ -2575,8 +2578,6 @@ void __init arch_cpu_finalize_init(void)
alternative_instructions();
if (IS_ENABLED(CONFIG_X86_64)) {
- unsigned long USER_PTR_MAX = TASK_SIZE_MAX;
-
/*
* Enable this when LAM is gated on LASS support
if (cpu_feature_enabled(X86_FEATURE_LAM))
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 21:39 ` Mateusz Guzik
2025-10-30 22:06 ` Mateusz Guzik
@ 2025-10-31 12:08 ` Christian Brauner
2025-10-31 15:13 ` Mateusz Guzik
1 sibling, 1 reply; 51+ messages in thread
From: Christian Brauner @ 2025-10-31 12:08 UTC (permalink / raw)
To: Mateusz Guzik
Cc: Linus Torvalds, Thomas Gleixner, viro, jack, linux-kernel,
linux-fsdevel, pfalcato
> I wonder if it would make sense to bypass the problem by moving the
> pathname handling routines to a different header -- might be useful in
> its own right to slim down the kitchen sink that fs.h turned out to
> be, but that's another bikeshed-y material.
fs.h needs to be split up. It's on my ToDo but let's just say there's a
lot of stuff on it so it's not really high-priority. If you have a good
reason to move something out of there by my guest. It would be
appreciated!
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 10:52 [PATCH v4] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
2025-10-30 13:13 ` kernel test robot
2025-10-30 16:15 ` Linus Torvalds
@ 2025-10-31 13:30 ` kernel test robot
2025-10-31 22:43 ` kernel test robot
2025-11-01 23:06 ` kernel test robot
4 siblings, 0 replies; 51+ messages in thread
From: kernel test robot @ 2025-10-31 13:30 UTC (permalink / raw)
To: Mateusz Guzik, brauner
Cc: llvm, oe-kbuild-all, viro, jack, linux-kernel, linux-fsdevel,
torvalds, pfalcato, Mateusz Guzik
Hi Mateusz,
kernel test robot noticed the following build errors:
[auto build test ERROR on arnd-asm-generic/master]
[also build test ERROR on linus/master brauner-vfs/vfs.all linux/master v6.18-rc3 next-20251031]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/fs-hide-names_cachep-behind-runtime-access-machinery/20251030-185523
base: https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git master
patch link: https://lore.kernel.org/r/20251030105242.801528-1-mjguzik%40gmail.com
patch subject: [PATCH v4] fs: hide names_cachep behind runtime access machinery
config: riscv-randconfig-002-20251031 (https://download.01.org/0day-ci/archive/20251031/202510312143.SvwwhqVp-lkp@intel.com/config)
compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251031/202510312143.SvwwhqVp-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510312143.SvwwhqVp-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from arch/riscv/kernel/asm-offsets.c:8:
In file included from include/linux/mm.h:1016:
In file included from include/linux/huge_mm.h:7:
In file included from include/linux/fs.h:53:
In file included from arch/riscv/include/asm/runtime-const.h:7:
>> arch/riscv/include/asm/cacheflush.h:49:6: error: call to undeclared function 'is_vmalloc_or_module_addr'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
49 | if (is_vmalloc_or_module_addr((void *)start)) {
| ^
In file included from arch/riscv/kernel/asm-offsets.c:8:
In file included from include/linux/mm.h:1016:
In file included from include/linux/huge_mm.h:7:
In file included from include/linux/fs.h:53:
In file included from arch/riscv/include/asm/runtime-const.h:9:
In file included from include/linux/memory.h:19:
In file included from include/linux/node.h:18:
In file included from include/linux/device.h:32:
In file included from include/linux/device/driver.h:21:
In file included from include/linux/module.h:20:
In file included from include/linux/elf.h:6:
In file included from arch/riscv/include/asm/elf.h:12:
In file included from include/linux/compat.h:18:
include/uapi/linux/aio_abi.h:79:2: error: unknown type name '__kernel_rwf_t'; did you mean '__kernel_off_t'?
79 | __kernel_rwf_t aio_rw_flags; /* RWF_* flags */
| ^~~~~~~~~~~~~~
| __kernel_off_t
include/uapi/asm-generic/posix_types.h:87:25: note: '__kernel_off_t' declared here
87 | typedef __kernel_long_t __kernel_off_t;
| ^
2 errors generated.
make[3]: *** [scripts/Makefile.build:182: arch/riscv/kernel/asm-offsets.s] Error 1 shuffle=1341192968
make[3]: Target 'prepare' not remade because of errors.
make[2]: *** [Makefile:1282: prepare0] Error 2 shuffle=1341192968
make[2]: Target 'prepare' not remade because of errors.
make[1]: *** [Makefile:248: __sub-make] Error 2 shuffle=1341192968
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:248: __sub-make] Error 2 shuffle=1341192968
make: Target 'prepare' not remade because of errors.
vim +/is_vmalloc_or_module_addr +49 arch/riscv/include/asm/cacheflush.h
08f051eda33b51e Andrew Waterman 2017-10-25 42
7e3811521dc3934 Alexandre Ghiti 2023-07-25 43 #ifdef CONFIG_64BIT
503638e0babf364 Alexandre Ghiti 2024-07-17 44 extern u64 new_vmalloc[NR_CPUS / sizeof(u64) + 1];
503638e0babf364 Alexandre Ghiti 2024-07-17 45 extern char _end[];
503638e0babf364 Alexandre Ghiti 2024-07-17 46 #define flush_cache_vmap flush_cache_vmap
503638e0babf364 Alexandre Ghiti 2024-07-17 47 static inline void flush_cache_vmap(unsigned long start, unsigned long end)
503638e0babf364 Alexandre Ghiti 2024-07-17 48 {
503638e0babf364 Alexandre Ghiti 2024-07-17 @49 if (is_vmalloc_or_module_addr((void *)start)) {
503638e0babf364 Alexandre Ghiti 2024-07-17 50 int i;
503638e0babf364 Alexandre Ghiti 2024-07-17 51
503638e0babf364 Alexandre Ghiti 2024-07-17 52 /*
503638e0babf364 Alexandre Ghiti 2024-07-17 53 * We don't care if concurrently a cpu resets this value since
503638e0babf364 Alexandre Ghiti 2024-07-17 54 * the only place this can happen is in handle_exception() where
503638e0babf364 Alexandre Ghiti 2024-07-17 55 * an sfence.vma is emitted.
503638e0babf364 Alexandre Ghiti 2024-07-17 56 */
503638e0babf364 Alexandre Ghiti 2024-07-17 57 for (i = 0; i < ARRAY_SIZE(new_vmalloc); ++i)
503638e0babf364 Alexandre Ghiti 2024-07-17 58 new_vmalloc[i] = -1ULL;
503638e0babf364 Alexandre Ghiti 2024-07-17 59 }
503638e0babf364 Alexandre Ghiti 2024-07-17 60 }
7a92fc8b4d20680 Alexandre Ghiti 2023-12-12 61 #define flush_cache_vmap_early(start, end) local_flush_tlb_kernel_range(start, end)
7e3811521dc3934 Alexandre Ghiti 2023-07-25 62 #endif
7e3811521dc3934 Alexandre Ghiti 2023-07-25 63
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-31 12:08 ` Christian Brauner
@ 2025-10-31 15:13 ` Mateusz Guzik
2025-10-31 16:04 ` Linus Torvalds
0 siblings, 1 reply; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-31 15:13 UTC (permalink / raw)
To: Christian Brauner
Cc: Linus Torvalds, Thomas Gleixner, viro, jack, linux-kernel,
linux-fsdevel, pfalcato
On Fri, Oct 31, 2025 at 1:08 PM Christian Brauner <brauner@kernel.org> wrote:
>
> > I wonder if it would make sense to bypass the problem by moving the
> > pathname handling routines to a different header -- might be useful in
> > its own right to slim down the kitchen sink that fs.h turned out to
> > be, but that's another bikeshed-y material.
>
> fs.h needs to be split up. It's on my ToDo but let's just say there's a
> lot of stuff on it so it's not really high-priority. If you have a good
> reason to move something out of there by my guest. It would be
> appreciated!
I slept on it and I think the pragmatic way forward is to split up
runtime-const.h instead.
The code to emit patchable access has very little requirements in
terms header files. In contrast, the code to do the patching can pull
in all kinds of headers with riscv being a great example.
While I ran into problems with fs.h on riscv specifically, one has to
expect the pre-existing mess will be posing an issue in other places
should they try to use the machinery.
So I think runtime-const-accessors.h (pardon the long name) for things
like fs.h would be the way forward, regardless of what happens with
the latter in the long run. I'm going to hack it up later.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-31 15:13 ` Mateusz Guzik
@ 2025-10-31 16:04 ` Linus Torvalds
2025-10-31 16:25 ` Mateusz Guzik
0 siblings, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2025-10-31 16:04 UTC (permalink / raw)
To: Mateusz Guzik
Cc: Christian Brauner, Thomas Gleixner, viro, jack, linux-kernel,
linux-fsdevel, pfalcato
On Fri, 31 Oct 2025 at 08:13, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> I slept on it and I think the pragmatic way forward is to split up
> runtime-const.h instead.
I don't think that would be wrong, but I do think the real bug was to
include runtime-const.h in any headers at all.
It should only be included by C code that is always built-in.
And it's all my fault and due to incompetence: this was introduced by
me in commit 86e6b1547b3d ("x86: fix user address masking
non-canonical speculation issue").
The original runtime const design was for core code optimization only,
and I just didn't think about the module case when I did that thing.
Sadly, this goes beyond just the trivial "access_ok()" - which can
trivially be fixed by just making it out-of-line. It ends up impacting
user address masking too.
It so happens that all our can_do_masked_user_access() optimizations
are in core code, so it's not an *actual* bug, just a potential one,
but it's one that Thomas' patches to do the nice scoped user accesses
will likely make much more common, just because his interface is so
much more convenient.
End result: I think your patch to just use
#ifdef MODULE
in the header was the right one. Except instead of that
+#ifdef MODULE
+#define __USER_PTR_MAX USER_PTR_MAX
+#else
thing, I think the right thing to do is to just do
#ifdef MODULE
#include <asm-generic/runtime-const.h>
#undef runtime_const_init
#else
#include <asm/runtime-const.h>
#endif
in the x86 uaccess_64.h header file.
Let me think about this a bit more, but I feel really bad about having
missed this bug. I'm relieved to say that it looks largely harmless in
practice, but it really is me having royally messed up.
Linus
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-31 16:04 ` Linus Torvalds
@ 2025-10-31 16:25 ` Mateusz Guzik
2025-10-31 16:31 ` Linus Torvalds
0 siblings, 1 reply; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-31 16:25 UTC (permalink / raw)
To: Linus Torvalds
Cc: Christian Brauner, Thomas Gleixner, viro, jack, linux-kernel,
linux-fsdevel, pfalcato
On Fri, Oct 31, 2025 at 5:04 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, 31 Oct 2025 at 08:13, Mateusz Guzik <mjguzik@gmail.com> wrote:
> >
> > I slept on it and I think the pragmatic way forward is to split up
> > runtime-const.h instead.
>
> I don't think that would be wrong, but I do think the real bug was to
> include runtime-const.h in any headers at all.
>
I think that was the right call, just that realities of going past
amd64 caught up with the header mixing the dependency-lean (if you
will) access with dependency-heavy patching of it.
Again with names_cachep as an example: there are different spots which
use it. On paper, fs.h can include the right header(tm) and everyone
is transparently covered. Without that every single .c file has to be
adjusted.
But that's only few spots, so one could argue that's a minor inconvenience.
Suppose one was trying to make systemic use of of the machinery for
other stuff. For sake of argument, say everything marked
ro_after_init?
With a lean header it will be feasible to sneak it in to something de
facto included everywhere.
[snip]
> End result: I think your patch to just use
>
> #ifdef MODULE
>
> in the header was the right one. Except instead of that
>
> +#ifdef MODULE
> +#define __USER_PTR_MAX USER_PTR_MAX
> +#else
>
> thing, I think the right thing to do is to just do
>
> #ifdef MODULE
> #include <asm-generic/runtime-const.h>
> #undef runtime_const_init
> #else
> #include <asm/runtime-const.h>
> #endif
>
> in the x86 uaccess_64.h header file.
>
While I can concede __USER_PTR_MAX naming is not the best here, I
think your approach looks weird but it also complicates things.
I take the intent would be still to fail compilation if
runtime-const.h is included. The file is there for the premier
platforms, but most platforms still resort to
asm-generic/runtime-const.h. I think it would be beneficial to have
that sucker also cause compilation failure if included for a module.
That way someone developing on a non-mainstream platform is less
likely to post a patch bogus on this front.
> Let me think about this a bit more, but I feel really bad about having
> missed this bug. I'm relieved to say that it looks largely harmless in
> practice, but it really is me having royally messed up.
>
Sure, there is no rush whatsoever. The original patch was meant to be
a 5 minute detour and it is not holding up any work.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-31 16:25 ` Mateusz Guzik
@ 2025-10-31 16:31 ` Linus Torvalds
2025-10-31 17:42 ` [WIP RFC PATCH 0/3] runtime-const header split and whatnot Mateusz Guzik
0 siblings, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2025-10-31 16:31 UTC (permalink / raw)
To: Mateusz Guzik
Cc: Christian Brauner, Thomas Gleixner, viro, jack, linux-kernel,
linux-fsdevel, pfalcato
On Fri, 31 Oct 2025 at 09:25, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> I take the intent would be still to fail compilation if
> runtime-const.h is included. The file is there for the premier
> platforms, but most platforms still resort to
> asm-generic/runtime-const.h. I think it would be beneficial to have
> that sucker also cause compilation failure if included for a module.
Good point. Yeah, I think you're right.
Linus
^ permalink raw reply [flat|nested] 51+ messages in thread
* [WIP RFC PATCH 0/3] runtime-const header split and whatnot
2025-10-31 16:31 ` Linus Torvalds
@ 2025-10-31 17:42 ` Mateusz Guzik
2025-10-31 17:42 ` [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules Mateusz Guzik
` (2 more replies)
0 siblings, 3 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-31 17:42 UTC (permalink / raw)
To: torvalds
Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, tglx, pfalcato,
Mateusz Guzik
So I slapped together what I described into a WIP patchset.
The runtime header treatment so far only done for x86 and riscv.
I verified things still compile with this in fs.h:
#ifndef MODULE
#include <asm/runtime-const-accessors.h>
#endif
The -accessors suffix is not my favourite, but I don't have a better
name.
If this looks like I'm going to do the needful(tm).
Mateusz Guzik (3):
x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX
in modules
runtime-const: split headers between accessors and fixup; disable for
modules
fs: hide names_cachep behind runtime access machinery
.../include/asm/runtime-const-accessors.h | 151 ++++++++++++++++++
arch/riscv/include/asm/runtime-const.h | 142 +---------------
.../x86/include/asm/runtime-const-accessors.h | 45 ++++++
arch/x86/include/asm/runtime-const.h | 38 +----
arch/x86/include/asm/uaccess_64.h | 17 +-
arch/x86/kernel/cpu/common.c | 8 +-
fs/dcache.c | 1 +
include/asm-generic/vmlinux.lds.h | 3 +-
include/linux/fs.h | 17 +-
9 files changed, 232 insertions(+), 190 deletions(-)
create mode 100644 arch/riscv/include/asm/runtime-const-accessors.h
create mode 100644 arch/x86/include/asm/runtime-const-accessors.h
--
2.34.1
^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-10-31 17:42 ` [WIP RFC PATCH 0/3] runtime-const header split and whatnot Mateusz Guzik
@ 2025-10-31 17:42 ` Mateusz Guzik
2025-10-31 21:46 ` Linus Torvalds
` (2 more replies)
2025-10-31 17:42 ` [PATCH 2/3] runtime-const: split headers between accessors and fixup; disable for modules Mateusz Guzik
2025-10-31 17:42 ` [PATCH 3/3] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
2 siblings, 3 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-31 17:42 UTC (permalink / raw)
To: torvalds
Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, tglx, pfalcato,
Mateusz Guzik
[real commit message will land here later]
---
arch/x86/include/asm/uaccess_64.h | 17 +++++++++--------
arch/x86/kernel/cpu/common.c | 8 +++++---
2 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index c8a5ae35c871..f60c0ed147c3 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -12,13 +12,14 @@
#include <asm/cpufeatures.h>
#include <asm/page.h>
#include <asm/percpu.h>
-#include <asm/runtime-const.h>
-/*
- * Virtual variable: there's no actual backing store for this,
- * it can purely be used as 'runtime_const_ptr(USER_PTR_MAX)'
- */
-extern unsigned long USER_PTR_MAX;
+extern unsigned long user_ptr_max;
+#ifdef MODULE
+#define __user_ptr_max_accessor user_ptr_max
+#else
+#include <asm/runtime-const.h>
+#define __user_ptr_max_accessor runtime_const_ptr(user_ptr_max)
+#endif
#ifdef CONFIG_ADDRESS_MASKING
/*
@@ -54,7 +55,7 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,
#endif
#define valid_user_address(x) \
- likely((__force unsigned long)(x) <= runtime_const_ptr(USER_PTR_MAX))
+ likely((__force unsigned long)(x) <= __user_ptr_max_accessor)
/*
* Masking the user address is an alternative to a conditional
@@ -67,7 +68,7 @@ static inline void __user *mask_user_address(const void __user *ptr)
asm("cmp %1,%0\n\t"
"cmova %1,%0"
:"=r" (ret)
- :"r" (runtime_const_ptr(USER_PTR_MAX)),
+ :"r" (__user_ptr_max_accessor),
"0" (ptr));
return ret;
}
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3ff9682d8bc4..f338f5e9adfc 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -78,6 +78,9 @@
DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
+unsigned long user_ptr_max __ro_after_init;
+EXPORT_SYMBOL(user_ptr_max);
+
u32 elf_hwcap2 __read_mostly;
/* Number of siblings per CPU package */
@@ -2575,14 +2578,13 @@ void __init arch_cpu_finalize_init(void)
alternative_instructions();
if (IS_ENABLED(CONFIG_X86_64)) {
- unsigned long USER_PTR_MAX = TASK_SIZE_MAX;
-
+ user_ptr_max = TASK_SIZE_MAX;
/*
* Enable this when LAM is gated on LASS support
if (cpu_feature_enabled(X86_FEATURE_LAM))
USER_PTR_MAX = (1ul << 63) - PAGE_SIZE;
*/
- runtime_const_init(ptr, USER_PTR_MAX);
+ runtime_const_init(ptr, user_ptr_max);
/*
* Make sure the first 2MB area is not mapped by huge pages
--
2.34.1
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 2/3] runtime-const: split headers between accessors and fixup; disable for modules
2025-10-31 17:42 ` [WIP RFC PATCH 0/3] runtime-const header split and whatnot Mateusz Guzik
2025-10-31 17:42 ` [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules Mateusz Guzik
@ 2025-10-31 17:42 ` Mateusz Guzik
2025-10-31 17:42 ` [PATCH 3/3] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
2 siblings, 0 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-31 17:42 UTC (permalink / raw)
To: torvalds
Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, tglx, pfalcato,
Mateusz Guzik
risv and x86 covered as a POC
---
.../include/asm/runtime-const-accessors.h | 151 ++++++++++++++++++
arch/riscv/include/asm/runtime-const.h | 142 +---------------
.../x86/include/asm/runtime-const-accessors.h | 45 ++++++
arch/x86/include/asm/runtime-const.h | 38 +----
4 files changed, 200 insertions(+), 176 deletions(-)
create mode 100644 arch/riscv/include/asm/runtime-const-accessors.h
create mode 100644 arch/x86/include/asm/runtime-const-accessors.h
diff --git a/arch/riscv/include/asm/runtime-const-accessors.h b/arch/riscv/include/asm/runtime-const-accessors.h
new file mode 100644
index 000000000000..5b8e0349ee0d
--- /dev/null
+++ b/arch/riscv/include/asm/runtime-const-accessors.h
@@ -0,0 +1,151 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_RUNTIME_CONST_ACCESSORS_H
+#define _ASM_RISCV_RUNTIME_CONST_ACCESSORS_H
+
+#ifdef MODULE
+#error "this functionality is not available for modules"
+#endif
+
+#ifdef CONFIG_32BIT
+#define runtime_const_ptr(sym) \
+({ \
+ typeof(sym) __ret; \
+ asm_inline(".option push\n\t" \
+ ".option norvc\n\t" \
+ "1:\t" \
+ "lui %[__ret],0x89abd\n\t" \
+ "addi %[__ret],%[__ret],-0x211\n\t" \
+ ".option pop\n\t" \
+ ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
+ ".long 1b - .\n\t" \
+ ".popsection" \
+ : [__ret] "=r" (__ret)); \
+ __ret; \
+})
+#else
+/*
+ * Loading 64-bit constants into a register from immediates is a non-trivial
+ * task on riscv64. To get it somewhat performant, load 32 bits into two
+ * different registers and then combine the results.
+ *
+ * If the processor supports the Zbkb extension, we can combine the final
+ * "slli,slli,srli,add" into the single "pack" instruction. If the processor
+ * doesn't support Zbkb but does support the Zbb extension, we can
+ * combine the final "slli,srli,add" into one instruction "add.uw".
+ */
+#define RISCV_RUNTIME_CONST_64_PREAMBLE \
+ ".option push\n\t" \
+ ".option norvc\n\t" \
+ "1:\t" \
+ "lui %[__ret],0x89abd\n\t" \
+ "lui %[__tmp],0x1234\n\t" \
+ "addiw %[__ret],%[__ret],-0x211\n\t" \
+ "addiw %[__tmp],%[__tmp],0x567\n\t" \
+
+#define RISCV_RUNTIME_CONST_64_BASE \
+ "slli %[__tmp],%[__tmp],32\n\t" \
+ "slli %[__ret],%[__ret],32\n\t" \
+ "srli %[__ret],%[__ret],32\n\t" \
+ "add %[__ret],%[__ret],%[__tmp]\n\t" \
+
+#define RISCV_RUNTIME_CONST_64_ZBA \
+ ".option push\n\t" \
+ ".option arch,+zba\n\t" \
+ ".option norvc\n\t" \
+ "slli %[__tmp],%[__tmp],32\n\t" \
+ "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \
+ "nop\n\t" \
+ "nop\n\t" \
+ ".option pop\n\t" \
+
+#define RISCV_RUNTIME_CONST_64_ZBKB \
+ ".option push\n\t" \
+ ".option arch,+zbkb\n\t" \
+ ".option norvc\n\t" \
+ "pack %[__ret],%[__ret],%[__tmp]\n\t" \
+ "nop\n\t" \
+ "nop\n\t" \
+ "nop\n\t" \
+ ".option pop\n\t" \
+
+#define RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
+ ".option pop\n\t" \
+ ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
+ ".long 1b - .\n\t" \
+ ".popsection" \
+
+#if defined(CONFIG_RISCV_ISA_ZBA) && defined(CONFIG_TOOLCHAIN_HAS_ZBA) \
+ && defined(CONFIG_RISCV_ISA_ZBKB)
+#define runtime_const_ptr(sym) \
+({ \
+ typeof(sym) __ret, __tmp; \
+ asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
+ ALTERNATIVE_2( \
+ RISCV_RUNTIME_CONST_64_BASE, \
+ RISCV_RUNTIME_CONST_64_ZBA, \
+ 0, RISCV_ISA_EXT_ZBA, 1, \
+ RISCV_RUNTIME_CONST_64_ZBKB, \
+ 0, RISCV_ISA_EXT_ZBKB, 1 \
+ ) \
+ RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
+ : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
+ __ret; \
+})
+#elif defined(CONFIG_RISCV_ISA_ZBA) && defined(CONFIG_TOOLCHAIN_HAS_ZBA)
+#define runtime_const_ptr(sym) \
+({ \
+ typeof(sym) __ret, __tmp; \
+ asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
+ ALTERNATIVE( \
+ RISCV_RUNTIME_CONST_64_BASE, \
+ RISCV_RUNTIME_CONST_64_ZBA, \
+ 0, RISCV_ISA_EXT_ZBA, 1 \
+ ) \
+ RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
+ : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
+ __ret; \
+})
+#elif defined(CONFIG_RISCV_ISA_ZBKB)
+#define runtime_const_ptr(sym) \
+({ \
+ typeof(sym) __ret, __tmp; \
+ asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
+ ALTERNATIVE( \
+ RISCV_RUNTIME_CONST_64_BASE, \
+ RISCV_RUNTIME_CONST_64_ZBKB, \
+ 0, RISCV_ISA_EXT_ZBKB, 1 \
+ ) \
+ RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
+ : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
+ __ret; \
+})
+#else
+#define runtime_const_ptr(sym) \
+({ \
+ typeof(sym) __ret, __tmp; \
+ asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
+ RISCV_RUNTIME_CONST_64_BASE \
+ RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
+ : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
+ __ret; \
+})
+#endif
+#endif
+
+#define runtime_const_shift_right_32(val, sym) \
+({ \
+ u32 __ret; \
+ asm_inline(".option push\n\t" \
+ ".option norvc\n\t" \
+ "1:\t" \
+ SRLI " %[__ret],%[__val],12\n\t" \
+ ".option pop\n\t" \
+ ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \
+ ".long 1b - .\n\t" \
+ ".popsection" \
+ : [__ret] "=r" (__ret) \
+ : [__val] "r" (val)); \
+ __ret; \
+})
+
+#endif /* _ASM_RISCV_RUNTIME_CONST_ACCESSORS_H */
diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h
index d766e2b9e6df..14994be81487 100644
--- a/arch/riscv/include/asm/runtime-const.h
+++ b/arch/riscv/include/asm/runtime-const.h
@@ -11,147 +11,7 @@
#include <linux/uaccess.h>
-#ifdef CONFIG_32BIT
-#define runtime_const_ptr(sym) \
-({ \
- typeof(sym) __ret; \
- asm_inline(".option push\n\t" \
- ".option norvc\n\t" \
- "1:\t" \
- "lui %[__ret],0x89abd\n\t" \
- "addi %[__ret],%[__ret],-0x211\n\t" \
- ".option pop\n\t" \
- ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
- ".long 1b - .\n\t" \
- ".popsection" \
- : [__ret] "=r" (__ret)); \
- __ret; \
-})
-#else
-/*
- * Loading 64-bit constants into a register from immediates is a non-trivial
- * task on riscv64. To get it somewhat performant, load 32 bits into two
- * different registers and then combine the results.
- *
- * If the processor supports the Zbkb extension, we can combine the final
- * "slli,slli,srli,add" into the single "pack" instruction. If the processor
- * doesn't support Zbkb but does support the Zbb extension, we can
- * combine the final "slli,srli,add" into one instruction "add.uw".
- */
-#define RISCV_RUNTIME_CONST_64_PREAMBLE \
- ".option push\n\t" \
- ".option norvc\n\t" \
- "1:\t" \
- "lui %[__ret],0x89abd\n\t" \
- "lui %[__tmp],0x1234\n\t" \
- "addiw %[__ret],%[__ret],-0x211\n\t" \
- "addiw %[__tmp],%[__tmp],0x567\n\t" \
-
-#define RISCV_RUNTIME_CONST_64_BASE \
- "slli %[__tmp],%[__tmp],32\n\t" \
- "slli %[__ret],%[__ret],32\n\t" \
- "srli %[__ret],%[__ret],32\n\t" \
- "add %[__ret],%[__ret],%[__tmp]\n\t" \
-
-#define RISCV_RUNTIME_CONST_64_ZBA \
- ".option push\n\t" \
- ".option arch,+zba\n\t" \
- ".option norvc\n\t" \
- "slli %[__tmp],%[__tmp],32\n\t" \
- "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \
- "nop\n\t" \
- "nop\n\t" \
- ".option pop\n\t" \
-
-#define RISCV_RUNTIME_CONST_64_ZBKB \
- ".option push\n\t" \
- ".option arch,+zbkb\n\t" \
- ".option norvc\n\t" \
- "pack %[__ret],%[__ret],%[__tmp]\n\t" \
- "nop\n\t" \
- "nop\n\t" \
- "nop\n\t" \
- ".option pop\n\t" \
-
-#define RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
- ".option pop\n\t" \
- ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
- ".long 1b - .\n\t" \
- ".popsection" \
-
-#if defined(CONFIG_RISCV_ISA_ZBA) && defined(CONFIG_TOOLCHAIN_HAS_ZBA) \
- && defined(CONFIG_RISCV_ISA_ZBKB)
-#define runtime_const_ptr(sym) \
-({ \
- typeof(sym) __ret, __tmp; \
- asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
- ALTERNATIVE_2( \
- RISCV_RUNTIME_CONST_64_BASE, \
- RISCV_RUNTIME_CONST_64_ZBA, \
- 0, RISCV_ISA_EXT_ZBA, 1, \
- RISCV_RUNTIME_CONST_64_ZBKB, \
- 0, RISCV_ISA_EXT_ZBKB, 1 \
- ) \
- RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
- : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
- __ret; \
-})
-#elif defined(CONFIG_RISCV_ISA_ZBA) && defined(CONFIG_TOOLCHAIN_HAS_ZBA)
-#define runtime_const_ptr(sym) \
-({ \
- typeof(sym) __ret, __tmp; \
- asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
- ALTERNATIVE( \
- RISCV_RUNTIME_CONST_64_BASE, \
- RISCV_RUNTIME_CONST_64_ZBA, \
- 0, RISCV_ISA_EXT_ZBA, 1 \
- ) \
- RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
- : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
- __ret; \
-})
-#elif defined(CONFIG_RISCV_ISA_ZBKB)
-#define runtime_const_ptr(sym) \
-({ \
- typeof(sym) __ret, __tmp; \
- asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
- ALTERNATIVE( \
- RISCV_RUNTIME_CONST_64_BASE, \
- RISCV_RUNTIME_CONST_64_ZBKB, \
- 0, RISCV_ISA_EXT_ZBKB, 1 \
- ) \
- RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
- : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
- __ret; \
-})
-#else
-#define runtime_const_ptr(sym) \
-({ \
- typeof(sym) __ret, __tmp; \
- asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
- RISCV_RUNTIME_CONST_64_BASE \
- RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
- : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
- __ret; \
-})
-#endif
-#endif
-
-#define runtime_const_shift_right_32(val, sym) \
-({ \
- u32 __ret; \
- asm_inline(".option push\n\t" \
- ".option norvc\n\t" \
- "1:\t" \
- SRLI " %[__ret],%[__val],12\n\t" \
- ".option pop\n\t" \
- ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \
- ".long 1b - .\n\t" \
- ".popsection" \
- : [__ret] "=r" (__ret) \
- : [__val] "r" (val)); \
- __ret; \
-})
+#include <asm/runtime-const-accessors.h>
#define runtime_const_init(type, sym) do { \
extern s32 __start_runtime_##type##_##sym[]; \
diff --git a/arch/x86/include/asm/runtime-const-accessors.h b/arch/x86/include/asm/runtime-const-accessors.h
new file mode 100644
index 000000000000..4c411bc3cb32
--- /dev/null
+++ b/arch/x86/include/asm/runtime-const-accessors.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RUNTIME_CONST_ACCESSORS_H
+#define _ASM_RUNTIME_CONST_ACCESSORS_H
+
+#ifdef MODULE
+#error "this functionality is not available for modules"
+#endif
+
+#ifdef __ASSEMBLY__
+
+.macro RUNTIME_CONST_PTR sym reg
+ movq $0x0123456789abcdef, %\reg
+ 1:
+ .pushsection runtime_ptr_\sym, "a"
+ .long 1b - 8 - .
+ .popsection
+.endm
+
+#else /* __ASSEMBLY__ */
+
+#define runtime_const_ptr(sym) ({ \
+ typeof(sym) __ret; \
+ asm_inline("mov %1,%0\n1:\n" \
+ ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
+ ".long 1b - %c2 - .\n" \
+ ".popsection" \
+ :"=r" (__ret) \
+ :"i" ((unsigned long)0x0123456789abcdefull), \
+ "i" (sizeof(long))); \
+ __ret; })
+
+// The 'typeof' will create at _least_ a 32-bit type, but
+// will happily also take a bigger type and the 'shrl' will
+// clear the upper bits
+#define runtime_const_shift_right_32(val, sym) ({ \
+ typeof(0u+(val)) __ret = (val); \
+ asm_inline("shrl $12,%k0\n1:\n" \
+ ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \
+ ".long 1b - 1 - .\n" \
+ ".popsection" \
+ :"+r" (__ret)); \
+ __ret; })
+
+#endif /* __ASSEMBLY__ */
+#endif
diff --git a/arch/x86/include/asm/runtime-const.h b/arch/x86/include/asm/runtime-const.h
index 8d983cfd06ea..15d67e2bfc96 100644
--- a/arch/x86/include/asm/runtime-const.h
+++ b/arch/x86/include/asm/runtime-const.h
@@ -2,41 +2,9 @@
#ifndef _ASM_RUNTIME_CONST_H
#define _ASM_RUNTIME_CONST_H
-#ifdef __ASSEMBLY__
-
-.macro RUNTIME_CONST_PTR sym reg
- movq $0x0123456789abcdef, %\reg
- 1:
- .pushsection runtime_ptr_\sym, "a"
- .long 1b - 8 - .
- .popsection
-.endm
-
-#else /* __ASSEMBLY__ */
-
-#define runtime_const_ptr(sym) ({ \
- typeof(sym) __ret; \
- asm_inline("mov %1,%0\n1:\n" \
- ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
- ".long 1b - %c2 - .\n" \
- ".popsection" \
- :"=r" (__ret) \
- :"i" ((unsigned long)0x0123456789abcdefull), \
- "i" (sizeof(long))); \
- __ret; })
-
-// The 'typeof' will create at _least_ a 32-bit type, but
-// will happily also take a bigger type and the 'shrl' will
-// clear the upper bits
-#define runtime_const_shift_right_32(val, sym) ({ \
- typeof(0u+(val)) __ret = (val); \
- asm_inline("shrl $12,%k0\n1:\n" \
- ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \
- ".long 1b - 1 - .\n" \
- ".popsection" \
- :"+r" (__ret)); \
- __ret; })
+#include <asm/runtime-const-accessors.h>
+#ifndef __ASSEMBLY__
#define runtime_const_init(type, sym) do { \
extern s32 __start_runtime_##type##_##sym[]; \
extern s32 __stop_runtime_##type##_##sym[]; \
@@ -70,5 +38,5 @@ static inline void runtime_const_fixup(void (*fn)(void *, unsigned long),
}
}
-#endif /* __ASSEMBLY__ */
+#endif /* !__ASSEMBLY__ */
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 3/3] fs: hide names_cachep behind runtime access machinery
2025-10-31 17:42 ` [WIP RFC PATCH 0/3] runtime-const header split and whatnot Mateusz Guzik
2025-10-31 17:42 ` [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules Mateusz Guzik
2025-10-31 17:42 ` [PATCH 2/3] runtime-const: split headers between accessors and fixup; disable for modules Mateusz Guzik
@ 2025-10-31 17:42 ` Mateusz Guzik
2025-10-31 23:30 ` kernel test robot
` (3 more replies)
2 siblings, 4 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-31 17:42 UTC (permalink / raw)
To: torvalds
Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, tglx, pfalcato,
Mateusz Guzik
The var is used twice for every path lookup, while the cache is
initialized early and stays valid for the duration.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
fs/dcache.c | 1 +
include/asm-generic/vmlinux.lds.h | 3 ++-
include/linux/fs.h | 17 +++++++++++++++--
3 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c
index de3e4e9777ea..1afef6cf16b7 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -3259,6 +3259,7 @@ void __init vfs_caches_init(void)
{
names_cachep = kmem_cache_create_usercopy("names_cache", PATH_MAX, 0,
SLAB_HWCACHE_ALIGN|SLAB_PANIC, 0, PATH_MAX, NULL);
+ runtime_const_init(ptr, names_cachep);
dcache_init();
inode_init();
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index dcdbd962abd6..c7d85c80111c 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -939,7 +939,8 @@
#define RUNTIME_CONST_VARIABLES \
RUNTIME_CONST(shift, d_hash_shift) \
- RUNTIME_CONST(ptr, dentry_hashtable)
+ RUNTIME_CONST(ptr, dentry_hashtable) \
+ RUNTIME_CONST(ptr, names_cachep)
/* Alignment must be consistent with (kunit_suite *) in include/kunit/test.h */
#define KUNIT_TABLE() \
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 947d7958eb72..bf0606ace221 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -50,6 +50,10 @@
#include <linux/unicode.h>
#include <asm/byteorder.h>
+#ifndef MODULE
+#include <asm/runtime-const-accessors.h>
+#endif
+
#include <uapi/linux/fs.h>
struct backing_dev_info;
@@ -3044,8 +3048,17 @@ extern void __init vfs_caches_init(void);
extern struct kmem_cache *names_cachep;
-#define __getname() kmem_cache_alloc(names_cachep, GFP_KERNEL)
-#define __putname(name) kmem_cache_free(names_cachep, (void *)(name))
+/*
+ * XXX The runtime_const machinery does not support modules at the moment.
+ */
+#ifdef MODULE
+#define __names_cachep_accessor names_cachep
+#else
+#define __names_cachep_accessor runtime_const_ptr(names_cachep)
+#endif
+
+#define __getname() kmem_cache_alloc(__names_cachep_accessor, GFP_KERNEL)
+#define __putname(name) kmem_cache_free(__names_cachep_accessor, (void *)(name))
extern struct super_block *blockdev_superblock;
static inline bool sb_is_blkdev_sb(struct super_block *sb)
--
2.34.1
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-10-31 17:42 ` [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules Mateusz Guzik
@ 2025-10-31 21:46 ` Linus Torvalds
2025-10-31 22:01 ` Mateusz Guzik
2025-11-01 11:26 ` David Laight
2025-11-04 6:25 ` Linus Torvalds
2 siblings, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2025-10-31 21:46 UTC (permalink / raw)
To: Mateusz Guzik
Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, tglx, pfalcato
On Fri, 31 Oct 2025 at 10:42, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> -extern unsigned long USER_PTR_MAX;
> +extern unsigned long user_ptr_max;
Yeah, this doesn't work at all.
We still use USER_PTR_MAX in other places, including the linker script
and arch/x86/lib/getuser.S
So you changed about half the places to the new name, breaking the others.
Linus
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-10-31 21:46 ` Linus Torvalds
@ 2025-10-31 22:01 ` Mateusz Guzik
0 siblings, 0 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-10-31 22:01 UTC (permalink / raw)
To: Linus Torvalds
Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, tglx, pfalcato
On Fri, Oct 31, 2025 at 10:46 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, 31 Oct 2025 at 10:42, Mateusz Guzik <mjguzik@gmail.com> wrote:
> >
> > -extern unsigned long USER_PTR_MAX;
> > +extern unsigned long user_ptr_max;
>
> Yeah, this doesn't work at all.
>
> We still use USER_PTR_MAX in other places, including the linker script
> and arch/x86/lib/getuser.S
>
> So you changed about half the places to the new name, breaking the others.
>
True, but note there is no sign-off on the patch as this is not a real
submission yet.
Changing this to lower case was a last minute adjustment and it really
does not matter in this context, interestingly enough the kernel still
built just fine, just threw the following:
ld: warning: orphan section `runtime_ptr_user_ptr_max' from
`vmlinux.o' being placed in section `runtime_ptr_user_ptr_max'
Anyway, the thing which does matter in this patchset is that a riscv
kernel now builds with fs.h including runtime-const-accessors.h and
this is the bit I'm fishing for comments on.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 10:52 [PATCH v4] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
` (2 preceding siblings ...)
2025-10-31 13:30 ` [PATCH v4] " kernel test robot
@ 2025-10-31 22:43 ` kernel test robot
2025-11-01 23:06 ` kernel test robot
4 siblings, 0 replies; 51+ messages in thread
From: kernel test robot @ 2025-10-31 22:43 UTC (permalink / raw)
To: Mateusz Guzik, brauner
Cc: oe-kbuild-all, viro, jack, linux-kernel, linux-fsdevel, torvalds,
pfalcato, Mateusz Guzik
Hi Mateusz,
kernel test robot noticed the following build warnings:
[auto build test WARNING on arnd-asm-generic/master]
[also build test WARNING on linus/master brauner-vfs/vfs.all linux/master v6.18-rc3 next-20251031]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/fs-hide-names_cachep-behind-runtime-access-machinery/20251030-185523
base: https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git master
patch link: https://lore.kernel.org/r/20251030105242.801528-1-mjguzik%40gmail.com
patch subject: [PATCH v4] fs: hide names_cachep behind runtime access machinery
config: i386-randconfig-061-20251031 (https://download.01.org/0day-ci/archive/20251101/202511010440.FLitz9Fi-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251101/202511010440.FLitz9Fi-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511010440.FLitz9Fi-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
fs/d_path.c:195:9: sparse: sparse: context imbalance in 'prepend_path' - wrong count at exit
fs/d_path.c:359:9: sparse: sparse: context imbalance in '__dentry_path' - wrong count at exit
>> fs/d_path.c:416:22: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/d_path.c:416:22: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/d_path.c:446:9: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
--
>> fs/namei.c:146:18: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/namei.c:146:18: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:163:25: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:169:25: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:191:25: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:197:25: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:203:25: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:208:25: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:249:18: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:249:18: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:261:25: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:267:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:294:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c:297:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/namei.c: note: in included file (through include/linux/rbtree.h, include/linux/mm_types.h, include/linux/mmzone.h, ...):
include/linux/rcupdate.h:871:25: sparse: sparse: context imbalance in 'leave_rcu' - unexpected unlock
fs/namei.c:2518:19: sparse: sparse: context imbalance in 'path_init' - different lock contexts for basic block
--
drivers/base/firmware_loader/main.c:229:9: sparse: sparse: context imbalance in 'free_fw_priv' - wrong count at exit
>> drivers/base/firmware_loader/main.c:509:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> drivers/base/firmware_loader/main.c:509:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
drivers/base/firmware_loader/main.c:591:9: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
vim +416 fs/d_path.c
7a5cf791a74764 Al Viro 2018-03-05 393
7a5cf791a74764 Al Viro 2018-03-05 394 /*
7a5cf791a74764 Al Viro 2018-03-05 395 * NOTE! The user-level library version returns a
7a5cf791a74764 Al Viro 2018-03-05 396 * character pointer. The kernel system call just
7a5cf791a74764 Al Viro 2018-03-05 397 * returns the length of the buffer filled (which
7a5cf791a74764 Al Viro 2018-03-05 398 * includes the ending '\0' character), or a negative
7a5cf791a74764 Al Viro 2018-03-05 399 * error value. So libc would do something like
7a5cf791a74764 Al Viro 2018-03-05 400 *
7a5cf791a74764 Al Viro 2018-03-05 401 * char *getcwd(char * buf, size_t size)
7a5cf791a74764 Al Viro 2018-03-05 402 * {
7a5cf791a74764 Al Viro 2018-03-05 403 * int retval;
7a5cf791a74764 Al Viro 2018-03-05 404 *
7a5cf791a74764 Al Viro 2018-03-05 405 * retval = sys_getcwd(buf, size);
7a5cf791a74764 Al Viro 2018-03-05 406 * if (retval >= 0)
7a5cf791a74764 Al Viro 2018-03-05 407 * return buf;
7a5cf791a74764 Al Viro 2018-03-05 408 * errno = -retval;
7a5cf791a74764 Al Viro 2018-03-05 409 * return NULL;
7a5cf791a74764 Al Viro 2018-03-05 410 * }
7a5cf791a74764 Al Viro 2018-03-05 411 */
7a5cf791a74764 Al Viro 2018-03-05 412 SYSCALL_DEFINE2(getcwd, char __user *, buf, unsigned long, size)
7a5cf791a74764 Al Viro 2018-03-05 413 {
7a5cf791a74764 Al Viro 2018-03-05 414 int error;
7a5cf791a74764 Al Viro 2018-03-05 415 struct path pwd, root;
7a5cf791a74764 Al Viro 2018-03-05 @416 char *page = __getname();
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 3/3] fs: hide names_cachep behind runtime access machinery
2025-10-31 17:42 ` [PATCH 3/3] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
@ 2025-10-31 23:30 ` kernel test robot
2025-10-31 23:30 ` kernel test robot
` (2 subsequent siblings)
3 siblings, 0 replies; 51+ messages in thread
From: kernel test robot @ 2025-10-31 23:30 UTC (permalink / raw)
To: Mateusz Guzik, torvalds
Cc: llvm, oe-kbuild-all, brauner, viro, jack, linux-kernel,
linux-fsdevel, tglx, pfalcato, Mateusz Guzik
Hi Mateusz,
kernel test robot noticed the following build warnings:
[auto build test WARNING on arnd-asm-generic/master]
[also build test WARNING on linus/master brauner-vfs/vfs.all v6.18-rc3 next-20251031]
[cannot apply to linux/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/x86-fix-access_ok-and-valid_user_address-using-wrong-USER_PTR_MAX-in-modules/20251101-054539
base: https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git master
patch link: https://lore.kernel.org/r/20251031174220.43458-4-mjguzik%40gmail.com
patch subject: [PATCH 3/3] fs: hide names_cachep behind runtime access machinery
config: um-allnoconfig (https://download.01.org/0day-ci/archive/20251101/202511010731.B5nbGjbm-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project d1c086e82af239b245fe8d7832f2753436634990)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251101/202511010731.B5nbGjbm-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511010731.B5nbGjbm-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from fs/dcache.c:38:
In file included from ./arch/um/include/generated/asm/runtime-const.h:1:
>> include/asm-generic/runtime-const.h:11:9: warning: 'runtime_const_ptr' macro redefined [-Wmacro-redefined]
11 | #define runtime_const_ptr(sym) (sym)
| ^
arch/x86/include/asm/runtime-const-accessors.h:21:9: note: previous definition is here
21 | #define runtime_const_ptr(sym) ({ \
| ^
In file included from fs/dcache.c:38:
In file included from ./arch/um/include/generated/asm/runtime-const.h:1:
>> include/asm-generic/runtime-const.h:12:9: warning: 'runtime_const_shift_right_32' macro redefined [-Wmacro-redefined]
12 | #define runtime_const_shift_right_32(val, sym) ((u32)(val)>>(sym))
| ^
arch/x86/include/asm/runtime-const-accessors.h:35:9: note: previous definition is here
35 | #define runtime_const_shift_right_32(val, sym) ({ \
| ^
2 warnings generated.
vim +/runtime_const_ptr +11 include/asm-generic/runtime-const.h
e78298556ee5d8 Linus Torvalds 2024-06-04 4
e78298556ee5d8 Linus Torvalds 2024-06-04 5 /*
e78298556ee5d8 Linus Torvalds 2024-06-04 6 * This is the fallback for when the architecture doesn't
e78298556ee5d8 Linus Torvalds 2024-06-04 7 * support the runtime const operations.
e78298556ee5d8 Linus Torvalds 2024-06-04 8 *
e78298556ee5d8 Linus Torvalds 2024-06-04 9 * We just use the actual symbols as-is.
e78298556ee5d8 Linus Torvalds 2024-06-04 10 */
e78298556ee5d8 Linus Torvalds 2024-06-04 @11 #define runtime_const_ptr(sym) (sym)
e78298556ee5d8 Linus Torvalds 2024-06-04 @12 #define runtime_const_shift_right_32(val, sym) ((u32)(val)>>(sym))
e78298556ee5d8 Linus Torvalds 2024-06-04 13 #define runtime_const_init(type,sym) do { } while (0)
e78298556ee5d8 Linus Torvalds 2024-06-04 14
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 3/3] fs: hide names_cachep behind runtime access machinery
2025-10-31 17:42 ` [PATCH 3/3] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
2025-10-31 23:30 ` kernel test robot
@ 2025-10-31 23:30 ` kernel test robot
2025-10-31 23:41 ` kernel test robot
2025-11-01 17:49 ` kernel test robot
3 siblings, 0 replies; 51+ messages in thread
From: kernel test robot @ 2025-10-31 23:30 UTC (permalink / raw)
To: Mateusz Guzik, torvalds
Cc: oe-kbuild-all, brauner, viro, jack, linux-kernel, linux-fsdevel,
tglx, pfalcato, Mateusz Guzik
Hi Mateusz,
kernel test robot noticed the following build errors:
[auto build test ERROR on arnd-asm-generic/master]
[also build test ERROR on linus/master brauner-vfs/vfs.all v6.18-rc3 next-20251031]
[cannot apply to linux/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/x86-fix-access_ok-and-valid_user_address-using-wrong-USER_PTR_MAX-in-modules/20251101-054539
base: https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git master
patch link: https://lore.kernel.org/r/20251031174220.43458-4-mjguzik%40gmail.com
patch subject: [PATCH 3/3] fs: hide names_cachep behind runtime access machinery
config: alpha-allnoconfig (https://download.01.org/0day-ci/archive/20251101/202511010706.nuASkMjZ-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251101/202511010706.nuASkMjZ-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511010706.nuASkMjZ-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from include/linux/huge_mm.h:7,
from include/linux/mm.h:1016,
from include/linux/pid_namespace.h:7,
from include/linux/ptrace.h:10,
from arch/alpha/kernel/asm-offsets.c:11:
>> include/linux/fs.h:54:10: fatal error: asm/runtime-const-accessors.h: No such file or directory
54 | #include <asm/runtime-const-accessors.h>
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.build:182: arch/alpha/kernel/asm-offsets.s] Error 1
make[3]: Target 'prepare' not remade because of errors.
make[2]: *** [Makefile:1282: prepare0] Error 2
make[2]: Target 'prepare' not remade because of errors.
make[1]: *** [Makefile:248: __sub-make] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:248: __sub-make] Error 2
make: Target 'prepare' not remade because of errors.
vim +54 include/linux/fs.h
51
52 #include <asm/byteorder.h>
53 #ifndef MODULE
> 54 #include <asm/runtime-const-accessors.h>
55 #endif
56
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 3/3] fs: hide names_cachep behind runtime access machinery
2025-10-31 17:42 ` [PATCH 3/3] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
2025-10-31 23:30 ` kernel test robot
2025-10-31 23:30 ` kernel test robot
@ 2025-10-31 23:41 ` kernel test robot
2025-11-01 17:49 ` kernel test robot
3 siblings, 0 replies; 51+ messages in thread
From: kernel test robot @ 2025-10-31 23:41 UTC (permalink / raw)
To: Mateusz Guzik, torvalds
Cc: llvm, oe-kbuild-all, brauner, viro, jack, linux-kernel,
linux-fsdevel, tglx, pfalcato, Mateusz Guzik
Hi Mateusz,
kernel test robot noticed the following build errors:
[auto build test ERROR on arnd-asm-generic/master]
[also build test ERROR on linus/master brauner-vfs/vfs.all v6.18-rc3 next-20251031]
[cannot apply to linux/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/x86-fix-access_ok-and-valid_user_address-using-wrong-USER_PTR_MAX-in-modules/20251101-054539
base: https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git master
patch link: https://lore.kernel.org/r/20251031174220.43458-4-mjguzik%40gmail.com
patch subject: [PATCH 3/3] fs: hide names_cachep behind runtime access machinery
config: arm-allnoconfig (https://download.01.org/0day-ci/archive/20251101/202511010704.D6l8wp63-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project d1c086e82af239b245fe8d7832f2753436634990)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251101/202511010704.D6l8wp63-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511010704.D6l8wp63-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from arch/arm/kernel/asm-offsets.c:12:
In file included from include/linux/mm.h:1016:
In file included from include/linux/huge_mm.h:7:
>> include/linux/fs.h:54:10: fatal error: 'asm/runtime-const-accessors.h' file not found
54 | #include <asm/runtime-const-accessors.h>
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make[3]: *** [scripts/Makefile.build:182: arch/arm/kernel/asm-offsets.s] Error 1
make[3]: Target 'prepare' not remade because of errors.
make[2]: *** [Makefile:1282: prepare0] Error 2
make[2]: Target 'prepare' not remade because of errors.
make[1]: *** [Makefile:248: __sub-make] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:248: __sub-make] Error 2
make: Target 'prepare' not remade because of errors.
vim +54 include/linux/fs.h
51
52 #include <asm/byteorder.h>
53 #ifndef MODULE
> 54 #include <asm/runtime-const-accessors.h>
55 #endif
56
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-10-31 17:42 ` [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules Mateusz Guzik
2025-10-31 21:46 ` Linus Torvalds
@ 2025-11-01 11:26 ` David Laight
2025-11-04 6:25 ` Linus Torvalds
2 siblings, 0 replies; 51+ messages in thread
From: David Laight @ 2025-11-01 11:26 UTC (permalink / raw)
To: Mateusz Guzik
Cc: torvalds, brauner, viro, jack, linux-kernel, linux-fsdevel, tglx,
pfalcato
On Fri, 31 Oct 2025 18:42:18 +0100
Mateusz Guzik <mjguzik@gmail.com> wrote:
> [real commit message will land here later]
Hmmm... modules use the 0x123456789abcdef0 placeholder (the 0 might not be
in the right place), this is non-canonical so nothing is badly broken.
Just allows speculative accesses to kernel space on some cpu.
> ---
> arch/x86/include/asm/uaccess_64.h | 17 +++++++++--------
> arch/x86/kernel/cpu/common.c | 8 +++++---
> 2 files changed, 14 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
> index c8a5ae35c871..f60c0ed147c3 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -12,13 +12,14 @@
> #include <asm/cpufeatures.h>
> #include <asm/page.h>
> #include <asm/percpu.h>
> -#include <asm/runtime-const.h>
>
> -/*
> - * Virtual variable: there's no actual backing store for this,
> - * it can purely be used as 'runtime_const_ptr(USER_PTR_MAX)'
> - */
> -extern unsigned long USER_PTR_MAX;
> +extern unsigned long user_ptr_max;
> +#ifdef MODULE
> +#define __user_ptr_max_accessor user_ptr_max
> +#else
> +#include <asm/runtime-const.h>
> +#define __user_ptr_max_accessor runtime_const_ptr(user_ptr_max)
> +#endif
>
> #ifdef CONFIG_ADDRESS_MASKING
> /*
> @@ -54,7 +55,7 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,
> #endif
>
> #define valid_user_address(x) \
> - likely((__force unsigned long)(x) <= runtime_const_ptr(USER_PTR_MAX))
> + likely((__force unsigned long)(x) <= __user_ptr_max_accessor)
>
> /*
> * Masking the user address is an alternative to a conditional
> @@ -67,7 +68,7 @@ static inline void __user *mask_user_address(const void __user *ptr)
> asm("cmp %1,%0\n\t"
> "cmova %1,%0"
> :"=r" (ret)
> - :"r" (runtime_const_ptr(USER_PTR_MAX)),
> + :"r" (__user_ptr_max_accessor),
> "0" (ptr));
> return ret;
> }
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 3ff9682d8bc4..f338f5e9adfc 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -78,6 +78,9 @@
> DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
> EXPORT_PER_CPU_SYMBOL(cpu_info);
>
> +unsigned long user_ptr_max __ro_after_init;
> +EXPORT_SYMBOL(user_ptr_max);
That doesn't appear to be inside a CONFIG_X86_64 define.
I think I'd initialise it to one of its two values - probably the LA48 one.
David
> +
> u32 elf_hwcap2 __read_mostly;
>
> /* Number of siblings per CPU package */
> @@ -2575,14 +2578,13 @@ void __init arch_cpu_finalize_init(void)
> alternative_instructions();
>
> if (IS_ENABLED(CONFIG_X86_64)) {
> - unsigned long USER_PTR_MAX = TASK_SIZE_MAX;
> -
> + user_ptr_max = TASK_SIZE_MAX;
> /*
> * Enable this when LAM is gated on LASS support
> if (cpu_feature_enabled(X86_FEATURE_LAM))
> USER_PTR_MAX = (1ul << 63) - PAGE_SIZE;
> */
> - runtime_const_init(ptr, USER_PTR_MAX);
> + runtime_const_init(ptr, user_ptr_max);
>
> /*
> * Make sure the first 2MB area is not mapped by huge pages
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 3/3] fs: hide names_cachep behind runtime access machinery
2025-10-31 17:42 ` [PATCH 3/3] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
` (2 preceding siblings ...)
2025-10-31 23:41 ` kernel test robot
@ 2025-11-01 17:49 ` kernel test robot
3 siblings, 0 replies; 51+ messages in thread
From: kernel test robot @ 2025-11-01 17:49 UTC (permalink / raw)
To: Mateusz Guzik, torvalds
Cc: oe-kbuild-all, brauner, viro, jack, linux-kernel, linux-fsdevel,
tglx, pfalcato, Mateusz Guzik
Hi Mateusz,
kernel test robot noticed the following build warnings:
[auto build test WARNING on arnd-asm-generic/master]
[also build test WARNING on linus/master brauner-vfs/vfs.all v6.18-rc3 next-20251031]
[cannot apply to linux/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/x86-fix-access_ok-and-valid_user_address-using-wrong-USER_PTR_MAX-in-modules/20251101-054539
base: https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git master
patch link: https://lore.kernel.org/r/20251031174220.43458-4-mjguzik%40gmail.com
patch subject: [PATCH 3/3] fs: hide names_cachep behind runtime access machinery
config: i386-randconfig-061-20251101 (https://download.01.org/0day-ci/archive/20251102/202511020147.47PufBIR-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251102/202511020147.47PufBIR-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511020147.47PufBIR-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
fs/smb/client/link.c: note: in included file:
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/smb/client/cifsproto.h:77:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
--
fs/smb/client/dir.c: note: in included file:
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/smb/client/cifsproto.h:77:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
--
fs/smb/client/misc.c: note: in included file:
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/smb/client/cifsproto.h:77:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
--
fs/smb/client/cifsfs.c: note: in included file:
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/smb/client/cifsproto.h:77:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
--
fs/smb/client/ioctl.c: note: in included file:
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/smb/client/cifsproto.h:77:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
--
fs/smb/client/inode.c: note: in included file:
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/smb/client/cifsproto.h:77:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
--
fs/smb/client/file.c: note: in included file:
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/smb/client/cifsproto.h:77:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
--
fs/smb/client/readdir.c: note: in included file:
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/smb/client/cifsproto.h:77:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
--
fs/smb/client/namespace.c: note: in included file:
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/smb/client/cifsproto.h:77:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
--
fs/smb/client/smb2ops.c: note: in included file:
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
>> fs/smb/client/cifsproto.h:71:16: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
fs/smb/client/cifsproto.h:77:17: sparse: sparse: cast truncates bits from constant value (123456789abcdef becomes 89abcdef)
vim +71 fs/smb/client/cifsproto.h
b6b38f704a8193 fs/cifs/cifsproto.h Joe Perches 2010-04-21 48
6d5786a34d98bf fs/cifs/cifsproto.h Pavel Shilovsky 2012-06-20 49 #define free_xid(curr_xid) \
b6b38f704a8193 fs/cifs/cifsproto.h Joe Perches 2010-04-21 50 do { \
6d5786a34d98bf fs/cifs/cifsproto.h Pavel Shilovsky 2012-06-20 51 _free_xid(curr_xid); \
a0a3036b81f1f6 fs/cifs/cifsproto.h Joe Perches 2020-04-14 52 cifs_dbg(FYI, "VFS: leaving %s (xid = %u) rc = %d\n", \
b6b38f704a8193 fs/cifs/cifsproto.h Joe Perches 2010-04-21 53 __func__, curr_xid, (int)rc); \
d683bcd3e5d157 fs/cifs/cifsproto.h Steve French 2018-05-19 54 if (rc) \
d683bcd3e5d157 fs/cifs/cifsproto.h Steve French 2018-05-19 55 trace_smb3_exit_err(curr_xid, __func__, (int)rc); \
d683bcd3e5d157 fs/cifs/cifsproto.h Steve French 2018-05-19 56 else \
d683bcd3e5d157 fs/cifs/cifsproto.h Steve French 2018-05-19 57 trace_smb3_exit_done(curr_xid, __func__); \
b6b38f704a8193 fs/cifs/cifsproto.h Joe Perches 2010-04-21 58 } while (0)
4d79dba0e00749 fs/cifs/cifsproto.h Shirish Pargaonkar 2011-04-27 59 extern int init_cifs_idmap(void);
4d79dba0e00749 fs/cifs/cifsproto.h Shirish Pargaonkar 2011-04-27 60 extern void exit_cifs_idmap(void);
b74cb9a80268be fs/cifs/cifsproto.h Sachin Prabhu 2016-05-17 61 extern int init_cifs_spnego(void);
b74cb9a80268be fs/cifs/cifsproto.h Sachin Prabhu 2016-05-17 62 extern void exit_cifs_spnego(void);
f6a9bc336b600e fs/cifs/cifsproto.h Al Viro 2021-03-05 63 extern const char *build_path_from_dentry(struct dentry *, void *);
7ad54b98fc1f14 fs/cifs/cifsproto.h Paulo Alcantara 2022-12-18 64 char *__build_path_from_dentry_optional_prefix(struct dentry *direntry, void *page,
7ad54b98fc1f14 fs/cifs/cifsproto.h Paulo Alcantara 2022-12-18 65 const char *tree, int tree_len,
7ad54b98fc1f14 fs/cifs/cifsproto.h Paulo Alcantara 2022-12-18 66 bool prefix);
268a635d414df4 fs/cifs/cifsproto.h Aurelien Aptel 2017-02-13 67 extern char *build_path_from_dentry_optional_prefix(struct dentry *direntry,
f6a9bc336b600e fs/cifs/cifsproto.h Al Viro 2021-03-05 68 void *page, bool prefix);
f6a9bc336b600e fs/cifs/cifsproto.h Al Viro 2021-03-05 69 static inline void *alloc_dentry_path(void)
f6a9bc336b600e fs/cifs/cifsproto.h Al Viro 2021-03-05 70 {
f6a9bc336b600e fs/cifs/cifsproto.h Al Viro 2021-03-05 @71 return __getname();
f6a9bc336b600e fs/cifs/cifsproto.h Al Viro 2021-03-05 72 }
f6a9bc336b600e fs/cifs/cifsproto.h Al Viro 2021-03-05 73
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v4] fs: hide names_cachep behind runtime access machinery
2025-10-30 10:52 [PATCH v4] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
` (3 preceding siblings ...)
2025-10-31 22:43 ` kernel test robot
@ 2025-11-01 23:06 ` kernel test robot
4 siblings, 0 replies; 51+ messages in thread
From: kernel test robot @ 2025-11-01 23:06 UTC (permalink / raw)
To: Mateusz Guzik, brauner
Cc: oe-kbuild-all, viro, jack, linux-kernel, linux-fsdevel, torvalds,
pfalcato, Mateusz Guzik
Hi Mateusz,
kernel test robot noticed the following build errors:
[auto build test ERROR on arnd-asm-generic/master]
[also build test ERROR on linus/master linux/master v6.18-rc3 next-20251031]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/fs-hide-names_cachep-behind-runtime-access-machinery/20251101-054539
base: https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git master
patch link: https://lore.kernel.org/r/20251030105242.801528-1-mjguzik%40gmail.com
patch subject: [PATCH v4] fs: hide names_cachep behind runtime access machinery
config: riscv-randconfig-r111-20251102 (https://download.01.org/0day-ci/archive/20251102/202511020603.NRNONOtT-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 11.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251102/202511020603.NRNONOtT-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511020603.NRNONOtT-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from arch/riscv/include/asm/runtime-const.h:7,
from include/linux/fs.h:53,
from include/linux/huge_mm.h:7,
from include/linux/mm.h:1016,
from arch/riscv/kernel/asm-offsets.c:8:
arch/riscv/include/asm/cacheflush.h: In function 'flush_cache_vmap':
arch/riscv/include/asm/cacheflush.h:49:13: error: implicit declaration of function 'is_vmalloc_or_module_addr' [-Werror=implicit-function-declaration]
49 | if (is_vmalloc_or_module_addr((void *)start)) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~
In file included from include/linux/compat.h:18,
from arch/riscv/include/asm/elf.h:12,
from include/linux/elf.h:6,
from include/linux/module.h:20,
from include/linux/device/driver.h:21,
from include/linux/device.h:32,
from include/linux/node.h:18,
from include/linux/memory.h:19,
from arch/riscv/include/asm/runtime-const.h:9,
from include/linux/fs.h:53,
from include/linux/huge_mm.h:7,
from include/linux/mm.h:1016,
from arch/riscv/kernel/asm-offsets.c:8:
include/uapi/linux/aio_abi.h: At top level:
include/uapi/linux/aio_abi.h:79:9: error: unknown type name '__kernel_rwf_t'
79 | __kernel_rwf_t aio_rw_flags; /* RWF_* flags */
| ^~~~~~~~~~~~~~
In file included from arch/riscv/kernel/asm-offsets.c:8:
>> include/linux/mm.h:1092:19: error: static declaration of 'is_vmalloc_or_module_addr' follows non-static declaration
1092 | static inline int is_vmalloc_or_module_addr(const void *x)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
In file included from arch/riscv/include/asm/runtime-const.h:7,
from include/linux/fs.h:53,
from include/linux/huge_mm.h:7,
from include/linux/mm.h:1016,
from arch/riscv/kernel/asm-offsets.c:8:
arch/riscv/include/asm/cacheflush.h:49:13: note: previous implicit declaration of 'is_vmalloc_or_module_addr' with type 'int()'
49 | if (is_vmalloc_or_module_addr((void *)start)) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:182: arch/riscv/kernel/asm-offsets.s] Error 1
make[3]: Target 'prepare' not remade because of errors.
make[2]: *** [Makefile:1282: prepare0] Error 2
make[2]: Target 'prepare' not remade because of errors.
make[1]: *** [Makefile:248: __sub-make] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:248: __sub-make] Error 2
make: Target 'prepare' not remade because of errors.
vim +/is_vmalloc_or_module_addr +1092 include/linux/mm.h
98b1917cdef92c David Hildenbrand 2025-04-10 1015
71e3aac0724ffe Andrea Arcangeli 2011-01-13 @1016 #include <linux/huge_mm.h>
^1da177e4c3f41 Linus Torvalds 2005-04-16 1017
^1da177e4c3f41 Linus Torvalds 2005-04-16 1018 /*
^1da177e4c3f41 Linus Torvalds 2005-04-16 1019 * Methods to modify the page usage count.
^1da177e4c3f41 Linus Torvalds 2005-04-16 1020 *
^1da177e4c3f41 Linus Torvalds 2005-04-16 1021 * What counts for a page usage:
^1da177e4c3f41 Linus Torvalds 2005-04-16 1022 * - cache mapping (page->mapping)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1023 * - private data (page->private)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1024 * - page mapped in a task's page tables, each mapping
^1da177e4c3f41 Linus Torvalds 2005-04-16 1025 * is counted separately
^1da177e4c3f41 Linus Torvalds 2005-04-16 1026 *
^1da177e4c3f41 Linus Torvalds 2005-04-16 1027 * Also, many kernel routines increase the page count before a critical
^1da177e4c3f41 Linus Torvalds 2005-04-16 1028 * routine so they can be sure the page doesn't go away from under them.
^1da177e4c3f41 Linus Torvalds 2005-04-16 1029 */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1030
^1da177e4c3f41 Linus Torvalds 2005-04-16 1031 /*
da6052f7b33abe Nicholas Piggin 2006-09-25 1032 * Drop a ref, return true if the refcount fell to zero (the page has no users)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1033 */
7c8ee9a86340db Nicholas Piggin 2006-03-22 1034 static inline int put_page_testzero(struct page *page)
7c8ee9a86340db Nicholas Piggin 2006-03-22 1035 {
fe896d1878949e Joonsoo Kim 2016-03-17 1036 VM_BUG_ON_PAGE(page_ref_count(page) == 0, page);
fe896d1878949e Joonsoo Kim 2016-03-17 1037 return page_ref_dec_and_test(page);
7c8ee9a86340db Nicholas Piggin 2006-03-22 1038 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 1039
b620f63358cd35 Matthew Wilcox (Oracle 2020-12-06 1040) static inline int folio_put_testzero(struct folio *folio)
b620f63358cd35 Matthew Wilcox (Oracle 2020-12-06 1041) {
b620f63358cd35 Matthew Wilcox (Oracle 2020-12-06 1042) return put_page_testzero(&folio->page);
b620f63358cd35 Matthew Wilcox (Oracle 2020-12-06 1043) }
b620f63358cd35 Matthew Wilcox (Oracle 2020-12-06 1044)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1045 /*
7c8ee9a86340db Nicholas Piggin 2006-03-22 1046 * Try to grab a ref unless the page has a refcount of zero, return false if
7c8ee9a86340db Nicholas Piggin 2006-03-22 1047 * that is the case.
8e0861fa3c4edf Alexey Kardashevskiy 2013-08-28 1048 * This can be called when MMU is off so it must not access
8e0861fa3c4edf Alexey Kardashevskiy 2013-08-28 1049 * any of the virtual mappings.
^1da177e4c3f41 Linus Torvalds 2005-04-16 1050 */
c25303281d7929 Matthew Wilcox (Oracle 2021-06-05 1051) static inline bool get_page_unless_zero(struct page *page)
7c8ee9a86340db Nicholas Piggin 2006-03-22 1052 {
fe896d1878949e Joonsoo Kim 2016-03-17 1053 return page_ref_add_unless(page, 1, 0);
7c8ee9a86340db Nicholas Piggin 2006-03-22 1054 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 1055
3c1ea2c729ef8e Vishal Moola (Oracle 2023-01-30 1056) static inline struct folio *folio_get_nontail_page(struct page *page)
3c1ea2c729ef8e Vishal Moola (Oracle 2023-01-30 1057) {
3c1ea2c729ef8e Vishal Moola (Oracle 2023-01-30 1058) if (unlikely(!get_page_unless_zero(page)))
3c1ea2c729ef8e Vishal Moola (Oracle 2023-01-30 1059) return NULL;
3c1ea2c729ef8e Vishal Moola (Oracle 2023-01-30 1060) return (struct folio *)page;
3c1ea2c729ef8e Vishal Moola (Oracle 2023-01-30 1061) }
3c1ea2c729ef8e Vishal Moola (Oracle 2023-01-30 1062)
53df8fdc15fb64 Wu Fengguang 2010-01-27 1063 extern int page_is_ram(unsigned long pfn);
124fe20d94630b Dan Williams 2015-08-10 1064
124fe20d94630b Dan Williams 2015-08-10 1065 enum {
124fe20d94630b Dan Williams 2015-08-10 1066 REGION_INTERSECTS,
124fe20d94630b Dan Williams 2015-08-10 1067 REGION_DISJOINT,
124fe20d94630b Dan Williams 2015-08-10 1068 REGION_MIXED,
124fe20d94630b Dan Williams 2015-08-10 1069 };
124fe20d94630b Dan Williams 2015-08-10 1070
1c29f25bf5d6c5 Toshi Kani 2016-01-26 1071 int region_intersects(resource_size_t offset, size_t size, unsigned long flags,
1c29f25bf5d6c5 Toshi Kani 2016-01-26 1072 unsigned long desc);
53df8fdc15fb64 Wu Fengguang 2010-01-27 1073
48667e7a43c1a1 Christoph Lameter 2008-02-04 1074 /* Support for virtually mapped pages */
b3bdda02aa547a Christoph Lameter 2008-02-04 1075 struct page *vmalloc_to_page(const void *addr);
b3bdda02aa547a Christoph Lameter 2008-02-04 1076 unsigned long vmalloc_to_pfn(const void *addr);
48667e7a43c1a1 Christoph Lameter 2008-02-04 1077
0738c4bb8f2a8b Paul Mundt 2008-03-12 1078 /*
0738c4bb8f2a8b Paul Mundt 2008-03-12 1079 * Determine if an address is within the vmalloc range
0738c4bb8f2a8b Paul Mundt 2008-03-12 1080 *
0738c4bb8f2a8b Paul Mundt 2008-03-12 1081 * On nommu, vmalloc/vfree wrap through kmalloc/kfree directly, so there
0738c4bb8f2a8b Paul Mundt 2008-03-12 1082 * is no special casing required.
0738c4bb8f2a8b Paul Mundt 2008-03-12 1083 */
81ac3ad9061dd9 KAMEZAWA Hiroyuki 2009-09-22 1084 #ifdef CONFIG_MMU
186525bd6b83ef Ingo Molnar 2019-11-29 1085 extern bool is_vmalloc_addr(const void *x);
81ac3ad9061dd9 KAMEZAWA Hiroyuki 2009-09-22 1086 extern int is_vmalloc_or_module_addr(const void *x);
81ac3ad9061dd9 KAMEZAWA Hiroyuki 2009-09-22 1087 #else
186525bd6b83ef Ingo Molnar 2019-11-29 1088 static inline bool is_vmalloc_addr(const void *x)
186525bd6b83ef Ingo Molnar 2019-11-29 1089 {
186525bd6b83ef Ingo Molnar 2019-11-29 1090 return false;
186525bd6b83ef Ingo Molnar 2019-11-29 1091 }
934831d060ccd5 David Howells 2009-09-24 @1092 static inline int is_vmalloc_or_module_addr(const void *x)
81ac3ad9061dd9 KAMEZAWA Hiroyuki 2009-09-22 1093 {
81ac3ad9061dd9 KAMEZAWA Hiroyuki 2009-09-22 1094 return 0;
81ac3ad9061dd9 KAMEZAWA Hiroyuki 2009-09-22 1095 }
81ac3ad9061dd9 KAMEZAWA Hiroyuki 2009-09-22 1096 #endif
9e2779fa281cfd Christoph Lameter 2008-02-04 1097
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-10-31 17:42 ` [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules Mateusz Guzik
2025-10-31 21:46 ` Linus Torvalds
2025-11-01 11:26 ` David Laight
@ 2025-11-04 6:25 ` Linus Torvalds
2025-11-04 8:56 ` Mateusz Guzik
` (2 more replies)
2 siblings, 3 replies; 51+ messages in thread
From: Linus Torvalds @ 2025-11-04 6:25 UTC (permalink / raw)
To: Mateusz Guzik, the arch/x86 maintainers
Cc: brauner, viro, jack, linux-kernel, linux-fsdevel, tglx, pfalcato
[-- Attachment #1: Type: text/plain, Size: 2429 bytes --]
[ Adding x86 maintainers - I had added Thomas earlier, but I guess at
least Borislav might actually care and have input too ]
So I think the patch I will commit would look like the attached: it's
similar to your suggestion, but without the renaming of USER_PTR_MAX,
and with just a
#ifdef MODULE
#define runtime_const_ptr(sym) (sym)
#else
#include <asm/runtime-const.h>
#endif
in the x86 asm/uaccess_64.h header file and an added '#error' for the
MODULE case in the actual x86 runtime-const.h file.
As it is, this bug really only affects modular code that uses
access_ok() and __{get,put}_user(), which is a really broken pattern
to begin with these days, and is happily fairly rare.
That is an old optimization that is no longer an optimization at all
(since a plain "get_user()" is actually *faster* than the access_ok()
and __get_user() these days), and I wish we didn't have any such code
any more, but there are a handful of things that have never been
converted to the modern world order.
So it is what it is, and we have to deal with it.
Also, even that kind of rare and broken code actually *works*,
although the whole "non-canonical reads can speculatively leak
possibly kernel data" does end up being an issue (largely theoretical
because it's now limited to just a couple of odd-ball code sequences)
And yes, it works just because I picked a runtime-const value that is
non-canonical. I'd say it's "by luck", but I did pick that value
partly *because* it's non-canonical, so it's not _entirely_ just luck.
But mostly.
That was all a long explanation for why I am planning on committing
this as a real fix, even if the actual impact of it is largely
theoretical.
Borislav - comments? Generating this patch took longer than it should
have, but I had travel and jetlag and a flight that I expected to have
wifi but didn't... And properly it should probably be committed by
x86 maintainers rather than me, but I did mess this code up in the
first place.
The patch *looks* very straightforward, but since I'm on the road I am
doing this on my laptop and haven't actually tested it yet (well, I've
built this, and booted it, but nothing past that).
Mateusz - I'd like to just credit you with this, since your comment
about modules was why I started looking into this all in the first
place (and you then wrote a similar patch). But I'm not going to do
that without your ack.
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 2054 bytes --]
arch/x86/include/asm/runtime-const.h | 4 ++++
arch/x86/include/asm/uaccess_64.h | 10 +++++-----
arch/x86/kernel/cpu/common.c | 6 +++++-
3 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/runtime-const.h b/arch/x86/include/asm/runtime-const.h
index 8d983cfd06ea..e5a13dc8816e 100644
--- a/arch/x86/include/asm/runtime-const.h
+++ b/arch/x86/include/asm/runtime-const.h
@@ -2,6 +2,10 @@
#ifndef _ASM_RUNTIME_CONST_H
#define _ASM_RUNTIME_CONST_H
+#ifdef MODULE
+ #error "Cannot use runtime-const infrastructure from modules"
+#endif
+
#ifdef __ASSEMBLY__
.macro RUNTIME_CONST_PTR sym reg
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index c8a5ae35c871..641f45c22f9d 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -12,12 +12,12 @@
#include <asm/cpufeatures.h>
#include <asm/page.h>
#include <asm/percpu.h>
-#include <asm/runtime-const.h>
-/*
- * Virtual variable: there's no actual backing store for this,
- * it can purely be used as 'runtime_const_ptr(USER_PTR_MAX)'
- */
+#ifdef MODULE
+ #define runtime_const_ptr(sym) (sym)
+#else
+ #include <asm/runtime-const.h>
+#endif
extern unsigned long USER_PTR_MAX;
#ifdef CONFIG_ADDRESS_MASKING
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index c7d3512914ca..02d97834a1d4 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -78,6 +78,10 @@
DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
+/* Used for modules: built-in code uses runtime constants */
+unsigned long USER_PTR_MAX;
+EXPORT_SYMBOL(USER_PTR_MAX);
+
u32 elf_hwcap2 __read_mostly;
/* Number of siblings per CPU package */
@@ -2579,7 +2583,7 @@ void __init arch_cpu_finalize_init(void)
alternative_instructions();
if (IS_ENABLED(CONFIG_X86_64)) {
- unsigned long USER_PTR_MAX = TASK_SIZE_MAX;
+ USER_PTR_MAX = TASK_SIZE_MAX;
/*
* Enable this when LAM is gated on LASS support
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 6:25 ` Linus Torvalds
@ 2025-11-04 8:56 ` Mateusz Guzik
2025-11-04 9:37 ` Linus Torvalds
2025-11-04 10:25 ` Borislav Petkov
2025-11-04 17:09 ` Sean Christopherson
2 siblings, 1 reply; 51+ messages in thread
From: Mateusz Guzik @ 2025-11-04 8:56 UTC (permalink / raw)
To: Linus Torvalds
Cc: the arch/x86 maintainers, brauner, viro, jack, linux-kernel,
linux-fsdevel, tglx, pfalcato
On Tue, Nov 4, 2025 at 7:25 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Mateusz - I'd like to just credit you with this, since your comment
> about modules was why I started looking into this all in the first
> place (and you then wrote a similar patch). But I'm not going to do
> that without your ack.
>
I don't think crediting me here is warranted.
I would appreciate some feedback on the header split idea though. :)
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 8:56 ` Mateusz Guzik
@ 2025-11-04 9:37 ` Linus Torvalds
0 siblings, 0 replies; 51+ messages in thread
From: Linus Torvalds @ 2025-11-04 9:37 UTC (permalink / raw)
To: Mateusz Guzik
Cc: the arch/x86 maintainers, brauner, viro, jack, linux-kernel,
linux-fsdevel, tglx, pfalcato
On Tue, 4 Nov 2025 at 17:57, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> I would appreciate some feedback on the header split idea though. :)
I don't think it's wrong, but I don't think it buys us much either.
And it does make it harder to see what the bigger pattern is - the
code that initializes the constants is deeply intertwined with the
code that uses it, and you split it up into different files, so now
you can't see what the interdependence is...
Linus
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 6:25 ` Linus Torvalds
2025-11-04 8:56 ` Mateusz Guzik
@ 2025-11-04 10:25 ` Borislav Petkov
2025-11-04 16:13 ` Borislav Petkov
2025-11-04 17:09 ` Sean Christopherson
2 siblings, 1 reply; 51+ messages in thread
From: Borislav Petkov @ 2025-11-04 10:25 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mateusz Guzik, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Tue, Nov 04, 2025 at 03:25:20PM +0900, Linus Torvalds wrote:
> Borislav - comments?
LGTM at a quick glance but lemme take it for a spin around the hw jungle here
later and give it a more thorough look, once I've put out all the daily
fires...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 10:25 ` Borislav Petkov
@ 2025-11-04 16:13 ` Borislav Petkov
2025-11-05 1:50 ` Linus Torvalds
2025-11-05 20:50 ` Mateusz Guzik
0 siblings, 2 replies; 51+ messages in thread
From: Borislav Petkov @ 2025-11-04 16:13 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mateusz Guzik, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Tue, Nov 04, 2025 at 11:25:44AM +0100, Borislav Petkov wrote:
> On Tue, Nov 04, 2025 at 03:25:20PM +0900, Linus Torvalds wrote:
> > Borislav - comments?
>
> LGTM at a quick glance but lemme take it for a spin around the hw jungle here
> later and give it a more thorough look, once I've put out all the daily
> fires...
Did a deeper look, did randbuilds, boots fine on a couple of machines, so all
good AFAIIC.
I sincerely hope that helps.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 6:25 ` Linus Torvalds
2025-11-04 8:56 ` Mateusz Guzik
2025-11-04 10:25 ` Borislav Petkov
@ 2025-11-04 17:09 ` Sean Christopherson
2025-11-04 19:07 ` Linus Torvalds
2 siblings, 1 reply; 51+ messages in thread
From: Sean Christopherson @ 2025-11-04 17:09 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mateusz Guzik, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Tue, Nov 04, 2025, Linus Torvalds wrote:
> [ Adding x86 maintainers - I had added Thomas earlier, but I guess at
> least Borislav might actually care and have input too ]
>
> So I think the patch I will commit would look like the attached: it's
> similar to your suggestion, but without the renaming of USER_PTR_MAX,
> and with just a
>
> #ifdef MODULE
> #define runtime_const_ptr(sym) (sym)
> #else
> #include <asm/runtime-const.h>
> #endif
>
> in the x86 asm/uaccess_64.h header file and an added '#error' for the
> MODULE case in the actual x86 runtime-const.h file.
>
> As it is, this bug really only affects modular code that uses
What exactly is the bug? Is the problem that module usage of runtime_const_ptr()
doesn't get patched on module load, and so module code ends up using the
0x0123456789abcdef placeholder?
> access_ok() and __{get,put}_user(), which is a really broken pattern
> to begin with these days, and is happily fairly rare.
Just to make sure I understand the impact, doesn't this also affect all flavors
of "nocheck" uaccesses? E.g. access_ok() + __copy_{from,to}_user()?
> That is an old optimization that is no longer an optimization at all
> (since a plain "get_user()" is actually *faster* than the access_ok()
> and __get_user() these days), and I wish we didn't have any such code
> any more, but there are a handful of things that have never been
> converted to the modern world order.
Looking at the assembly, I assume get_user() is faster than __get_user() due to
the LFENCE in ASM_BARRIER_NOSPEC?
> So it is what it is, and we have to deal with it.
Assuming __{get,put}_user() are slower on x86 in all scenarios, would it make
sense to kill them off entirely for x86? E.g. could we reroute them to the
"checked" variants?
For KVM x86, I'm more than happy to switch all two __{get,put}_user() calls to
the checked variants if they're faster.
> Also, even that kind of rare and broken code actually *works*,
> although the whole "non-canonical reads can speculatively leak
> possibly kernel data" does end up being an issue (largely theoretical
> because it's now limited to just a couple of odd-ball code sequences)
>
> And yes, it works just because I picked a runtime-const value that is
> non-canonical. I'd say it's "by luck", but I did pick that value
> partly *because* it's non-canonical, so it's not _entirely_ just luck.
> But mostly.
>
> That was all a long explanation for why I am planning on committing
> this as a real fix, even if the actual impact of it is largely
> theoretical.
>
> Borislav - comments? Generating this patch took longer than it should
> have, but I had travel and jetlag and a flight that I expected to have
> wifi but didn't... And properly it should probably be committed by
> x86 maintainers rather than me, but I did mess this code up in the
> first place.
>
> The patch *looks* very straightforward, but since I'm on the road I am
> doing this on my laptop and haven't actually tested it yet (well, I've
> built this, and booted it, but nothing past that).
FWIW, AFAICT it doesn't cause any regressions for KVM's usage of access_ok().
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 17:09 ` Sean Christopherson
@ 2025-11-04 19:07 ` Linus Torvalds
2025-11-04 19:34 ` Linus Torvalds
2025-11-04 20:17 ` Borislav Petkov
0 siblings, 2 replies; 51+ messages in thread
From: Linus Torvalds @ 2025-11-04 19:07 UTC (permalink / raw)
To: Sean Christopherson
Cc: Mateusz Guzik, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Wed, 5 Nov 2025 at 02:09, Sean Christopherson <seanjc@google.com> wrote:
>
> What exactly is the bug? Is the problem that module usage of runtime_const_ptr()
> doesn't get patched on module load, and so module code ends up using the
> 0x0123456789abcdef placeholder?
Yes. The runtime-const fixup is intentionally simplistic, because the
ordering concerns with the more generic instruction rewriting was
painful (and architecture-specific).
And as part of being simple and stupid, it doesn't deal with modules,
and only runs early at boot.
> Just to make sure I understand the impact, doesn't this also affect all flavors
> of "nocheck" uaccesses? E.g. access_ok() + __copy_{from,to}_user()?
The access_ok() issue happens with those too, but I don't think there
was any way to then trigger the speculative leak with non-canonical
addresses that way. Iirc, you needed a load-load gadget and only had
a few cycles in which to do it.
But in theory, yes.
> Looking at the assembly, I assume get_user() is faster than __get_user() due to
> the LFENCE in ASM_BARRIER_NOSPEC?
The access_ok() itself is also slower than the address masking, with
the whole "add size and check for overflow" dance that a plain
get_user() simply doesn't need.
Of course, at some point it can be advantageous to only check the
address once, and then do multiple __get_user() calls, and that was
obviously the *original* advantage (along with inlining the
single-instruction __get_user).
But with SMAP, the inlining advantage hasn't existed for years, and
the "avoid 3*N cheap ALU instructions by using a much more expensive
access_ok()" is dubious even for somewhat larger values of N.
> Assuming __{get,put}_user() are slower on x86 in all scenarios, would it make
> sense to kill them off entirely for x86? E.g. could we reroute them to the
> "checked" variants?
Sadly, no. We've wanted to do that many times for various other
reasons, and we really should, but because of historical semantics,
some horrendous users still use "__get_user()" for addresses that
might be user space or might be kernel space depending on use-case.
Maybe we should bite the bullet and just break any remaining cases of
that horrendous historical pattern. There might not be any actual
relevant ones left, and they should all be easyish to fix if we just
*find* them. But we had that pattern in at least some tracing code,
and I'd expect some random drivers too, just because it *used* to
historically work to do "the user access path does access_ok(), the
kernel access path doesn't, and then both can use __get_user()".
In fact, Josh Poimboeuf tried to do that __get_user() fix fairly
recently, but he hit at least the "coco" code mis-using this thing.
See vc_read_mem() in arch/x86/coco/sev/vc-handle.c.
Are there others? Probably not very many. But *finding* all those
cases is the painful part.
Anybody want a new pet project?
Linus
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 19:07 ` Linus Torvalds
@ 2025-11-04 19:34 ` Linus Torvalds
2025-11-04 21:53 ` Sean Christopherson
2025-11-04 20:17 ` Borislav Petkov
1 sibling, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2025-11-04 19:34 UTC (permalink / raw)
To: Sean Christopherson
Cc: Mateusz Guzik, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Wed, 5 Nov 2025 at 04:07, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Sadly, no. We've wanted to do that many times for various other
> reasons, and we really should, but because of historical semantics,
> some horrendous users still use "__get_user()" for addresses that
> might be user space or might be kernel space depending on use-case.
>
> Maybe we should bite the bullet and just break any remaining cases of
> that horrendous historical pattern. [...]
What I think is probably the right approach is to just take the normal
__get_user() calls - the ones that are obviously to user space, and
have an access_ok() - and just replace them with get_user().
That should all be very simple and straightforward for any half-way
normal code, and you won't see any downsides.
And in the unlikely case that you can measure any performance impact
because you had one single access_ok() and many __get_user() calls,
and *if* you really really care, that kind of code should be using
"user_read_access_begin()" and friends anyway, because unlike the
range checking, the *real* performance issue is almost certainly going
to be the cost of the CLAC/STAC instructions.
Put another way: __get_user() is simply always wrong these days.
Either it's wrong because it's a bad historical optimization that
isn't an optimization any more, or it's wrong because it's mis-using
the old semantics to play tricks with kernel-vs-user memory.
So we shouldn't try to "fix" __get_user(). We should aim to get rid of it.
LInus
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 19:07 ` Linus Torvalds
2025-11-04 19:34 ` Linus Torvalds
@ 2025-11-04 20:17 ` Borislav Petkov
2025-11-04 22:06 ` Linus Torvalds
1 sibling, 1 reply; 51+ messages in thread
From: Borislav Petkov @ 2025-11-04 20:17 UTC (permalink / raw)
To: Linus Torvalds, Joerg Roedel, Tom Lendacky
Cc: Sean Christopherson, Mateusz Guzik, the arch/x86 maintainers,
brauner, viro, jack, linux-kernel, linux-fsdevel, tglx, pfalcato
+ Joerg and Tom.
On Wed, Nov 05, 2025 at 04:07:44AM +0900, Linus Torvalds wrote:
> In fact, Josh Poimboeuf tried to do that __get_user() fix fairly
> recently, but he hit at least the "coco" code mis-using this thing.
>
> See vc_read_mem() in arch/x86/coco/sev/vc-handle.c.
So Tom and I did pre-fault this whole deal just now: so we need an atomic way
to figure out whether we'll fault on the address and then handle that result
properly. Which we do. So we only need to know whether it'll fault or not,
without sleeping.
So the question is, what would be an alternative to do that? Should we do
something homegrown?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 19:34 ` Linus Torvalds
@ 2025-11-04 21:53 ` Sean Christopherson
0 siblings, 0 replies; 51+ messages in thread
From: Sean Christopherson @ 2025-11-04 21:53 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mateusz Guzik, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Wed, Nov 05, 2025, Linus Torvalds wrote:
> On Wed, 5 Nov 2025 at 04:07, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Sadly, no. We've wanted to do that many times for various other
> > reasons, and we really should, but because of historical semantics,
> > some horrendous users still use "__get_user()" for addresses that
> > might be user space or might be kernel space depending on use-case.
Eww.
> > Maybe we should bite the bullet and just break any remaining cases of
> > that horrendous historical pattern. [...]
>
> What I think is probably the right approach is to just take the normal
> __get_user() calls - the ones that are obviously to user space, and
> have an access_ok() - and just replace them with get_user().
>
> That should all be very simple and straightforward for any half-way
> normal code, and you won't see any downsides.
>
> And in the unlikely case that you can measure any performance impact
> because you had one single access_ok() and many __get_user() calls,
> and *if* you really really care, that kind of code should be using
> "user_read_access_begin()" and friends anyway, because unlike the
> range checking, the *real* performance issue is almost certainly going
> to be the cost of the CLAC/STAC instructions.
>
> Put another way: __get_user() is simply always wrong these days.
> Either it's wrong because it's a bad historical optimization that
> isn't an optimization any more, or it's wrong because it's mis-using
> the old semantics to play tricks with kernel-vs-user memory.
>
> So we shouldn't try to "fix" __get_user(). We should aim to get rid of it.
Curiosity got the better of me :-)
TL;DR: I agree, we should kill __get_user().
KVM x86's use case is a bit of a snowflake. KVM does the access_ok() check when
host userspace configures memory regions for the guest, and then does __get_user()
when reading guest PTEs (i.e. when walking the guest's page tables for shadow
paging).
For each access_ok(), there are potentially billions (with a 'b') of __get_user()
calls throughout the lifetime of the guest when KVM is using shadow paging. E.g.
just booting a Linux guest hits the __get_user() in arch/x86/kvm/mmu/paging_tmpl.h
a few million times. So if there's any chance that split access_ok() + __get_user()
provides a performance advantage, then it should show up in KVM's shadow paging
use case.
Unless I botched the measurements, get_user() is straight up faster on both Intel
(EMR) and AMD (Turin). Over tens of millions of calls, get_user() is 12%+ faster
on Intel and 25%+ faster on AMD, relative to __get_user(). The extra overhead is
pretty much entirely due to the LFENCE, as open coding the equivalent via
__uaccess_begin_nospec()+unsafe_get_user()+__uaccess_end(), to avoid the CALL+RET,
yields identical numbers to __get_user(). Dropping the LFENCE, by using
__uaccess_begin(), manages to eke out a victory over get_user() by ~2 cycles, but
that's not remotely worth having to think about whether or not the LFENCE is necessary.
The only setup I can think of that _might_ benefit from __get_user() would be
ancient CPUs without EPT/NPT (i.e. CPUs on which KVM _must_ use shadow paging)
and without SMAP, but those CPUs are so old that IMO they simply aren't relevant
when it comes to performance. Or I suppose the horrors where RET is actually
something else entirely, but that's also a "don't care", at least as far as KVM
is concerned.
Cycles per guest PTE read:
__get_user() get_user() open-coded open-coded, no LFENCE
Intel (EMR) 75.1 67.6 75.3 65.5
AMD (Turin) 68.1 51.1 67.5 49.3
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 20:17 ` Borislav Petkov
@ 2025-11-04 22:06 ` Linus Torvalds
2025-11-05 11:49 ` Borislav Petkov
0 siblings, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2025-11-04 22:06 UTC (permalink / raw)
To: Borislav Petkov
Cc: Joerg Roedel, Tom Lendacky, Sean Christopherson, Mateusz Guzik,
the arch/x86 maintainers, brauner, viro, jack, linux-kernel,
linux-fsdevel, tglx, pfalcato
On Wed, 5 Nov 2025 at 05:18, Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Nov 05, 2025 at 04:07:44AM +0900, Linus Torvalds wrote:
> > In fact, Josh Poimboeuf tried to do that __get_user() fix fairly
> > recently, but he hit at least the "coco" code mis-using this thing.
> >
> > See vc_read_mem() in arch/x86/coco/sev/vc-handle.c.
>
> So Tom and I did pre-fault this whole deal just now: so we need an atomic way
> to figure out whether we'll fault on the address and then handle that result
> properly. Which we do. So we only need to know whether it'll fault or not,
> without sleeping.
>
> So the question is, what would be an alternative to do that? Should we do
> something homegrown?
So I think that since it's x86-specific code, maybe something
homegrown is the way to go. I mean, that cdoe already effectively is.
With a *BIG* comment about what is going on, something like
pagefault_disable();
stac();
unsafe_get_user(val, ptr, fault_label);
clac();
pagefault_enable();
return 0;
fault_label:
clac();
return 1;
but any other users of __get_user() that aren't in x86-specific code
can't do that, so I do think it's probably better to just migrate the
*good* cases - the ones known to actually be about user space - away
from __get_user() and just leave these turds alone.
Linus
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 16:13 ` Borislav Petkov
@ 2025-11-05 1:50 ` Linus Torvalds
2025-11-05 11:37 ` Borislav Petkov
2025-11-05 20:50 ` Mateusz Guzik
1 sibling, 1 reply; 51+ messages in thread
From: Linus Torvalds @ 2025-11-05 1:50 UTC (permalink / raw)
To: Borislav Petkov
Cc: Mateusz Guzik, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Wed, 5 Nov 2025 at 01:14, Borislav Petkov <bp@alien8.de> wrote:
>
> Did a deeper look, did randbuilds, boots fine on a couple of machines, so all
> good AFAIIC.
>
> I sincerely hope that helps.
I pushed it out with a proper commit message etc. It might not be an
acute bug right now, but I do want it fixed in 6.18, so that when
Thomas' new scoped accessors get merged - and maybe cause the whole
inlining pattern to be much more commonly used - this is all behind
us.
And the patch certainly _looks_ ObviouslyCorrect(tm). Famous last words.
Linus
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-05 1:50 ` Linus Torvalds
@ 2025-11-05 11:37 ` Borislav Petkov
0 siblings, 0 replies; 51+ messages in thread
From: Borislav Petkov @ 2025-11-05 11:37 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mateusz Guzik, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Wed, Nov 05, 2025 at 10:50:21AM +0900, Linus Torvalds wrote:
> I pushed it out with a proper commit message etc. It might not be an
> acute bug right now, but I do want it fixed in 6.18, so that when
> Thomas' new scoped accessors get merged - and maybe cause the whole
> inlining pattern to be much more commonly used - this is all behind
> us.
Right.
> And the patch certainly _looks_ ObviouslyCorrect(tm). Famous last words.
Yeah, and we are testing the lineup in the coming weeks on a lot of hw so if
anything fires, we will catch it. So we should be good.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 22:06 ` Linus Torvalds
@ 2025-11-05 11:49 ` Borislav Petkov
0 siblings, 0 replies; 51+ messages in thread
From: Borislav Petkov @ 2025-11-05 11:49 UTC (permalink / raw)
To: Linus Torvalds
Cc: Joerg Roedel, Tom Lendacky, Sean Christopherson, Mateusz Guzik,
the arch/x86 maintainers, brauner, viro, jack, linux-kernel,
linux-fsdevel, tglx, pfalcato
On Wed, Nov 05, 2025 at 07:06:05AM +0900, Linus Torvalds wrote:
> but any other users of __get_user() that aren't in x86-specific code
> can't do that, so I do think it's probably better to just migrate the
> *good* cases - the ones known to actually be about user space - away
> from __get_user() and just leave these turds alone.
We probably should think of a scheme to stop __get_user() from spreading
around by hiding it in an arch-specific header which doesn't get exposed to
modules/drivers/etc and then once that is in place, take care of the existing
offenders and convert them slowly.
It'll need careful conversion and testing, I'd say but at least we'll have
a finite, non-growing number of occurrences to convert:
$ git grep -w __get_user *.c | grep -v arch | wc -l
43
Not a lot. (The headers are mostly macro definitions AFAICT).
The arches would then be a separate deal...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-04 16:13 ` Borislav Petkov
2025-11-05 1:50 ` Linus Torvalds
@ 2025-11-05 20:50 ` Mateusz Guzik
2025-11-06 11:14 ` Borislav Petkov
1 sibling, 1 reply; 51+ messages in thread
From: Mateusz Guzik @ 2025-11-05 20:50 UTC (permalink / raw)
To: Borislav Petkov
Cc: Linus Torvalds, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Tue, Nov 4, 2025 at 5:14 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Nov 04, 2025 at 11:25:44AM +0100, Borislav Petkov wrote:
> > On Tue, Nov 04, 2025 at 03:25:20PM +0900, Linus Torvalds wrote:
> > > Borislav - comments?
> >
> > LGTM at a quick glance but lemme take it for a spin around the hw jungle here
> > later and give it a more thorough look, once I've put out all the daily
> > fires...
>
> Did a deeper look, did randbuilds, boots fine on a couple of machines, so all
> good AFAIIC.
>
> I sincerely hope that helps.
>
Derailing the thread from the previous derailment with the following:
For unrelated reasons I disassembled kmem_cache_free and the following
goodies popped up:
sub 0x18e033f(%rip),%rax # ffffffff82f944d0 <page_offset_base>
[..]
add 0x18e031d(%rip),%rax # ffffffff82f944c0 <vmemmap_base>
[..]
mov 0x2189e19(%rip),%rax # ffffffff8383e010 <__pi_phys_base>
These are definitely worthwhile to get rid of.
I'm just worried that given their low level nature they may happen to
be used before the runtime machinery is done patching and for now
can't be bothered to test that.
Worst case separate helpers could be added which are only legally used
after the patching and select cases like the above can get converted
to do it. Again not looking into it myself.
But perhaps someone would be interested? ;)
I'm responding to this e-mail since this would require some testing on
a bunch of uarchs most likely, especially with LA57.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-05 20:50 ` Mateusz Guzik
@ 2025-11-06 11:14 ` Borislav Petkov
2025-11-06 12:06 ` Mateusz Guzik
0 siblings, 1 reply; 51+ messages in thread
From: Borislav Petkov @ 2025-11-06 11:14 UTC (permalink / raw)
To: Mateusz Guzik
Cc: Linus Torvalds, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Wed, Nov 05, 2025 at 09:50:51PM +0100, Mateusz Guzik wrote:
> For unrelated reasons I disassembled kmem_cache_free and the following
> goodies popped up:
> sub 0x18e033f(%rip),%rax # ffffffff82f944d0 <page_offset_base>
> [..]
> add 0x18e031d(%rip),%rax # ffffffff82f944c0 <vmemmap_base>
> [..]
> mov 0x2189e19(%rip),%rax # ffffffff8383e010 <__pi_phys_base>
>
> These are definitely worthwhile to get rid of.
Says which semi-respectable benchmark?
If none, why bother?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-06 11:14 ` Borislav Petkov
@ 2025-11-06 12:06 ` Mateusz Guzik
2025-11-06 13:10 ` Borislav Petkov
0 siblings, 1 reply; 51+ messages in thread
From: Mateusz Guzik @ 2025-11-06 12:06 UTC (permalink / raw)
To: Borislav Petkov
Cc: Linus Torvalds, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Thu, Nov 6, 2025 at 12:14 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Nov 05, 2025 at 09:50:51PM +0100, Mateusz Guzik wrote:
> > For unrelated reasons I disassembled kmem_cache_free and the following
> > goodies popped up:
> > sub 0x18e033f(%rip),%rax # ffffffff82f944d0 <page_offset_base>
> > [..]
> > add 0x18e031d(%rip),%rax # ffffffff82f944c0 <vmemmap_base>
> > [..]
> > mov 0x2189e19(%rip),%rax # ffffffff8383e010 <__pi_phys_base>
> >
> > These are definitely worthwhile to get rid of.
>
> Says which semi-respectable benchmark?
>
> If none, why bother?
>
I don't know what are you trying to say here.
Are you protesting the notion that reducing cache footprint of the
memory allocator is a good idea, or perhaps are you claiming these
vars are too problematic to warrant the effort, or something else?
I'll note that contrary to popular belief the Linux kernel is very
much *slow* in terms of single-threaded performance and it is not
about mitigations or hardening measures. There are tidbits of heavy
microoptimization here and there, but that's all paired with massive
perf loss few instructions later -- inlined rep movsq/stosq for small
sizes (gcc is at fault here), lock-prefixed instructions when they can
be avoided, but also cache-cold memory accesses which don't need to be
there and so on.
One great example of slowness is the SLUB allocator with its
cmpxchg16b-using fast paths, but that got recently damage-controlled
with introduction of "shaves". Even then, it still leaves performance
on the table.
I don't know if you consider this semi-respectable or better, but
years back Ingo Molnar created a simple benchmark for i-cache
footprint: https://lkml.org/lkml/2015/5/19/1009
I have been using a modified version of it on and off to optimize
FreeBSD and through systemic removal of tons of avoidable work
(including memory references which did not need to be there) I got to
single-threaded performance beating Linux. It's not that anything
clever is taking place there (in fact there is still plenty of room
for improvement), rather Linux has systemic issues where it loses on
performance when it does not have to.
All that said, will you notice not taking a cache miss in there in the
sea of other cache misses and other slowdows which are currently
present? I don't think so, but it does not invalidate the notion that
they should be eliminated if feasible.
I feel compelled to note runtime-consting of USER_PTR_MAX came in with
no benchmark results (semi-respectable or otherwise) and still
received no pushback despite a bug being uncovered related to it. Per
the above, I think runtime-consting of the thing makes perfect sense
and does not warrant benchmarking. Like I said, I'm not sure what you
were trying to state. If your position is that a benchmark is required
to remove a memory reference from a frequently used codepath, then you
should be protesting USER_PTR_MAX.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-06 12:06 ` Mateusz Guzik
@ 2025-11-06 13:10 ` Borislav Petkov
2025-11-06 13:19 ` Mateusz Guzik
0 siblings, 1 reply; 51+ messages in thread
From: Borislav Petkov @ 2025-11-06 13:10 UTC (permalink / raw)
To: Mateusz Guzik
Cc: Linus Torvalds, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Thu, Nov 06, 2025 at 01:06:06PM +0100, Mateusz Guzik wrote:
> I don't know what are you trying to say here.
>
> Are you protesting the notion that reducing cache footprint of the
> memory allocator is a good idea, or perhaps are you claiming these
> vars are too problematic to warrant the effort, or something else?
I'm saying all work which does not change the code in a trivial way should
have numbers to back it up. As in: "this change X shows this perf improvement
Y with the benchmark Z."
Because code uglification better have a fair justification.
Not just random "oh yeah, it would be better to have this." If the changes are
trivial, sure. But the runtime const thing was added for a very narrow case,
AFAIR, and it wasn't supposed to have a widespread use. And it ain't that
trivial, codewise.
IOW, no non-trivial changes which become a burden to maintainers without
a really good reason for them. This has been the guiding principle for
non-trivial perf optimizations in Linux. AFAIR at least.
But hey, what do I know...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-06 13:10 ` Borislav Petkov
@ 2025-11-06 13:19 ` Mateusz Guzik
2025-11-06 13:36 ` Borislav Petkov
2025-11-06 19:26 ` David Laight
0 siblings, 2 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-11-06 13:19 UTC (permalink / raw)
To: Borislav Petkov
Cc: Linus Torvalds, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Thu, Nov 6, 2025 at 2:10 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Thu, Nov 06, 2025 at 01:06:06PM +0100, Mateusz Guzik wrote:
> > I don't know what are you trying to say here.
> >
> > Are you protesting the notion that reducing cache footprint of the
> > memory allocator is a good idea, or perhaps are you claiming these
> > vars are too problematic to warrant the effort, or something else?
>
> I'm saying all work which does not change the code in a trivial way should
> have numbers to back it up. As in: "this change X shows this perf improvement
> Y with the benchmark Z."
>
> Because code uglification better have a fair justification.
>
> Not just random "oh yeah, it would be better to have this." If the changes are
> trivial, sure. But the runtime const thing was added for a very narrow case,
> AFAIR, and it wasn't supposed to have a widespread use. And it ain't that
> trivial, codewise.
>
> IOW, no non-trivial changes which become a burden to maintainers without
> a really good reason for them. This has been the guiding principle for
> non-trivial perf optimizations in Linux. AFAIR at least.
>
> But hey, what do I know...
Then, as I pointed out, you should be protesting the patching of
USER_PTR_MAX as it came with no benchmarks and also resulted in a
regression.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-06 13:19 ` Mateusz Guzik
@ 2025-11-06 13:36 ` Borislav Petkov
2025-11-06 14:49 ` Mateusz Guzik
2025-11-06 19:26 ` David Laight
1 sibling, 1 reply; 51+ messages in thread
From: Borislav Petkov @ 2025-11-06 13:36 UTC (permalink / raw)
To: Mateusz Guzik
Cc: Linus Torvalds, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Thu, Nov 06, 2025 at 02:19:06PM +0100, Mateusz Guzik wrote:
> Then, as I pointed out, you should be protesting the patching of
> USER_PTR_MAX as it came with no benchmarks
That came in as a security fix. I'd say correctness before performance. And if
anyone finds a better and faster fix and can prove it, I'm all ears.
> and also resulted in a regression.
Oh well, shit happens on a daily basis. And then we fix it and move on.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-06 13:36 ` Borislav Petkov
@ 2025-11-06 14:49 ` Mateusz Guzik
0 siblings, 0 replies; 51+ messages in thread
From: Mateusz Guzik @ 2025-11-06 14:49 UTC (permalink / raw)
To: Borislav Petkov
Cc: Linus Torvalds, the arch/x86 maintainers, brauner, viro, jack,
linux-kernel, linux-fsdevel, tglx, pfalcato
On Thu, Nov 6, 2025 at 2:37 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Thu, Nov 06, 2025 at 02:19:06PM +0100, Mateusz Guzik wrote:
> > Then, as I pointed out, you should be protesting the patching of
> > USER_PTR_MAX as it came with no benchmarks
>
> That came in as a security fix. I'd say correctness before performance. And if
> anyone finds a better and faster fix and can prove it, I'm all ears.
>
Perhaps I failed to state my point clearly.
The position you are describing above does not line up with your
behavior concerning the use of runtime-const machinery for
USER_PTR_MAX.
It is purely an optimization and it has nothing to do with fixing the
problem the commit introducing it was aiming to solve. You accept it
without a benchmark. Later when a bug was identified you did some
testing to make sure it works. I think it that made sense. However,
per what you are describing above I would expect you would be
questioning whether this is warranted in the first place.
kmem is probably used about as often as user access (if not more so).
To my reading you rejected the idea of patching up some of its memory
accesses without a benchmark from the get go, which is quite a
different stance and I find myself confused about the discrepancy.
I have not tried to write patches to optimize these. There is a
threshold of complexity/ugliness where I would drop the idea myself.
But in a hypothetical case where they turn out fine, I don't
understand what's up with the insistence on benchmarks for this
particular thing, especially in light of your position on
USER_PTR_MAX. Per what I described previously, this would be hard to
arrange anyway even if someone genuinely tried.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-06 13:19 ` Mateusz Guzik
2025-11-06 13:36 ` Borislav Petkov
@ 2025-11-06 19:26 ` David Laight
2025-11-06 19:49 ` Linus Torvalds
1 sibling, 1 reply; 51+ messages in thread
From: David Laight @ 2025-11-06 19:26 UTC (permalink / raw)
To: Mateusz Guzik
Cc: Borislav Petkov, Linus Torvalds, the arch/x86 maintainers,
brauner, viro, jack, linux-kernel, linux-fsdevel, tglx, pfalcato
On Thu, 6 Nov 2025 14:19:06 +0100
Mateusz Guzik <mjguzik@gmail.com> wrote:
> On Thu, Nov 6, 2025 at 2:10 PM Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Thu, Nov 06, 2025 at 01:06:06PM +0100, Mateusz Guzik wrote:
> > > I don't know what are you trying to say here.
> > >
> > > Are you protesting the notion that reducing cache footprint of the
> > > memory allocator is a good idea, or perhaps are you claiming these
> > > vars are too problematic to warrant the effort, or something else?
> >
> > I'm saying all work which does not change the code in a trivial way should
> > have numbers to back it up. As in: "this change X shows this perf improvement
> > Y with the benchmark Z."
> >
> > Because code uglification better have a fair justification.
> >
> > Not just random "oh yeah, it would be better to have this." If the changes are
> > trivial, sure. But the runtime const thing was added for a very narrow case,
> > AFAIR, and it wasn't supposed to have a widespread use. And it ain't that
> > trivial, codewise.
> >
> > IOW, no non-trivial changes which become a burden to maintainers without
> > a really good reason for them. This has been the guiding principle for
> > non-trivial perf optimizations in Linux. AFAIR at least.
> >
> > But hey, what do I know...
>
> Then, as I pointed out, you should be protesting the patching of
> USER_PTR_MAX as it came with no benchmarks and also resulted in a
> regression.
>
IIRC it was a definite performance improvement for a specific workload
(compiling kernels) on a system where the relatively small d-cache
caused significant overhead reading the value from memory.
Look at the patch author for more info.
David
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules
2025-11-06 19:26 ` David Laight
@ 2025-11-06 19:49 ` Linus Torvalds
0 siblings, 0 replies; 51+ messages in thread
From: Linus Torvalds @ 2025-11-06 19:49 UTC (permalink / raw)
To: David Laight
Cc: Mateusz Guzik, Borislav Petkov, the arch/x86 maintainers, brauner,
viro, jack, linux-kernel, linux-fsdevel, tglx, pfalcato
On Thu, 6 Nov 2025 at 11:26, David Laight <david.laight.linux@gmail.com> wrote:
>
> IIRC it was a definite performance improvement for a specific workload
> (compiling kernels) on a system where the relatively small d-cache
> caused significant overhead reading the value from memory.
Some background:
https://lore.kernel.org/lkml/20240610204821.230388-1-torvalds@linux-foundation.org/
https://lore.kernel.org/lkml/CAHk-=whHvMbfL2ov1MRbT9QfebO2d6-xXi1ynznCCi-k_m6Q0w@mail.gmail.com/
where that "load address from memory" was particularly noticeable on
my 128-core Altra box in profiles.
That machine really has fairly weak cores and caches (it's what I call
a "flock of chickens" design: individual cores are not particularly
interesting, and the only point of that machine is "reasonable
performance on multithreaded loads thanks to many cores").
I did have numbers, but never posted them, because as mentioned in one
of the emails:
For example, making d_hash() avoid indirection just means that now
pretty much _all_ the cost of __d_lookup_rcu() is in the cache misses
on the hash table itself. Which was always the bulk of it. And on my
arm64 machine, it turns out that the best optimization for the load I
tested would be to make that hash table smaller to actually be a bit
denser in the cache, But that's such a load-dependent optimization
that I'm not doing this.
IOW, the actual biggest impact on that machine was when I hacked the
dcache hash tables to be smaller, so that it fit better in the L2.
But that's one of those "tune for the benchmark and the particular
machine" things that I despise, so I never did that except locally for
testing.
The patches that actually got committed are "these improve performance
a bit by just making the code do the same thing, just being less
stupid". Much less noticeable than the "tune for the machine".
Linus
^ permalink raw reply [flat|nested] 51+ messages in thread
end of thread, other threads:[~2025-11-06 19:49 UTC | newest]
Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-30 10:52 [PATCH v4] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
2025-10-30 13:13 ` kernel test robot
2025-10-30 13:19 ` Mateusz Guzik
2025-10-30 16:15 ` Linus Torvalds
2025-10-30 16:35 ` Mateusz Guzik
2025-10-30 18:07 ` Linus Torvalds
2025-10-30 18:25 ` Linus Torvalds
2025-10-30 21:39 ` Mateusz Guzik
2025-10-30 22:06 ` Mateusz Guzik
2025-10-31 12:08 ` Christian Brauner
2025-10-31 15:13 ` Mateusz Guzik
2025-10-31 16:04 ` Linus Torvalds
2025-10-31 16:25 ` Mateusz Guzik
2025-10-31 16:31 ` Linus Torvalds
2025-10-31 17:42 ` [WIP RFC PATCH 0/3] runtime-const header split and whatnot Mateusz Guzik
2025-10-31 17:42 ` [PATCH 1/3] x86: fix access_ok() and valid_user_address() using wrong USER_PTR_MAX in modules Mateusz Guzik
2025-10-31 21:46 ` Linus Torvalds
2025-10-31 22:01 ` Mateusz Guzik
2025-11-01 11:26 ` David Laight
2025-11-04 6:25 ` Linus Torvalds
2025-11-04 8:56 ` Mateusz Guzik
2025-11-04 9:37 ` Linus Torvalds
2025-11-04 10:25 ` Borislav Petkov
2025-11-04 16:13 ` Borislav Petkov
2025-11-05 1:50 ` Linus Torvalds
2025-11-05 11:37 ` Borislav Petkov
2025-11-05 20:50 ` Mateusz Guzik
2025-11-06 11:14 ` Borislav Petkov
2025-11-06 12:06 ` Mateusz Guzik
2025-11-06 13:10 ` Borislav Petkov
2025-11-06 13:19 ` Mateusz Guzik
2025-11-06 13:36 ` Borislav Petkov
2025-11-06 14:49 ` Mateusz Guzik
2025-11-06 19:26 ` David Laight
2025-11-06 19:49 ` Linus Torvalds
2025-11-04 17:09 ` Sean Christopherson
2025-11-04 19:07 ` Linus Torvalds
2025-11-04 19:34 ` Linus Torvalds
2025-11-04 21:53 ` Sean Christopherson
2025-11-04 20:17 ` Borislav Petkov
2025-11-04 22:06 ` Linus Torvalds
2025-11-05 11:49 ` Borislav Petkov
2025-10-31 17:42 ` [PATCH 2/3] runtime-const: split headers between accessors and fixup; disable for modules Mateusz Guzik
2025-10-31 17:42 ` [PATCH 3/3] fs: hide names_cachep behind runtime access machinery Mateusz Guzik
2025-10-31 23:30 ` kernel test robot
2025-10-31 23:30 ` kernel test robot
2025-10-31 23:41 ` kernel test robot
2025-11-01 17:49 ` kernel test robot
2025-10-31 13:30 ` [PATCH v4] " kernel test robot
2025-10-31 22:43 ` kernel test robot
2025-11-01 23:06 ` kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).