linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
@ 2025-03-08  2:33 Peter Collingbourne
  2025-03-08  3:36 ` Kees Cook
  2025-03-10 17:29 ` Catalin Marinas
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Collingbourne @ 2025-03-08  2:33 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Andrew Morton,
	Kees Cook, Andy Shevchenko, Andrey Konovalov, Catalin Marinas
  Cc: Peter Collingbourne, linux-fsdevel, linux-kernel, linux-hardening,
	linux-arm-kernel, stable

The optimized strscpy() and dentry_string_cmp() routines will read 8
unaligned bytes at a time via the function read_word_at_a_time(), but
this is incompatible with MTE which will fault on a partially invalid
read. The attributes on read_word_at_a_time() that disable KASAN are
invisible to the CPU so they have no effect on MTE. Let's fix the
bug for now by disabling the optimizations if the kernel is built
with HW tag-based KASAN and consider improvements for followup changes.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
Cc: stable@vger.kernel.org
---
 fs/dcache.c  | 2 +-
 lib/string.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index e3634916ffb93..71f0830ac5e69 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -223,7 +223,7 @@ fs_initcall(init_fs_dcache_sysctls);
  * Compare 2 name strings, return 0 if they match, otherwise non-zero.
  * The strings are both count bytes long, and count is non-zero.
  */
-#ifdef CONFIG_DCACHE_WORD_ACCESS
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && !defined(CONFIG_KASAN_HW_TAGS)
 
 #include <asm/word-at-a-time.h>
 /*
diff --git a/lib/string.c b/lib/string.c
index eb4486ed40d25..9a43a3824d0d7 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -119,7 +119,8 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count)
 	if (count == 0 || WARN_ON_ONCE(count > INT_MAX))
 		return -E2BIG;
 
-#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \
+	!defined(CONFIG_KASAN_HW_TAGS)
 	/*
 	 * If src is unaligned, don't cross a page boundary,
 	 * since we don't know if the next page is mapped.
-- 
2.49.0.rc0.332.g42c0ae87b1-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
  2025-03-08  2:33 [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled Peter Collingbourne
@ 2025-03-08  3:36 ` Kees Cook
  2025-03-10 17:37   ` Catalin Marinas
  2025-03-10 17:29 ` Catalin Marinas
  1 sibling, 1 reply; 11+ messages in thread
From: Kees Cook @ 2025-03-08  3:36 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Andrew Morton,
	Andy Shevchenko, Andrey Konovalov, Catalin Marinas, linux-fsdevel,
	linux-kernel, linux-hardening, linux-arm-kernel, stable

On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> The optimized strscpy() and dentry_string_cmp() routines will read 8
> unaligned bytes at a time via the function read_word_at_a_time(), but
> this is incompatible with MTE which will fault on a partially invalid
> read. The attributes on read_word_at_a_time() that disable KASAN are
> invisible to the CPU so they have no effect on MTE. Let's fix the
> bug for now by disabling the optimizations if the kernel is built
> with HW tag-based KASAN and consider improvements for followup changes.

Why is faulting on a partially invalid read a problem? It's still
invalid, so ... it should fault, yes? What am I missing?

> 
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
> Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
> Cc: stable@vger.kernel.org
> ---
>  fs/dcache.c  | 2 +-
>  lib/string.c | 3 ++-
>  2 files changed, 3 insertions(+), 2 deletions(-)

Why are DCACHE_WORD_ACCESS and HAVE_EFFICIENT_UNALIGNED_ACCESS separate
things? I can see at least one place where it's directly tied:

arch/arm/Kconfig:58:    select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS

Would it make sense to sort this out so that KASAN_HW_TAGS can be taken
into account at the Kconfig level instead?

> diff --git a/fs/dcache.c b/fs/dcache.c
> index e3634916ffb93..71f0830ac5e69 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -223,7 +223,7 @@ fs_initcall(init_fs_dcache_sysctls);
>   * Compare 2 name strings, return 0 if they match, otherwise non-zero.
>   * The strings are both count bytes long, and count is non-zero.
>   */
> -#ifdef CONFIG_DCACHE_WORD_ACCESS
> +#if defined(CONFIG_DCACHE_WORD_ACCESS) && !defined(CONFIG_KASAN_HW_TAGS)

Why not also the word_at_a_time use in fs/namei.c and lib/siphash.c?

For reference, here are the DCACHE_WORD_ACCESS places:

arch/arm/Kconfig:58:    select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/arm64/Kconfig:137: select DCACHE_WORD_ACCESS
arch/powerpc/Kconfig:192:       select DCACHE_WORD_ACCESS if PPC64 && CPU_LITTLE_ENDIAN
arch/riscv/Kconfig:934: select DCACHE_WORD_ACCESS if MMU
arch/s390/Kconfig:154:  select DCACHE_WORD_ACCESS if !KMSAN
arch/x86/Kconfig:160:   select DCACHE_WORD_ACCESS if !KMSAN
arch/x86/um/Kconfig:12: select DCACHE_WORD_ACCESS

>  
>  #include <asm/word-at-a-time.h>
>  /*
> diff --git a/lib/string.c b/lib/string.c
> index eb4486ed40d25..9a43a3824d0d7 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -119,7 +119,8 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count)
>  	if (count == 0 || WARN_ON_ONCE(count > INT_MAX))
>  		return -E2BIG;
>  
> -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \
> +	!defined(CONFIG_KASAN_HW_TAGS)

There are lots more places checking CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS...
Why only here?

And the Kconfigs since I was comparing these against DCACHE_WORD_ACCESS

arch/arc/Kconfig:352:   select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/arm/Kconfig:107:   select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
arch/arm64/Kconfig:222: select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/loongarch/Kconfig:140:     select HAVE_EFFICIENT_UNALIGNED_ACCESS if !ARCH_STRICT_ALIGN
arch/m68k/Kconfig:33:   select HAVE_EFFICIENT_UNALIGNED_ACCESS if !CPU_HAS_NO_UNALIGNED
arch/powerpc/Kconfig:246:       select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/riscv/Kconfig:935: select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/s390/Kconfig:197:  select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/x86/Kconfig:238:   select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/x86/um/Kconfig:13: select HAVE_EFFICIENT_UNALIGNED_ACCESS

>  	/*
>  	 * If src is unaligned, don't cross a page boundary,
>  	 * since we don't know if the next page is mapped.
> -- 
> 2.49.0.rc0.332.g42c0ae87b1-goog
> 

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
  2025-03-08  2:33 [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled Peter Collingbourne
  2025-03-08  3:36 ` Kees Cook
@ 2025-03-10 17:29 ` Catalin Marinas
  1 sibling, 0 replies; 11+ messages in thread
From: Catalin Marinas @ 2025-03-10 17:29 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Andrew Morton,
	Kees Cook, Andy Shevchenko, Andrey Konovalov, linux-fsdevel,
	linux-kernel, linux-hardening, linux-arm-kernel, stable

On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> The optimized strscpy() and dentry_string_cmp() routines will read 8
> unaligned bytes at a time via the function read_word_at_a_time(), but
> this is incompatible with MTE which will fault on a partially invalid
> read. The attributes on read_word_at_a_time() that disable KASAN are
> invisible to the CPU so they have no effect on MTE. Let's fix the
> bug for now by disabling the optimizations if the kernel is built
> with HW tag-based KASAN and consider improvements for followup changes.
> 
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
> Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
> Cc: stable@vger.kernel.org

Some time ago Vincenzo had an attempt at fixing this but neither of us
got around to posting it. It's on top of 6.2 and not sure how cleanly it
would rebase:

git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/mte-strscpy

Feel free to cherry-pick patches from above, rewrite them etc.

> diff --git a/lib/string.c b/lib/string.c
> index eb4486ed40d25..9a43a3824d0d7 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -119,7 +119,8 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count)
>  	if (count == 0 || WARN_ON_ONCE(count > INT_MAX))
>  		return -E2BIG;
>  
> -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \
> +	!defined(CONFIG_KASAN_HW_TAGS)

Assuming that no-one wants to ever use KASAN_HW_TAGS=y in production,
this patch would do. Otherwise I'd rather use TCO around the access as
per the last patch from Vincenzo above.

Yet another option - use load_unaligned_zeropad() instead of
read_word_at_a_time(), not sure how it changes the semantics of
strscpy() in any way. This can be done in the arch code

-- 
Catalin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
  2025-03-08  3:36 ` Kees Cook
@ 2025-03-10 17:37   ` Catalin Marinas
  2025-03-10 18:09     ` Kees Cook
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Catalin Marinas @ 2025-03-10 17:37 UTC (permalink / raw)
  To: Kees Cook
  Cc: Peter Collingbourne, Alexander Viro, Christian Brauner, Jan Kara,
	Andrew Morton, Andy Shevchenko, Andrey Konovalov, linux-fsdevel,
	linux-kernel, linux-hardening, linux-arm-kernel, stable

On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > unaligned bytes at a time via the function read_word_at_a_time(), but
> > this is incompatible with MTE which will fault on a partially invalid
> > read. The attributes on read_word_at_a_time() that disable KASAN are
> > invisible to the CPU so they have no effect on MTE. Let's fix the
> > bug for now by disabling the optimizations if the kernel is built
> > with HW tag-based KASAN and consider improvements for followup changes.
> 
> Why is faulting on a partially invalid read a problem? It's still
> invalid, so ... it should fault, yes? What am I missing?

read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
beyond the end of string. The has_zero() function is then used to check
where the string ends. For this uses, I think we can go with
load_unaligned_zeropad() which handles a potential fault and pads the
rest with zeroes.

> > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
> > Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
> > Cc: stable@vger.kernel.org
> > ---
> >  fs/dcache.c  | 2 +-
> >  lib/string.c | 3 ++-
> >  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> Why are DCACHE_WORD_ACCESS and HAVE_EFFICIENT_UNALIGNED_ACCESS separate
> things? I can see at least one place where it's directly tied:
> 
> arch/arm/Kconfig:58:    select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS

DCACHE_WORD_ACCESS requires load_unaligned_zeropad() which handles the
faults. For some reason, read_word_at_a_time() doesn't expect to fault
and it is only used with HAVE_EFFICIENT_UNALIGNED_ACCESS. I guess arm32
only enabled load_unaligned_zeropad() on hardware that supports
efficient unaligned accesses (v6 onwards), hence the dependency.

> Would it make sense to sort this out so that KASAN_HW_TAGS can be taken
> into account at the Kconfig level instead?

I don't think we should play with config options but rather sort out the
fault path (load_unaligned_zeropad) or disable MTE temporarily. I'd go
with the former as long as read_word_at_a_time() is only used for
strings in conjunction with has_zero(). I haven't checked.

-- 
Catalin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
  2025-03-10 17:37   ` Catalin Marinas
@ 2025-03-10 18:09     ` Kees Cook
  2025-03-10 18:13     ` Mark Rutland
  2025-03-18 21:41     ` Peter Collingbourne
  2 siblings, 0 replies; 11+ messages in thread
From: Kees Cook @ 2025-03-10 18:09 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Peter Collingbourne, Alexander Viro, Christian Brauner, Jan Kara,
	Andrew Morton, Andy Shevchenko, Andrey Konovalov, linux-fsdevel,
	linux-kernel, linux-hardening, linux-arm-kernel, stable

On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > this is incompatible with MTE which will fault on a partially invalid
> > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > bug for now by disabling the optimizations if the kernel is built
> > > with HW tag-based KASAN and consider improvements for followup changes.
> > 
> > Why is faulting on a partially invalid read a problem? It's still
> > invalid, so ... it should fault, yes? What am I missing?
> 
> read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> beyond the end of string. The has_zero() function is then used to check
> where the string ends. For this uses, I think we can go with
> load_unaligned_zeropad() which handles a potential fault and pads the
> rest with zeroes.

Agh, right, I keep forgetting that this can read past the end of the
actual allocation. I'd agree, load_unaligned_zeropad() makes sense
there.

> 
> > > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > > Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
> > > Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  fs/dcache.c  | 2 +-
> > >  lib/string.c | 3 ++-
> > >  2 files changed, 3 insertions(+), 2 deletions(-)
> > 
> > Why are DCACHE_WORD_ACCESS and HAVE_EFFICIENT_UNALIGNED_ACCESS separate
> > things? I can see at least one place where it's directly tied:
> > 
> > arch/arm/Kconfig:58:    select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS
> 
> DCACHE_WORD_ACCESS requires load_unaligned_zeropad() which handles the
> faults. For some reason, read_word_at_a_time() doesn't expect to fault
> and it is only used with HAVE_EFFICIENT_UNALIGNED_ACCESS. I guess arm32
> only enabled load_unaligned_zeropad() on hardware that supports
> efficient unaligned accesses (v6 onwards), hence the dependency.
> 
> > Would it make sense to sort this out so that KASAN_HW_TAGS can be taken
> > into account at the Kconfig level instead?
> 
> I don't think we should play with config options but rather sort out the
> fault path (load_unaligned_zeropad) or disable MTE temporarily. I'd go
> with the former as long as read_word_at_a_time() is only used for
> strings in conjunction with has_zero(). I haven't checked.

Okay, sounds good. (And with a mild thread-merge: yes, folks want to use
KASAN_HW_TAGS=y in production.)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
  2025-03-10 17:37   ` Catalin Marinas
  2025-03-10 18:09     ` Kees Cook
@ 2025-03-10 18:13     ` Mark Rutland
  2025-03-10 18:40       ` Catalin Marinas
  2025-03-18 21:41     ` Peter Collingbourne
  2 siblings, 1 reply; 11+ messages in thread
From: Mark Rutland @ 2025-03-10 18:13 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kees Cook, Peter Collingbourne, Alexander Viro, Christian Brauner,
	Jan Kara, Andrew Morton, Andy Shevchenko, Andrey Konovalov,
	linux-fsdevel, linux-kernel, linux-hardening, linux-arm-kernel,
	stable

On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > this is incompatible with MTE which will fault on a partially invalid
> > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > bug for now by disabling the optimizations if the kernel is built
> > > with HW tag-based KASAN and consider improvements for followup changes.
> > 
> > Why is faulting on a partially invalid read a problem? It's still
> > invalid, so ... it should fault, yes? What am I missing?
> 
> read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> beyond the end of string. The has_zero() function is then used to check
> where the string ends. For this uses, I think we can go with
> load_unaligned_zeropad() which handles a potential fault and pads the
> rest with zeroes.

If we only care about synchronous and asymmetric modes, that should be
possible, but that won't work in asynchronous mode. In asynchronous mode
the fault will accumulate into TFSR and will be detected later
asynchronously where it cannot be related to its source and fixed up.

That means that both read_word_at_a_time() and load_unaligned_zeropad()
are dodgy in async mode.

Can we somehow hang this off ARCH_HAS_SUBPAGE_FAULTS?

... and is there anything else that deliberately makes accesses that
could straddle objects?

Mark.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
  2025-03-10 18:13     ` Mark Rutland
@ 2025-03-10 18:40       ` Catalin Marinas
  2025-03-10 19:37         ` Mark Rutland
  0 siblings, 1 reply; 11+ messages in thread
From: Catalin Marinas @ 2025-03-10 18:40 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, Peter Collingbourne, Alexander Viro, Christian Brauner,
	Jan Kara, Andrew Morton, Andy Shevchenko, Andrey Konovalov,
	linux-fsdevel, linux-kernel, linux-hardening, linux-arm-kernel,
	stable

On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote:
> On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > > this is incompatible with MTE which will fault on a partially invalid
> > > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > > bug for now by disabling the optimizations if the kernel is built
> > > > with HW tag-based KASAN and consider improvements for followup changes.
> > > 
> > > Why is faulting on a partially invalid read a problem? It's still
> > > invalid, so ... it should fault, yes? What am I missing?
> > 
> > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> > beyond the end of string. The has_zero() function is then used to check
> > where the string ends. For this uses, I think we can go with
> > load_unaligned_zeropad() which handles a potential fault and pads the
> > rest with zeroes.
> 
> If we only care about synchronous and asymmetric modes, that should be
> possible, but that won't work in asynchronous mode. In asynchronous mode
> the fault will accumulate into TFSR and will be detected later
> asynchronously where it cannot be related to its source and fixed up.
> 
> That means that both read_word_at_a_time() and load_unaligned_zeropad()
> are dodgy in async mode.

load_unaligned_zeropad() has a __mte_enable_tco_async() call to set
PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is
indeed busted and I've had Vincezo's patches for a couple of years
already, they just never made it to the list.

> Can we somehow hang this off ARCH_HAS_SUBPAGE_FAULTS?

We could, though that was mostly for user-space faults while in-kernel
we'd only need something similar if KASAN_HW_TAGS.

> ... and is there anything else that deliberately makes accesses that
> could straddle objects?

So far we only came across load_unaligned_zeropad() and
read_word_at_a_time(). I'm not aware of anything else.

-- 
Catalin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
  2025-03-10 18:40       ` Catalin Marinas
@ 2025-03-10 19:37         ` Mark Rutland
  2025-03-11 11:45           ` Catalin Marinas
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Rutland @ 2025-03-10 19:37 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kees Cook, Peter Collingbourne, Alexander Viro, Christian Brauner,
	Jan Kara, Andrew Morton, Andy Shevchenko, Andrey Konovalov,
	linux-fsdevel, linux-kernel, linux-hardening, linux-arm-kernel,
	stable

On Mon, Mar 10, 2025 at 06:40:11PM +0000, Catalin Marinas wrote:
> On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote:
> > On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> > > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > > > this is incompatible with MTE which will fault on a partially invalid
> > > > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > > > bug for now by disabling the optimizations if the kernel is built
> > > > > with HW tag-based KASAN and consider improvements for followup changes.
> > > > 
> > > > Why is faulting on a partially invalid read a problem? It's still
> > > > invalid, so ... it should fault, yes? What am I missing?
> > > 
> > > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> > > beyond the end of string. The has_zero() function is then used to check
> > > where the string ends. For this uses, I think we can go with
> > > load_unaligned_zeropad() which handles a potential fault and pads the
> > > rest with zeroes.
> > 
> > If we only care about synchronous and asymmetric modes, that should be
> > possible, but that won't work in asynchronous mode. In asynchronous mode
> > the fault will accumulate into TFSR and will be detected later
> > asynchronously where it cannot be related to its source and fixed up.
> > 
> > That means that both read_word_at_a_time() and load_unaligned_zeropad()
> > are dodgy in async mode.
> 
> load_unaligned_zeropad() has a __mte_enable_tco_async() call to set
> PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is
> indeed busted and I've had Vincezo's patches for a couple of years
> already, they just never made it to the list.

Sorry, I missed the __mte_{enable,disable}_tco_async() calls. So long as
we're happy to omit the check in that case, that's fine.

I was worried that ex_handler_load_unaligned_zeropad() might not do the
right thing in response to a tag check fault (e.g. access the wrong 8
bytes), but it looks as though that's ok due to the way it generates the
offset and the aligned pointer.

If load_unaligned_zeropad() is handed a string that starts with an
unexpected tag (and even if that starts off aligned),
ex_handler_load_unaligned_zeropad() will access that and cause another
tag check fault, which will be reported.

Mark.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
  2025-03-10 19:37         ` Mark Rutland
@ 2025-03-11 11:45           ` Catalin Marinas
  2025-03-11 11:55             ` Mark Rutland
  0 siblings, 1 reply; 11+ messages in thread
From: Catalin Marinas @ 2025-03-11 11:45 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, Peter Collingbourne, Alexander Viro, Christian Brauner,
	Jan Kara, Andrew Morton, Andy Shevchenko, Andrey Konovalov,
	linux-fsdevel, linux-kernel, linux-hardening, linux-arm-kernel,
	stable

On Mon, Mar 10, 2025 at 07:37:32PM +0000, Mark Rutland wrote:
> On Mon, Mar 10, 2025 at 06:40:11PM +0000, Catalin Marinas wrote:
> > On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote:
> > > On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> > > > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > > > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > > > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > > > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > > > > this is incompatible with MTE which will fault on a partially invalid
> > > > > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > > > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > > > > bug for now by disabling the optimizations if the kernel is built
> > > > > > with HW tag-based KASAN and consider improvements for followup changes.
> > > > > 
> > > > > Why is faulting on a partially invalid read a problem? It's still
> > > > > invalid, so ... it should fault, yes? What am I missing?
> > > > 
> > > > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> > > > beyond the end of string. The has_zero() function is then used to check
> > > > where the string ends. For this uses, I think we can go with
> > > > load_unaligned_zeropad() which handles a potential fault and pads the
> > > > rest with zeroes.
> > > 
> > > If we only care about synchronous and asymmetric modes, that should be
> > > possible, but that won't work in asynchronous mode. In asynchronous mode
> > > the fault will accumulate into TFSR and will be detected later
> > > asynchronously where it cannot be related to its source and fixed up.
> > > 
> > > That means that both read_word_at_a_time() and load_unaligned_zeropad()
> > > are dodgy in async mode.
> > 
> > load_unaligned_zeropad() has a __mte_enable_tco_async() call to set
> > PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is
> > indeed busted and I've had Vincezo's patches for a couple of years
> > already, they just never made it to the list.
> 
> Sorry, I missed the __mte_{enable,disable}_tco_async() calls. So long as
> we're happy to omit the check in that case, that's fine.

That was the easiest. Alternatively we can try to sync the TFSR before
and after the load but with the ISBs, that's too expensive. We could
also do a dummy one byte load before setting TCO. read_word_at_a_time()
does have an explicit kasan_check_read() but last time I checked it has
no effect on MTE.

> I was worried that ex_handler_load_unaligned_zeropad() might not do the
> right thing in response to a tag check fault (e.g. access the wrong 8
> bytes), but it looks as though that's ok due to the way it generates the
> offset and the aligned pointer.
> 
> If load_unaligned_zeropad() is handed a string that starts with an
> unexpected tag (and even if that starts off aligned),
> ex_handler_load_unaligned_zeropad() will access that and cause another
> tag check fault, which will be reported.

Yes, it will report an async tag check fault on the
exit_to_kernel_mode() path _if_ load_unaligned_zeropad() triggered the
fault for other reasons (end of page). It's slightly inconsistent, we
could set TCO for the async case in ex_handler_load_unaligned_zeropad()
as well.

For sync checks, we'd get the first fault ending up in
ex_handler_load_unaligned_zeropad() and a second tag check fault while
processing the first. This ends up in do_tag_recovery and we disable tag
checking after the report. Not ideal but not that bad. We could adjust
ex_handler_load_unaligned_zeropad() to return false if the pointer is
already aligned but we need to check the semantics of
load_unaligned_zeropad(), is it allowed to fault on the first byte?

-- 
Catalin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
  2025-03-11 11:45           ` Catalin Marinas
@ 2025-03-11 11:55             ` Mark Rutland
  0 siblings, 0 replies; 11+ messages in thread
From: Mark Rutland @ 2025-03-11 11:55 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kees Cook, Peter Collingbourne, Alexander Viro, Christian Brauner,
	Jan Kara, Andrew Morton, Andy Shevchenko, Andrey Konovalov,
	linux-fsdevel, linux-kernel, linux-hardening, linux-arm-kernel,
	stable

On Tue, Mar 11, 2025 at 11:45:21AM +0000, Catalin Marinas wrote:
> On Mon, Mar 10, 2025 at 07:37:32PM +0000, Mark Rutland wrote:
> > I was worried that ex_handler_load_unaligned_zeropad() might not do the
> > right thing in response to a tag check fault (e.g. access the wrong 8
> > bytes), but it looks as though that's ok due to the way it generates the
> > offset and the aligned pointer.
> > 
> > If load_unaligned_zeropad() is handed a string that starts with an
> > unexpected tag (and even if that starts off aligned),
> > ex_handler_load_unaligned_zeropad() will access that and cause another
> > tag check fault, which will be reported.
> 
> Yes, it will report an async tag check fault on the
> exit_to_kernel_mode() path _if_ load_unaligned_zeropad() triggered the
> fault for other reasons (end of page).

Sorry, yes. The aligned case I mentioned shouldn't apply here.

> It's slightly inconsistent, we could set TCO for the async case in
> ex_handler_load_unaligned_zeropad() as well.

Yep, I think that'd be necessary for async mode.

> For sync checks, we'd get the first fault ending up in
> ex_handler_load_unaligned_zeropad() and a second tag check fault while
> processing the first. This ends up in do_tag_recovery and we disable
> tag checking after the report. Not ideal but not that bad.

Yep; that's what I was describing in the second paragraph above, though
I forgot to say that was assuming sync or asymm mode.

> We could adjust ex_handler_load_unaligned_zeropad() to return false if
> the pointer is already aligned but we need to check the semantics of
> load_unaligned_zeropad(), is it allowed to fault on the first byte?

IIUC today it's only expected to fault due to misalignment, and the
gneral expectation is that for a sequence of load_unaligned_zeropad()
calls, we should get at least one byte without faulting (for the NUL
terminator).

I reckon it'd be better to figure this out based on the ESR if possible.
Kristina's patches for MOPS would give us that.

Mark.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled
  2025-03-10 17:37   ` Catalin Marinas
  2025-03-10 18:09     ` Kees Cook
  2025-03-10 18:13     ` Mark Rutland
@ 2025-03-18 21:41     ` Peter Collingbourne
  2 siblings, 0 replies; 11+ messages in thread
From: Peter Collingbourne @ 2025-03-18 21:41 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kees Cook, Alexander Viro, Christian Brauner, Jan Kara,
	Andrew Morton, Andy Shevchenko, Andrey Konovalov, linux-fsdevel,
	linux-kernel, linux-hardening, linux-arm-kernel, stable,
	Mark Rutland

On Mon, Mar 10, 2025 at 10:37 AM Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > this is incompatible with MTE which will fault on a partially invalid
> > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > bug for now by disabling the optimizations if the kernel is built
> > > with HW tag-based KASAN and consider improvements for followup changes.
> >
> > Why is faulting on a partially invalid read a problem? It's still
> > invalid, so ... it should fault, yes? What am I missing?
>
> read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> beyond the end of string. The has_zero() function is then used to check
> where the string ends. For this uses, I think we can go with
> load_unaligned_zeropad() which handles a potential fault and pads the
> rest with zeroes.

Agreed, strscpy() should be using load_unaligned_zeropad() if
available. We can also disable the code that checks for page
boundaries if load_unaligned_zeropad() is available.

The only other use of read_word_at_a_time() is the one in
dentry_string_cmp(). At the time that I sent the patch I hadn't
noticed the comment in dentry_string_cmp() stating that its argument
to read_word_at_a_time() is aligned. Since calling
read_word_at_a_time() is fine if the pointer is aligned but partially
invalid (not possible with MTE but it is possible with SW KASAN which
uses a 1:1 shadow mapping), I left this user as-is. In other words, I
didn't make read_word_at_a_time() arch-specific as in Vincenzo's
series.

I sent a v2 with my patch to switch strscpy() over to using
load_unaligned_zeropad() if available, as well as the patch adding
tests from Vincenzo's series (which had some issues that I fixed).

Peter

> > > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > > Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
> > > Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  fs/dcache.c  | 2 +-
> > >  lib/string.c | 3 ++-
> > >  2 files changed, 3 insertions(+), 2 deletions(-)
> >
> > Why are DCACHE_WORD_ACCESS and HAVE_EFFICIENT_UNALIGNED_ACCESS separate
> > things? I can see at least one place where it's directly tied:
> >
> > arch/arm/Kconfig:58:    select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS
>
> DCACHE_WORD_ACCESS requires load_unaligned_zeropad() which handles the
> faults. For some reason, read_word_at_a_time() doesn't expect to fault
> and it is only used with HAVE_EFFICIENT_UNALIGNED_ACCESS. I guess arm32
> only enabled load_unaligned_zeropad() on hardware that supports
> efficient unaligned accesses (v6 onwards), hence the dependency.
>
> > Would it make sense to sort this out so that KASAN_HW_TAGS can be taken
> > into account at the Kconfig level instead?
>
> I don't think we should play with config options but rather sort out the
> fault path (load_unaligned_zeropad) or disable MTE temporarily. I'd go
> with the former as long as read_word_at_a_time() is only used for
> strings in conjunction with has_zero(). I haven't checked.
>
> --
> Catalin

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-03-18 21:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-08  2:33 [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled Peter Collingbourne
2025-03-08  3:36 ` Kees Cook
2025-03-10 17:37   ` Catalin Marinas
2025-03-10 18:09     ` Kees Cook
2025-03-10 18:13     ` Mark Rutland
2025-03-10 18:40       ` Catalin Marinas
2025-03-10 19:37         ` Mark Rutland
2025-03-11 11:45           ` Catalin Marinas
2025-03-11 11:55             ` Mark Rutland
2025-03-18 21:41     ` Peter Collingbourne
2025-03-10 17:29 ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).