* [PATCH 0/2] arm64: mte: Improve performance by tightening handling of PSTATE.TCO
@ 2025-10-31 3:49 Carl Worth
2025-10-31 3:49 ` [PATCH 1/2] arm64: mte: Unify kernel MTE policy and manipulation of TCO Carl Worth
2025-10-31 3:49 ` [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end Carl Worth
0 siblings, 2 replies; 12+ messages in thread
From: Carl Worth @ 2025-10-31 3:49 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: linux-arm-kernel, linux-kernel, Taehyun Noh, Carl Worth
[Thanks to Taehyun Noh from UT Austin for originally reporting this
bug. In this cover letter, "we" refers to a collaborative effort
between indiviuals at both Ampere Computing and UT Austin.]
We measured severe performance overhead (30-50%) when enabling
userspace MTE and running memcached on an AmpereOne machine, (detailed
benchmark results are provided below).
We identified excessive tag checking taking place in the kernel,
(though only userspace tag checking was requested), as the culprit for
the performance slowdown. The existing code enables tag checking (by
_disabling_ PSTATE.TCO: ("tag check override")) at kernel entry
regardless of whether it's kernel-side MTE (via KASAN_HW_TAGS) or
userspace MTE that is being requested.
This patch series addresses the slowdown (in the case that only
userspace MTE is requested) by deferring the enabling of tag checking
until the kernel is about to access userspace memory, that is enabling
tag checking in user_access_begin and then disabling it again in
user_access_end.
The effect of this patch series is most-readily seen by using perf to
count tag-checked accesses in both kernel and userspace, for example
while runnning "perf bench futex hash" with MTE enabled.
Prior to the patch series, we see:
# GLIBC_TUNABLES=glibc.mem.tagging=3 perf stat -e mem_access_checked_rd:u,mem_access_checked_wr:u,mem_access_checked_rd:k,mem_access_checked_wr:k perf bench futex hash
...
Performance counter stats for 'perf bench futex hash':
4,046,872,020 mem_access_checked_rd:u
23,580 mem_access_checked_wr:u
251,248,813,102 mem_access_checked_rd:k
87,256,021,241 mem_access_checked_wr:k
And after the patch series we see (for the same command):
Performance counter stats for 'perf bench futex hash':
3,866,346,822 mem_access_checked_rd:u
23,499 mem_access_checked_wr:u
7,725,072,314 mem_access_checked_rd:k
424 mem_access_checked_wr:k
As can be seen above, with roughly equivalent counts of userspace
tag-checked accesses, over 97% of the kernel-space tag-checked
accesses are eliminated.
As to performance, the patch series has been observed as having no
impact for workloads with MTE disabled.
For workloads with MTE enabled, we measured the series causing a 5-8%
slowdown for "perf bench futex hash". Presumably, this results from
code paths that now include 2 writes to PSTATE.TCO where previously
there was only 1. Given that this is a synthetic micro-benchmark, we
argue that this performance slowdown is acceptable given the results
with more realistic workloads as described below.
We used the Phoronix Test Suite pts/memcached benchmark with a
get-heavy workload (1:10 Set:Get ratio) which is where the slowdown
appears most clearly. The slowdown worsens with increased core count,
levelling out above 32 cores. The numbers below are based on averages
from 50 runs each, with 96 cores on each run. For "MTE on",
GLIBC_TUNABLES was set to "glibc.mem.tagging=3". For "MTE off",
GLIBC_TUNABLES was unset.
The numbers below are normalized ops./sec. (higher is better),
normalized to the baseline case (unpatched kernel, MTE off).
Before the patch series (unpatched v6.18-rc1):
MTE off: 1.000
MTE on: 0.455
MTE overhead: 54.5% +/ 2.3%
After applying this patch series:
MTE off: 0.997
MTE on: 1.002
MTE overhead: No difference proven at 95.0% confidence
Changes since v1:
* Reorded patches to put cleanup patch before performance fix.
Signed-off-by: Carl Worth <carl@os.amperecomputing.com>
---
Carl Worth (1):
arm64: mte: Defer disabling of TCO until user_access_begin/end
Taehyun Noh (1):
arm64: mte: Unify kernel MTE policy and manipulation of TCO
arch/arm64/include/asm/mte.h | 53 +++++++++++++++++++++++++++++++---------
arch/arm64/include/asm/uaccess.h | 32 +++++++++++++++++++++++-
arch/arm64/kernel/entry-common.c | 4 +--
arch/arm64/kernel/mte.c | 2 +-
4 files changed, 76 insertions(+), 15 deletions(-)
---
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] arm64: mte: Unify kernel MTE policy and manipulation of TCO
2025-10-31 3:49 [PATCH 0/2] arm64: mte: Improve performance by tightening handling of PSTATE.TCO Carl Worth
@ 2025-10-31 3:49 ` Carl Worth
2026-01-08 15:05 ` Will Deacon
2025-10-31 3:49 ` [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end Carl Worth
1 sibling, 1 reply; 12+ messages in thread
From: Carl Worth @ 2025-10-31 3:49 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: linux-arm-kernel, linux-kernel, Taehyun Noh, Carl Worth
From: Taehyun Noh <taehyun@utexas.edu>
The kernel's primary knob for controlling MTE tag checking is the
PSTATE.TCO bit (tag check override). TCO is enabled (which,
confusingly _disables_ tag checking) by the hardware at the time of an
exception. Then, at various times, when the kernel needs to enable
tag-checking it clears TCO, (which in turn allows TCF0 or TCF to
control whether tag-checking occurs).
Some of the TCO manipulation code has redundancy and confusing naming.
Fix the redundancy by introducing a new function user_uses_tagcheck
which captures the existing repeated condition. The new function
includes significant new comments to help explain the logic.
Fix the naming by renaming mte_disable_tco_entry() to
set_kernel_mte_policy(). This function does not necessarily disable
TCO, but does so only conditionally in the case of KASAN HW TAGS. The
new name accurately describes the purpose of the function.
This commit should have no behavioral change.
Signed-off-by: Taehyun Noh <taehyun@utexas.edu>
Co-developed-by: Carl Worth <carl@os.amperecomputing.com>
Signed-off-by: Carl Worth <carl@os.amperecomputing.com>
---
arch/arm64/include/asm/mte.h | 40 +++++++++++++++++++++++++++++++++-------
arch/arm64/kernel/entry-common.c | 4 ++--
arch/arm64/kernel/mte.c | 2 +-
3 files changed, 36 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 3b5069f4683d..70dabc884616 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -224,7 +224,35 @@ static inline bool folio_try_hugetlb_mte_tagging(struct folio *folio)
}
#endif
-static inline void mte_disable_tco_entry(struct task_struct *task)
+static inline bool user_uses_tagcheck(void)
+{
+ /*
+ * To decide whether userspace wants tag checking we only look
+ * at TCF0 (SCTLR_EL1.TCF0 bit 0 is set for both synchronous
+ * or asymmetric mode).
+ *
+ * There's an argument that could be made that the kernel
+ * should also consider the state of TCO (tag check override)
+ * since userspace does have the ability to set that as well,
+ * and that could suggest a desire to disable tag checking in
+ * spite of the state of TCF0. However, the Linux kernel has
+ * never historically considered the userspace state of TCO,
+ * (so changing this would be an ABI break), and the hardware
+ * unconditionally sets TCO when an exception occurs
+ * anyway.
+ *
+ * So, again, here we look only at TCF0 and do not consider
+ * TCO.
+ */
+ return (current->thread.sctlr_user & (1UL << SCTLR_EL1_TCF0_SHIFT));
+}
+
+/*
+ * Set the kernel's desired policy for MTE tag checking.
+ *
+ * This function should be used right after the kernel entry.
+ */
+static inline void set_kernel_mte_policy(struct task_struct *task)
{
if (!system_supports_mte())
return;
@@ -232,15 +260,13 @@ static inline void mte_disable_tco_entry(struct task_struct *task)
/*
* Re-enable tag checking (TCO set on exception entry). This is only
* necessary if MTE is enabled in either the kernel or the userspace
- * task in synchronous or asymmetric mode (SCTLR_EL1.TCF0 bit 0 is set
- * for both). With MTE disabled in the kernel and disabled or
- * asynchronous in userspace, tag check faults (including in uaccesses)
- * are not reported, therefore there is no need to re-enable checking.
+ * task. With MTE disabled in the kernel and disabled or asynchronous
+ * in userspace, tag check faults (including in uaccesses) are not
+ * reported, therefore there is no need to re-enable checking.
* This is beneficial on microarchitectures where re-enabling TCO is
* expensive.
*/
- if (kasan_hw_tags_enabled() ||
- (task->thread.sctlr_user & (1UL << SCTLR_EL1_TCF0_SHIFT)))
+ if (kasan_hw_tags_enabled() || user_uses_tagcheck())
asm volatile(SET_PSTATE_TCO(0));
}
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index f546a914f041..466562d1d966 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -49,7 +49,7 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
state = __enter_from_kernel_mode(regs);
mte_check_tfsr_entry();
- mte_disable_tco_entry(current);
+ set_kernel_mte_policy(current);
return state;
}
@@ -83,7 +83,7 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
static __always_inline void __enter_from_user_mode(struct pt_regs *regs)
{
enter_from_user_mode(regs);
- mte_disable_tco_entry(current);
+ set_kernel_mte_policy(current);
}
static __always_inline void arm64_enter_from_user_mode(struct pt_regs *regs)
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 43f7a2f39403..0cc698714328 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -289,7 +289,7 @@ void mte_thread_switch(struct task_struct *next)
mte_update_gcr_excl(next);
/* TCO may not have been disabled on exception entry for the current task. */
- mte_disable_tco_entry(next);
+ set_kernel_mte_policy(next);
/*
* Check if an async tag exception occurred at EL1.
--
2.39.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
2025-10-31 3:49 [PATCH 0/2] arm64: mte: Improve performance by tightening handling of PSTATE.TCO Carl Worth
2025-10-31 3:49 ` [PATCH 1/2] arm64: mte: Unify kernel MTE policy and manipulation of TCO Carl Worth
@ 2025-10-31 3:49 ` Carl Worth
2026-01-08 15:06 ` Will Deacon
1 sibling, 1 reply; 12+ messages in thread
From: Carl Worth @ 2025-10-31 3:49 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: linux-arm-kernel, linux-kernel, Taehyun Noh, Carl Worth
The PSTATE.TCO (Tag Checking Override) register, when set causes MTE
tag checking to be disabled. The TCO bit is automatically set by the
hardware when an exception is taken.
Prior to this commit, mte_disable_tco_entry would clear TCO (enable
tag checking) for either of two cases: 1. When the kernel wants tag
checking (KASAN) or 2. when userspace wants tag checking (via
SCTLR.TCF0).
In the case of userspace desired tag checking, (that is, when KASAN is
off), clearing TCO on entry to the kernel has negative performance
implications. This results in excess kernel space tag checking that
has not been requested.
For this case, move the clearing of TCO to user_space_access_begin,
and set it again in user_access_end. This restricts the tag checking
to only the duration of the userspace accesses as desired.
This patch has been measured to eliminate over 97% of kernel-side tag
checking during "perf bench futex hash"
Reported-by: Taehyun Noh <taehyun@utexas.edu>
Signed-off-by: Carl Worth <carl@os.amperecomputing.com>
---
arch/arm64/include/asm/mte.h | 21 +++++++++++++--------
arch/arm64/include/asm/uaccess.h | 32 +++++++++++++++++++++++++++++++-
2 files changed, 44 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 70dabc884616..3608ba452da5 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -258,15 +258,20 @@ static inline void set_kernel_mte_policy(struct task_struct *task)
return;
/*
- * Re-enable tag checking (TCO set on exception entry). This is only
- * necessary if MTE is enabled in either the kernel or the userspace
- * task. With MTE disabled in the kernel and disabled or asynchronous
- * in userspace, tag check faults (including in uaccesses) are not
- * reported, therefore there is no need to re-enable checking.
- * This is beneficial on microarchitectures where re-enabling TCO is
- * expensive.
+ * TCO is set on exception entry, (which overrides either of TCF
+ * or TCF0 and disables tag checking).
+ *
+ * If KASAN is enabled and using MTE/(aka "hw_tags") we clear
+ * TCO so that the kernel gets the tag-checking it needs for
+ * KASAN_HW_TAGS.
+ *
+ * When the kernel needs to enable tag-checking temporarily,
+ * (such as before accessing userspace memory in the case that
+ * userspace has requested tag checking), the kernel can
+ * temporarily change the state of TCO. See
+ * user_access_begin().
*/
- if (kasan_hw_tags_enabled() || user_uses_tagcheck())
+ if (kasan_hw_tags_enabled())
asm volatile(SET_PSTATE_TCO(0));
}
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 1aa4ecb73429..248741a66c91 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -417,11 +417,41 @@ static __must_check __always_inline bool user_access_begin(const void __user *pt
{
if (unlikely(!access_ok(ptr,len)))
return 0;
+
+ /*
+ * Enable tag checking for the user access if MTE is enabled
+ * in the userspace task.
+ *
+ * Note: We don't need to do anything if KASAN is enabled,
+ * since that means the tag checking override (TCO) will
+ * already be disabled. In turn, the TCF0 bits will control
+ * whether user-space tag checking happens .
+ */
+ if (!kasan_hw_tags_enabled() && user_uses_tagcheck())
+ asm volatile(SET_PSTATE_TCO(0));
+
uaccess_ttbr0_enable();
return 1;
}
+
+static __always_inline void user_access_end(void)
+{
+ /*
+ * Restore TCO to disable tag checking now that user access is done.
+ *
+ * This logic uses the identical condition as in user_access_begin
+ * to avoid writing PSTATE.TCO with a value identical to what it
+ * already has (which would needlessly introduce a pipeline flush
+ * and could impact performance).
+ */
+ if (!kasan_hw_tags_enabled() && user_uses_tagcheck())
+ asm volatile(SET_PSTATE_TCO(1));
+
+ uaccess_ttbr0_disable();
+}
+
#define user_access_begin(a,b) user_access_begin(a,b)
-#define user_access_end() uaccess_ttbr0_disable()
+#define user_access_end() user_access_end()
#define unsafe_put_user(x, ptr, label) \
__raw_put_mem("sttr", x, uaccess_mask_ptr(ptr), label, U)
#define unsafe_get_user(x, ptr, label) \
--
2.39.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] arm64: mte: Unify kernel MTE policy and manipulation of TCO
2025-10-31 3:49 ` [PATCH 1/2] arm64: mte: Unify kernel MTE policy and manipulation of TCO Carl Worth
@ 2026-01-08 15:05 ` Will Deacon
2026-01-08 16:28 ` Yeoreum Yun
0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2026-01-08 15:05 UTC (permalink / raw)
To: Carl Worth
Cc: Catalin Marinas, linux-arm-kernel, linux-kernel, Taehyun Noh,
andreyknvl, pcc, yeoreum.yun
On Thu, Oct 30, 2025 at 08:49:31PM -0700, Carl Worth wrote:
> From: Taehyun Noh <taehyun@utexas.edu>
>
> The kernel's primary knob for controlling MTE tag checking is the
> PSTATE.TCO bit (tag check override). TCO is enabled (which,
> confusingly _disables_ tag checking) by the hardware at the time of an
> exception. Then, at various times, when the kernel needs to enable
> tag-checking it clears TCO, (which in turn allows TCF0 or TCF to
> control whether tag-checking occurs).
>
> Some of the TCO manipulation code has redundancy and confusing naming.
>
> Fix the redundancy by introducing a new function user_uses_tagcheck
> which captures the existing repeated condition. The new function
> includes significant new comments to help explain the logic.
>
> Fix the naming by renaming mte_disable_tco_entry() to
> set_kernel_mte_policy(). This function does not necessarily disable
> TCO, but does so only conditionally in the case of KASAN HW TAGS. The
> new name accurately describes the purpose of the function.
>
> This commit should have no behavioral change.
>
> Signed-off-by: Taehyun Noh <taehyun@utexas.edu>
> Co-developed-by: Carl Worth <carl@os.amperecomputing.com>
> Signed-off-by: Carl Worth <carl@os.amperecomputing.com>
> ---
> arch/arm64/include/asm/mte.h | 40 +++++++++++++++++++++++++++++++++-------
> arch/arm64/kernel/entry-common.c | 4 ++--
> arch/arm64/kernel/mte.c | 2 +-
> 3 files changed, 36 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index 3b5069f4683d..70dabc884616 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -224,7 +224,35 @@ static inline bool folio_try_hugetlb_mte_tagging(struct folio *folio)
> }
> #endif
>
> -static inline void mte_disable_tco_entry(struct task_struct *task)
> +static inline bool user_uses_tagcheck(void)
> +{
> + /*
> + * To decide whether userspace wants tag checking we only look
> + * at TCF0 (SCTLR_EL1.TCF0 bit 0 is set for both synchronous
> + * or asymmetric mode).
> + *
> + * There's an argument that could be made that the kernel
> + * should also consider the state of TCO (tag check override)
> + * since userspace does have the ability to set that as well,
> + * and that could suggest a desire to disable tag checking in
> + * spite of the state of TCF0. However, the Linux kernel has
> + * never historically considered the userspace state of TCO,
> + * (so changing this would be an ABI break), and the hardware
> + * unconditionally sets TCO when an exception occurs
> + * anyway.
> + *
> + * So, again, here we look only at TCF0 and do not consider
> + * TCO.
> + */
> + return (current->thread.sctlr_user & (1UL << SCTLR_EL1_TCF0_SHIFT));
> +}
> +
> +/*
> + * Set the kernel's desired policy for MTE tag checking.
> + *
> + * This function should be used right after the kernel entry.
> + */
> +static inline void set_kernel_mte_policy(struct task_struct *task)
> {
> if (!system_supports_mte())
> return;
> @@ -232,15 +260,13 @@ static inline void mte_disable_tco_entry(struct task_struct *task)
> /*
> * Re-enable tag checking (TCO set on exception entry). This is only
> * necessary if MTE is enabled in either the kernel or the userspace
> - * task in synchronous or asymmetric mode (SCTLR_EL1.TCF0 bit 0 is set
> - * for both). With MTE disabled in the kernel and disabled or
> - * asynchronous in userspace, tag check faults (including in uaccesses)
> - * are not reported, therefore there is no need to re-enable checking.
> + * task. With MTE disabled in the kernel and disabled or asynchronous
> + * in userspace, tag check faults (including in uaccesses) are not
> + * reported, therefore there is no need to re-enable checking.
> * This is beneficial on microarchitectures where re-enabling TCO is
> * expensive.
The comment implies that toggling TCO can be expensive, so it's not clear
to me that moving it to the uaccess routines in the next patch is
necessarily a good idea in general. I understand that you see improvements
with memcached, but have you tried exercising workloads that are heavy on
user accesses?
> */
> - if (kasan_hw_tags_enabled() ||
> - (task->thread.sctlr_user & (1UL << SCTLR_EL1_TCF0_SHIFT)))
> + if (kasan_hw_tags_enabled() || user_uses_tagcheck())
> asm volatile(SET_PSTATE_TCO(0));
> }
>
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index f546a914f041..466562d1d966 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -49,7 +49,7 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
>
> state = __enter_from_kernel_mode(regs);
> mte_check_tfsr_entry();
> - mte_disable_tco_entry(current);
> + set_kernel_mte_policy(current);
>
> return state;
> }
> @@ -83,7 +83,7 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
> static __always_inline void __enter_from_user_mode(struct pt_regs *regs)
> {
> enter_from_user_mode(regs);
> - mte_disable_tco_entry(current);
> + set_kernel_mte_policy(current);
> }
>
> static __always_inline void arm64_enter_from_user_mode(struct pt_regs *regs)
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index 43f7a2f39403..0cc698714328 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -289,7 +289,7 @@ void mte_thread_switch(struct task_struct *next)
> mte_update_gcr_excl(next);
>
> /* TCO may not have been disabled on exception entry for the current task. */
> - mte_disable_tco_entry(next);
> + set_kernel_mte_policy(next);
So this passes 'next' as the task, but user_uses_tagcheck() looks at
current. Shouldn't you propagate the task through?
Will
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
2025-10-31 3:49 ` [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end Carl Worth
@ 2026-01-08 15:06 ` Will Deacon
2026-01-08 18:45 ` Catalin Marinas
0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2026-01-08 15:06 UTC (permalink / raw)
To: Carl Worth
Cc: Catalin Marinas, linux-arm-kernel, linux-kernel, Taehyun Noh,
andreyknvl, pcc, yeoreum.yun
On Thu, Oct 30, 2025 at 08:49:32PM -0700, Carl Worth wrote:
> The PSTATE.TCO (Tag Checking Override) register, when set causes MTE
> tag checking to be disabled. The TCO bit is automatically set by the
> hardware when an exception is taken.
>
> Prior to this commit, mte_disable_tco_entry would clear TCO (enable
> tag checking) for either of two cases: 1. When the kernel wants tag
> checking (KASAN) or 2. when userspace wants tag checking (via
> SCTLR.TCF0).
>
> In the case of userspace desired tag checking, (that is, when KASAN is
> off), clearing TCO on entry to the kernel has negative performance
> implications. This results in excess kernel space tag checking that
> has not been requested.
>
> For this case, move the clearing of TCO to user_space_access_begin,
> and set it again in user_access_end. This restricts the tag checking
> to only the duration of the userspace accesses as desired.
>
> This patch has been measured to eliminate over 97% of kernel-side tag
> checking during "perf bench futex hash"
>
> Reported-by: Taehyun Noh <taehyun@utexas.edu>
> Signed-off-by: Carl Worth <carl@os.amperecomputing.com>
> ---
> arch/arm64/include/asm/mte.h | 21 +++++++++++++--------
> arch/arm64/include/asm/uaccess.h | 32 +++++++++++++++++++++++++++++++-
> 2 files changed, 44 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index 70dabc884616..3608ba452da5 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -258,15 +258,20 @@ static inline void set_kernel_mte_policy(struct task_struct *task)
> return;
>
> /*
> - * Re-enable tag checking (TCO set on exception entry). This is only
> - * necessary if MTE is enabled in either the kernel or the userspace
> - * task. With MTE disabled in the kernel and disabled or asynchronous
> - * in userspace, tag check faults (including in uaccesses) are not
> - * reported, therefore there is no need to re-enable checking.
> - * This is beneficial on microarchitectures where re-enabling TCO is
> - * expensive.
> + * TCO is set on exception entry, (which overrides either of TCF
> + * or TCF0 and disables tag checking).
> + *
> + * If KASAN is enabled and using MTE/(aka "hw_tags") we clear
> + * TCO so that the kernel gets the tag-checking it needs for
> + * KASAN_HW_TAGS.
> + *
> + * When the kernel needs to enable tag-checking temporarily,
> + * (such as before accessing userspace memory in the case that
> + * userspace has requested tag checking), the kernel can
> + * temporarily change the state of TCO. See
> + * user_access_begin().
> */
> - if (kasan_hw_tags_enabled() || user_uses_tagcheck())
> + if (kasan_hw_tags_enabled())
> asm volatile(SET_PSTATE_TCO(0));
> }
>
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index 1aa4ecb73429..248741a66c91 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -417,11 +417,41 @@ static __must_check __always_inline bool user_access_begin(const void __user *pt
> {
> if (unlikely(!access_ok(ptr,len)))
> return 0;
> +
> + /*
> + * Enable tag checking for the user access if MTE is enabled
> + * in the userspace task.
> + *
> + * Note: We don't need to do anything if KASAN is enabled,
> + * since that means the tag checking override (TCO) will
> + * already be disabled. In turn, the TCF0 bits will control
> + * whether user-space tag checking happens .
> + */
> + if (!kasan_hw_tags_enabled() && user_uses_tagcheck())
> + asm volatile(SET_PSTATE_TCO(0));
> +
> uaccess_ttbr0_enable();
> return 1;
> }
What about all the uaccess routines that don't call user_access_begin? For
example, copy_from_user().
Will
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] arm64: mte: Unify kernel MTE policy and manipulation of TCO
2026-01-08 15:05 ` Will Deacon
@ 2026-01-08 16:28 ` Yeoreum Yun
0 siblings, 0 replies; 12+ messages in thread
From: Yeoreum Yun @ 2026-01-08 16:28 UTC (permalink / raw)
To: Will Deacon
Cc: Carl Worth, Catalin Marinas, linux-arm-kernel, linux-kernel,
Taehyun Noh, andreyknvl, pcc
Hi,
> On Thu, Oct 30, 2025 at 08:49:31PM -0700, Carl Worth wrote:
> > From: Taehyun Noh <taehyun@utexas.edu>
> >
> > The kernel's primary knob for controlling MTE tag checking is the
> > PSTATE.TCO bit (tag check override). TCO is enabled (which,
> > confusingly _disables_ tag checking) by the hardware at the time of an
> > exception. Then, at various times, when the kernel needs to enable
> > tag-checking it clears TCO, (which in turn allows TCF0 or TCF to
> > control whether tag-checking occurs).
> >
> > Some of the TCO manipulation code has redundancy and confusing naming.
> >
> > Fix the redundancy by introducing a new function user_uses_tagcheck
> > which captures the existing repeated condition. The new function
> > includes significant new comments to help explain the logic.
> >
> > Fix the naming by renaming mte_disable_tco_entry() to
> > set_kernel_mte_policy(). This function does not necessarily disable
> > TCO, but does so only conditionally in the case of KASAN HW TAGS. The
> > new name accurately describes the purpose of the function.
> >
> > This commit should have no behavioral change.
> >
> > Signed-off-by: Taehyun Noh <taehyun@utexas.edu>
> > Co-developed-by: Carl Worth <carl@os.amperecomputing.com>
> > Signed-off-by: Carl Worth <carl@os.amperecomputing.com>
> > ---
> > arch/arm64/include/asm/mte.h | 40 +++++++++++++++++++++++++++++++++-------
> > arch/arm64/kernel/entry-common.c | 4 ++--
> > arch/arm64/kernel/mte.c | 2 +-
> > 3 files changed, 36 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> > index 3b5069f4683d..70dabc884616 100644
> > --- a/arch/arm64/include/asm/mte.h
> > +++ b/arch/arm64/include/asm/mte.h
> > @@ -224,7 +224,35 @@ static inline bool folio_try_hugetlb_mte_tagging(struct folio *folio)
> > }
> > #endif
> >
> > -static inline void mte_disable_tco_entry(struct task_struct *task)
> > +static inline bool user_uses_tagcheck(void)
> > +{
> > + /*
> > + * To decide whether userspace wants tag checking we only look
> > + * at TCF0 (SCTLR_EL1.TCF0 bit 0 is set for both synchronous
> > + * or asymmetric mode).
> > + *
> > + * There's an argument that could be made that the kernel
> > + * should also consider the state of TCO (tag check override)
> > + * since userspace does have the ability to set that as well,
> > + * and that could suggest a desire to disable tag checking in
> > + * spite of the state of TCF0. However, the Linux kernel has
> > + * never historically considered the userspace state of TCO,
> > + * (so changing this would be an ABI break), and the hardware
> > + * unconditionally sets TCO when an exception occurs
> > + * anyway.
> > + *
> > + * So, again, here we look only at TCF0 and do not consider
> > + * TCO.
> > + */
> > + return (current->thread.sctlr_user & (1UL << SCTLR_EL1_TCF0_SHIFT));
> > +}
> > +
> > +/*
> > + * Set the kernel's desired policy for MTE tag checking.
> > + *
> > + * This function should be used right after the kernel entry.
> > + */
> > +static inline void set_kernel_mte_policy(struct task_struct *task)
> > {
> > if (!system_supports_mte())
> > return;
> > @@ -232,15 +260,13 @@ static inline void mte_disable_tco_entry(struct task_struct *task)
> > /*
> > * Re-enable tag checking (TCO set on exception entry). This is only
> > * necessary if MTE is enabled in either the kernel or the userspace
> > - * task in synchronous or asymmetric mode (SCTLR_EL1.TCF0 bit 0 is set
> > - * for both). With MTE disabled in the kernel and disabled or
> > - * asynchronous in userspace, tag check faults (including in uaccesses)
> > - * are not reported, therefore there is no need to re-enable checking.
> > + * task. With MTE disabled in the kernel and disabled or asynchronous
> > + * in userspace, tag check faults (including in uaccesses) are not
> > + * reported, therefore there is no need to re-enable checking.
> > * This is beneficial on microarchitectures where re-enabling TCO is
> > * expensive.
>
> The comment implies that toggling TCO can be expensive, so it's not clear
> to me that moving it to the uaccess routines in the next patch is
> necessarily a good idea in general. I understand that you see improvements
> with memcached, but have you tried exercising workloads that are heavy on
> user accesses?
TBH, I don’t understand why toggling TCO is considered expensive.
PSTATE.TCO is set to 0 by default, and kasan_hw_tags_enabled() only
sets SCTLR_ELx.TCF to a value other than TCF_NONE.
Based on my understanding of the performance results (IIUC),
it appears that mem_access_check_* operations occur even when SCTLR_ELx.TCF == TCF_NONE.
It also seems that the observed performance impact is caused
by an incorrect check of user_uses_tagcheck() in mte_thread_switch()
pointed by Will, which ends up enabling the TCO bit.
If that's true, I think somewhere set PSTATE.TCO as 1 default and
disable TCO bit and properly handle this bit at the enter_from_xxx() and
exit_to_user_mode()...
Am I missing something?
[...]
--
Sincerely,
Yeoreum Yun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
2026-01-08 15:06 ` Will Deacon
@ 2026-01-08 18:45 ` Catalin Marinas
2026-01-08 23:19 ` Carl Worth
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Catalin Marinas @ 2026-01-08 18:45 UTC (permalink / raw)
To: Will Deacon
Cc: Carl Worth, linux-arm-kernel, linux-kernel, Taehyun Noh,
andreyknvl, pcc, yeoreum.yun
On Thu, Jan 08, 2026 at 03:06:30PM +0000, Will Deacon wrote:
> On Thu, Oct 30, 2025 at 08:49:32PM -0700, Carl Worth wrote:
> > The PSTATE.TCO (Tag Checking Override) register, when set causes MTE
> > tag checking to be disabled. The TCO bit is automatically set by the
> > hardware when an exception is taken.
> >
> > Prior to this commit, mte_disable_tco_entry would clear TCO (enable
> > tag checking) for either of two cases: 1. When the kernel wants tag
> > checking (KASAN) or 2. when userspace wants tag checking (via
> > SCTLR.TCF0).
> >
> > In the case of userspace desired tag checking, (that is, when KASAN is
> > off), clearing TCO on entry to the kernel has negative performance
> > implications. This results in excess kernel space tag checking that
> > has not been requested.
I would have expected the hardware to avoid any tag checking if
SCTLR_EL1.TCF is 0. I guess the Arm ARM isn't entirely clear (D10.4.1
Tag Checked memory accesses), it seems to only mention TCF and TCMA with
a match-all tag for considering Unchecked accesses.
> > diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> > index 1aa4ecb73429..248741a66c91 100644
> > --- a/arch/arm64/include/asm/uaccess.h
> > +++ b/arch/arm64/include/asm/uaccess.h
> > @@ -417,11 +417,41 @@ static __must_check __always_inline bool user_access_begin(const void __user *pt
> > {
> > if (unlikely(!access_ok(ptr,len)))
> > return 0;
> > +
> > + /*
> > + * Enable tag checking for the user access if MTE is enabled
> > + * in the userspace task.
> > + *
> > + * Note: We don't need to do anything if KASAN is enabled,
> > + * since that means the tag checking override (TCO) will
> > + * already be disabled. In turn, the TCF0 bits will control
> > + * whether user-space tag checking happens .
> > + */
> > + if (!kasan_hw_tags_enabled() && user_uses_tagcheck())
> > + asm volatile(SET_PSTATE_TCO(0));
> > +
> > uaccess_ttbr0_enable();
> > return 1;
> > }
>
> What about all the uaccess routines that don't call user_access_begin? For
> example, copy_from_user().
We might as well ignore tag checking for all uaccess for specific
hardware. It's a relaxation but you get this with futex already and some
combination of read/write() syscalls with O_DIRECT.
Reading the Arm ARM section again, I wonder whether always setting TCMA1
does the trick for the Ampere hardware. With KASAN disabled in the
kernel, all addresses will star with 0xff... so behave as match-all. We
do this with KASAN_HW_TAGS enabled but it won't have any effect with
kasan disabled.
Carl, could you please try the patch below?
----------------8<----------------------------------------
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 01e868116448..8b1f0de00fd3 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -48,14 +48,14 @@
#define TCR_KASAN_SW_FLAGS 0
#endif
-#ifdef CONFIG_KASAN_HW_TAGS
-#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
-#elif defined(CONFIG_ARM64_MTE)
+#ifdef CONFIG_ARM64_MTE
/*
* The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
- * TBI being enabled at EL1.
+ * TBI being enabled at EL1. TCMA1 is needed to treat accesses with the
+ * match-all tag (0xF) as Tag Unchecked, irrespective of the SCTLR_EL1.TCF
+ * setting.
*/
-#define TCR_MTE_FLAGS TCR_EL1_TBI1 | TCR_EL1_TBID1
+#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
#else
#define TCR_MTE_FLAGS 0
#endif
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
2026-01-08 18:45 ` Catalin Marinas
@ 2026-01-08 23:19 ` Carl Worth
2026-01-09 11:40 ` Will Deacon
2026-01-10 5:29 ` Taehyun Noh
2 siblings, 0 replies; 12+ messages in thread
From: Carl Worth @ 2026-01-08 23:19 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: linux-arm-kernel, linux-kernel, Taehyun Noh, andreyknvl, pcc,
yeoreum.yun
Catalin Marinas <catalin.marinas@arm.com> writes:
> On Thu, Jan 08, 2026 at 03:06:30PM +0000, Will Deacon wrote:
>>
>> What about all the uaccess routines that don't call user_access_begin? For
>> example, copy_from_user().
It's possible I missed some code paths here. Thanks for pointing that
out.
> We might as well ignore tag checking for all uaccess for specific
> hardware. It's a relaxation but you get this with futex already and some
> combination of read/write() syscalls with O_DIRECT.
I'm not sure I agree with that. I mean, you could argue that since the
current implementation doesn't guarantee all uaccess gets tag checking
we have cover for skipping tag checking in other cases.
But I think the system is strictly better if we prefer to have kernel
uaccess use tag checking wherever possible.
> Reading the Arm ARM section again, I wonder whether always setting TCMA1
> does the trick for the Ampere hardware. With KASAN disabled in the
> kernel, all addresses will star with 0xff... so behave as match-all. We
> do this with KASAN_HW_TAGS enabled but it won't have any effect with
> kasan disabled.
I'm not familiar with any "match-all" semantics associated with a
tag-value of 0xf. Maybe I'm missing something?
But I'm clearly not aware of everything regarding MTE, since TCMA1 was
new to me too.
Having read up on it now, I agree it looks like a good approach to try
addressing the performance problem here. And this would let us leave the
TCO handling as-is so we could skip past Will's two concerns above,
(potential performance slowdown to other uses cases than what I've
reported on, and potential code paths where I missed the toggling of
TCO).
> Carl, could you please try the patch below?
I'll do that and report back here soon.
Thanks,
-Carl
> ----------------8<----------------------------------------
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 01e868116448..8b1f0de00fd3 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -48,14 +48,14 @@
> #define TCR_KASAN_SW_FLAGS 0
> #endif
>
> -#ifdef CONFIG_KASAN_HW_TAGS
> -#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
> -#elif defined(CONFIG_ARM64_MTE)
> +#ifdef CONFIG_ARM64_MTE
> /*
> * The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
> - * TBI being enabled at EL1.
> + * TBI being enabled at EL1. TCMA1 is needed to treat accesses with the
> + * match-all tag (0xF) as Tag Unchecked, irrespective of the SCTLR_EL1.TCF
> + * setting.
> */
> -#define TCR_MTE_FLAGS TCR_EL1_TBI1 | TCR_EL1_TBID1
> +#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
> #else
> #define TCR_MTE_FLAGS 0
> #endif
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
2026-01-08 18:45 ` Catalin Marinas
2026-01-08 23:19 ` Carl Worth
@ 2026-01-09 11:40 ` Will Deacon
2026-01-10 5:29 ` Taehyun Noh
2 siblings, 0 replies; 12+ messages in thread
From: Will Deacon @ 2026-01-09 11:40 UTC (permalink / raw)
To: Catalin Marinas
Cc: Carl Worth, linux-arm-kernel, linux-kernel, Taehyun Noh,
andreyknvl, pcc, yeoreum.yun
On Thu, Jan 08, 2026 at 06:45:30PM +0000, Catalin Marinas wrote:
> On Thu, Jan 08, 2026 at 03:06:30PM +0000, Will Deacon wrote:
> > On Thu, Oct 30, 2025 at 08:49:32PM -0700, Carl Worth wrote:
> > > The PSTATE.TCO (Tag Checking Override) register, when set causes MTE
> > > tag checking to be disabled. The TCO bit is automatically set by the
> > > hardware when an exception is taken.
> > >
> > > Prior to this commit, mte_disable_tco_entry would clear TCO (enable
> > > tag checking) for either of two cases: 1. When the kernel wants tag
> > > checking (KASAN) or 2. when userspace wants tag checking (via
> > > SCTLR.TCF0).
> > >
> > > In the case of userspace desired tag checking, (that is, when KASAN is
> > > off), clearing TCO on entry to the kernel has negative performance
> > > implications. This results in excess kernel space tag checking that
> > > has not been requested.
>
> I would have expected the hardware to avoid any tag checking if
> SCTLR_EL1.TCF is 0. I guess the Arm ARM isn't entirely clear (D10.4.1
> Tag Checked memory accesses), it seems to only mention TCF and TCMA with
> a match-all tag for considering Unchecked accesses.
>
> > > diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> > > index 1aa4ecb73429..248741a66c91 100644
> > > --- a/arch/arm64/include/asm/uaccess.h
> > > +++ b/arch/arm64/include/asm/uaccess.h
> > > @@ -417,11 +417,41 @@ static __must_check __always_inline bool user_access_begin(const void __user *pt
> > > {
> > > if (unlikely(!access_ok(ptr,len)))
> > > return 0;
> > > +
> > > + /*
> > > + * Enable tag checking for the user access if MTE is enabled
> > > + * in the userspace task.
> > > + *
> > > + * Note: We don't need to do anything if KASAN is enabled,
> > > + * since that means the tag checking override (TCO) will
> > > + * already be disabled. In turn, the TCF0 bits will control
> > > + * whether user-space tag checking happens .
> > > + */
> > > + if (!kasan_hw_tags_enabled() && user_uses_tagcheck())
> > > + asm volatile(SET_PSTATE_TCO(0));
> > > +
> > > uaccess_ttbr0_enable();
> > > return 1;
> > > }
> >
> > What about all the uaccess routines that don't call user_access_begin? For
> > example, copy_from_user().
>
> We might as well ignore tag checking for all uaccess for specific
> hardware. It's a relaxation but you get this with futex already and some
> combination of read/write() syscalls with O_DIRECT.
Hmm, you could argue it's an ABI break, no? You can write a userspace
program that will behave differently before and after the change.
Conversely, you could argue that a syscall using uaccess is an
unstable implementation detail of the syscall, but it feels a bit fragile
(for example, signal delivery is always going to use the uaccess routines
to access the signal stack)
Will
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
2026-01-08 18:45 ` Catalin Marinas
2026-01-08 23:19 ` Carl Worth
2026-01-09 11:40 ` Will Deacon
@ 2026-01-10 5:29 ` Taehyun Noh
2026-01-10 13:02 ` Catalin Marinas
2 siblings, 1 reply; 12+ messages in thread
From: Taehyun Noh @ 2026-01-10 5:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: Carl Worth, linux-arm-kernel, linux-kernel, andreyknvl, pcc,
yeoreum.yun
Hi,
On Thu Jan 8, 2026 at 12:45 PM CST, Catalin Marinas wrote:
> Reading the Arm ARM section again, I wonder whether always setting TCMA1
> does the trick for the Ampere hardware. With KASAN disabled in the
> kernel, all addresses will star with 0xff... so behave as match-all. We
> do this with KASAN_HW_TAGS enabled but it won't have any effect with
> kasan disabled.
Our team agrees with Catalin’s TCMA1 solution. It disables every kernel
tag checking but the user address will get tag checked as far as TCO is
clear. Also, Carl’s initial testing confirms that
`mem_access_checked*:k` counters drop with the TCMA1 patch. While we
haven’t run the memcached benchmark yet, we will follow up with those
results shortly.
Additionally, we’ve observed that Pixel 9 behaves differently; the
kernel does not perform any tag checking when the user process enables
MTE. I’ve tested a simple kernel module that accesses kernel memory on
user ioctl, and measured the MTE perf counters on both AmpereOne and
Pixel 9. Pixel 9 shows no increases in checked access counters, but
AmpereOne shows proportional increases depending on the buffer size that
is accessed inside the kernel module.
We will keep you posted as more data becomes available.
Taehyun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
2026-01-10 5:29 ` Taehyun Noh
@ 2026-01-10 13:02 ` Catalin Marinas
2026-01-14 20:27 ` Carl Worth
0 siblings, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2026-01-10 13:02 UTC (permalink / raw)
To: Taehyun Noh
Cc: Will Deacon, Carl Worth, linux-arm-kernel, linux-kernel,
andreyknvl, pcc, yeoreum.yun
On Fri, Jan 09, 2026 at 11:29:29PM -0600, Taehyun Noh wrote:
> On Thu Jan 8, 2026 at 12:45 PM CST, Catalin Marinas wrote:
> > Reading the Arm ARM section again, I wonder whether always setting TCMA1
> > does the trick for the Ampere hardware. With KASAN disabled in the
> > kernel, all addresses will star with 0xff... so behave as match-all. We
> > do this with KASAN_HW_TAGS enabled but it won't have any effect with
> > kasan disabled.
>
> Our team agrees with Catalin’s TCMA1 solution. It disables every kernel
> tag checking but the user address will get tag checked as far as TCO is
> clear. Also, Carl’s initial testing confirms that
> `mem_access_checked*:k` counters drop with the TCMA1 patch. While we
> haven’t run the memcached benchmark yet, we will follow up with those
> results shortly.
That's great. Carl, could you please respin the patch with just setting
the TCMA1 bit? Just add a suggested-by me (I could post the patch as
well but I don't have the data to back it up and include in the commit
log).
> Additionally, we’ve observed that Pixel 9 behaves differently; the
> kernel does not perform any tag checking when the user process enables
> MTE. I’ve tested a simple kernel module that accesses kernel memory on
> user ioctl, and measured the MTE perf counters on both AmpereOne and
> Pixel 9. Pixel 9 shows no increases in checked access counters, but
> AmpereOne shows proportional increases depending on the buffer size that
> is accessed inside the kernel module.
It's an implementation choice. I think the Arm Ltd CPUs ignore tag
checking if SCTLR_EL1.TCF==0, irrespective of TCMA1 or TCO. But always
setting TCMA1 is completely harmless and it's covered by the text in the
Arm ARM.
--
Catalin
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
2026-01-10 13:02 ` Catalin Marinas
@ 2026-01-14 20:27 ` Carl Worth
0 siblings, 0 replies; 12+ messages in thread
From: Carl Worth @ 2026-01-14 20:27 UTC (permalink / raw)
To: Catalin Marinas, Taehyun Noh
Cc: Will Deacon, linux-arm-kernel, linux-kernel, andreyknvl, pcc,
yeoreum.yun
Catalin Marinas <catalin.marinas@arm.com> writes:
> On Fri, Jan 09, 2026 at 11:29:29PM -0600, Taehyun Noh wrote:
>> Our team agrees with Catalin’s TCMA1 solution. It disables every kernel
>> tag checking but the user address will get tag checked as far as TCO is
>> clear. Also, Carl’s initial testing confirms that
>> `mem_access_checked*:k` counters drop with the TCMA1 patch. While we
>> haven’t run the memcached benchmark yet, we will follow up with those
>> results shortly.
>
> That's great. Carl, could you please respin the patch with just setting
> the TCMA1 bit? Just add a suggested-by me (I could post the patch as
> well but I don't have the data to back it up and include in the commit
> log).
Will do. I'm just running the final benchmark numbers and then will send
this out.
-Carl
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-01-14 20:28 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-31 3:49 [PATCH 0/2] arm64: mte: Improve performance by tightening handling of PSTATE.TCO Carl Worth
2025-10-31 3:49 ` [PATCH 1/2] arm64: mte: Unify kernel MTE policy and manipulation of TCO Carl Worth
2026-01-08 15:05 ` Will Deacon
2026-01-08 16:28 ` Yeoreum Yun
2025-10-31 3:49 ` [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end Carl Worth
2026-01-08 15:06 ` Will Deacon
2026-01-08 18:45 ` Catalin Marinas
2026-01-08 23:19 ` Carl Worth
2026-01-09 11:40 ` Will Deacon
2026-01-10 5:29 ` Taehyun Noh
2026-01-10 13:02 ` Catalin Marinas
2026-01-14 20:27 ` Carl Worth
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox