From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Carl Worth <carl@os.amperecomputing.com>,
Taehyun Noh <taehyun@utexas.edu>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Sasha Levin <sashal@kernel.org>,
anshuman.khandual@arm.com, ryan.roberts@arm.com,
kevin.brodsky@arm.com, yeoreum.yun@arm.com
Subject: [PATCH AUTOSEL 6.19] arm64: mte: Set TCMA1 whenever MTE is present in the kernel
Date: Wed, 11 Feb 2026 07:30:37 -0500 [thread overview]
Message-ID: <20260211123112.1330287-27-sashal@kernel.org> (raw)
In-Reply-To: <20260211123112.1330287-1-sashal@kernel.org>
From: Carl Worth <carl@os.amperecomputing.com>
[ Upstream commit a4e5927115f30a301f9939ed43e6a21a343e06ad ]
Set the TCMA1 bit so that access to TTBR1 addresses with 0xf in their
tag bits will be treated as tag unchecked.
This is important to avoid unwanted tag checking on some
systems. Specifically, SCTLR_EL1.TCF can be set to indicate that no
tag check faults are desired. But the architecture doesn't guarantee
that in this case the system won't still perform tag checks.
Use TCMA1 to ensure that undesired tag checks are not performed. This
bit was already set in the KASAN case. Adding it to the non-KASAN case
prevents tag checking since all TTBR1 address will have a value of 0xf
in their tag bits.
This patch has been measured on an Ampere system to improve the following:
* Eliminate over 98% of kernel-side tag checks during "perf bench
futex hash", as measured with "perf stat".
* Eliminate all MTE overhead (was previously a 25% performance
penalty) from the Phoronix pts/memcached benchmark (1:10 Set:Get
ration with 96 cores).
Reported-by: Taehyun Noh <taehyun@utexas.edu>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Carl Worth <carl@os.amperecomputing.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Good. Now I have a complete understanding. Let me write the full
analysis.
---
## Comprehensive Analysis: arm64: mte: Set TCMA1 whenever MTE is present
in the kernel
### 1. COMMIT MESSAGE ANALYSIS
The commit sets the `TCMA1` (Tag Check Match All 1) bit in `TCR_EL1`
whenever `CONFIG_ARM64_MTE` is enabled, not just when
`CONFIG_KASAN_HW_TAGS` is enabled.
Key indicators:
- **"Reported-by: Taehyun Noh"** - real-world issue discovered by a user
- **"Suggested-by: Catalin Marinas"** - the fix approach comes from the
arm64 MTE subsystem maintainer
- **"Reviewed-by: Catalin Marinas"** - reviewed by the domain expert
- **Signed-off-by: Will Deacon** - merged by the arm64 maintainer
The commit message clearly describes the problem: `SCTLR_EL1.TCF` being
set to NONE (no faults) does **not** guarantee the hardware won't still
perform tag checks. TCMA1 is needed to definitively prevent unwanted tag
checking for kernel addresses.
### 2. CODE CHANGE ANALYSIS
The change is in `arch/arm64/mm/proc.S`:
**Before (current stable/mainline):**
```51:61:arch/arm64/mm/proc.S
#ifdef CONFIG_KASAN_HW_TAGS
#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
#elif defined(CONFIG_ARM64_MTE)
/*
- The mte_zero_clear_page_tags() implementation uses DC GZVA, which
relies on
- TBI being enabled at EL1.
*/
#define TCR_MTE_FLAGS TCR_EL1_TBI1 | TCR_EL1_TBID1
#else
#define TCR_MTE_FLAGS 0
#endif
```
**After (the fix):**
- Collapses the three-way `#ifdef` into two-way: `CONFIG_ARM64_MTE` vs.
else
- Adds `TCR_EL1_TCMA1` to the `CONFIG_ARM64_MTE` case (previously only
in `CONFIG_KASAN_HW_TAGS`)
- This is valid because `CONFIG_KASAN_HW_TAGS` implies
`CONFIG_ARM64_MTE` (via `HAVE_ARCH_KASAN_HW_TAGS` which `select`s from
`ARM64_MTE`)
The behavioral change is a single bit addition: `TCR_EL1_TCMA1` (bit 58
of `TCR_EL1`).
### 3. TECHNICAL EXPLANATION
**What TCMA1 does (ARM Architecture Reference Manual):**
- TCMA1 controls "Tag Check Match All" for TTBR1 (kernel) addresses
- When set: accesses with tag 0xF (all bits set) in the top byte are
treated as "Tag Unchecked"
- When clear: tag 0xF is treated like any other tag and is checked
against the allocation tag
**Why this matters:**
- All kernel pointers (TTBR1 addresses) have tag 0xFF (`KASAN_TAG_KERNEL
= 0xFF` from `include/linux/kasan-tags.h`), which corresponds to the
4-bit MTE tag 0xF
- Without TCMA1, the hardware may perform tag checks on every kernel
memory access, even with `SCTLR_EL1.TCF = NONE` (the architecture
doesn't guarantee TCF=NONE prevents checking - it only prevents
faults)
- On Ampere systems, this results in **98% unnecessary kernel-side tag
checks** during futex benchmarks and a **25% performance penalty** on
memcached
**Why it was missing:**
- The original MTE implementation correctly set TCMA1 for
`CONFIG_KASAN_HW_TAGS` (because KASAN uses non-0xF tags for tagged
allocations, and 0xF means "match all")
- But for plain `CONFIG_ARM64_MTE` (without KASAN), TCMA1 was omitted,
likely because it was assumed TCF=NONE was sufficient to prevent tag
checking
### 4. SCOPE AND RISK ASSESSMENT
**Scope:**
- 1 file changed (`arch/arm64/mm/proc.S`)
- ~5 lines of actual diff (macro definition change)
- Purely a register configuration change at boot time
**Risk: VERY LOW**
- `TCMA1` was already set in the `CONFIG_KASAN_HW_TAGS` path - this
extends it to all MTE configurations
- The bit is well-defined in the ARM architecture specification
- It only affects the handling of tag 0xF (match-all tag) on TTBR1
addresses
- Cannot cause any functional regression - it makes the hardware skip
checks that were producing no useful results anyway
### 5. USER IMPACT ASSESSMENT
**Who is affected:**
- `CONFIG_ARM64_MTE` defaults to `y` in `arch/arm64/Kconfig` (line 2124:
`default y`)
- This means virtually **all ARM64 distro kernels** have it enabled
- Any ARM64 system with MTE-capable hardware (ARMv8.5+: Ampere
Altra/AmpereOne, Arm Neoverse V1/V2/N2, Cortex-X2+, etc.)
- These are widely used in cloud/data center (Ampere), mobile (Cortex),
and embedded systems
**Severity:**
- 25% performance penalty on memcached (a very common server workload)
- 98% unnecessary tag checks during kernel futex operations
- This is a **significant performance issue** for production ARM64
servers
### 6. STABLE TREE APPLICABILITY
The affected three-way ifdef structure exists in:
- **v6.12/v6.6**: Uses `TCR_TCMA1` macro name (trivial rename needed)
- **v6.1**: Same structure, uses `TCR_TCMA1`
- **v5.15**: Uses `SYS_TCR_EL1_TCMA1` macro name, slightly different
ifdef structure but same issue
- **v5.10**: Different code structure (MTE was new), would need more
adaptation
The fix applies cleanly with minor macro name adjustments to all active
LTS kernels from 5.15+.
### 7. CLASSIFICATION
This fix meets the stable kernel criteria:
1. **Obviously correct and tested**: Measured on Ampere hardware,
reviewed by Catalin Marinas (the MTE architect)
2. **Small**: ~5 lines in 1 file
3. **Fixes one thing**: Adds TCMA1 to prevent unwanted tag checking
4. **Fixes a real bug**: The kernel's intent (no tag checking) was not
being realized by the hardware
5. **Significant performance regression**: Stable rules explicitly allow
fixes for "a significant performance regression" - 25% on memcached
qualifies
6. **No new features**: Just correcting a register configuration
The commit explicitly addresses a case where the kernel's register
configuration was incomplete, causing the hardware to perform unwanted
operations (tag checking) that produce no useful result but cause
significant performance degradation. This is a bug in the kernel's MTE
initialization, not a new feature or optimization.
**YES** - This commit should be backported to stable kernel trees. It
fixes a significant performance regression (25% on memcached) affecting
all ARM64 MTE-capable systems running kernels with `CONFIG_ARM64_MTE=y`
(the default). The fix is tiny (adding one bit to a register), extremely
low risk (the same bit was already set in the KASAN path), reviewed by
the MTE subsystem expert, and the stable rules explicitly permit fixes
for significant performance regressions. The affected code exists in all
current LTS kernels (5.15+) with only trivial macro name adjustments
needed for backporting.
**YES**
arch/arm64/mm/proc.S | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 5d907ce3b6d3f..22866b49be372 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -48,14 +48,14 @@
#define TCR_KASAN_SW_FLAGS 0
#endif
-#ifdef CONFIG_KASAN_HW_TAGS
-#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
-#elif defined(CONFIG_ARM64_MTE)
+#ifdef CONFIG_ARM64_MTE
/*
* The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
- * TBI being enabled at EL1.
+ * TBI being enabled at EL1. TCMA1 is needed to treat accesses with the
+ * match-all tag (0xF) as Tag Unchecked, irrespective of the SCTLR_EL1.TCF
+ * setting.
*/
-#define TCR_MTE_FLAGS TCR_EL1_TBI1 | TCR_EL1_TBID1
+#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
#else
#define TCR_MTE_FLAGS 0
#endif
--
2.51.0
next prev parent reply other threads:[~2026-02-11 12:32 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 12:30 [PATCH AUTOSEL 6.19-5.10] s390/perf: Disable register readout on sampling events Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] arm64: Add support for TSV110 Spectre-BHB mitigation Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] xenbus: Use .freeze/.thaw to handle xenbus devices Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] s390/purgatory: Add -Wno-default-const-init-unsafe to KBUILD_CFLAGS Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] s390/boot: " Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.1] perf/arm-cmn: Support CMN-600AE Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] ntfs: ->d_compare() must not block Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] ACPI: x86: s2idle: Invoke Microsoft _DSM Function 9 (Turn On Display) Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] block: decouple secure erase size limit from discard size limit Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] sparc: don't reference obsolete termio struct for TC* constants Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] EFI/CPER: don't go past the ARM processor CPER record buffer Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19] ACPI: scan: Use async schedule function in acpi_scan_clear_dep_fn() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.6] cpufreq: dt-platdev: Block the driver from probing on more QC platforms Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] EFI/CPER: don't dump the entire memory region Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] ACPI: battery: fix incorrect charging status when current is zero Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] rust: cpufreq: always inline functions using build_assert with arguments Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] blk-mq-sched: unify elevators checking for async requests Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] x86/xen/pvh: Enable PAE mode for 32-bit guest only when CONFIG_X86_PAE is set Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] APEI/GHES: ARM processor Error: don't go past allocated memory Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] md raid: fix hang when stopping arrays with metadata through dm-raid Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] tools/power cpupower: Reset errno before strtoull() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] sparc: Synchronize user stack on fork and clone Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] rnbd-srv: Zero the rsp buffer before using it Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] alpha: fix user-space corruption during memory compaction Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] ACPICA: Abort AML bytecode execution when executing AML_FATAL_OP Sasha Levin
2026-02-11 12:30 ` Sasha Levin [this message]
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] tools/cpupower: Fix inverted APERF capability check Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.15] ACPI: processor: Fix NULL-pointer dereference in acpi_processor_errata_piix4() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] ACPI: resource: Add JWIPC JVC9100 to irq1_level_low_skip_override[] Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.6] perf/cxlpmu: Replace IRQF_ONESHOT with IRQF_NO_THREAD Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.6] md-cluster: fix NULL pointer dereference in process_metadata_update Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] APEI/GHES: ensure that won't go past CPER allocated record Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] powercap: intel_rapl: Add PL4 support for Ice Lake Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] io_uring/timeout: annotate data race in io_flush_timeouts() Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260211123112.1330287-27-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=anshuman.khandual@arm.com \
--cc=carl@os.amperecomputing.com \
--cc=catalin.marinas@arm.com \
--cc=kevin.brodsky@arm.com \
--cc=patches@lists.linux.dev \
--cc=ryan.roberts@arm.com \
--cc=stable@vger.kernel.org \
--cc=taehyun@utexas.edu \
--cc=will@kernel.org \
--cc=yeoreum.yun@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox