* [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2
@ 2025-11-27 12:22 Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 1/5] arm64: Remove dead code resetting HCR_EL2 for pKVM Fuad Tabba
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-11-27 12:22 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, tabba
pKVM never exposes MTE to protected guests (pVM), but we must also
ensure a malicious host cannot use MTE to attack the hypervisor or a
pVM.
If MTE is supported by the hardware (and is enabled at EL3), it remains
available to lower exception levels by default. Disabling it in the host
kernel (e.g., via 'arm64.nomte') only stops the kernel from advertising
the feature; it does not physically disable MTE in the hardware.
In this scenario, a malicious host could still access tags in pages
donated to a guest using MTE instructions (e.g., STG and LDG), bypassing
the kernel's configuration.
To prevent this, explicitly disable MTE at EL2 (by clearing HCR_EL2.ATA)
when the host has MTE disabled. This causes any MTE instruction usage to
generate a Data Abort (trap) to the hypervisor.
Additionally, to faithfully mimic hardware that does not support MTE,
trap accesses to MTE system registers (e.g., GCR_EL1) and inject an
Undefined Instruction exception back to the host.
This logic is applied in all non-VHE modes. For non-protected modes,
this remains beneficial as it prevents unpredictable behavior caused by
accessing allocation tags when the system considers them disabled.
Note that this ties into my other outgoing patch series [1], which also
has some MTE-related fixes, but is not dependent on it.
Based on Linux 6.18-rc7
Cheers,
/fuad
[1] https://lore.kernel.org/all/20251118103807.707500-1-tabba@google.com/
Fuad Tabba (4):
arm64: Remove dead code resetting HCR_EL2 for pKVM
arm64: Clear HCR_EL2.ATA when MTE is not supported or disabled
arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled
KVM: arm64: Use kvm_has_mte() in pKVM trap initialization
Quentin Perret (1):
KVM: arm64: Refactor enter_exception64()
arch/arm64/include/asm/kvm_arm.h | 2 +-
arch/arm64/include/asm/kvm_emulate.h | 5 ++
arch/arm64/kernel/head.S | 2 +-
arch/arm64/kvm/arm.c | 4 ++
arch/arm64/kvm/hyp/exception.c | 100 ++++++++++++++++-----------
arch/arm64/kvm/hyp/nvhe/hyp-init.S | 5 --
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 44 ++++++++++++
arch/arm64/kvm/hyp/nvhe/pkvm.c | 2 +-
8 files changed, 114 insertions(+), 50 deletions(-)
base-commit: ac3fd01e4c1efce8f2c054cdeb2ddd2fc0fb150d
--
2.52.0.487.g5c8c507ade-goog
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v1 1/5] arm64: Remove dead code resetting HCR_EL2 for pKVM
2025-11-27 12:22 [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Fuad Tabba
@ 2025-11-27 12:22 ` Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 2/5] arm64: Clear HCR_EL2.ATA when MTE is not supported or disabled Fuad Tabba
` (4 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-11-27 12:22 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, tabba
The pKVM lifecycle does not support tearing down the hypervisor and
returning to the hyp stub once initialized. The transition to protected
mode is one-way.
Consequently, the code path in hyp-init.S responsible for resetting
EL2 registers (triggered by kexec or hibernation) is unreachable in
protected mode.
Remove the dead code handling HCR_EL2 reset for
ARM64_KVM_PROTECTED_MODE.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/hyp/nvhe/hyp-init.S | 5 -----
1 file changed, 5 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index aada42522e7b..0d42eedc7167 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -260,11 +260,6 @@ reset:
msr sctlr_el2, x5
isb
-alternative_if ARM64_KVM_PROTECTED_MODE
- mov_q x5, HCR_HOST_NVHE_FLAGS
- msr_hcr_el2 x5
-alternative_else_nop_endif
-
/* Install stub vectors */
adr_l x5, __hyp_stub_vectors
msr vbar_el2, x5
--
2.52.0.487.g5c8c507ade-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v1 2/5] arm64: Clear HCR_EL2.ATA when MTE is not supported or disabled
2025-11-27 12:22 [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 1/5] arm64: Remove dead code resetting HCR_EL2 for pKVM Fuad Tabba
@ 2025-11-27 12:22 ` Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 3/5] KVM: arm64: Refactor enter_exception64() Fuad Tabba
` (3 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-11-27 12:22 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, tabba
If MTE is not supported by the hardware, or is disabled in the kernel
configuration (CONFIG_ARM64_MTE=n) or command line (arm64.nomte), the
kernel stops advertising MTE to userspace and avoids using MTE
instructions. However, this is a software-level disable only.
When MTE hardware is present and enabled by EL3 firmware, leaving
HCR_EL2.ATA set allows the host to execute MTE instructions (STG, LDG,
etc.) and access allocation tags in physical memory. This creates a
security risk where a malicious or buggy host could lead to system
crashes, undefined behavior, or compromise guests.
Prevent this by clearing HCR_EL2.ATA when MTE is disabled. Remove it
from the HCR_HOST_NVHE_FLAGS default, and conditionally set it in
cpu_prepare_hyp_mode() only when system_supports_mte() returns true.
This causes MTE instructions to trap to EL2 when HCR_EL2.ATA is cleared.
Early boot code in head.S temporarily keeps HCR_ATA set to avoid
special-casing initialization paths. This is safe because this code
executes before untrusted code runs and will clear HCR_ATA if MTE is
disabled.
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/include/asm/kvm_arm.h | 2 +-
arch/arm64/kernel/head.S | 2 +-
arch/arm64/kvm/arm.c | 4 ++++
3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 1da290aeedce..a41e3087e00a 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -101,7 +101,7 @@
HCR_BSU_IS | HCR_FB | HCR_TACR | \
HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \
HCR_FMO | HCR_IMO | HCR_PTW | HCR_TID3 | HCR_TID1)
-#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA)
+#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK)
#define HCR_HOST_NVHE_PROTECTED_FLAGS (HCR_HOST_NVHE_FLAGS | HCR_TSC)
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H | HCR_AMO | HCR_IMO | HCR_FMO)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ca04b338cb0d..87a822e5c4ca 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -299,7 +299,7 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
isb
0:
- init_el2_hcr HCR_HOST_NVHE_FLAGS
+ init_el2_hcr HCR_HOST_NVHE_FLAGS | HCR_ATA
init_el2_state
/* Hypervisor stub */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 052bf0d4d0b0..c03006b1c5bc 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2030,6 +2030,10 @@ static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits)
params->hcr_el2 = HCR_HOST_NVHE_PROTECTED_FLAGS;
else
params->hcr_el2 = HCR_HOST_NVHE_FLAGS;
+
+ if (system_supports_mte())
+ params->hcr_el2 |= HCR_ATA;
+
if (cpus_have_final_cap(ARM64_KVM_HVHE))
params->hcr_el2 |= HCR_E2H;
params->vttbr = params->vtcr = 0;
--
2.52.0.487.g5c8c507ade-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v1 3/5] KVM: arm64: Refactor enter_exception64()
2025-11-27 12:22 [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 1/5] arm64: Remove dead code resetting HCR_EL2 for pKVM Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 2/5] arm64: Clear HCR_EL2.ATA when MTE is not supported or disabled Fuad Tabba
@ 2025-11-27 12:22 ` Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 4/5] arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled Fuad Tabba
` (2 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-11-27 12:22 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, tabba
From: Quentin Perret <qperret@google.com>
To simplify the injection of exceptions into the host in pKVM context,
refactor enter_exception64() to split out the logic for calculating the
exception vector offset and the target CPSR.
Extract two new helper functions:
- get_except64_offset(): Calculates exception vector offset based on
current/target exception levels and exception type
- get_except64_cpsr(): Computes the new CPSR/PSTATE when taking an
exception
A subsequent patch will use these helpers to inject UNDEF exceptions
into the host when MTE system registers are accessed with MTE disabled.
Extracting the helpers allows that code to reuse the exception entry
logic without duplicating the CPSR and vector offset calculations.
No functional change intended.
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/include/asm/kvm_emulate.h | 5 ++
arch/arm64/kvm/hyp/exception.c | 100 ++++++++++++++++-----------
2 files changed, 63 insertions(+), 42 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index c9eab316398e..c3f04bd5b2a5 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -71,6 +71,11 @@ static inline int kvm_inject_serror(struct kvm_vcpu *vcpu)
return kvm_inject_serror_esr(vcpu, ESR_ELx_ISV);
}
+unsigned long get_except64_offset(unsigned long psr, unsigned long target_mode,
+ enum exception_type type);
+unsigned long get_except64_cpsr(unsigned long old, bool has_mte,
+ unsigned long sctlr, unsigned long mode);
+
void kvm_vcpu_wfi(struct kvm_vcpu *vcpu);
void kvm_emulate_nested_eret(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index bef40ddb16db..d3bcda665612 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -65,12 +65,25 @@ static void __vcpu_write_spsr_und(struct kvm_vcpu *vcpu, u64 val)
vcpu->arch.ctxt.spsr_und = val;
}
+unsigned long get_except64_offset(unsigned long psr, unsigned long target_mode,
+ enum exception_type type)
+{
+ u64 mode = psr & (PSR_MODE_MASK | PSR_MODE32_BIT);
+ u64 exc_offset;
+
+ if (mode == target_mode)
+ exc_offset = CURRENT_EL_SP_ELx_VECTOR;
+ else if ((mode | PSR_MODE_THREAD_BIT) == target_mode)
+ exc_offset = CURRENT_EL_SP_EL0_VECTOR;
+ else if (!(mode & PSR_MODE32_BIT))
+ exc_offset = LOWER_EL_AArch64_VECTOR;
+ else
+ exc_offset = LOWER_EL_AArch32_VECTOR;
+
+ return exc_offset + type;
+}
+
/*
- * This performs the exception entry at a given EL (@target_mode), stashing PC
- * and PSTATE into ELR and SPSR respectively, and compute the new PC/PSTATE.
- * The EL passed to this function *must* be a non-secure, privileged mode with
- * bit 0 being set (PSTATE.SP == 1).
- *
* When an exception is taken, most PSTATE fields are left unchanged in the
* handler. However, some are explicitly overridden (e.g. M[4:0]). Luckily all
* of the inherited bits have the same position in the AArch64/AArch32 SPSR_ELx
@@ -82,50 +95,17 @@ static void __vcpu_write_spsr_und(struct kvm_vcpu *vcpu, u64 val)
* Here we manipulate the fields in order of the AArch64 SPSR_ELx layout, from
* MSB to LSB.
*/
-static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
- enum exception_type type)
+unsigned long get_except64_cpsr(unsigned long old, bool has_mte,
+ unsigned long sctlr, unsigned long target_mode)
{
- unsigned long sctlr, vbar, old, new, mode;
- u64 exc_offset;
-
- mode = *vcpu_cpsr(vcpu) & (PSR_MODE_MASK | PSR_MODE32_BIT);
-
- if (mode == target_mode)
- exc_offset = CURRENT_EL_SP_ELx_VECTOR;
- else if ((mode | PSR_MODE_THREAD_BIT) == target_mode)
- exc_offset = CURRENT_EL_SP_EL0_VECTOR;
- else if (!(mode & PSR_MODE32_BIT))
- exc_offset = LOWER_EL_AArch64_VECTOR;
- else
- exc_offset = LOWER_EL_AArch32_VECTOR;
-
- switch (target_mode) {
- case PSR_MODE_EL1h:
- vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL1);
- sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL1);
- __vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL1);
- break;
- case PSR_MODE_EL2h:
- vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL2);
- sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL2);
- __vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL2);
- break;
- default:
- /* Don't do that */
- BUG();
- }
-
- *vcpu_pc(vcpu) = vbar + exc_offset + type;
-
- old = *vcpu_cpsr(vcpu);
- new = 0;
+ u64 new = 0;
new |= (old & PSR_N_BIT);
new |= (old & PSR_Z_BIT);
new |= (old & PSR_C_BIT);
new |= (old & PSR_V_BIT);
- if (kvm_has_mte(kern_hyp_va(vcpu->kvm)))
+ if (has_mte)
new |= PSR_TCO_BIT;
new |= (old & PSR_DIT_BIT);
@@ -161,6 +141,42 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
new |= target_mode;
+ return new;
+}
+
+/*
+ * This performs the exception entry at a given EL (@target_mode), stashing PC
+ * and PSTATE into ELR and SPSR respectively, and compute the new PC/PSTATE.
+ * The EL passed to this function *must* be a non-secure, privileged mode with
+ * bit 0 being set (PSTATE.SP == 1).
+ */
+static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
+ enum exception_type type)
+{
+ u64 offset = get_except64_offset(*vcpu_cpsr(vcpu), target_mode, type);
+ unsigned long sctlr, vbar, old, new;
+
+ switch (target_mode) {
+ case PSR_MODE_EL1h:
+ vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL1);
+ sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL1);
+ __vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL1);
+ break;
+ case PSR_MODE_EL2h:
+ vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL2);
+ sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL2);
+ __vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL2);
+ break;
+ default:
+ /* Don't do that */
+ BUG();
+ }
+
+ *vcpu_pc(vcpu) = vbar + offset;
+
+ old = *vcpu_cpsr(vcpu);
+ new = get_except64_cpsr(old, kvm_has_mte(kern_hyp_va(vcpu->kvm)), sctlr,
+ target_mode);
*vcpu_cpsr(vcpu) = new;
__vcpu_write_spsr(vcpu, target_mode, old);
}
--
2.52.0.487.g5c8c507ade-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v1 4/5] arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled
2025-11-27 12:22 [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Fuad Tabba
` (2 preceding siblings ...)
2025-11-27 12:22 ` [PATCH v1 3/5] KVM: arm64: Refactor enter_exception64() Fuad Tabba
@ 2025-11-27 12:22 ` Fuad Tabba
2025-11-27 14:17 ` Marc Zyngier
2025-11-27 12:22 ` [PATCH v1 5/5] KVM: arm64: Use kvm_has_mte() in pKVM trap initialization Fuad Tabba
2025-12-02 22:43 ` [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Oliver Upton
5 siblings, 1 reply; 13+ messages in thread
From: Fuad Tabba @ 2025-11-27 12:22 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, tabba
When MTE hardware is present but disabled via software (arm64.nomte or
CONFIG_ARM64_MTE=n), HCR_EL2.ATA is cleared to prevent use of MTE
instructions. However, this alone doesn't fully emulate hardware that
lacks MTE support.
With HCR_EL2.ATA cleared, accesses to certain MTE system registers trap
to EL2 with exception class ESR_ELx_EC_SYS64. To faithfully emulate
hardware without MTE (where such accesses would cause an Undefined
Instruction exception), inject UNDEF into the host.
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 44 ++++++++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 29430c031095..f542e4c17156 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -686,6 +686,46 @@ static void handle_host_smc(struct kvm_cpu_context *host_ctxt)
kvm_skip_host_instr();
}
+static void inject_undef64(void)
+{
+ unsigned long sctlr, vbar, old, new;
+ u64 offset, esr;
+
+ vbar = read_sysreg_el1(SYS_VBAR);
+ sctlr = read_sysreg_el1(SYS_SCTLR);
+ old = read_sysreg_el2(SYS_SPSR);
+ new = get_except64_cpsr(old, system_supports_mte(), sctlr, PSR_MODE_EL1h);
+ offset = get_except64_offset(old, PSR_MODE_EL1h, except_type_sync);
+ esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT) | ESR_ELx_IL;
+
+ write_sysreg_el1(esr, SYS_ESR);
+ write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
+ write_sysreg_el1(old, SYS_SPSR);
+ write_sysreg_el2(vbar + offset, SYS_ELR);
+ write_sysreg_el2(new, SYS_SPSR);
+}
+
+static bool handle_host_mte(u64 esr)
+{
+ /* If we're here for any reason other than MTE, then it's a bug. */
+
+ if (read_sysreg(HCR_EL2) & HCR_ATA)
+ return false;
+
+ switch (esr_sys64_to_sysreg(esr)) {
+ case SYS_RGSR_EL1:
+ case SYS_GCR_EL1:
+ case SYS_TFSR_EL1:
+ case SYS_TFSRE0_EL1:
+ break;
+ default:
+ return false;
+ }
+
+ inject_undef64();
+ return true;
+}
+
void handle_trap(struct kvm_cpu_context *host_ctxt)
{
u64 esr = read_sysreg_el2(SYS_ESR);
@@ -701,6 +741,10 @@ void handle_trap(struct kvm_cpu_context *host_ctxt)
case ESR_ELx_EC_DABT_LOW:
handle_host_mem_abort(host_ctxt);
break;
+ case ESR_ELx_EC_SYS64:
+ if (handle_host_mte(esr))
+ break;
+ fallthrough;
default:
BUG();
}
--
2.52.0.487.g5c8c507ade-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v1 5/5] KVM: arm64: Use kvm_has_mte() in pKVM trap initialization
2025-11-27 12:22 [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Fuad Tabba
` (3 preceding siblings ...)
2025-11-27 12:22 ` [PATCH v1 4/5] arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled Fuad Tabba
@ 2025-11-27 12:22 ` Fuad Tabba
2025-12-02 22:43 ` [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Oliver Upton
5 siblings, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-11-27 12:22 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, tabba
When initializing HCR traps in protected mode, use kvm_has_mte() to
check for MTE support rather than kvm_has_feat(kvm, ID_AA64PFR1_EL1,
MTE, IMP).
kvm_has_mte() provides a more comprehensive check:
- kvm_has_feat() only checks if MTE is in the guest's ID register view
(i.e., what we advertise to the guest)
- kvm_has_mte() checks both system_supports_mte() AND whether
KVM_ARCH_FLAG_MTE_ENABLED is set for this VM instance
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/hyp/nvhe/pkvm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index 43bde061b65d..f2060ca66e5d 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -82,7 +82,7 @@ static void pvm_init_traps_hcr(struct kvm_vcpu *vcpu)
if (!kvm_has_feat(kvm, ID_AA64PFR0_EL1, AMU, IMP))
val &= ~(HCR_AMVOFFEN);
- if (!kvm_has_feat(kvm, ID_AA64PFR1_EL1, MTE, IMP)) {
+ if (!kvm_has_mte(kvm)) {
val |= HCR_TID5;
val &= ~(HCR_DCT | HCR_ATA);
}
--
2.52.0.487.g5c8c507ade-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v1 4/5] arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled
2025-11-27 12:22 ` [PATCH v1 4/5] arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled Fuad Tabba
@ 2025-11-27 14:17 ` Marc Zyngier
2025-11-27 14:41 ` Fuad Tabba
0 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2025-11-27 14:17 UTC (permalink / raw)
To: Fuad Tabba
Cc: kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
suzuki.poulose, yuzenghui, catalin.marinas, will
On Thu, 27 Nov 2025 12:22:09 +0000,
Fuad Tabba <tabba@google.com> wrote:
>
> When MTE hardware is present but disabled via software (arm64.nomte or
> CONFIG_ARM64_MTE=n), HCR_EL2.ATA is cleared to prevent use of MTE
> instructions. However, this alone doesn't fully emulate hardware that
> lacks MTE support.
>
> With HCR_EL2.ATA cleared, accesses to certain MTE system registers trap
> to EL2 with exception class ESR_ELx_EC_SYS64. To faithfully emulate
> hardware without MTE (where such accesses would cause an Undefined
> Instruction exception), inject UNDEF into the host.
>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
> arch/arm64/kvm/hyp/nvhe/hyp-main.c | 44 ++++++++++++++++++++++++++++++
> 1 file changed, 44 insertions(+)
>
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 29430c031095..f542e4c17156 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -686,6 +686,46 @@ static void handle_host_smc(struct kvm_cpu_context *host_ctxt)
> kvm_skip_host_instr();
> }
>
> +static void inject_undef64(void)
> +{
> + unsigned long sctlr, vbar, old, new;
> + u64 offset, esr;
> +
> + vbar = read_sysreg_el1(SYS_VBAR);
> + sctlr = read_sysreg_el1(SYS_SCTLR);
> + old = read_sysreg_el2(SYS_SPSR);
> + new = get_except64_cpsr(old, system_supports_mte(), sctlr, PSR_MODE_EL1h);
> + offset = get_except64_offset(old, PSR_MODE_EL1h, except_type_sync);
> + esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT) | ESR_ELx_IL;
> +
> + write_sysreg_el1(esr, SYS_ESR);
> + write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> + write_sysreg_el1(old, SYS_SPSR);
> + write_sysreg_el2(vbar + offset, SYS_ELR);
> + write_sysreg_el2(new, SYS_SPSR);
> +}
> +
> +static bool handle_host_mte(u64 esr)
> +{
> + /* If we're here for any reason other than MTE, then it's a bug. */
> +
> + if (read_sysreg(HCR_EL2) & HCR_ATA)
> + return false;
> +
> + switch (esr_sys64_to_sysreg(esr)) {
> + case SYS_RGSR_EL1:
> + case SYS_GCR_EL1:
> + case SYS_TFSR_EL1:
> + case SYS_TFSRE0_EL1:
How about other things, such as DC GVA? Don't you need to trap and
UNDEF it (which has the side effect of also trapping DC ZVA)?
Same question for all the DC {C,I,CI}GVA{C,P} instructions.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 4/5] arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled
2025-11-27 14:17 ` Marc Zyngier
@ 2025-11-27 14:41 ` Fuad Tabba
2025-11-28 8:43 ` Marc Zyngier
0 siblings, 1 reply; 13+ messages in thread
From: Fuad Tabba @ 2025-11-27 14:41 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
suzuki.poulose, yuzenghui, catalin.marinas, will
Hi Marc,
On Thu, 27 Nov 2025 at 14:17, Marc Zyngier <maz@kernel.org> wrote:
>
> On Thu, 27 Nov 2025 12:22:09 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > When MTE hardware is present but disabled via software (arm64.nomte or
> > CONFIG_ARM64_MTE=n), HCR_EL2.ATA is cleared to prevent use of MTE
> > instructions. However, this alone doesn't fully emulate hardware that
> > lacks MTE support.
> >
> > With HCR_EL2.ATA cleared, accesses to certain MTE system registers trap
> > to EL2 with exception class ESR_ELx_EC_SYS64. To faithfully emulate
> > hardware without MTE (where such accesses would cause an Undefined
> > Instruction exception), inject UNDEF into the host.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> > arch/arm64/kvm/hyp/nvhe/hyp-main.c | 44 ++++++++++++++++++++++++++++++
> > 1 file changed, 44 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > index 29430c031095..f542e4c17156 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > @@ -686,6 +686,46 @@ static void handle_host_smc(struct kvm_cpu_context *host_ctxt)
> > kvm_skip_host_instr();
> > }
> >
> > +static void inject_undef64(void)
> > +{
> > + unsigned long sctlr, vbar, old, new;
> > + u64 offset, esr;
> > +
> > + vbar = read_sysreg_el1(SYS_VBAR);
> > + sctlr = read_sysreg_el1(SYS_SCTLR);
> > + old = read_sysreg_el2(SYS_SPSR);
> > + new = get_except64_cpsr(old, system_supports_mte(), sctlr, PSR_MODE_EL1h);
> > + offset = get_except64_offset(old, PSR_MODE_EL1h, except_type_sync);
> > + esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT) | ESR_ELx_IL;
> > +
> > + write_sysreg_el1(esr, SYS_ESR);
> > + write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> > + write_sysreg_el1(old, SYS_SPSR);
> > + write_sysreg_el2(vbar + offset, SYS_ELR);
> > + write_sysreg_el2(new, SYS_SPSR);
> > +}
> > +
> > +static bool handle_host_mte(u64 esr)
> > +{
> > + /* If we're here for any reason other than MTE, then it's a bug. */
> > +
> > + if (read_sysreg(HCR_EL2) & HCR_ATA)
> > + return false;
> > +
> > + switch (esr_sys64_to_sysreg(esr)) {
> > + case SYS_RGSR_EL1:
> > + case SYS_GCR_EL1:
> > + case SYS_TFSR_EL1:
> > + case SYS_TFSRE0_EL1:
>
> How about other things, such as DC GVA? Don't you need to trap and
> UNDEF it (which has the side effect of also trapping DC ZVA)?
>
> Same question for all the DC {C,I,CI}GVA{C,P} instructions.
As far as I could tell, none of these are trapped by ATA. The spec
says that in the absence of MTE, their behavior is undefined --- which
is the same for the ones I'm actually handling here...
The reasons I've only handled these is that, when booting a system
with a misadvertised MTE, the kernel accesses these registers, and
injecting an UNDEF resulted in a nicer failure mode.
What do you suggest, drop this patch (and the one before it), since
trying to access MTE is UNDEF, and the kernel that does it is just
shooting itself in the foot (no security implications)? Or edit the
commit message to make it clear that this is best effort, based on
what ATA traps?
Cheers,
/fuad
>
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 4/5] arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled
2025-11-27 14:41 ` Fuad Tabba
@ 2025-11-28 8:43 ` Marc Zyngier
2025-11-28 8:53 ` Fuad Tabba
2025-11-28 12:10 ` Will Deacon
0 siblings, 2 replies; 13+ messages in thread
From: Marc Zyngier @ 2025-11-28 8:43 UTC (permalink / raw)
To: Fuad Tabba
Cc: kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
suzuki.poulose, yuzenghui, catalin.marinas, will
On Thu, 27 Nov 2025 14:41:24 +0000,
Fuad Tabba <tabba@google.com> wrote:
>
> Hi Marc,
>
> On Thu, 27 Nov 2025 at 14:17, Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Thu, 27 Nov 2025 12:22:09 +0000,
> > Fuad Tabba <tabba@google.com> wrote:
> > >
> > > When MTE hardware is present but disabled via software (arm64.nomte or
> > > CONFIG_ARM64_MTE=n), HCR_EL2.ATA is cleared to prevent use of MTE
> > > instructions. However, this alone doesn't fully emulate hardware that
> > > lacks MTE support.
> > >
> > > With HCR_EL2.ATA cleared, accesses to certain MTE system registers trap
> > > to EL2 with exception class ESR_ELx_EC_SYS64. To faithfully emulate
> > > hardware without MTE (where such accesses would cause an Undefined
> > > Instruction exception), inject UNDEF into the host.
> > >
> > > Signed-off-by: Fuad Tabba <tabba@google.com>
> > > ---
> > > arch/arm64/kvm/hyp/nvhe/hyp-main.c | 44 ++++++++++++++++++++++++++++++
> > > 1 file changed, 44 insertions(+)
> > >
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > > index 29430c031095..f542e4c17156 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > > +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > > @@ -686,6 +686,46 @@ static void handle_host_smc(struct kvm_cpu_context *host_ctxt)
> > > kvm_skip_host_instr();
> > > }
> > >
> > > +static void inject_undef64(void)
> > > +{
> > > + unsigned long sctlr, vbar, old, new;
> > > + u64 offset, esr;
> > > +
> > > + vbar = read_sysreg_el1(SYS_VBAR);
> > > + sctlr = read_sysreg_el1(SYS_SCTLR);
> > > + old = read_sysreg_el2(SYS_SPSR);
> > > + new = get_except64_cpsr(old, system_supports_mte(), sctlr, PSR_MODE_EL1h);
> > > + offset = get_except64_offset(old, PSR_MODE_EL1h, except_type_sync);
> > > + esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT) | ESR_ELx_IL;
> > > +
> > > + write_sysreg_el1(esr, SYS_ESR);
> > > + write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> > > + write_sysreg_el1(old, SYS_SPSR);
> > > + write_sysreg_el2(vbar + offset, SYS_ELR);
> > > + write_sysreg_el2(new, SYS_SPSR);
> > > +}
> > > +
> > > +static bool handle_host_mte(u64 esr)
> > > +{
> > > + /* If we're here for any reason other than MTE, then it's a bug. */
> > > +
> > > + if (read_sysreg(HCR_EL2) & HCR_ATA)
> > > + return false;
> > > +
> > > + switch (esr_sys64_to_sysreg(esr)) {
> > > + case SYS_RGSR_EL1:
> > > + case SYS_GCR_EL1:
> > > + case SYS_TFSR_EL1:
> > > + case SYS_TFSRE0_EL1:
> >
> > How about other things, such as DC GVA? Don't you need to trap and
> > UNDEF it (which has the side effect of also trapping DC ZVA)?
> >
> > Same question for all the DC {C,I,CI}GVA{C,P} instructions.
>
> As far as I could tell, none of these are trapped by ATA. The spec
> says that in the absence of MTE, their behavior is undefined --- which
> is the same for the ones I'm actually handling here...
>
> The reasons I've only handled these is that, when booting a system
> with a misadvertised MTE, the kernel accesses these registers, and
> injecting an UNDEF resulted in a nicer failure mode.
But it all comes down to *why* is MTE disabled. Is it because the user
cannot be arsed with MTE's abysmal^Wstellar performance? Or because
this is a memory corruption vector on a misconfigured platform?
> What do you suggest, drop this patch (and the one before it), since
> trying to access MTE is UNDEF, and the kernel that does it is just
> shooting itself in the foot (no security implications)? Or edit the
> commit message to make it clear that this is best effort, based on
> what ATA traps?
No, what I am simply pointing out that that there is more to MTE than
what gets trapped by ATA, and my hunch is that when MTE is disabled on
machine that actually has it, it is because something is deeply broken
with tag management (I have one such machine).
So depending which side of the problem you're on, this could be either
perfectly valid, or just missing the point.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 4/5] arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled
2025-11-28 8:43 ` Marc Zyngier
@ 2025-11-28 8:53 ` Fuad Tabba
2025-11-28 12:10 ` Will Deacon
1 sibling, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-11-28 8:53 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
suzuki.poulose, yuzenghui, catalin.marinas, will
Hi Marc,
On Fri, 28 Nov 2025 at 08:43, Marc Zyngier <maz@kernel.org> wrote:
>
> On Thu, 27 Nov 2025 14:41:24 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Hi Marc,
> >
> > On Thu, 27 Nov 2025 at 14:17, Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > On Thu, 27 Nov 2025 12:22:09 +0000,
> > > Fuad Tabba <tabba@google.com> wrote:
> > > >
> > > > When MTE hardware is present but disabled via software (arm64.nomte or
> > > > CONFIG_ARM64_MTE=n), HCR_EL2.ATA is cleared to prevent use of MTE
> > > > instructions. However, this alone doesn't fully emulate hardware that
> > > > lacks MTE support.
> > > >
> > > > With HCR_EL2.ATA cleared, accesses to certain MTE system registers trap
> > > > to EL2 with exception class ESR_ELx_EC_SYS64. To faithfully emulate
> > > > hardware without MTE (where such accesses would cause an Undefined
> > > > Instruction exception), inject UNDEF into the host.
> > > >
> > > > Signed-off-by: Fuad Tabba <tabba@google.com>
> > > > ---
> > > > arch/arm64/kvm/hyp/nvhe/hyp-main.c | 44 ++++++++++++++++++++++++++++++
> > > > 1 file changed, 44 insertions(+)
> > > >
> > > > diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > > > index 29430c031095..f542e4c17156 100644
> > > > --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > > > +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > > > @@ -686,6 +686,46 @@ static void handle_host_smc(struct kvm_cpu_context *host_ctxt)
> > > > kvm_skip_host_instr();
> > > > }
> > > >
> > > > +static void inject_undef64(void)
> > > > +{
> > > > + unsigned long sctlr, vbar, old, new;
> > > > + u64 offset, esr;
> > > > +
> > > > + vbar = read_sysreg_el1(SYS_VBAR);
> > > > + sctlr = read_sysreg_el1(SYS_SCTLR);
> > > > + old = read_sysreg_el2(SYS_SPSR);
> > > > + new = get_except64_cpsr(old, system_supports_mte(), sctlr, PSR_MODE_EL1h);
> > > > + offset = get_except64_offset(old, PSR_MODE_EL1h, except_type_sync);
> > > > + esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT) | ESR_ELx_IL;
> > > > +
> > > > + write_sysreg_el1(esr, SYS_ESR);
> > > > + write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> > > > + write_sysreg_el1(old, SYS_SPSR);
> > > > + write_sysreg_el2(vbar + offset, SYS_ELR);
> > > > + write_sysreg_el2(new, SYS_SPSR);
> > > > +}
> > > > +
> > > > +static bool handle_host_mte(u64 esr)
> > > > +{
> > > > + /* If we're here for any reason other than MTE, then it's a bug. */
> > > > +
> > > > + if (read_sysreg(HCR_EL2) & HCR_ATA)
> > > > + return false;
> > > > +
> > > > + switch (esr_sys64_to_sysreg(esr)) {
> > > > + case SYS_RGSR_EL1:
> > > > + case SYS_GCR_EL1:
> > > > + case SYS_TFSR_EL1:
> > > > + case SYS_TFSRE0_EL1:
> > >
> > > How about other things, such as DC GVA? Don't you need to trap and
> > > UNDEF it (which has the side effect of also trapping DC ZVA)?
> > >
> > > Same question for all the DC {C,I,CI}GVA{C,P} instructions.
> >
> > As far as I could tell, none of these are trapped by ATA. The spec
> > says that in the absence of MTE, their behavior is undefined --- which
> > is the same for the ones I'm actually handling here...
> >
> > The reasons I've only handled these is that, when booting a system
> > with a misadvertised MTE, the kernel accesses these registers, and
> > injecting an UNDEF resulted in a nicer failure mode.
>
> But it all comes down to *why* is MTE disabled. Is it because the user
> cannot be arsed with MTE's abysmal^Wstellar performance? Or because
> this is a memory corruption vector on a misconfigured platform?
>
> > What do you suggest, drop this patch (and the one before it), since
> > trying to access MTE is UNDEF, and the kernel that does it is just
> > shooting itself in the foot (no security implications)? Or edit the
> > commit message to make it clear that this is best effort, based on
> > what ATA traps?
>
> No, what I am simply pointing out that that there is more to MTE than
> what gets trapped by ATA, and my hunch is that when MTE is disabled on
> machine that actually has it, it is because something is deeply broken
> with tag management (I have one such machine).
>
> So depending which side of the problem you're on, this could be either
> perfectly valid, or just missing the point.
I see your point. Of course, my main concern is security and
protecting the guest and the hypervisor from a malicious host (and of
course the other way around, but that's the case in non-protected mode
as well). When MTE is disabled, I think most (if not all) MTE-related
instructions are UNDEFINED.
As far as I know, for the devices I've run into, the reason to disable
MTE has been either performance or lack of support. So for that case,
I think that this solution is enough. On the other hand, if the
hardware is broken, then we can try to mitigate as much as possible,
but I think that at the end there is so much we can do.
So for now, I would think that this is enough. If we encounter cases
that require more hardening, we could add that later.
What do you think?
/fuad
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 4/5] arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled
2025-11-28 8:43 ` Marc Zyngier
2025-11-28 8:53 ` Fuad Tabba
@ 2025-11-28 12:10 ` Will Deacon
1 sibling, 0 replies; 13+ messages in thread
From: Will Deacon @ 2025-11-28 12:10 UTC (permalink / raw)
To: Marc Zyngier
Cc: Fuad Tabba, kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
suzuki.poulose, yuzenghui, catalin.marinas
Hey Marc,
I can shed a bit more light on why MTE might be disabled in Android, but
please don't shoot the messenger!
On Fri, Nov 28, 2025 at 08:43:09AM +0000, Marc Zyngier wrote:
> On Thu, 27 Nov 2025 14:41:24 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> > On Thu, 27 Nov 2025 at 14:17, Marc Zyngier <maz@kernel.org> wrote:
> > > On Thu, 27 Nov 2025 12:22:09 +0000,
> > > Fuad Tabba <tabba@google.com> wrote:
> > > > diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > > > index 29430c031095..f542e4c17156 100644
> > > > --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > > > +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > > > @@ -686,6 +686,46 @@ static void handle_host_smc(struct kvm_cpu_context *host_ctxt)
> > > > kvm_skip_host_instr();
> > > > }
> > > >
> > > > +static void inject_undef64(void)
> > > > +{
> > > > + unsigned long sctlr, vbar, old, new;
> > > > + u64 offset, esr;
> > > > +
> > > > + vbar = read_sysreg_el1(SYS_VBAR);
> > > > + sctlr = read_sysreg_el1(SYS_SCTLR);
> > > > + old = read_sysreg_el2(SYS_SPSR);
> > > > + new = get_except64_cpsr(old, system_supports_mte(), sctlr, PSR_MODE_EL1h);
> > > > + offset = get_except64_offset(old, PSR_MODE_EL1h, except_type_sync);
> > > > + esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT) | ESR_ELx_IL;
> > > > +
> > > > + write_sysreg_el1(esr, SYS_ESR);
> > > > + write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> > > > + write_sysreg_el1(old, SYS_SPSR);
> > > > + write_sysreg_el2(vbar + offset, SYS_ELR);
> > > > + write_sysreg_el2(new, SYS_SPSR);
> > > > +}
> > > > +
> > > > +static bool handle_host_mte(u64 esr)
> > > > +{
> > > > + /* If we're here for any reason other than MTE, then it's a bug. */
> > > > +
> > > > + if (read_sysreg(HCR_EL2) & HCR_ATA)
> > > > + return false;
> > > > +
> > > > + switch (esr_sys64_to_sysreg(esr)) {
> > > > + case SYS_RGSR_EL1:
> > > > + case SYS_GCR_EL1:
> > > > + case SYS_TFSR_EL1:
> > > > + case SYS_TFSRE0_EL1:
> > >
> > > How about other things, such as DC GVA? Don't you need to trap and
> > > UNDEF it (which has the side effect of also trapping DC ZVA)?
> > >
> > > Same question for all the DC {C,I,CI}GVA{C,P} instructions.
> >
> > As far as I could tell, none of these are trapped by ATA. The spec
> > says that in the absence of MTE, their behavior is undefined --- which
> > is the same for the ones I'm actually handling here...
> >
> > The reasons I've only handled these is that, when booting a system
> > with a misadvertised MTE, the kernel accesses these registers, and
> > injecting an UNDEF resulted in a nicer failure mode.
>
> But it all comes down to *why* is MTE disabled. Is it because the user
> cannot be arsed with MTE's abysmal^Wstellar performance? Or because
> this is a memory corruption vector on a misconfigured platform?
FWIW, Android uses "arm64.nomte" as the mechanism to "disable" MTE so
that the physical memory otherwise used as tag storage can be used for
other things (i.e. treated just like the rest of memory):
https://source.android.com/docs/security/test/memory-safety/bootloader-support#bootloader-support
I appreciate that this isn't what the early idreg overrides were
designed for, but it's an interesting case because the hardware isn't
actually broken, it's just that there's a decicion about whether to give
up memory for tags or not and, if the memory is used for other things,
we need to clear ATA to prevent the host from accesing that memory via
tag operations.
Will
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2
2025-11-27 12:22 [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Fuad Tabba
` (4 preceding siblings ...)
2025-11-27 12:22 ` [PATCH v1 5/5] KVM: arm64: Use kvm_has_mte() in pKVM trap initialization Fuad Tabba
@ 2025-12-02 22:43 ` Oliver Upton
2025-12-05 17:00 ` Will Deacon
5 siblings, 1 reply; 13+ messages in thread
From: Oliver Upton @ 2025-12-02 22:43 UTC (permalink / raw)
To: Fuad Tabba
Cc: kvmarm, linux-arm-kernel, maz, oliver.upton, joey.gouly,
suzuki.poulose, yuzenghui, catalin.marinas, will
Hi Fuad,
On Thu, Nov 27, 2025 at 12:22:05PM +0000, Fuad Tabba wrote:
> pKVM never exposes MTE to protected guests (pVM), but we must also
> ensure a malicious host cannot use MTE to attack the hypervisor or a
> pVM.
>
> If MTE is supported by the hardware (and is enabled at EL3), it remains
> available to lower exception levels by default. Disabling it in the host
> kernel (e.g., via 'arm64.nomte') only stops the kernel from advertising
> the feature; it does not physically disable MTE in the hardware.
>
> In this scenario, a malicious host could still access tags in pages
> donated to a guest using MTE instructions (e.g., STG and LDG), bypassing
> the kernel's configuration.
>
> To prevent this, explicitly disable MTE at EL2 (by clearing HCR_EL2.ATA)
> when the host has MTE disabled. This causes any MTE instruction usage to
> generate a Data Abort (trap) to the hypervisor.
>
> Additionally, to faithfully mimic hardware that does not support MTE,
> trap accesses to MTE system registers (e.g., GCR_EL1) and inject an
> Undefined Instruction exception back to the host.
>
> This logic is applied in all non-VHE modes. For non-protected modes,
> this remains beneficial as it prevents unpredictable behavior caused by
> accessing allocation tags when the system considers them disabled.
>
> Note that this ties into my other outgoing patch series [1], which also
> has some MTE-related fixes, but is not dependent on it.
To be honest, I've actually been having a bit of a hard time
rationalizing some of these targeted fixes for pKVM. It has been in a
half working state upstream for O(years) and we haven't made forward
progress on enabling pVMs.
Fully aware that guest_memfd has been one of the long poles here, but
I'm becoming less interested in fixes addressing "pKVM policy is XYZ"
without having the full picture of the feature.
What are the upstream plans on enabling some basic implementation of
protected VMs?
Thanks,
Oliver
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2
2025-12-02 22:43 ` [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Oliver Upton
@ 2025-12-05 17:00 ` Will Deacon
0 siblings, 0 replies; 13+ messages in thread
From: Will Deacon @ 2025-12-05 17:00 UTC (permalink / raw)
To: Oliver Upton
Cc: Fuad Tabba, kvmarm, linux-arm-kernel, maz, oliver.upton,
joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas
On Tue, Dec 02, 2025 at 02:43:36PM -0800, Oliver Upton wrote:
> On Thu, Nov 27, 2025 at 12:22:05PM +0000, Fuad Tabba wrote:
> > pKVM never exposes MTE to protected guests (pVM), but we must also
> > ensure a malicious host cannot use MTE to attack the hypervisor or a
> > pVM.
> >
> > If MTE is supported by the hardware (and is enabled at EL3), it remains
> > available to lower exception levels by default. Disabling it in the host
> > kernel (e.g., via 'arm64.nomte') only stops the kernel from advertising
> > the feature; it does not physically disable MTE in the hardware.
> >
> > In this scenario, a malicious host could still access tags in pages
> > donated to a guest using MTE instructions (e.g., STG and LDG), bypassing
> > the kernel's configuration.
> >
> > To prevent this, explicitly disable MTE at EL2 (by clearing HCR_EL2.ATA)
> > when the host has MTE disabled. This causes any MTE instruction usage to
> > generate a Data Abort (trap) to the hypervisor.
> >
> > Additionally, to faithfully mimic hardware that does not support MTE,
> > trap accesses to MTE system registers (e.g., GCR_EL1) and inject an
> > Undefined Instruction exception back to the host.
> >
> > This logic is applied in all non-VHE modes. For non-protected modes,
> > this remains beneficial as it prevents unpredictable behavior caused by
> > accessing allocation tags when the system considers them disabled.
> >
> > Note that this ties into my other outgoing patch series [1], which also
> > has some MTE-related fixes, but is not dependent on it.
>
> To be honest, I've actually been having a bit of a hard time
> rationalizing some of these targeted fixes for pKVM. It has been in a
> half working state upstream for O(years) and we haven't made forward
> progress on enabling pVMs.
>
> Fully aware that guest_memfd has been one of the long poles here, but
> I'm becoming less interested in fixes addressing "pKVM policy is XYZ"
> without having the full picture of the feature.
That's completely understandable and we're similarly frustrated.
> What are the upstream plans on enabling some basic implementation of
> protected VMs?
Funnily enough, I've been hacking on this recently and I've ended up
with something that I think serves as a good basis for enabling pvms
incrementally upstream. I need to clean the patches up but I'll be
flying to Japan and back next week so that gives me a good opportunity
to do exactly that!
In the meantime, I hope you'll still consider fixes for non-protected
guests under pKVM (e.g. [1]), as that is an area where I think we've
made some reasonable progress.
Will
[1] https://lore.kernel.org/all/20251128141710.19472-1-will@kernel.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-12-05 17:00 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-27 12:22 [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 1/5] arm64: Remove dead code resetting HCR_EL2 for pKVM Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 2/5] arm64: Clear HCR_EL2.ATA when MTE is not supported or disabled Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 3/5] KVM: arm64: Refactor enter_exception64() Fuad Tabba
2025-11-27 12:22 ` [PATCH v1 4/5] arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled Fuad Tabba
2025-11-27 14:17 ` Marc Zyngier
2025-11-27 14:41 ` Fuad Tabba
2025-11-28 8:43 ` Marc Zyngier
2025-11-28 8:53 ` Fuad Tabba
2025-11-28 12:10 ` Will Deacon
2025-11-27 12:22 ` [PATCH v1 5/5] KVM: arm64: Use kvm_has_mte() in pKVM trap initialization Fuad Tabba
2025-12-02 22:43 ` [PATCH v1 0/5] KVM: arm64: Enforce MTE disablement at EL2 Oliver Upton
2025-12-05 17:00 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).