From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7775410A62E2 for ; Thu, 26 Mar 2026 14:12:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=MqkYSH4rSsGJtV3dvyH8zl2XZYSZ3SwJvsT6SkmY2sA=; b=aDtmHiNC67AeQ1ZiZNTRKHbhQr uaElN3PTgOioiWBPV5Qpw4R3Vd4XCSK1FokT80MdA5UVYBJWQyyczr8PFNiFWVzEPPfhJJ3Bq12Ez 3igq2yos+e89QQ9OnA4gVWaJAsYHTupvof7H0xFYmxsyVt9cgjYvkZxSyjB78WKBoCaLbwAI/MzPD tddU7IqgsFJjyqfTpJWEpFbt9FDDvCAWBXAD+qRO0s8guOeh45NtVRNQMgcQVVbLVy+kHdhKvASCt B8W0wYixfZCUgTwiIJujy+fkMdLgm0kAS7t4+W0+E2/dgpVnlImZ/uBqRykRoUumxOXzq/uYSzZNs ph2LMINg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1w5lRl-00000005cCt-1dAt; Thu, 26 Mar 2026 14:12:29 +0000 Received: from sea.source.kernel.org ([172.234.252.31]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1w5lRh-00000005cBc-2Iom for linux-arm-kernel@lists.infradead.org; Thu, 26 Mar 2026 14:12:26 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id A79A2436D0; Thu, 26 Mar 2026 14:12:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 33BA8C2BCB1; Thu, 26 Mar 2026 14:12:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774534344; bh=mRkhsVcdgzhGj3A3dYs+jhPT8Fkxy5scOpJSWYyFDEc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OsYe/TaIAelTFsJAZ+JIVxabcI99MX+caAxg6b5Yf/KSjLz5QOE+yZlMa+2Fq15NV 92VX1/WCY8tL0N6Tf6UF02PD4x5PTV5dMhyczCFqrC7ADnt/D+5vxAkjI3ZTPGMO1E 0XWN4xrO6dKB56AJZZM8A8TwltDZRlXq1lHWE1YVNycuFT6dkQoHNR32KvAH269fO6 HiLt4Y5pomlXwKcKd5v1Rbf7CEVFqHM/RsJ32FVyNAU4ua3BQu/FgQc1fy58ZaxZMu oSkXQt5iqbkVHdX/c6OIq7Zkzue4rSGWZpjH0aNzT8shH3D4Sr4uDphw+MjBGjBxeD jabds/ahTHTKQ== From: Will Deacon To: kvmarm@lists.linux.dev Cc: mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org, Will Deacon , Marc Zyngier , Oliver Upton , James Clark , Leo Yan , Suzuki K Poulose , Fuad Tabba , Alexandru Elisei , Yabin Cui Subject: [PATCH v3 1/3] KVM: arm64: Disable TRBE Trace Buffer Unit when running in guest context Date: Thu, 26 Mar 2026 14:12:11 +0000 Message-ID: <20260326141214.18990-2-will@kernel.org> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260326141214.18990-1-will@kernel.org> References: <20260326141214.18990-1-will@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260326_071225_632825_807344A5 X-CRM114-Status: GOOD ( 21.33 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The nVHE world-switch code relies on zeroing TRFCR_EL1 to disable trace generation in guest context when self-hosted TRBE is in use by the host. Per D3.2.1 ("Controls to prohibit trace at Exception levels"), clearing TRFCR_EL1 means that trace generation is prohibited at EL1 and EL0 but per R_YCHKJ the Trace Buffer Unit will still be enabled if TRBLIMITR_EL1.E is set. R_SJFRQ goes on to state that, when enabled, the Trace Buffer Unit can perform address translation for the "owning exception level" even when it is out of context. Consequently, we can end up in a state where TRBE performs speculative page-table walks for a host VA/IPA in guest/hypervisor context depending on the value of MDCR_EL2.E2TB, which changes over world-switch. The potential result appears to be a heady mixture of SErrors, data corruption and hardware lockups. Extend the TRBE world-switch code to clear TRBLIMITR_EL1.E after draining the buffer, restoring the register on return to the host. This unfortunately means we need to tackle CPU errata #2064142 and #2038923 which add additional synchronisation requirements around manipulations of the limit register. Hopefully this doesn't need to be fast. Cc: Marc Zyngier Cc: Oliver Upton Cc: James Clark Cc: Leo Yan Cc: Suzuki K Poulose Cc: Fuad Tabba Cc: Alexandru Elisei Tested-by: Leo Yan Tested-by: Fuad Tabba Reviewed-by: Suzuki K Poulose Reviewed-by: Fuad Tabba Fixes: a1319260bf62 ("arm64: KVM: Enable access to TRBE support for host") Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_host.h | 1 + arch/arm64/kvm/hyp/nvhe/debug-sr.c | 71 ++++++++++++++++++++++++++---- arch/arm64/kvm/hyp/nvhe/switch.c | 2 +- 3 files changed, 64 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 70cb9cfd760a..b1335c55dbef 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -770,6 +770,7 @@ struct kvm_host_data { u64 pmscr_el1; /* Self-hosted trace */ u64 trfcr_el1; + u64 trblimitr_el1; /* Values of trap registers for the host before guest entry. */ u64 mdcr_el2; u64 brbcr_el1; diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c index 2a1c0f49792b..0955af771ad1 100644 --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c @@ -57,12 +57,54 @@ static void __trace_do_switch(u64 *saved_trfcr, u64 new_trfcr) write_sysreg_el1(new_trfcr, SYS_TRFCR); } -static bool __trace_needs_drain(void) +static void __trace_drain_and_disable(void) { - if (is_protected_kvm_enabled() && host_data_test_flag(HAS_TRBE)) - return read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_EL1_E; + u64 *trblimitr_el1 = host_data_ptr(host_debug_state.trblimitr_el1); + bool needs_drain = is_protected_kvm_enabled() ? + host_data_test_flag(HAS_TRBE) : + host_data_test_flag(TRBE_ENABLED); - return host_data_test_flag(TRBE_ENABLED); + if (!needs_drain) { + *trblimitr_el1 = 0; + return; + } + + *trblimitr_el1 = read_sysreg_s(SYS_TRBLIMITR_EL1); + if (*trblimitr_el1 & TRBLIMITR_EL1_E) { + /* + * The host has enabled the Trace Buffer Unit so we have + * to beat the CPU with a stick until it stops accessing + * memory. + */ + + /* First, ensure that our prior write to TRFCR has stuck. */ + isb(); + + /* Now synchronise with the trace and drain the buffer. */ + tsb_csync(); + dsb(nsh); + + /* + * With no more trace being generated, we can disable the + * Trace Buffer Unit. + */ + write_sysreg_s(0, SYS_TRBLIMITR_EL1); + if (cpus_have_final_cap(ARM64_WORKAROUND_2064142)) { + /* + * Some CPUs are so good, we have to drain 'em + * twice. + */ + tsb_csync(); + dsb(nsh); + } + + /* + * Ensure that the Trace Buffer Unit is disabled before + * we start mucking with the stage-2 and trap + * configuration. + */ + isb(); + } } static bool __trace_needs_switch(void) @@ -79,15 +121,26 @@ static void __trace_switch_to_guest(void) __trace_do_switch(host_data_ptr(host_debug_state.trfcr_el1), *host_data_ptr(trfcr_while_in_guest)); - - if (__trace_needs_drain()) { - isb(); - tsb_csync(); - } + __trace_drain_and_disable(); } static void __trace_switch_to_host(void) { + u64 trblimitr_el1 = *host_data_ptr(host_debug_state.trblimitr_el1); + + if (trblimitr_el1 & TRBLIMITR_EL1_E) { + /* Re-enable the Trace Buffer Unit for the host. */ + write_sysreg_s(trblimitr_el1, SYS_TRBLIMITR_EL1); + isb(); + if (cpus_have_final_cap(ARM64_WORKAROUND_2038923)) { + /* + * Make sure the unit is re-enabled before we + * poke TRFCR. + */ + isb(); + } + } + __trace_do_switch(host_data_ptr(trfcr_while_in_guest), *host_data_ptr(host_debug_state.trfcr_el1)); } diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 779089e42681..f00688e69d88 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -278,7 +278,7 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu) * We're about to restore some new MMU state. Make sure * ongoing page-table walks that have started before we * trapped to EL2 have completed. This also synchronises the - * above disabling of BRBE, SPE and TRBE. + * above disabling of BRBE and SPE. * * See DDI0487I.a D8.1.5 "Out-of-context translation regimes", * rule R_LFHQG and subsequent information statements. -- 2.53.0.1018.g2bb0e51243-goog