From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ABED0D7976B for ; Sat, 31 Jan 2026 13:31:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=M+GkdGsUgu4Q8sQp/aQtvPYhjbJoPPZSMXXtWDkMc54=; b=3jkpygVw66JnMaxqO+27sj1xBK bQWaZFJhAO6b8DI7ibDIzIAI0rgY5ld4N1FP5cfI8Na3OUFx/0STTiOih75O1eFwqCg6tQO9tcAdL zqgiydC+xfWE0TX/RJMRn6MG3H93wBs198G6oLqKXZ2RP70STjA2+tAqPehaw4GUerrTMoxpNQO0q cKW8AU45kyNcp5Efk1rojOh7O8R/rGFpoCkqYv40PFMzUjWP3/cXV2Mr2G8E/YIzsIhxGzAhgRIEW Sc6isomemUaQtkERrDblOjPrWCJDK7YQ1kc4adTc3XKkToEFUpBNBTchkbMCEQilfcrh79yLyc1Ac f0u/DeSg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1vmB45-00000002d0Y-1PVp; Sat, 31 Jan 2026 13:31:05 +0000 Received: from mail-wm1-x34a.google.com ([2a00:1450:4864:20::34a]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1vmB2p-00000002c4p-3Ivs for linux-arm-kernel@lists.infradead.org; Sat, 31 Jan 2026 13:29:49 +0000 Received: by mail-wm1-x34a.google.com with SMTP id 5b1f17b1804b1-48025e12b5bso44189705e9.2 for ; Sat, 31 Jan 2026 05:29:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1769866185; x=1770470985; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=M+GkdGsUgu4Q8sQp/aQtvPYhjbJoPPZSMXXtWDkMc54=; b=nUj1VYqXWoir/j2z4LsB7/SrxsecDkSNLVYQvvlE+BzJKYuHb50Dz7tofBn/0dGG7s HE2osNWcy3ZqxU7DIIGZvkkeV8JhHXx8FDmefTJTYmb76Ihc66o2brma5l3UhFqJ4j2e BXOrfK4uTOPhoexBWeHjQe2LCiJiHGznexhLLZ0Cp8wH+1Sb6x/4mCDDimf5sDvOVsed TxrUcIc/geyf0qWqoSeGWz2I3w53s9ugytovVGrNfQbqFNmFOqOSLioDnP/gLhnnfnlr 2AXCNvGqybIvlnA1solPNW0ecgHnCFaOb49F0C1h3odPcP26lp6VsJ90XnaiQuKPg6Rz I1wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769866185; x=1770470985; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=M+GkdGsUgu4Q8sQp/aQtvPYhjbJoPPZSMXXtWDkMc54=; b=HDqvZeagK5HZ0pbIKjeGjcPmCMK2mN0k3iVwQO/QbiZmt6zL3Hlhg+rAPrhpTTjxM+ 2EBbC3m6Un/u6XQ4jEorLtRgrKt9TsvKnGeyE+A/dYl1+7SYmslWh3mHGmA+Ydm3eaPz /5H8whhZf3tvMqYskJAvEsr4SmWsmfWkWXgqVYnbaR1cxOG0sUQGQ5Yj8Inor+lPfRzE lO8354ZKZkFmkqmDD6vKzdmqTlxNG9xCqamSI46rOyL1ZF5/uxXkloJ3CZ7AUdl4PxEq WCKHCpGEcjh2dOrnpRNzN1sZsqZTBI/FBm2o1HsAiSZu+kNo/udxfzjBPb7yNGqq6ClZ bHyQ== X-Forwarded-Encrypted: i=1; AJvYcCUGv4zsZt5OqgQ5wPR3THMQzsx3BpmZI2rNiz5Rd1SjhcSf+Cs9tQw8plXs7pVsZ00aMqHef9p/ep/y4skCkOll@lists.infradead.org X-Gm-Message-State: AOJu0Ywy7rI/pGLftMfELzhby6XEffZXGmODIL2g51T0nkFL6CD5A7Gm t6myAAomVA80o6GTJ1adlYhE0O3xEvBrEGaLEolC84xg2n0yyaocLCYCGJPe/lNJgaNc4NEHsBG xiXtbUj5bgZ3CwbRBGUscoQ== X-Received: from wmgg16.prod.google.com ([2002:a05:600d:10:b0:47b:20f8:660a]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4ed3:b0:477:a246:8398 with SMTP id 5b1f17b1804b1-482db45929emr83972105e9.2.1769866185313; Sat, 31 Jan 2026 05:29:45 -0800 (PST) Date: Sat, 31 Jan 2026 13:28:43 +0000 In-Reply-To: <20260131132848.254084-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20260131132848.254084-1-vdonnefort@google.com> X-Mailer: git-send-email 2.53.0.rc1.225.gd81095ad13-goog Message-ID: <20260131132848.254084-26-vdonnefort@google.com> Subject: [PATCH v11 25/30] KVM: arm64: Sync boot clock with the nVHE/pKVM hyp From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, aneesh.kumar@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort , Thomas Gleixner , Stephen Boyd , "Christopher S. Hall" , Richard Cochran Content-Type: text/plain; charset="UTF-8" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260131_052947_905836_0F5459F2 X-CRM114-Status: GOOD ( 24.46 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Configure the hypervisor tracing clock with the kernel boot clock. For tracing purposes, the boot clock is interesting: it doesn't stop on suspend. However, it is corrected on a regular basis, which implies the need to re-evaluate it every once in a while. Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Christopher S. Hall Cc: Richard Cochran Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index c3a7fc939f42..352ebbedaab2 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -93,6 +93,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___tracing_unload, __KVM_HOST_SMCCC_FUNC___tracing_enable, __KVM_HOST_SMCCC_FUNC___tracing_swap_reader, + __KVM_HOST_SMCCC_FUNC___tracing_update_clock, }; #define DECLARE_KVM_VHE_SYM(sym) extern char sym[] diff --git a/arch/arm64/kvm/hyp/include/nvhe/trace.h b/arch/arm64/kvm/hyp/include/nvhe/trace.h index 7da8788ce527..fd641e1b1c23 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/trace.h +++ b/arch/arm64/kvm/hyp/include/nvhe/trace.h @@ -11,6 +11,7 @@ int __tracing_load(unsigned long desc_va, size_t desc_size); void __tracing_unload(void); int __tracing_enable(bool enable); int __tracing_swap_reader(unsigned int cpu); +void __tracing_update_clock(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc); #else static inline void *tracing_reserve_entry(unsigned long length) { return NULL; } static inline void tracing_commit_entry(void) { } @@ -19,5 +20,6 @@ static inline int __tracing_load(unsigned long desc_va, size_t desc_size) { retu static inline void __tracing_unload(void) { } static inline int __tracing_enable(bool enable) { return -ENODEV; } static inline int __tracing_swap_reader(unsigned int cpu) { return -ENODEV; } +static inline void __tracing_update_clock(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc) { } #endif #endif diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index eddbf5df5d13..9df0d37a494b 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -615,6 +615,16 @@ static void handle___tracing_swap_reader(struct kvm_cpu_context *host_ctxt) cpu_reg(host_ctxt, 1) = __tracing_swap_reader(cpu); } +static void handle___tracing_update_clock(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(u32, mult, host_ctxt, 1); + DECLARE_REG(u32, shift, host_ctxt, 2); + DECLARE_REG(u64, epoch_ns, host_ctxt, 3); + DECLARE_REG(u64, epoch_cyc, host_ctxt, 4); + + __tracing_update_clock(mult, shift, epoch_ns, epoch_cyc); +} + typedef void (*hcall_t)(struct kvm_cpu_context *); #define HANDLE_FUNC(x) [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x @@ -660,6 +670,7 @@ static const hcall_t host_hcall[] = { HANDLE_FUNC(__tracing_unload), HANDLE_FUNC(__tracing_enable), HANDLE_FUNC(__tracing_swap_reader), + HANDLE_FUNC(__tracing_update_clock), }; static void handle_host_hcall(struct kvm_cpu_context *host_ctxt) diff --git a/arch/arm64/kvm/hyp/nvhe/trace.c b/arch/arm64/kvm/hyp/nvhe/trace.c index 282cba70ce9b..2c8e6f49d7de 100644 --- a/arch/arm64/kvm/hyp/nvhe/trace.c +++ b/arch/arm64/kvm/hyp/nvhe/trace.c @@ -271,3 +271,19 @@ int __tracing_swap_reader(unsigned int cpu) return ret; } + +void __tracing_update_clock(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc) +{ + int cpu; + + /* After this loop, all CPUs are observing the new bank... */ + for (cpu = 0; cpu < hyp_nr_cpus; cpu++) { + struct simple_rb_per_cpu *simple_rb = per_cpu_ptr(trace_buffer.simple_rbs, cpu); + + while (READ_ONCE(simple_rb->status) == SIMPLE_RB_WRITING) + ; + } + + /* ...we can now override the old one and swap. */ + trace_clock_update(mult, shift, epoch_ns, epoch_cyc); +} diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index 2866effe28ec..1e5fc27f0e9d 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -4,15 +4,133 @@ * Author: Vincent Donnefort */ +#include #include +#include #include +#include #include #include #include #include "hyp_trace.h" +/* Same 10min used by clocksource when width is more than 32-bits */ +#define CLOCK_MAX_CONVERSION_S 600 +/* + * Time to give for the clock init. Long enough to get a good mult/shift + * estimation. Short enough to not delay the tracing start too much. + */ +#define CLOCK_INIT_MS 100 +/* + * Time between clock checks. Must be small enough to catch clock deviation when + * it is still tiny. + */ +#define CLOCK_UPDATE_MS 500 + +static struct hyp_trace_clock { + u64 cycles; + u64 cyc_overflow64; + u64 boot; + u32 mult; + u32 shift; + struct delayed_work work; + struct completion ready; + struct mutex lock; + bool running; +} hyp_clock; + +static void __hyp_clock_work(struct work_struct *work) +{ + struct delayed_work *dwork = to_delayed_work(work); + struct hyp_trace_clock *hyp_clock; + struct system_time_snapshot snap; + u64 rate, delta_cycles; + u64 boot, delta_boot; + + hyp_clock = container_of(dwork, struct hyp_trace_clock, work); + + ktime_get_snapshot(&snap); + boot = ktime_to_ns(snap.boot); + + delta_boot = boot - hyp_clock->boot; + delta_cycles = snap.cycles - hyp_clock->cycles; + + /* Compare hyp clock with the kernel boot clock */ + if (hyp_clock->mult) { + u64 err, cur = delta_cycles; + + if (WARN_ON_ONCE(cur >= hyp_clock->cyc_overflow64)) { + __uint128_t tmp = (__uint128_t)cur * hyp_clock->mult; + + cur = tmp >> hyp_clock->shift; + } else { + cur *= hyp_clock->mult; + cur >>= hyp_clock->shift; + } + cur += hyp_clock->boot; + + err = abs_diff(cur, boot); + /* No deviation, only update epoch if necessary */ + if (!err) { + if (delta_cycles >= (hyp_clock->cyc_overflow64 >> 1)) + goto fast_forward; + + goto resched; + } + + /* Warn if the error is above tracing precision (1us) */ + if (err > NSEC_PER_USEC) + pr_warn_ratelimited("hyp trace clock off by %lluus\n", + err / NSEC_PER_USEC); + } + + rate = div64_u64(delta_cycles * NSEC_PER_SEC, delta_boot); + + clocks_calc_mult_shift(&hyp_clock->mult, &hyp_clock->shift, + rate, NSEC_PER_SEC, CLOCK_MAX_CONVERSION_S); + + /* Add a comfortable 50% margin */ + hyp_clock->cyc_overflow64 = (U64_MAX / hyp_clock->mult) >> 1; + +fast_forward: + hyp_clock->cycles = snap.cycles; + hyp_clock->boot = boot; + kvm_call_hyp_nvhe(__tracing_update_clock, hyp_clock->mult, + hyp_clock->shift, hyp_clock->boot, hyp_clock->cycles); + complete(&hyp_clock->ready); + +resched: + schedule_delayed_work(&hyp_clock->work, + msecs_to_jiffies(CLOCK_UPDATE_MS)); +} + +static void hyp_trace_clock_enable(struct hyp_trace_clock *hyp_clock, bool enable) +{ + struct system_time_snapshot snap; + + if (hyp_clock->running == enable) + return; + + if (!enable) { + cancel_delayed_work_sync(&hyp_clock->work); + hyp_clock->running = false; + } + + ktime_get_snapshot(&snap); + + hyp_clock->boot = ktime_to_ns(snap.boot); + hyp_clock->cycles = snap.cycles; + hyp_clock->mult = 0; + + init_completion(&hyp_clock->ready); + INIT_DELAYED_WORK(&hyp_clock->work, __hyp_clock_work); + schedule_delayed_work(&hyp_clock->work, msecs_to_jiffies(CLOCK_INIT_MS)); + wait_for_completion(&hyp_clock->ready); + hyp_clock->running = true; +} + /* Access to this struct within the trace_remote_callbacks are protected by the trace_remote lock */ static struct hyp_trace_buffer { struct hyp_trace_desc *desc; @@ -183,6 +301,8 @@ static void hyp_trace_unload(struct trace_buffer_desc *desc, void *priv) static int hyp_trace_enable_tracing(bool enable, void *priv) { + hyp_trace_clock_enable(&hyp_clock, enable); + return kvm_call_hyp_nvhe(__tracing_enable, enable); } @@ -201,7 +321,22 @@ static int hyp_trace_enable_event(unsigned short id, bool enable, void *priv) return 0; } +static int hyp_trace_clock_show(struct seq_file *m, void *v) +{ + seq_puts(m, "[boot]\n"); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(hyp_trace_clock); + +static int hyp_trace_init_tracefs(struct dentry *d, void *priv) +{ + return tracefs_create_file("trace_clock", 0440, d, NULL, &hyp_trace_clock_fops) ? + 0 : -ENOMEM; +} + static struct trace_remote_callbacks trace_remote_callbacks = { + .init = hyp_trace_init_tracefs, .load_trace_buffer = hyp_trace_load, .unload_trace_buffer = hyp_trace_unload, .enable_tracing = hyp_trace_enable_tracing, @@ -212,8 +347,22 @@ static struct trace_remote_callbacks trace_remote_callbacks = { int __init kvm_hyp_trace_init(void) { + int cpu; + if (is_kernel_in_hyp_mode()) return 0; +#ifdef CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND + for_each_possible_cpu(cpu) { + const struct arch_timer_erratum_workaround *wa = + per_cpu(timer_unstable_counter_workaround, cpu); + + if (wa && wa->read_cntvct_el0) { + pr_warn("hyp trace can't handle CNTVCT workaround '%s'\n", wa->desc); + return -EOPNOTSUPP; + } + } +#endif + return trace_remote_register("hypervisor", &trace_remote_callbacks, &trace_buffer, NULL, 0); } -- 2.53.0.rc1.225.gd81095ad13-goog