From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99EB8C76196 for ; Tue, 28 Mar 2023 10:10:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232127AbjC1KKU (ORCPT ); Tue, 28 Mar 2023 06:10:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232263AbjC1KKJ (ORCPT ); Tue, 28 Mar 2023 06:10:09 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B74C83D6 for ; Tue, 28 Mar 2023 03:09:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 57599B81BDF for ; Tue, 28 Mar 2023 10:09:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1CFDEC433D2; Tue, 28 Mar 2023 10:09:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679998189; bh=EOVmU8BJ5FxoPl3LUzLpL+JfKMMwtlA39IBMB2FcU+c=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=kE6Nmg0Suz0F9lk3wM2k7WVcZoJwcwFH+kMwi5GKx0Lg+I0FQgx0pw+fBqBfqP2hn ZrPHtTd+dz9Ru8Ea9NALsog3vMiLJ1xP/LxCNqBGg15An5yyVnC/LS+GWy53NkXNxD JmvnT4Lk/eduWXvwUBCF+rn8eR/gwqOId0n56R7GMTVKbGb5UljVa/01KMzBdfUDYk Q1IrZuKp/9CkuQwVwTolqeJ9OHAD+JKN4eoltNlxtsO8bk9a69fC/kidJH4vABYclP V1PlGxNndjxVOeBjT7JTg4lLmZzUP7pkDVOqdWdNipBHRNNHMBaKezpZUdoHqjFMNS lxCriB9Y7vpGw== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1ph6Gw-003jA4-TK; Tue, 28 Mar 2023 11:09:47 +0100 Date: Tue, 28 Mar 2023 11:09:46 +0100 Message-ID: <86r0t9w5jp.wl-maz@kernel.org> From: Marc Zyngier To: Colton Lewis Cc: pbonzini@redhat.com, shuah@kernel.org, seanjc@google.com, dmatlack@google.com, vipinsh@google.com, andrew.jones@linux.dev, bgardon@google.com, ricarkol@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org Subject: Re: [PATCH v2 1/2] KVM: selftests: Provide generic way to read system counter In-Reply-To: References: <87y1nvgv8s.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: coltonlewis@google.com, pbonzini@redhat.com, shuah@kernel.org, seanjc@google.com, dmatlack@google.com, vipinsh@google.com, andrew.jones@linux.dev, bgardon@google.com, ricarkol@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Tue, 21 Mar 2023 19:10:04 +0000, Colton Lewis wrote: > > Marc Zyngier writes: > > >> +#define MEASURE_CYCLES(x) \ > >> + ({ \ > >> + uint64_t start; \ > >> + start = cycles_read(); \ > >> + x; \ > > > You insert memory accesses inside a sequence that has no dependency > > with it. On a weakly ordered memory system, there is absolutely no > > reason why the memory access shouldn't be moved around. What do you > > exactly measure in that case? > > cycles_read is built on another function timer_get_cntct which includes > its own barriers. Stripped of some abstraction, the sequence is: > > timer_get_cntct (isb+read timer) > whatever is being measured > timer_get_cntct > > I hadn't looked at it too closely before but on review of the manual > I think you are correct. Borrowing from example D7-2 in the manual, it > should be: > > timer_get_cntct > isb > whatever is being measured > dsb > timer_get_cntct That's better, but also very heavy handed. You'd be better off constructing an address dependency from the timer value, and feed that into a load-acquire/store-release pair wrapping your payload. > > >> + cycles_read() - start; \ > > > I also question the usefulness of this exercise. You're comparing the > > time it takes for a multi-GHz system to put a write in a store buffer > > (assuming it didn't miss in the TLBs) vs a counter that gets updated > > at a frequency of a few tens of MHz. > > > My guts feeling is that this results in a big fat zero most of the > > time, but I'm happy to be explained otherwise. > > > In context, I'm trying to measure the time it takes to write to a buffer > *with dirty memory logging enabled*. What do you mean by zero? I can > confirm from running this code I am not measuring zero time. See my earlier point: the counter tick is a few MHz, and the CPU multiple GHz. So unless "whatever" is something that takes a significant time (several thousands of CPU cycles), you'll measure nothing using the counter. Page faults will probably show, but not a normal access. The right tool for this job is to use PMU events, as they count at the CPU frequency. Thanks, M. -- Without deviation from the norm, progress is not possible.