From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 10204CCD183 for ; Tue, 14 Oct 2025 01:19:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9782210E179; Tue, 14 Oct 2025 01:19:41 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="C/clh/B/"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id B02CB10E179 for ; Tue, 14 Oct 2025 01:19:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1760404779; x=1791940779; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version:content-transfer-encoding; bh=2022WaTA9FTsPikOw0P/UuoMH/7SrvE38j0c8oJJg9U=; b=C/clh/B/jt9eiqlqUItrdLV+M3XUColxynsYxfQickpoZcyHw19eaR4Z 49TqyKVenlmmM6GBaRzQEJB/KBvc0C98AKW4DaNTuNIAPMAQk0GGq/nfz +Prgz2nZKQAb4jO2+j3EBuhHca8yJgNUWC3yn7zMXfhq0JLBjWyKT2tBp Aj3c+51TcuQpFtbapbvu8bL2y5wea9yexHzdBdATJaHDMlIIstu42aIV6 EtmzveJgszp1h9+DrptskBErwre4I6b5CoFMtMeTQ+KFDuxscUiuCdFZV lru0wZ4GoQtIqqVYJDJeW1e/dnfuODrqxVlvd+GQ6IcbSEu5/BGjIWrT/ A==; X-CSE-ConnectionGUID: dZkQr6AuSfem+5A0SPKWoA== X-CSE-MsgGUID: Wa5dQhWlSSmW4oGe7jNJHA== X-IronPort-AV: E=McAfee;i="6800,10657,11581"; a="61766170" X-IronPort-AV: E=Sophos;i="6.19,226,1754982000"; d="scan'208";a="61766170" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Oct 2025 18:19:39 -0700 X-CSE-ConnectionGUID: l3IArev2Tq27T08JI45FIg== X-CSE-MsgGUID: jVKIiiYMR3mhQW+PfcuyRw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.19,226,1754982000"; d="scan'208";a="205442564" Received: from ldelemos-mobl.amr.corp.intel.com (HELO adixit-MOBL3.intel.com) ([10.125.179.111]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Oct 2025 18:19:39 -0700 Date: Mon, 13 Oct 2025 18:19:37 -0700 Message-ID: <87ikgii8k6.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Umesh Nerlige Ramappa Cc: intel-gfx@lists.freedesktop.org, lucas.demarchi@intel.com, riana.tauro@intel.com, andi.shyti@kernel.org, matthew.brost@intel.com Subject: Re: [PATCH] drm/i915: Fix conversion between clock ticks and nanoseconds In-Reply-To: <20251007233543.635130-2-umesh.nerlige.ramappa@intel.com> References: <20251007233543.635130-2-umesh.nerlige.ramappa@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Tue, 07 Oct 2025 16:35:44 -0700, Umesh Nerlige Ramappa wrote: Hi Umesh, > > When tick values are large, the multiplication by NSEC_PER_SEC is larger > than 64 bits and results in bad conversions. > > The issue is seen in PMU busyness counters that look like they have > wrapped around due to bad conversion. i915 PMU implementation returns > monotonically increasing counters. If a count is lesser than previous > one, it will only return the larger value until the smaller value > catches up. The user will see this as zero delta between two > measurements even though the engines are busy. > > Fix it by using a scaling factor to do the conversion. Add the same fix > for reverse conversion as well. > > Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14955 > Signed-off-by: Umesh Nerlige Ramappa > --- > v2: > - Fix divide by zero for Gen11 (Andi) > - Update commit message > --- > .../gpu/drm/i915/gt/intel_gt_clock_utils.c | 19 ++++++++++++++----- > drivers/gpu/drm/i915/gt/intel_gt_types.h | 2 ++ > 2 files changed, 16 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c b/drivers/gpu= /drm/i915/gt/intel_gt_clock_utils.c > index 88b147fa5cb1..41a0e8622b33 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c > +++ b/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c > @@ -3,6 +3,8 @@ > * Copyright =A9 2020 Intel Corporation > */ > > +#include > + > #include "i915_drv.h" > #include "i915_reg.h" > #include "intel_gt.h" > @@ -171,7 +173,12 @@ static u32 read_clock_frequency(struct intel_uncore = *uncore) > > void intel_gt_init_clock_frequency(struct intel_gt *gt) > { > + unsigned long clock_period_scale; > + > gt->clock_frequency =3D read_clock_frequency(gt->uncore); > + clock_period_scale =3D gcd(NSEC_PER_SEC, gt->clock_frequency); > + gt->clock_nsec_scaled =3D NSEC_PER_SEC / clock_period_scale; > + gt->clock_freq_scaled =3D gt->clock_frequency / clock_period_scale; > > /* Icelake appears to use another fixed frequency for CTX_TIMESTAMP */ > if (GRAPHICS_VER(gt->i915) =3D=3D 11) > @@ -180,11 +187,11 @@ void intel_gt_init_clock_frequency(struct intel_gt = *gt) > gt->clock_period_ns =3D intel_gt_clock_interval_to_ns(gt, 1); > > GT_TRACE(gt, > - "Using clock frequency: %dkHz, period: %dns, wrap: %lldms\n", > + "Using clock frequency: %dkHz, period: %dns, wrap: %lldms, scale %lu\= n", > gt->clock_frequency / 1000, > gt->clock_period_ns, > - div_u64(mul_u32_u32(gt->clock_period_ns, S32_MAX), > - USEC_PER_SEC)); > + div_u64(mul_u32_u32(gt->clock_period_ns, S32_MAX), USEC_PER_SEC), > + clock_period_scale); > } > > #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) > @@ -205,7 +212,8 @@ static u64 div_u64_roundup(u64 nom, u32 den) > > u64 intel_gt_clock_interval_to_ns(const struct intel_gt *gt, u64 count) > { > - return div_u64_roundup(count * NSEC_PER_SEC, gt->clock_frequency); > + return div_u64_roundup(count * gt->clock_nsec_scaled, > + gt->clock_freq_scaled); > } > > u64 intel_gt_pm_interval_to_ns(const struct intel_gt *gt, u64 count) > @@ -215,7 +223,8 @@ u64 intel_gt_pm_interval_to_ns(const struct intel_gt = *gt, u64 count) > > u64 intel_gt_ns_to_clock_interval(const struct intel_gt *gt, u64 ns) > { > - return div_u64_roundup(gt->clock_frequency * ns, NSEC_PER_SEC); > + return div_u64_roundup(gt->clock_freq_scaled * ns, > + gt->clock_nsec_scaled); Instead of this approach, how about just using the already available mul_u64_u32_div() (or even mul_u64_u64_div_u64())? That would be preferable I think (though not sure if the rounding is needed?). There is also a roundup_u64() available in math64.h as a replacement for div_u64_roundup(). Thanks. -- Ashutosh > } > > u64 intel_gt_ns_to_pm_interval(const struct intel_gt *gt, u64 ns) > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i= 915/gt/intel_gt_types.h > index bcee084b1f27..a19c568fcdc0 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h > @@ -166,6 +166,8 @@ struct intel_gt { > > u32 clock_frequency; > u32 clock_period_ns; > + u32 clock_freq_scaled; > + u32 clock_nsec_scaled; > > struct intel_llc llc; > struct intel_rc6 rc6; > -- > 2.43.0 >