From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AA0CDCCD18E for ; Tue, 14 Oct 2025 21:43:07 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 382A810E6A3; Tue, 14 Oct 2025 21:43:07 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="X5/InI4+"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) by gabe.freedesktop.org (Postfix) with ESMTPS id 90EAE10E6A3 for ; Tue, 14 Oct 2025 21:43:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1760478185; x=1792014185; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version:content-transfer-encoding; bh=kGGOkBOBhpDYaRMfAp1Bsw940/sK9YFmqk5ANP8ZQlo=; b=X5/InI4+FYsMdQOG+bzuEdfnm3/ZtRr1wFmuMqpHED6sPCrfMNCImUHg zm2Fuxnig8TeDBemHyjl2xeUXzY+EcuX4bZjXvUKZ4TVZA2ShB9DV3Eyl sgO5Jvmyn2wph5uSMqF7MemTUm96PYGDfgnuMtQXddDvqeaLAZdBWutOy KZhIPNll/Uf0hVbyjJCWgd9u8PjfLZDUPSCPhp1XbPijTNnzv7i3OO2ow g84GsrXu5GVKqo9PbBW+IJHAc1LuVTmzLNS0rFICLNArZlQFyHWwDLelO 4kGAHR+3s1mAb4utDgJ1KtlE11qer5fpyt6VAo5ov/gN395s7jkkDslt2 g==; X-CSE-ConnectionGUID: qU3tjsbqT8WA3rbPE2Mf4g== X-CSE-MsgGUID: 14nszYsJR1mts9BZSI+ODw== X-IronPort-AV: E=McAfee;i="6800,10657,11582"; a="73989780" X-IronPort-AV: E=Sophos;i="6.19,229,1754982000"; d="scan'208";a="73989780" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2025 14:43:05 -0700 X-CSE-ConnectionGUID: ZxLUzqWETgmh4RJyOaGTHg== X-CSE-MsgGUID: fKXChX9ZQt63ZOxG7/08aw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.19,229,1754982000"; d="scan'208";a="182470717" Received: from iprasad-mobl.amr.corp.intel.com (HELO adixit-MOBL3.intel.com) ([10.125.52.66]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2025 14:43:05 -0700 Date: Tue, 14 Oct 2025 14:43:04 -0700 Message-ID: <87h5w1i2hj.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Umesh Nerlige Ramappa Cc: , , , , Subject: Re: [PATCH] drm/i915: Fix conversion between clock ticks and nanoseconds In-Reply-To: References: <20251007233543.635130-2-umesh.nerlige.ramappa@intel.com> <87ikgii8k6.wl-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Tue, 14 Oct 2025 13:31:27 -0700, Umesh Nerlige Ramappa wrote: > > On Mon, Oct 13, 2025 at 06:19:37PM -0700, Dixit, Ashutosh wrote: > > On Tue, 07 Oct 2025 16:35:44 -0700, Umesh Nerlige Ramappa wrote: > > > > Hi Umesh, > > > >> > >> When tick values are large, the multiplication by NSEC_PER_SEC is larg= er > >> than 64 bits and results in bad conversions. > >> > >> The issue is seen in PMU busyness counters that look like they have > >> wrapped around due to bad conversion. i915 PMU implementation returns > >> monotonically increasing counters. If a count is lesser than previous > >> one, it will only return the larger value until the smaller value > >> catches up. The user will see this as zero delta between two > >> measurements even though the engines are busy. > >> > >> Fix it by using a scaling factor to do the conversion. Add the same fix > >> for reverse conversion as well. > >> > >> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14955 > >> Signed-off-by: Umesh Nerlige Ramappa > >> --- > >> v2: > >> - Fix divide by zero for Gen11 (Andi) > >> - Update commit message > >> --- > >> .../gpu/drm/i915/gt/intel_gt_clock_utils.c | 19 ++++++++++++++----- > >> drivers/gpu/drm/i915/gt/intel_gt_types.h | 2 ++ > >> 2 files changed, 16 insertions(+), 5 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c b/drivers/= gpu/drm/i915/gt/intel_gt_clock_utils.c > >> index 88b147fa5cb1..41a0e8622b33 100644 > >> --- a/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c > >> +++ b/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c > >> @@ -3,6 +3,8 @@ > >> * Copyright =A9 2020 Intel Corporation > >> */ > >> > >> +#include > >> + > >> #include "i915_drv.h" > >> #include "i915_reg.h" > >> #include "intel_gt.h" > >> @@ -171,7 +173,12 @@ static u32 read_clock_frequency(struct intel_unco= re *uncore) > >> > >> void intel_gt_init_clock_frequency(struct intel_gt *gt) > >> { > >> + unsigned long clock_period_scale; > >> + > >> gt->clock_frequency =3D read_clock_frequency(gt->uncore); > >> + clock_period_scale =3D gcd(NSEC_PER_SEC, gt->clock_frequency); > >> + gt->clock_nsec_scaled =3D NSEC_PER_SEC / clock_period_scale; > >> + gt->clock_freq_scaled =3D gt->clock_frequency / clock_period_scale; > >> > >> /* Icelake appears to use another fixed frequency for CTX_TIMESTAMP */ > >> if (GRAPHICS_VER(gt->i915) =3D=3D 11) > >> @@ -180,11 +187,11 @@ void intel_gt_init_clock_frequency(struct intel_= gt *gt) > >> gt->clock_period_ns =3D intel_gt_clock_interval_to_ns(gt, 1); > >> > >> GT_TRACE(gt, > >> - "Using clock frequency: %dkHz, period: %dns, wrap: %lldms\n", > >> + "Using clock frequency: %dkHz, period: %dns, wrap: %lldms, scale %= lu\n", > >> gt->clock_frequency / 1000, > >> gt->clock_period_ns, > >> - div_u64(mul_u32_u32(gt->clock_period_ns, S32_MAX), > >> - USEC_PER_SEC)); > >> + div_u64(mul_u32_u32(gt->clock_period_ns, S32_MAX), USEC_PER_SEC), > >> + clock_period_scale); > >> } > >> > >> #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) > >> @@ -205,7 +212,8 @@ static u64 div_u64_roundup(u64 nom, u32 den) > >> > >> u64 intel_gt_clock_interval_to_ns(const struct intel_gt *gt, u64 coun= t) > >> { > >> - return div_u64_roundup(count * NSEC_PER_SEC, gt->clock_frequency); > >> + return div_u64_roundup(count * gt->clock_nsec_scaled, > >> + gt->clock_freq_scaled); > >> } > >> > >> u64 intel_gt_pm_interval_to_ns(const struct intel_gt *gt, u64 count) > >> @@ -215,7 +223,8 @@ u64 intel_gt_pm_interval_to_ns(const struct intel_= gt *gt, u64 count) > >> > >> u64 intel_gt_ns_to_clock_interval(const struct intel_gt *gt, u64 ns) > >> { > >> - return div_u64_roundup(gt->clock_frequency * ns, NSEC_PER_SEC); > >> + return div_u64_roundup(gt->clock_freq_scaled * ns, > >> + gt->clock_nsec_scaled); > > > > Instead of this approach, how about just using the already available > > mul_u64_u32_div() (or even mul_u64_u64_div_u64())? That would be prefer= able > > I think (though not sure if the rounding is needed?). > > I still think that we need to use the GCD for this calculation and I don't > see any of the available helpers doing it already. I will assume your > comment is just about replacing the div_u64_roundup with something already > available. Are you okay if I leave it as is here, but change it in Xe KMD > as per your suggestion? Hi Umesh, Sorry, the comment about roundup_u64() was mostly incidental. But to me it looks like gcd() itself is not needed, because the functions I mentioned will prevent 64 bit overflow using things like 128 bit operations (or equivalent when 128 bit is not available). These looks safer (and more standard) in case we have weird gpu freq's and gcd turns out to be say 1? Thanks. -- Ashutosh > > > > > There is also a roundup_u64() available in math64.h as a replacement for > > div_u64_roundup(). > > > > >> } > >> > >> u64 intel_gt_ns_to_pm_interval(const struct intel_gt *gt, u64 ns) > >> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/dr= m/i915/gt/intel_gt_types.h > >> index bcee084b1f27..a19c568fcdc0 100644 > >> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h > >> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h > >> @@ -166,6 +166,8 @@ struct intel_gt { > >> > >> u32 clock_frequency; > >> u32 clock_period_ns; > >> + u32 clock_freq_scaled; > >> + u32 clock_nsec_scaled; > >> > >> struct intel_llc llc; > >> struct intel_rc6 rc6; > >> -- > >> 2.43.0 > >>