From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E7BFDC36008 for ; Wed, 26 Mar 2025 22:38:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A893910E0B5; Wed, 26 Mar 2025 22:38:35 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="ccLubAHW"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by gabe.freedesktop.org (Postfix) with ESMTPS id 46A6C10E0B5 for ; Wed, 26 Mar 2025 22:38:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1743028714; x=1774564714; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version:content-transfer-encoding; bh=b28PuSQ3inD0WsRt7UHQhs3RXv1WptB9q+uiUwaIEvU=; b=ccLubAHWe9fTQnfryMjfNLHiJdM41231dpTHPXtCGXsBPSeElj4Wy2Tu oVhzfhbaLbTcR0I7MoGu/rF3DltOPO6Xl9JEEVHFIS6wBE4qcGVLwN5fb e8MiH5EqEmNjQKOOXmdhErmfoq7HvUwnr9mer1C7xEWERbK6ZYvD79tLd DkTM7Iu1FaItZI+f4XP8eqpsyxIFcc3y3v3MFeuFlMTgt26oiVkC1kbcW B+dfRKiqMLfJA+Ue39T5bHq4Z9TvvvKWPOyj/SyHfEEeYXd6JEpOWJuDM zpX4emZked+WHGkdeG7KTm7SYp1HAb9esWAcjetgalJxSOxFxAuhgenZu Q==; X-CSE-ConnectionGUID: wI2kVtEWTdeiu+pVTnH9Pg== X-CSE-MsgGUID: fk6wkRTvRLyanGsGOVFKEg== X-IronPort-AV: E=McAfee;i="6700,10204,11385"; a="55339133" X-IronPort-AV: E=Sophos;i="6.14,279,1736841600"; d="scan'208";a="55339133" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Mar 2025 15:38:34 -0700 X-CSE-ConnectionGUID: b6lqfcUxTVyBtagKN0RbZA== X-CSE-MsgGUID: +fPhWB8ZSl2n7LlrhXKl1A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,279,1736841600"; d="scan'208";a="125871087" Received: from rrodri4-mobl.amr.corp.intel.com (HELO adixit-MOBL3.intel.com) ([10.125.216.73]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Mar 2025 15:38:33 -0700 Date: Wed, 26 Mar 2025 15:38:33 -0700 Message-ID: <87bjtnifiu.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: "Belgaumkar, Vinay" Cc: Lucas De Marchi , , Riana Tauro , Rodrigo Vivi Subject: Re: [PATCH v4] drm/xe/pmu: Add GT frequency events In-Reply-To: <4a2d5282-eec6-412d-8b4f-2b9ee165bd6f@intel.com> References: <20250324232402.46481-1-vinay.belgaumkar@intel.com> <87jz8e9d3h.wl-ashutosh.dixit@intel.com> <87jz8dhw0r.wl-ashutosh.dixit@intel.com> <33ebfe1d-5ac5-4985-8c8d-c1996e5a6d14@intel.com> <87h63gixq0.wl-ashutosh.dixit@intel.com> <269b993d-2a22-4080-8609-66650a60ad62@intel.com> <87frj0ivbh.wl-ashutosh.dixit@intel.com> <4a2d5282-eec6-412d-8b4f-2b9ee165bd6f@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Wed, 26 Mar 2025 15:02:49 -0700, Belgaumkar, Vinay wrote: > > > On 3/25/2025 9:14 PM, Lucas De Marchi wrote: > > On Tue, Mar 25, 2025 at 11:09:55PM -0500, Lucas De Marchi wrote: > >> On Tue, Mar 25, 2025 at 03:45:06PM -0700, Ashutosh Dixit wrote: > >>> On Tue, 25 Mar 2025 15:01:47 -0700, Belgaumkar, Vinay wrote: > >>>> > >>>> > >>>> On 3/25/2025 2:53 PM, Dixit, Ashutosh wrote: > >>>>> On Tue, 25 Mar 2025 13:33:43 -0700, Belgaumkar, Vinay wrote: > >>>>>> On 3/25/2025 10:15 AM, Dixit, Ashutosh wrote: > >>>>>>> On Mon, 24 Mar 2025 19:37:32 -0700, Belgaumkar, Vinay wrote: > >>>>>>> Hi Vinay, > >>>>>>> > >>>>>>>> On 3/24/2025 5:18 PM, Dixit, Ashutosh wrote: > >>>>>>>>> On Mon, 24 Mar 2025 16:24:02 -0700, Vinay Belgaumkar wrote: > >>>>>>>>>> @@ -266,11 +274,24 @@ static u64 __xe_pmu_event_read(struct > >>>>>>>>>> perf_event *event) > >>>>>>>>>> =A0=A0=A0=A0case XE_PMU_EVENT_ENGINE_ACTIVE_TICKS: > >>>>>>>>>> =A0=A0=A0=A0case XE_PMU_EVENT_ENGINE_TOTAL_TICKS: > >>>>>>>>>> =A0=A0=A0=A0=A0=A0=A0 return read_engine_events(gt, event); > >>>>>>>>>> +=A0=A0=A0 case XE_PMU_EVENT_GT_ACTUAL_FREQUENCY: > >>>>>>>>>> +=A0=A0=A0=A0=A0=A0=A0 return xe_guc_pc_get_act_freq(>->uc.g= uc.pc); > >>>>>>>>>> +=A0=A0=A0 case XE_PMU_EVENT_GT_REQUESTED_FREQUENCY: > >>>>>>>>>> +=A0=A0=A0=A0=A0=A0=A0 if (!xe_guc_pc_get_cur_freq(>->uc.guc= .pc, > >>>>>>>>>> &cur_gt_freq)) > >>>>>>>>> This is unconditionally taking the forcewake and waking the card > >>>>>>>>> up just to > >>>>>>>>> get the sample. Do we really want to do that? > >>>>>>>>> > >>>>>>>>> So if we don't do that, both the actual and requested freq will > >>>>>>>>> be 0 if gt > >>>>>>>>> is in C6. > >>>>>>>> For actual frequency, the register(0xc60) does not belong to any > >>>>>>>> fw domain - > >>>>>>>> > >>>>>>>> GEN_FW_RANGE(0xc00, 0xfff, 0), > >>>>>>>> > >>>>>>>> HW will report 0 when GT is in C6. > >>>>>>> Yes, no issue about act_freq, see commit 22009b6dad66. I was > >>>>>>> referring only > >>>>>>> to requested freq. > >>>>>>> > >>>>>>>> The requested freq register is a > >>>>>>>> shadowed register (0xa008), so that will not accrue fwake either. > >>>>>>>> > >>>>>>>> static const struct i915_range mtl_shadowed_regs[] =3D { > >>>>>>>> =A0 =A0=A0=A0=A0=A0=A0=A0 { .start =3D=A0=A0 0x2030, .end =3D=A0= =A0 0x2030 }, > >>>>>>>> =A0 =A0=A0=A0=A0=A0=A0=A0 { .start =3D=A0=A0 0x2510, .end =3D=A0= =A0 0x2550 }, > >>>>>>>> =A0 =A0=A0=A0=A0=A0=A0=A0 { .start =3D=A0=A0 0xA008, .end =3D=A0= =A0 0xA00C }, > >>>>>>> So this still doesn't make sense because: > >>>>>>> > >>>>>>> =A0=A0=A0 1. The fact is that xe_guc_pc_get_cur_freq() *is* taking > >>>>>>> forcewake > >>>>>>> =A0=A0=A0 2. And that is in accord with the following comment in > >>>>>>> i915/intel_uncore.c > >>>>>>> > >>>>>>> =A0=A0=A0=A0=A0=A0 * Shadowing only applies to writes; forcewake > >>>>>>> =A0=A0=A0=A0=A0=A0 * must still be acquired when reading from reg= isters in > >>>>>>> these ranges. > >>>>>>> > >>>>>>> Also see intel_rps_read_punit_req() which is called from i915 PMU > >>>>>>> (frequency_sample()) and uses with_intel_runtime_pm_if_in_use(), = so > >>>>>>> we'd > >>>>>>> need to do use the equivalent in xe. > >>>>>> Hi Ashutosh, > >>>>>> > >>>>>> =A0=A0 As part of a previous decision, in the Xe PMU implementatio= n, we > >>>>>> are > >>>>>> doing a runtime_get() during pmu_init for all PMU sessions. So, > >>>>>> device is > >>>>>> going to be awake anyways. In this case, it does not make sense to > >>>>>> just > >>>>>> read the register without a fwake. > >>>>> > >>>>> Even if that is ok... Let us take a completely idle device. What > >>>>> should the > >>>>> actual and requested freq's be for this case? Both should be zero, > >>>>> correct? > >>>>> Now, what are we going to show if we are taking fwake? At least the > >>>>> requested freq will be non-zero? Is that ok? Or the requested freq > >>>>> will > >>>>> show zero even if we take fwake? Thanks. > >>>> > >>>> If we take fwake before reading, requested frequency will not be zero > >>>> even > >>>> if GT is idle. HW always retains the last requested frequency in that > >>>> register, never zeroes it like it does for actual_freq. So, showing > >>>> the > >>>> non-zero value is the correct thing to do instead of artifically > >>>> zeroing > >>>> it. > >>> > >>> So now, this I disagree with. For an idle device, hopefully in C6, > >>> there > >>> should be no reason to request a non-zero freq. So the requested freq > >>> should show 0. As i915 does (using gt parked and unparked states and > >>> using > >>> with_intel_runtime_pm_if_in_use()). > >> > >> wrt locking and interaction with runtime_pm and forcewake, the perf > >> implementation in i915 is completely broken. You can't take a runtime_= pm > >> or forcewake when you are in an atomic context. And when you have a > >> raw_spin_lock taken like is the case with perf, you can't even look at > >> it (if it involves taking a spin_lock, which it does). > >> > >> perf is holding a raw_spin_lock at that time. The only reason why the > >> CI isn't on fire over there is because we are papering it over with th= is > >> commit in topic/core-for-CI: > >> > >> dc17a830ffb5 ("Revert "lockdep: Enable PROVE_RAW_LOCK_NESTING with > >> PROVE_LOCKING."") > >> > >>> > >>> If we are already "doing a runtime_get() during pmu_init" (as mention= ed > >>> above) we need to devise some other way of making this happen (maybe = by > >>> looking at xe_force_wake_domain ref field?). > >> > >> you can't do that safely. > > > > which reminds me... Vinay can you test this patch directly on > > drm-xe-next? By code inspection it seems we are doing: > > > > __xe_pmu_event_read() > > =A0 xe_guc_pc_get_cur_freq() > > =A0=A0=A0 xe_force_wake_get() > > =A0=A0=A0=A0=A0 spin_lock_irqsave(&fw->lock, flags); > > > > and that spin_lock can't be called at this point. > > Hi Lucas, > > Tried it on drm-xe-next, not seeing any deadlocks/warnings. Is that becau= se > CONFIG_PREEMPT is not set? Doesn't look like upstream config has it set > either - > https://gitlab.freedesktop.org/drm/xe/ci/-/blob/main/kernel/kconfig?ref_t= ype=3Dheads Yeah this will not show anything on Xe unless CONFIG_PREEMPT_RT (not CONFIG_PREEMPT) is enabled. Though, are we really supporting CONFIG_PREEMPT_RT with Xe? Not sure what else will show up if we enable CONFIG_PREEMPT_RT? Thanks. -- Ashutosh