From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 40EC3C0015E for ; Fri, 11 Aug 2023 19:18:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 06C0110E6E9; Fri, 11 Aug 2023 19:18:13 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by gabe.freedesktop.org (Postfix) with ESMTPS id 59B0010E6E9 for ; Fri, 11 Aug 2023 19:18:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1691781491; x=1723317491; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=xJBMvPVFmG6lNGdbvc2H5RJAc5EY/NKp1GD9wFIkcP0=; b=BxnXRxY+/O6Ji18Vgvsadmr9tMMaq3BSqwVwctmZGGmFJ+NSR51t1Dzw k4HNMKesADp3EpmEHGtgxRZ1kqGfh8WBm5CuVFCpSg5frCHi2w0W0/B9Q ko79Nel6jgwjUptGY2v1A2CRRHsiuD/J/eT1ZTmk5Pk59Rps9RC23f6VA NysgoTR1bQKiD/L96eGVgqZifmTkYlQDY/n/c4J6/BTp9vcoSwJjtWRIV P7JSwicVnEiU7hT/PQZbaP/poOPYKPMIiRXBUj2mNiLtzLdQmPImZSwwJ rCfyGtOPJ4C91cRivFR3pHQdQzBjV+s976HTxN9nuy3Sb+m6mAGpZxlL7 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10799"; a="356704530" X-IronPort-AV: E=Sophos;i="6.01,166,1684825200"; d="scan'208";a="356704530" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2023 12:18:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10799"; a="767765273" X-IronPort-AV: E=Sophos;i="6.01,166,1684825200"; d="scan'208";a="767765273" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.103.218]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2023 12:18:10 -0700 Date: Fri, 11 Aug 2023 12:17:17 -0700 Message-ID: <87wmy1z8w2.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: "Iddamsetty, Aravind" In-Reply-To: <890d81c9-f617-f77d-848c-de92cb9396c3@intel.com> References: <20230808115436.400611-1-aravind.iddamsetty@intel.com> <20230808115436.400611-3-aravind.iddamsetty@intel.com> <877cq4y8s3.wl-ashutosh.dixit@intel.com> <95750d7f-df24-a9c5-ab39-86db69767bac@intel.com> <87msyz39iv.wl-ashutosh.dixit@intel.com> <875y5m1c56.wl-ashutosh.dixit@intel.com> <890d81c9-f617-f77d-848c-de92cb9396c3@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Subject: Re: [Intel-xe] [PATCH v3 2/2] drm/xe/pmu: Enable PMU interface X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Bommu Krishnaiah , Tvrtko Ursulin , intel-xe@lists.freedesktop.org, Rodrigo Vivi Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Thu, 10 Aug 2023 23:17:47 -0700, Iddamsetty, Aravind wrote: > Hi Aravind, > On 11-08-2023 09:08, Dixit, Ashutosh wrote: > > > On Thu, 10 Aug 2023 14:55:41 -0700, Rodrigo Vivi wrote: > > > >> On Thu, Aug 10, 2023 at 01:40:16PM +0530, Iddamsetty, Aravind wrote: > >>> On 10-08-2023 08:10, Dixit, Ashutosh wrote: > >>> > >>>> On Wed, 09 Aug 2023 06:11:48 -0700, Iddamsetty, Aravind wrote: > >>>> > >>>>> On 09-08-2023 17:27, Iddamsetty, Aravind wrote: > >>>>>> On 09-08-2023 15:25, Iddamsetty, Aravind wrote: > >>>>>>> On 09-08-2023 12:58, Dixit, Ashutosh wrote: > >>>>>>>> On Tue, 08 Aug 2023 04:54:36 -0700, Aravind Iddamsetty wrote: > >>>>>>>> > >>>>>>>> Spotted a few remaining things. See if it's possible to fix these up and > >>>>>>>> send another version. > >>>>>>>> > >>>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c > >>>>>>>>> new file mode 100644 > >>>>>>>>> index 000000000000..9637f8283641 > >>>>>>>>> --- /dev/null > >>>>>>>>> +++ b/drivers/gpu/drm/xe/xe_pmu.c > >>>>>>>>> @@ -0,0 +1,673 @@ > >>>>>>> > >>>>>>> > >>>>>>>>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, int sample_type) > >>>>>>>>> +{ > >>>>>>>>> + u64 val = 0; > >>>>>>>>> + > >>>>>>>> > >>>>>>>> What is the forcewake domain for these registers? Don't we need to get > >>>>>>>> forcewake before reading these. Something like: > >>>>>>>> > >>>>>>>> XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL)); > >>>>>>> > >>>>>>> based on BSPEC:67609 these belong to GT power domain, so acquiring that > >>>>>>> should be sufficient. > >>>>>> > >>>>>> But if i understand correctly taking forcewake is not allowed here as it > >>>>>> is an atomic context and forcewake can sleep and that is what I'm seeing > >>>>>> as well, might also be the reason why i915 didn't do that as well. > >>>>>> > >>>>>> [ 899.114316] BUG: sleeping function called from invalid context at > >>>>>> kernel/locking/mutex.c:580 > >>>>>> [ 899.115768] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: > >>>>>> 290, name: kworker/27:1 > >>>>> > >>>>> that is the reason why in i915 we were doing similar thing of storing > >>>>> the counter as we enter rc6, not sure how do we do that in xe. > >>>> > >>>> Just to check, which code path(s) is/are aotmic context: > >>>> > >>>> a. xe_pm_suspend > >>>> b. xe_pm_runtime_suspend > >>>> c. xe_pmu_event_read > >>> > >>> pmu_event_read and runtime_suspend are atomic contexts. > >> > >> what about doing this at xe_pci_runtime_idle() ? > >> > >> This will run after the autosuspend time elapses, > >> but before calling any suspend. Also, there's no requirement of > >> that function to be in atomic context. So you could forcewake_get/put > >> and stash your registers before we go runtime_suspend. > > > > Thanks for the suggestion. Though Anshuman was saying that rpm_suspend > > callback itself is not called in atomic context, Aravind seems to have made > > a mistake. Aravind could you please confirm? > > Yup i made a mistake runtime_suspend is not an atomic context, in actual > we are taking the forcewake in gt_suspend which calls the xe_pmu_suspend. OK, great, thanks for confirming. > > > > In any case there seems to be a way out here, we should work to save off > > the registers while suspending in either the idle or suspend callbacks. > > > >>>> Now I am wondering if GuC should provide these counters too along with > >>>> other busyness values it provides, since GuC is what control RC6 > >>>> entry/exit. But let's try to understand the issue some more first. > >>> > >>> do you mean GuC reading these registers and presenting us in a way, will > >>> need to think over how does it fit in the PMU. > > > > I think better to leave GuC out since it's a long process to modify the GuC > > API. So let's cancel that. > > > > About xe_pmu_event_read being atomic context, since the registers might be > > getting updated while xe_pmu_event_read calls are happening, the only way > > out I am seeing is to run a kthread (specifically a delayed work item) > > while PMU is active. The work item will run every 10 ms or so and save off > > the registers (since we cannot take forcewake in xe_pmu_event_read and read > > the registers). So this way we should be able to report register values > > which are at most 10 ms old. > > there are two cases here: Correct. > 1. similar to suspend, we shall capture the register state before GT goes > to rc6 and that shall cover suspend case as well > > 2. in xe_pmu_event_read, we shall check if device is not in rc6 rather > than awake and read register but might have to take forcewake > > for first I do not think we have a way to know in Xe if GT is entering > rc6 (like gt_park in i915). But we can take forcewake here to either prevent GT from entering RC6 or bring it out of RC6 to read the registers. > even in kthread/timer we shall take forcewake only when GT is out of Timer is interrupt/atomic context so it cannot be used, it will have to be a kthread/workqueue. > rc6, but there is a problem our devices default autosuspend is in 1sec > so if we have a timer or kthread at msec the user shall also adjust the > autosuspend to ms or else device will never go to suspend as we take the > pm reference in mem_access_get. The kthread will only try to get forcewake (to read the registers) if the device is awake, so it will use xe_device_mem_access_get_if_ongoing. If device is not awake use previous stored values. Thanks. -- Ashutosh > > Any other ideas here? > > > > Thanks. > > -- > > Ashutosh > > > >>>>>>>> > >>>>>>>>> + switch (sample_type) { > >>>>>>>>> + case __XE_SAMPLE_RENDER_GROUP_BUSY: > >>>>>>>>> + val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE); > >>>>>>>>> + break; > >>>>>>>>> + case __XE_SAMPLE_COPY_GROUP_BUSY: > >>>>>>>>> + val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE); > >>>>>>>>> + break; > >>>>>>>>> + case __XE_SAMPLE_MEDIA_GROUP_BUSY: > >>>>>>>>> + val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE); > >>>>>>>>> + break; > >>>>>>>>> + case __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY: > >>>>>>>>> + val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE); > >>>>>>>>> + break; > >>>>>>>>> + default: > >>>>>>>>> + drm_warn(>->tile->xe->drm, "unknown pmu event\n"); > >>>>>>>>> + } > >>>>>>>> > >>>>>>>> And similarly here: > >>>>>>>> > >>>>>>>> XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));