From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 96E0BC001B0 for ; Mon, 7 Aug 2023 21:18:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4E6F510E37D; Mon, 7 Aug 2023 21:18:01 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5FB6E10E38F for ; Mon, 7 Aug 2023 21:17:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1691443078; x=1722979078; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=rDSg70CeGeexbTMAaNSnS0Z5DjxpZHke0P+zEQTHULU=; b=K0swCNxU/eM6ITYj0KME9KkZPvPykK0lfJbR9OshZw3deexM6iFGvazh nC92l4cXthHt7DojehkR4a77TzR2LiGiKPPt3ryTo4QeJAfpvRESWATET kZsh0J6EVdofeMWGDDNjN/z1EKaPCq3IqYxiyDCLNfGdsfhPQvwYNhauC 6djk4LTskvB3xjUewRhD7jwZneLyYpCn0VQveGuP+YV3Lej82oHp18XXD PptdTpBhDFtSoZBiZs9pPfFUhUGDaZYcqBjeDoqdDZJDRICp/rdvONMW2 q/0PHDKAO6uybTqGKGSjlH285mzjPSDgOoeONQps7RrSpxTy4cNpBYUiv Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="374327288" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="374327288" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2023 14:17:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="845189601" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="845189601" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.251.14.200]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2023 14:17:33 -0700 Date: Mon, 07 Aug 2023 14:16:59 -0700 Message-ID: <87o7jiy2lw.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: "Iddamsetty, Aravind" In-Reply-To: <500331df-d95b-d447-18e6-d0f198570072@intel.com> References: <20230627122113.1472532-1-aravind.iddamsetty@intel.com> <20230627122113.1472532-3-aravind.iddamsetty@intel.com> <87cz0mqdpi.wl-ashutosh.dixit@intel.com> <877cqsrg65.wl-ashutosh.dixit@intel.com> <875y6cqy6p.wl-ashutosh.dixit@intel.com> <8fd9bc04-b737-0acd-a0c9-4f163c1a117c@intel.com> <87351dqpcr.wl-ashutosh.dixit@intel.com> <42bb140d-2faf-43d3-a027-fe430b3682e5@intel.com> <87zg3lp8y5.wl-ashutosh.dixit@intel.com> <500331df-d95b-d447-18e6-d0f198570072@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Subject: Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Bommu Krishnaiah , intel-xe@lists.freedesktop.org, Tvrtko Ursulin Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Tue, 25 Jul 2023 04:38:45 -0700, Iddamsetty, Aravind wrote: > Hi Aravind, > On 24-07-2023 22:01, Dixit, Ashutosh wrote: > > On Mon, 24 Jul 2023 09:05:53 -0700, Iddamsetty, Aravind wrote: > >> > >>>> On 22-07-2023 11:34, Dixit, Ashutosh wrote: > >>>>>> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote: > >>>>>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote: > >>>>>>>>> +void engine_group_busyness_store(struct xe_gt *gt) > >>>>>>>>> +{ > >>>>>>>>> + struct xe_pmu *pmu = >->tile->xe->pmu; > >>>>>>>>> + unsigned int gt_id = gt->info.id; > >>>>>>>>> + unsigned long flags; > >>>>>>>>> + > >>>>>>>>> + spin_lock_irqsave(&pmu->lock, flags); > >>>>>>>>> + > >>>>>>>>> + store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY, > >>>>>>>>> + __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0))); > >>>>>>>>> + store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY, > >>>>>>>>> + __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0))); > >>>>>>>>> + store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY, > >>>>>>>>> + __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0))); > >>>>>>>>> + store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY, > >>>>>>>>> + __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0))); > >>>>> > >>>>> Here why should we store everything, we should store only those events > >>>>> which are enabled? > >>>> > >>>> The events are enabled only when they are opened which can happen after > >>>> the device is suspended hence we need to store all. As in the present > >>>> case device is put to suspend immediately after probe and event is > >>>> opened post driver load is done. > >>> > >>> I don't think we can justify doing expensive PCIe reads and increasing the > >>> time to go into runtime suspend, when PMU might not being used at all. > >>> > >>> If we store only enabled samples and start storing them only after they are > >>> enabled, what would be the consequence of this? The first non-zero sample > >>> seen by the perf tool would be wrong and later samples will be fine? > >> > >> Why do you say it is wrong perf reports relative from the time an event > >> is opened. > > > > I am asking you what is the consequence. Initial values will all be zero > > and then there is some activity and we get a non zero value but this will > > include all the previous activity so the first difference we send to perf > > will be large/wrong I think. > > correct if we just store the enabled events in suspend, any other event > will have 0 initial value and when we read the register later it will > have all the accumulation and since past value we have is 0 we would end > up reporting the entire value which is wrong. Ok, agreed, so we need to do "something". > > > > >> > >>> > >>> If there is a consequence, we might have to go back to what I was saying > >>> earlier about waking the device up and reading the enabled counter when > >>> xe_pmu_event_start happens, to initialize the counter values. I am assuming > >>> this will work? > >> > >> xe_pmu_event_start can be called when device is in suspend so we shall > >> not wake up the device i.e event being enabled when in suspend, so if we > >> do not store while going to suspend we will not have any value to > >> consider when event is enabled after suspend as we need to present > >> relative value. > > > > That is why I am saying wake up the device and initialize the counters in > > xe_pmu_event_start. > > Afaik since PMU doesn't take DRM reference we shall not wake up the > device. Not sure what you mean because PMU does do this: drm_dev_get(&xe->drm); Anyway I don't think it has anything to do with waking up the device since that is done via xe_device_mem_access_get. > if we were allowed to wake up the device why do we even need to > store during suspend. when ever PMU event is opened we could wake up the > device and read the register directly. No. That is why we are saving the counters during suspend so we don't have to wake up the device just to read the counters. So the issue is only how to *initialize* the counters. You are saying we initialize by saving all counters during suspend, whether or not they are enabled, which I don't agree with. I am saying we should only read and store the counters which are enabled during normal operation. And to initialize we wake the device up during xe_pmu_event_start and store the counter value. Alternatively, we can zero out the enabled counters during xe_pmu_event_start (the counters are RW) but in any case that will also need waking up the device. So this way we only wake up the device for initialization but not afterwards. Since this is the "base" patch we should try to set up a good infrastructure in this patch so that other stuff which is exposed via PMU can be easily added later. > >>> > >>> Doing this IMO would be better than always doing these PCIe reads on > >>> runtime suspend even when PMU is not being used > >> > >> we have been doing these in i915 not sure if it affected any timing > >> requirements for runtime suspend. > > > > Hmm i915 indeed seems to be reading the RC6 residency in __gt_park even > > when RC6 event is not enabled or PMU might not be used. > > > > @Tvrtko, any comments here? > > > > Thanks. > > -- > > Ashutosh Thanks. -- Ashutosh