From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 155C1ECAAD3 for ; Wed, 14 Sep 2022 16:04:17 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E613410E963; Wed, 14 Sep 2022 16:04:15 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id CFB8610E963 for ; Wed, 14 Sep 2022 16:04:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663171451; x=1694707451; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version:content-transfer-encoding; bh=+DhREHb6PbUyJ/W0rMO3iZCkRTQXrPf6ibXMF5B8lBQ=; b=PfrMPSnEs6UgrV1a0/nHRTlyCn34MXpq891kv5NJKoi6Cz9jCIPVRwbk j2IdEES0YJzl0XXqloBgLISbFNQ7cnF7N8PeRHX6p7mzyeux5b6tgCFxs w5t0M5HaW7jaz4stS+rsYdaXiGVx6fC6rmexkWmh5JG4IsDktiK/++BOR bXARAJaux5HXwbXpGb+pkQoW5CeqPpzPDN8lsghiYh2t7JXMIxLRrfUfv a1fow6xMIgJwJsUNy1ttua4VXjNX6is1zI2dVO9WJC6i/tHBWHJsbV0RQ Y2oOcj9s5ymIWqmjDQ1V583j3XjIGhAJNSbC/iN9h32C1bnPzWITEpy4A Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10470"; a="298466004" X-IronPort-AV: E=Sophos;i="5.93,315,1654585200"; d="scan'208";a="298466004" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2022 09:04:11 -0700 X-IronPort-AV: E=Sophos;i="5.93,315,1654585200"; d="scan'208";a="945564679" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.38.224]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2022 09:04:10 -0700 Date: Wed, 14 Sep 2022 09:04:10 -0700 Message-ID: <87mtb2djt1.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Umesh Nerlige Ramappa In-Reply-To: <20220823204155.8178-7-umesh.nerlige.ramappa@intel.com> References: <20220823204155.8178-1-umesh.nerlige.ramappa@intel.com> <20220823204155.8178-7-umesh.nerlige.ramappa@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Intel-gfx] [PATCH 06/19] drm/i915/perf: Use helpers to process reports w.r.t. OA buffer size X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Tue, 23 Aug 2022 13:41:42 -0700, Umesh Nerlige Ramappa wrote: > Hi Umesh, > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915= _perf.c > index 6fc4f0d8fc5a..bbf1c574f393 100644 > --- a/drivers/gpu/drm/i915/i915_perf.c > +++ b/drivers/gpu/drm/i915/i915_perf.c > @@ -385,6 +385,21 @@ static struct ctl_table_header *sysctl_header; > > static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtim= er); > > +static u32 _oa_taken(struct i915_perf_stream * stream, u32 tail, u32 hea= d) nit: no space between * and stream. > +{ > + u32 size =3D stream->oa_buffer.vma->size; > + > + return tail >=3D head ? tail - head : size - (head - tail); > +} If we are doing this we should probably eliminate references to OA_TAKEN which serves an identical purpose (I think there is one remaining reference) and also delete OA_TAKEN #define. > + > +static u32 _rewind_tail(struct i915_perf_stream * stream, u32 relative_h= w_tail, > + u32 rewind_delta) > +{ > + return rewind_delta > relative_hw_tail ? > + stream->oa_buffer.vma->size - (rewind_delta - relative_hw_tail) : > + relative_hw_tail - rewind_delta; > +} Also are we really saying here that we are supporting non-power-of-2 OA buffer sizes? Because if we stayed with power-of-2 sizes the expression above are nice and elegant and actually closer to the previous code being changed in this patch. For example: #include static u32 _oa_taken(struct i915_perf_stream *stream, u32 tail, u32 head) { return CIRC_CNT(tail, head, stream->oa_buffer.vma->size); } static u32 _rewind_tail(struct i915_perf_stream *stream, u32 relative_hw_ta= il, u32 rewind_delta) { return CIRC_CNT(relative_hw_tail, rewind_delta, stream->oa_buffer.vma->siz= e); } Note that for power-of-2 sizes the two functions above are identical but we should keep them separate for clarity (as is done in the patch) since they are serving two different functions in the OA code. Also another assumption in the code seems to be: stream->oa_buffer.vma->size =3D=3D OA_BUFFER_SIZE which I am pretty sure will not hold for arbitrary non-power-of-2 OA buffer sizes? So we might as well stick with power-of-2 sizes and change later in a separate patch only if needed? Thanks. -- Ashutosh > + > void i915_oa_config_release(struct kref *ref) > { > struct i915_oa_config *oa_config =3D > @@ -487,12 +502,14 @@ static bool oa_buffer_check_unlocked(struct i915_pe= rf_stream *stream) > * sizes need not be integral multiples or 64 or powers of 2. > * Compute potentially partially landed report in the OA buffer > */ > - partial_report_size =3D OA_TAKEN(hw_tail, stream->oa_buffer.tail); > + partial_report_size =3D > + _oa_taken(stream, hw_tail, stream->oa_buffer.tail); > partial_report_size %=3D report_size; > > /* Subtract partial amount off the tail */ > - hw_tail =3D gtt_offset + ((hw_tail - partial_report_size) & > - (stream->oa_buffer.vma->size - 1)); > + hw_tail =3D gtt_offset + _rewind_tail(stream, > + hw_tail - gtt_offset, > + partial_report_size); > > now =3D ktime_get_mono_fast_ns(); > > @@ -527,16 +544,16 @@ static bool oa_buffer_check_unlocked(struct i915_pe= rf_stream *stream) > * memory in the order they were written to. > * If not : (=E2=95=AF=C2=B0=E2=96=A1=C2=B0=EF=BC=89=E2=95=AF=EF=B8=B5 = =E2=94=BB=E2=94=81=E2=94=BB > */ > - while (OA_TAKEN(tail, aged_tail) >=3D report_size) { > + while (_oa_taken(stream, tail, aged_tail) >=3D report_size) { > u32 *report32 =3D (void *)(stream->oa_buffer.vaddr + tail); > > if (report32[0] !=3D 0 || report32[1] !=3D 0) > break; > > - tail =3D (tail - report_size) & (OA_BUFFER_SIZE - 1); > + tail =3D _rewind_tail(stream, tail, report_size); > } > > - if (OA_TAKEN(hw_tail, tail) > report_size && > + if (_oa_taken(stream, hw_tail, tail) > report_size && > __ratelimit(&stream->perf->tail_pointer_race)) > DRM_NOTE("unlanded report(s) head=3D0x%x " > "tail=3D0x%x hw_tail=3D0x%x\n", > @@ -547,8 +564,9 @@ static bool oa_buffer_check_unlocked(struct i915_perf= _stream *stream) > stream->oa_buffer.aging_timestamp =3D now; > } > > - pollin =3D OA_TAKEN(stream->oa_buffer.tail - gtt_offset, > - stream->oa_buffer.head - gtt_offset) >=3D report_size; > + pollin =3D _oa_taken(stream, > + stream->oa_buffer.tail, > + stream->oa_buffer.head) >=3D report_size; > > spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags); > > @@ -679,11 +697,9 @@ static int gen8_append_oa_reports(struct i915_perf_s= tream *stream, > int report_size =3D stream->oa_buffer.format_size; > u8 *oa_buf_base =3D stream->oa_buffer.vaddr; > u32 gtt_offset =3D i915_ggtt_offset(stream->oa_buffer.vma); > - u32 mask =3D (OA_BUFFER_SIZE - 1); > size_t start_offset =3D *offset; > unsigned long flags; > - u32 head, tail; > - u32 taken; > + u32 head, tail, size; > int ret =3D 0; > > if (drm_WARN_ON(&uncore->i915->drm, !stream->enabled)) > @@ -693,6 +709,7 @@ static int gen8_append_oa_reports(struct i915_perf_st= ream *stream, > > head =3D stream->oa_buffer.head; > tail =3D stream->oa_buffer.tail; > + size =3D stream->oa_buffer.vma->size; > > spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags); > > @@ -711,16 +728,15 @@ static int gen8_append_oa_reports(struct i915_perf_= stream *stream, > * all a power of two). > */ > if (drm_WARN_ONCE(&uncore->i915->drm, > - head > stream->oa_buffer.vma->size || > - tail > stream->oa_buffer.vma->size, > + head > size || tail > size, > "Inconsistent OA buffer pointers: head =3D %u, tail =3D %u\n", > head, tail)) > return -EIO; > > > for (/* none */; > - (taken =3D OA_TAKEN(tail, head)); > - head =3D (head + report_size) & mask) { > + _oa_taken(stream, tail, head); > + head =3D (head + report_size) % size) { > u8 *report =3D oa_buf_base + head; > u32 *report32 =3D (void *)report; > u32 ctx_id; > -- > 2.25.1 >