From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
To: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 2/3] drm/i915/perf: Remove gtt_offset from stream->oa_buffer.head/.tail
Date: Tue, 12 Sep 2023 18:36:04 -0700 [thread overview]
Message-ID: <ZQESBLpba0nnK24G@unerlige-ril> (raw)
In-Reply-To: <20230909011626.1643734-3-ashutosh.dixit@intel.com>
On Fri, Sep 08, 2023 at 06:16:25PM -0700, Ashutosh Dixit wrote:
>There is no reason to add gtt_offset to the cached head/tail pointers
>stream->oa_buffer.head and stream->oa_buffer.tail. This causes the code to
>constantly add gtt_offset and subtract gtt_offset and is error
>prone (e.g. see previous patch).
>
>It is much simpler to maintain stream->oa_buffer.head and
>stream->oa_buffer.tail without adding gtt_offset to them and just allow for
>the gtt_offset when reading/writing from/to HW registers.
>
>Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
>---
> drivers/gpu/drm/i915/i915_perf.c | 53 ++++++++------------------------
> 1 file changed, 13 insertions(+), 40 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>index ec0fc2934045a..1347e4ec9dd5a 100644
>--- a/drivers/gpu/drm/i915/i915_perf.c
>+++ b/drivers/gpu/drm/i915/i915_perf.c
>@@ -543,10 +543,9 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> {
> u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
> int report_size = stream->oa_buffer.format->size;
>- u32 head, tail, read_tail;
>+ u32 tail, hw_tail;
> unsigned long flags;
> bool pollin;
>- u32 hw_tail;
> u32 partial_report_size;
>
> /* We have to consider the (unlikely) possibility that read() errors
>@@ -556,6 +555,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>
> hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
>+ hw_tail -= gtt_offset;
Since this patch intends to remove gtt_offset for all head/tail
calculations, we don't need patch 1/3. Patch 1 can be dropped.
>
> /* The tail pointer increases in 64 byte increments, not in report_size
> * steps. Also the report size may not be a power of 2. Compute
>@@ -565,16 +565,8 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> partial_report_size %= report_size;
>
> /* Subtract partial amount off the tail */
>- hw_tail -= gtt_offset;
> hw_tail = OA_TAKEN(hw_tail, partial_report_size);
>
>- /* NB: The head we observe here might effectively be a little
>- * out of date. If a read() is in progress, the head could be
>- * anywhere between this head and stream->oa_buffer.tail.
>- */
>- head = stream->oa_buffer.head - gtt_offset;
>- read_tail = stream->oa_buffer.tail - gtt_offset;
>-
> tail = hw_tail;
>
> /* Walk the stream backward until we find a report with report
>@@ -588,7 +580,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> * memory in the order they were written to.
> * If not : (╯°□°)╯︵ ┻━┻
> */
>- while (OA_TAKEN(tail, read_tail) >= report_size) {
>+ while (OA_TAKEN(tail, stream->oa_buffer.tail) >= report_size) {
> void *report = stream->oa_buffer.vaddr + tail;
>
> if (oa_report_id(stream, report) ||
>@@ -602,9 +594,9 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> __ratelimit(&stream->perf->tail_pointer_race))
> drm_notice(&stream->uncore->i915->drm,
> "unlanded report(s) head=0x%x tail=0x%x hw_tail=0x%x\n",
>- head, tail, hw_tail);
>+ stream->oa_buffer.head, tail, hw_tail);
>
>- stream->oa_buffer.tail = gtt_offset + tail;
>+ stream->oa_buffer.tail = tail;
>
> pollin = OA_TAKEN(stream->oa_buffer.tail,
> stream->oa_buffer.head) >= report_size;
>@@ -754,13 +746,6 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
>
> spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
>
>- /*
>- * NB: oa_buffer.head/tail include the gtt_offset which we don't want
>- * while indexing relative to oa_buf_base.
>- */
>- head -= gtt_offset;
>- tail -= gtt_offset;
>-
> /*
> * An out of bounds or misaligned head or tail pointer implies a driver
> * bug since we validate + align the tail pointers we read from the
>@@ -896,9 +881,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
> * We removed the gtt_offset for the copy loop above, indexing
> * relative to oa_buf_base so put back here...
> */
>- head += gtt_offset;
> intel_uncore_write(uncore, oaheadptr,
>- head & GEN12_OAG_OAHEADPTR_MASK);
>+ (head + gtt_offset) & GEN12_OAG_OAHEADPTR_MASK);
> stream->oa_buffer.head = head;
>
> spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
>@@ -1043,12 +1027,6 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
>
> spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
>
>- /* NB: oa_buffer.head/tail include the gtt_offset which we don't want
>- * while indexing relative to oa_buf_base.
>- */
>- head -= gtt_offset;
>- tail -= gtt_offset;
>-
> /* An out of bounds or misaligned head or tail pointer implies a driver
> * bug since we validate + align the tail pointers we read from the
> * hardware and we are in full control of the head pointer which should
>@@ -1111,13 +1089,8 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
> if (start_offset != *offset) {
> spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>
>- /* We removed the gtt_offset for the copy loop above, indexing
>- * relative to oa_buf_base so put back here...
>- */
>- head += gtt_offset;
>-
> intel_uncore_write(uncore, GEN7_OASTATUS2,
>- (head & GEN7_OASTATUS2_HEAD_MASK) |
>+ ((head + gtt_offset) & GEN7_OASTATUS2_HEAD_MASK) |
> GEN7_OASTATUS2_MEM_SELECT_GGTT);
> stream->oa_buffer.head = head;
>
>@@ -1705,7 +1678,7 @@ static void gen7_init_oa_buffer(struct i915_perf_stream *stream)
> */
> intel_uncore_write(uncore, GEN7_OASTATUS2, /* head */
> gtt_offset | GEN7_OASTATUS2_MEM_SELECT_GGTT);
>- stream->oa_buffer.head = gtt_offset;
>+ stream->oa_buffer.head = 0;
>
> intel_uncore_write(uncore, GEN7_OABUFFER, gtt_offset);
>
>@@ -1713,7 +1686,7 @@ static void gen7_init_oa_buffer(struct i915_perf_stream *stream)
> gtt_offset | OABUFFER_SIZE_16M);
>
> /* Mark that we need updated tail pointers to read from... */
>- stream->oa_buffer.tail = gtt_offset;
>+ stream->oa_buffer.tail = 0;
>
> spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
>
>@@ -1747,7 +1720,7 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream)
>
> intel_uncore_write(uncore, GEN8_OASTATUS, 0);
> intel_uncore_write(uncore, GEN8_OAHEADPTR, gtt_offset);
>- stream->oa_buffer.head = gtt_offset;
>+ stream->oa_buffer.head = 0;
>
> intel_uncore_write(uncore, GEN8_OABUFFER_UDW, 0);
>
>@@ -1764,7 +1737,7 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream)
> intel_uncore_write(uncore, GEN8_OATAILPTR, gtt_offset & GEN8_OATAILPTR_MASK);
>
> /* Mark that we need updated tail pointers to read from... */
>- stream->oa_buffer.tail = gtt_offset;
>+ stream->oa_buffer.tail = 0;
>
> /*
> * Reset state used to recognise context switches, affecting which
>@@ -1801,7 +1774,7 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
> intel_uncore_write(uncore, __oa_regs(stream)->oa_status, 0);
> intel_uncore_write(uncore, __oa_regs(stream)->oa_head_ptr,
> gtt_offset & GEN12_OAG_OAHEADPTR_MASK);
>- stream->oa_buffer.head = gtt_offset;
>+ stream->oa_buffer.head = 0;
>
> /*
> * PRM says:
>@@ -1817,7 +1790,7 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
> gtt_offset & GEN12_OAG_OATAILPTR_MASK);
>
> /* Mark that we need updated tail pointers to read from... */
>- stream->oa_buffer.tail = gtt_offset;
>+ stream->oa_buffer.tail = 0;
>
Looks correct.
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Thanks,
Umesh
> /*
> * Reset state used to recognise context switches, affecting which
>--
>2.41.0
>
next prev parent reply other threads:[~2023-09-13 1:36 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-09 1:16 [Intel-gfx] [PATCH 0/3] drm/i915/perf: A few fixes and enhancements Ashutosh Dixit
2023-09-09 1:16 ` [Intel-gfx] [PATCH 1/3] drm/i915/perf: Subtract gtt_offset from hw_tail Ashutosh Dixit
2023-09-13 1:25 ` Umesh Nerlige Ramappa
2023-09-13 2:20 ` Dixit, Ashutosh
2023-09-09 1:16 ` [Intel-gfx] [PATCH 2/3] drm/i915/perf: Remove gtt_offset from stream->oa_buffer.head/.tail Ashutosh Dixit
2023-09-13 1:36 ` Umesh Nerlige Ramappa [this message]
2023-09-09 1:16 ` [Intel-gfx] [PATCH 3/3] drm/i915/perf: Initialize gen12 OA buffer unconditionally Ashutosh Dixit
2023-09-09 1:24 ` Dixit, Ashutosh
2023-09-13 1:46 ` Umesh Nerlige Ramappa
2023-09-13 16:59 ` Dixit, Ashutosh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZQESBLpba0nnK24G@unerlige-ril \
--to=umesh.nerlige.ramappa@intel.com \
--cc=ashutosh.dixit@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox