Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To: Jason Ekstrand <jason@jlekstrand.net>
Cc: "Intel GFX" <intel-gfx@lists.freedesktop.org>,
	"Thomas Hellström" <thomas.hellstrom@intel.com>
Subject: Re: [Intel-gfx] [PATCH v8 01/69] drm/i915: Do not share hwsp across contexts any more, v7.
Date: Mon, 15 Mar 2021 13:08:01 +0100	[thread overview]
Message-ID: <8a997e84-6677-038c-02f4-9bcac059fcc8@linux.intel.com> (raw)
In-Reply-To: <CAOFGe96XvGm8bi3Hse44zuva=Vqks9Ht3mV9MQ1PurY9RVRfvw@mail.gmail.com>

Op 2021-03-11 om 22:22 schreef Jason Ekstrand:
> First off, I'm just here asking questions right now trying to start
> getting my head around some of this stuff.  Feel free to ignore me or
> tell me to go away if I'm being annoying. :-)
>
> On Thu, Mar 11, 2021 at 7:49 AM Maarten Lankhorst
> <maarten.lankhorst@linux.intel.com> wrote:
>> Instead of sharing pages with breadcrumbs, give each timeline a
>> single page. This allows unrelated timelines not to share locks
>> any more during command submission.
>>
>> As an additional benefit, seqno wraparound no longer requires
>> i915_vma_pin, which means we no longer need to worry about a
>> potential -EDEADLK at a point where we are ready to submit.
>>
>> Changes since v1:
>> - Fix erroneous i915_vma_acquire that should be a i915_vma_release (ickle).
>> - Extra check for completion in intel_read_hwsp().
>> Changes since v2:
>> - Fix inconsistent indent in hwsp_alloc() (kbuild)
>> - memset entire cacheline to 0.
>> Changes since v3:
>> - Do same in intel_timeline_reset_seqno(), and clflush for good measure.
>> Changes since v4:
>> - Use refcounting on timeline, instead of relying on i915_active.
>> - Fix waiting on kernel requests.
>> Changes since v5:
>> - Bump amount of slots to maximum (256), for best wraparounds.
>> - Add hwsp_offset to i915_request to fix potential wraparound hang.
>> - Ensure timeline wrap test works with the changes.
>> - Assign hwsp in intel_timeline_read_hwsp() within the rcu lock to
>>   fix a hang.
>> Changes since v6:
>> - Rename i915_request_active_offset to i915_request_active_seqno(),
>>   and elaborate the function. (tvrtko)
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> #v1
>> Reported-by: kernel test robot <lkp@intel.com>
>> ---
>>  drivers/gpu/drm/i915/gt/gen2_engine_cs.c      |   2 +-
>>  drivers/gpu/drm/i915/gt/gen6_engine_cs.c      |   8 +-
>>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c      |  13 +-
>>  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   1 +
>>  drivers/gpu/drm/i915/gt/intel_gt_types.h      |   4 -
>>  drivers/gpu/drm/i915/gt/intel_timeline.c      | 422 ++++--------------
>>  .../gpu/drm/i915/gt/intel_timeline_types.h    |  17 +-
>>  drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |   5 +-
>>  drivers/gpu/drm/i915/gt/selftest_timeline.c   |  83 ++--
>>  drivers/gpu/drm/i915/i915_request.c           |   4 -
>>  drivers/gpu/drm/i915/i915_request.h           |  31 +-
>>  11 files changed, 175 insertions(+), 415 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/gen2_engine_cs.c b/drivers/gpu/drm/i915/gt/gen2_engine_cs.c
>> index b491a64919c8..9646200d2792 100644
>> --- a/drivers/gpu/drm/i915/gt/gen2_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/gen2_engine_cs.c
>> @@ -143,7 +143,7 @@ static u32 *__gen2_emit_breadcrumb(struct i915_request *rq, u32 *cs,
>>                                    int flush, int post)
>>  {
>>         GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma);
>> -       GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
>> +       GEM_BUG_ON(offset_in_page(rq->hwsp_seqno) != I915_GEM_HWS_SEQNO_ADDR);
>>
>>         *cs++ = MI_FLUSH;
>>
>> diff --git a/drivers/gpu/drm/i915/gt/gen6_engine_cs.c b/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
>> index ce38d1bcaba3..b388ceeeb1c9 100644
>> --- a/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
>> @@ -161,7 +161,7 @@ u32 *gen6_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
>>                  PIPE_CONTROL_DC_FLUSH_ENABLE |
>>                  PIPE_CONTROL_QW_WRITE |
>>                  PIPE_CONTROL_CS_STALL);
>> -       *cs++ = i915_request_active_timeline(rq)->hwsp_offset |
>> +       *cs++ = i915_request_active_seqno(rq) |
>>                 PIPE_CONTROL_GLOBAL_GTT;
>>         *cs++ = rq->fence.seqno;
>>
>> @@ -359,7 +359,7 @@ u32 *gen7_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
>>                  PIPE_CONTROL_QW_WRITE |
>>                  PIPE_CONTROL_GLOBAL_GTT_IVB |
>>                  PIPE_CONTROL_CS_STALL);
>> -       *cs++ = i915_request_active_timeline(rq)->hwsp_offset;
>> +       *cs++ = i915_request_active_seqno(rq);
>>         *cs++ = rq->fence.seqno;
>>
>>         *cs++ = MI_USER_INTERRUPT;
>> @@ -374,7 +374,7 @@ u32 *gen7_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
>>  u32 *gen6_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
>>  {
>>         GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma);
>> -       GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
>> +       GEM_BUG_ON(offset_in_page(rq->hwsp_seqno) != I915_GEM_HWS_SEQNO_ADDR);
>>
>>         *cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
>>         *cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT;
>> @@ -394,7 +394,7 @@ u32 *gen7_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
>>         int i;
>>
>>         GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma);
>> -       GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
>> +       GEM_BUG_ON(offset_in_page(rq->hwsp_seqno) != I915_GEM_HWS_SEQNO_ADDR);
>>
>>         *cs++ = MI_FLUSH_DW | MI_INVALIDATE_TLB |
>>                 MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
>> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
>> index cac80af7ad1c..6b9c34d3ac8d 100644
>> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
>> @@ -338,15 +338,14 @@ static u32 preempt_address(struct intel_engine_cs *engine)
>>
>>  static u32 hwsp_offset(const struct i915_request *rq)
>>  {
>> -       const struct intel_timeline_cacheline *cl;
>> +       const struct intel_timeline *tl;
>>
>> -       /* Before the request is executed, the timeline/cachline is fixed */
>> +       /* Before the request is executed, the timeline is fixed */
>> +       tl = rcu_dereference_protected(rq->timeline,
>> +                                      !i915_request_signaled(rq));
> Why is Gen8+ different from Gen2/6 here?  In particular, why not use
> i915_request_active_timeline(rq) or, better yet,
> i915_request_active_seqno()?  The primary difference I see is that the
> guard on the RCU is different but it's not immediately obvious to me
> why this should be different between hardware generations.  Also,
> i915_request_active_seqno() returns a u32, but that could be fixed.

The legacy rings different locking, and it was splatting on PROVE_RCU.

I didn't want to weaken it, so I just put in a different condition for gen8 that's slightly weaker, but still better than the '1'.


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-03-15 12:08 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-11 13:41 [Intel-gfx] [PATCH v8 00/69] drm/i915: Remove obj->mm.lock! Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 01/69] drm/i915: Do not share hwsp across contexts any more, v7 Maarten Lankhorst
2021-03-11 21:22   ` Jason Ekstrand
2021-03-15 12:08     ` Maarten Lankhorst [this message]
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 02/69] drm/i915: Pin timeline map after first timeline pin, v3 Maarten Lankhorst
2021-03-11 21:44   ` Jason Ekstrand
2021-03-15 12:34     ` Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 03/69] drm/i915: Move cmd parser pinning to execbuffer Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 04/69] drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2 Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 05/69] drm/i915: Ensure we hold the object mutex in pin correctly Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 06/69] drm/i915: Add gem object locking to madvise Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 07/69] drm/i915: Move HAS_STRUCT_PAGE to obj->flags Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 08/69] drm/i915: Rework struct phys attachment handling Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 09/69] drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2 Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 10/69] drm/i915: make lockdep slightly happier about execbuf Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 11/69] drm/i915: Disable userptr pread/pwrite support Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 12/69] drm/i915: No longer allow exporting userptr through dma-buf Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 13/69] drm/i915: Reject more ioctls for userptr, v2 Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 14/69] drm/i915: Reject UNSYNCHRONIZED " Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 15/69] drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 16/69] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7 Maarten Lankhorst
2021-03-11 17:24   ` Thomas Hellström (Intel)
2021-03-15 12:36     ` Maarten Lankhorst
2021-03-16  8:47       ` Thomas Hellström (Intel)
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 17/69] drm/i915: Flatten obj->mm.lock Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 18/69] drm/i915: Populate logical context during first pin Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 19/69] drm/i915: Make ring submission compatible with obj->mm.lock removal, v2 Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 20/69] drm/i915: Handle ww locking in init_status_page Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 21/69] drm/i915: Rework clflush to work correctly without obj->mm.lock Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 22/69] drm/i915: Pass ww ctx to intel_pin_to_display_plane Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 23/69] drm/i915: Add object locking to vm_fault_cpu Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 24/69] drm/i915: Move pinning to inside engine_wa_list_verify() Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 25/69] drm/i915: Take reservation lock around i915_vma_pin Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 26/69] drm/i915: Make lrc_init_wa_ctx compatible with ww locking, v3 Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 27/69] drm/i915: Make __engine_unpark() compatible with ww locking Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 28/69] drm/i915: Take obj lock around set_domain ioctl Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 29/69] drm/i915: Defer pin calls in buffer pool until first use by caller Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 30/69] drm/i915: Fix pread/pwrite to work with new locking rules Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 31/69] drm/i915: Fix workarounds selftest, part 1 Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 32/69] drm/i915: Prepare for obj->mm.lock removal, v2 Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 33/69] drm/i915: Add igt_spinner_pin() to allow for ww locking around spinner Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 34/69] drm/i915: Add ww locking around vm_access() Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 35/69] drm/i915: Increase ww locking for perf Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 36/69] drm/i915: Lock ww in ucode objects correctly Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 37/69] drm/i915: Add ww locking to dma-buf ops Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 38/69] drm/i915: Add missing ww lock in intel_dsb_prepare Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 39/69] drm/i915: Fix ww locking in shmem_create_from_object Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 40/69] drm/i915: Use a single page table lock for each gtt Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 41/69] drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock removal Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 42/69] drm/i915/selftests: Prepare client blit " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 43/69] drm/i915/selftests: Prepare coherency tests " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 44/69] drm/i915/selftests: Prepare context " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 45/69] drm/i915/selftests: Prepare dma-buf " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 46/69] drm/i915/selftests: Prepare execbuf " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 47/69] drm/i915/selftests: Prepare mman testcases " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 48/69] drm/i915/selftests: Prepare object tests " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 49/69] drm/i915/selftests: Prepare object blit " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 50/69] drm/i915/selftests: Prepare igt_gem_utils " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 51/69] drm/i915/selftests: Prepare context selftest " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 52/69] drm/i915/selftests: Prepare hangcheck " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 53/69] drm/i915/selftests: Prepare execlists and lrc selftests " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 54/69] drm/i915/selftests: Prepare mocs tests " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 55/69] drm/i915/selftests: Prepare ring submission " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 56/69] drm/i915/selftests: Prepare timeline tests " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 57/69] drm/i915/selftests: Prepare i915_request " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 58/69] drm/i915/selftests: Prepare memory region " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 59/69] drm/i915/selftests: Prepare cs engine " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 60/69] drm/i915/selftests: Prepare gtt " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 61/69] drm/i915: Finally remove obj->mm.lock Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 62/69] drm/i915: Keep userpointer bindings if seqcount is unchanged, v2 Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 63/69] drm/i915: Move gt_revoke() slightly Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 64/69] drm/i915: Add missing -EDEADLK path in execbuffer ggtt pinning Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 65/69] drm/i915: Fix pin_map in scheduler selftests Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 66/69] drm/i915: Add ww parameter to get_pages() callback Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 67/69] drm/i915: Add ww context to prepare_(read/write) Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 68/69] drm/i915: Pass ww ctx to pin_map Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 69/69] drm/i915: Pass ww ctx to i915_gem_object_pin_pages Maarten Lankhorst
2021-03-11 14:27 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Remove obj->mm.lock! (rev16) Patchwork
2021-03-11 14:28 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-03-11 14:32 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2021-03-11 14:59 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2021-03-16  9:10 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Remove obj->mm.lock! (rev17) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a997e84-6677-038c-02f4-9bcac059fcc8@linux.intel.com \
    --to=maarten.lankhorst@linux.intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jason@jlekstrand.net \
    --cc=thomas.hellstrom@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox