From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 20/27] drm/i915: Remove logical HW ID
Date: Wed, 25 Sep 2019 15:38:59 +0100 [thread overview]
Message-ID: <a32e4f08-eb84-7807-5aa7-e2e4e3b5fdf8@linux.intel.com> (raw)
In-Reply-To: <156941589825.4979.9378610445541130440@skylake-alporthouse-com>
On 25/09/2019 13:51, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-09-25 13:41:10)
>> [+ Daniele, I think he might want to have a look at this.]
>>
>> On 25/09/2019 11:01, Chris Wilson wrote:
>>> With the introduction of ctx->engines[] we allow multiple logical
>>> contexts to be used on the same engine (e.g. with virtual engines). Each
>>> logical context requires a unique tag in order for context-switching to
>>> occur correctly between them.
>>>
>>> We only need to keep a unique tag for the active lifetime of the
>>> context, and for as long as we need to identify that context. The HW
>>> uses the tag to determine if it should use a lite-restore (why not the
>>> LRCA?) and passes the tag back for various status identifies. The only
>>> status we need to track is for OA, so when using perf, we assign the
>>> specific context a unique tag.
>>>
>>> Fixes: 976b55f0e1db ("drm/i915: Allow a context to define its set of engines")
>>
>> Fixes? Why?
>
> The above paragraph explains that as we give two distinct contexts the
> same tag then it will only perform (according to bspec) a lite-restore.
Right I see. But not with virtual engines per se, but even if someone
created an engine map with two same engines in it. Virtual engines are
then a super set of this problem.
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
>>> index da4b45aa84b1..7bce176e696c 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
>>> @@ -296,10 +296,12 @@ struct intel_engine_cs {
>>> u8 uabi_class;
>>> u8 uabi_instance;
>>>
>>> + u32 uabi_capabilities;
>>> u32 context_size;
>>> u32 mmio_base;
>>>
>>> - u32 uabi_capabilities;
>>> + unsigned int context_tag;
>>> +#define NUM_CONTEXT_TAG (256)
>>
>> AFAICT this define now defines how deep ELSP we can have before we start
>> getting hard to debug problems. GuC angle needs considering here.
>
> The guc is immaterial here, since they use their own mechanism. It just
> needs to be a value larger than 2 * EXECLIST_MAX_PORTS and less then
> BIT(10).
I don't know how GuC (will) work so hopefully Daniele can confirm or
deny this won't be a problem there.
>
>>> @@ -978,6 +965,15 @@ __execlists_schedule_in(struct i915_request *rq)
>>>
>>> intel_context_get(ce);
>>>
>>> + if (ce->tag) {
>>> + ce->lrc_desc |= (u64)ce->tag << 32;
>>> + } else {
>>> + ce->lrc_desc &= ~GENMASK_ULL(47, 37);
>>> + ce->lrc_desc |=
>>> + (u64)(engine->context_tag++ % NUM_CONTEXT_TAG) <<
>>> + GEN11_SW_CTX_ID_SHIFT;
>>> + }
>>
>> So hw id is valid only while context is in ELSP and it changes with
>> every submission except in the OA case?
>
> Aye. OA being a pita.
>
>> Shifts and masks look dodgy. For >= gen11 the current code has the hw_id
>> in [37, 42], otherwise [32, 52]. This patch does not seem to handle the
>> differences between gens.
>
> Because the values we supply do not matter (they are cookies), just the
> lifetime.
Okay so you are choosing to not use some of the available bits since we
don't need all 20 (pre gen11), or even 10 (gen11+). This definitely
deserves a comment. And another assert with NUM_CONTEXT_TAG relating to
available bits.
>
>>> +
>>> intel_gt_pm_get(engine->gt);
>>> execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
>>> intel_engine_context_in(engine);
>>> @@ -2224,7 +2220,6 @@ static void execlists_context_unpin(struct intel_context *ce)
>>> check_redzone((void *)ce->lrc_reg_state - LRC_STATE_PN * PAGE_SIZE,
>>> ce->engine);
>>>
>>> - i915_gem_context_unpin_hw_id(ce->gem_context);
>>> i915_gem_object_unpin_map(ce->state->obj);
>>> intel_ring_reset(ce->ring, ce->ring->tail);
>>> }
>>> @@ -2274,18 +2269,12 @@ __execlists_context_pin(struct intel_context *ce,
>>> goto unpin_active;
>>> }
>>>
>>> - ret = i915_gem_context_pin_hw_id(ce->gem_context);
>>> - if (ret)
>>> - goto unpin_map;
>>> -
>>> ce->lrc_desc = lrc_descriptor(ce, engine);
>>> ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
>>> __execlists_update_reg_state(ce, engine);
>>>
>>> return 0;
>>>
>>> -unpin_map:
>>> - i915_gem_object_unpin_map(ce->state->obj);
>>> unpin_active:
>>> intel_context_active_release(ce);
>>> err:
>>> diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
>>> index 343d79c1cb7e..04a5a0d90823 100644
>>> --- a/drivers/gpu/drm/i915/gvt/kvmgt.c
>>> +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
>>> @@ -1564,27 +1564,10 @@ vgpu_id_show(struct device *dev, struct device_attribute *attr,
>>> return sprintf(buf, "\n");
>>> }
>>>
>>> -static ssize_t
>>> -hw_id_show(struct device *dev, struct device_attribute *attr,
>>> - char *buf)
>>> -{
>>> - struct mdev_device *mdev = mdev_from_dev(dev);
>>> -
>>> - if (mdev) {
>>> - struct intel_vgpu *vgpu = (struct intel_vgpu *)
>>> - mdev_get_drvdata(mdev);
>>> - return sprintf(buf, "%u\n",
>>> - vgpu->submission.shadow[0]->gem_context->hw_id);
>>> - }
>>> - return sprintf(buf, "\n");
>>> -}
>>> -
>>> static DEVICE_ATTR_RO(vgpu_id);
>>> -static DEVICE_ATTR_RO(hw_id);
>>>
>>> static struct attribute *intel_vgpu_attrs[] = {
>>> &dev_attr_vgpu_id.attr,
>>> - &dev_attr_hw_id.attr,
>>> NULL
>>> };
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index fd41505d52ec..8caaa446490f 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -1504,9 +1504,6 @@ static int i915_context_status(struct seq_file *m, void *unused)
>>> struct intel_context *ce;
>>>
>>> seq_puts(m, "HW context ");
>>> - if (!list_empty(&ctx->hw_id_link))
>>> - seq_printf(m, "%x [pin %u]", ctx->hw_id,
>>> - atomic_read(&ctx->hw_id_pin_count));
>>> if (ctx->pid) {
>>> struct task_struct *task;
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>>> index a86d28bda6dd..47239df653f2 100644
>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>>> @@ -471,9 +471,9 @@ static void error_print_context(struct drm_i915_error_state_buf *m,
>>> const char *header,
>>> const struct drm_i915_error_context *ctx)
>>> {
>>> - err_printf(m, "%s%s[%d] hw_id %d, prio %d, guilty %d active %d\n",
>>> - header, ctx->comm, ctx->pid, ctx->hw_id,
>>> - ctx->sched_attr.priority, ctx->guilty, ctx->active);
>>> + err_printf(m, "%s%s[%d] prio %d, guilty %d active %d\n",
>>> + header, ctx->comm, ctx->pid, ctx->sched_attr.priority,
>>> + ctx->guilty, ctx->active);
>>> }
>>>
>>> static void error_print_engine(struct drm_i915_error_state_buf *m,
>>> @@ -1262,7 +1262,6 @@ static bool record_context(struct drm_i915_error_context *e,
>>> rcu_read_unlock();
>>> }
>>>
>>> - e->hw_id = ctx->hw_id;
>>> e->sched_attr = ctx->sched;
>>> e->guilty = atomic_read(&ctx->guilty_count);
>>> e->active = atomic_read(&ctx->active_count);
>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
>>> index caaed5093d95..4dc36d6ee3a2 100644
>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.h
>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.h
>>> @@ -117,7 +117,6 @@ struct i915_gpu_state {
>>> struct drm_i915_error_context {
>>> char comm[TASK_COMM_LEN];
>>> pid_t pid;
>>> - u32 hw_id;
>>> int active;
>>> int guilty;
>>> struct i915_sched_attr sched_attr;
>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>>> index 80055501eccb..d36ba248d438 100644
>>> --- a/drivers/gpu/drm/i915/i915_perf.c
>>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>>> @@ -1283,22 +1283,15 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>>> } else {
>>> stream->specific_ctx_id_mask =
>>> (1U << GEN8_CTX_ID_WIDTH) - 1;
>>> - stream->specific_ctx_id =
>>> - upper_32_bits(ce->lrc_desc);
>>> - stream->specific_ctx_id &=
>>> - stream->specific_ctx_id_mask;
>>> + stream->specific_ctx_id = stream->specific_ctx_id_mask;
>>> }
>>> break;
>>>
>>> case 11:
>>> case 12: {
>>> stream->specific_ctx_id_mask =
>>> - ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32) |
>>> - ((1U << GEN11_ENGINE_INSTANCE_WIDTH) - 1) << (GEN11_ENGINE_INSTANCE_SHIFT - 32) |
>>> - ((1 << GEN11_ENGINE_CLASS_WIDTH) - 1) << (GEN11_ENGINE_CLASS_SHIFT - 32);
>>> - stream->specific_ctx_id = upper_32_bits(ce->lrc_desc);
>>> - stream->specific_ctx_id &=
>>> - stream->specific_ctx_id_mask;
>>> + ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32);
>>> + stream->specific_ctx_id = stream->specific_ctx_id_mask;
>>> break;
>>> }
>>>
>>> @@ -1306,6 +1299,8 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>>> MISSING_CASE(INTEL_GEN(i915));
>>> }
>>>
>>> + ce->tag = stream->specific_ctx_id_mask;
>>> +
>>> DRM_DEBUG_DRIVER("filtering on ctx_id=0x%x ctx_id_mask=0x%x\n",
>>> stream->specific_ctx_id,
>>> stream->specific_ctx_id_mask);
>>> @@ -1324,12 +1319,14 @@ static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
>>> {
>>> struct intel_context *ce;
>>>
>>> - stream->specific_ctx_id = INVALID_CTX_ID;
>>> - stream->specific_ctx_id_mask = 0;
>>> -
>>> ce = fetch_and_zero(&stream->pinned_ctx);
>>> - if (ce)
>>> + if (ce) {
>>> + ce->tag = 0;
>>> intel_context_unpin(ce);
>>> + }
>>> +
>>> + stream->specific_ctx_id = INVALID_CTX_ID;
>>> + stream->specific_ctx_id_mask = 0;
>>
>> OA hunks looks dodgy as well. ce->tag is set to
>> stream->specific_ctx_id_mask while I think it should be id. Furthermore
>> id is assigned from mask which is fixed and not unique for different
>> contexts.
>
> It is singular in the entire system. The mask is a guaranteed unique
> value.
What is singular? The OA context and it's id? I assumed what needs to be
done is fix hw_id for _all_ contexts while OA is sampling. But on a more
detailed look this assumption does not seem to hold in code. Okay, I
declare to not know how OA works well enough.
>
>>> diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c
>>> index 1c9db08f7c28..ac1ff558eb90 100644
>>> --- a/drivers/gpu/drm/i915/selftests/i915_vma.c
>>> +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
>>> @@ -170,7 +170,7 @@ static int igt_vma_create(void *arg)
>>> }
>>>
>>> nc = 0;
>>> - for_each_prime_number(num_ctx, MAX_CONTEXT_HW_ID) {
>>> + for_each_prime_number(num_ctx, 2 * NUM_CONTEXT_TAG) {
>>> for (; nc < num_ctx; nc++) {
>>> ctx = mock_context(i915, "mock");
>>> if (!ctx)
>>>
>>
>> Unless I am missing something I see this patch as simplification and not
>> a fix. If we can get away with it in the GuC world then it's quite a big
>> simplification so it's fine by me.
>
> Bspec says that for lrc_desc with matching tags (at least for gen8-gen11
> I believe), it performs a lite restore (regardless of lrca). Ergo it is
> quite a major fix.
Yes I see that now, my bad. This one will not lend itself well to
backporting.. But I don't think there is a lighter-weight solution.
Moving hw_id to intel_context would also end up large-ish I think.
Because of the locking requirement etc.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2019-09-25 14:39 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-25 10:01 struct_mutex is over the hill and far away Chris Wilson
2019-09-25 10:01 ` [PATCH 01/27] dma-fence: Serialise signal enabling (dma_fence_enable_sw_signaling) Chris Wilson
2019-09-25 10:01 ` [PATCH 02/27] drm/mm: Pack allocated/scanned boolean into a bitfield Chris Wilson
2019-09-25 10:01 ` [PATCH 03/27] drm/i915: Only track bound elements of the GTT Chris Wilson
2019-09-25 10:01 ` [PATCH 04/27] drm/i915: Mark up address spaces that may need to allocate Chris Wilson
2019-09-25 10:01 ` [PATCH 05/27] drm/i915: Pull i915_vma_pin under the vm->mutex Chris Wilson
2019-09-27 8:47 ` Tvrtko Ursulin
2019-09-27 11:06 ` Chris Wilson
2019-09-25 10:01 ` [PATCH 06/27] drm/i915: Push the i915_active.retire into a worker Chris Wilson
2019-09-25 10:01 ` [PATCH 07/27] drm/i915: Coordinate i915_active with its own mutex Chris Wilson
2019-09-27 11:10 ` Tvrtko Ursulin
2019-09-27 11:25 ` Chris Wilson
2019-09-27 12:08 ` Tvrtko Ursulin
2019-09-27 12:16 ` Chris Wilson
2019-09-27 12:25 ` Tvrtko Ursulin
2019-09-27 12:32 ` Chris Wilson
2019-09-27 13:58 ` Tvrtko Ursulin
2019-09-27 14:10 ` Chris Wilson
2019-09-25 10:01 ` [PATCH 08/27] drm/i915: Move idle barrier cleanup into engine-pm Chris Wilson
2019-09-25 10:01 ` [PATCH 09/27] drm/i915: Drop struct_mutex from around i915_retire_requests() Chris Wilson
2019-09-25 10:01 ` [PATCH 10/27] drm/i915: Remove the GEM idle worker Chris Wilson
2019-09-25 10:01 ` [PATCH 11/27] drm/i915: Merge wait_for_timelines with retire_request Chris Wilson
2019-09-25 10:47 ` Tvrtko Ursulin
2019-09-25 10:54 ` Chris Wilson
2019-09-25 10:01 ` [PATCH 12/27] drm/i915: Move request runtime management onto gt Chris Wilson
2019-09-25 10:57 ` Tvrtko Ursulin
2019-09-25 11:17 ` Chris Wilson
2019-09-25 11:24 ` Chris Wilson
2019-09-25 11:29 ` Chris Wilson
2019-09-25 11:33 ` Chris Wilson
2019-09-25 15:17 ` Tvrtko Ursulin
2019-09-25 10:01 ` [PATCH 13/27] drm/i915: Move global activity tracking from GEM to GT Chris Wilson
2019-09-25 10:01 ` [PATCH 14/27] drm/i915: Expose engine properties via sysfs Chris Wilson
2019-09-27 20:48 ` Rodrigo Vivi
2019-09-25 10:01 ` [PATCH 15/27] drm/i915/execlists: Force preemption Chris Wilson
2019-09-25 10:01 ` [PATCH 16/27] drm/i915: Mark up "sentinel" requests Chris Wilson
2019-09-25 10:01 ` [PATCH 17/27] drm/i915/execlists: Cancel banned contexts on schedule-out Chris Wilson
2019-09-25 10:01 ` [PATCH 18/27] drm/i915: Cancel non-persistent contexts on close Chris Wilson
2019-09-25 10:01 ` [PATCH 19/27] drm/i915: Replace hangcheck by heartbeats Chris Wilson
2019-09-27 8:26 ` Joonas Lahtinen
2019-09-27 9:18 ` Chris Wilson
2019-09-25 10:01 ` [PATCH 20/27] drm/i915: Remove logical HW ID Chris Wilson
2019-09-25 12:41 ` Tvrtko Ursulin
2019-09-25 12:51 ` Chris Wilson
2019-09-25 14:38 ` Tvrtko Ursulin [this message]
2019-09-25 17:59 ` Daniele Ceraolo Spurio
2019-09-25 18:23 ` Matthew Brost
2019-09-25 10:01 ` [PATCH 21/27] drm/i915: Move context management under GEM Chris Wilson
2019-09-26 13:57 ` Tvrtko Ursulin
2019-10-02 16:09 ` Tvrtko Ursulin
2019-10-03 7:35 ` Chris Wilson
2019-09-25 10:01 ` [PATCH 22/27] drm/i915/overlay: Drop struct_mutex guard Chris Wilson
2019-09-25 12:43 ` Tvrtko Ursulin
2019-09-25 12:53 ` Chris Wilson
2019-09-25 13:01 ` Tvrtko Ursulin
2019-09-25 13:11 ` Chris Wilson
2019-09-25 10:01 ` [PATCH 23/27] drm/i915: Drop struct_mutex guard from debugfs/framebuffer_info Chris Wilson
2019-09-25 12:45 ` Tvrtko Ursulin
2019-09-25 10:01 ` [PATCH 24/27] drm/i915: Remove struct_mutex guard for debugfs/opregion Chris Wilson
2019-09-25 12:51 ` Tvrtko Ursulin
2019-09-25 10:01 ` [PATCH 25/27] drm/i915: Drop struct_mutex from suspend state save/restore Chris Wilson
2019-09-25 12:52 ` Tvrtko Ursulin
2019-09-25 10:01 ` [PATCH 26/27] drm/i915/selftests: Drop vestigal struct_mutex guards Chris Wilson
2019-09-25 12:55 ` Tvrtko Ursulin
2019-09-25 10:01 ` [PATCH 27/27] drm/i915: Drop struct_mutex from around GEM initialisation Chris Wilson
2019-09-25 12:56 ` Tvrtko Ursulin
2019-09-25 10:21 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/27] dma-fence: Serialise signal enabling (dma_fence_enable_sw_signaling) Patchwork
2019-09-25 10:34 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-09-25 10:51 ` ✓ Fi.CI.BAT: success " Patchwork
2019-09-26 0:56 ` ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a32e4f08-eb84-7807-5aa7-e2e4e3b5fdf8@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox