From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Lionel Landwerlin <lionel.g.landwerlin@intel.com>,
Tvrtko Ursulin <tursulin@ursulin.net>,
Intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 5/8] drm/i915/perf: lock powergating configuration to default when active
Date: Wed, 15 Aug 2018 12:58:05 +0100 [thread overview]
Message-ID: <7741f355-ea6f-1b60-b252-4cf7e07b3fd2@linux.intel.com> (raw)
In-Reply-To: <0450a167-81f0-d839-b49a-0a99fedaec15@intel.com>
On 14/08/2018 15:59, Lionel Landwerlin wrote:
> Hey Tvrtko,
>
> Thanks for taking over this series.
>
> I've been talking to developers using the i915/perf interface and from
> their point of view, they expect the system to be in a stable
> configuration when doing measurements.
>
> One issue with this patch on Gen11 is that it will lock the system in a
> configuration that isn't workable for media workloads (all subslices
> enabled).
> So I think we should set the value for the locked configuration per
> generation (gen8-10: all slices/subslices, gen11: only subslices that
> contain VME samplers) so that we always get a functional configurations
> for all workloads.
> Could we want to select that configuration when opening perf?
This would be via i915_perf.c/gen8_configure_all_contexts?
Sounds like an unfortunate but workable compromise. As long as there
doesn't appear another asymmetric slice feature in the future, which
doesn't align with VME. Like one workloads wants slices 0 & 2, another
wants 1 & 3, or whatever. If that happens then I don't know what we do
apart from locking out perf/OA.
> Another question is how do we expose the selected value to the user. But
> that can be solved in a different series.
SSEU get_param would cut it? Although it would be perhaps be unexpected
to get different results depending on whether perf/OA is active or not..
Hm... export device and available bitmasks via get param? Device bitmask
would be fixed and available would change depending on whether perf/OA
is active or not.
Regards,
Tvrtko
> Cheers,
>
> -
> Lionel
>
> On 14/08/18 15:40, Tvrtko Ursulin wrote:
>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>
>> If some of the contexts submitting workloads to the GPU have been
>> configured to shutdown slices/subslices, we might loose the NOA
>> configurations written in the NOA muxes.
>>
>> One possible solution to this problem is to reprogram the NOA muxes
>> when we switch to a new context. We initially tried this in the
>> workaround batchbuffer but some concerns where raised about the cost
>> of reprogramming at every context switch. This solution is also not
>> without consequences from the userspace point of view. Reprogramming
>> of the muxes can only happen once the powergating configuration has
>> changed (which happens after context switch). This means for a window
>> of time during the recording, counters recorded by the OA unit might
>> be invalid. This requires userspace dealing with OA reports to discard
>> the invalid values.
>>
>> Minimizing the reprogramming could be implemented by tracking of the
>> last programmed configuration somewhere in GGTT and use MI_PREDICATE
>> to discard some of the programming commands, but the command streamer
>> would still have to parse all the MI_LRI instructions in the
>> workaround batchbuffer.
>>
>> Another solution, which this change implements, is to simply disregard
>> the user requested configuration for the period of time when i915/perf
>> is active. There is no known issue with this apart from a performance
>> penality for some media workloads that benefit from running on a
>> partially powergated GPU. We already prevent RC6 from affecting the
>> programming so it doesn't sound completely unreasonable to hold on
>> powergating for the same reason.
>>
>> v2: Leave RPCS programming in intel_lrc.c (Lionel)
>>
>> v3: Update for s/union intel_sseu/struct intel_sseu/ (Lionel)
>> More to_intel_context() (Tvrtko)
>> s/dev_priv/i915/ (Tvrtko)
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_drv.h | 15 +++++++++++++++
>> drivers/gpu/drm/i915/i915_perf.c | 23 ++++++++++++++++++-----
>> drivers/gpu/drm/i915/intel_lrc.c | 11 +++++++----
>> drivers/gpu/drm/i915/intel_lrc.h | 3 +++
>> 4 files changed, 43 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index d6049c3f911b..5c12d2676435 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -3851,4 +3851,19 @@ static inline int
>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>> return I915_HWS_CSB_WRITE_INDEX;
>> }
>> +static inline struct intel_sseu
>> +intel_engine_prepare_sseu(struct intel_engine_cs *engine,
>> + struct intel_sseu sseu)
>> +{
>> + struct drm_i915_private *i915 = engine->i915;
>> +
>> + /*
>> + * If i915/perf is active, we want a stable powergating
>> configuration
>> + * on the system. The most natural configuration to take in that
>> case
>> + * is the default (i.e maximum the hardware can do).
>> + */
>> + return i915->perf.oa.exclusive_stream ?
>> + intel_device_default_sseu(i915) : sseu;
>> +}
>> +
>> #endif
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c
>> b/drivers/gpu/drm/i915/i915_perf.c
>> index ccb20230df2c..c2fc2399e0ed 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -1631,7 +1631,8 @@ static void hsw_disable_metric_set(struct
>> drm_i915_private *dev_priv)
>> */
>> static void gen8_update_reg_state_unlocked(struct i915_gem_context
>> *ctx,
>> u32 *reg_state,
>> - const struct i915_oa_config *oa_config)
>> + const struct i915_oa_config *oa_config,
>> + struct intel_sseu sseu)
>> {
>> struct drm_i915_private *dev_priv = ctx->i915;
>> u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset;
>> @@ -1677,6 +1678,9 @@ static void
>> gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
>> CTX_REG(reg_state, state_offset, flex_regs[i], value);
>> }
>> +
>> + CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
>> + gen8_make_rpcs(&INTEL_INFO(dev_priv)->sseu, sseu));
>> }
>> /*
>> @@ -1807,6 +1811,7 @@ static int
>> gen8_switch_to_updated_kernel_context(struct drm_i915_private *dev_pr
>> static int gen8_configure_all_contexts(struct drm_i915_private
>> *dev_priv,
>> const struct i915_oa_config *oa_config)
>> {
>> + struct intel_sseu default_sseu =
>> intel_device_default_sseu(dev_priv);
>> struct intel_engine_cs *engine = dev_priv->engine[RCS];
>> struct i915_gem_context *ctx;
>> int ret;
>> @@ -1854,7 +1859,8 @@ static int gen8_configure_all_contexts(struct
>> drm_i915_private *dev_priv,
>> ce->state->obj->mm.dirty = true;
>> regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
>> - gen8_update_reg_state_unlocked(ctx, regs, oa_config);
>> + gen8_update_reg_state_unlocked(ctx, regs, oa_config,
>> + oa_config ? default_sseu : ce->sseu);
>> i915_gem_object_unpin_map(ce->state->obj);
>> }
>> @@ -2226,14 +2232,21 @@ void i915_oa_init_reg_state(struct
>> intel_engine_cs *engine,
>> struct i915_gem_context *ctx,
>> u32 *reg_state)
>> {
>> + struct drm_i915_private *i915 = engine->i915;
>> struct i915_perf_stream *stream;
>> if (engine->id != RCS)
>> return;
>> - stream = engine->i915->perf.oa.exclusive_stream;
>> - if (stream)
>> - gen8_update_reg_state_unlocked(ctx, reg_state,
>> stream->oa_config);
>> + stream = i915->perf.oa.exclusive_stream;
>> + if (stream) {
>> + struct intel_sseu default_sseu =
>> + intel_device_default_sseu(i915);
>> +
>> + gen8_update_reg_state_unlocked(ctx, reg_state,
>> + stream->oa_config,
>> + default_sseu);
>> + }
>> }
>> /**
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index 7b2f2d6bb057..8a2997be7ef7 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -2484,8 +2484,8 @@ int logical_xcs_ring_init(struct intel_engine_cs
>> *engine)
>> return logical_ring_init(engine);
>> }
>> -static u32 make_rpcs(const struct sseu_dev_info *sseu,
>> - struct intel_sseu ctx_sseu)
>> +u32 gen8_make_rpcs(const struct sseu_dev_info *sseu,
>> + struct intel_sseu ctx_sseu)
>> {
>> u32 rpcs = 0;
>> @@ -2635,10 +2635,13 @@ static void execlists_init_reg_state(u32 *regs,
>> }
>> if (rcs) {
>> + struct intel_sseu sseu = to_intel_context(ctx, engine)->sseu;
>> +
>> regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
>> CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
>> - make_rpcs(&INTEL_INFO(dev_priv)->sseu,
>> - to_intel_context(ctx, engine)->sseu));
>> + gen8_make_rpcs(&INTEL_INFO(dev_priv)->sseu,
>> + intel_engine_prepare_sseu(engine,
>> + sseu)));
>> i915_oa_init_reg_state(engine, ctx, regs);
>> }
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h
>> b/drivers/gpu/drm/i915/intel_lrc.h
>> index f5a5502ecf70..bf3acdc3d0af 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.h
>> +++ b/drivers/gpu/drm/i915/intel_lrc.h
>> @@ -104,4 +104,7 @@ void intel_lr_context_resume(struct
>> drm_i915_private *dev_priv);
>> void intel_execlists_set_default_submission(struct intel_engine_cs
>> *engine);
>> +u32 gen8_make_rpcs(const struct sseu_dev_info *sseu,
>> + struct intel_sseu ctx_sseu);
>> +
>> #endif /* _INTEL_LRC_H_ */
>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2018-08-15 11:58 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-14 14:40 [PATCH v10 0/8] Per context dynamic (sub)slice power-gating Tvrtko Ursulin
2018-08-14 14:40 ` [PATCH 1/8] drm/i915: Program RPCS for Broadwell Tvrtko Ursulin
2018-08-14 14:40 ` [PATCH 2/8] drm/i915: Record the sseu configuration per-context & engine Tvrtko Ursulin
2018-08-14 14:40 ` [PATCH 3/8] drm/i915/perf: simplify configure all context function Tvrtko Ursulin
2018-08-14 14:40 ` [PATCH 4/8] drm/i915/perf: reuse intel_lrc ctx regs macro Tvrtko Ursulin
2018-08-14 14:40 ` [PATCH 5/8] drm/i915/perf: lock powergating configuration to default when active Tvrtko Ursulin
2018-08-14 14:59 ` Lionel Landwerlin
2018-08-15 11:58 ` Tvrtko Ursulin [this message]
2018-08-16 7:31 ` Tvrtko Ursulin
2018-08-16 9:56 ` Lionel Landwerlin
2018-08-14 14:40 ` [PATCH 6/8] drm/i915: Add global barrier support Tvrtko Ursulin
2018-08-14 14:59 ` Tvrtko Ursulin
2018-08-14 14:40 ` [PATCH 7/8] drm/i915: Explicitly mark Global GTT address spaces Tvrtko Ursulin
2018-08-14 14:40 ` [PATCH 8/8] drm/i915: Expose RPCS (SSEU) configuration to userspace Tvrtko Ursulin
2018-08-14 14:59 ` Chris Wilson
2018-08-14 15:11 ` Lionel Landwerlin
2018-08-14 15:18 ` Chris Wilson
2018-08-14 16:05 ` Lionel Landwerlin
2018-08-14 16:09 ` Lionel Landwerlin
2018-08-14 18:44 ` Tvrtko Ursulin
2018-08-14 18:53 ` Chris Wilson
2018-08-15 9:12 ` Tvrtko Ursulin
2018-08-14 15:22 ` Chris Wilson
2018-08-15 11:51 ` Tvrtko Ursulin
2018-08-15 11:56 ` Chris Wilson
2018-08-14 14:45 ` [PATCH v10 0/8] Per context dynamic (sub)slice power-gating Tvrtko Ursulin
2018-08-14 15:15 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2018-08-14 15:19 ` ✗ Fi.CI.SPARSE: " Patchwork
2018-08-14 15:32 ` ✓ Fi.CI.BAT: success " Patchwork
2018-08-14 20:22 ` ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7741f355-ea6f-1b60-b252-4cf7e07b3fd2@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=Intel-gfx@lists.freedesktop.org \
--cc=lionel.g.landwerlin@intel.com \
--cc=tursulin@ursulin.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).