public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Lionel Landwerlin <lionel.g.landwerlin@intel.com>,
	Chris Wilson <chris@chris-wilson.co.uk>,
	Intel-gfx@lists.freedesktop.org,
	Tvrtko Ursulin <tursulin@ursulin.net>
Subject: Re: [PATCH 4/7] drm/i915/perf: lock powergating configuration to default when active
Date: Fri, 7 Sep 2018 09:26:27 +0100	[thread overview]
Message-ID: <4bfe6648-203f-fbc8-9bcd-7b4e28279d02@linux.intel.com> (raw)
In-Reply-To: <3f14bd57-3a23-7208-6072-6d51892994ca@intel.com>


On 06/09/2018 11:36, Lionel Landwerlin wrote:
> On 06/09/2018 11:22, Chris Wilson wrote:
>> Quoting Lionel Landwerlin (2018-09-06 11:18:01)
>>> On 06/09/2018 11:10, Chris Wilson wrote:
>>>> Quoting Lionel Landwerlin (2018-09-06 10:57:47)
>>>>> On 05/09/2018 15:22, Tvrtko Ursulin wrote:
>>>>>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>>>>>
>>>>>> If some of the contexts submitting workloads to the GPU have been
>>>>>> configured to shutdown slices/subslices, we might loose the NOA
>>>>>> configurations written in the NOA muxes.
>>>>>>
>>>>>> One possible solution to this problem is to reprogram the NOA muxes
>>>>>> when we switch to a new context. We initially tried this in the
>>>>>> workaround batchbuffer but some concerns where raised about the cost
>>>>>> of reprogramming at every context switch. This solution is also not
>>>>>> without consequences from the userspace point of view. Reprogramming
>>>>>> of the muxes can only happen once the powergating configuration has
>>>>>> changed (which happens after context switch). This means for a window
>>>>>> of time during the recording, counters recorded by the OA unit might
>>>>>> be invalid. This requires userspace dealing with OA reports to 
>>>>>> discard
>>>>>> the invalid values.
>>>>>>
>>>>>> Minimizing the reprogramming could be implemented by tracking of the
>>>>>> last programmed configuration somewhere in GGTT and use MI_PREDICATE
>>>>>> to discard some of the programming commands, but the command streamer
>>>>>> would still have to parse all the MI_LRI instructions in the
>>>>>> workaround batchbuffer.
>>>>>>
>>>>>> Another solution, which this change implements, is to simply 
>>>>>> disregard
>>>>>> the user requested configuration for the period of time when 
>>>>>> i915/perf
>>>>>> is active. There is no known issue with this apart from a performance
>>>>>> penality for some media workloads that benefit from running on a
>>>>>> partially powergated GPU. We already prevent RC6 from affecting the
>>>>>> programming so it doesn't sound completely unreasonable to hold on
>>>>>> powergating for the same reason.
>>>>>>
>>>>>> v2: Leave RPCS programming in intel_lrc.c (Lionel)
>>>>>>
>>>>>> v3: Update for s/union intel_sseu/struct intel_sseu/ (Lionel)
>>>>>>        More to_intel_context() (Tvrtko)
>>>>>>        s/dev_priv/i915/ (Tvrtko)
>>>>>>
>>>>>> Tvrtko Ursulin:
>>>>>>
>>>>>> v4:
>>>>>>     * Rebase for make_rpcs changes.
>>>>>>
>>>>>> v5:
>>>>>>     * Apply OA restriction from make_rpcs directly.
>>>>>>
>>>>>> v6:
>>>>>>     * Rebase for context image setup changes.
>>>>>>
>>>>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/i915/i915_perf.c |  5 +++++
>>>>>>     drivers/gpu/drm/i915/intel_lrc.c | 30 
>>>>>> ++++++++++++++++++++----------
>>>>>>     drivers/gpu/drm/i915/intel_lrc.h |  3 +++
>>>>>>     3 files changed, 28 insertions(+), 10 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c 
>>>>>> b/drivers/gpu/drm/i915/i915_perf.c
>>>>>> index ccb20230df2c..dd65b72bddd4 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_perf.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>>>>>> @@ -1677,6 +1677,11 @@ static void 
>>>>>> gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
>>>>>>                 CTX_REG(reg_state, state_offset, flex_regs[i], 
>>>>>> value);
>>>>>>         }
>>>>>> +
>>>>>> +     CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
>>>>>> +             gen8_make_rpcs(dev_priv,
>>>>>> +                            &to_intel_context(ctx,
>>>>>> +                                              
>>>>>> dev_priv->engine[RCS])->sseu));
>>>>> I think there is one issue I missed on the previous iterations of this
>>>>> patch.
>>>>>
>>>>> This gen8_update_reg_state_unlocked() is called when the GPU is parked
>>>>> on the kernel context.
>>>>>
>>>>> It's supposed to update all contexts, but I think we might not be able
>>>>> to update the kernel context image while the GPU is using it.
>>>> The kernel context is only ever taken in extremis (you are either
>>>> parking or stalling userspace) so I don't care.
>>>
>>> The patch exposing the RPCS configuration to userspace will make use of
>>> the kernel context while OA/perf is enabled. Even if it reprograms the
>>> locked value that will break the power configuration stability on Gen11
>>> (because the locked configuration will be different from the kernel
>>> context configuration).
>> Sure, but as you point out that's only on changing configuration.
>>
>> What's missing in the patch is that we only bail early if the new sseu
>> matches the ce->sseu, but that doesn't necessarily match whats in the
>> context due to OA. (Or maybe I missed the conversion to rpcs value and
>> checking.)
>> -Chris
>>
> 
> Yep, because the gen8_make_rpcs() post processes the values store at the 
> gem context level, we risk rerunning the kernel context to write the 
> exiting value.
> Sorry this is all so messy :(

Lets see if I managed to follow here.

The current code indeed bails out at the set ctx param level if the 
requested state matches the ce->state. My thinking was that ce->state is 
the master state and whatever happens in "post processing" via 
gen8_make_rpcs should be hidden from it since the design is that the 
i915_perf.c will re-configure all contexts when the OA active status 
changes (to either direction).

So I don't see a problem in those two interactions.

Apart from one, get_param_sseu will lie a bit - we can discuss about 
this one more. At one point I suggested we have two sets of masks in the 
uAPI, requested and active in a way. So userspace could query what it 
set and what is actually active.

Now second issue is if i915_perf.c is able to reprogram the kernel config.

Here its true, it will write to the context image and that will get 
overwritten by context save.

If that is a problem for OA, I was initially if a throw-away second 
"kernel" context could be use to re-program the real one, but perhaps 
even simpler - what about a mmio write to program the RPCS while kernel 
context is active?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2018-09-07  8:26 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-05 14:22 [PATCH v11 0/7] Per context dynamic (sub)slice power-gating Tvrtko Ursulin
2018-09-05 14:22 ` [PATCH 1/7] drm/i915/execlists: Move RPCS setup to context pin Tvrtko Ursulin
2018-09-05 15:14   ` Chris Wilson
2018-09-05 14:22 ` [PATCH 2/7] drm/i915: Program RPCS for Broadwell Tvrtko Ursulin
2018-09-05 14:22 ` [PATCH 3/7] drm/i915: Record the sseu configuration per-context & engine Tvrtko Ursulin
2018-09-05 15:18   ` Chris Wilson
2018-09-06  9:36     ` Tvrtko Ursulin
2018-09-05 14:22 ` [PATCH 4/7] drm/i915/perf: lock powergating configuration to default when active Tvrtko Ursulin
2018-09-05 15:21   ` Chris Wilson
2018-09-06  9:41     ` Tvrtko Ursulin
2018-09-06  9:57   ` Lionel Landwerlin
2018-09-06 10:10     ` Chris Wilson
2018-09-06 10:18       ` Lionel Landwerlin
2018-09-06 10:22         ` Chris Wilson
2018-09-06 10:36           ` Lionel Landwerlin
2018-09-07  8:26             ` Tvrtko Ursulin [this message]
2018-09-07  8:59               ` Chris Wilson
2018-09-07  9:23               ` Lionel Landwerlin
2018-09-07  9:39                 ` Tvrtko Ursulin
2018-09-07  9:55                   ` Lionel Landwerlin
2018-09-10 13:44                     ` Tvrtko Ursulin
2018-09-11 20:11                       ` Lionel Landwerlin
2018-09-12  8:03                         ` Tvrtko Ursulin
2018-09-05 14:22 ` [PATCH 5/7] drm/i915: Add timeline barrier support Tvrtko Ursulin
2018-09-05 15:23   ` Chris Wilson
2018-09-05 14:22 ` [PATCH 6/7] drm/i915: Expose RPCS (SSEU) configuration to userspace Tvrtko Ursulin
2018-09-05 15:29   ` Chris Wilson
2018-09-06  9:50     ` Tvrtko Ursulin
2018-09-06  9:54       ` Chris Wilson
2018-09-06  9:58       ` Lionel Landwerlin
2018-09-05 14:22 ` [PATCH 7/7] drm/i915/icl: Support co-existance between per-context SSEU and OA Tvrtko Ursulin
2018-09-05 14:46 ` ✗ Fi.CI.CHECKPATCH: warning for Per context dynamic (sub)slice power-gating (rev2) Patchwork
2018-09-05 14:49 ` ✗ Fi.CI.SPARSE: " Patchwork
2018-09-05 15:05 ` ✓ Fi.CI.BAT: success " Patchwork
2018-09-05 19:55 ` ✗ Fi.CI.IGT: failure " Patchwork
2018-09-06 19:33 ` [PATCH v11 0/7] Per context dynamic (sub)slice power-gating Chris Wilson
2018-09-06 19:52   ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4bfe6648-203f-fbc8-9bcd-7b4e28279d02@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=lionel.g.landwerlin@intel.com \
    --cc=tursulin@ursulin.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox