* [CI 0/2] Reviewed perf cleanups
@ 2018-08-13 8:02 Tvrtko Ursulin
2018-08-13 8:02 ` [CI 1/2] drm/i915/perf: simplify configure all context function Tvrtko Ursulin
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-08-13 8:02 UTC (permalink / raw)
To: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Two reviewed simple cleanup patches, extracted from dynamic SSEU series.
Lionel Landwerlin (2):
drm/i915/perf: simplify configure all context function
drm/i915/perf: reuse intel_lrc ctx regs macro
drivers/gpu/drm/i915/i915_perf.c | 45 ++++++++++++++------------------
1 file changed, 20 insertions(+), 25 deletions(-)
--
2.17.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread* [CI 1/2] drm/i915/perf: simplify configure all context function 2018-08-13 8:02 [CI 0/2] Reviewed perf cleanups Tvrtko Ursulin @ 2018-08-13 8:02 ` Tvrtko Ursulin 2018-08-13 8:02 ` [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro Tvrtko Ursulin ` (2 subsequent siblings) 3 siblings, 0 replies; 15+ messages in thread From: Tvrtko Ursulin @ 2018-08-13 8:02 UTC (permalink / raw) To: Intel-gfx From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> We don't need any special treatment on error so just return as soon as possible. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/i915_perf.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 0376338d1f8d..49597cf31707 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1819,7 +1819,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv, /* Switch away from any user context. */ ret = gen8_switch_to_updated_kernel_context(dev_priv, oa_config); if (ret) - goto out; + return ret; /* * The OA register config is setup through the context image. This image @@ -1838,7 +1838,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv, wait_flags, MAX_SCHEDULE_TIMEOUT); if (ret) - goto out; + return ret; /* Update all contexts now that we've stalled the submission. */ list_for_each_entry(ctx, &dev_priv->contexts.list, link) { @@ -1850,10 +1850,8 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv, continue; regs = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB); - if (IS_ERR(regs)) { - ret = PTR_ERR(regs); - goto out; - } + if (IS_ERR(regs)) + return PTR_ERR(regs); ce->state->obj->mm.dirty = true; regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs); @@ -1863,7 +1861,6 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv, i915_gem_object_unpin_map(ce->state->obj); } - out: return ret; } -- 2.17.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro 2018-08-13 8:02 [CI 0/2] Reviewed perf cleanups Tvrtko Ursulin 2018-08-13 8:02 ` [CI 1/2] drm/i915/perf: simplify configure all context function Tvrtko Ursulin @ 2018-08-13 8:02 ` Tvrtko Ursulin 2018-08-13 8:16 ` Chris Wilson 2018-08-13 8:33 ` ✓ Fi.CI.BAT: success for Reviewed perf cleanups Patchwork 2018-08-13 10:00 ` ✓ Fi.CI.IGT: " Patchwork 3 siblings, 1 reply; 15+ messages in thread From: Tvrtko Ursulin @ 2018-08-13 8:02 UTC (permalink / raw) To: Intel-gfx From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Abstract the context image access a bit. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/i915_perf.c | 34 +++++++++++++++----------------- 1 file changed, 16 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 49597cf31707..ccb20230df2c 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -210,6 +210,7 @@ #include "i915_oa_cflgt3.h" #include "i915_oa_cnl.h" #include "i915_oa_icl.h" +#include "intel_lrc_reg.h" /* HW requires this to be a power of two, between 128k and 16M, though driver * is currently generally designed assuming the largest 16M size is used such @@ -1636,27 +1637,25 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset; u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset; /* The MMIO offsets for Flex EU registers aren't contiguous */ - u32 flex_mmio[] = { - i915_mmio_reg_offset(EU_PERF_CNTL0), - i915_mmio_reg_offset(EU_PERF_CNTL1), - i915_mmio_reg_offset(EU_PERF_CNTL2), - i915_mmio_reg_offset(EU_PERF_CNTL3), - i915_mmio_reg_offset(EU_PERF_CNTL4), - i915_mmio_reg_offset(EU_PERF_CNTL5), - i915_mmio_reg_offset(EU_PERF_CNTL6), + i915_reg_t flex_regs[] = { + EU_PERF_CNTL0, + EU_PERF_CNTL1, + EU_PERF_CNTL2, + EU_PERF_CNTL3, + EU_PERF_CNTL4, + EU_PERF_CNTL5, + EU_PERF_CNTL6, }; int i; - reg_state[ctx_oactxctrl] = i915_mmio_reg_offset(GEN8_OACTXCONTROL); - reg_state[ctx_oactxctrl+1] = (dev_priv->perf.oa.period_exponent << - GEN8_OA_TIMER_PERIOD_SHIFT) | - (dev_priv->perf.oa.periodic ? - GEN8_OA_TIMER_ENABLE : 0) | - GEN8_OA_COUNTER_RESUME; + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, + (dev_priv->perf.oa.period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) | + (dev_priv->perf.oa.periodic ? GEN8_OA_TIMER_ENABLE : 0) | + GEN8_OA_COUNTER_RESUME); - for (i = 0; i < ARRAY_SIZE(flex_mmio); i++) { + for (i = 0; i < ARRAY_SIZE(flex_regs); i++) { u32 state_offset = ctx_flexeu0 + i * 2; - u32 mmio = flex_mmio[i]; + u32 mmio = i915_mmio_reg_offset(flex_regs[i]); /* * This arbitrary default will select the 'EU FPU0 Pipeline @@ -1676,8 +1675,7 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, } } - reg_state[state_offset] = mmio; - reg_state[state_offset+1] = value; + CTX_REG(reg_state, state_offset, flex_regs[i], value); } } -- 2.17.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro 2018-08-13 8:02 ` [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro Tvrtko Ursulin @ 2018-08-13 8:16 ` Chris Wilson 2018-08-13 9:11 ` Tvrtko Ursulin 0 siblings, 1 reply; 15+ messages in thread From: Chris Wilson @ 2018-08-13 8:16 UTC (permalink / raw) To: Intel-gfx, Tvrtko Ursulin Quoting Tvrtko Ursulin (2018-08-13 09:02:18) > From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > > Abstract the context image access a bit. > > Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/i915_perf.c | 34 +++++++++++++++----------------- > 1 file changed, 16 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c > index 49597cf31707..ccb20230df2c 100644 > --- a/drivers/gpu/drm/i915/i915_perf.c > +++ b/drivers/gpu/drm/i915/i915_perf.c > @@ -210,6 +210,7 @@ > #include "i915_oa_cflgt3.h" > #include "i915_oa_cnl.h" > #include "i915_oa_icl.h" > +#include "intel_lrc_reg.h" > > /* HW requires this to be a power of two, between 128k and 16M, though driver > * is currently generally designed assuming the largest 16M size is used such > @@ -1636,27 +1637,25 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, > u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset; > u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset; > /* The MMIO offsets for Flex EU registers aren't contiguous */ > - u32 flex_mmio[] = { > - i915_mmio_reg_offset(EU_PERF_CNTL0), > - i915_mmio_reg_offset(EU_PERF_CNTL1), > - i915_mmio_reg_offset(EU_PERF_CNTL2), > - i915_mmio_reg_offset(EU_PERF_CNTL3), > - i915_mmio_reg_offset(EU_PERF_CNTL4), > - i915_mmio_reg_offset(EU_PERF_CNTL5), > - i915_mmio_reg_offset(EU_PERF_CNTL6), > + i915_reg_t flex_regs[] = { > + EU_PERF_CNTL0, > + EU_PERF_CNTL1, > + EU_PERF_CNTL2, > + EU_PERF_CNTL3, > + EU_PERF_CNTL4, > + EU_PERF_CNTL5, > + EU_PERF_CNTL6, > }; > int i; > > - reg_state[ctx_oactxctrl] = i915_mmio_reg_offset(GEN8_OACTXCONTROL); > - reg_state[ctx_oactxctrl+1] = (dev_priv->perf.oa.period_exponent << > - GEN8_OA_TIMER_PERIOD_SHIFT) | > - (dev_priv->perf.oa.periodic ? > - GEN8_OA_TIMER_ENABLE : 0) | > - GEN8_OA_COUNTER_RESUME; > + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, > + (dev_priv->perf.oa.period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) | > + (dev_priv->perf.oa.periodic ? GEN8_OA_TIMER_ENABLE : 0) | > + GEN8_OA_COUNTER_RESUME); I'll be honest but, I don't think it's CTX_REG() that helps improve the readability here. The really odd part is that this sticks itself into a bare part of the register state not surrounded by any LRI and after a BB_END. This routine can only work for established contexts, it should not work for execlists_init_reg_state. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro 2018-08-13 8:16 ` Chris Wilson @ 2018-08-13 9:11 ` Tvrtko Ursulin 2018-08-13 9:16 ` Chris Wilson 0 siblings, 1 reply; 15+ messages in thread From: Tvrtko Ursulin @ 2018-08-13 9:11 UTC (permalink / raw) To: Chris Wilson, Intel-gfx, Tvrtko Ursulin On 13/08/2018 09:16, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-08-13 09:02:18) >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >> >> Abstract the context image access a bit. >> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> --- >> drivers/gpu/drm/i915/i915_perf.c | 34 +++++++++++++++----------------- >> 1 file changed, 16 insertions(+), 18 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c >> index 49597cf31707..ccb20230df2c 100644 >> --- a/drivers/gpu/drm/i915/i915_perf.c >> +++ b/drivers/gpu/drm/i915/i915_perf.c >> @@ -210,6 +210,7 @@ >> #include "i915_oa_cflgt3.h" >> #include "i915_oa_cnl.h" >> #include "i915_oa_icl.h" >> +#include "intel_lrc_reg.h" >> >> /* HW requires this to be a power of two, between 128k and 16M, though driver >> * is currently generally designed assuming the largest 16M size is used such >> @@ -1636,27 +1637,25 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, >> u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset; >> u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset; >> /* The MMIO offsets for Flex EU registers aren't contiguous */ >> - u32 flex_mmio[] = { >> - i915_mmio_reg_offset(EU_PERF_CNTL0), >> - i915_mmio_reg_offset(EU_PERF_CNTL1), >> - i915_mmio_reg_offset(EU_PERF_CNTL2), >> - i915_mmio_reg_offset(EU_PERF_CNTL3), >> - i915_mmio_reg_offset(EU_PERF_CNTL4), >> - i915_mmio_reg_offset(EU_PERF_CNTL5), >> - i915_mmio_reg_offset(EU_PERF_CNTL6), >> + i915_reg_t flex_regs[] = { >> + EU_PERF_CNTL0, >> + EU_PERF_CNTL1, >> + EU_PERF_CNTL2, >> + EU_PERF_CNTL3, >> + EU_PERF_CNTL4, >> + EU_PERF_CNTL5, >> + EU_PERF_CNTL6, >> }; >> int i; >> >> - reg_state[ctx_oactxctrl] = i915_mmio_reg_offset(GEN8_OACTXCONTROL); >> - reg_state[ctx_oactxctrl+1] = (dev_priv->perf.oa.period_exponent << >> - GEN8_OA_TIMER_PERIOD_SHIFT) | >> - (dev_priv->perf.oa.periodic ? >> - GEN8_OA_TIMER_ENABLE : 0) | >> - GEN8_OA_COUNTER_RESUME; >> + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, >> + (dev_priv->perf.oa.period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) | >> + (dev_priv->perf.oa.periodic ? GEN8_OA_TIMER_ENABLE : 0) | >> + GEN8_OA_COUNTER_RESUME); > > I'll be honest but, I don't think it's CTX_REG() that helps improve the > readability here. > > The really odd part is that this sticks itself into a bare part of the > register state not surrounded by any LRI and after a BB_END. This > routine can only work for established contexts, it should not work for > execlists_init_reg_state. Unless I am missing something change is completely mechanical, so any question marks you have are already there, right? What do you suggest is the action here? Regards, Tvrtko _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro 2018-08-13 9:11 ` Tvrtko Ursulin @ 2018-08-13 9:16 ` Chris Wilson 2018-08-14 18:49 ` Tvrtko Ursulin 0 siblings, 1 reply; 15+ messages in thread From: Chris Wilson @ 2018-08-13 9:16 UTC (permalink / raw) To: Intel-gfx, Tvrtko Ursulin, Tvrtko Ursulin Quoting Tvrtko Ursulin (2018-08-13 10:11:44) > > On 13/08/2018 09:16, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-08-13 09:02:18) > >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > >> > >> Abstract the context image access a bit. > >> > >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > >> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >> --- > >> drivers/gpu/drm/i915/i915_perf.c | 34 +++++++++++++++----------------- > >> 1 file changed, 16 insertions(+), 18 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c > >> index 49597cf31707..ccb20230df2c 100644 > >> --- a/drivers/gpu/drm/i915/i915_perf.c > >> +++ b/drivers/gpu/drm/i915/i915_perf.c > >> @@ -210,6 +210,7 @@ > >> #include "i915_oa_cflgt3.h" > >> #include "i915_oa_cnl.h" > >> #include "i915_oa_icl.h" > >> +#include "intel_lrc_reg.h" > >> > >> /* HW requires this to be a power of two, between 128k and 16M, though driver > >> * is currently generally designed assuming the largest 16M size is used such > >> @@ -1636,27 +1637,25 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, > >> u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset; > >> u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset; > >> /* The MMIO offsets for Flex EU registers aren't contiguous */ > >> - u32 flex_mmio[] = { > >> - i915_mmio_reg_offset(EU_PERF_CNTL0), > >> - i915_mmio_reg_offset(EU_PERF_CNTL1), > >> - i915_mmio_reg_offset(EU_PERF_CNTL2), > >> - i915_mmio_reg_offset(EU_PERF_CNTL3), > >> - i915_mmio_reg_offset(EU_PERF_CNTL4), > >> - i915_mmio_reg_offset(EU_PERF_CNTL5), > >> - i915_mmio_reg_offset(EU_PERF_CNTL6), > >> + i915_reg_t flex_regs[] = { > >> + EU_PERF_CNTL0, > >> + EU_PERF_CNTL1, > >> + EU_PERF_CNTL2, > >> + EU_PERF_CNTL3, > >> + EU_PERF_CNTL4, > >> + EU_PERF_CNTL5, > >> + EU_PERF_CNTL6, > >> }; > >> int i; > >> > >> - reg_state[ctx_oactxctrl] = i915_mmio_reg_offset(GEN8_OACTXCONTROL); > >> - reg_state[ctx_oactxctrl+1] = (dev_priv->perf.oa.period_exponent << > >> - GEN8_OA_TIMER_PERIOD_SHIFT) | > >> - (dev_priv->perf.oa.periodic ? > >> - GEN8_OA_TIMER_ENABLE : 0) | > >> - GEN8_OA_COUNTER_RESUME; > >> + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, > >> + (dev_priv->perf.oa.period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) | > >> + (dev_priv->perf.oa.periodic ? GEN8_OA_TIMER_ENABLE : 0) | > >> + GEN8_OA_COUNTER_RESUME); > > > > I'll be honest but, I don't think it's CTX_REG() that helps improve the > > readability here. > > > > The really odd part is that this sticks itself into a bare part of the > > register state not surrounded by any LRI and after a BB_END. This > > routine can only work for established contexts, it should not work for > > execlists_init_reg_state. > > Unless I am missing something change is completely mechanical, so any > question marks you have are already there, right? What do you suggest is > the action here? Sure, the only thing I question of this patch itself is whether CTX_REG() is simply too much horrible obfuscating magic. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro 2018-08-13 9:16 ` Chris Wilson @ 2018-08-14 18:49 ` Tvrtko Ursulin 2018-08-14 18:57 ` Chris Wilson 0 siblings, 1 reply; 15+ messages in thread From: Tvrtko Ursulin @ 2018-08-14 18:49 UTC (permalink / raw) To: Chris Wilson, Intel-gfx, Tvrtko Ursulin On 13/08/2018 10:16, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-08-13 10:11:44) >> >> On 13/08/2018 09:16, Chris Wilson wrote: >>> Quoting Tvrtko Ursulin (2018-08-13 09:02:18) >>>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >>>> >>>> Abstract the context image access a bit. >>>> >>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>>> --- >>>> drivers/gpu/drm/i915/i915_perf.c | 34 +++++++++++++++----------------- >>>> 1 file changed, 16 insertions(+), 18 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c >>>> index 49597cf31707..ccb20230df2c 100644 >>>> --- a/drivers/gpu/drm/i915/i915_perf.c >>>> +++ b/drivers/gpu/drm/i915/i915_perf.c >>>> @@ -210,6 +210,7 @@ >>>> #include "i915_oa_cflgt3.h" >>>> #include "i915_oa_cnl.h" >>>> #include "i915_oa_icl.h" >>>> +#include "intel_lrc_reg.h" >>>> >>>> /* HW requires this to be a power of two, between 128k and 16M, though driver >>>> * is currently generally designed assuming the largest 16M size is used such >>>> @@ -1636,27 +1637,25 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, >>>> u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset; >>>> u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset; >>>> /* The MMIO offsets for Flex EU registers aren't contiguous */ >>>> - u32 flex_mmio[] = { >>>> - i915_mmio_reg_offset(EU_PERF_CNTL0), >>>> - i915_mmio_reg_offset(EU_PERF_CNTL1), >>>> - i915_mmio_reg_offset(EU_PERF_CNTL2), >>>> - i915_mmio_reg_offset(EU_PERF_CNTL3), >>>> - i915_mmio_reg_offset(EU_PERF_CNTL4), >>>> - i915_mmio_reg_offset(EU_PERF_CNTL5), >>>> - i915_mmio_reg_offset(EU_PERF_CNTL6), >>>> + i915_reg_t flex_regs[] = { >>>> + EU_PERF_CNTL0, >>>> + EU_PERF_CNTL1, >>>> + EU_PERF_CNTL2, >>>> + EU_PERF_CNTL3, >>>> + EU_PERF_CNTL4, >>>> + EU_PERF_CNTL5, >>>> + EU_PERF_CNTL6, >>>> }; >>>> int i; >>>> >>>> - reg_state[ctx_oactxctrl] = i915_mmio_reg_offset(GEN8_OACTXCONTROL); >>>> - reg_state[ctx_oactxctrl+1] = (dev_priv->perf.oa.period_exponent << >>>> - GEN8_OA_TIMER_PERIOD_SHIFT) | >>>> - (dev_priv->perf.oa.periodic ? >>>> - GEN8_OA_TIMER_ENABLE : 0) | >>>> - GEN8_OA_COUNTER_RESUME; >>>> + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, >>>> + (dev_priv->perf.oa.period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) | >>>> + (dev_priv->perf.oa.periodic ? GEN8_OA_TIMER_ENABLE : 0) | >>>> + GEN8_OA_COUNTER_RESUME); >>> >>> I'll be honest but, I don't think it's CTX_REG() that helps improve the >>> readability here. >>> >>> The really odd part is that this sticks itself into a bare part of the >>> register state not surrounded by any LRI and after a BB_END. This >>> routine can only work for established contexts, it should not work for >>> execlists_init_reg_state. >> >> Unless I am missing something change is completely mechanical, so any >> question marks you have are already there, right? What do you suggest is >> the action here? > > Sure, the only thing I question of this patch itself is whether > CTX_REG() is simply too much horrible obfuscating magic. Turn a blind eye if the perceived badness factor is below some threshold? Following patch depends on this one, so if I have to drop this one, then I have to rework the next one etc.. well, it's not the worst problem, so yeah, whatever. Make a call and let me know. Regards, Tvrtko _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro 2018-08-14 18:49 ` Tvrtko Ursulin @ 2018-08-14 18:57 ` Chris Wilson 2018-08-15 8:49 ` Tvrtko Ursulin 0 siblings, 1 reply; 15+ messages in thread From: Chris Wilson @ 2018-08-14 18:57 UTC (permalink / raw) To: Intel-gfx, Tvrtko Ursulin, Tvrtko Ursulin Quoting Tvrtko Ursulin (2018-08-14 19:49:46) > > On 13/08/2018 10:16, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-08-13 10:11:44) > >> > >> On 13/08/2018 09:16, Chris Wilson wrote: > >>> Quoting Tvrtko Ursulin (2018-08-13 09:02:18) > >>>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > >>>> > >>>> Abstract the context image access a bit. > >>>> > >>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > >>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >>>> --- > >>>> drivers/gpu/drm/i915/i915_perf.c | 34 +++++++++++++++----------------- > >>>> 1 file changed, 16 insertions(+), 18 deletions(-) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c > >>>> index 49597cf31707..ccb20230df2c 100644 > >>>> --- a/drivers/gpu/drm/i915/i915_perf.c > >>>> +++ b/drivers/gpu/drm/i915/i915_perf.c > >>>> @@ -210,6 +210,7 @@ > >>>> #include "i915_oa_cflgt3.h" > >>>> #include "i915_oa_cnl.h" > >>>> #include "i915_oa_icl.h" > >>>> +#include "intel_lrc_reg.h" > >>>> > >>>> /* HW requires this to be a power of two, between 128k and 16M, though driver > >>>> * is currently generally designed assuming the largest 16M size is used such > >>>> @@ -1636,27 +1637,25 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, > >>>> u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset; > >>>> u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset; > >>>> /* The MMIO offsets for Flex EU registers aren't contiguous */ > >>>> - u32 flex_mmio[] = { > >>>> - i915_mmio_reg_offset(EU_PERF_CNTL0), > >>>> - i915_mmio_reg_offset(EU_PERF_CNTL1), > >>>> - i915_mmio_reg_offset(EU_PERF_CNTL2), > >>>> - i915_mmio_reg_offset(EU_PERF_CNTL3), > >>>> - i915_mmio_reg_offset(EU_PERF_CNTL4), > >>>> - i915_mmio_reg_offset(EU_PERF_CNTL5), > >>>> - i915_mmio_reg_offset(EU_PERF_CNTL6), > >>>> + i915_reg_t flex_regs[] = { > >>>> + EU_PERF_CNTL0, > >>>> + EU_PERF_CNTL1, > >>>> + EU_PERF_CNTL2, > >>>> + EU_PERF_CNTL3, > >>>> + EU_PERF_CNTL4, > >>>> + EU_PERF_CNTL5, > >>>> + EU_PERF_CNTL6, > >>>> }; > >>>> int i; > >>>> > >>>> - reg_state[ctx_oactxctrl] = i915_mmio_reg_offset(GEN8_OACTXCONTROL); > >>>> - reg_state[ctx_oactxctrl+1] = (dev_priv->perf.oa.period_exponent << > >>>> - GEN8_OA_TIMER_PERIOD_SHIFT) | > >>>> - (dev_priv->perf.oa.periodic ? > >>>> - GEN8_OA_TIMER_ENABLE : 0) | > >>>> - GEN8_OA_COUNTER_RESUME; > >>>> + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, > >>>> + (dev_priv->perf.oa.period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) | > >>>> + (dev_priv->perf.oa.periodic ? GEN8_OA_TIMER_ENABLE : 0) | > >>>> + GEN8_OA_COUNTER_RESUME); > >>> > >>> I'll be honest but, I don't think it's CTX_REG() that helps improve the > >>> readability here. > >>> > >>> The really odd part is that this sticks itself into a bare part of the > >>> register state not surrounded by any LRI and after a BB_END. This > >>> routine can only work for established contexts, it should not work for > >>> execlists_init_reg_state. > >> > >> Unless I am missing something change is completely mechanical, so any > >> question marks you have are already there, right? What do you suggest is > >> the action here? > > > > Sure, the only thing I question of this patch itself is whether > > CTX_REG() is simply too much horrible obfuscating magic. > > Turn a blind eye if the perceived badness factor is below some > threshold? Following patch depends on this one, so if I have to drop > this one, then I have to rework the next one etc.. well, it's not the > worst problem, so yeah, whatever. Make a call and let me know. The patch was fine, just worrying about the surrounding code. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro 2018-08-14 18:57 ` Chris Wilson @ 2018-08-15 8:49 ` Tvrtko Ursulin 2018-08-15 10:50 ` Lionel Landwerlin 0 siblings, 1 reply; 15+ messages in thread From: Tvrtko Ursulin @ 2018-08-15 8:49 UTC (permalink / raw) To: Chris Wilson, Intel-gfx, Tvrtko Ursulin On 14/08/2018 19:57, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-08-14 19:49:46) >> >> On 13/08/2018 10:16, Chris Wilson wrote: >>> Quoting Tvrtko Ursulin (2018-08-13 10:11:44) >>>> >>>> On 13/08/2018 09:16, Chris Wilson wrote: >>>>> Quoting Tvrtko Ursulin (2018-08-13 09:02:18) >>>>>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >>>>>> >>>>>> Abstract the context image access a bit. >>>>>> >>>>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >>>>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>>>>> --- >>>>>> drivers/gpu/drm/i915/i915_perf.c | 34 +++++++++++++++----------------- >>>>>> 1 file changed, 16 insertions(+), 18 deletions(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c >>>>>> index 49597cf31707..ccb20230df2c 100644 >>>>>> --- a/drivers/gpu/drm/i915/i915_perf.c >>>>>> +++ b/drivers/gpu/drm/i915/i915_perf.c >>>>>> @@ -210,6 +210,7 @@ >>>>>> #include "i915_oa_cflgt3.h" >>>>>> #include "i915_oa_cnl.h" >>>>>> #include "i915_oa_icl.h" >>>>>> +#include "intel_lrc_reg.h" >>>>>> >>>>>> /* HW requires this to be a power of two, between 128k and 16M, though driver >>>>>> * is currently generally designed assuming the largest 16M size is used such >>>>>> @@ -1636,27 +1637,25 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, >>>>>> u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset; >>>>>> u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset; >>>>>> /* The MMIO offsets for Flex EU registers aren't contiguous */ >>>>>> - u32 flex_mmio[] = { >>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL0), >>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL1), >>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL2), >>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL3), >>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL4), >>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL5), >>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL6), >>>>>> + i915_reg_t flex_regs[] = { >>>>>> + EU_PERF_CNTL0, >>>>>> + EU_PERF_CNTL1, >>>>>> + EU_PERF_CNTL2, >>>>>> + EU_PERF_CNTL3, >>>>>> + EU_PERF_CNTL4, >>>>>> + EU_PERF_CNTL5, >>>>>> + EU_PERF_CNTL6, >>>>>> }; >>>>>> int i; >>>>>> >>>>>> - reg_state[ctx_oactxctrl] = i915_mmio_reg_offset(GEN8_OACTXCONTROL); >>>>>> - reg_state[ctx_oactxctrl+1] = (dev_priv->perf.oa.period_exponent << >>>>>> - GEN8_OA_TIMER_PERIOD_SHIFT) | >>>>>> - (dev_priv->perf.oa.periodic ? >>>>>> - GEN8_OA_TIMER_ENABLE : 0) | >>>>>> - GEN8_OA_COUNTER_RESUME; >>>>>> + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, >>>>>> + (dev_priv->perf.oa.period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) | >>>>>> + (dev_priv->perf.oa.periodic ? GEN8_OA_TIMER_ENABLE : 0) | >>>>>> + GEN8_OA_COUNTER_RESUME); >>>>> >>>>> I'll be honest but, I don't think it's CTX_REG() that helps improve the >>>>> readability here. >>>>> >>>>> The really odd part is that this sticks itself into a bare part of the >>>>> register state not surrounded by any LRI and after a BB_END. This >>>>> routine can only work for established contexts, it should not work for >>>>> execlists_init_reg_state. >>>> >>>> Unless I am missing something change is completely mechanical, so any >>>> question marks you have are already there, right? What do you suggest is >>>> the action here? >>> >>> Sure, the only thing I question of this patch itself is whether >>> CTX_REG() is simply too much horrible obfuscating magic. >> >> Turn a blind eye if the perceived badness factor is below some >> threshold? Following patch depends on this one, so if I have to drop >> this one, then I have to rework the next one etc.. well, it's not the >> worst problem, so yeah, whatever. Make a call and let me know. > > The patch was fine, just worrying about the surrounding code. I misunderstood. So only about ctx_oactxctrl_offset and ctx_flexeu0_offset from i915_perf.c? Maybe that is some OA magic, I have not idea. CC-ing Lionel in case he can shed some light on it. Regards, Tvrtko _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro 2018-08-15 8:49 ` Tvrtko Ursulin @ 2018-08-15 10:50 ` Lionel Landwerlin 2018-08-15 10:56 ` Chris Wilson 0 siblings, 1 reply; 15+ messages in thread From: Lionel Landwerlin @ 2018-08-15 10:50 UTC (permalink / raw) To: Tvrtko Ursulin, Chris Wilson, Intel-gfx, Tvrtko Ursulin On 15/08/18 09:49, Tvrtko Ursulin wrote: > > On 14/08/2018 19:57, Chris Wilson wrote: >> Quoting Tvrtko Ursulin (2018-08-14 19:49:46) >>> >>> On 13/08/2018 10:16, Chris Wilson wrote: >>>> Quoting Tvrtko Ursulin (2018-08-13 10:11:44) >>>>> >>>>> On 13/08/2018 09:16, Chris Wilson wrote: >>>>>> Quoting Tvrtko Ursulin (2018-08-13 09:02:18) >>>>>>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >>>>>>> >>>>>>> Abstract the context image access a bit. >>>>>>> >>>>>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >>>>>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>>>>>> --- >>>>>>> drivers/gpu/drm/i915/i915_perf.c | 34 >>>>>>> +++++++++++++++----------------- >>>>>>> 1 file changed, 16 insertions(+), 18 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c >>>>>>> b/drivers/gpu/drm/i915/i915_perf.c >>>>>>> index 49597cf31707..ccb20230df2c 100644 >>>>>>> --- a/drivers/gpu/drm/i915/i915_perf.c >>>>>>> +++ b/drivers/gpu/drm/i915/i915_perf.c >>>>>>> @@ -210,6 +210,7 @@ >>>>>>> #include "i915_oa_cflgt3.h" >>>>>>> #include "i915_oa_cnl.h" >>>>>>> #include "i915_oa_icl.h" >>>>>>> +#include "intel_lrc_reg.h" >>>>>>> /* HW requires this to be a power of two, between 128k >>>>>>> and 16M, though driver >>>>>>> * is currently generally designed assuming the largest 16M >>>>>>> size is used such >>>>>>> @@ -1636,27 +1637,25 @@ static void >>>>>>> gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, >>>>>>> u32 ctx_oactxctrl = >>>>>>> dev_priv->perf.oa.ctx_oactxctrl_offset; >>>>>>> u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset; >>>>>>> /* The MMIO offsets for Flex EU registers aren't >>>>>>> contiguous */ >>>>>>> - u32 flex_mmio[] = { >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL0), >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL1), >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL2), >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL3), >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL4), >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL5), >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL6), >>>>>>> + i915_reg_t flex_regs[] = { >>>>>>> + EU_PERF_CNTL0, >>>>>>> + EU_PERF_CNTL1, >>>>>>> + EU_PERF_CNTL2, >>>>>>> + EU_PERF_CNTL3, >>>>>>> + EU_PERF_CNTL4, >>>>>>> + EU_PERF_CNTL5, >>>>>>> + EU_PERF_CNTL6, >>>>>>> }; >>>>>>> int i; >>>>>>> - reg_state[ctx_oactxctrl] = >>>>>>> i915_mmio_reg_offset(GEN8_OACTXCONTROL); >>>>>>> - reg_state[ctx_oactxctrl+1] = >>>>>>> (dev_priv->perf.oa.period_exponent << >>>>>>> - GEN8_OA_TIMER_PERIOD_SHIFT) | >>>>>>> - (dev_priv->perf.oa.periodic ? >>>>>>> - GEN8_OA_TIMER_ENABLE : 0) | >>>>>>> - GEN8_OA_COUNTER_RESUME; >>>>>>> + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, >>>>>>> + (dev_priv->perf.oa.period_exponent << >>>>>>> GEN8_OA_TIMER_PERIOD_SHIFT) | >>>>>>> + (dev_priv->perf.oa.periodic ? >>>>>>> GEN8_OA_TIMER_ENABLE : 0) | >>>>>>> + GEN8_OA_COUNTER_RESUME); >>>>>> >>>>>> I'll be honest but, I don't think it's CTX_REG() that helps >>>>>> improve the >>>>>> readability here. >>>>>> >>>>>> The really odd part is that this sticks itself into a bare part >>>>>> of the >>>>>> register state not surrounded by any LRI and after a BB_END. This >>>>>> routine can only work for established contexts, it should not >>>>>> work for >>>>>> execlists_init_reg_state. >>>>> >>>>> Unless I am missing something change is completely mechanical, so any >>>>> question marks you have are already there, right? What do you >>>>> suggest is >>>>> the action here? >>>> >>>> Sure, the only thing I question of this patch itself is whether >>>> CTX_REG() is simply too much horrible obfuscating magic. >>> >>> Turn a blind eye if the perceived badness factor is below some >>> threshold? Following patch depends on this one, so if I have to drop >>> this one, then I have to rework the next one etc.. well, it's not the >>> worst problem, so yeah, whatever. Make a call and let me know. >> >> The patch was fine, just worrying about the surrounding code. > > I misunderstood. So only about ctx_oactxctrl_offset and > ctx_flexeu0_offset from i915_perf.c? Maybe that is some OA magic, I > have not idea. CC-ing Lionel in case he can shed some light on it. Those are the offsets at which the hardware will store the OA_CTXCTRL/FLEX_EU registers values as documented. I can see that's it's a bit odd not to have the MI_LRI written before we do the first restore. I'm 99% sure I've verified in practice that application started after i915/perf is opened have the right values programmed. Not completely sure that the IGT tests cover that case though. So maybe there is a problem with the first restore... What's the value set into most register that aren't explicitly programmed in intel_lrc.c ? - Lionel > > Regards, > > Tvrtko > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro 2018-08-15 10:50 ` Lionel Landwerlin @ 2018-08-15 10:56 ` Chris Wilson 2018-08-15 11:15 ` Lionel Landwerlin 0 siblings, 1 reply; 15+ messages in thread From: Chris Wilson @ 2018-08-15 10:56 UTC (permalink / raw) To: Intel-gfx, Lionel Landwerlin, Tvrtko Ursulin, Tvrtko Ursulin Quoting Lionel Landwerlin (2018-08-15 11:50:55) > On 15/08/18 09:49, Tvrtko Ursulin wrote: > > > > On 14/08/2018 19:57, Chris Wilson wrote: > >> Quoting Tvrtko Ursulin (2018-08-14 19:49:46) > >>> > >>> On 13/08/2018 10:16, Chris Wilson wrote: > >>>> Quoting Tvrtko Ursulin (2018-08-13 10:11:44) > >>>>> > >>>>> On 13/08/2018 09:16, Chris Wilson wrote: > >>>>>> Quoting Tvrtko Ursulin (2018-08-13 09:02:18) > >>>>>>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > >>>>>>> > >>>>>>> Abstract the context image access a bit. > >>>>>>> > >>>>>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > >>>>>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >>>>>>> --- > >>>>>>> drivers/gpu/drm/i915/i915_perf.c | 34 > >>>>>>> +++++++++++++++----------------- > >>>>>>> 1 file changed, 16 insertions(+), 18 deletions(-) > >>>>>>> > >>>>>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c > >>>>>>> b/drivers/gpu/drm/i915/i915_perf.c > >>>>>>> index 49597cf31707..ccb20230df2c 100644 > >>>>>>> --- a/drivers/gpu/drm/i915/i915_perf.c > >>>>>>> +++ b/drivers/gpu/drm/i915/i915_perf.c > >>>>>>> @@ -210,6 +210,7 @@ > >>>>>>> #include "i915_oa_cflgt3.h" > >>>>>>> #include "i915_oa_cnl.h" > >>>>>>> #include "i915_oa_icl.h" > >>>>>>> +#include "intel_lrc_reg.h" > >>>>>>> /* HW requires this to be a power of two, between 128k > >>>>>>> and 16M, though driver > >>>>>>> * is currently generally designed assuming the largest 16M > >>>>>>> size is used such > >>>>>>> @@ -1636,27 +1637,25 @@ static void > >>>>>>> gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, > >>>>>>> u32 ctx_oactxctrl = > >>>>>>> dev_priv->perf.oa.ctx_oactxctrl_offset; > >>>>>>> u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset; > >>>>>>> /* The MMIO offsets for Flex EU registers aren't > >>>>>>> contiguous */ > >>>>>>> - u32 flex_mmio[] = { > >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL0), > >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL1), > >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL2), > >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL3), > >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL4), > >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL5), > >>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL6), > >>>>>>> + i915_reg_t flex_regs[] = { > >>>>>>> + EU_PERF_CNTL0, > >>>>>>> + EU_PERF_CNTL1, > >>>>>>> + EU_PERF_CNTL2, > >>>>>>> + EU_PERF_CNTL3, > >>>>>>> + EU_PERF_CNTL4, > >>>>>>> + EU_PERF_CNTL5, > >>>>>>> + EU_PERF_CNTL6, > >>>>>>> }; > >>>>>>> int i; > >>>>>>> - reg_state[ctx_oactxctrl] = > >>>>>>> i915_mmio_reg_offset(GEN8_OACTXCONTROL); > >>>>>>> - reg_state[ctx_oactxctrl+1] = > >>>>>>> (dev_priv->perf.oa.period_exponent << > >>>>>>> - GEN8_OA_TIMER_PERIOD_SHIFT) | > >>>>>>> - (dev_priv->perf.oa.periodic ? > >>>>>>> - GEN8_OA_TIMER_ENABLE : 0) | > >>>>>>> - GEN8_OA_COUNTER_RESUME; > >>>>>>> + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, > >>>>>>> + (dev_priv->perf.oa.period_exponent << > >>>>>>> GEN8_OA_TIMER_PERIOD_SHIFT) | > >>>>>>> + (dev_priv->perf.oa.periodic ? > >>>>>>> GEN8_OA_TIMER_ENABLE : 0) | > >>>>>>> + GEN8_OA_COUNTER_RESUME); > >>>>>> > >>>>>> I'll be honest but, I don't think it's CTX_REG() that helps > >>>>>> improve the > >>>>>> readability here. > >>>>>> > >>>>>> The really odd part is that this sticks itself into a bare part > >>>>>> of the > >>>>>> register state not surrounded by any LRI and after a BB_END. This > >>>>>> routine can only work for established contexts, it should not > >>>>>> work for > >>>>>> execlists_init_reg_state. > >>>>> > >>>>> Unless I am missing something change is completely mechanical, so any > >>>>> question marks you have are already there, right? What do you > >>>>> suggest is > >>>>> the action here? > >>>> > >>>> Sure, the only thing I question of this patch itself is whether > >>>> CTX_REG() is simply too much horrible obfuscating magic. > >>> > >>> Turn a blind eye if the perceived badness factor is below some > >>> threshold? Following patch depends on this one, so if I have to drop > >>> this one, then I have to rework the next one etc.. well, it's not the > >>> worst problem, so yeah, whatever. Make a call and let me know. > >> > >> The patch was fine, just worrying about the surrounding code. > > > > I misunderstood. So only about ctx_oactxctrl_offset and > > ctx_flexeu0_offset from i915_perf.c? Maybe that is some OA magic, I > > have not idea. CC-ing Lionel in case he can shed some light on it. > > Those are the offsets at which the hardware will store the > OA_CTXCTRL/FLEX_EU registers values as documented. > I can see that's it's a bit odd not to have the MI_LRI written before we > do the first restore. > > I'm 99% sure I've verified in practice that application started after > i915/perf is opened have the right values programmed. > Not completely sure that the IGT tests cover that case though. > So maybe there is a problem with the first restore... > > What's the value set into most register that aren't explicitly > programmed in intel_lrc.c ? That would be whatever was previously there In practice, since we now initialise all real contexts from the default state, the LRI will be there. execlists_init_reg_state() is wonderfully, perhaps dangerously, misleading atm. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro 2018-08-15 10:56 ` Chris Wilson @ 2018-08-15 11:15 ` Lionel Landwerlin 0 siblings, 0 replies; 15+ messages in thread From: Lionel Landwerlin @ 2018-08-15 11:15 UTC (permalink / raw) To: Chris Wilson, Intel-gfx, Tvrtko Ursulin, Tvrtko Ursulin On 15/08/18 11:56, Chris Wilson wrote: > Quoting Lionel Landwerlin (2018-08-15 11:50:55) >> On 15/08/18 09:49, Tvrtko Ursulin wrote: >>> On 14/08/2018 19:57, Chris Wilson wrote: >>>> Quoting Tvrtko Ursulin (2018-08-14 19:49:46) >>>>> On 13/08/2018 10:16, Chris Wilson wrote: >>>>>> Quoting Tvrtko Ursulin (2018-08-13 10:11:44) >>>>>>> On 13/08/2018 09:16, Chris Wilson wrote: >>>>>>>> Quoting Tvrtko Ursulin (2018-08-13 09:02:18) >>>>>>>>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >>>>>>>>> >>>>>>>>> Abstract the context image access a bit. >>>>>>>>> >>>>>>>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >>>>>>>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/i915/i915_perf.c | 34 >>>>>>>>> +++++++++++++++----------------- >>>>>>>>> 1 file changed, 16 insertions(+), 18 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c >>>>>>>>> b/drivers/gpu/drm/i915/i915_perf.c >>>>>>>>> index 49597cf31707..ccb20230df2c 100644 >>>>>>>>> --- a/drivers/gpu/drm/i915/i915_perf.c >>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_perf.c >>>>>>>>> @@ -210,6 +210,7 @@ >>>>>>>>> #include "i915_oa_cflgt3.h" >>>>>>>>> #include "i915_oa_cnl.h" >>>>>>>>> #include "i915_oa_icl.h" >>>>>>>>> +#include "intel_lrc_reg.h" >>>>>>>>> /* HW requires this to be a power of two, between 128k >>>>>>>>> and 16M, though driver >>>>>>>>> * is currently generally designed assuming the largest 16M >>>>>>>>> size is used such >>>>>>>>> @@ -1636,27 +1637,25 @@ static void >>>>>>>>> gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, >>>>>>>>> u32 ctx_oactxctrl = >>>>>>>>> dev_priv->perf.oa.ctx_oactxctrl_offset; >>>>>>>>> u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset; >>>>>>>>> /* The MMIO offsets for Flex EU registers aren't >>>>>>>>> contiguous */ >>>>>>>>> - u32 flex_mmio[] = { >>>>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL0), >>>>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL1), >>>>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL2), >>>>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL3), >>>>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL4), >>>>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL5), >>>>>>>>> - i915_mmio_reg_offset(EU_PERF_CNTL6), >>>>>>>>> + i915_reg_t flex_regs[] = { >>>>>>>>> + EU_PERF_CNTL0, >>>>>>>>> + EU_PERF_CNTL1, >>>>>>>>> + EU_PERF_CNTL2, >>>>>>>>> + EU_PERF_CNTL3, >>>>>>>>> + EU_PERF_CNTL4, >>>>>>>>> + EU_PERF_CNTL5, >>>>>>>>> + EU_PERF_CNTL6, >>>>>>>>> }; >>>>>>>>> int i; >>>>>>>>> - reg_state[ctx_oactxctrl] = >>>>>>>>> i915_mmio_reg_offset(GEN8_OACTXCONTROL); >>>>>>>>> - reg_state[ctx_oactxctrl+1] = >>>>>>>>> (dev_priv->perf.oa.period_exponent << >>>>>>>>> - GEN8_OA_TIMER_PERIOD_SHIFT) | >>>>>>>>> - (dev_priv->perf.oa.periodic ? >>>>>>>>> - GEN8_OA_TIMER_ENABLE : 0) | >>>>>>>>> - GEN8_OA_COUNTER_RESUME; >>>>>>>>> + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, >>>>>>>>> + (dev_priv->perf.oa.period_exponent << >>>>>>>>> GEN8_OA_TIMER_PERIOD_SHIFT) | >>>>>>>>> + (dev_priv->perf.oa.periodic ? >>>>>>>>> GEN8_OA_TIMER_ENABLE : 0) | >>>>>>>>> + GEN8_OA_COUNTER_RESUME); >>>>>>>> I'll be honest but, I don't think it's CTX_REG() that helps >>>>>>>> improve the >>>>>>>> readability here. >>>>>>>> >>>>>>>> The really odd part is that this sticks itself into a bare part >>>>>>>> of the >>>>>>>> register state not surrounded by any LRI and after a BB_END. This >>>>>>>> routine can only work for established contexts, it should not >>>>>>>> work for >>>>>>>> execlists_init_reg_state. >>>>>>> Unless I am missing something change is completely mechanical, so any >>>>>>> question marks you have are already there, right? What do you >>>>>>> suggest is >>>>>>> the action here? >>>>>> Sure, the only thing I question of this patch itself is whether >>>>>> CTX_REG() is simply too much horrible obfuscating magic. >>>>> Turn a blind eye if the perceived badness factor is below some >>>>> threshold? Following patch depends on this one, so if I have to drop >>>>> this one, then I have to rework the next one etc.. well, it's not the >>>>> worst problem, so yeah, whatever. Make a call and let me know. >>>> The patch was fine, just worrying about the surrounding code. >>> I misunderstood. So only about ctx_oactxctrl_offset and >>> ctx_flexeu0_offset from i915_perf.c? Maybe that is some OA magic, I >>> have not idea. CC-ing Lionel in case he can shed some light on it. >> Those are the offsets at which the hardware will store the >> OA_CTXCTRL/FLEX_EU registers values as documented. >> I can see that's it's a bit odd not to have the MI_LRI written before we >> do the first restore. >> >> I'm 99% sure I've verified in practice that application started after >> i915/perf is opened have the right values programmed. >> Not completely sure that the IGT tests cover that case though. >> So maybe there is a problem with the first restore... >> >> What's the value set into most register that aren't explicitly >> programmed in intel_lrc.c ? > That would be whatever was previously there > > In practice, since we now initialise all real contexts from the default > state, the LRI will be there. execlists_init_reg_state() is wonderfully, > perhaps dangerously, misleading atm. > -Chris > Then I guess it's all fine :) Thanks, - Lionel _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* ✓ Fi.CI.BAT: success for Reviewed perf cleanups 2018-08-13 8:02 [CI 0/2] Reviewed perf cleanups Tvrtko Ursulin 2018-08-13 8:02 ` [CI 1/2] drm/i915/perf: simplify configure all context function Tvrtko Ursulin 2018-08-13 8:02 ` [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro Tvrtko Ursulin @ 2018-08-13 8:33 ` Patchwork 2018-08-13 10:00 ` ✓ Fi.CI.IGT: " Patchwork 3 siblings, 0 replies; 15+ messages in thread From: Patchwork @ 2018-08-13 8:33 UTC (permalink / raw) To: Tvrtko Ursulin; +Cc: intel-gfx == Series Details == Series: Reviewed perf cleanups URL : https://patchwork.freedesktop.org/series/48100/ State : success == Summary == = CI Bug Log - changes from CI_DRM_4660 -> Patchwork_9926 = == Summary - SUCCESS == No regressions found. External URL: https://patchwork.freedesktop.org/api/1.0/series/48100/revisions/1/mbox/ == Known issues == Here are the changes found in Patchwork_9926 that come from known issues: === IGT changes === ==== Possible fixes ==== igt@drv_selftest@live_hangcheck: fi-skl-guc: DMESG-FAIL (fdo#107174) -> PASS igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a: {fi-byt-clapper}: FAIL (fdo#107362) -> PASS igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c: fi-skl-guc: FAIL (fdo#103191) -> PASS {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191 fdo#107174 https://bugs.freedesktop.org/show_bug.cgi?id=107174 fdo#107362 https://bugs.freedesktop.org/show_bug.cgi?id=107362 == Participating hosts (54 -> 49) == Missing (5): fi-ctg-p8600 fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u == Build changes == * Linux: CI_DRM_4660 -> Patchwork_9926 CI_DRM_4660: 9cd882754c1017e68ed9d2c8d57dc326530522fe @ git://anongit.freedesktop.org/gfx-ci/linux IGT_4592: fce9638b2e60afce872b3056c19a729b1b3708be @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_9926: fbf730a0a4a717a5f3a151e398f50ca6b9de1757 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == fbf730a0a4a7 drm/i915/perf: reuse intel_lrc ctx regs macro fecf631c038e drm/i915/perf: simplify configure all context function == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9926/issues.html _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* ✓ Fi.CI.IGT: success for Reviewed perf cleanups 2018-08-13 8:02 [CI 0/2] Reviewed perf cleanups Tvrtko Ursulin ` (2 preceding siblings ...) 2018-08-13 8:33 ` ✓ Fi.CI.BAT: success for Reviewed perf cleanups Patchwork @ 2018-08-13 10:00 ` Patchwork 2018-08-31 15:27 ` Chris Wilson 3 siblings, 1 reply; 15+ messages in thread From: Patchwork @ 2018-08-13 10:00 UTC (permalink / raw) To: Tvrtko Ursulin; +Cc: intel-gfx == Series Details == Series: Reviewed perf cleanups URL : https://patchwork.freedesktop.org/series/48100/ State : success == Summary == = CI Bug Log - changes from CI_DRM_4660_full -> Patchwork_9926_full = == Summary - SUCCESS == No regressions found. == Known issues == Here are the changes found in Patchwork_9926_full that come from known issues: === IGT changes === ==== Issues hit ==== igt@drv_suspend@fence-restore-untiled: shard-glk: PASS -> FAIL (fdo#103375) igt@drv_suspend@shrink: shard-apl: PASS -> FAIL (fdo#106886) igt@kms_busy@extended-modeset-hang-newfb-with-reset-render-a: shard-snb: PASS -> INCOMPLETE (fdo#105411) igt@kms_setmode@basic: shard-apl: PASS -> FAIL (fdo#99912) ==== Possible fixes ==== igt@gem_eio@reset-stress: shard-apl: FAIL -> PASS fdo#103375 https://bugs.freedesktop.org/show_bug.cgi?id=103375 fdo#105411 https://bugs.freedesktop.org/show_bug.cgi?id=105411 fdo#106886 https://bugs.freedesktop.org/show_bug.cgi?id=106886 fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912 == Participating hosts (5 -> 5) == No changes in participating hosts == Build changes == * Linux: CI_DRM_4660 -> Patchwork_9926 CI_DRM_4660: 9cd882754c1017e68ed9d2c8d57dc326530522fe @ git://anongit.freedesktop.org/gfx-ci/linux IGT_4592: fce9638b2e60afce872b3056c19a729b1b3708be @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_9926: fbf730a0a4a717a5f3a151e398f50ca6b9de1757 @ git://anongit.freedesktop.org/gfx-ci/linux piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9926/shards.html _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ✓ Fi.CI.IGT: success for Reviewed perf cleanups 2018-08-13 10:00 ` ✓ Fi.CI.IGT: " Patchwork @ 2018-08-31 15:27 ` Chris Wilson 0 siblings, 0 replies; 15+ messages in thread From: Chris Wilson @ 2018-08-31 15:27 UTC (permalink / raw) To: Patchwork, Tvrtko Ursulin; +Cc: intel-gfx Quoting Patchwork (2018-08-13 11:00:57) > == Series Details == > > Series: Reviewed perf cleanups > URL : https://patchwork.freedesktop.org/series/48100/ > State : success > > == Summary == > > = CI Bug Log - changes from CI_DRM_4660_full -> Patchwork_9926_full = > > == Summary - SUCCESS == > > No regressions found. And pushed. Ta, -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-08-31 15:27 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-08-13 8:02 [CI 0/2] Reviewed perf cleanups Tvrtko Ursulin 2018-08-13 8:02 ` [CI 1/2] drm/i915/perf: simplify configure all context function Tvrtko Ursulin 2018-08-13 8:02 ` [CI 2/2] drm/i915/perf: reuse intel_lrc ctx regs macro Tvrtko Ursulin 2018-08-13 8:16 ` Chris Wilson 2018-08-13 9:11 ` Tvrtko Ursulin 2018-08-13 9:16 ` Chris Wilson 2018-08-14 18:49 ` Tvrtko Ursulin 2018-08-14 18:57 ` Chris Wilson 2018-08-15 8:49 ` Tvrtko Ursulin 2018-08-15 10:50 ` Lionel Landwerlin 2018-08-15 10:56 ` Chris Wilson 2018-08-15 11:15 ` Lionel Landwerlin 2018-08-13 8:33 ` ✓ Fi.CI.BAT: success for Reviewed perf cleanups Patchwork 2018-08-13 10:00 ` ✓ Fi.CI.IGT: " Patchwork 2018-08-31 15:27 ` Chris Wilson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).