From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, Intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/2] drm/i915: Cache some registers in execlists hot paths
Date: Thu, 24 Mar 2016 11:31:34 +0000 [thread overview]
Message-ID: <56F3D016.1060404@linux.intel.com> (raw)
In-Reply-To: <20160324111628.GD27742@nuc-i3427.alporthouse.com>
On 24/03/16 11:16, Chris Wilson wrote:
> On Wed, Mar 23, 2016 at 03:15:02PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> GCC cannot optimize well calculations hidden in macros and
>> assigned to temporary structures. We can cache the register in
>> ELSP write, and refactor reading of the CSB a bit to enable
>> it to do a better job.
>>
>> Code is still equally readable but the generated body of the
>> CSB read loop is 30% smaller, and since that loop runs at
>> least once per interrupt, which in turn can fire in tens or
>> hundreds thousands times per second, must be of some value.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> drivers/gpu/drm/i915/intel_lrc.c | 26 +++++++++++++-------------
>> drivers/gpu/drm/i915/intel_lrc.h | 11 ++++++++---
>> 2 files changed, 21 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 3a23b9549f7b..67592f8354d6 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -361,8 +361,8 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
>> {
>>
>> struct intel_engine_cs *engine = rq0->engine;
>> - struct drm_device *dev = engine->dev;
>> - struct drm_i915_private *dev_priv = dev->dev_private;
>> + struct drm_i915_private *dev_priv = rq0->i915;
>> + i915_reg_t elsp_reg = RING_ELSP(engine);
>
> If we are going to open-code it, how about going whole hog
> and
>
> /* We opencode I915_WRITE_FW(RING_ELSP(engine)) here to save gcc the
> * trouble of doing so - gcc fails miserably!
> */
> u32 *elsp = rq->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
> then use writel(upper_32_bits(desc[1]), elsp);
>
> Keeping the comment around for grepping.
Yeah I had that but thought it will be considered to tasteless for
posting. :)
>> uint64_t desc[2];
>>
>> if (rq1) {
>> @@ -376,12 +376,12 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
>> rq0->elsp_submitted++;
>>
>> /* You must always write both descriptors in the order below. */
>> - I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[1]));
>> - I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[1]));
>> + I915_WRITE_FW(elsp_reg, upper_32_bits(desc[1]));
>> + I915_WRITE_FW(elsp_reg, lower_32_bits(desc[1]));
>>
>> - I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[0]));
>> + I915_WRITE_FW(elsp_reg, upper_32_bits(desc[0]));
>> /* The context is automatically loaded after the following */
>> - I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[0]));
>> + I915_WRITE_FW(elsp_reg, lower_32_bits(desc[0]));
>>
>> /* ELSP is a wo register, use another nearby reg for posting */
>> POSTING_READ_FW(RING_EXECLIST_STATUS_LO(engine));
>
> Observing the above, we can also kill the POSTING_READ_FW() which will
> make a much bigger improvement than all of the above.
It is a different register, why it can be removed?
>> @@ -517,21 +517,19 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
>> }
>>
>> static u32
>> -get_context_status(struct intel_engine_cs *engine, unsigned int read_pointer,
>> - u32 *context_id)
>> +get_context_status(struct drm_i915_private *dev_priv, u32 csb_base,
>> + unsigned int read_pointer, u32 *context_id)
>> {
>> - struct drm_i915_private *dev_priv = engine->dev->dev_private;
>> u32 status;
>>
>> read_pointer %= GEN8_CSB_ENTRIES;
>>
>> - status = I915_READ_FW(RING_CONTEXT_STATUS_BUF_LO(engine, read_pointer));
>> + status = I915_READ_FW(RING_CSB_LO(csb_base, read_pointer));
>
> If we forgo the "convenience" interface here, could we not also improve
> the readability of the code by just having the csb ringbuffer and readl?
The same whole-hog open coding you mean like the above? It is slightly
more efficient, not sure that is is more readable so maybe I am not
getting exactly what you mean.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2016-03-24 11:31 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-23 15:15 [PATCH 1/2] drm/i915: Cache some registers in execlists hot paths Tvrtko Ursulin
2016-03-23 15:15 ` [PATCH 2/2] drm/i915: Remove impossibe checks from i915_gem_request_add_to_client Tvrtko Ursulin
2016-03-23 15:31 ` Chris Wilson
2016-03-24 9:49 ` Tvrtko Ursulin
2016-03-24 10:11 ` Chris Wilson
2016-03-23 16:06 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915: Cache some registers in execlists hot paths Patchwork
2016-03-24 11:16 ` [PATCH 1/2] " Chris Wilson
2016-03-24 11:31 ` Tvrtko Ursulin [this message]
2016-03-24 11:53 ` Chris Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56F3D016.1060404@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=Intel-gfx@lists.freedesktop.org \
--cc=chris@chris-wilson.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.