All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, Intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/2] drm/i915: Cache some registers in execlists hot paths
Date: Thu, 24 Mar 2016 11:31:34 +0000	[thread overview]
Message-ID: <56F3D016.1060404@linux.intel.com> (raw)
In-Reply-To: <20160324111628.GD27742@nuc-i3427.alporthouse.com>


On 24/03/16 11:16, Chris Wilson wrote:
> On Wed, Mar 23, 2016 at 03:15:02PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> GCC cannot optimize well calculations hidden in macros and
>> assigned to temporary structures. We can cache the register in
>> ELSP write, and refactor reading of the CSB a bit to enable
>> it to do a better job.
>>
>> Code is still equally readable but the generated body of the
>> CSB read loop is 30% smaller, and since that loop runs at
>> least once per interrupt, which in turn can fire in tens or
>> hundreds thousands times per second, must be of some value.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_lrc.c | 26 +++++++++++++-------------
>>   drivers/gpu/drm/i915/intel_lrc.h | 11 ++++++++---
>>   2 files changed, 21 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 3a23b9549f7b..67592f8354d6 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -361,8 +361,8 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
>>   {
>>
>>   	struct intel_engine_cs *engine = rq0->engine;
>> -	struct drm_device *dev = engine->dev;
>> -	struct drm_i915_private *dev_priv = dev->dev_private;
>> +	struct drm_i915_private *dev_priv = rq0->i915;
>> +	i915_reg_t elsp_reg = RING_ELSP(engine);
>
> If we are going to open-code it, how about going whole hog
> and
>
> /* We opencode I915_WRITE_FW(RING_ELSP(engine)) here to save gcc the
>   * trouble of doing so - gcc fails miserably!
>   */
> u32 *elsp = rq->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
> then use writel(upper_32_bits(desc[1]), elsp);
>
> Keeping the comment around for grepping.

Yeah I had that but thought it will be considered to tasteless for 
posting. :)

>>   	uint64_t desc[2];
>>
>>   	if (rq1) {
>> @@ -376,12 +376,12 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
>>   	rq0->elsp_submitted++;
>>
>>   	/* You must always write both descriptors in the order below. */
>> -	I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[1]));
>> -	I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[1]));
>> +	I915_WRITE_FW(elsp_reg, upper_32_bits(desc[1]));
>> +	I915_WRITE_FW(elsp_reg, lower_32_bits(desc[1]));
>>
>> -	I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[0]));
>> +	I915_WRITE_FW(elsp_reg, upper_32_bits(desc[0]));
>>   	/* The context is automatically loaded after the following */
>> -	I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[0]));
>> +	I915_WRITE_FW(elsp_reg, lower_32_bits(desc[0]));
>>
>>   	/* ELSP is a wo register, use another nearby reg for posting */
>>   	POSTING_READ_FW(RING_EXECLIST_STATUS_LO(engine));
>
> Observing the above, we can also kill the POSTING_READ_FW() which will
> make a much bigger improvement than all of the above.

It is a different register, why it can be removed?

>> @@ -517,21 +517,19 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
>>   }
>>
>>   static u32
>> -get_context_status(struct intel_engine_cs *engine, unsigned int read_pointer,
>> -		   u32 *context_id)
>> +get_context_status(struct drm_i915_private *dev_priv, u32 csb_base,
>> +		   unsigned int read_pointer, u32 *context_id)
>>   {
>> -	struct drm_i915_private *dev_priv = engine->dev->dev_private;
>>   	u32 status;
>>
>>   	read_pointer %= GEN8_CSB_ENTRIES;
>>
>> -	status = I915_READ_FW(RING_CONTEXT_STATUS_BUF_LO(engine, read_pointer));
>> +	status = I915_READ_FW(RING_CSB_LO(csb_base, read_pointer));
>
> If we forgo the "convenience" interface here, could we not also improve
> the readability of the code by just having the csb ringbuffer and readl?

The same whole-hog open coding you mean like the above? It is slightly 
more efficient, not sure that is is more readable so maybe I am not 
getting exactly what you mean.

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-03-24 11:31 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-23 15:15 [PATCH 1/2] drm/i915: Cache some registers in execlists hot paths Tvrtko Ursulin
2016-03-23 15:15 ` [PATCH 2/2] drm/i915: Remove impossibe checks from i915_gem_request_add_to_client Tvrtko Ursulin
2016-03-23 15:31   ` Chris Wilson
2016-03-24  9:49     ` Tvrtko Ursulin
2016-03-24 10:11       ` Chris Wilson
2016-03-23 16:06 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915: Cache some registers in execlists hot paths Patchwork
2016-03-24 11:16 ` [PATCH 1/2] " Chris Wilson
2016-03-24 11:31   ` Tvrtko Ursulin [this message]
2016-03-24 11:53     ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56F3D016.1060404@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=chris@chris-wilson.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.