All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, Intel-gfx@lists.freedesktop.org
Subject: Re: [RFC] drm/i915: Emit to ringbuffer directly
Date: Fri, 9 Sep 2016 09:32:50 +0100	[thread overview]
Message-ID: <57D273B2.4010203@linux.intel.com> (raw)
In-Reply-To: <20160908164041.GB5479@nuc-i3427.alporthouse.com>


On 08/09/16 17:40, Chris Wilson wrote:
> On Thu, Sep 08, 2016 at 04:12:55PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> This removes the usage of intel_ring_emit in favour of
>> directly writing to the ring buffer.
>
> I have the same patch! But I called it out, for historical reasons.

Yes I know we talked about it in the past but I did not think you will 
find time to actually write it amongst all the other things.

> Oh, except mine uses out[0]...out[N] because gcc prefers that over
> *out++ = ...

It copes just fine with the latter here, for example:

	*rbuf++ = cmd;
	*rbuf++ = I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT;
	*rbuf++ = 0; /* upper addr */
	*rbuf++ = 0; /* value */

Is:

      3e9:       89 10                   mov    %edx,(%rax)
      3eb:       c7 40 04 04 01 00 00    movl   $0x104,0x4(%rax)
      3f2:       c7 40 08 00 00 00 00    movl   $0x0,0x8(%rax)
      3f9:       c7 40 0c 00 00 00 00    movl   $0x0,0xc(%rax)

And for the record, before this patch, with intel_ring_emit:

      53a:       8b 53 3c                mov    0x3c(%rbx),%edx
      53d:       48 8b 4b 08             mov    0x8(%rbx),%rcx
      541:       89 04 11                mov    %eax,(%rcx,%rdx,1)
      544:       8b 43 3c                mov    0x3c(%rbx),%eax
      547:       48 8b 53 08             mov    0x8(%rbx),%rdx
      54b:       83 c0 04                add    $0x4,%eax
      54e:       89 43 3c                mov    %eax,0x3c(%rbx)
      551:       c7 04 02 04 01 00 00    movl   $0x104,(%rdx,%rax,1)
      558:       8b 43 3c                mov    0x3c(%rbx),%eax
      55b:       48 8b 53 08             mov    0x8(%rbx),%rdx
      55f:       83 c0 04                add    $0x4,%eax
      562:       89 43 3c                mov    %eax,0x3c(%rbx)
      565:       c7 04 02 00 00 00 00    movl   $0x0,(%rdx,%rax,1)
      56c:       8b 43 3c                mov    0x3c(%rbx),%eax
      56f:       48 8b 53 08             mov    0x8(%rbx),%rdx
      573:       83 c0 04                add    $0x4,%eax
      576:       89 43 3c                mov    %eax,0x3c(%rbx)
      579:       c7 04 02 00 00 00 00    movl   $0x0,(%rdx,%rax,1)

Yuck :) At least they are not function calls to iowrite any more. :)

>> intel_ring_emit was preventing the compiler for optimising
>> fetch and increment of the current ring buffer pointer and
>> therefore generating very verbose code for every write.
>>
>> It had no useful purpose since all ringbuffer operations
>> are started and ended with intel_ring_begin and
>> intel_ring_advance respectively, with no bail out in the
>> middle possible, so it is fine to increment the tail in
>> intel_ring_begin and let the code manage the pointer
>> itself.
>>
>> Useless instruction removal amounts to approximately
>> 2384 bytes of saved text on my build.
>>
>> Not sure if this has any measurable performance
>> implications but executing a ton of useless instructions
>> on fast paths cannot be good.
>
> It does show up in perf.

Cool.

>> Patch is not fully polished, but it compiles and runs
>> on Gen9 at least.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem_context.c    |  62 ++--
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |  27 +-
>>   drivers/gpu/drm/i915/i915_gem_gtt.c        |  57 ++--
>>   drivers/gpu/drm/i915/intel_display.c       | 113 ++++---
>>   drivers/gpu/drm/i915/intel_lrc.c           | 223 +++++++-------
>>   drivers/gpu/drm/i915/intel_mocs.c          |  43 +--
>>   drivers/gpu/drm/i915/intel_overlay.c       |  69 ++---
>>   drivers/gpu/drm/i915/intel_ringbuffer.c    | 480 +++++++++++++++--------------
>>   drivers/gpu/drm/i915/intel_ringbuffer.h    |  19 +-
>>   9 files changed, 555 insertions(+), 538 deletions(-)
>
> Hmm, mine is bigger.
>
>   drivers/gpu/drm/i915/i915_gem_context.c    |  85 ++--
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |  37 +-
>   drivers/gpu/drm/i915/i915_gem_gtt.c        |  62 +--
>   drivers/gpu/drm/i915/i915_gem_request.c    | 135 ++++-
>   drivers/gpu/drm/i915/i915_gem_request.h    |   2 +
>   drivers/gpu/drm/i915/intel_display.c       | 133 +++--
>   drivers/gpu/drm/i915/intel_lrc.c           | 188 ++++---
>   drivers/gpu/drm/i915/intel_lrc.h           |   2 -
>   drivers/gpu/drm/i915/intel_mocs.c          |  50 +-
>   drivers/gpu/drm/i915/intel_overlay.c       |  77 ++-
>   drivers/gpu/drm/i915/intel_ringbuffer.c    | 762 ++++++++++++-----------------
>   drivers/gpu/drm/i915/intel_ringbuffer.h    |  36 +-
>   12 files changed, 721 insertions(+), 848 deletions(-)
>
> (this includes moving the intel_ring_begin to i915_gem_request)
>
> plus an ealier
>
>   drivers/gpu/drm/i915/i915_gem_request.c |  26 ++---
>   drivers/gpu/drm/i915/intel_lrc.c        | 121 ++++++++---------------
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 168 +++++++++++---------------------
>   drivers/gpu/drm/i915/intel_ringbuffer.h |  10 +-
>   4 files changed, 112 insertions(+), 213 deletions(-)
>
> since I wanted parts of it for emitting timelines.

Ok what do you want to do?

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-09-09  8:32 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-08 15:12 [RFC] drm/i915: Emit to ringbuffer directly Tvrtko Ursulin
2016-09-08 15:54 ` ✗ Fi.CI.BAT: failure for " Patchwork
2016-09-08 16:40 ` [RFC] " Chris Wilson
2016-09-09  8:32   ` Tvrtko Ursulin [this message]
2016-09-09 13:20     ` Dave Gordon
2016-09-09 13:58       ` Tvrtko Ursulin
2016-09-09 15:52         ` [RFC v2] " Tvrtko Ursulin
2016-09-09 16:04           ` Chris Wilson
2016-09-12  9:44             ` [PATCH v3] " Tvrtko Ursulin
2016-09-12 15:04               ` Dave Gordon
2016-09-09 13:40     ` [RFC] " Chris Wilson
2016-09-09 13:45     ` Chris Wilson
2016-09-09 14:14       ` Tvrtko Ursulin
2016-09-09 16:26 ` ✗ Fi.CI.BAT: failure for drm/i915: Emit to ringbuffer directly (rev2) Patchwork
2016-09-12 10:19 ` ✓ Fi.CI.BAT: success for drm/i915: Emit to ringbuffer directly (rev3) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57D273B2.4010203@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=chris@chris-wilson.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.