intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, Intel-gfx@lists.freedesktop.org
Subject: Re: [RFC] drm/i915: Emit to ringbuffer directly
Date: Fri, 9 Sep 2016 09:32:50 +0100	[thread overview]
Message-ID: <57D273B2.4010203@linux.intel.com> (raw)
In-Reply-To: <20160908164041.GB5479@nuc-i3427.alporthouse.com>


On 08/09/16 17:40, Chris Wilson wrote:
> On Thu, Sep 08, 2016 at 04:12:55PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> This removes the usage of intel_ring_emit in favour of
>> directly writing to the ring buffer.
>
> I have the same patch! But I called it out, for historical reasons.

Yes I know we talked about it in the past but I did not think you will 
find time to actually write it amongst all the other things.

> Oh, except mine uses out[0]...out[N] because gcc prefers that over
> *out++ = ...

It copes just fine with the latter here, for example:

	*rbuf++ = cmd;
	*rbuf++ = I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT;
	*rbuf++ = 0; /* upper addr */
	*rbuf++ = 0; /* value */

Is:

      3e9:       89 10                   mov    %edx,(%rax)
      3eb:       c7 40 04 04 01 00 00    movl   $0x104,0x4(%rax)
      3f2:       c7 40 08 00 00 00 00    movl   $0x0,0x8(%rax)
      3f9:       c7 40 0c 00 00 00 00    movl   $0x0,0xc(%rax)

And for the record, before this patch, with intel_ring_emit:

      53a:       8b 53 3c                mov    0x3c(%rbx),%edx
      53d:       48 8b 4b 08             mov    0x8(%rbx),%rcx
      541:       89 04 11                mov    %eax,(%rcx,%rdx,1)
      544:       8b 43 3c                mov    0x3c(%rbx),%eax
      547:       48 8b 53 08             mov    0x8(%rbx),%rdx
      54b:       83 c0 04                add    $0x4,%eax
      54e:       89 43 3c                mov    %eax,0x3c(%rbx)
      551:       c7 04 02 04 01 00 00    movl   $0x104,(%rdx,%rax,1)
      558:       8b 43 3c                mov    0x3c(%rbx),%eax
      55b:       48 8b 53 08             mov    0x8(%rbx),%rdx
      55f:       83 c0 04                add    $0x4,%eax
      562:       89 43 3c                mov    %eax,0x3c(%rbx)
      565:       c7 04 02 00 00 00 00    movl   $0x0,(%rdx,%rax,1)
      56c:       8b 43 3c                mov    0x3c(%rbx),%eax
      56f:       48 8b 53 08             mov    0x8(%rbx),%rdx
      573:       83 c0 04                add    $0x4,%eax
      576:       89 43 3c                mov    %eax,0x3c(%rbx)
      579:       c7 04 02 00 00 00 00    movl   $0x0,(%rdx,%rax,1)

Yuck :) At least they are not function calls to iowrite any more. :)

>> intel_ring_emit was preventing the compiler for optimising
>> fetch and increment of the current ring buffer pointer and
>> therefore generating very verbose code for every write.
>>
>> It had no useful purpose since all ringbuffer operations
>> are started and ended with intel_ring_begin and
>> intel_ring_advance respectively, with no bail out in the
>> middle possible, so it is fine to increment the tail in
>> intel_ring_begin and let the code manage the pointer
>> itself.
>>
>> Useless instruction removal amounts to approximately
>> 2384 bytes of saved text on my build.
>>
>> Not sure if this has any measurable performance
>> implications but executing a ton of useless instructions
>> on fast paths cannot be good.
>
> It does show up in perf.

Cool.

>> Patch is not fully polished, but it compiles and runs
>> on Gen9 at least.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem_context.c    |  62 ++--
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |  27 +-
>>   drivers/gpu/drm/i915/i915_gem_gtt.c        |  57 ++--
>>   drivers/gpu/drm/i915/intel_display.c       | 113 ++++---
>>   drivers/gpu/drm/i915/intel_lrc.c           | 223 +++++++-------
>>   drivers/gpu/drm/i915/intel_mocs.c          |  43 +--
>>   drivers/gpu/drm/i915/intel_overlay.c       |  69 ++---
>>   drivers/gpu/drm/i915/intel_ringbuffer.c    | 480 +++++++++++++++--------------
>>   drivers/gpu/drm/i915/intel_ringbuffer.h    |  19 +-
>>   9 files changed, 555 insertions(+), 538 deletions(-)
>
> Hmm, mine is bigger.
>
>   drivers/gpu/drm/i915/i915_gem_context.c    |  85 ++--
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |  37 +-
>   drivers/gpu/drm/i915/i915_gem_gtt.c        |  62 +--
>   drivers/gpu/drm/i915/i915_gem_request.c    | 135 ++++-
>   drivers/gpu/drm/i915/i915_gem_request.h    |   2 +
>   drivers/gpu/drm/i915/intel_display.c       | 133 +++--
>   drivers/gpu/drm/i915/intel_lrc.c           | 188 ++++---
>   drivers/gpu/drm/i915/intel_lrc.h           |   2 -
>   drivers/gpu/drm/i915/intel_mocs.c          |  50 +-
>   drivers/gpu/drm/i915/intel_overlay.c       |  77 ++-
>   drivers/gpu/drm/i915/intel_ringbuffer.c    | 762 ++++++++++++-----------------
>   drivers/gpu/drm/i915/intel_ringbuffer.h    |  36 +-
>   12 files changed, 721 insertions(+), 848 deletions(-)
>
> (this includes moving the intel_ring_begin to i915_gem_request)
>
> plus an ealier
>
>   drivers/gpu/drm/i915/i915_gem_request.c |  26 ++---
>   drivers/gpu/drm/i915/intel_lrc.c        | 121 ++++++++---------------
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 168 +++++++++++---------------------
>   drivers/gpu/drm/i915/intel_ringbuffer.h |  10 +-
>   4 files changed, 112 insertions(+), 213 deletions(-)
>
> since I wanted parts of it for emitting timelines.

Ok what do you want to do?

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-09-09  8:32 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-08 15:12 [RFC] drm/i915: Emit to ringbuffer directly Tvrtko Ursulin
2016-09-08 15:54 ` ✗ Fi.CI.BAT: failure for " Patchwork
2016-09-08 16:40 ` [RFC] " Chris Wilson
2016-09-09  8:32   ` Tvrtko Ursulin [this message]
2016-09-09 13:20     ` Dave Gordon
2016-09-09 13:58       ` Tvrtko Ursulin
2016-09-09 15:52         ` [RFC v2] " Tvrtko Ursulin
2016-09-09 16:04           ` Chris Wilson
2016-09-12  9:44             ` [PATCH v3] " Tvrtko Ursulin
2016-09-12 15:04               ` Dave Gordon
2016-09-09 13:40     ` [RFC] " Chris Wilson
2016-09-09 13:45     ` Chris Wilson
2016-09-09 14:14       ` Tvrtko Ursulin
2016-09-09 16:26 ` ✗ Fi.CI.BAT: failure for drm/i915: Emit to ringbuffer directly (rev2) Patchwork
2016-09-12 10:19 ` ✓ Fi.CI.BAT: success for drm/i915: Emit to ringbuffer directly (rev3) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57D273B2.4010203@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=chris@chris-wilson.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).