All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] radeon: Shrink radeon_ring_write()
Date: Mon, 18 May 2015 20:11:26 +0200	[thread overview]
Message-ID: <555A2B4E.1090309@amd.com> (raw)
In-Reply-To: <1431971955-31231-2-git-send-email-dvlasenk@redhat.com>

De-duplicating the error message is probably a good idea, but are the 
remaining code changes really necessary for the size reduction?

Regards,
Christian.

On 18.05.2015 19:59, Denys Vlasenko wrote:
> Inlined radeon_ring_write() has 729 callers, which amounts to about 50000
> bytes of code. however, deinlining it is probably too much
> of a performance impact.
>
> This patch shrinks slow path a bit and optimizes fast path.
> Comparison of generated machine code is below:
>
> old___________________________________ new____________________________
> 55             push %rbp               55         push %rbp
> 4889e5         mov  %rsp,%rbp          ff4f38     decl 0x38(%rdi)
> 4154           push %r12               4889e5     mov  %rsp,%rbp
> 4189f4         mov  %esi,%r12d         4154       push %r12
> 53             push %rbx               4189f4     mov  %esi,%r12d
> 837f3800       cmpl $0x0,0x38(%rdi)    53         push %rbx
> 4889fb         mov  %rdi,%rbx          4889fb     mov  %rdi,%rbx
> 7f0e           jg   <.Lbl>             7905       jns  <.Lbl>
> 48c7c78f51a785 mov  $message,%rdi
> 31c0           xor  %eax,%eax
> e89306f9ff     call <drm_err>          e8cbffffff call <radeon_ring_overflow>
>      .Lbl:
> 8b4328         mov  0x28(%rbx),%eax    8b5328     mov  0x28(%rbx),%edx
> 488b5308       mov  0x8(%rbx),%rdx     488b4308   mov  0x8(%rbx),%rax
> 89c1           mov  %eax,%ecx          488d0490   lea  (%rax,%rdx,4),%rax
> ffc0           inc  %eax
> 488d148a       lea  (%rdx,%rcx,4),%rdx 448920     mov  %r12d,(%rax)
> 448922         mov  %r12d,(%rdx)       8b4328     mov  0x28(%rbx),%eax
> 234354         and  0x54(%rbx),%eax    ff4b34     decl 0x34(%rbx)
> ff4b38         decl 0x38(%rbx)         ffc0       inc  %eax
> ff4b34         decl 0x34(%rbx)         234354     and  0x54(%rbx),%eax
> 894328         mov  %eax,0x28(%rbx)    894328     mov  %eax,0x28(%rbx)
> 5b             pop  %rbx               5b         pop  %rbx
> 415c           pop  %r12               415c       pop  %r12
> 5d             pop  %rbp               5d         pop  %rbp
>
> This shaves off more than 10 kbytes of code off the kernel:
>
>      text     data      bss       dec     hex filename
> 85657104 22294872 20627456 128579432 7a9f768 vmlinux.before
> 85646544 22294872 20627456 128568872 7a9ce28 vmlinux
>
> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: linux-kernel@vger.kernel.org
> ---
>   drivers/gpu/drm/radeon/radeon.h      | 11 +++++------
>   drivers/gpu/drm/radeon/radeon_ring.c |  5 +++++
>   2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index bb6b25c..9106873 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -2658,14 +2658,13 @@ void radeon_atombios_fini(struct radeon_device *rdev);
>    *
>    * Write a value to the requested ring buffer (all asics).
>    */
> +void radeon_ring_overflow(void);
>   static inline void radeon_ring_write(struct radeon_ring *ring, uint32_t v)
>   {
> -	if (ring->count_dw <= 0)
> -		DRM_ERROR("radeon: writing more dwords to the ring than expected!\n");
> -
> -	ring->ring[ring->wptr++] = v;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw--;
> +	if (--ring->count_dw < 0)
> +		radeon_ring_overflow();
> +	ring->ring[ring->wptr] = v;
> +	ring->wptr = (ring->wptr + 1) & ring->ptr_mask;
>   	ring->ring_free_dw--;
>   }
>   
> diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
> index 2456f69..8204c23 100644
> --- a/drivers/gpu/drm/radeon/radeon_ring.c
> +++ b/drivers/gpu/drm/radeon/radeon_ring.c
> @@ -126,6 +126,11 @@ int radeon_ring_alloc(struct radeon_device *rdev, struct radeon_ring *ring, unsi
>   	return 0;
>   }
>   
> +void radeon_ring_overflow(void)
> +{
> +	DRM_ERROR("radeon: writing more dwords to the ring than expected!\n");
> +}
> +
>   /**
>    * radeon_ring_lock - lock the ring and allocate space on it
>    *


  reply	other threads:[~2015-05-18 18:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-18 17:59 [PATCH] radeon: Deinline indirect register accessor functions Denys Vlasenko
2015-05-18 17:59 ` [PATCH] radeon: Shrink radeon_ring_write() Denys Vlasenko
2015-05-18 18:11   ` Christian König [this message]
2015-05-18 18:25     ` Denys Vlasenko
2015-05-18 19:01       ` Christian König
2015-05-18 18:06 ` [PATCH] radeon: Deinline indirect register accessor functions Christian König
2015-05-18 18:50   ` Denys Vlasenko
2015-05-18 19:04     ` Christian König
2015-05-18 19:22       ` Ilia Mirkin
2015-05-18 19:22         ` Ilia Mirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555A2B4E.1090309@amd.com \
    --to=christian.koenig@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=dvlasenk@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.