public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Knut Petersen <Knut_Petersen@t-online.de>
To: Roman Zippel <zippel@linux-m68k.org>
Cc: Andrew Morton <akpm@osdl.org>,
	linux-fbdev-devel@lists.sourceforge.net,
	"Antonino A. Daplas" <adaplas@gmail.com>,
	Linux Kernel Development <linux-kernel@vger.kernel.org>,
	Jochen Hein <jochen@jochen.org>,
	Geert Uytterhoeven <geert@linux-m68k.org>
Subject: Re: [Linux-fbdev-devel] [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts
Date: Wed, 31 Aug 2005 21:19:40 +0200	[thread overview]
Message-ID: <431602CC.1030008@t-online.de> (raw)
In-Reply-To: <Pine.LNX.4.61.0508311750140.3728@scrub.home>

  Hi Roman!

>>+static inline void __fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src, +
>>u32 s_pitch, u32 height)
>>+{
>>+	int i, j;
>>+
>>+	if (likely(s_pitch==1))
>>+		for(i=0; i < height; i++)
>>+			dst[d_pitch*i] = src[i];
>>    
>>

I added the multiply back because gcc (v. 3.3.4) does generate the 
fastest code
if I write it this way. I compiled, inspected the generated assembly and 
benchmarked
about a  dozend variations of the code (benchmark as previously described).

The special case for s_pitch == 1 saves about 10 ms system time (770 ms 
-> 760 ms)

The special case for s_pitch == 2 saves about 270 ms system time (2120 
-> 1850ms)
with a 16x30 font.

The third case is for even bigger fonts ... I believe that it will not 
often be used but
something like that must be present.

>You have now 3 slightly different variants of the same, which isn't really 
>an improvement. In my example I showed you how to generate the first and 
>last version from the same source.
>
The first version will only be generated when gcc can be sure that 
s_pitch is 1.
Therefore you had to explicitly call __fb_pad_aligned_buffer with that 
value:

	if (likely(idx == 1))
		__fb_pad_aligned_buffer(dst, pitch, src, 1, image.height);
	else
		fb_pad_aligned_buffer(dst, pitch, src, idx, image.height);
 
With the version I propose it´s enough to write

		__fb_pad_aligned_buffer(dst, pitch, src, idx, image.height);

instead, and you will get good performance for all cases. If the value of
idx/s_pitch is know at compile time, the compiler can and will ignore the
other cases.

>If you also want to optimize for other sizes, you might want to always 
>inline the function, if the function call overhead is the largest part 
>anyway, the special case for 2 bytes might not be needed anymore.
>  
>
fb_pad_aligned_buffer() is  usefull to save some space in cases like 
softcursor.
It´s also used by some  drivers (nvidia and riva), but the authors of 
those drivers
have to decide if they prefer the inlined version or the version fixed 
in fbmem.

>BTW this version saves another condition:
>
>static inline void __fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src, u32 s_pitch, u32 height)
>{
>	int i, j;
>
>	d_pitch -= s_pitch;
>	i = height;
>	do {
>		/* s_pitch is a few bytes at the most, memcpy is suboptimal */
>		j = s_pitch;
>		do
>			*dst++ = *src++;
>		while (--j > 0);
>		dst += d_pitch;
>	} while (--i > 0);
>}
>
I tested that code, together with the followig code in but_putcs():

             if(idx==1)
                    __fb_pad_aligned_buffer(dst, pitch, src, 1, 
image.height);
             else if (idx==2)
                    __fb_pad_aligned_buffer(dst, pitch, src, 2, 
image.height);
             else
                    fb_pad_aligned_buffer(dst, pitch, src, idx, 
image.height);
             dst += width;

It´s as fast/slow as your previous version, the measurements are almost 
identical.

Let´s summarize:

Your version of __fb_pad_aligned_buffer looks much better, but it needs 
not so nice
conditionals when used. My version looks bad, but it is easier to use 
and it is
_faster_.

cu,
 knut




  reply	other threads:[~2005-08-31 19:16 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-30 16:15 [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts Knut Petersen
2005-08-30 16:18 ` [Linux-fbdev-devel] " Geert Uytterhoeven
2005-08-30 17:58   ` Knut Petersen
2005-08-30 19:13     ` Roman Zippel
2005-08-30 22:26       ` Knut Petersen
2005-08-31  0:51         ` Roman Zippel
2005-08-31  6:42           ` Antonino A. Daplas
2005-08-31 15:49             ` Roman Zippel
2005-08-31 12:46           ` Knut Petersen
2005-08-31 17:15             ` Roman Zippel
2005-08-31 19:19               ` Knut Petersen [this message]
2005-08-31 19:34                 ` Roman Zippel
2005-08-31 19:52                   ` Knut Petersen
2005-08-30 19:59     ` Geert Uytterhoeven
2005-08-31  1:14     ` Antonino A. Daplas
2005-08-31  1:14 ` Antonino A. Daplas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=431602CC.1030008@t-online.de \
    --to=knut_petersen@t-online.de \
    --cc=adaplas@gmail.com \
    --cc=akpm@osdl.org \
    --cc=geert@linux-m68k.org \
    --cc=jochen@jochen.org \
    --cc=linux-fbdev-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=zippel@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox