From: Knut Petersen <Knut_Petersen@t-online.de>
To: Roman Zippel <zippel@linux-m68k.org>
Cc: Andrew Morton <akpm@osdl.org>,
linux-fbdev-devel@lists.sourceforge.net,
"Antonino A. Daplas" <adaplas@gmail.com>,
Linux Kernel Development <linux-kernel@vger.kernel.org>,
Jochen Hein <jochen@jochen.org>,
Geert Uytterhoeven <geert@linux-m68k.org>
Subject: Re: [Linux-fbdev-devel] [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts
Date: Wed, 31 Aug 2005 21:19:40 +0200 [thread overview]
Message-ID: <431602CC.1030008@t-online.de> (raw)
In-Reply-To: <Pine.LNX.4.61.0508311750140.3728@scrub.home>
Hi Roman!
>>+static inline void __fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src, +
>>u32 s_pitch, u32 height)
>>+{
>>+ int i, j;
>>+
>>+ if (likely(s_pitch==1))
>>+ for(i=0; i < height; i++)
>>+ dst[d_pitch*i] = src[i];
>>
>>
I added the multiply back because gcc (v. 3.3.4) does generate the
fastest code
if I write it this way. I compiled, inspected the generated assembly and
benchmarked
about a dozend variations of the code (benchmark as previously described).
The special case for s_pitch == 1 saves about 10 ms system time (770 ms
-> 760 ms)
The special case for s_pitch == 2 saves about 270 ms system time (2120
-> 1850ms)
with a 16x30 font.
The third case is for even bigger fonts ... I believe that it will not
often be used but
something like that must be present.
>You have now 3 slightly different variants of the same, which isn't really
>an improvement. In my example I showed you how to generate the first and
>last version from the same source.
>
The first version will only be generated when gcc can be sure that
s_pitch is 1.
Therefore you had to explicitly call __fb_pad_aligned_buffer with that
value:
if (likely(idx == 1))
__fb_pad_aligned_buffer(dst, pitch, src, 1, image.height);
else
fb_pad_aligned_buffer(dst, pitch, src, idx, image.height);
With the version I propose it´s enough to write
__fb_pad_aligned_buffer(dst, pitch, src, idx, image.height);
instead, and you will get good performance for all cases. If the value of
idx/s_pitch is know at compile time, the compiler can and will ignore the
other cases.
>If you also want to optimize for other sizes, you might want to always
>inline the function, if the function call overhead is the largest part
>anyway, the special case for 2 bytes might not be needed anymore.
>
>
fb_pad_aligned_buffer() is usefull to save some space in cases like
softcursor.
It´s also used by some drivers (nvidia and riva), but the authors of
those drivers
have to decide if they prefer the inlined version or the version fixed
in fbmem.
>BTW this version saves another condition:
>
>static inline void __fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src, u32 s_pitch, u32 height)
>{
> int i, j;
>
> d_pitch -= s_pitch;
> i = height;
> do {
> /* s_pitch is a few bytes at the most, memcpy is suboptimal */
> j = s_pitch;
> do
> *dst++ = *src++;
> while (--j > 0);
> dst += d_pitch;
> } while (--i > 0);
>}
>
I tested that code, together with the followig code in but_putcs():
if(idx==1)
__fb_pad_aligned_buffer(dst, pitch, src, 1,
image.height);
else if (idx==2)
__fb_pad_aligned_buffer(dst, pitch, src, 2,
image.height);
else
fb_pad_aligned_buffer(dst, pitch, src, idx,
image.height);
dst += width;
It´s as fast/slow as your previous version, the measurements are almost
identical.
Let´s summarize:
Your version of __fb_pad_aligned_buffer looks much better, but it needs
not so nice
conditionals when used. My version looks bad, but it is easier to use
and it is
_faster_.
cu,
knut
next prev parent reply other threads:[~2005-08-31 19:16 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-08-30 16:15 [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts Knut Petersen
2005-08-30 16:18 ` [Linux-fbdev-devel] " Geert Uytterhoeven
2005-08-30 17:58 ` Knut Petersen
2005-08-30 19:13 ` Roman Zippel
2005-08-30 22:26 ` Knut Petersen
2005-08-31 0:51 ` Roman Zippel
2005-08-31 6:42 ` Antonino A. Daplas
2005-08-31 15:49 ` Roman Zippel
2005-08-31 12:46 ` Knut Petersen
2005-08-31 17:15 ` Roman Zippel
2005-08-31 19:19 ` Knut Petersen [this message]
2005-08-31 19:34 ` Roman Zippel
2005-08-31 19:52 ` Knut Petersen
2005-08-30 19:59 ` Geert Uytterhoeven
2005-08-31 1:14 ` Antonino A. Daplas
2005-08-31 1:14 ` Antonino A. Daplas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=431602CC.1030008@t-online.de \
--to=knut_petersen@t-online.de \
--cc=adaplas@gmail.com \
--cc=akpm@osdl.org \
--cc=geert@linux-m68k.org \
--cc=jochen@jochen.org \
--cc=linux-fbdev-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=zippel@linux-m68k.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox