From: Knut Petersen <Knut_Petersen@t-online.de>
To: Andrew Morton <akpm@osdl.org>
Cc: Roman Zippel <zippel@linux-m68k.org>,
linux-fbdev-devel@lists.sourceforge.net,
"Antonino A. Daplas" <adaplas@gmail.com>,
Linux Kernel Development <linux-kernel@vger.kernel.org>,
Jochen Hein <jochen@jochen.org>,
Geert Uytterhoeven <geert@linux-m68k.org>
Subject: Re: [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts
Date: Wed, 31 Aug 2005 14:46:35 +0200 [thread overview]
Message-ID: <4315A6AB.5090108@t-online.de> (raw)
In-Reply-To: <Pine.LNX.4.61.0508310159290.3728@scrub.home>
>Something like below, which has the advantange that there is still only
>one implementation of the function
>
True, that´s a great advantage.
> and if it's still slower, we really need to check the compiler
>
>
Please have a look at the following patch. It takes your idea of
inlining but moves
the special cases into the macro, speeding things up for the very likely
case of
s_pitch == 1 and the less likely case of s_pitch of 2. Treating s_pitch
== 2 special
gives a still significant performance improvement of more than 10 % for
16x30
fonts.
This way also bit_putcs looks better again ...
Andrew, as this way is better than and still as fast as my first
approach I think
framebuffer-bit_putcs-optimization-for-8x.patch should be reverted and the
following patch should be applied instead.
Antonino, Roman, Geert, do you agree?
cu,
knut
diff -uprN -X linux/Documentation/dontdiff -x '*.bak' -x '*.ctx' linuxorig/drivers/video/console/bitblit.c linux/drivers/video/console/bitblit.c
--- linuxorig/drivers/video/console/bitblit.c 2005-08-29 01:41:01.000000000 +0200
+++ linux/drivers/video/console/bitblit.c 2005-08-31 10:06:22.000000000 +0200
@@ -175,7 +175,7 @@ static void bit_putcs(struct vc_data *vc
src = buf;
}
- fb_pad_aligned_buffer(dst, pitch, src, idx, image.height);
+ __fb_pad_aligned_buffer(dst, pitch, src, idx, image.height);
dst += width;
}
}
diff -uprN -X linux/Documentation/dontdiff -x '*.bak' -x '*.ctx' linuxorig/drivers/video/fbmem.c linux/drivers/video/fbmem.c
--- linuxorig/drivers/video/fbmem.c 2005-08-29 01:41:01.000000000 +0200
+++ linux/drivers/video/fbmem.c 2005-08-31 13:36:16.000000000 +0200
@@ -80,15 +80,7 @@ EXPORT_SYMBOL(fb_get_color_depth);
*/
void fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src, u32 s_pitch, u32 height)
{
- int i, j;
-
- for (i = height; i--; ) {
- /* s_pitch is a few bytes at the most, memcpy is suboptimal */
- for (j = 0; j < s_pitch; j++)
- dst[j] = src[j];
- src += s_pitch;
- dst += d_pitch;
- }
+ __fb_pad_aligned_buffer(dst, d_pitch, src, s_pitch, height);
}
EXPORT_SYMBOL(fb_pad_aligned_buffer);
diff -uprN -X linux/Documentation/dontdiff -x '*.bak' -x '*.ctx' linuxorig/include/linux/fb.h linux/include/linux/fb.h
--- linuxorig/include/linux/fb.h 2005-08-29 01:41:01.000000000 +0200
+++ linux/include/linux/fb.h 2005-08-31 12:45:08.000000000 +0200
@@ -824,6 +824,38 @@ extern int fb_get_color_depth(struct fb_
extern int fb_get_options(char *name, char **option);
extern int fb_new_modelist(struct fb_info *info);
+
+/*
+ * Don't change without testing performance of framebuffer
+ * bitblitting. Inlining is necessary for performance reasons.
+ * Although the code might not _look_ fast because of some
+ * multiplications, it really _is_ fast as it is easier for gcc
+ * to optimize well.
+ */
+
+static inline void __fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src,
+ u32 s_pitch, u32 height)
+{
+ int i, j;
+
+ if (likely(s_pitch==1))
+ for(i=0; i < height; i++)
+ dst[d_pitch*i] = src[i];
+ else if (s_pitch==2)
+ for(i=0; i < height; i++) {
+ *(u16 *)dst = ((u16 *)src)[i];
+ dst += d_pitch;
+ }
+ else {
+ d_pitch -= s_pitch;
+ for (i = height; i--; ) {
+ for (j = 0; j < s_pitch; j++)
+ *dst++ = *src++;
+ dst += d_pitch;
+ }
+ }
+}
+
extern struct fb_info *registered_fb[FB_MAX];
extern int num_registered_fb;
-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
WARNING: multiple messages have this Message-ID (diff)
From: Knut Petersen <Knut_Petersen@t-online.de>
To: Andrew Morton <akpm@osdl.org>
Cc: Roman Zippel <zippel@linux-m68k.org>,
linux-fbdev-devel@lists.sourceforge.net,
"Antonino A. Daplas" <adaplas@gmail.com>,
Linux Kernel Development <linux-kernel@vger.kernel.org>,
Jochen Hein <jochen@jochen.org>,
Geert Uytterhoeven <geert@linux-m68k.org>
Subject: Re: [Linux-fbdev-devel] [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts
Date: Wed, 31 Aug 2005 14:46:35 +0200 [thread overview]
Message-ID: <4315A6AB.5090108@t-online.de> (raw)
In-Reply-To: <Pine.LNX.4.61.0508310159290.3728@scrub.home>
>Something like below, which has the advantange that there is still only
>one implementation of the function
>
True, that´s a great advantage.
> and if it's still slower, we really need to check the compiler
>
>
Please have a look at the following patch. It takes your idea of
inlining but moves
the special cases into the macro, speeding things up for the very likely
case of
s_pitch == 1 and the less likely case of s_pitch of 2. Treating s_pitch
== 2 special
gives a still significant performance improvement of more than 10 % for
16x30
fonts.
This way also bit_putcs looks better again ...
Andrew, as this way is better than and still as fast as my first
approach I think
framebuffer-bit_putcs-optimization-for-8x.patch should be reverted and the
following patch should be applied instead.
Antonino, Roman, Geert, do you agree?
cu,
knut
diff -uprN -X linux/Documentation/dontdiff -x '*.bak' -x '*.ctx' linuxorig/drivers/video/console/bitblit.c linux/drivers/video/console/bitblit.c
--- linuxorig/drivers/video/console/bitblit.c 2005-08-29 01:41:01.000000000 +0200
+++ linux/drivers/video/console/bitblit.c 2005-08-31 10:06:22.000000000 +0200
@@ -175,7 +175,7 @@ static void bit_putcs(struct vc_data *vc
src = buf;
}
- fb_pad_aligned_buffer(dst, pitch, src, idx, image.height);
+ __fb_pad_aligned_buffer(dst, pitch, src, idx, image.height);
dst += width;
}
}
diff -uprN -X linux/Documentation/dontdiff -x '*.bak' -x '*.ctx' linuxorig/drivers/video/fbmem.c linux/drivers/video/fbmem.c
--- linuxorig/drivers/video/fbmem.c 2005-08-29 01:41:01.000000000 +0200
+++ linux/drivers/video/fbmem.c 2005-08-31 13:36:16.000000000 +0200
@@ -80,15 +80,7 @@ EXPORT_SYMBOL(fb_get_color_depth);
*/
void fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src, u32 s_pitch, u32 height)
{
- int i, j;
-
- for (i = height; i--; ) {
- /* s_pitch is a few bytes at the most, memcpy is suboptimal */
- for (j = 0; j < s_pitch; j++)
- dst[j] = src[j];
- src += s_pitch;
- dst += d_pitch;
- }
+ __fb_pad_aligned_buffer(dst, d_pitch, src, s_pitch, height);
}
EXPORT_SYMBOL(fb_pad_aligned_buffer);
diff -uprN -X linux/Documentation/dontdiff -x '*.bak' -x '*.ctx' linuxorig/include/linux/fb.h linux/include/linux/fb.h
--- linuxorig/include/linux/fb.h 2005-08-29 01:41:01.000000000 +0200
+++ linux/include/linux/fb.h 2005-08-31 12:45:08.000000000 +0200
@@ -824,6 +824,38 @@ extern int fb_get_color_depth(struct fb_
extern int fb_get_options(char *name, char **option);
extern int fb_new_modelist(struct fb_info *info);
+
+/*
+ * Don't change without testing performance of framebuffer
+ * bitblitting. Inlining is necessary for performance reasons.
+ * Although the code might not _look_ fast because of some
+ * multiplications, it really _is_ fast as it is easier for gcc
+ * to optimize well.
+ */
+
+static inline void __fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src,
+ u32 s_pitch, u32 height)
+{
+ int i, j;
+
+ if (likely(s_pitch==1))
+ for(i=0; i < height; i++)
+ dst[d_pitch*i] = src[i];
+ else if (s_pitch==2)
+ for(i=0; i < height; i++) {
+ *(u16 *)dst = ((u16 *)src)[i];
+ dst += d_pitch;
+ }
+ else {
+ d_pitch -= s_pitch;
+ for (i = height; i--; ) {
+ for (j = 0; j < s_pitch; j++)
+ *dst++ = *src++;
+ dst += d_pitch;
+ }
+ }
+}
+
extern struct fb_info *registered_fb[FB_MAX];
extern int num_registered_fb;
next prev parent reply other threads:[~2005-08-31 12:43 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-08-30 16:15 [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts Knut Petersen
2005-08-30 16:18 ` Geert Uytterhoeven
2005-08-30 16:18 ` [Linux-fbdev-devel] " Geert Uytterhoeven
2005-08-30 17:58 ` Knut Petersen
2005-08-30 17:58 ` [Linux-fbdev-devel] " Knut Petersen
2005-08-30 19:13 ` Roman Zippel
2005-08-30 19:13 ` [Linux-fbdev-devel] " Roman Zippel
2005-08-30 22:26 ` Knut Petersen
2005-08-30 22:26 ` [Linux-fbdev-devel] " Knut Petersen
2005-08-31 0:51 ` Roman Zippel
2005-08-31 6:42 ` Antonino A. Daplas
2005-08-31 15:49 ` Roman Zippel
2005-08-31 15:49 ` [Linux-fbdev-devel] " Roman Zippel
2005-08-31 12:46 ` Knut Petersen [this message]
2005-08-31 12:46 ` Knut Petersen
2005-08-31 17:15 ` Roman Zippel
2005-08-31 19:19 ` Knut Petersen
2005-08-31 19:19 ` Knut Petersen
2005-08-31 19:34 ` Roman Zippel
2005-08-31 19:52 ` Knut Petersen
2005-08-30 19:59 ` Geert Uytterhoeven
2005-08-30 19:59 ` Geert Uytterhoeven
2005-08-31 1:14 ` Antonino A. Daplas
2005-08-31 1:14 ` Antonino A. Daplas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4315A6AB.5090108@t-online.de \
--to=knut_petersen@t-online.de \
--cc=adaplas@gmail.com \
--cc=akpm@osdl.org \
--cc=geert@linux-m68k.org \
--cc=jochen@jochen.org \
--cc=linux-fbdev-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=zippel@linux-m68k.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.