linux-fbdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Text console scrolling
@ 2007-07-18 15:10 Geert Uytterhoeven
  2007-07-18 16:08 ` Antonino A. Daplas
  0 siblings, 1 reply; 4+ messages in thread
From: Geert Uytterhoeven @ 2007-07-18 15:10 UTC (permalink / raw)
  To: Linux Frame Buffer Device Development

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3113 bytes --]

	Hi all,

I was trying to find the optimal scrolling strategy for ps3fb just when many
fbcon changes went in, and I want to share my findings with you.

Below are the results for my text console scrolling benchmark (execution time
in seconds of cat'ing a big text file) on the PS3, for various video modes,
screen rotations (you can fit lots of debug output on a rotated 1920x1200
screen ;-), and font, using the default FBINFO_DEFAULT vs.
FBINFO_DEFAULT | FBINFO_READS_FAST flags.

2.6.22-ga5fcaa21 is mainline from yesterday, 2.6.22-gcb32da04 is mainline from
today:

				2.6.22-ga5fcaa21	2.6.22-gcb32da04
mode	rot	font		DEFAULT	READS_FAST	DEFAULT	READS_FAST
480p	0	default8x16	 7.32	 6.63		 7.49	 5.18
480p	1	default8x16	11.39	 5.79		11.60	 7.60		
480p	2	default8x16	 7.35	 4.34		 7.50	 3.90		
480p	3	default8x16	11.39	 8.85		11.61	 9.31		
720p	0	default8x16	13.04	15.91		13.33	 9.28		
720p	1	default8x16	22.76	12.73		23.17	18.35		
720p	2	default8x16	13.10	10.22		13.34	 7.30		
720p	3	default8x16	22.60	19.65		23.03	19.56		
1080p	0	default8x16	20.76	34.25		21.09	14.98		
1080p	1	default8x16	35.13	23.82		35.78	26.27		
1080p	2	default8x16	20.69	21.51		21.12	11.57		
1080p	3	default8x16	34.98	37.03		35.68	29.23		
WUXGA	0	default8x16	24.92	46.83		25.52	18.05		
WUXGA	1	default8x16	41.19	31.08		41.98	28.61		
WUXGA	2	default8x16	25.00	29.11		25.56	13.74		
WUXGA	3	default8x16	41.20	49.49		42.02	34.79		
480p	0	lat4-19		 7.20	 6.57		 7.35	 5.12		
480p	1	lat4-19		11.19	 6.02		11.36	 7.86		
480p	2	lat4-19		 7.21	 4.32		 7.36	 3.83		
480p	3	lat4-19		11.12	 8.90		11.33	 9.08		
720p	0	lat4-19		12.69	15.57		12.94	 9.09		
720p	1	lat4-19		22.43	12.92		22.86	17.46		
720p	2	lat4-19		12.71	10.04		13.08	 7.19		
720p	3	lat4-19		22.50	20.02		22.89	20.21		
1080p	0	lat4-19		20.31	33.99		20.75	14.87		
1080p	1	lat4-19		34.60	24.06		35.26	25.19		
1080p	2	lat4-19		20.39	21.37		20.79	11.51		
1080p	3	lat4-19		34.50	37.45		35.24	29.13		
WUXGA	0	lat4-19		24.72	46.92		25.26	18.06		
WUXGA	1	lat4-19		40.98	32.32		41.77	30.52		
WUXGA	2	lat4-19		24.80	29.13		25.28	13.73		
WUXGA	3	lat4-19		41.25	50.26		42.25	35.50		

Yesterday, FBINFO_READS_FAST was faster in some cases, but much slower in
others, so I didn't really know whether I should enable it or not.

Today, FBINFO_READS_FAST is faster in all test cases, so the choice is clear.

Note that there are still a few cases where today's fastest option is a bit
slower than yesterday's fastest option.

With kind regards,
 
Geert Uytterhoeven
Software Architect

Sony Network and Software Technology Center Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium
 
Phone:    +32 (0)2 700 8453	
Fax:      +32 (0)2 700 8622	
E-mail:   Geert.Uytterhoeven@sonycom.com	
Internet: http://www.sony-europe.com/
 	
Sony Network and Software Technology Center Europe	
A division of Sony Service Centre (Europe) N.V.	
Registered office: Technologielaan 7 · B-1840 Londerzeel · Belgium	
VAT BE 0413.825.160 · RPR Brussels	
Fortis Bank Zaventem · Swift GEBABEBB08A · IBAN BE39001382358619

[-- Attachment #2: Type: text/plain, Size: 286 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

[-- Attachment #3: Type: text/plain, Size: 182 bytes --]

_______________________________________________
Linux-fbdev-devel mailing list
Linux-fbdev-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-fbdev-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Text console scrolling
  2007-07-18 15:10 Text console scrolling Geert Uytterhoeven
@ 2007-07-18 16:08 ` Antonino A. Daplas
  2007-07-19 18:20   ` Krzysztof Helt
  0 siblings, 1 reply; 4+ messages in thread
From: Antonino A. Daplas @ 2007-07-18 16:08 UTC (permalink / raw)
  To: linux-fbdev-devel; +Cc: Geert.Uytterhoeven

On Wed, 2007-07-18 at 17:10 +0200, Geert Uytterhoeven wrote:
> 	Hi all,
> 
> Yesterday, FBINFO_READS_FAST was faster in some cases, but much slower in
> others, so I didn't really know whether I should enable it or not.
> 
> Today, FBINFO_READS_FAST is faster in all test cases, so the choice is clear.
> 

The improvement of scroll_move is courtesy of Ondrej's smart blitter
patch. Basically, it only does an actual move if the content of the
source is different from the destination.  If the source == destination,
the actual move is skipped.
 
> Note that there are still a few cases where today's fastest option is a bit
> slower than yesterday's fastest option.
> 

All things equal, the speed of the smart blitter will depend on:

1. the amount of text to be scrolled;
2. the difference of the content of the screen to be scrolled and of the
contents of the screen to be scrolled into; and 
3. the amount of time the cpu spends for the comparison.

The less different the contents, the more the smart blitter will speed
up (because less characters are actually moved). If the contents are
totally different, then the blitter spends as much time as the old
(dumb) blitter plus the time spent on the compare. 

For the old (dumb) scroll_move, the scrolling speed is only dependent on
the amount of text to be scrolled. No cpu time is spent on the compare.

For the most common usage (screen has lots of whitespaces and blank
lines), I bet the smart blitter will perform very well.

Tony



-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Text console scrolling
  2007-07-18 16:08 ` Antonino A. Daplas
@ 2007-07-19 18:20   ` Krzysztof Helt
  2007-07-20 12:21     ` Geert Uytterhoeven
  0 siblings, 1 reply; 4+ messages in thread
From: Krzysztof Helt @ 2007-07-19 18:20 UTC (permalink / raw)
  To: linux-fbdev-devel; +Cc: Geert.Uytterhoeven, Antonino A. Daplas

On Thu, 19 Jul 2007 00:08:56 +0800
"Antonino A. Daplas" <adaplas@gmail.com> wrote:

> On Wed, 2007-07-18 at 17:10 +0200, Geert Uytterhoeven wrote:
> > 	Hi all,
> > 
> > Yesterday, FBINFO_READS_FAST was faster in some cases, but much slower in
> > others, so I didn't really know whether I should enable it or not.
> > 
> > Today, FBINFO_READS_FAST is faster in all test cases, so the choice is clear.
> > 
> 
> The improvement of scroll_move is courtesy of Ondrej's smart blitter
> patch.
[...]
>  
> > Note that there are still a few cases where today's fastest option is a bit
> > slower than yesterday's fastest option.
> > 
> 
> All things equal, the speed of the smart blitter will depend on:
> 
> The less different the contents, the more the smart blitter will speed
> up (because less characters are actually moved). If the contents are
> totally different, then the blitter spends as much time as the old
> (dumb) blitter plus the time spent on the compare. 
> 

My guess is that comparing is relatively fast comparing to preparation of the blit operation.  
One of the very first versions of the patch had a variant to merge small blits into bigger ones.
It was done by regarding the first character after difference (string of different character) as different too.
An advantage was fewer blitter operations but the operations got bigger at least by one character.

You may look here (see MERGE_BLITS define):

http://marc.info/?l=linux-fbdev-devel&m=117869435713671&w=2

Some tests result on cards with real blitter:

http://marc.info/?l=linux-fbdev-devel&m=117881573823606&w=2

As you see, merging blits gave speed up on some cards (comparing to current method) despite it had more to compare.
But it was usually lost at higher bpp as amount of data to move grown faster.

I am interested in results of scrolling speeds with merged blits on your hardware.

Regards,
Krzysztof

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Text console scrolling
  2007-07-19 18:20   ` Krzysztof Helt
@ 2007-07-20 12:21     ` Geert Uytterhoeven
  0 siblings, 0 replies; 4+ messages in thread
From: Geert Uytterhoeven @ 2007-07-20 12:21 UTC (permalink / raw)
  To: Krzysztof Helt; +Cc: linux-fbdev-devel, Antonino A. Daplas

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6402 bytes --]

On Thu, 19 Jul 2007, Krzysztof Helt wrote:
> On Thu, 19 Jul 2007 00:08:56 +0800
> "Antonino A. Daplas" <adaplas@gmail.com> wrote:
> > On Wed, 2007-07-18 at 17:10 +0200, Geert Uytterhoeven wrote:
> > > Yesterday, FBINFO_READS_FAST was faster in some cases, but much slower in
> > > others, so I didn't really know whether I should enable it or not.
> > > 
> > > Today, FBINFO_READS_FAST is faster in all test cases, so the choice is clear.
> > The improvement of scroll_move is courtesy of Ondrej's smart blitter
> > patch.
> [...]
> >  
> > > Note that there are still a few cases where today's fastest option is a bit
> > > slower than yesterday's fastest option.
> > > 
> > 
> > All things equal, the speed of the smart blitter will depend on:
> > 
> > The less different the contents, the more the smart blitter will speed
> > up (because less characters are actually moved). If the contents are
> > totally different, then the blitter spends as much time as the old
> > (dumb) blitter plus the time spent on the compare. 
> > 
> 
> My guess is that comparing is relatively fast comparing to preparation of the blit operation.  
> One of the very first versions of the patch had a variant to merge small blits into bigger ones.
> It was done by regarding the first character after difference (string of different character) as different too.
> An advantage was fewer blitter operations but the operations got bigger at least by one character.
> 
> You may look here (see MERGE_BLITS define):
> 
> http://marc.info/?l=linux-fbdev-devel&m=117869435713671&w=2
> 
> Some tests result on cards with real blitter:
> 
> http://marc.info/?l=linux-fbdev-devel&m=117881573823606&w=2
> 
> As you see, merging blits gave speed up on some cards (comparing to current method) despite it had more to compare.
> But it was usually lost at higher bpp as amount of data to move grown faster.
> 
> I am interested in results of scrolling speeds with merged blits on your hardware.

I manually applied your MERGE_BLITS changes (patch below) and reran my
benchmark:

                          -ga5fcaa21    -gcb32da04    -g9a79b227
mode   rot     font      DFLT   READS  DFLT   READS  READS  +MERGE
                                FAST          FAST   FAST    BLITS
480p    0   default8x16   7.32   6.63   7.49   5.18   3.69   5.24
480p    1   default8x16  11.39   5.79  11.6    7.60  10.18   8.09
480p    2   default8x16   7.35   4.34   7.5    3.90   5.40   3.67
480p    3   default8x16  11.39   8.85  11.61   9.31   6.80   9.92
720p    0   default8x16  13.04  15.91  13.33   9.28   6.62   9.35
720p    1   default8x16  22.76  12.73  23.17  18.35  23.51  19.83
720p    2   default8x16  13.10  10.22  13.34   7.30  10.04   6.84
720p    3   default8x16  22.60  19.65  23.03  19.56  14.48  21.11
1080p   0   default8x16  20.76  34.25  21.09  14.98  10.78  15.01
1080p   1   default8x16  35.13  23.82  35.78  26.27  34.15  28.11
1080p   2   default8x16  20.69  21.51  21.12  11.57  15.79  10.73
1080p   3   default8x16  34.98  37.03  35.68  29.23  21.47  31.43
WUXGA   0   default8x16  24.92  46.83  25.52  18.05  12.98  18.05
WUXGA   1   default8x16  41.19  31.08  41.98  28.61  37.90  30.99
WUXGA   2   default8x16  25.00  29.11  25.56  13.74  18.81  12.64
WUXGA   3   default8x16  41.20  49.49  42.02  34.79  25.66  37.84
480p    0   lat4-19       7.20   6.57   7.35   5.12   3.63   5.18
480p    1   lat4-19      11.19   6.02  11.36   7.86   9.97   8.31
480p    2   lat4-19       7.21   4.32   7.36   3.83   5.31   3.62
480p    3   lat4-19      11.12   8.90  11.33   9.08   7.01   9.63
720p    0   lat4-19      12.69  15.57  12.94   9.09   6.52   9.13
720p    1   lat4-19      22.43  12.92  22.86  17.46  21.68  18.89
720p    2   lat4-19      12.71  10.04  13.08   7.19   9.83   6.70
720p    3   lat4-19      22.50  20.02  22.89  20.21  15.92  21.84
1080p   0   lat4-19      20.31  33.99  20.75  14.87  10.76  14.88
1080p   1   lat4-19      34.60  24.06  35.26  25.19  31.67  27.02
1080p   2   lat4-19      20.39  21.37  20.79  11.51  15.67  10.62
1080p   3   lat4-19      34.50  37.45  35.24  29.13  22.61  31.30
WUXGA   0   lat4-19      24.72  46.92  25.26  18.06  13.03  18.01
WUXGA   1   lat4-19      40.98  32.32  41.77  30.52  38.22  33.30
WUXGA   2   lat4-19      24.80  29.13  25.28  13.73  18.78  12.59
WUXGA   3   lat4-19      41.25  50.26  42.25  35.50  27.82  38.63
                                                            
           RMS           24.50  26.47  24.99  18.60  18.87  19.74

(RMS = Root Mean Squares)

Two findings:
  - MERGE_BLITS is faster for rotations 1 and 2, but slower for rotations
    0 and 3.
  - Todays' kernel behaves different than the one from two days ago, even
    without changes to the console code.

Note that all scrolling is done by the CPU.

--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -1,3 +1,4 @@
+#define MERGE_BLITS
 /*
  *  linux/drivers/video/fbcon.c -- Low level frame buffer based console driver
  *
@@ -1724,19 +1725,35 @@ static void fbcon_redraw_blit(struct vc_
 		unsigned short *le = advance_row(s, 1);
 		unsigned short c;
 		int x = 0;
+#ifdef MERGE_BLITS
+		int was_blit = 1;
+#endif
 
 		do {
 			c = scr_readw(s);
 
 			if (c == scr_readw(d)) {
 				if (s > start) {
+#ifdef MERGE_BLITS
+				    if (!was_blit) {
+#endif
 					ops->bmove(vc, info, line + ycount, x,
 						   line, x, 1, s-start);
 					x += s - start + 1;
 					start = s + 1;
+#ifdef MERGE_BLITS
+				    }
+				    was_blit = !was_blit;
+#endif
 				} else {
+#ifdef MERGE_BLITS
+				    if (was_blit) {
+#endif
 					x++;
 					start++;
+#ifdef MERGE_BLITS
+				    }
+#endif
 				}
 			}
 

With kind regards,
 
Geert Uytterhoeven
Software Architect

Sony Network and Software Technology Center Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium
 
Phone:    +32 (0)2 700 8453	
Fax:      +32 (0)2 700 8622	
E-mail:   Geert.Uytterhoeven@sonycom.com	
Internet: http://www.sony-europe.com/
 	
Sony Network and Software Technology Center Europe	
A division of Sony Service Centre (Europe) N.V.	
Registered office: Technologielaan 7 · B-1840 Londerzeel · Belgium	
VAT BE 0413.825.160 · RPR Brussels	
Fortis Bank Zaventem · Swift GEBABEBB08A · IBAN BE39001382358619

[-- Attachment #2: Type: text/plain, Size: 228 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

[-- Attachment #3: Type: text/plain, Size: 182 bytes --]

_______________________________________________
Linux-fbdev-devel mailing list
Linux-fbdev-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-fbdev-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-07-20 12:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-18 15:10 Text console scrolling Geert Uytterhoeven
2007-07-18 16:08 ` Antonino A. Daplas
2007-07-19 18:20   ` Krzysztof Helt
2007-07-20 12:21     ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).