All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] fbuffer: performance improvement + code cleanup
@ 2015-05-28 13:13 Greg Kurz
  2015-05-28 13:13 ` [PATCH 1/3] fbuffer: simplify address computations in fb8-toggle-cursor Greg Kurz
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Greg Kurz @ 2015-05-28 13:13 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Thomas Huth, Nikunj A Dadhania,
	David Gibson

If booted in frame buffer mode, board-qemu currently calls hv-logical-load
and hv-logical-store for every pixel when enabling or disabling the cursor.
This is suboptimal when writing one char at a time to the console since
terminal-write always toggles the cursor. And this is precisely what grub
is doing when the user wants to edit a menu entry... the result is an
incredibly slow and barely usable interface.

This series introduces per-board helpers to be used by the frame buffer
code, so that board-qemu may have its own accelarated implementation:

- the first patch is preliminary cleanup, before moving code out to helpers.

- the second patch introduces a helper to invert a memory region byte-per-byte:
  this fixes the unbearable slowliness of grub editing mode.

- the third patch introduces a similar helper with a a quad-word pace: it
  doesn't bring any speed improvement since board-qemu already uses
  hv-logical-memop, but it allows to "unify hcall-invert-screen and
  fb8-invert-screen again".

Please comment.

---

Greg Kurz (3):
      fbuffer: simplify address computations in fb8-toggle-cursor
      fbuffer: introduce the invert-region helper
      fbuffer: introduce the invert-region-x helper


 board-js2x/slof/helper.fs               |    9 +++++++++
 board-qemu/slof/helper.fs               |    7 +++++++
 board-qemu/slof/pci-device_1234_1111.fs |   10 +---------
 slof/fs/fbuffer.fs                      |    8 +++-----
 4 files changed, 20 insertions(+), 14 deletions(-)

--
Greg

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] fbuffer: simplify address computations in fb8-toggle-cursor
  2015-05-28 13:13 [PATCH 0/3] fbuffer: performance improvement + code cleanup Greg Kurz
@ 2015-05-28 13:13 ` Greg Kurz
  2015-05-28 13:30   ` Thomas Huth
  2015-05-29  4:17   ` Nikunj A Dadhania
  2015-05-28 13:13 ` [PATCH 2/3] fbuffer: introduce the invert-region helper Greg Kurz
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 11+ messages in thread
From: Greg Kurz @ 2015-05-28 13:13 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Thomas Huth, Nikunj A Dadhania,
	David Gibson

The inner loop deals with a contiguous region. It could easily be replaced
by faster board specific functions like hv-logical-memop in board-qemu.
Since hv-logical-memop does not return an address, let's have the enclosing
loop compute the next line address by itself and drop the confusing
"char-width screen-depth * -" address adjustment.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 slof/fs/fbuffer.fs |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
index 756f05a..faae6a9 100644
--- a/slof/fs/fbuffer.fs
+++ b/slof/fs/fbuffer.fs
@@ -99,8 +99,8 @@ CREATE bitmap-buffer 400 4 * allot
 : fb8-toggle-cursor ( -- )
 	line# fb8-line2addr column# fb8-columns2bytes +
 	char-height 0 ?DO
-		char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP
-		screen-width screen-depth * + char-width screen-depth * -
+		dup char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
+		screen-width screen-depth * +
 	LOOP drop
 ;
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] fbuffer: introduce the invert-region helper
  2015-05-28 13:13 [PATCH 0/3] fbuffer: performance improvement + code cleanup Greg Kurz
  2015-05-28 13:13 ` [PATCH 1/3] fbuffer: simplify address computations in fb8-toggle-cursor Greg Kurz
@ 2015-05-28 13:13 ` Greg Kurz
  2015-05-28 17:19   ` Thomas Huth
  2015-05-29  4:17   ` Nikunj A Dadhania
  2015-05-28 13:13 ` [PATCH 3/3] fbuffer: introduce the invert-region-x helper Greg Kurz
  2015-05-29  4:54 ` [PATCH 0/3] fbuffer: performance improvement + code cleanup Alexey Kardashevskiy
  3 siblings, 2 replies; 11+ messages in thread
From: Greg Kurz @ 2015-05-28 13:13 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Thomas Huth, Nikunj A Dadhania,
	David Gibson

The inner loop in fb8-toggle-cursor can be implemented with hv-logical-memop
in board-qemu and get an incredible performance boost.

Let's introduce a per-board helper:
- board-js2x: slow RB based, taken from current fb8-toggle-cursor
- board-qemu: faster hv-logical-memop based

With standard graphical settings on board-qemu, we go from 512 hcall
invocations per character down to 16.

Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 board-js2x/slof/helper.fs |    4 ++++
 board-qemu/slof/helper.fs |    3 +++
 slof/fs/fbuffer.fs        |    2 +-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/board-js2x/slof/helper.fs b/board-js2x/slof/helper.fs
index 34d60da..918fdc4 100644
--- a/board-js2x/slof/helper.fs
+++ b/board-js2x/slof/helper.fs
@@ -26,3 +26,7 @@
    s" , " $cat
    bdate2human $cat encode-string THEN
 ;
+
+: invert-region ( addr len -- )
+   0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
+;
diff --git a/board-qemu/slof/helper.fs b/board-qemu/slof/helper.fs
index 96da498..da676c7 100644
--- a/board-qemu/slof/helper.fs
+++ b/board-qemu/slof/helper.fs
@@ -33,3 +33,6 @@
   swap -
 ;
 
+: invert-region ( addr len -- )
+   over swap 0 swap 1 hv-logical-memop drop
+;
diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
index faae6a9..deeba6b 100644
--- a/slof/fs/fbuffer.fs
+++ b/slof/fs/fbuffer.fs
@@ -99,7 +99,7 @@ CREATE bitmap-buffer 400 4 * allot
 : fb8-toggle-cursor ( -- )
 	line# fb8-line2addr column# fb8-columns2bytes +
 	char-height 0 ?DO
-		dup char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
+		dup char-width screen-depth * invert-region
 		screen-width screen-depth * +
 	LOOP drop
 ;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] fbuffer: introduce the invert-region-x helper
  2015-05-28 13:13 [PATCH 0/3] fbuffer: performance improvement + code cleanup Greg Kurz
  2015-05-28 13:13 ` [PATCH 1/3] fbuffer: simplify address computations in fb8-toggle-cursor Greg Kurz
  2015-05-28 13:13 ` [PATCH 2/3] fbuffer: introduce the invert-region helper Greg Kurz
@ 2015-05-28 13:13 ` Greg Kurz
  2015-05-28 17:33   ` Thomas Huth
  2015-05-29  4:25   ` Nikunj A Dadhania
  2015-05-29  4:54 ` [PATCH 0/3] fbuffer: performance improvement + code cleanup Alexey Kardashevskiy
  3 siblings, 2 replies; 11+ messages in thread
From: Greg Kurz @ 2015-05-28 13:13 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Thomas Huth, Nikunj A Dadhania,
	David Gibson

This patch simply moves the slow RX based logic from fb8-invert-screen
to board-js2x helpers and implement a fast hv-logical-memop based helper
for board-qemu. And we can drop hcall-invert-screen !

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 board-js2x/slof/helper.fs               |    5 +++++
 board-qemu/slof/helper.fs               |    4 ++++
 board-qemu/slof/pci-device_1234_1111.fs |   10 +---------
 slof/fs/fbuffer.fs                      |    4 +---
 4 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/board-js2x/slof/helper.fs b/board-js2x/slof/helper.fs
index 918fdc4..ea2d584 100644
--- a/board-js2x/slof/helper.fs
+++ b/board-js2x/slof/helper.fs
@@ -30,3 +30,8 @@
 : invert-region ( addr len -- )
    0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
 ;
+
+
+: invert-region-x ( addr len -- )
+   /x / 0 ?DO dup dup rx@ -1 xor swap rx! xa1+ LOOP drop
+;
diff --git a/board-qemu/slof/helper.fs b/board-qemu/slof/helper.fs
index da676c7..c807bc6 100644
--- a/board-qemu/slof/helper.fs
+++ b/board-qemu/slof/helper.fs
@@ -36,3 +36,7 @@
 : invert-region ( addr len -- )
    over swap 0 swap 1 hv-logical-memop drop
 ;
+
+: invert-region-x ( addr len -- )
+   over swap /x / 3 swap 1 hv-logical-memop drop
+;
diff --git a/board-qemu/slof/pci-device_1234_1111.fs b/board-qemu/slof/pci-device_1234_1111.fs
index a5c3584..26b0623 100644
--- a/board-qemu/slof/pci-device_1234_1111.fs
+++ b/board-qemu/slof/pci-device_1234_1111.fs
@@ -188,16 +188,9 @@ a CONSTANT VBE_DISPI_INDEX_NB
 : display-remove ( -- ) 
 ;
 
-: hcall-invert-screen ( -- )
-    frame-buffer-adr frame-buffer-adr 3
-    screen-height screen-width * screen-depth * /x /
-    1 hv-logical-memop
-    drop
-;
-
 : hcall-blink-screen ( -- )
     \ 32 msec delay for visually noticing the blink
-    hcall-invert-screen 20 ms hcall-invert-screen
+    invert-screen 20 ms invert-screen
 ;
 
 : display-install ( -- )
@@ -211,7 +204,6 @@ a CONSTANT VBE_DISPI_INDEX_NB
         disp-width char-width / disp-height char-height /
         disp-depth 7 + 8 /                      ( width height #lines #cols depth )
         fb-install
-	['] hcall-invert-screen to invert-screen
 	['] hcall-blink-screen to blink-screen
          true to is-installed?
     THEN
diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
index deeba6b..fcdd2fa 100644
--- a/slof/fs/fbuffer.fs
+++ b/slof/fs/fbuffer.fs
@@ -170,9 +170,7 @@ CREATE bitmap-buffer 400 4 * allot
 ;
 
 : fb8-invert-screen ( -- )
-	frame-buffer-adr screen-height screen-width * screen-depth * 2dup /x / 0 ?DO
-		dup rx@ -1 xor over rx! xa1+
-	LOOP 3drop
+	frame-buffer-adr screen-height screen-width * screen-depth * invert-region-x
 ;
 
 : fb8-blink-screen ( -- ) fb8-invert-screen fb8-invert-screen ;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] fbuffer: simplify address computations in fb8-toggle-cursor
  2015-05-28 13:13 ` [PATCH 1/3] fbuffer: simplify address computations in fb8-toggle-cursor Greg Kurz
@ 2015-05-28 13:30   ` Thomas Huth
  2015-05-29  4:17   ` Nikunj A Dadhania
  1 sibling, 0 replies; 11+ messages in thread
From: Thomas Huth @ 2015-05-28 13:30 UTC (permalink / raw)
  To: Greg Kurz
  Cc: linuxppc-dev, Alexey Kardashevskiy, Nikunj A Dadhania,
	David Gibson

On Thu, 28 May 2015 15:13:14 +0200
Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:

> The inner loop deals with a contiguous region. It could easily be replaced
> by faster board specific functions like hv-logical-memop in board-qemu.
> Since hv-logical-memop does not return an address, let's have the enclosing
> loop compute the next line address by itself and drop the confusing
> "char-width screen-depth * -" address adjustment.
> 
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  slof/fs/fbuffer.fs |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
> index 756f05a..faae6a9 100644
> --- a/slof/fs/fbuffer.fs
> +++ b/slof/fs/fbuffer.fs
> @@ -99,8 +99,8 @@ CREATE bitmap-buffer 400 4 * allot
>  : fb8-toggle-cursor ( -- )
>  	line# fb8-line2addr column# fb8-columns2bytes +
>  	char-height 0 ?DO
> -		char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP
> -		screen-width screen-depth * + char-width screen-depth * -
> +		dup char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
> +		screen-width screen-depth * +
>  	LOOP drop
>  ;

Reviewed-by: Thomas Huth <thuth@redhat.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] fbuffer: introduce the invert-region helper
  2015-05-28 13:13 ` [PATCH 2/3] fbuffer: introduce the invert-region helper Greg Kurz
@ 2015-05-28 17:19   ` Thomas Huth
  2015-05-29  4:17   ` Nikunj A Dadhania
  1 sibling, 0 replies; 11+ messages in thread
From: Thomas Huth @ 2015-05-28 17:19 UTC (permalink / raw)
  To: Greg Kurz
  Cc: linuxppc-dev, Alexey Kardashevskiy, Nikunj A Dadhania,
	David Gibson

On Thu, 28 May 2015 15:13:19 +0200
Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:

> The inner loop in fb8-toggle-cursor can be implemented with hv-logical-memop
> in board-qemu and get an incredible performance boost.
> 
> Let's introduce a per-board helper:
> - board-js2x: slow RB based, taken from current fb8-toggle-cursor
> - board-qemu: faster hv-logical-memop based
> 
> With standard graphical settings on board-qemu, we go from 512 hcall
> invocations per character down to 16.
> 
> Suggested-by: Thomas Huth <thuth@redhat.com>
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  board-js2x/slof/helper.fs |    4 ++++
>  board-qemu/slof/helper.fs |    3 +++
>  slof/fs/fbuffer.fs        |    2 +-
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/board-js2x/slof/helper.fs b/board-js2x/slof/helper.fs
> index 34d60da..918fdc4 100644
> --- a/board-js2x/slof/helper.fs
> +++ b/board-js2x/slof/helper.fs
> @@ -26,3 +26,7 @@
>     s" , " $cat
>     bdate2human $cat encode-string THEN
>  ;
> +
> +: invert-region ( addr len -- )
> +   0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
> +;
> diff --git a/board-qemu/slof/helper.fs b/board-qemu/slof/helper.fs
> index 96da498..da676c7 100644
> --- a/board-qemu/slof/helper.fs
> +++ b/board-qemu/slof/helper.fs
> @@ -33,3 +33,6 @@
>    swap -
>  ;
>  
> +: invert-region ( addr len -- )
> +   over swap 0 swap 1 hv-logical-memop drop
> +;
> diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
> index faae6a9..deeba6b 100644
> --- a/slof/fs/fbuffer.fs
> +++ b/slof/fs/fbuffer.fs
> @@ -99,7 +99,7 @@ CREATE bitmap-buffer 400 4 * allot
>  : fb8-toggle-cursor ( -- )
>  	line# fb8-line2addr column# fb8-columns2bytes +
>  	char-height 0 ?DO
> -		dup char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
> +		dup char-width screen-depth * invert-region
>  		screen-width screen-depth * +
>  	LOOP drop
>  ;

Reviewed-by: Thomas Huth <thuth@redhat.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] fbuffer: introduce the invert-region-x helper
  2015-05-28 13:13 ` [PATCH 3/3] fbuffer: introduce the invert-region-x helper Greg Kurz
@ 2015-05-28 17:33   ` Thomas Huth
  2015-05-29  4:25   ` Nikunj A Dadhania
  1 sibling, 0 replies; 11+ messages in thread
From: Thomas Huth @ 2015-05-28 17:33 UTC (permalink / raw)
  To: Greg Kurz
  Cc: linuxppc-dev, Alexey Kardashevskiy, Nikunj A Dadhania,
	David Gibson

On Thu, 28 May 2015 15:13:24 +0200
Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:

> This patch simply moves the slow RX based logic from fb8-invert-screen
> to board-js2x helpers and implement a fast hv-logical-memop based helper
> for board-qemu. And we can drop hcall-invert-screen !
> 
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  board-js2x/slof/helper.fs               |    5 +++++
>  board-qemu/slof/helper.fs               |    4 ++++
>  board-qemu/slof/pci-device_1234_1111.fs |   10 +---------
>  slof/fs/fbuffer.fs                      |    4 +---
>  4 files changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/board-js2x/slof/helper.fs b/board-js2x/slof/helper.fs
> index 918fdc4..ea2d584 100644
> --- a/board-js2x/slof/helper.fs
> +++ b/board-js2x/slof/helper.fs
> @@ -30,3 +30,8 @@
>  : invert-region ( addr len -- )
>     0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
>  ;
> +
> +

Maybe remove one of the two empty lines?

> +: invert-region-x ( addr len -- )
> +   /x / 0 ?DO dup dup rx@ -1 xor swap rx! xa1+ LOOP drop
> +;
> diff --git a/board-qemu/slof/helper.fs b/board-qemu/slof/helper.fs
> index da676c7..c807bc6 100644
> --- a/board-qemu/slof/helper.fs
> +++ b/board-qemu/slof/helper.fs
> @@ -36,3 +36,7 @@
>  : invert-region ( addr len -- )
>     over swap 0 swap 1 hv-logical-memop drop
>  ;
> +
> +: invert-region-x ( addr len -- )
> +   over swap /x / 3 swap 1 hv-logical-memop drop
> +;
> diff --git a/board-qemu/slof/pci-device_1234_1111.fs b/board-qemu/slof/pci-device_1234_1111.fs
> index a5c3584..26b0623 100644
> --- a/board-qemu/slof/pci-device_1234_1111.fs
> +++ b/board-qemu/slof/pci-device_1234_1111.fs
> @@ -188,16 +188,9 @@ a CONSTANT VBE_DISPI_INDEX_NB
>  : display-remove ( -- ) 
>  ;
>  
> -: hcall-invert-screen ( -- )
> -    frame-buffer-adr frame-buffer-adr 3
> -    screen-height screen-width * screen-depth * /x /
> -    1 hv-logical-memop
> -    drop
> -;
> -
>  : hcall-blink-screen ( -- )
>      \ 32 msec delay for visually noticing the blink
> -    hcall-invert-screen 20 ms hcall-invert-screen
> +    invert-screen 20 ms invert-screen
>  ;
>  
>  : display-install ( -- )
> @@ -211,7 +204,6 @@ a CONSTANT VBE_DISPI_INDEX_NB
>          disp-width char-width / disp-height char-height /
>          disp-depth 7 + 8 /                      ( width height #lines #cols depth )
>          fb-install
> -	['] hcall-invert-screen to invert-screen
>  	['] hcall-blink-screen to blink-screen
>           true to is-installed?
>      THEN
> diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
> index deeba6b..fcdd2fa 100644
> --- a/slof/fs/fbuffer.fs
> +++ b/slof/fs/fbuffer.fs
> @@ -170,9 +170,7 @@ CREATE bitmap-buffer 400 4 * allot
>  ;
>  
>  : fb8-invert-screen ( -- )
> -	frame-buffer-adr screen-height screen-width * screen-depth * 2dup /x / 0 ?DO
> -		dup rx@ -1 xor over rx! xa1+
> -	LOOP 3drop
> +	frame-buffer-adr screen-height screen-width * screen-depth * invert-region-x
>  ;
>  
>  : fb8-blink-screen ( -- ) fb8-invert-screen fb8-invert-screen ;

Reviewed-by: Thomas Huth <thuth@redhat.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] fbuffer: simplify address computations in fb8-toggle-cursor
  2015-05-28 13:13 ` [PATCH 1/3] fbuffer: simplify address computations in fb8-toggle-cursor Greg Kurz
  2015-05-28 13:30   ` Thomas Huth
@ 2015-05-29  4:17   ` Nikunj A Dadhania
  1 sibling, 0 replies; 11+ messages in thread
From: Nikunj A Dadhania @ 2015-05-29  4:17 UTC (permalink / raw)
  To: Greg Kurz, linuxppc-dev; +Cc: Alexey Kardashevskiy, Thomas Huth, David Gibson

Greg Kurz <gkurz@linux.vnet.ibm.com> writes:

> The inner loop deals with a contiguous region. It could easily be replaced
> by faster board specific functions like hv-logical-memop in board-qemu.
> Since hv-logical-memop does not return an address, let's have the enclosing
> loop compute the next line address by itself and drop the confusing
> "char-width screen-depth * -" address adjustment.

Much better :-)

Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

>
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  slof/fs/fbuffer.fs |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
> index 756f05a..faae6a9 100644
> --- a/slof/fs/fbuffer.fs
> +++ b/slof/fs/fbuffer.fs
> @@ -99,8 +99,8 @@ CREATE bitmap-buffer 400 4 * allot
>  : fb8-toggle-cursor ( -- )
>  	line# fb8-line2addr column# fb8-columns2bytes +
>  	char-height 0 ?DO
> -		char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP
> -		screen-width screen-depth * + char-width screen-depth * -
> +		dup char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
> +		screen-width screen-depth * +
>  	LOOP drop
>  ;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] fbuffer: introduce the invert-region helper
  2015-05-28 13:13 ` [PATCH 2/3] fbuffer: introduce the invert-region helper Greg Kurz
  2015-05-28 17:19   ` Thomas Huth
@ 2015-05-29  4:17   ` Nikunj A Dadhania
  1 sibling, 0 replies; 11+ messages in thread
From: Nikunj A Dadhania @ 2015-05-29  4:17 UTC (permalink / raw)
  To: Greg Kurz, linuxppc-dev; +Cc: Alexey Kardashevskiy, Thomas Huth, David Gibson

Greg Kurz <gkurz@linux.vnet.ibm.com> writes:

> The inner loop in fb8-toggle-cursor can be implemented with hv-logical-memop
> in board-qemu and get an incredible performance boost.
>
> Let's introduce a per-board helper:
> - board-js2x: slow RB based, taken from current fb8-toggle-cursor
> - board-qemu: faster hv-logical-memop based
>
> With standard graphical settings on board-qemu, we go from 512 hcall
> invocations per character down to 16.
>
> Suggested-by: Thomas Huth <thuth@redhat.com>
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>

Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

> ---
>  board-js2x/slof/helper.fs |    4 ++++
>  board-qemu/slof/helper.fs |    3 +++
>  slof/fs/fbuffer.fs        |    2 +-
>  3 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/board-js2x/slof/helper.fs b/board-js2x/slof/helper.fs
> index 34d60da..918fdc4 100644
> --- a/board-js2x/slof/helper.fs
> +++ b/board-js2x/slof/helper.fs
> @@ -26,3 +26,7 @@
>     s" , " $cat
>     bdate2human $cat encode-string THEN
>  ;
> +
> +: invert-region ( addr len -- )
> +   0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
> +;
> diff --git a/board-qemu/slof/helper.fs b/board-qemu/slof/helper.fs
> index 96da498..da676c7 100644
> --- a/board-qemu/slof/helper.fs
> +++ b/board-qemu/slof/helper.fs
> @@ -33,3 +33,6 @@
>    swap -
>  ;
>
> +: invert-region ( addr len -- )
> +   over swap 0 swap 1 hv-logical-memop drop
> +;
> diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
> index faae6a9..deeba6b 100644
> --- a/slof/fs/fbuffer.fs
> +++ b/slof/fs/fbuffer.fs
> @@ -99,7 +99,7 @@ CREATE bitmap-buffer 400 4 * allot
>  : fb8-toggle-cursor ( -- )
>  	line# fb8-line2addr column# fb8-columns2bytes +
>  	char-height 0 ?DO
> -		dup char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
> +		dup char-width screen-depth * invert-region
>  		screen-width screen-depth * +
>  	LOOP drop
>  ;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] fbuffer: introduce the invert-region-x helper
  2015-05-28 13:13 ` [PATCH 3/3] fbuffer: introduce the invert-region-x helper Greg Kurz
  2015-05-28 17:33   ` Thomas Huth
@ 2015-05-29  4:25   ` Nikunj A Dadhania
  1 sibling, 0 replies; 11+ messages in thread
From: Nikunj A Dadhania @ 2015-05-29  4:25 UTC (permalink / raw)
  To: Greg Kurz, linuxppc-dev; +Cc: Alexey Kardashevskiy, Thomas Huth, David Gibson

Greg Kurz <gkurz@linux.vnet.ibm.com> writes:

> This patch simply moves the slow RX based logic from fb8-invert-screen
> to board-js2x helpers and implement a fast hv-logical-memop based helper
> for board-qemu. And we can drop hcall-invert-screen !
>
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>

Apart for the extra lines that Thomas pointed:

Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

> ---
>  board-js2x/slof/helper.fs               |    5 +++++
>  board-qemu/slof/helper.fs               |    4 ++++
>  board-qemu/slof/pci-device_1234_1111.fs |   10 +---------
>  slof/fs/fbuffer.fs                      |    4 +---
>  4 files changed, 11 insertions(+), 12 deletions(-)
>
> diff --git a/board-js2x/slof/helper.fs b/board-js2x/slof/helper.fs
> index 918fdc4..ea2d584 100644
> --- a/board-js2x/slof/helper.fs
> +++ b/board-js2x/slof/helper.fs
> @@ -30,3 +30,8 @@
>  : invert-region ( addr len -- )
>     0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
>  ;
> +
> +
> +: invert-region-x ( addr len -- )
> +   /x / 0 ?DO dup dup rx@ -1 xor swap rx! xa1+ LOOP drop
> +;
> diff --git a/board-qemu/slof/helper.fs b/board-qemu/slof/helper.fs
> index da676c7..c807bc6 100644
> --- a/board-qemu/slof/helper.fs
> +++ b/board-qemu/slof/helper.fs
> @@ -36,3 +36,7 @@
>  : invert-region ( addr len -- )
>     over swap 0 swap 1 hv-logical-memop drop
>  ;
> +
> +: invert-region-x ( addr len -- )
> +   over swap /x / 3 swap 1 hv-logical-memop drop
> +;
> diff --git a/board-qemu/slof/pci-device_1234_1111.fs b/board-qemu/slof/pci-device_1234_1111.fs
> index a5c3584..26b0623 100644
> --- a/board-qemu/slof/pci-device_1234_1111.fs
> +++ b/board-qemu/slof/pci-device_1234_1111.fs
> @@ -188,16 +188,9 @@ a CONSTANT VBE_DISPI_INDEX_NB
>  : display-remove ( -- ) 
>  ;
>
> -: hcall-invert-screen ( -- )
> -    frame-buffer-adr frame-buffer-adr 3
> -    screen-height screen-width * screen-depth * /x /
> -    1 hv-logical-memop
> -    drop
> -;
> -
>  : hcall-blink-screen ( -- )
>      \ 32 msec delay for visually noticing the blink
> -    hcall-invert-screen 20 ms hcall-invert-screen
> +    invert-screen 20 ms invert-screen
>  ;
>
>  : display-install ( -- )
> @@ -211,7 +204,6 @@ a CONSTANT VBE_DISPI_INDEX_NB
>          disp-width char-width / disp-height char-height /
>          disp-depth 7 + 8 /                      ( width height #lines #cols depth )
>          fb-install
> -	['] hcall-invert-screen to invert-screen
>  	['] hcall-blink-screen to blink-screen
>           true to is-installed?
>      THEN
> diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
> index deeba6b..fcdd2fa 100644
> --- a/slof/fs/fbuffer.fs
> +++ b/slof/fs/fbuffer.fs
> @@ -170,9 +170,7 @@ CREATE bitmap-buffer 400 4 * allot
>  ;
>
>  : fb8-invert-screen ( -- )
> -	frame-buffer-adr screen-height screen-width * screen-depth * 2dup /x / 0 ?DO
> -		dup rx@ -1 xor over rx! xa1+
> -	LOOP 3drop
> +	frame-buffer-adr screen-height screen-width * screen-depth * invert-region-x
>  ;
>
>  : fb8-blink-screen ( -- ) fb8-invert-screen fb8-invert-screen ;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] fbuffer: performance improvement + code cleanup
  2015-05-28 13:13 [PATCH 0/3] fbuffer: performance improvement + code cleanup Greg Kurz
                   ` (2 preceding siblings ...)
  2015-05-28 13:13 ` [PATCH 3/3] fbuffer: introduce the invert-region-x helper Greg Kurz
@ 2015-05-29  4:54 ` Alexey Kardashevskiy
  3 siblings, 0 replies; 11+ messages in thread
From: Alexey Kardashevskiy @ 2015-05-29  4:54 UTC (permalink / raw)
  To: Greg Kurz, linuxppc-dev; +Cc: Thomas Huth, Nikunj A Dadhania, David Gibson

On 05/28/2015 11:13 PM, Greg Kurz wrote:
> If booted in frame buffer mode, board-qemu currently calls hv-logical-load
> and hv-logical-store for every pixel when enabling or disabling the cursor.
> This is suboptimal when writing one char at a time to the console since
> terminal-write always toggles the cursor. And this is precisely what grub
> is doing when the user wants to edit a menu entry... the result is an
> incredibly slow and barely usable interface.
>
> This series introduces per-board helpers to be used by the frame buffer
> code, so that board-qemu may have its own accelarated implementation:
>
> - the first patch is preliminary cleanup, before moving code out to helpers.
>
> - the second patch introduces a helper to invert a memory region byte-per-byte:
>    this fixes the unbearable slowliness of grub editing mode.
>
> - the third patch introduces a similar helper with a a quad-word pace: it
>    doesn't bring any speed improvement since board-qemu already uses
>    hv-logical-memop, but it allows to "unify hcall-invert-screen and
>    fb8-invert-screen again".
>
> Please comment.

Thanks, I'll remove that extra line in 3/3 and push these today.


>
> ---
>
> Greg Kurz (3):
>        fbuffer: simplify address computations in fb8-toggle-cursor
>        fbuffer: introduce the invert-region helper
>        fbuffer: introduce the invert-region-x helper
>
>
>   board-js2x/slof/helper.fs               |    9 +++++++++
>   board-qemu/slof/helper.fs               |    7 +++++++
>   board-qemu/slof/pci-device_1234_1111.fs |   10 +---------
>   slof/fs/fbuffer.fs                      |    8 +++-----
>   4 files changed, 20 insertions(+), 14 deletions(-)
>
> --
> Greg
>


-- 
Alexey

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-05-29  4:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-28 13:13 [PATCH 0/3] fbuffer: performance improvement + code cleanup Greg Kurz
2015-05-28 13:13 ` [PATCH 1/3] fbuffer: simplify address computations in fb8-toggle-cursor Greg Kurz
2015-05-28 13:30   ` Thomas Huth
2015-05-29  4:17   ` Nikunj A Dadhania
2015-05-28 13:13 ` [PATCH 2/3] fbuffer: introduce the invert-region helper Greg Kurz
2015-05-28 17:19   ` Thomas Huth
2015-05-29  4:17   ` Nikunj A Dadhania
2015-05-28 13:13 ` [PATCH 3/3] fbuffer: introduce the invert-region-x helper Greg Kurz
2015-05-28 17:33   ` Thomas Huth
2015-05-29  4:25   ` Nikunj A Dadhania
2015-05-29  4:54 ` [PATCH 0/3] fbuffer: performance improvement + code cleanup Alexey Kardashevskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.