linux-fbdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops
@ 2025-02-07  4:18 Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 01/13] fbdev: core: Copy cfbcopyarea to fb_copyarea Zsolt Kajtar
                   ` (14 more replies)
  0 siblings, 15 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

In 68648ed1f58d98b8e8d994022e5e25331fbfe42a the drawing routines were
duplicated to have separate I/O and system memory versions.

Later the pixel reversing in 779121e9f17525769c04a00475fd85600c8c04eb
was only added to the I/O version and not to system.

That's unfortunate as reversing is not something only applicable for
I/O memory and I happen to need both I/O and system version now.

One option is to bring the system version up to date, but from the
maintenance perspective it's better to not have two versions in the
first place.

The drawing routines (based on the cfb version) were moved to header
files. These are now included in both cfb and sys modules. The memory
access and other minor differences were handled with a few macros.

The last patch adds a separate config option for the system version.

Zsolt Kajtar (13):
  fbdev: core: Copy cfbcopyarea to fb_copyarea
  fbdev: core: Make fb_copyarea generic
  fbdev: core: Use generic copyarea for as cfb_copyarea
  fbdev: core: Use generic copyarea for as sys_copyarea
  fbdev: core: Copy cfbfillrect to fb_fillrect
  fbdev: core: Make fb_fillrect generic
  fbdev: core: Use generic fillrect for as cfb_fillrect
  fbdev: core: Use generic fillrect for as sys_fillrect
  fbdev: core: Copy cfbimgblt to fb_imageblit
  fbdev: core: Make fb_imageblit generic
  fbdev: core: Use generic imageblit for as cfb_imageblit
  fbdev: core: Use generic imageblit for as sys_imageblit
  fbdev: core: Split CFB and SYS pixel reversing configuration

 drivers/video/fbdev/core/Kconfig        |  10 +-
 drivers/video/fbdev/core/cfbcopyarea.c  | 427 +-----------------------
 drivers/video/fbdev/core/cfbfillrect.c  | 363 +-------------------
 drivers/video/fbdev/core/cfbimgblt.c    | 358 +-------------------
 drivers/video/fbdev/core/fb_copyarea.h  | 421 +++++++++++++++++++++++
 drivers/video/fbdev/core/fb_draw.h      |   6 +-
 drivers/video/fbdev/core/fb_fillrect.h  | 359 ++++++++++++++++++++
 drivers/video/fbdev/core/fb_imageblit.h | 356 ++++++++++++++++++++
 drivers/video/fbdev/core/syscopyarea.c  | 358 +-------------------
 drivers/video/fbdev/core/sysfillrect.c  | 315 +----------------
 drivers/video/fbdev/core/sysimgblt.c    | 326 +-----------------
 11 files changed, 1208 insertions(+), 2091 deletions(-)
 create mode 100644 drivers/video/fbdev/core/fb_copyarea.h
 create mode 100644 drivers/video/fbdev/core/fb_fillrect.h
 create mode 100644 drivers/video/fbdev/core/fb_imageblit.h

-- 
2.30.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH RESEND 01/13] fbdev: core: Copy cfbcopyarea to fb_copyarea
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 02/13] fbdev: core: Make fb_copyarea generic Zsolt Kajtar
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/fb_copyarea.h | 439 +++++++++++++++++++++++++
 1 file changed, 439 insertions(+)
 create mode 100644 drivers/video/fbdev/core/fb_copyarea.h

diff --git a/drivers/video/fbdev/core/fb_copyarea.h b/drivers/video/fbdev/core/fb_copyarea.h
new file mode 100644
index 000000000..f266de119
--- /dev/null
+++ b/drivers/video/fbdev/core/fb_copyarea.h
@@ -0,0 +1,439 @@
+/*
+ *  Generic function for frame buffer with packed pixels of any depth.
+ *
+ *      Copyright (C)  1999-2005 James Simmons <jsimmons@www.infradead.org>
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of this archive for
+ *  more details.
+ *
+ * NOTES:
+ *
+ *  This is for cfb packed pixels. Iplan and such are incorporated in the
+ *  drivers that need them.
+ *
+ *  FIXME
+ *
+ *  Also need to add code to deal with cards endians that are different than
+ *  the native cpu endians. I also need to deal with MSB position in the word.
+ *
+ *  The two functions or copying forward and backward could be split up like
+ *  the ones for filling, i.e. in aligned and unaligned versions. This would
+ *  help moving some redundant computations and branches out of the loop, too.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/fb.h>
+#include <asm/types.h>
+#include <asm/io.h>
+#include "fb_draw.h"
+
+#if BITS_PER_LONG == 32
+#  define FB_WRITEL fb_writel
+#  define FB_READL  fb_readl
+#else
+#  define FB_WRITEL fb_writeq
+#  define FB_READL  fb_readq
+#endif
+
+    /*
+     *  Generic bitwise copy algorithm
+     */
+
+static void
+bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
+                const unsigned long __iomem *src, unsigned src_idx, int bits,
+                unsigned n, u32 bswapmask)
+{
+        unsigned long first, last;
+        int const shift = dst_idx-src_idx;
+
+#if 0
+        /*
+         * If you suspect bug in this function, compare it with this simple
+         * memmove implementation.
+         */
+        memmove((char *)dst + ((dst_idx & (bits - 1))) / 8,
+                (char *)src + ((src_idx & (bits - 1))) / 8, n / 8);
+        return;
+#endif
+
+        first = fb_shifted_pixels_mask_long(p, dst_idx, bswapmask);
+        last = ~fb_shifted_pixels_mask_long(p, (dst_idx+n) % bits, bswapmask);
+
+        if (!shift) {
+                // Same alignment for source and dest
+
+                if (dst_idx+n <= bits) {
+                        // Single word
+                        if (last)
+                                first &= last;
+                        FB_WRITEL( comp( FB_READL(src), FB_READL(dst), first), dst);
+                } else {
+                        // Multiple destination words
+
+                        // Leading bits
+                        if (first != ~0UL) {
+                                FB_WRITEL( comp( FB_READL(src), FB_READL(dst), first), dst);
+                                dst++;
+                                src++;
+                                n -= bits - dst_idx;
+                        }
+
+                        // Main chunk
+                        n /= bits;
+                        while (n >= 8) {
+                                FB_WRITEL(FB_READL(src++), dst++);
+                                FB_WRITEL(FB_READL(src++), dst++);
+                                FB_WRITEL(FB_READL(src++), dst++);
+                                FB_WRITEL(FB_READL(src++), dst++);
+                                FB_WRITEL(FB_READL(src++), dst++);
+                                FB_WRITEL(FB_READL(src++), dst++);
+                                FB_WRITEL(FB_READL(src++), dst++);
+                                FB_WRITEL(FB_READL(src++), dst++);
+                                n -= 8;
+                        }
+                        while (n--)
+                                FB_WRITEL(FB_READL(src++), dst++);
+
+                        // Trailing bits
+                        if (last)
+                                FB_WRITEL( comp( FB_READL(src), FB_READL(dst), last), dst);
+                }
+        } else {
+                /* Different alignment for source and dest */
+                unsigned long d0, d1;
+                int m;
+
+                int const left = shift & (bits - 1);
+                int const right = -shift & (bits - 1);
+
+                if (dst_idx+n <= bits) {
+                        // Single destination word
+                        if (last)
+                                first &= last;
+                        d0 = FB_READL(src);
+                        d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                        if (shift > 0) {
+                                // Single source word
+                                d0 <<= left;
+                        } else if (src_idx+n <= bits) {
+                                // Single source word
+                                d0 >>= right;
+                        } else {
+                                // 2 source words
+                                d1 = FB_READL(src + 1);
+                                d1 = fb_rev_pixels_in_long(d1, bswapmask);
+                                d0 = d0 >> right | d1 << left;
+                        }
+                        d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                        FB_WRITEL(comp(d0, FB_READL(dst), first), dst);
+                } else {
+                        // Multiple destination words
+                        /** We must always remember the last value read, because in case
+                        SRC and DST overlap bitwise (e.g. when moving just one pixel in
+                        1bpp), we always collect one full long for DST and that might
+                        overlap with the current long from SRC. We store this value in
+                        'd0'. */
+                        d0 = FB_READL(src++);
+                        d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                        // Leading bits
+                        if (shift > 0) {
+                                // Single source word
+                                d1 = d0;
+                                d0 <<= left;
+                                n -= bits - dst_idx;
+                        } else {
+                                // 2 source words
+                                d1 = FB_READL(src++);
+                                d1 = fb_rev_pixels_in_long(d1, bswapmask);
+
+                                d0 = d0 >> right | d1 << left;
+                                n -= bits - dst_idx;
+                        }
+                        d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                        FB_WRITEL(comp(d0, FB_READL(dst), first), dst);
+                        d0 = d1;
+                        dst++;
+
+                        // Main chunk
+                        m = n % bits;
+                        n /= bits;
+                        while ((n >= 4) && !bswapmask) {
+                                d1 = FB_READL(src++);
+                                FB_WRITEL(d0 >> right | d1 << left, dst++);
+                                d0 = d1;
+                                d1 = FB_READL(src++);
+                                FB_WRITEL(d0 >> right | d1 << left, dst++);
+                                d0 = d1;
+                                d1 = FB_READL(src++);
+                                FB_WRITEL(d0 >> right | d1 << left, dst++);
+                                d0 = d1;
+                                d1 = FB_READL(src++);
+                                FB_WRITEL(d0 >> right | d1 << left, dst++);
+                                d0 = d1;
+                                n -= 4;
+                        }
+                        while (n--) {
+                                d1 = FB_READL(src++);
+                                d1 = fb_rev_pixels_in_long(d1, bswapmask);
+                                d0 = d0 >> right | d1 << left;
+                                d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                                FB_WRITEL(d0, dst++);
+                                d0 = d1;
+                        }
+
+                        // Trailing bits
+                        if (m) {
+                                if (m <= bits - right) {
+                                        // Single source word
+                                        d0 >>= right;
+                                } else {
+                                        // 2 source words
+                                        d1 = FB_READL(src);
+                                        d1 = fb_rev_pixels_in_long(d1,
+                                                                bswapmask);
+                                        d0 = d0 >> right | d1 << left;
+                                }
+                                d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                                FB_WRITEL(comp(d0, FB_READL(dst), last), dst);
+                        }
+                }
+        }
+}
+
+    /*
+     *  Generic bitwise copy algorithm, operating backward
+     */
+
+static void
+bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
+                const unsigned long __iomem *src, unsigned src_idx, int bits,
+                unsigned n, u32 bswapmask)
+{
+        unsigned long first, last;
+        int shift;
+
+#if 0
+        /*
+         * If you suspect bug in this function, compare it with this simple
+         * memmove implementation.
+         */
+        memmove((char *)dst + ((dst_idx & (bits - 1))) / 8,
+                (char *)src + ((src_idx & (bits - 1))) / 8, n / 8);
+        return;
+#endif
+
+        dst += (dst_idx + n - 1) / bits;
+        src += (src_idx + n - 1) / bits;
+        dst_idx = (dst_idx + n - 1) % bits;
+        src_idx = (src_idx + n - 1) % bits;
+
+        shift = dst_idx-src_idx;
+
+        first = ~fb_shifted_pixels_mask_long(p, (dst_idx + 1) % bits, bswapmask);
+        last = fb_shifted_pixels_mask_long(p, (bits + dst_idx + 1 - n) % bits, bswapmask);
+
+        if (!shift) {
+                // Same alignment for source and dest
+
+                if ((unsigned long)dst_idx+1 >= n) {
+                        // Single word
+                        if (first)
+                                last &= first;
+                        FB_WRITEL( comp( FB_READL(src), FB_READL(dst), last), dst);
+                } else {
+                        // Multiple destination words
+
+                        // Leading bits
+                        if (first) {
+                                FB_WRITEL( comp( FB_READL(src), FB_READL(dst), first), dst);
+                                dst--;
+                                src--;
+                                n -= dst_idx+1;
+                        }
+
+                        // Main chunk
+                        n /= bits;
+                        while (n >= 8) {
+                                FB_WRITEL(FB_READL(src--), dst--);
+                                FB_WRITEL(FB_READL(src--), dst--);
+                                FB_WRITEL(FB_READL(src--), dst--);
+                                FB_WRITEL(FB_READL(src--), dst--);
+                                FB_WRITEL(FB_READL(src--), dst--);
+                                FB_WRITEL(FB_READL(src--), dst--);
+                                FB_WRITEL(FB_READL(src--), dst--);
+                                FB_WRITEL(FB_READL(src--), dst--);
+                                n -= 8;
+                        }
+                        while (n--)
+                                FB_WRITEL(FB_READL(src--), dst--);
+
+                        // Trailing bits
+                        if (last != -1UL)
+                                FB_WRITEL( comp( FB_READL(src), FB_READL(dst), last), dst);
+                }
+        } else {
+                // Different alignment for source and dest
+                unsigned long d0, d1;
+                int m;
+
+                int const left = shift & (bits-1);
+                int const right = -shift & (bits-1);
+
+                if ((unsigned long)dst_idx+1 >= n) {
+                        // Single destination word
+                        if (first)
+                                last &= first;
+                        d0 = FB_READL(src);
+                        if (shift < 0) {
+                                // Single source word
+                                d0 >>= right;
+                        } else if (1+(unsigned long)src_idx >= n) {
+                                // Single source word
+                                d0 <<= left;
+                        } else {
+                                // 2 source words
+                                d1 = FB_READL(src - 1);
+                                d1 = fb_rev_pixels_in_long(d1, bswapmask);
+                                d0 = d0 << left | d1 >> right;
+                        }
+                        d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                        FB_WRITEL(comp(d0, FB_READL(dst), last), dst);
+                } else {
+                        // Multiple destination words
+                        /** We must always remember the last value read, because in case
+                        SRC and DST overlap bitwise (e.g. when moving just one pixel in
+                        1bpp), we always collect one full long for DST and that might
+                        overlap with the current long from SRC. We store this value in
+                        'd0'. */
+
+                        d0 = FB_READL(src--);
+                        d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                        // Leading bits
+                        if (shift < 0) {
+                                // Single source word
+                                d1 = d0;
+                                d0 >>= right;
+                        } else {
+                                // 2 source words
+                                d1 = FB_READL(src--);
+                                d1 = fb_rev_pixels_in_long(d1, bswapmask);
+                                d0 = d0 << left | d1 >> right;
+                        }
+                        d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                        if (!first)
+                                FB_WRITEL(d0, dst);
+                        else
+                                FB_WRITEL(comp(d0, FB_READL(dst), first), dst);
+                        d0 = d1;
+                        dst--;
+                        n -= dst_idx+1;
+
+                        // Main chunk
+                        m = n % bits;
+                        n /= bits;
+                        while ((n >= 4) && !bswapmask) {
+                                d1 = FB_READL(src--);
+                                FB_WRITEL(d0 << left | d1 >> right, dst--);
+                                d0 = d1;
+                                d1 = FB_READL(src--);
+                                FB_WRITEL(d0 << left | d1 >> right, dst--);
+                                d0 = d1;
+                                d1 = FB_READL(src--);
+                                FB_WRITEL(d0 << left | d1 >> right, dst--);
+                                d0 = d1;
+                                d1 = FB_READL(src--);
+                                FB_WRITEL(d0 << left | d1 >> right, dst--);
+                                d0 = d1;
+                                n -= 4;
+                        }
+                        while (n--) {
+                                d1 = FB_READL(src--);
+                                d1 = fb_rev_pixels_in_long(d1, bswapmask);
+                                d0 = d0 << left | d1 >> right;
+                                d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                                FB_WRITEL(d0, dst--);
+                                d0 = d1;
+                        }
+
+                        // Trailing bits
+                        if (m) {
+                                if (m <= bits - left) {
+                                        // Single source word
+                                        d0 <<= left;
+                                } else {
+                                        // 2 source words
+                                        d1 = FB_READL(src);
+                                        d1 = fb_rev_pixels_in_long(d1,
+                                                                bswapmask);
+                                        d0 = d0 << left | d1 >> right;
+                                }
+                                d0 = fb_rev_pixels_in_long(d0, bswapmask);
+                                FB_WRITEL(comp(d0, FB_READL(dst), last), dst);
+                        }
+                }
+        }
+}
+
+void cfb_copyarea(struct fb_info *p, const struct fb_copyarea *area)
+{
+        u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
+        u32 height = area->height, width = area->width;
+        unsigned int const bits_per_line = p->fix.line_length * 8u;
+        unsigned long __iomem *base = NULL;
+        int bits = BITS_PER_LONG, bytes = bits >> 3;
+        unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
+        u32 bswapmask = fb_compute_bswapmask(p);
+
+        if (p->state != FBINFO_STATE_RUNNING)
+                return;
+
+        if (p->flags & FBINFO_VIRTFB)
+                fb_warn_once(p, "Framebuffer is not in I/O address space.");
+
+        /* if the beginning of the target area might overlap with the end of
+        the source area, be have to copy the area reverse. */
+        if ((dy == sy && dx > sx) || (dy > sy)) {
+                dy += height;
+                sy += height;
+                rev_copy = 1;
+        }
+
+        // split the base of the framebuffer into a long-aligned address and the
+        // index of the first bit
+        base = (unsigned long __iomem *)((unsigned long)p->screen_base & ~(bytes-1));
+        dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
+        // add offset of source and target area
+        dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
+        src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
+
+        if (p->fbops->fb_sync)
+                p->fbops->fb_sync(p);
+
+        if (rev_copy) {
+                while (height--) {
+                        dst_idx -= bits_per_line;
+                        src_idx -= bits_per_line;
+                        bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
+                                base + (src_idx / bits), src_idx % bits, bits,
+                                width*p->var.bits_per_pixel, bswapmask);
+                }
+        } else {
+                while (height--) {
+                        bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
+                                base + (src_idx / bits), src_idx % bits, bits,
+                                width*p->var.bits_per_pixel, bswapmask);
+                        dst_idx += bits_per_line;
+                        src_idx += bits_per_line;
+                }
+        }
+}
+
+EXPORT_SYMBOL(cfb_copyarea);
+
+MODULE_AUTHOR("James Simmons <jsimmons@users.sf.net>");
+MODULE_DESCRIPTION("Generic software accelerated copyarea");
+MODULE_LICENSE("GPL");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 02/13] fbdev: core: Make fb_copyarea generic
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 01/13] fbdev: core: Copy cfbcopyarea to fb_copyarea Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 03/13] fbdev: core: Use generic copyarea for as cfb_copyarea Zsolt Kajtar
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/fb_copyarea.h | 144 +++++++++++--------------
 1 file changed, 63 insertions(+), 81 deletions(-)

diff --git a/drivers/video/fbdev/core/fb_copyarea.h b/drivers/video/fbdev/core/fb_copyarea.h
index f266de119..4d7b1acd5 100644
--- a/drivers/video/fbdev/core/fb_copyarea.h
+++ b/drivers/video/fbdev/core/fb_copyarea.h
@@ -21,30 +21,15 @@
  *  the ones for filling, i.e. in aligned and unaligned versions. This would
  *  help moving some redundant computations and branches out of the loop, too.
  */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/string.h>
-#include <linux/fb.h>
-#include <asm/types.h>
-#include <asm/io.h>
 #include "fb_draw.h"
 
-#if BITS_PER_LONG == 32
-#  define FB_WRITEL fb_writel
-#  define FB_READL  fb_readl
-#else
-#  define FB_WRITEL fb_writeq
-#  define FB_READL  fb_readq
-#endif
-
     /*
      *  Generic bitwise copy algorithm
      */
 
 static void
-bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
-                const unsigned long __iomem *src, unsigned src_idx, int bits,
+bitcpy(struct fb_info *p, unsigned long FB_MEM *dst, unsigned dst_idx,
+                const unsigned long FB_MEM *src, unsigned src_idx, int bits,
                 unsigned n, u32 bswapmask)
 {
         unsigned long first, last;
@@ -64,17 +49,17 @@ bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
         last = ~fb_shifted_pixels_mask_long(p, (dst_idx+n) % bits, bswapmask);
 
         if (!shift) {
-                // Same alignment for source and dest
+                /* Same alignment for source and dest */
 
                 if (dst_idx+n <= bits) {
-                        // Single word
+                        /* Single word */
                         if (last)
                                 first &= last;
                         FB_WRITEL( comp( FB_READL(src), FB_READL(dst), first), dst);
                 } else {
-                        // Multiple destination words
+                        /* Multiple destination words */
 
-                        // Leading bits
+                        /* Leading bits */
                         if (first != ~0UL) {
                                 FB_WRITEL( comp( FB_READL(src), FB_READL(dst), first), dst);
                                 dst++;
@@ -82,7 +67,7 @@ bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                                 n -= bits - dst_idx;
                         }
 
-                        // Main chunk
+                        /* Main chunk */
                         n /= bits;
                         while (n >= 8) {
                                 FB_WRITEL(FB_READL(src++), dst++);
@@ -98,7 +83,7 @@ bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                         while (n--)
                                 FB_WRITEL(FB_READL(src++), dst++);
 
-                        // Trailing bits
+                        /* Trailing bits */
                         if (last)
                                 FB_WRITEL( comp( FB_READL(src), FB_READL(dst), last), dst);
                 }
@@ -111,19 +96,19 @@ bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                 int const right = -shift & (bits - 1);
 
                 if (dst_idx+n <= bits) {
-                        // Single destination word
+                        /* Single destination word */
                         if (last)
                                 first &= last;
                         d0 = FB_READL(src);
                         d0 = fb_rev_pixels_in_long(d0, bswapmask);
                         if (shift > 0) {
-                                // Single source word
+                                /* Single source word */
                                 d0 <<= left;
                         } else if (src_idx+n <= bits) {
-                                // Single source word
+                                /* Single source word */
                                 d0 >>= right;
                         } else {
-                                // 2 source words
+                                /* 2 source words */
                                 d1 = FB_READL(src + 1);
                                 d1 = fb_rev_pixels_in_long(d1, bswapmask);
                                 d0 = d0 >> right | d1 << left;
@@ -131,22 +116,23 @@ bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                         d0 = fb_rev_pixels_in_long(d0, bswapmask);
                         FB_WRITEL(comp(d0, FB_READL(dst), first), dst);
                 } else {
-                        // Multiple destination words
-                        /** We must always remember the last value read, because in case
-                        SRC and DST overlap bitwise (e.g. when moving just one pixel in
-                        1bpp), we always collect one full long for DST and that might
-                        overlap with the current long from SRC. We store this value in
-                        'd0'. */
+                        /* Multiple destination words */
+                        /** We must always remember the last value read,
+                            because in case SRC and DST overlap bitwise (e.g.
+                            when moving just one pixel in 1bpp), we always
+                            collect one full long for DST and that might
+                            overlap with the current long from SRC. We store
+                            this value in 'd0'. */
                         d0 = FB_READL(src++);
                         d0 = fb_rev_pixels_in_long(d0, bswapmask);
-                        // Leading bits
+                        /* Leading bits */
                         if (shift > 0) {
-                                // Single source word
+                                /* Single source word */
                                 d1 = d0;
                                 d0 <<= left;
                                 n -= bits - dst_idx;
                         } else {
-                                // 2 source words
+                                /* 2 source words */
                                 d1 = FB_READL(src++);
                                 d1 = fb_rev_pixels_in_long(d1, bswapmask);
 
@@ -158,7 +144,7 @@ bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                         d0 = d1;
                         dst++;
 
-                        // Main chunk
+                        /* Main chunk */
                         m = n % bits;
                         n /= bits;
                         while ((n >= 4) && !bswapmask) {
@@ -185,13 +171,13 @@ bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                                 d0 = d1;
                         }
 
-                        // Trailing bits
+                        /* Trailing bits */
                         if (m) {
                                 if (m <= bits - right) {
-                                        // Single source word
+                                        /* Single source word */
                                         d0 >>= right;
                                 } else {
-                                        // 2 source words
+                                        /* 2 source words */
                                         d1 = FB_READL(src);
                                         d1 = fb_rev_pixels_in_long(d1,
                                                                 bswapmask);
@@ -209,8 +195,8 @@ bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
      */
 
 static void
-bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
-                const unsigned long __iomem *src, unsigned src_idx, int bits,
+bitcpy_rev(struct fb_info *p, unsigned long FB_MEM *dst, unsigned dst_idx,
+                const unsigned long FB_MEM *src, unsigned src_idx, int bits,
                 unsigned n, u32 bswapmask)
 {
         unsigned long first, last;
@@ -237,17 +223,17 @@ bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
         last = fb_shifted_pixels_mask_long(p, (bits + dst_idx + 1 - n) % bits, bswapmask);
 
         if (!shift) {
-                // Same alignment for source and dest
+                /* Same alignment for source and dest */
 
                 if ((unsigned long)dst_idx+1 >= n) {
-                        // Single word
+                        /* Single word */
                         if (first)
                                 last &= first;
                         FB_WRITEL( comp( FB_READL(src), FB_READL(dst), last), dst);
                 } else {
-                        // Multiple destination words
+                        /* Multiple destination words */
 
-                        // Leading bits
+                        /* Leading bits */
                         if (first) {
                                 FB_WRITEL( comp( FB_READL(src), FB_READL(dst), first), dst);
                                 dst--;
@@ -255,7 +241,7 @@ bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                                 n -= dst_idx+1;
                         }
 
-                        // Main chunk
+                        /* Main chunk */
                         n /= bits;
                         while (n >= 8) {
                                 FB_WRITEL(FB_READL(src--), dst--);
@@ -271,12 +257,12 @@ bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                         while (n--)
                                 FB_WRITEL(FB_READL(src--), dst--);
 
-                        // Trailing bits
+                        /* Trailing bits */
                         if (last != -1UL)
                                 FB_WRITEL( comp( FB_READL(src), FB_READL(dst), last), dst);
                 }
         } else {
-                // Different alignment for source and dest
+                /* Different alignment for source and dest */
                 unsigned long d0, d1;
                 int m;
 
@@ -284,18 +270,18 @@ bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                 int const right = -shift & (bits-1);
 
                 if ((unsigned long)dst_idx+1 >= n) {
-                        // Single destination word
+                        /* Single destination word */
                         if (first)
                                 last &= first;
                         d0 = FB_READL(src);
                         if (shift < 0) {
-                                // Single source word
+                                /* Single source word */
                                 d0 >>= right;
                         } else if (1+(unsigned long)src_idx >= n) {
-                                // Single source word
+                                /* Single source word */
                                 d0 <<= left;
                         } else {
-                                // 2 source words
+                                /* 2 source words */
                                 d1 = FB_READL(src - 1);
                                 d1 = fb_rev_pixels_in_long(d1, bswapmask);
                                 d0 = d0 << left | d1 >> right;
@@ -303,22 +289,23 @@ bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                         d0 = fb_rev_pixels_in_long(d0, bswapmask);
                         FB_WRITEL(comp(d0, FB_READL(dst), last), dst);
                 } else {
-                        // Multiple destination words
-                        /** We must always remember the last value read, because in case
-                        SRC and DST overlap bitwise (e.g. when moving just one pixel in
-                        1bpp), we always collect one full long for DST and that might
-                        overlap with the current long from SRC. We store this value in
-                        'd0'. */
+                        /* Multiple destination words */
+                        /** We must always remember the last value read,
+                            because in case SRC and DST overlap bitwise (e.g.
+                            when moving just one pixel in 1bpp), we always
+                            collect one full long for DST and that might
+                            overlap with the current long from SRC. We store
+                            this value in 'd0'. */
 
                         d0 = FB_READL(src--);
                         d0 = fb_rev_pixels_in_long(d0, bswapmask);
-                        // Leading bits
+                        /* Leading bits */
                         if (shift < 0) {
-                                // Single source word
+                                /* Single source word */
                                 d1 = d0;
                                 d0 >>= right;
                         } else {
-                                // 2 source words
+                                /* 2 source words */
                                 d1 = FB_READL(src--);
                                 d1 = fb_rev_pixels_in_long(d1, bswapmask);
                                 d0 = d0 << left | d1 >> right;
@@ -332,7 +319,7 @@ bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                         dst--;
                         n -= dst_idx+1;
 
-                        // Main chunk
+                        /* Main chunk */
                         m = n % bits;
                         n /= bits;
                         while ((n >= 4) && !bswapmask) {
@@ -359,13 +346,13 @@ bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
                                 d0 = d1;
                         }
 
-                        // Trailing bits
+                        /* Trailing bits */
                         if (m) {
                                 if (m <= bits - left) {
-                                        // Single source word
+                                        /* Single source word */
                                         d0 <<= left;
                                 } else {
-                                        // 2 source words
+                                        /* 2 source words */
                                         d1 = FB_READL(src);
                                         d1 = fb_rev_pixels_in_long(d1,
                                                                 bswapmask);
@@ -378,12 +365,12 @@ bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
         }
 }
 
-void cfb_copyarea(struct fb_info *p, const struct fb_copyarea *area)
+void FB_COPYAREA(struct fb_info *p, const struct fb_copyarea *area)
 {
         u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
         u32 height = area->height, width = area->width;
         unsigned int const bits_per_line = p->fix.line_length * 8u;
-        unsigned long __iomem *base = NULL;
+        unsigned long FB_MEM *base = NULL;
         int bits = BITS_PER_LONG, bytes = bits >> 3;
         unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
         u32 bswapmask = fb_compute_bswapmask(p);
@@ -391,8 +378,9 @@ void cfb_copyarea(struct fb_info *p, const struct fb_copyarea *area)
         if (p->state != FBINFO_STATE_RUNNING)
                 return;
 
-        if (p->flags & FBINFO_VIRTFB)
-                fb_warn_once(p, "Framebuffer is not in I/O address space.");
+        if ((p->flags & FBINFO_VIRTFB) != FB_SPACE)
+                fb_warn_once(p, "Framebuffer is not in " FB_SPACE_NAME
+                             " address space.");
 
         /* if the beginning of the target area might overlap with the end of
         the source area, be have to copy the area reverse. */
@@ -402,11 +390,11 @@ void cfb_copyarea(struct fb_info *p, const struct fb_copyarea *area)
                 rev_copy = 1;
         }
 
-        // split the base of the framebuffer into a long-aligned address and the
-        // index of the first bit
-        base = (unsigned long __iomem *)((unsigned long)p->screen_base & ~(bytes-1));
-        dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
-        // add offset of source and target area
+        /* split the base of the framebuffer into a long-aligned address and
+           the index of the first bit */
+        base = (unsigned long FB_MEM *)((unsigned long)FB_SCREEN_BASE(p) & ~(bytes-1));
+        dst_idx = src_idx = 8*((unsigned long)FB_SCREEN_BASE(p) & (bytes-1));
+        /* add offset of source and target area */
         dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
         src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
 
@@ -431,9 +419,3 @@ void cfb_copyarea(struct fb_info *p, const struct fb_copyarea *area)
                 }
         }
 }
-
-EXPORT_SYMBOL(cfb_copyarea);
-
-MODULE_AUTHOR("James Simmons <jsimmons@users.sf.net>");
-MODULE_DESCRIPTION("Generic software accelerated copyarea");
-MODULE_LICENSE("GPL");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 03/13] fbdev: core: Use generic copyarea for as cfb_copyarea
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 01/13] fbdev: core: Copy cfbcopyarea to fb_copyarea Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 02/13] fbdev: core: Make fb_copyarea generic Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 04/13] fbdev: core: Use generic copyarea for as sys_copyarea Zsolt Kajtar
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/cfbcopyarea.c | 426 +------------------------
 1 file changed, 10 insertions(+), 416 deletions(-)

diff --git a/drivers/video/fbdev/core/cfbcopyarea.c b/drivers/video/fbdev/core/cfbcopyarea.c
index a271f57d9..ba0ebd115 100644
--- a/drivers/video/fbdev/core/cfbcopyarea.c
+++ b/drivers/video/fbdev/core/cfbcopyarea.c
@@ -7,434 +7,28 @@
  *  License.  See the file COPYING in the main directory of this archive for
  *  more details.
  *
- * NOTES:
- *
- *  This is for cfb packed pixels. Iplan and such are incorporated in the
- *  drivers that need them.
- *
- *  FIXME
- *
- *  Also need to add code to deal with cards endians that are different than
- *  the native cpu endians. I also need to deal with MSB position in the word.
- *
- *  The two functions or copying forward and backward could be split up like
- *  the ones for filling, i.e. in aligned and unaligned versions. This would
- *  help moving some redundant computations and branches out of the loop, too.
  */
 
 #include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/string.h>
 #include <linux/fb.h>
 #include <asm/types.h>
-#include <asm/io.h>
-#include "fb_draw.h"
 
 #if BITS_PER_LONG == 32
-#  define FB_WRITEL fb_writel
-#  define FB_READL  fb_readl
+#  define FB_WRITEL       fb_writel
+#  define FB_READL        fb_readl
 #else
-#  define FB_WRITEL fb_writeq
-#  define FB_READL  fb_readq
-#endif
-
-    /*
-     *  Generic bitwise copy algorithm
-     */
-
-static void
-bitcpy(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
-		const unsigned long __iomem *src, unsigned src_idx, int bits,
-		unsigned n, u32 bswapmask)
-{
-	unsigned long first, last;
-	int const shift = dst_idx-src_idx;
-
-#if 0
-	/*
-	 * If you suspect bug in this function, compare it with this simple
-	 * memmove implementation.
-	 */
-	memmove((char *)dst + ((dst_idx & (bits - 1))) / 8,
-		(char *)src + ((src_idx & (bits - 1))) / 8, n / 8);
-	return;
-#endif
-
-	first = fb_shifted_pixels_mask_long(p, dst_idx, bswapmask);
-	last = ~fb_shifted_pixels_mask_long(p, (dst_idx+n) % bits, bswapmask);
-
-	if (!shift) {
-		// Same alignment for source and dest
-
-		if (dst_idx+n <= bits) {
-			// Single word
-			if (last)
-				first &= last;
-			FB_WRITEL( comp( FB_READL(src), FB_READL(dst), first), dst);
-		} else {
-			// Multiple destination words
-
-			// Leading bits
-			if (first != ~0UL) {
-				FB_WRITEL( comp( FB_READL(src), FB_READL(dst), first), dst);
-				dst++;
-				src++;
-				n -= bits - dst_idx;
-			}
-
-			// Main chunk
-			n /= bits;
-			while (n >= 8) {
-				FB_WRITEL(FB_READL(src++), dst++);
-				FB_WRITEL(FB_READL(src++), dst++);
-				FB_WRITEL(FB_READL(src++), dst++);
-				FB_WRITEL(FB_READL(src++), dst++);
-				FB_WRITEL(FB_READL(src++), dst++);
-				FB_WRITEL(FB_READL(src++), dst++);
-				FB_WRITEL(FB_READL(src++), dst++);
-				FB_WRITEL(FB_READL(src++), dst++);
-				n -= 8;
-			}
-			while (n--)
-				FB_WRITEL(FB_READL(src++), dst++);
-
-			// Trailing bits
-			if (last)
-				FB_WRITEL( comp( FB_READL(src), FB_READL(dst), last), dst);
-		}
-	} else {
-		/* Different alignment for source and dest */
-		unsigned long d0, d1;
-		int m;
-
-		int const left = shift & (bits - 1);
-		int const right = -shift & (bits - 1);
-
-		if (dst_idx+n <= bits) {
-			// Single destination word
-			if (last)
-				first &= last;
-			d0 = FB_READL(src);
-			d0 = fb_rev_pixels_in_long(d0, bswapmask);
-			if (shift > 0) {
-				// Single source word
-				d0 <<= left;
-			} else if (src_idx+n <= bits) {
-				// Single source word
-				d0 >>= right;
-			} else {
-				// 2 source words
-				d1 = FB_READL(src + 1);
-				d1 = fb_rev_pixels_in_long(d1, bswapmask);
-				d0 = d0 >> right | d1 << left;
-			}
-			d0 = fb_rev_pixels_in_long(d0, bswapmask);
-			FB_WRITEL(comp(d0, FB_READL(dst), first), dst);
-		} else {
-			// Multiple destination words
-			/** We must always remember the last value read, because in case
-			SRC and DST overlap bitwise (e.g. when moving just one pixel in
-			1bpp), we always collect one full long for DST and that might
-			overlap with the current long from SRC. We store this value in
-			'd0'. */
-			d0 = FB_READL(src++);
-			d0 = fb_rev_pixels_in_long(d0, bswapmask);
-			// Leading bits
-			if (shift > 0) {
-				// Single source word
-				d1 = d0;
-				d0 <<= left;
-				n -= bits - dst_idx;
-			} else {
-				// 2 source words
-				d1 = FB_READL(src++);
-				d1 = fb_rev_pixels_in_long(d1, bswapmask);
-
-				d0 = d0 >> right | d1 << left;
-				n -= bits - dst_idx;
-			}
-			d0 = fb_rev_pixels_in_long(d0, bswapmask);
-			FB_WRITEL(comp(d0, FB_READL(dst), first), dst);
-			d0 = d1;
-			dst++;
-
-			// Main chunk
-			m = n % bits;
-			n /= bits;
-			while ((n >= 4) && !bswapmask) {
-				d1 = FB_READL(src++);
-				FB_WRITEL(d0 >> right | d1 << left, dst++);
-				d0 = d1;
-				d1 = FB_READL(src++);
-				FB_WRITEL(d0 >> right | d1 << left, dst++);
-				d0 = d1;
-				d1 = FB_READL(src++);
-				FB_WRITEL(d0 >> right | d1 << left, dst++);
-				d0 = d1;
-				d1 = FB_READL(src++);
-				FB_WRITEL(d0 >> right | d1 << left, dst++);
-				d0 = d1;
-				n -= 4;
-			}
-			while (n--) {
-				d1 = FB_READL(src++);
-				d1 = fb_rev_pixels_in_long(d1, bswapmask);
-				d0 = d0 >> right | d1 << left;
-				d0 = fb_rev_pixels_in_long(d0, bswapmask);
-				FB_WRITEL(d0, dst++);
-				d0 = d1;
-			}
-
-			// Trailing bits
-			if (m) {
-				if (m <= bits - right) {
-					// Single source word
-					d0 >>= right;
-				} else {
-					// 2 source words
-					d1 = FB_READL(src);
-					d1 = fb_rev_pixels_in_long(d1,
-								bswapmask);
-					d0 = d0 >> right | d1 << left;
-				}
-				d0 = fb_rev_pixels_in_long(d0, bswapmask);
-				FB_WRITEL(comp(d0, FB_READL(dst), last), dst);
-			}
-		}
-	}
-}
-
-    /*
-     *  Generic bitwise copy algorithm, operating backward
-     */
-
-static void
-bitcpy_rev(struct fb_info *p, unsigned long __iomem *dst, unsigned dst_idx,
-		const unsigned long __iomem *src, unsigned src_idx, int bits,
-		unsigned n, u32 bswapmask)
-{
-	unsigned long first, last;
-	int shift;
-
-#if 0
-	/*
-	 * If you suspect bug in this function, compare it with this simple
-	 * memmove implementation.
-	 */
-	memmove((char *)dst + ((dst_idx & (bits - 1))) / 8,
-		(char *)src + ((src_idx & (bits - 1))) / 8, n / 8);
-	return;
+#  define FB_WRITEL       fb_writeq
+#  define FB_READL        fb_readq
 #endif
-
-	dst += (dst_idx + n - 1) / bits;
-	src += (src_idx + n - 1) / bits;
-	dst_idx = (dst_idx + n - 1) % bits;
-	src_idx = (src_idx + n - 1) % bits;
-
-	shift = dst_idx-src_idx;
-
-	first = ~fb_shifted_pixels_mask_long(p, (dst_idx + 1) % bits, bswapmask);
-	last = fb_shifted_pixels_mask_long(p, (bits + dst_idx + 1 - n) % bits, bswapmask);
-
-	if (!shift) {
-		// Same alignment for source and dest
-
-		if ((unsigned long)dst_idx+1 >= n) {
-			// Single word
-			if (first)
-				last &= first;
-			FB_WRITEL( comp( FB_READL(src), FB_READL(dst), last), dst);
-		} else {
-			// Multiple destination words
-
-			// Leading bits
-			if (first) {
-				FB_WRITEL( comp( FB_READL(src), FB_READL(dst), first), dst);
-				dst--;
-				src--;
-				n -= dst_idx+1;
-			}
-
-			// Main chunk
-			n /= bits;
-			while (n >= 8) {
-				FB_WRITEL(FB_READL(src--), dst--);
-				FB_WRITEL(FB_READL(src--), dst--);
-				FB_WRITEL(FB_READL(src--), dst--);
-				FB_WRITEL(FB_READL(src--), dst--);
-				FB_WRITEL(FB_READL(src--), dst--);
-				FB_WRITEL(FB_READL(src--), dst--);
-				FB_WRITEL(FB_READL(src--), dst--);
-				FB_WRITEL(FB_READL(src--), dst--);
-				n -= 8;
-			}
-			while (n--)
-				FB_WRITEL(FB_READL(src--), dst--);
-
-			// Trailing bits
-			if (last != -1UL)
-				FB_WRITEL( comp( FB_READL(src), FB_READL(dst), last), dst);
-		}
-	} else {
-		// Different alignment for source and dest
-		unsigned long d0, d1;
-		int m;
-
-		int const left = shift & (bits-1);
-		int const right = -shift & (bits-1);
-
-		if ((unsigned long)dst_idx+1 >= n) {
-			// Single destination word
-			if (first)
-				last &= first;
-			d0 = FB_READL(src);
-			if (shift < 0) {
-				// Single source word
-				d0 >>= right;
-			} else if (1+(unsigned long)src_idx >= n) {
-				// Single source word
-				d0 <<= left;
-			} else {
-				// 2 source words
-				d1 = FB_READL(src - 1);
-				d1 = fb_rev_pixels_in_long(d1, bswapmask);
-				d0 = d0 << left | d1 >> right;
-			}
-			d0 = fb_rev_pixels_in_long(d0, bswapmask);
-			FB_WRITEL(comp(d0, FB_READL(dst), last), dst);
-		} else {
-			// Multiple destination words
-			/** We must always remember the last value read, because in case
-			SRC and DST overlap bitwise (e.g. when moving just one pixel in
-			1bpp), we always collect one full long for DST and that might
-			overlap with the current long from SRC. We store this value in
-			'd0'. */
-
-			d0 = FB_READL(src--);
-			d0 = fb_rev_pixels_in_long(d0, bswapmask);
-			// Leading bits
-			if (shift < 0) {
-				// Single source word
-				d1 = d0;
-				d0 >>= right;
-			} else {
-				// 2 source words
-				d1 = FB_READL(src--);
-				d1 = fb_rev_pixels_in_long(d1, bswapmask);
-				d0 = d0 << left | d1 >> right;
-			}
-			d0 = fb_rev_pixels_in_long(d0, bswapmask);
-			if (!first)
-				FB_WRITEL(d0, dst);
-			else
-				FB_WRITEL(comp(d0, FB_READL(dst), first), dst);
-			d0 = d1;
-			dst--;
-			n -= dst_idx+1;
-
-			// Main chunk
-			m = n % bits;
-			n /= bits;
-			while ((n >= 4) && !bswapmask) {
-				d1 = FB_READL(src--);
-				FB_WRITEL(d0 << left | d1 >> right, dst--);
-				d0 = d1;
-				d1 = FB_READL(src--);
-				FB_WRITEL(d0 << left | d1 >> right, dst--);
-				d0 = d1;
-				d1 = FB_READL(src--);
-				FB_WRITEL(d0 << left | d1 >> right, dst--);
-				d0 = d1;
-				d1 = FB_READL(src--);
-				FB_WRITEL(d0 << left | d1 >> right, dst--);
-				d0 = d1;
-				n -= 4;
-			}
-			while (n--) {
-				d1 = FB_READL(src--);
-				d1 = fb_rev_pixels_in_long(d1, bswapmask);
-				d0 = d0 << left | d1 >> right;
-				d0 = fb_rev_pixels_in_long(d0, bswapmask);
-				FB_WRITEL(d0, dst--);
-				d0 = d1;
-			}
-
-			// Trailing bits
-			if (m) {
-				if (m <= bits - left) {
-					// Single source word
-					d0 <<= left;
-				} else {
-					// 2 source words
-					d1 = FB_READL(src);
-					d1 = fb_rev_pixels_in_long(d1,
-								bswapmask);
-					d0 = d0 << left | d1 >> right;
-				}
-				d0 = fb_rev_pixels_in_long(d0, bswapmask);
-				FB_WRITEL(comp(d0, FB_READL(dst), last), dst);
-			}
-		}
-	}
-}
-
-void cfb_copyarea(struct fb_info *p, const struct fb_copyarea *area)
-{
-	u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
-	u32 height = area->height, width = area->width;
-	unsigned int const bits_per_line = p->fix.line_length * 8u;
-	unsigned long __iomem *base = NULL;
-	int bits = BITS_PER_LONG, bytes = bits >> 3;
-	unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
-	u32 bswapmask = fb_compute_bswapmask(p);
-
-	if (p->state != FBINFO_STATE_RUNNING)
-		return;
-
-	if (p->flags & FBINFO_VIRTFB)
-		fb_warn_once(p, "Framebuffer is not in I/O address space.");
-
-	/* if the beginning of the target area might overlap with the end of
-	the source area, be have to copy the area reverse. */
-	if ((dy == sy && dx > sx) || (dy > sy)) {
-		dy += height;
-		sy += height;
-		rev_copy = 1;
-	}
-
-	// split the base of the framebuffer into a long-aligned address and the
-	// index of the first bit
-	base = (unsigned long __iomem *)((unsigned long)p->screen_base & ~(bytes-1));
-	dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
-	// add offset of source and target area
-	dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
-	src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
-
-	if (p->fbops->fb_sync)
-		p->fbops->fb_sync(p);
-
-	if (rev_copy) {
-		while (height--) {
-			dst_idx -= bits_per_line;
-			src_idx -= bits_per_line;
-			bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
-				base + (src_idx / bits), src_idx % bits, bits,
-				width*p->var.bits_per_pixel, bswapmask);
-		}
-	} else {
-		while (height--) {
-			bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
-				base + (src_idx / bits), src_idx % bits, bits,
-				width*p->var.bits_per_pixel, bswapmask);
-			dst_idx += bits_per_line;
-			src_idx += bits_per_line;
-		}
-	}
-}
+#define FB_MEM            /* nothing */
+#define FB_COPYAREA       cfb_copyarea
+#define FB_SPACE          0
+#define FB_SPACE_NAME     "I/O"
+#define FB_SCREEN_BASE(a) ((a)->screen_base)
+#include "fb_copyarea.h"
 
 EXPORT_SYMBOL(cfb_copyarea);
 
 MODULE_AUTHOR("James Simmons <jsimmons@users.sf.net>");
 MODULE_DESCRIPTION("Generic software accelerated copyarea");
 MODULE_LICENSE("GPL");
-
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 04/13] fbdev: core: Use generic copyarea for as sys_copyarea
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (2 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 03/13] fbdev: core: Use generic copyarea for as cfb_copyarea Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 05/13] fbdev: core: Copy cfbfillrect to fb_fillrect Zsolt Kajtar
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/syscopyarea.c | 357 +------------------------
 1 file changed, 8 insertions(+), 349 deletions(-)

diff --git a/drivers/video/fbdev/core/syscopyarea.c b/drivers/video/fbdev/core/syscopyarea.c
index 75e7001e8..124831eed 100644
--- a/drivers/video/fbdev/core/syscopyarea.c
+++ b/drivers/video/fbdev/core/syscopyarea.c
@@ -13,361 +13,20 @@
  *
  */
 #include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/string.h>
 #include <linux/fb.h>
 #include <asm/types.h>
-#include <asm/io.h>
-#include "fb_draw.h"
 
-    /*
-     *  Generic bitwise copy algorithm
-     */
-
-static void
-bitcpy(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
-	const unsigned long *src, unsigned src_idx, int bits, unsigned n)
-{
-	unsigned long first, last;
-	int const shift = dst_idx-src_idx;
-	int left, right;
-
-	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
-	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
-
-	if (!shift) {
-		/* Same alignment for source and dest */
-		if (dst_idx+n <= bits) {
-			/* Single word */
-			if (last)
-				first &= last;
-			*dst = comp(*src, *dst, first);
-		} else {
-			/* Multiple destination words */
-			/* Leading bits */
- 			if (first != ~0UL) {
-				*dst = comp(*src, *dst, first);
-				dst++;
-				src++;
-				n -= bits - dst_idx;
-			}
-
-			/* Main chunk */
-			n /= bits;
-			while (n >= 8) {
-				*dst++ = *src++;
-				*dst++ = *src++;
-				*dst++ = *src++;
-				*dst++ = *src++;
-				*dst++ = *src++;
-				*dst++ = *src++;
-				*dst++ = *src++;
-				*dst++ = *src++;
-				n -= 8;
-			}
-			while (n--)
-				*dst++ = *src++;
-
-			/* Trailing bits */
-			if (last)
-				*dst = comp(*src, *dst, last);
-		}
-	} else {
-		unsigned long d0, d1;
-		int m;
-
-		/* Different alignment for source and dest */
-		right = shift & (bits - 1);
-		left = -shift & (bits - 1);
-
-		if (dst_idx+n <= bits) {
-			/* Single destination word */
-			if (last)
-				first &= last;
-			if (shift > 0) {
-				/* Single source word */
-				*dst = comp(*src << left, *dst, first);
-			} else if (src_idx+n <= bits) {
-				/* Single source word */
-				*dst = comp(*src >> right, *dst, first);
-			} else {
-				/* 2 source words */
-				d0 = *src++;
-				d1 = *src;
-				*dst = comp(d0 >> right | d1 << left, *dst,
-					    first);
-			}
-		} else {
-			/* Multiple destination words */
-			/** We must always remember the last value read,
-			    because in case SRC and DST overlap bitwise (e.g.
-			    when moving just one pixel in 1bpp), we always
-			    collect one full long for DST and that might
-			    overlap with the current long from SRC. We store
-			    this value in 'd0'. */
-			d0 = *src++;
-			/* Leading bits */
-			if (shift > 0) {
-				/* Single source word */
-				*dst = comp(d0 << left, *dst, first);
-				dst++;
-				n -= bits - dst_idx;
-			} else {
-				/* 2 source words */
-				d1 = *src++;
-				*dst = comp(d0 >> right | d1 << left, *dst,
-					    first);
-				d0 = d1;
-				dst++;
-				n -= bits - dst_idx;
-			}
-
-			/* Main chunk */
-			m = n % bits;
-			n /= bits;
-			while (n >= 4) {
-				d1 = *src++;
-				*dst++ = d0 >> right | d1 << left;
-				d0 = d1;
-				d1 = *src++;
-				*dst++ = d0 >> right | d1 << left;
-				d0 = d1;
-				d1 = *src++;
-				*dst++ = d0 >> right | d1 << left;
-				d0 = d1;
-				d1 = *src++;
-				*dst++ = d0 >> right | d1 << left;
-				d0 = d1;
-				n -= 4;
-			}
-			while (n--) {
-				d1 = *src++;
-				*dst++ = d0 >> right | d1 << left;
-				d0 = d1;
-			}
-
-			/* Trailing bits */
-			if (m) {
-				if (m <= bits - right) {
-					/* Single source word */
-					d0 >>= right;
-				} else {
-					/* 2 source words */
- 					d1 = *src;
-					d0 = d0 >> right | d1 << left;
-				}
-				*dst = comp(d0, *dst, last);
-			}
-		}
-	}
-}
-
-    /*
-     *  Generic bitwise copy algorithm, operating backward
-     */
-
-static void
-bitcpy_rev(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
-	   const unsigned long *src, unsigned src_idx, unsigned bits,
-	   unsigned n)
-{
-	unsigned long first, last;
-	int shift;
-
-	dst += (dst_idx + n - 1) / bits;
-	src += (src_idx + n - 1) / bits;
-	dst_idx = (dst_idx + n - 1) % bits;
-	src_idx = (src_idx + n - 1) % bits;
-
-	shift = dst_idx-src_idx;
-
-	first = ~FB_SHIFT_HIGH(p, ~0UL, (dst_idx + 1) % bits);
-	last = FB_SHIFT_HIGH(p, ~0UL, (bits + dst_idx + 1 - n) % bits);
-
-	if (!shift) {
-		/* Same alignment for source and dest */
-		if ((unsigned long)dst_idx+1 >= n) {
-			/* Single word */
-			if (first)
-				last &= first;
-			*dst = comp(*src, *dst, last);
-		} else {
-			/* Multiple destination words */
-
-			/* Leading bits */
-			if (first) {
-				*dst = comp(*src, *dst, first);
-				dst--;
-				src--;
-				n -= dst_idx+1;
-			}
-
-			/* Main chunk */
-			n /= bits;
-			while (n >= 8) {
-				*dst-- = *src--;
-				*dst-- = *src--;
-				*dst-- = *src--;
-				*dst-- = *src--;
-				*dst-- = *src--;
-				*dst-- = *src--;
-				*dst-- = *src--;
-				*dst-- = *src--;
-				n -= 8;
-			}
-			while (n--)
-				*dst-- = *src--;
-			/* Trailing bits */
-			if (last != -1UL)
-				*dst = comp(*src, *dst, last);
-		}
-	} else {
-		/* Different alignment for source and dest */
-
-		int const left = shift & (bits-1);
-		int const right = -shift & (bits-1);
-
-		if ((unsigned long)dst_idx+1 >= n) {
-			/* Single destination word */
-			if (first)
-				last &= first;
-			if (shift < 0) {
-				/* Single source word */
-				*dst = comp(*src >> right, *dst, last);
-			} else if (1+(unsigned long)src_idx >= n) {
-				/* Single source word */
-				*dst = comp(*src << left, *dst, last);
-			} else {
-				/* 2 source words */
-				*dst = comp(*src << left | *(src-1) >> right,
-					    *dst, last);
-			}
-		} else {
-			/* Multiple destination words */
-			/** We must always remember the last value read,
-			    because in case SRC and DST overlap bitwise (e.g.
-			    when moving just one pixel in 1bpp), we always
-			    collect one full long for DST and that might
-			    overlap with the current long from SRC. We store
-			    this value in 'd0'. */
-			unsigned long d0, d1;
-			int m;
-
-			d0 = *src--;
-			/* Leading bits */
-			if (shift < 0) {
-				/* Single source word */
-				d1 = d0;
-				d0 >>= right;
-			} else {
-				/* 2 source words */
-				d1 = *src--;
-				d0 = d0 << left | d1 >> right;
-			}
-			if (!first)
-				*dst = d0;
-			else
-				*dst = comp(d0, *dst, first);
-			d0 = d1;
-			dst--;
-			n -= dst_idx+1;
-
-			/* Main chunk */
-			m = n % bits;
-			n /= bits;
-			while (n >= 4) {
-				d1 = *src--;
-				*dst-- = d0 << left | d1 >> right;
-				d0 = d1;
-				d1 = *src--;
-				*dst-- = d0 << left | d1 >> right;
-				d0 = d1;
-				d1 = *src--;
-				*dst-- = d0 << left | d1 >> right;
-				d0 = d1;
-				d1 = *src--;
-				*dst-- = d0 << left | d1 >> right;
-				d0 = d1;
-				n -= 4;
-			}
-			while (n--) {
-				d1 = *src--;
-				*dst-- = d0 << left | d1 >> right;
-				d0 = d1;
-			}
-
-			/* Trailing bits */
-			if (m) {
-				if (m <= bits - left) {
-					/* Single source word */
-					d0 <<= left;
-				} else {
-					/* 2 source words */
-					d1 = *src;
-					d0 = d0 << left | d1 >> right;
-				}
-				*dst = comp(d0, *dst, last);
-			}
-		}
-	}
-}
-
-void sys_copyarea(struct fb_info *p, const struct fb_copyarea *area)
-{
-	u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
-	u32 height = area->height, width = area->width;
-	unsigned int const bits_per_line = p->fix.line_length * 8u;
-	unsigned long *base = NULL;
-	int bits = BITS_PER_LONG, bytes = bits >> 3;
-	unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
-
-	if (p->state != FBINFO_STATE_RUNNING)
-		return;
-
-	if (!(p->flags & FBINFO_VIRTFB))
-		fb_warn_once(p, "Framebuffer is not in virtual address space.");
-
-	/* if the beginning of the target area might overlap with the end of
-	the source area, be have to copy the area reverse. */
-	if ((dy == sy && dx > sx) || (dy > sy)) {
-		dy += height;
-		sy += height;
-		rev_copy = 1;
-	}
-
-	/* split the base of the framebuffer into a long-aligned address and
-	   the index of the first bit */
-	base = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
-	dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
-	/* add offset of source and target area */
-	dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
-	src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
-
-	if (p->fbops->fb_sync)
-		p->fbops->fb_sync(p);
-
-	if (rev_copy) {
-		while (height--) {
-			dst_idx -= bits_per_line;
-			src_idx -= bits_per_line;
-			bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
-				base + (src_idx / bits), src_idx % bits, bits,
-				width*p->var.bits_per_pixel);
-		}
-	} else {
-		while (height--) {
-			bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
-				base + (src_idx / bits), src_idx % bits, bits,
-				width*p->var.bits_per_pixel);
-			dst_idx += bits_per_line;
-			src_idx += bits_per_line;
-		}
-	}
-}
+#define FB_READL(a)       (*a)
+#define FB_WRITEL(a,b)    do { *(b) = (a); } while (false)
+#define FB_MEM            /* nothing */
+#define FB_COPYAREA       sys_copyarea
+#define FB_SPACE          FBINFO_VIRTFB
+#define FB_SPACE_NAME     "virtual"
+#define FB_SCREEN_BASE(a) ((a)->screen_buffer)
+#include "fb_copyarea.h"
 
 EXPORT_SYMBOL(sys_copyarea);
 
 MODULE_AUTHOR("Antonino Daplas <adaplas@pol.net>");
 MODULE_DESCRIPTION("Generic copyarea (sys-to-sys)");
 MODULE_LICENSE("GPL");
-
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 05/13] fbdev: core: Copy cfbfillrect to fb_fillrect
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (3 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 04/13] fbdev: core: Use generic copyarea for as sys_copyarea Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 06/13] fbdev: core: Make fb_fillrect generic Zsolt Kajtar
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/fb_fillrect.h | 374 +++++++++++++++++++++++++
 1 file changed, 374 insertions(+)
 create mode 100644 drivers/video/fbdev/core/fb_fillrect.h

diff --git a/drivers/video/fbdev/core/fb_fillrect.h b/drivers/video/fbdev/core/fb_fillrect.h
new file mode 100644
index 000000000..a3bef06ce
--- /dev/null
+++ b/drivers/video/fbdev/core/fb_fillrect.h
@@ -0,0 +1,374 @@
+/*
+ *  Generic fillrect for frame buffers with packed pixels of any depth.
+ *
+ *      Copyright (C)  2000 James Simmons (jsimmons@linux-fbdev.org)
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of this archive for
+ *  more details.
+ *
+ * NOTES:
+ *
+ *  Also need to add code to deal with cards endians that are different than
+ *  the native cpu endians. I also need to deal with MSB position in the word.
+ *
+ */
+#include <linux/module.h>
+#include <linux/string.h>
+#include <linux/fb.h>
+#include <asm/types.h>
+#include "fb_draw.h"
+
+#if BITS_PER_LONG == 32
+#  define FB_WRITEL fb_writel
+#  define FB_READL  fb_readl
+#else
+#  define FB_WRITEL fb_writeq
+#  define FB_READL  fb_readq
+#endif
+
+    /*
+     *  Aligned pattern fill using 32/64-bit memory accesses
+     */
+
+static void
+bitfill_aligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
+                unsigned long pat, unsigned n, int bits, u32 bswapmask)
+{
+        unsigned long first, last;
+
+        if (!n)
+                return;
+
+        first = fb_shifted_pixels_mask_long(p, dst_idx, bswapmask);
+        last = ~fb_shifted_pixels_mask_long(p, (dst_idx+n) % bits, bswapmask);
+
+        if (dst_idx+n <= bits) {
+                // Single word
+                if (last)
+                        first &= last;
+                FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
+        } else {
+                // Multiple destination words
+
+                // Leading bits
+                if (first!= ~0UL) {
+                        FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
+                        dst++;
+                        n -= bits - dst_idx;
+                }
+
+                // Main chunk
+                n /= bits;
+                while (n >= 8) {
+                        FB_WRITEL(pat, dst++);
+                        FB_WRITEL(pat, dst++);
+                        FB_WRITEL(pat, dst++);
+                        FB_WRITEL(pat, dst++);
+                        FB_WRITEL(pat, dst++);
+                        FB_WRITEL(pat, dst++);
+                        FB_WRITEL(pat, dst++);
+                        FB_WRITEL(pat, dst++);
+                        n -= 8;
+                }
+                while (n--)
+                        FB_WRITEL(pat, dst++);
+
+                // Trailing bits
+                if (last)
+                        FB_WRITEL(comp(pat, FB_READL(dst), last), dst);
+        }
+}
+
+
+    /*
+     *  Unaligned generic pattern fill using 32/64-bit memory accesses
+     *  The pattern must have been expanded to a full 32/64-bit value
+     *  Left/right are the appropriate shifts to convert to the pattern to be
+     *  used for the next 32/64-bit word
+     */
+
+static void
+bitfill_unaligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
+                  unsigned long pat, int left, int right, unsigned n, int bits)
+{
+        unsigned long first, last;
+
+        if (!n)
+                return;
+
+        first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+        last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+        if (dst_idx+n <= bits) {
+                // Single word
+                if (last)
+                        first &= last;
+                FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
+        } else {
+                // Multiple destination words
+                // Leading bits
+                if (first) {
+                        FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
+                        dst++;
+                        pat = pat << left | pat >> right;
+                        n -= bits - dst_idx;
+                }
+
+                // Main chunk
+                n /= bits;
+                while (n >= 4) {
+                        FB_WRITEL(pat, dst++);
+                        pat = pat << left | pat >> right;
+                        FB_WRITEL(pat, dst++);
+                        pat = pat << left | pat >> right;
+                        FB_WRITEL(pat, dst++);
+                        pat = pat << left | pat >> right;
+                        FB_WRITEL(pat, dst++);
+                        pat = pat << left | pat >> right;
+                        n -= 4;
+                }
+                while (n--) {
+                        FB_WRITEL(pat, dst++);
+                        pat = pat << left | pat >> right;
+                }
+
+                // Trailing bits
+                if (last)
+                        FB_WRITEL(comp(pat, FB_READL(dst), last), dst);
+        }
+}
+
+    /*
+     *  Aligned pattern invert using 32/64-bit memory accesses
+     */
+static void
+bitfill_aligned_rev(struct fb_info *p, unsigned long __iomem *dst,
+                    int dst_idx, unsigned long pat, unsigned n, int bits,
+                    u32 bswapmask)
+{
+        unsigned long val = pat, dat;
+        unsigned long first, last;
+
+        if (!n)
+                return;
+
+        first = fb_shifted_pixels_mask_long(p, dst_idx, bswapmask);
+        last = ~fb_shifted_pixels_mask_long(p, (dst_idx+n) % bits, bswapmask);
+
+        if (dst_idx+n <= bits) {
+                // Single word
+                if (last)
+                        first &= last;
+                dat = FB_READL(dst);
+                FB_WRITEL(comp(dat ^ val, dat, first), dst);
+        } else {
+                // Multiple destination words
+                // Leading bits
+                if (first!=0UL) {
+                        dat = FB_READL(dst);
+                        FB_WRITEL(comp(dat ^ val, dat, first), dst);
+                        dst++;
+                        n -= bits - dst_idx;
+                }
+
+                // Main chunk
+                n /= bits;
+                while (n >= 8) {
+                        FB_WRITEL(FB_READL(dst) ^ val, dst);
+                        dst++;
+                        FB_WRITEL(FB_READL(dst) ^ val, dst);
+                        dst++;
+                        FB_WRITEL(FB_READL(dst) ^ val, dst);
+                        dst++;
+                        FB_WRITEL(FB_READL(dst) ^ val, dst);
+                        dst++;
+                        FB_WRITEL(FB_READL(dst) ^ val, dst);
+                        dst++;
+                        FB_WRITEL(FB_READL(dst) ^ val, dst);
+                        dst++;
+                        FB_WRITEL(FB_READL(dst) ^ val, dst);
+                        dst++;
+                        FB_WRITEL(FB_READL(dst) ^ val, dst);
+                        dst++;
+                        n -= 8;
+                }
+                while (n--) {
+                        FB_WRITEL(FB_READL(dst) ^ val, dst);
+                        dst++;
+                }
+                // Trailing bits
+                if (last) {
+                        dat = FB_READL(dst);
+                        FB_WRITEL(comp(dat ^ val, dat, last), dst);
+                }
+        }
+}
+
+
+    /*
+     *  Unaligned generic pattern invert using 32/64-bit memory accesses
+     *  The pattern must have been expanded to a full 32/64-bit value
+     *  Left/right are the appropriate shifts to convert to the pattern to be
+     *  used for the next 32/64-bit word
+     */
+
+static void
+bitfill_unaligned_rev(struct fb_info *p, unsigned long __iomem *dst,
+                      int dst_idx, unsigned long pat, int left, int right,
+                      unsigned n, int bits)
+{
+        unsigned long first, last, dat;
+
+        if (!n)
+                return;
+
+        first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+        last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+        if (dst_idx+n <= bits) {
+                // Single word
+                if (last)
+                        first &= last;
+                dat = FB_READL(dst);
+                FB_WRITEL(comp(dat ^ pat, dat, first), dst);
+        } else {
+                // Multiple destination words
+
+                // Leading bits
+                if (first != 0UL) {
+                        dat = FB_READL(dst);
+                        FB_WRITEL(comp(dat ^ pat, dat, first), dst);
+                        dst++;
+                        pat = pat << left | pat >> right;
+                        n -= bits - dst_idx;
+                }
+
+                // Main chunk
+                n /= bits;
+                while (n >= 4) {
+                        FB_WRITEL(FB_READL(dst) ^ pat, dst);
+                        dst++;
+                        pat = pat << left | pat >> right;
+                        FB_WRITEL(FB_READL(dst) ^ pat, dst);
+                        dst++;
+                        pat = pat << left | pat >> right;
+                        FB_WRITEL(FB_READL(dst) ^ pat, dst);
+                        dst++;
+                        pat = pat << left | pat >> right;
+                        FB_WRITEL(FB_READL(dst) ^ pat, dst);
+                        dst++;
+                        pat = pat << left | pat >> right;
+                        n -= 4;
+                }
+                while (n--) {
+                        FB_WRITEL(FB_READL(dst) ^ pat, dst);
+                        dst++;
+                        pat = pat << left | pat >> right;
+                }
+
+                // Trailing bits
+                if (last) {
+                        dat = FB_READL(dst);
+                        FB_WRITEL(comp(dat ^ pat, dat, last), dst);
+                }
+        }
+}
+
+void cfb_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
+{
+        unsigned long pat, pat2, fg;
+        unsigned long width = rect->width, height = rect->height;
+        int bits = BITS_PER_LONG, bytes = bits >> 3;
+        u32 bpp = p->var.bits_per_pixel;
+        unsigned long __iomem *dst;
+        int dst_idx, left;
+
+        if (p->state != FBINFO_STATE_RUNNING)
+                return;
+
+        if (p->flags & FBINFO_VIRTFB)
+                fb_warn_once(p, "Framebuffer is not in I/O address space.");
+
+        if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
+            p->fix.visual == FB_VISUAL_DIRECTCOLOR )
+                fg = ((u32 *) (p->pseudo_palette))[rect->color];
+        else
+                fg = rect->color;
+
+        pat = pixel_to_pat(bpp, fg);
+
+        dst = (unsigned long __iomem *)((unsigned long)p->screen_base & ~(bytes-1));
+        dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
+        dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
+        /* FIXME For now we support 1-32 bpp only */
+        left = bits % bpp;
+        if (p->fbops->fb_sync)
+                p->fbops->fb_sync(p);
+        if (!left) {
+                u32 bswapmask = fb_compute_bswapmask(p);
+                void (*fill_op32)(struct fb_info *p,
+                                  unsigned long __iomem *dst, int dst_idx,
+                                  unsigned long pat, unsigned n, int bits,
+                                  u32 bswapmask) = NULL;
+
+                switch (rect->rop) {
+                case ROP_XOR:
+                        fill_op32 = bitfill_aligned_rev;
+                        break;
+                case ROP_COPY:
+                        fill_op32 = bitfill_aligned;
+                        break;
+                default:
+                        printk( KERN_ERR "cfb_fillrect(): unknown rop, defaulting to ROP_COPY\n");
+                        fill_op32 = bitfill_aligned;
+                        break;
+                }
+                while (height--) {
+                        dst += dst_idx >> (ffs(bits) - 1);
+                        dst_idx &= (bits - 1);
+                        fill_op32(p, dst, dst_idx, pat, width*bpp, bits,
+                                  bswapmask);
+                        dst_idx += p->fix.line_length*8;
+                }
+        } else {
+                int right, r;
+                void (*fill_op)(struct fb_info *p, unsigned long __iomem *dst,
+                                int dst_idx, unsigned long pat, int left,
+                                int right, unsigned n, int bits) = NULL;
+#ifdef __LITTLE_ENDIAN
+                right = left;
+                left = bpp - right;
+#else
+                right = bpp - left;
+#endif
+                switch (rect->rop) {
+                case ROP_XOR:
+                        fill_op = bitfill_unaligned_rev;
+                        break;
+                case ROP_COPY:
+                        fill_op = bitfill_unaligned;
+                        break;
+                default:
+                        printk(KERN_ERR "cfb_fillrect(): unknown rop, defaulting to ROP_COPY\n");
+                        fill_op = bitfill_unaligned;
+                        break;
+                }
+                while (height--) {
+                        dst += dst_idx / bits;
+                        dst_idx &= (bits - 1);
+                        r = dst_idx % bpp;
+                        /* rotate pattern to the correct start position */
+                        pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
+                        fill_op(p, dst, dst_idx, pat2, left, right,
+                                width*bpp, bits);
+                        dst_idx += p->fix.line_length*8;
+                }
+        }
+}
+
+EXPORT_SYMBOL(cfb_fillrect);
+
+MODULE_AUTHOR("James Simmons <jsimmons@users.sf.net>");
+MODULE_DESCRIPTION("Generic software accelerated fill rectangle");
+MODULE_LICENSE("GPL");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 06/13] fbdev: core: Make fb_fillrect generic
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (4 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 05/13] fbdev: core: Copy cfbfillrect to fb_fillrect Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 07/13] fbdev: core: Use generic fillrect for as cfb_fillrect Zsolt Kajtar
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/fb_fillrect.h | 89 +++++++++++---------------
 1 file changed, 37 insertions(+), 52 deletions(-)

diff --git a/drivers/video/fbdev/core/fb_fillrect.h b/drivers/video/fbdev/core/fb_fillrect.h
index a3bef06ce..5f1123533 100644
--- a/drivers/video/fbdev/core/fb_fillrect.h
+++ b/drivers/video/fbdev/core/fb_fillrect.h
@@ -13,26 +13,14 @@
  *  the native cpu endians. I also need to deal with MSB position in the word.
  *
  */
-#include <linux/module.h>
-#include <linux/string.h>
-#include <linux/fb.h>
-#include <asm/types.h>
 #include "fb_draw.h"
 
-#if BITS_PER_LONG == 32
-#  define FB_WRITEL fb_writel
-#  define FB_READL  fb_readl
-#else
-#  define FB_WRITEL fb_writeq
-#  define FB_READL  fb_readq
-#endif
-
     /*
      *  Aligned pattern fill using 32/64-bit memory accesses
      */
 
 static void
-bitfill_aligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
+bitfill_aligned(struct fb_info *p, unsigned long FB_MEM *dst, int dst_idx,
                 unsigned long pat, unsigned n, int bits, u32 bswapmask)
 {
         unsigned long first, last;
@@ -44,21 +32,21 @@ bitfill_aligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
         last = ~fb_shifted_pixels_mask_long(p, (dst_idx+n) % bits, bswapmask);
 
         if (dst_idx+n <= bits) {
-                // Single word
+                /* Single word */
                 if (last)
                         first &= last;
                 FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
         } else {
-                // Multiple destination words
+                /* Multiple destination words */
 
-                // Leading bits
+                /* Leading bits */
                 if (first!= ~0UL) {
                         FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
                         dst++;
                         n -= bits - dst_idx;
                 }
 
-                // Main chunk
+                /* Main chunk */
                 n /= bits;
                 while (n >= 8) {
                         FB_WRITEL(pat, dst++);
@@ -74,7 +62,7 @@ bitfill_aligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
                 while (n--)
                         FB_WRITEL(pat, dst++);
 
-                // Trailing bits
+                /* Trailing bits */
                 if (last)
                         FB_WRITEL(comp(pat, FB_READL(dst), last), dst);
         }
@@ -89,7 +77,7 @@ bitfill_aligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
      */
 
 static void
-bitfill_unaligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
+bitfill_unaligned(struct fb_info *p, unsigned long FB_MEM *dst, int dst_idx,
                   unsigned long pat, int left, int right, unsigned n, int bits)
 {
         unsigned long first, last;
@@ -101,13 +89,13 @@ bitfill_unaligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
         last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
 
         if (dst_idx+n <= bits) {
-                // Single word
+                /* Single word */
                 if (last)
                         first &= last;
                 FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
         } else {
-                // Multiple destination words
-                // Leading bits
+                /* Multiple destination words */
+                /* Leading bits */
                 if (first) {
                         FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
                         dst++;
@@ -115,7 +103,7 @@ bitfill_unaligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
                         n -= bits - dst_idx;
                 }
 
-                // Main chunk
+                /* Main chunk */
                 n /= bits;
                 while (n >= 4) {
                         FB_WRITEL(pat, dst++);
@@ -133,7 +121,7 @@ bitfill_unaligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
                         pat = pat << left | pat >> right;
                 }
 
-                // Trailing bits
+                /* Trailing bits */
                 if (last)
                         FB_WRITEL(comp(pat, FB_READL(dst), last), dst);
         }
@@ -143,7 +131,7 @@ bitfill_unaligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
      *  Aligned pattern invert using 32/64-bit memory accesses
      */
 static void
-bitfill_aligned_rev(struct fb_info *p, unsigned long __iomem *dst,
+bitfill_aligned_rev(struct fb_info *p, unsigned long FB_MEM *dst,
                     int dst_idx, unsigned long pat, unsigned n, int bits,
                     u32 bswapmask)
 {
@@ -157,14 +145,14 @@ bitfill_aligned_rev(struct fb_info *p, unsigned long __iomem *dst,
         last = ~fb_shifted_pixels_mask_long(p, (dst_idx+n) % bits, bswapmask);
 
         if (dst_idx+n <= bits) {
-                // Single word
+                /* Single word */
                 if (last)
                         first &= last;
                 dat = FB_READL(dst);
                 FB_WRITEL(comp(dat ^ val, dat, first), dst);
         } else {
-                // Multiple destination words
-                // Leading bits
+                /* Multiple destination words */
+                /* Leading bits */
                 if (first!=0UL) {
                         dat = FB_READL(dst);
                         FB_WRITEL(comp(dat ^ val, dat, first), dst);
@@ -172,7 +160,7 @@ bitfill_aligned_rev(struct fb_info *p, unsigned long __iomem *dst,
                         n -= bits - dst_idx;
                 }
 
-                // Main chunk
+                /* Main chunk */
                 n /= bits;
                 while (n >= 8) {
                         FB_WRITEL(FB_READL(dst) ^ val, dst);
@@ -197,7 +185,7 @@ bitfill_aligned_rev(struct fb_info *p, unsigned long __iomem *dst,
                         FB_WRITEL(FB_READL(dst) ^ val, dst);
                         dst++;
                 }
-                // Trailing bits
+                /* Trailing bits */
                 if (last) {
                         dat = FB_READL(dst);
                         FB_WRITEL(comp(dat ^ val, dat, last), dst);
@@ -214,7 +202,7 @@ bitfill_aligned_rev(struct fb_info *p, unsigned long __iomem *dst,
      */
 
 static void
-bitfill_unaligned_rev(struct fb_info *p, unsigned long __iomem *dst,
+bitfill_unaligned_rev(struct fb_info *p, unsigned long FB_MEM *dst,
                       int dst_idx, unsigned long pat, int left, int right,
                       unsigned n, int bits)
 {
@@ -227,15 +215,15 @@ bitfill_unaligned_rev(struct fb_info *p, unsigned long __iomem *dst,
         last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
 
         if (dst_idx+n <= bits) {
-                // Single word
+                /* Single word */
                 if (last)
                         first &= last;
                 dat = FB_READL(dst);
                 FB_WRITEL(comp(dat ^ pat, dat, first), dst);
         } else {
-                // Multiple destination words
+                /* Multiple destination words */
 
-                // Leading bits
+                /* Leading bits */
                 if (first != 0UL) {
                         dat = FB_READL(dst);
                         FB_WRITEL(comp(dat ^ pat, dat, first), dst);
@@ -244,7 +232,7 @@ bitfill_unaligned_rev(struct fb_info *p, unsigned long __iomem *dst,
                         n -= bits - dst_idx;
                 }
 
-                // Main chunk
+                /* Main chunk */
                 n /= bits;
                 while (n >= 4) {
                         FB_WRITEL(FB_READL(dst) ^ pat, dst);
@@ -267,7 +255,7 @@ bitfill_unaligned_rev(struct fb_info *p, unsigned long __iomem *dst,
                         pat = pat << left | pat >> right;
                 }
 
-                // Trailing bits
+                /* Trailing bits */
                 if (last) {
                         dat = FB_READL(dst);
                         FB_WRITEL(comp(dat ^ pat, dat, last), dst);
@@ -275,20 +263,21 @@ bitfill_unaligned_rev(struct fb_info *p, unsigned long __iomem *dst,
         }
 }
 
-void cfb_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
+void FB_FILLRECT(struct fb_info *p, const struct fb_fillrect *rect)
 {
         unsigned long pat, pat2, fg;
         unsigned long width = rect->width, height = rect->height;
         int bits = BITS_PER_LONG, bytes = bits >> 3;
         u32 bpp = p->var.bits_per_pixel;
-        unsigned long __iomem *dst;
+        unsigned long FB_MEM *dst;
         int dst_idx, left;
 
         if (p->state != FBINFO_STATE_RUNNING)
                 return;
 
-        if (p->flags & FBINFO_VIRTFB)
-                fb_warn_once(p, "Framebuffer is not in I/O address space.");
+        if ((p->flags & FBINFO_VIRTFB) != FB_SPACE)
+                fb_warn_once(p, "Framebuffer is not in " FB_SPACE_NAME
+                             " address space.");
 
         if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
             p->fix.visual == FB_VISUAL_DIRECTCOLOR )
@@ -298,8 +287,8 @@ void cfb_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
 
         pat = pixel_to_pat(bpp, fg);
 
-        dst = (unsigned long __iomem *)((unsigned long)p->screen_base & ~(bytes-1));
-        dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
+        dst = (unsigned long FB_MEM *)((unsigned long)FB_SCREEN_BASE(p) & ~(bytes-1));
+        dst_idx = ((unsigned long)FB_SCREEN_BASE(p) & (bytes - 1))*8;
         dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
         /* FIXME For now we support 1-32 bpp only */
         left = bits % bpp;
@@ -308,7 +297,7 @@ void cfb_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
         if (!left) {
                 u32 bswapmask = fb_compute_bswapmask(p);
                 void (*fill_op32)(struct fb_info *p,
-                                  unsigned long __iomem *dst, int dst_idx,
+                                  unsigned long FB_MEM *dst, int dst_idx,
                                   unsigned long pat, unsigned n, int bits,
                                   u32 bswapmask) = NULL;
 
@@ -320,7 +309,8 @@ void cfb_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
                         fill_op32 = bitfill_aligned;
                         break;
                 default:
-                        printk( KERN_ERR "cfb_fillrect(): unknown rop, defaulting to ROP_COPY\n");
+                        printk( KERN_ERR FB_FILLRECT_NAME "(): unknown rop, "
+                                "defaulting to ROP_COPY\n");
                         fill_op32 = bitfill_aligned;
                         break;
                 }
@@ -333,7 +323,7 @@ void cfb_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
                 }
         } else {
                 int right, r;
-                void (*fill_op)(struct fb_info *p, unsigned long __iomem *dst,
+                void (*fill_op)(struct fb_info *p, unsigned long FB_MEM *dst,
                                 int dst_idx, unsigned long pat, int left,
                                 int right, unsigned n, int bits) = NULL;
 #ifdef __LITTLE_ENDIAN
@@ -350,7 +340,8 @@ void cfb_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
                         fill_op = bitfill_unaligned;
                         break;
                 default:
-                        printk(KERN_ERR "cfb_fillrect(): unknown rop, defaulting to ROP_COPY\n");
+                        printk(KERN_ERR FB_FILLRECT_NAME "(): unknown rop, "
+                                "defaulting to ROP_COPY\n");
                         fill_op = bitfill_unaligned;
                         break;
                 }
@@ -366,9 +357,3 @@ void cfb_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
                 }
         }
 }
-
-EXPORT_SYMBOL(cfb_fillrect);
-
-MODULE_AUTHOR("James Simmons <jsimmons@users.sf.net>");
-MODULE_DESCRIPTION("Generic software accelerated fill rectangle");
-MODULE_LICENSE("GPL");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 07/13] fbdev: core: Use generic fillrect for as cfb_fillrect
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (5 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 06/13] fbdev: core: Make fb_fillrect generic Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 08/13] fbdev: core: Use generic fillrect for as sys_fillrect Zsolt Kajtar
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/cfbfillrect.c | 362 +------------------------
 1 file changed, 11 insertions(+), 351 deletions(-)

diff --git a/drivers/video/fbdev/core/cfbfillrect.c b/drivers/video/fbdev/core/cfbfillrect.c
index cbaa4c9e2..116d56de2 100644
--- a/drivers/video/fbdev/core/cfbfillrect.c
+++ b/drivers/video/fbdev/core/cfbfillrect.c
@@ -7,365 +7,25 @@
  *  License.  See the file COPYING in the main directory of this archive for
  *  more details.
  *
- * NOTES:
- *
- *  Also need to add code to deal with cards endians that are different than
- *  the native cpu endians. I also need to deal with MSB position in the word.
- *
  */
 #include <linux/module.h>
-#include <linux/string.h>
 #include <linux/fb.h>
 #include <asm/types.h>
-#include "fb_draw.h"
 
 #if BITS_PER_LONG == 32
-#  define FB_WRITEL fb_writel
-#  define FB_READL  fb_readl
-#else
-#  define FB_WRITEL fb_writeq
-#  define FB_READL  fb_readq
-#endif
-
-    /*
-     *  Aligned pattern fill using 32/64-bit memory accesses
-     */
-
-static void
-bitfill_aligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
-		unsigned long pat, unsigned n, int bits, u32 bswapmask)
-{
-	unsigned long first, last;
-
-	if (!n)
-		return;
-
-	first = fb_shifted_pixels_mask_long(p, dst_idx, bswapmask);
-	last = ~fb_shifted_pixels_mask_long(p, (dst_idx+n) % bits, bswapmask);
-
-	if (dst_idx+n <= bits) {
-		// Single word
-		if (last)
-			first &= last;
-		FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
-	} else {
-		// Multiple destination words
-
-		// Leading bits
-		if (first!= ~0UL) {
-			FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
-			dst++;
-			n -= bits - dst_idx;
-		}
-
-		// Main chunk
-		n /= bits;
-		while (n >= 8) {
-			FB_WRITEL(pat, dst++);
-			FB_WRITEL(pat, dst++);
-			FB_WRITEL(pat, dst++);
-			FB_WRITEL(pat, dst++);
-			FB_WRITEL(pat, dst++);
-			FB_WRITEL(pat, dst++);
-			FB_WRITEL(pat, dst++);
-			FB_WRITEL(pat, dst++);
-			n -= 8;
-		}
-		while (n--)
-			FB_WRITEL(pat, dst++);
-
-		// Trailing bits
-		if (last)
-			FB_WRITEL(comp(pat, FB_READL(dst), last), dst);
-	}
-}
-
-
-    /*
-     *  Unaligned generic pattern fill using 32/64-bit memory accesses
-     *  The pattern must have been expanded to a full 32/64-bit value
-     *  Left/right are the appropriate shifts to convert to the pattern to be
-     *  used for the next 32/64-bit word
-     */
-
-static void
-bitfill_unaligned(struct fb_info *p, unsigned long __iomem *dst, int dst_idx,
-		  unsigned long pat, int left, int right, unsigned n, int bits)
-{
-	unsigned long first, last;
-
-	if (!n)
-		return;
-
-	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
-	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
-
-	if (dst_idx+n <= bits) {
-		// Single word
-		if (last)
-			first &= last;
-		FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
-	} else {
-		// Multiple destination words
-		// Leading bits
-		if (first) {
-			FB_WRITEL(comp(pat, FB_READL(dst), first), dst);
-			dst++;
-			pat = pat << left | pat >> right;
-			n -= bits - dst_idx;
-		}
-
-		// Main chunk
-		n /= bits;
-		while (n >= 4) {
-			FB_WRITEL(pat, dst++);
-			pat = pat << left | pat >> right;
-			FB_WRITEL(pat, dst++);
-			pat = pat << left | pat >> right;
-			FB_WRITEL(pat, dst++);
-			pat = pat << left | pat >> right;
-			FB_WRITEL(pat, dst++);
-			pat = pat << left | pat >> right;
-			n -= 4;
-		}
-		while (n--) {
-			FB_WRITEL(pat, dst++);
-			pat = pat << left | pat >> right;
-		}
-
-		// Trailing bits
-		if (last)
-			FB_WRITEL(comp(pat, FB_READL(dst), last), dst);
-	}
-}
-
-    /*
-     *  Aligned pattern invert using 32/64-bit memory accesses
-     */
-static void
-bitfill_aligned_rev(struct fb_info *p, unsigned long __iomem *dst,
-		    int dst_idx, unsigned long pat, unsigned n, int bits,
-		    u32 bswapmask)
-{
-	unsigned long val = pat, dat;
-	unsigned long first, last;
-
-	if (!n)
-		return;
-
-	first = fb_shifted_pixels_mask_long(p, dst_idx, bswapmask);
-	last = ~fb_shifted_pixels_mask_long(p, (dst_idx+n) % bits, bswapmask);
-
-	if (dst_idx+n <= bits) {
-		// Single word
-		if (last)
-			first &= last;
-		dat = FB_READL(dst);
-		FB_WRITEL(comp(dat ^ val, dat, first), dst);
-	} else {
-		// Multiple destination words
-		// Leading bits
-		if (first!=0UL) {
-			dat = FB_READL(dst);
-			FB_WRITEL(comp(dat ^ val, dat, first), dst);
-			dst++;
-			n -= bits - dst_idx;
-		}
-
-		// Main chunk
-		n /= bits;
-		while (n >= 8) {
-			FB_WRITEL(FB_READL(dst) ^ val, dst);
-			dst++;
-			FB_WRITEL(FB_READL(dst) ^ val, dst);
-			dst++;
-			FB_WRITEL(FB_READL(dst) ^ val, dst);
-			dst++;
-			FB_WRITEL(FB_READL(dst) ^ val, dst);
-			dst++;
-			FB_WRITEL(FB_READL(dst) ^ val, dst);
-			dst++;
-			FB_WRITEL(FB_READL(dst) ^ val, dst);
-			dst++;
-			FB_WRITEL(FB_READL(dst) ^ val, dst);
-			dst++;
-			FB_WRITEL(FB_READL(dst) ^ val, dst);
-			dst++;
-			n -= 8;
-		}
-		while (n--) {
-			FB_WRITEL(FB_READL(dst) ^ val, dst);
-			dst++;
-		}
-		// Trailing bits
-		if (last) {
-			dat = FB_READL(dst);
-			FB_WRITEL(comp(dat ^ val, dat, last), dst);
-		}
-	}
-}
-
-
-    /*
-     *  Unaligned generic pattern invert using 32/64-bit memory accesses
-     *  The pattern must have been expanded to a full 32/64-bit value
-     *  Left/right are the appropriate shifts to convert to the pattern to be
-     *  used for the next 32/64-bit word
-     */
-
-static void
-bitfill_unaligned_rev(struct fb_info *p, unsigned long __iomem *dst,
-		      int dst_idx, unsigned long pat, int left, int right,
-		      unsigned n, int bits)
-{
-	unsigned long first, last, dat;
-
-	if (!n)
-		return;
-
-	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
-	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
-
-	if (dst_idx+n <= bits) {
-		// Single word
-		if (last)
-			first &= last;
-		dat = FB_READL(dst);
-		FB_WRITEL(comp(dat ^ pat, dat, first), dst);
-	} else {
-		// Multiple destination words
-
-		// Leading bits
-		if (first != 0UL) {
-			dat = FB_READL(dst);
-			FB_WRITEL(comp(dat ^ pat, dat, first), dst);
-			dst++;
-			pat = pat << left | pat >> right;
-			n -= bits - dst_idx;
-		}
-
-		// Main chunk
-		n /= bits;
-		while (n >= 4) {
-			FB_WRITEL(FB_READL(dst) ^ pat, dst);
-			dst++;
-			pat = pat << left | pat >> right;
-			FB_WRITEL(FB_READL(dst) ^ pat, dst);
-			dst++;
-			pat = pat << left | pat >> right;
-			FB_WRITEL(FB_READL(dst) ^ pat, dst);
-			dst++;
-			pat = pat << left | pat >> right;
-			FB_WRITEL(FB_READL(dst) ^ pat, dst);
-			dst++;
-			pat = pat << left | pat >> right;
-			n -= 4;
-		}
-		while (n--) {
-			FB_WRITEL(FB_READL(dst) ^ pat, dst);
-			dst++;
-			pat = pat << left | pat >> right;
-		}
-
-		// Trailing bits
-		if (last) {
-			dat = FB_READL(dst);
-			FB_WRITEL(comp(dat ^ pat, dat, last), dst);
-		}
-	}
-}
-
-void cfb_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
-{
-	unsigned long pat, pat2, fg;
-	unsigned long width = rect->width, height = rect->height;
-	int bits = BITS_PER_LONG, bytes = bits >> 3;
-	u32 bpp = p->var.bits_per_pixel;
-	unsigned long __iomem *dst;
-	int dst_idx, left;
-
-	if (p->state != FBINFO_STATE_RUNNING)
-		return;
-
-	if (p->flags & FBINFO_VIRTFB)
-		fb_warn_once(p, "Framebuffer is not in I/O address space.");
-
-	if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
-	    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
-		fg = ((u32 *) (p->pseudo_palette))[rect->color];
-	else
-		fg = rect->color;
-
-	pat = pixel_to_pat(bpp, fg);
-
-	dst = (unsigned long __iomem *)((unsigned long)p->screen_base & ~(bytes-1));
-	dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
-	dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
-	/* FIXME For now we support 1-32 bpp only */
-	left = bits % bpp;
-	if (p->fbops->fb_sync)
-		p->fbops->fb_sync(p);
-	if (!left) {
-		u32 bswapmask = fb_compute_bswapmask(p);
-		void (*fill_op32)(struct fb_info *p,
-				  unsigned long __iomem *dst, int dst_idx,
-		                  unsigned long pat, unsigned n, int bits,
-				  u32 bswapmask) = NULL;
-
-		switch (rect->rop) {
-		case ROP_XOR:
-			fill_op32 = bitfill_aligned_rev;
-			break;
-		case ROP_COPY:
-			fill_op32 = bitfill_aligned;
-			break;
-		default:
-			printk( KERN_ERR "cfb_fillrect(): unknown rop, defaulting to ROP_COPY\n");
-			fill_op32 = bitfill_aligned;
-			break;
-		}
-		while (height--) {
-			dst += dst_idx >> (ffs(bits) - 1);
-			dst_idx &= (bits - 1);
-			fill_op32(p, dst, dst_idx, pat, width*bpp, bits,
-				  bswapmask);
-			dst_idx += p->fix.line_length*8;
-		}
-	} else {
-		int right, r;
-		void (*fill_op)(struct fb_info *p, unsigned long __iomem *dst,
-				int dst_idx, unsigned long pat, int left,
-				int right, unsigned n, int bits) = NULL;
-#ifdef __LITTLE_ENDIAN
-		right = left;
-		left = bpp - right;
+#  define FB_WRITEL       fb_writel
+#  define FB_READL        fb_readl
 #else
-		right = bpp - left;
+#  define FB_WRITEL       fb_writeq
+#  define FB_READL        fb_readq
 #endif
-		switch (rect->rop) {
-		case ROP_XOR:
-			fill_op = bitfill_unaligned_rev;
-			break;
-		case ROP_COPY:
-			fill_op = bitfill_unaligned;
-			break;
-		default:
-			printk(KERN_ERR "cfb_fillrect(): unknown rop, defaulting to ROP_COPY\n");
-			fill_op = bitfill_unaligned;
-			break;
-		}
-		while (height--) {
-			dst += dst_idx / bits;
-			dst_idx &= (bits - 1);
-			r = dst_idx % bpp;
-			/* rotate pattern to the correct start position */
-			pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
-			fill_op(p, dst, dst_idx, pat2, left, right,
-				width*bpp, bits);
-			dst_idx += p->fix.line_length*8;
-		}
-	}
-}
+#define FB_MEM            __iomem
+#define FB_FILLRECT       cfb_fillrect
+#define FB_FILLRECT_NAME  "cfb_fillrect"
+#define FB_SPACE          0
+#define FB_SPACE_NAME     "I/O"
+#define FB_SCREEN_BASE(a) ((a)->screen_base)
+#include "fb_fillrect.h"
 
 EXPORT_SYMBOL(cfb_fillrect);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 08/13] fbdev: core: Use generic fillrect for as sys_fillrect
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (6 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 07/13] fbdev: core: Use generic fillrect for as cfb_fillrect Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 09/13] fbdev: core: Copy cfbimgblt to fb_imageblit Zsolt Kajtar
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/sysfillrect.c | 314 +------------------------
 1 file changed, 9 insertions(+), 305 deletions(-)

diff --git a/drivers/video/fbdev/core/sysfillrect.c b/drivers/video/fbdev/core/sysfillrect.c
index e49221a88..48d0f0efb 100644
--- a/drivers/video/fbdev/core/sysfillrect.c
+++ b/drivers/video/fbdev/core/sysfillrect.c
@@ -12,314 +12,18 @@
  *  more details.
  */
 #include <linux/module.h>
-#include <linux/string.h>
 #include <linux/fb.h>
 #include <asm/types.h>
-#include "fb_draw.h"
 
-    /*
-     *  Aligned pattern fill using 32/64-bit memory accesses
-     */
-
-static void
-bitfill_aligned(struct fb_info *p, unsigned long *dst, int dst_idx,
-		unsigned long pat, unsigned n, int bits)
-{
-	unsigned long first, last;
-
-	if (!n)
-		return;
-
-	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
-	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
-
-	if (dst_idx+n <= bits) {
-		/* Single word */
-		if (last)
-			first &= last;
-		*dst = comp(pat, *dst, first);
-	} else {
-		/* Multiple destination words */
-
-		/* Leading bits */
- 		if (first!= ~0UL) {
-			*dst = comp(pat, *dst, first);
-			dst++;
-			n -= bits - dst_idx;
-		}
-
-		/* Main chunk */
-		n /= bits;
-		memset_l(dst, pat, n);
-		dst += n;
-
-		/* Trailing bits */
-		if (last)
-			*dst = comp(pat, *dst, last);
-	}
-}
-
-
-    /*
-     *  Unaligned generic pattern fill using 32/64-bit memory accesses
-     *  The pattern must have been expanded to a full 32/64-bit value
-     *  Left/right are the appropriate shifts to convert to the pattern to be
-     *  used for the next 32/64-bit word
-     */
-
-static void
-bitfill_unaligned(struct fb_info *p, unsigned long *dst, int dst_idx,
-		  unsigned long pat, int left, int right, unsigned n, int bits)
-{
-	unsigned long first, last;
-
-	if (!n)
-		return;
-
-	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
-	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
-
-	if (dst_idx+n <= bits) {
-		/* Single word */
-		if (last)
-			first &= last;
-		*dst = comp(pat, *dst, first);
-	} else {
-		/* Multiple destination words */
-		/* Leading bits */
-		if (first) {
-			*dst = comp(pat, *dst, first);
-			dst++;
-			pat = pat << left | pat >> right;
-			n -= bits - dst_idx;
-		}
-
-		/* Main chunk */
-		n /= bits;
-		while (n >= 4) {
-			*dst++ = pat;
-			pat = pat << left | pat >> right;
-			*dst++ = pat;
-			pat = pat << left | pat >> right;
-			*dst++ = pat;
-			pat = pat << left | pat >> right;
-			*dst++ = pat;
-			pat = pat << left | pat >> right;
-			n -= 4;
-		}
-		while (n--) {
-			*dst++ = pat;
-			pat = pat << left | pat >> right;
-		}
-
-		/* Trailing bits */
-		if (last)
-			*dst = comp(pat, *dst, last);
-	}
-}
-
-    /*
-     *  Aligned pattern invert using 32/64-bit memory accesses
-     */
-static void
-bitfill_aligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
-		    unsigned long pat, unsigned n, int bits)
-{
-	unsigned long val = pat;
-	unsigned long first, last;
-
-	if (!n)
-		return;
-
-	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
-	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
-
-	if (dst_idx+n <= bits) {
-		/* Single word */
-		if (last)
-			first &= last;
-		*dst = comp(*dst ^ val, *dst, first);
-	} else {
-		/* Multiple destination words */
-		/* Leading bits */
-		if (first!=0UL) {
-			*dst = comp(*dst ^ val, *dst, first);
-			dst++;
-			n -= bits - dst_idx;
-		}
-
-		/* Main chunk */
-		n /= bits;
-		while (n >= 8) {
-			*dst++ ^= val;
-			*dst++ ^= val;
-			*dst++ ^= val;
-			*dst++ ^= val;
-			*dst++ ^= val;
-			*dst++ ^= val;
-			*dst++ ^= val;
-			*dst++ ^= val;
-			n -= 8;
-		}
-		while (n--)
-			*dst++ ^= val;
-		/* Trailing bits */
-		if (last)
-			*dst = comp(*dst ^ val, *dst, last);
-	}
-}
-
-
-    /*
-     *  Unaligned generic pattern invert using 32/64-bit memory accesses
-     *  The pattern must have been expanded to a full 32/64-bit value
-     *  Left/right are the appropriate shifts to convert to the pattern to be
-     *  used for the next 32/64-bit word
-     */
-
-static void
-bitfill_unaligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
-		      unsigned long pat, int left, int right, unsigned n,
-		      int bits)
-{
-	unsigned long first, last;
-
-	if (!n)
-		return;
-
-	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
-	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
-
-	if (dst_idx+n <= bits) {
-		/* Single word */
-		if (last)
-			first &= last;
-		*dst = comp(*dst ^ pat, *dst, first);
-	} else {
-		/* Multiple destination words */
-
-		/* Leading bits */
-		if (first != 0UL) {
-			*dst = comp(*dst ^ pat, *dst, first);
-			dst++;
-			pat = pat << left | pat >> right;
-			n -= bits - dst_idx;
-		}
-
-		/* Main chunk */
-		n /= bits;
-		while (n >= 4) {
-			*dst++ ^= pat;
-			pat = pat << left | pat >> right;
-			*dst++ ^= pat;
-			pat = pat << left | pat >> right;
-			*dst++ ^= pat;
-			pat = pat << left | pat >> right;
-			*dst++ ^= pat;
-			pat = pat << left | pat >> right;
-			n -= 4;
-		}
-		while (n--) {
-			*dst ^= pat;
-			pat = pat << left | pat >> right;
-		}
-
-		/* Trailing bits */
-		if (last)
-			*dst = comp(*dst ^ pat, *dst, last);
-	}
-}
-
-void sys_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
-{
-	unsigned long pat, pat2, fg;
-	unsigned long width = rect->width, height = rect->height;
-	int bits = BITS_PER_LONG, bytes = bits >> 3;
-	u32 bpp = p->var.bits_per_pixel;
-	unsigned long *dst;
-	int dst_idx, left;
-
-	if (p->state != FBINFO_STATE_RUNNING)
-		return;
-
-	if (!(p->flags & FBINFO_VIRTFB))
-		fb_warn_once(p, "Framebuffer is not in virtual address space.");
-
-	if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
-	    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
-		fg = ((u32 *) (p->pseudo_palette))[rect->color];
-	else
-		fg = rect->color;
-
-	pat = pixel_to_pat( bpp, fg);
-
-	dst = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
-	dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
-	dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
-	/* FIXME For now we support 1-32 bpp only */
-	left = bits % bpp;
-	if (p->fbops->fb_sync)
-		p->fbops->fb_sync(p);
-	if (!left) {
-		void (*fill_op32)(struct fb_info *p, unsigned long *dst,
-				  int dst_idx, unsigned long pat, unsigned n,
-				  int bits) = NULL;
-
-		switch (rect->rop) {
-		case ROP_XOR:
-			fill_op32 = bitfill_aligned_rev;
-			break;
-		case ROP_COPY:
-			fill_op32 = bitfill_aligned;
-			break;
-		default:
-			printk( KERN_ERR "cfb_fillrect(): unknown rop, "
-				"defaulting to ROP_COPY\n");
-			fill_op32 = bitfill_aligned;
-			break;
-		}
-		while (height--) {
-			dst += dst_idx >> (ffs(bits) - 1);
-			dst_idx &= (bits - 1);
-			fill_op32(p, dst, dst_idx, pat, width*bpp, bits);
-			dst_idx += p->fix.line_length*8;
-		}
-	} else {
-		int right, r;
-		void (*fill_op)(struct fb_info *p, unsigned long *dst,
-				int dst_idx, unsigned long pat, int left,
-				int right, unsigned n, int bits) = NULL;
-#ifdef __LITTLE_ENDIAN
-		right = left;
-		left = bpp - right;
-#else
-		right = bpp - left;
-#endif
-		switch (rect->rop) {
-		case ROP_XOR:
-			fill_op = bitfill_unaligned_rev;
-			break;
-		case ROP_COPY:
-			fill_op = bitfill_unaligned;
-			break;
-		default:
-			printk(KERN_ERR "sys_fillrect(): unknown rop, "
-				"defaulting to ROP_COPY\n");
-			fill_op = bitfill_unaligned;
-			break;
-		}
-		while (height--) {
-			dst += dst_idx / bits;
-			dst_idx &= (bits - 1);
-			r = dst_idx % bpp;
-			/* rotate pattern to the correct start position */
-			pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
-			fill_op(p, dst, dst_idx, pat2, left, right,
-				width*bpp, bits);
-			dst_idx += p->fix.line_length*8;
-		}
-	}
-}
+#define FB_READL(a)       (*a)
+#define FB_WRITEL(a,b)    do { *(b) = (a); } while (false)
+#define FB_MEM            /* nothing */
+#define FB_FILLRECT       sys_fillrect
+#define FB_FILLRECT_NAME  "sys_fillrect"
+#define FB_SPACE          FBINFO_VIRTFB
+#define FB_SPACE_NAME     "virtual"
+#define FB_SCREEN_BASE(a) ((a)->screen_buffer)
+#include "fb_fillrect.h"
 
 EXPORT_SYMBOL(sys_fillrect);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 09/13] fbdev: core: Copy cfbimgblt to fb_imageblit
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (7 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 08/13] fbdev: core: Use generic fillrect for as sys_fillrect Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 10/13] fbdev: core: Make fb_imageblit generic Zsolt Kajtar
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/fb_imageblit.h | 368 ++++++++++++++++++++++++
 1 file changed, 368 insertions(+)
 create mode 100644 drivers/video/fbdev/core/fb_imageblit.h

diff --git a/drivers/video/fbdev/core/fb_imageblit.h b/drivers/video/fbdev/core/fb_imageblit.h
new file mode 100644
index 000000000..129822b6f
--- /dev/null
+++ b/drivers/video/fbdev/core/fb_imageblit.h
@@ -0,0 +1,368 @@
+/*
+ *  Generic BitBLT function for frame buffer with packed pixels of any depth.
+ *
+ *      Copyright (C)  June 1999 James Simmons
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of this archive for
+ *  more details.
+ *
+ * NOTES:
+ *
+ *    This function copys a image from system memory to video memory. The
+ *  image can be a bitmap where each 0 represents the background color and
+ *  each 1 represents the foreground color. Great for font handling. It can
+ *  also be a color image. This is determined by image_depth. The color image
+ *  must be laid out exactly in the same format as the framebuffer. Yes I know
+ *  their are cards with hardware that coverts images of various depths to the
+ *  framebuffer depth. But not every card has this. All images must be rounded
+ *  up to the nearest byte. For example a bitmap 12 bits wide must be two
+ *  bytes width.
+ *
+ *  Tony:
+ *  Incorporate mask tables similar to fbcon-cfb*.c in 2.4 API.  This speeds
+ *  up the code significantly.
+ *
+ *  Code for depths not multiples of BITS_PER_LONG is still kludgy, which is
+ *  still processed a bit at a time.
+ *
+ *  Also need to add code to deal with cards endians that are different than
+ *  the native cpu endians. I also need to deal with MSB position in the word.
+ */
+#include <linux/module.h>
+#include <linux/string.h>
+#include <linux/fb.h>
+#include <asm/types.h>
+#include "fb_draw.h"
+
+#define DEBUG
+
+#ifdef DEBUG
+#define DPRINTK(fmt, args...) printk(KERN_DEBUG "%s: " fmt,__func__,## args)
+#else
+#define DPRINTK(fmt, args...)
+#endif
+
+static const u32 cfb_tab8_be[] = {
+    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
+    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
+    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
+    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
+};
+
+static const u32 cfb_tab8_le[] = {
+    0x00000000,0xff000000,0x00ff0000,0xffff0000,
+    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
+    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
+    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
+};
+
+static const u32 cfb_tab16_be[] = {
+    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
+};
+
+static const u32 cfb_tab16_le[] = {
+    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
+};
+
+static const u32 cfb_tab32[] = {
+        0x00000000, 0xffffffff
+};
+
+#define FB_WRITEL fb_writel
+#define FB_READL  fb_readl
+
+static inline void color_imageblit(const struct fb_image *image,
+                                   struct fb_info *p, u8 __iomem *dst1,
+                                   u32 start_index,
+                                   u32 pitch_index)
+{
+        /* Draw the penguin */
+        u32 __iomem *dst, *dst2;
+        u32 color = 0, val, shift;
+        int i, n, bpp = p->var.bits_per_pixel;
+        u32 null_bits = 32 - bpp;
+        u32 *palette = (u32 *) p->pseudo_palette;
+        const u8 *src = image->data;
+        u32 bswapmask = fb_compute_bswapmask(p);
+
+        dst2 = (u32 __iomem *) dst1;
+        for (i = image->height; i--; ) {
+                n = image->width;
+                dst = (u32 __iomem *) dst1;
+                shift = 0;
+                val = 0;
+
+                if (start_index) {
+                        u32 start_mask = ~fb_shifted_pixels_mask_u32(p,
+                                                start_index, bswapmask);
+                        val = FB_READL(dst) & start_mask;
+                        shift = start_index;
+                }
+                while (n--) {
+                        if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
+                            p->fix.visual == FB_VISUAL_DIRECTCOLOR )
+                                color = palette[*src];
+                        else
+                                color = *src;
+                        color <<= FB_LEFT_POS(p, bpp);
+                        val |= FB_SHIFT_HIGH(p, color, shift ^ bswapmask);
+                        if (shift >= null_bits) {
+                                FB_WRITEL(val, dst++);
+
+                                val = (shift == null_bits) ? 0 :
+                                        FB_SHIFT_LOW(p, color, 32 - shift);
+                        }
+                        shift += bpp;
+                        shift &= (32 - 1);
+                        src++;
+                }
+                if (shift) {
+                        u32 end_mask = fb_shifted_pixels_mask_u32(p, shift,
+                                                bswapmask);
+
+                        FB_WRITEL((FB_READL(dst) & end_mask) | val, dst);
+                }
+                dst1 += p->fix.line_length;
+                if (pitch_index) {
+                        dst2 += p->fix.line_length;
+                        dst1 = (u8 __iomem *)((long __force)dst2 & ~(sizeof(u32) - 1));
+
+                        start_index += pitch_index;
+                        start_index &= 32 - 1;
+                }
+        }
+}
+
+static inline void slow_imageblit(const struct fb_image *image, struct fb_info *p,
+                                  u8 __iomem *dst1, u32 fgcolor,
+                                  u32 bgcolor,
+                                  u32 start_index,
+                                  u32 pitch_index)
+{
+        u32 shift, color = 0, bpp = p->var.bits_per_pixel;
+        u32 __iomem *dst, *dst2;
+        u32 val, pitch = p->fix.line_length;
+        u32 null_bits = 32 - bpp;
+        u32 spitch = (image->width+7)/8;
+        const u8 *src = image->data, *s;
+        u32 i, j, l;
+        u32 bswapmask = fb_compute_bswapmask(p);
+
+        dst2 = (u32 __iomem *) dst1;
+        fgcolor <<= FB_LEFT_POS(p, bpp);
+        bgcolor <<= FB_LEFT_POS(p, bpp);
+
+        for (i = image->height; i--; ) {
+                shift = val = 0;
+                l = 8;
+                j = image->width;
+                dst = (u32 __iomem *) dst1;
+                s = src;
+
+                /* write leading bits */
+                if (start_index) {
+                        u32 start_mask = ~fb_shifted_pixels_mask_u32(p,
+                                                start_index, bswapmask);
+                        val = FB_READL(dst) & start_mask;
+                        shift = start_index;
+                }
+
+                while (j--) {
+                        l--;
+                        color = (*s & (1 << l)) ? fgcolor : bgcolor;
+                        val |= FB_SHIFT_HIGH(p, color, shift ^ bswapmask);
+
+                        /* Did the bitshift spill bits to the next long? */
+                        if (shift >= null_bits) {
+                                FB_WRITEL(val, dst++);
+                                val = (shift == null_bits) ? 0 :
+                                        FB_SHIFT_LOW(p, color, 32 - shift);
+                        }
+                        shift += bpp;
+                        shift &= (32 - 1);
+                        if (!l) { l = 8; s++; }
+                }
+
+                /* write trailing bits */
+                if (shift) {
+                        u32 end_mask = fb_shifted_pixels_mask_u32(p, shift,
+                                                bswapmask);
+
+                        FB_WRITEL((FB_READL(dst) & end_mask) | val, dst);
+                }
+
+                dst1 += pitch;
+                src += spitch;
+                if (pitch_index) {
+                        dst2 += pitch;
+                        dst1 = (u8 __iomem *)((long __force)dst2 & ~(sizeof(u32) - 1));
+                        start_index += pitch_index;
+                        start_index &= 32 - 1;
+                }
+
+        }
+}
+
+/*
+ * fast_imageblit - optimized monochrome color expansion
+ *
+ * Only if:  bits_per_pixel == 8, 16, or 32
+ *           image->width is divisible by pixel/dword (ppw);
+ *           fix->line_legth is divisible by 4;
+ *           beginning and end of a scanline is dword aligned
+ */
+static inline void fast_imageblit(const struct fb_image *image, struct fb_info *p,
+                                  u8 __iomem *dst1, u32 fgcolor,
+                                  u32 bgcolor)
+{
+        u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
+        u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
+        u32 bit_mask, eorx, shift;
+        const char *s = image->data, *src;
+        u32 __iomem *dst;
+        const u32 *tab = NULL;
+        size_t tablen;
+        u32 colortab[16];
+        int i, j, k;
+
+        switch (bpp) {
+        case 8:
+                tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
+                tablen = 16;
+                break;
+        case 16:
+                tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
+                tablen = 4;
+                break;
+        case 32:
+                tab = cfb_tab32;
+                tablen = 2;
+                break;
+        default:
+                return;
+        }
+
+        for (i = ppw-1; i--; ) {
+                fgx <<= bpp;
+                bgx <<= bpp;
+                fgx |= fgcolor;
+                bgx |= bgcolor;
+        }
+
+        bit_mask = (1 << ppw) - 1;
+        eorx = fgx ^ bgx;
+        k = image->width/ppw;
+
+        for (i = 0; i < tablen; ++i)
+                colortab[i] = (tab[i] & eorx) ^ bgx;
+
+        for (i = image->height; i--; ) {
+                dst = (u32 __iomem *)dst1;
+                shift = 8;
+                src = s;
+
+                /*
+                 * Manually unroll the per-line copying loop for better
+                 * performance. This works until we processed the last
+                 * completely filled source byte (inclusive).
+                 */
+                switch (ppw) {
+                case 4: /* 8 bpp */
+                        for (j = k; j >= 2; j -= 2, ++src) {
+                                FB_WRITEL(colortab[(*src >> 4) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 0) & bit_mask], dst++);
+                        }
+                        break;
+                case 2: /* 16 bpp */
+                        for (j = k; j >= 4; j -= 4, ++src) {
+                                FB_WRITEL(colortab[(*src >> 6) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 4) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 2) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 0) & bit_mask], dst++);
+                        }
+                        break;
+                case 1: /* 32 bpp */
+                        for (j = k; j >= 8; j -= 8, ++src) {
+                                FB_WRITEL(colortab[(*src >> 7) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 6) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 5) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 4) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 3) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 2) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 1) & bit_mask], dst++);
+                                FB_WRITEL(colortab[(*src >> 0) & bit_mask], dst++);
+                        }
+                        break;
+                }
+
+                /*
+                 * For image widths that are not a multiple of 8, there
+                 * are trailing pixels left on the current line. Print
+                 * them as well.
+                 */
+                for (; j--; ) {
+                        shift -= ppw;
+                        FB_WRITEL(colortab[(*src >> shift) & bit_mask], dst++);
+                        if (!shift) {
+                                shift = 8;
+                                ++src;
+                        }
+                }
+
+                dst1 += p->fix.line_length;
+                s += spitch;
+        }
+}
+
+void cfb_imageblit(struct fb_info *p, const struct fb_image *image)
+{
+        u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
+        u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
+        u32 width = image->width;
+        u32 dx = image->dx, dy = image->dy;
+        u8 __iomem *dst1;
+
+        if (p->state != FBINFO_STATE_RUNNING)
+                return;
+
+        if (p->flags & FBINFO_VIRTFB)
+                fb_warn_once(p, "Framebuffer is not in I/O address space.");
+
+        bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
+        start_index = bitstart & (32 - 1);
+        pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
+
+        bitstart /= 8;
+        bitstart &= ~(bpl - 1);
+        dst1 = p->screen_base + bitstart;
+
+        if (p->fbops->fb_sync)
+                p->fbops->fb_sync(p);
+
+        if (image->depth == 1) {
+                if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
+                    p->fix.visual == FB_VISUAL_DIRECTCOLOR) {
+                        fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
+                        bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
+                } else {
+                        fgcolor = image->fg_color;
+                        bgcolor = image->bg_color;
+                }
+
+                if (32 % bpp == 0 && !start_index && !pitch_index &&
+                    ((width & (32/bpp-1)) == 0) &&
+                    bpp >= 8 && bpp <= 32)
+                        fast_imageblit(image, p, dst1, fgcolor, bgcolor);
+                else
+                        slow_imageblit(image, p, dst1, fgcolor, bgcolor,
+                                        start_index, pitch_index);
+        } else
+                color_imageblit(image, p, dst1, start_index, pitch_index);
+}
+
+EXPORT_SYMBOL(cfb_imageblit);
+
+MODULE_AUTHOR("James Simmons <jsimmons@users.sf.net>");
+MODULE_DESCRIPTION("Generic software accelerated imaging drawing");
+MODULE_LICENSE("GPL");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 10/13] fbdev: core: Make fb_imageblit generic
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (8 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 09/13] fbdev: core: Copy cfbimgblt to fb_imageblit Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 11/13] fbdev: core: Use generic imageblit for as cfb_imageblit Zsolt Kajtar
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/fb_imageblit.h | 52 ++++++++++---------------
 1 file changed, 20 insertions(+), 32 deletions(-)

diff --git a/drivers/video/fbdev/core/fb_imageblit.h b/drivers/video/fbdev/core/fb_imageblit.h
index 129822b6f..b8cd5eb83 100644
--- a/drivers/video/fbdev/core/fb_imageblit.h
+++ b/drivers/video/fbdev/core/fb_imageblit.h
@@ -29,10 +29,6 @@
  *  Also need to add code to deal with cards endians that are different than
  *  the native cpu endians. I also need to deal with MSB position in the word.
  */
-#include <linux/module.h>
-#include <linux/string.h>
-#include <linux/fb.h>
-#include <asm/types.h>
 #include "fb_draw.h"
 
 #define DEBUG
@@ -69,16 +65,13 @@ static const u32 cfb_tab32[] = {
         0x00000000, 0xffffffff
 };
 
-#define FB_WRITEL fb_writel
-#define FB_READL  fb_readl
-
 static inline void color_imageblit(const struct fb_image *image,
-                                   struct fb_info *p, u8 __iomem *dst1,
+                                   struct fb_info *p, u8 FB_MEM *dst1,
                                    u32 start_index,
                                    u32 pitch_index)
 {
         /* Draw the penguin */
-        u32 __iomem *dst, *dst2;
+        u32 FB_MEM *dst, *dst2;
         u32 color = 0, val, shift;
         int i, n, bpp = p->var.bits_per_pixel;
         u32 null_bits = 32 - bpp;
@@ -86,10 +79,10 @@ static inline void color_imageblit(const struct fb_image *image,
         const u8 *src = image->data;
         u32 bswapmask = fb_compute_bswapmask(p);
 
-        dst2 = (u32 __iomem *) dst1;
+        dst2 = (u32 FB_MEM *) dst1;
         for (i = image->height; i--; ) {
                 n = image->width;
-                dst = (u32 __iomem *) dst1;
+                dst = (u32 FB_MEM *) dst1;
                 shift = 0;
                 val = 0;
 
@@ -126,7 +119,7 @@ static inline void color_imageblit(const struct fb_image *image,
                 dst1 += p->fix.line_length;
                 if (pitch_index) {
                         dst2 += p->fix.line_length;
-                        dst1 = (u8 __iomem *)((long __force)dst2 & ~(sizeof(u32) - 1));
+                        dst1 = (u8 FB_MEM *)((long __force)dst2 & ~(sizeof(u32) - 1));
 
                         start_index += pitch_index;
                         start_index &= 32 - 1;
@@ -135,13 +128,13 @@ static inline void color_imageblit(const struct fb_image *image,
 }
 
 static inline void slow_imageblit(const struct fb_image *image, struct fb_info *p,
-                                  u8 __iomem *dst1, u32 fgcolor,
+                                  u8 FB_MEM *dst1, u32 fgcolor,
                                   u32 bgcolor,
                                   u32 start_index,
                                   u32 pitch_index)
 {
         u32 shift, color = 0, bpp = p->var.bits_per_pixel;
-        u32 __iomem *dst, *dst2;
+        u32 FB_MEM *dst, *dst2;
         u32 val, pitch = p->fix.line_length;
         u32 null_bits = 32 - bpp;
         u32 spitch = (image->width+7)/8;
@@ -149,7 +142,7 @@ static inline void slow_imageblit(const struct fb_image *image, struct fb_info *
         u32 i, j, l;
         u32 bswapmask = fb_compute_bswapmask(p);
 
-        dst2 = (u32 __iomem *) dst1;
+        dst2 = (u32 FB_MEM *) dst1;
         fgcolor <<= FB_LEFT_POS(p, bpp);
         bgcolor <<= FB_LEFT_POS(p, bpp);
 
@@ -157,7 +150,7 @@ static inline void slow_imageblit(const struct fb_image *image, struct fb_info *
                 shift = val = 0;
                 l = 8;
                 j = image->width;
-                dst = (u32 __iomem *) dst1;
+                dst = (u32 FB_MEM *) dst1;
                 s = src;
 
                 /* write leading bits */
@@ -196,7 +189,7 @@ static inline void slow_imageblit(const struct fb_image *image, struct fb_info *
                 src += spitch;
                 if (pitch_index) {
                         dst2 += pitch;
-                        dst1 = (u8 __iomem *)((long __force)dst2 & ~(sizeof(u32) - 1));
+                        dst1 = (u8 FB_MEM *)((long __force)dst2 & ~(sizeof(u32) - 1));
                         start_index += pitch_index;
                         start_index &= 32 - 1;
                 }
@@ -213,14 +206,14 @@ static inline void slow_imageblit(const struct fb_image *image, struct fb_info *
  *           beginning and end of a scanline is dword aligned
  */
 static inline void fast_imageblit(const struct fb_image *image, struct fb_info *p,
-                                  u8 __iomem *dst1, u32 fgcolor,
+                                  u8 FB_MEM *dst1, u32 fgcolor,
                                   u32 bgcolor)
 {
         u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
         u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
         u32 bit_mask, eorx, shift;
-        const char *s = image->data, *src;
-        u32 __iomem *dst;
+        const u8 *s = image->data, *src;
+        u32 FB_MEM *dst;
         const u32 *tab = NULL;
         size_t tablen;
         u32 colortab[16];
@@ -258,7 +251,7 @@ static inline void fast_imageblit(const struct fb_image *image, struct fb_info *
                 colortab[i] = (tab[i] & eorx) ^ bgx;
 
         for (i = image->height; i--; ) {
-                dst = (u32 __iomem *)dst1;
+                dst = (u32 FB_MEM *)dst1;
                 shift = 8;
                 src = s;
 
@@ -315,19 +308,20 @@ static inline void fast_imageblit(const struct fb_image *image, struct fb_info *
         }
 }
 
-void cfb_imageblit(struct fb_info *p, const struct fb_image *image)
+void FB_IMAGEBLIT (struct fb_info *p, const struct fb_image *image)
 {
         u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
         u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
         u32 width = image->width;
         u32 dx = image->dx, dy = image->dy;
-        u8 __iomem *dst1;
+        u8 FB_MEM *dst1;
 
         if (p->state != FBINFO_STATE_RUNNING)
                 return;
 
-        if (p->flags & FBINFO_VIRTFB)
-                fb_warn_once(p, "Framebuffer is not in I/O address space.");
+        if ((p->flags & FBINFO_VIRTFB) != FB_SPACE)
+                fb_warn_once(p, "Framebuffer is not in " FB_SPACE_NAME
+                             " address space.");
 
         bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
         start_index = bitstart & (32 - 1);
@@ -335,7 +329,7 @@ void cfb_imageblit(struct fb_info *p, const struct fb_image *image)
 
         bitstart /= 8;
         bitstart &= ~(bpl - 1);
-        dst1 = p->screen_base + bitstart;
+        dst1 = (void __force *)FB_SCREEN_BASE(p) + bitstart;
 
         if (p->fbops->fb_sync)
                 p->fbops->fb_sync(p);
@@ -360,9 +354,3 @@ void cfb_imageblit(struct fb_info *p, const struct fb_image *image)
         } else
                 color_imageblit(image, p, dst1, start_index, pitch_index);
 }
-
-EXPORT_SYMBOL(cfb_imageblit);
-
-MODULE_AUTHOR("James Simmons <jsimmons@users.sf.net>");
-MODULE_DESCRIPTION("Generic software accelerated imaging drawing");
-MODULE_LICENSE("GPL");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 11/13] fbdev: core: Use generic imageblit for as cfb_imageblit
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (9 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 10/13] fbdev: core: Make fb_imageblit generic Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 12/13] fbdev: core: Use generic imageblit for as sys_imageblit Zsolt Kajtar
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/cfbimgblt.c | 357 +--------------------------
 1 file changed, 8 insertions(+), 349 deletions(-)

diff --git a/drivers/video/fbdev/core/cfbimgblt.c b/drivers/video/fbdev/core/cfbimgblt.c
index 7d1d2f1a6..a5bb63913 100644
--- a/drivers/video/fbdev/core/cfbimgblt.c
+++ b/drivers/video/fbdev/core/cfbimgblt.c
@@ -7,363 +7,22 @@
  *  License.  See the file COPYING in the main directory of this archive for
  *  more details.
  *
- * NOTES:
- *
- *    This function copys a image from system memory to video memory. The
- *  image can be a bitmap where each 0 represents the background color and
- *  each 1 represents the foreground color. Great for font handling. It can
- *  also be a color image. This is determined by image_depth. The color image
- *  must be laid out exactly in the same format as the framebuffer. Yes I know
- *  their are cards with hardware that coverts images of various depths to the
- *  framebuffer depth. But not every card has this. All images must be rounded
- *  up to the nearest byte. For example a bitmap 12 bits wide must be two
- *  bytes width.
- *
- *  Tony:
- *  Incorporate mask tables similar to fbcon-cfb*.c in 2.4 API.  This speeds
- *  up the code significantly.
- *
- *  Code for depths not multiples of BITS_PER_LONG is still kludgy, which is
- *  still processed a bit at a time.
- *
- *  Also need to add code to deal with cards endians that are different than
- *  the native cpu endians. I also need to deal with MSB position in the word.
  */
 #include <linux/module.h>
-#include <linux/string.h>
 #include <linux/fb.h>
 #include <asm/types.h>
-#include "fb_draw.h"
-
-#define DEBUG
-
-#ifdef DEBUG
-#define DPRINTK(fmt, args...) printk(KERN_DEBUG "%s: " fmt,__func__,## args)
-#else
-#define DPRINTK(fmt, args...)
-#endif
-
-static const u32 cfb_tab8_be[] = {
-    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
-    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
-    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
-    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
-};
-
-static const u32 cfb_tab8_le[] = {
-    0x00000000,0xff000000,0x00ff0000,0xffff0000,
-    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
-    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
-    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
-};
-
-static const u32 cfb_tab16_be[] = {
-    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
-};
-
-static const u32 cfb_tab16_le[] = {
-    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
-};
-
-static const u32 cfb_tab32[] = {
-	0x00000000, 0xffffffff
-};
-
-#define FB_WRITEL fb_writel
-#define FB_READL  fb_readl
-
-static inline void color_imageblit(const struct fb_image *image,
-				   struct fb_info *p, u8 __iomem *dst1,
-				   u32 start_index,
-				   u32 pitch_index)
-{
-	/* Draw the penguin */
-	u32 __iomem *dst, *dst2;
-	u32 color = 0, val, shift;
-	int i, n, bpp = p->var.bits_per_pixel;
-	u32 null_bits = 32 - bpp;
-	u32 *palette = (u32 *) p->pseudo_palette;
-	const u8 *src = image->data;
-	u32 bswapmask = fb_compute_bswapmask(p);
-
-	dst2 = (u32 __iomem *) dst1;
-	for (i = image->height; i--; ) {
-		n = image->width;
-		dst = (u32 __iomem *) dst1;
-		shift = 0;
-		val = 0;
-
-		if (start_index) {
-			u32 start_mask = ~fb_shifted_pixels_mask_u32(p,
-						start_index, bswapmask);
-			val = FB_READL(dst) & start_mask;
-			shift = start_index;
-		}
-		while (n--) {
-			if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
-			    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
-				color = palette[*src];
-			else
-				color = *src;
-			color <<= FB_LEFT_POS(p, bpp);
-			val |= FB_SHIFT_HIGH(p, color, shift ^ bswapmask);
-			if (shift >= null_bits) {
-				FB_WRITEL(val, dst++);
-
-				val = (shift == null_bits) ? 0 :
-					FB_SHIFT_LOW(p, color, 32 - shift);
-			}
-			shift += bpp;
-			shift &= (32 - 1);
-			src++;
-		}
-		if (shift) {
-			u32 end_mask = fb_shifted_pixels_mask_u32(p, shift,
-						bswapmask);
-
-			FB_WRITEL((FB_READL(dst) & end_mask) | val, dst);
-		}
-		dst1 += p->fix.line_length;
-		if (pitch_index) {
-			dst2 += p->fix.line_length;
-			dst1 = (u8 __iomem *)((long __force)dst2 & ~(sizeof(u32) - 1));
-
-			start_index += pitch_index;
-			start_index &= 32 - 1;
-		}
-	}
-}
-
-static inline void slow_imageblit(const struct fb_image *image, struct fb_info *p,
-				  u8 __iomem *dst1, u32 fgcolor,
-				  u32 bgcolor,
-				  u32 start_index,
-				  u32 pitch_index)
-{
-	u32 shift, color = 0, bpp = p->var.bits_per_pixel;
-	u32 __iomem *dst, *dst2;
-	u32 val, pitch = p->fix.line_length;
-	u32 null_bits = 32 - bpp;
-	u32 spitch = (image->width+7)/8;
-	const u8 *src = image->data, *s;
-	u32 i, j, l;
-	u32 bswapmask = fb_compute_bswapmask(p);
-
-	dst2 = (u32 __iomem *) dst1;
-	fgcolor <<= FB_LEFT_POS(p, bpp);
-	bgcolor <<= FB_LEFT_POS(p, bpp);
-
-	for (i = image->height; i--; ) {
-		shift = val = 0;
-		l = 8;
-		j = image->width;
-		dst = (u32 __iomem *) dst1;
-		s = src;
-
-		/* write leading bits */
-		if (start_index) {
-			u32 start_mask = ~fb_shifted_pixels_mask_u32(p,
-						start_index, bswapmask);
-			val = FB_READL(dst) & start_mask;
-			shift = start_index;
-		}
-
-		while (j--) {
-			l--;
-			color = (*s & (1 << l)) ? fgcolor : bgcolor;
-			val |= FB_SHIFT_HIGH(p, color, shift ^ bswapmask);
-
-			/* Did the bitshift spill bits to the next long? */
-			if (shift >= null_bits) {
-				FB_WRITEL(val, dst++);
-				val = (shift == null_bits) ? 0 :
-					FB_SHIFT_LOW(p, color, 32 - shift);
-			}
-			shift += bpp;
-			shift &= (32 - 1);
-			if (!l) { l = 8; s++; }
-		}
 
-		/* write trailing bits */
- 		if (shift) {
-			u32 end_mask = fb_shifted_pixels_mask_u32(p, shift,
-						bswapmask);
-
-			FB_WRITEL((FB_READL(dst) & end_mask) | val, dst);
-		}
-
-		dst1 += pitch;
-		src += spitch;
-		if (pitch_index) {
-			dst2 += pitch;
-			dst1 = (u8 __iomem *)((long __force)dst2 & ~(sizeof(u32) - 1));
-			start_index += pitch_index;
-			start_index &= 32 - 1;
-		}
-
-	}
-}
-
-/*
- * fast_imageblit - optimized monochrome color expansion
- *
- * Only if:  bits_per_pixel == 8, 16, or 32
- *           image->width is divisible by pixel/dword (ppw);
- *           fix->line_legth is divisible by 4;
- *           beginning and end of a scanline is dword aligned
- */
-static inline void fast_imageblit(const struct fb_image *image, struct fb_info *p,
-				  u8 __iomem *dst1, u32 fgcolor,
-				  u32 bgcolor)
-{
-	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
-	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
-	u32 bit_mask, eorx, shift;
-	const char *s = image->data, *src;
-	u32 __iomem *dst;
-	const u32 *tab = NULL;
-	size_t tablen;
-	u32 colortab[16];
-	int i, j, k;
-
-	switch (bpp) {
-	case 8:
-		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
-		tablen = 16;
-		break;
-	case 16:
-		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
-		tablen = 4;
-		break;
-	case 32:
-		tab = cfb_tab32;
-		tablen = 2;
-		break;
-	default:
-		return;
-	}
-
-	for (i = ppw-1; i--; ) {
-		fgx <<= bpp;
-		bgx <<= bpp;
-		fgx |= fgcolor;
-		bgx |= bgcolor;
-	}
-
-	bit_mask = (1 << ppw) - 1;
-	eorx = fgx ^ bgx;
-	k = image->width/ppw;
-
-	for (i = 0; i < tablen; ++i)
-		colortab[i] = (tab[i] & eorx) ^ bgx;
-
-	for (i = image->height; i--; ) {
-		dst = (u32 __iomem *)dst1;
-		shift = 8;
-		src = s;
-
-		/*
-		 * Manually unroll the per-line copying loop for better
-		 * performance. This works until we processed the last
-		 * completely filled source byte (inclusive).
-		 */
-		switch (ppw) {
-		case 4: /* 8 bpp */
-			for (j = k; j >= 2; j -= 2, ++src) {
-				FB_WRITEL(colortab[(*src >> 4) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 0) & bit_mask], dst++);
-			}
-			break;
-		case 2: /* 16 bpp */
-			for (j = k; j >= 4; j -= 4, ++src) {
-				FB_WRITEL(colortab[(*src >> 6) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 4) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 2) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 0) & bit_mask], dst++);
-			}
-			break;
-		case 1: /* 32 bpp */
-			for (j = k; j >= 8; j -= 8, ++src) {
-				FB_WRITEL(colortab[(*src >> 7) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 6) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 5) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 4) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 3) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 2) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 1) & bit_mask], dst++);
-				FB_WRITEL(colortab[(*src >> 0) & bit_mask], dst++);
-			}
-			break;
-		}
-
-		/*
-		 * For image widths that are not a multiple of 8, there
-		 * are trailing pixels left on the current line. Print
-		 * them as well.
-		 */
-		for (; j--; ) {
-			shift -= ppw;
-			FB_WRITEL(colortab[(*src >> shift) & bit_mask], dst++);
-			if (!shift) {
-				shift = 8;
-				++src;
-			}
-		}
-
-		dst1 += p->fix.line_length;
-		s += spitch;
-	}
-}
-
-void cfb_imageblit(struct fb_info *p, const struct fb_image *image)
-{
-	u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
-	u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
-	u32 width = image->width;
-	u32 dx = image->dx, dy = image->dy;
-	u8 __iomem *dst1;
-
-	if (p->state != FBINFO_STATE_RUNNING)
-		return;
-
-	if (p->flags & FBINFO_VIRTFB)
-		fb_warn_once(p, "Framebuffer is not in I/O address space.");
-
-	bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
-	start_index = bitstart & (32 - 1);
-	pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
-
-	bitstart /= 8;
-	bitstart &= ~(bpl - 1);
-	dst1 = p->screen_base + bitstart;
-
-	if (p->fbops->fb_sync)
-		p->fbops->fb_sync(p);
-
-	if (image->depth == 1) {
-		if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
-		    p->fix.visual == FB_VISUAL_DIRECTCOLOR) {
-			fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
-			bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
-		} else {
-			fgcolor = image->fg_color;
-			bgcolor = image->bg_color;
-		}
-
-		if (32 % bpp == 0 && !start_index && !pitch_index &&
-		    ((width & (32/bpp-1)) == 0) &&
-		    bpp >= 8 && bpp <= 32)
-			fast_imageblit(image, p, dst1, fgcolor, bgcolor);
-		else
-			slow_imageblit(image, p, dst1, fgcolor, bgcolor,
-					start_index, pitch_index);
-	} else
-		color_imageblit(image, p, dst1, start_index, pitch_index);
-}
+#define FB_WRITEL         fb_writel
+#define FB_READL          fb_readl
+#define FB_MEM            __iomem
+#define FB_IMAGEBLIT      cfb_imageblit
+#define FB_SPACE          0
+#define FB_SPACE_NAME     "I/O"
+#define FB_SCREEN_BASE(a) ((a)->screen_base)
+#include "fb_imageblit.h"
 
 EXPORT_SYMBOL(cfb_imageblit);
 
 MODULE_AUTHOR("James Simmons <jsimmons@users.sf.net>");
 MODULE_DESCRIPTION("Generic software accelerated imaging drawing");
 MODULE_LICENSE("GPL");
-
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 12/13] fbdev: core: Use generic imageblit for as sys_imageblit
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (10 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 11/13] fbdev: core: Use generic imageblit for as cfb_imageblit Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  4:18 ` [PATCH RESEND 13/13] fbdev: core: Split CFB and SYS pixel reversing configuration Zsolt Kajtar
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/sysimgblt.c | 325 +--------------------------
 1 file changed, 8 insertions(+), 317 deletions(-)

diff --git a/drivers/video/fbdev/core/sysimgblt.c b/drivers/video/fbdev/core/sysimgblt.c
index 6949bbd51..6e60e3486 100644
--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@@ -11,329 +11,20 @@
  *  more details.
  */
 #include <linux/module.h>
-#include <linux/string.h>
 #include <linux/fb.h>
 #include <asm/types.h>
 
-#define DEBUG
-
-#ifdef DEBUG
-#define DPRINTK(fmt, args...) printk(KERN_DEBUG "%s: " fmt,__func__,## args)
-#else
-#define DPRINTK(fmt, args...)
-#endif
-
-static const u32 cfb_tab8_be[] = {
-    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
-    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
-    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
-    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
-};
-
-static const u32 cfb_tab8_le[] = {
-    0x00000000,0xff000000,0x00ff0000,0xffff0000,
-    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
-    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
-    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
-};
-
-static const u32 cfb_tab16_be[] = {
-    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
-};
-
-static const u32 cfb_tab16_le[] = {
-    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
-};
-
-static const u32 cfb_tab32[] = {
-	0x00000000, 0xffffffff
-};
-
-static void color_imageblit(const struct fb_image *image, struct fb_info *p,
-			    void *dst1, u32 start_index, u32 pitch_index)
-{
-	/* Draw the penguin */
-	u32 *dst, *dst2;
-	u32 color = 0, val, shift;
-	int i, n, bpp = p->var.bits_per_pixel;
-	u32 null_bits = 32 - bpp;
-	u32 *palette = (u32 *) p->pseudo_palette;
-	const u8 *src = image->data;
-
-	dst2 = dst1;
-	for (i = image->height; i--; ) {
-		n = image->width;
-		dst = dst1;
-		shift = 0;
-		val = 0;
-
-		if (start_index) {
-			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
-							 start_index));
-			val = *dst & start_mask;
-			shift = start_index;
-		}
-		while (n--) {
-			if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
-			    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
-				color = palette[*src];
-			else
-				color = *src;
-			color <<= FB_LEFT_POS(p, bpp);
-			val |= FB_SHIFT_HIGH(p, color, shift);
-			if (shift >= null_bits) {
-				*dst++ = val;
-
-				val = (shift == null_bits) ? 0 :
-					FB_SHIFT_LOW(p, color, 32 - shift);
-			}
-			shift += bpp;
-			shift &= (32 - 1);
-			src++;
-		}
-		if (shift) {
-			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
-
-			*dst &= end_mask;
-			*dst |= val;
-		}
-		dst1 += p->fix.line_length;
-		if (pitch_index) {
-			dst2 += p->fix.line_length;
-			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
-
-			start_index += pitch_index;
-			start_index &= 32 - 1;
-		}
-	}
-}
-
-static void slow_imageblit(const struct fb_image *image, struct fb_info *p,
-				  void *dst1, u32 fgcolor, u32 bgcolor,
-				  u32 start_index, u32 pitch_index)
-{
-	u32 shift, color = 0, bpp = p->var.bits_per_pixel;
-	u32 *dst, *dst2;
-	u32 val, pitch = p->fix.line_length;
-	u32 null_bits = 32 - bpp;
-	u32 spitch = (image->width+7)/8;
-	const u8 *src = image->data, *s;
-	u32 i, j, l;
-
-	dst2 = dst1;
-	fgcolor <<= FB_LEFT_POS(p, bpp);
-	bgcolor <<= FB_LEFT_POS(p, bpp);
-
-	for (i = image->height; i--; ) {
-		shift = val = 0;
-		l = 8;
-		j = image->width;
-		dst = dst1;
-		s = src;
-
-		/* write leading bits */
-		if (start_index) {
-			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
-							 start_index));
-			val = *dst & start_mask;
-			shift = start_index;
-		}
-
-		while (j--) {
-			l--;
-			color = (*s & (1 << l)) ? fgcolor : bgcolor;
-			val |= FB_SHIFT_HIGH(p, color, shift);
-
-			/* Did the bitshift spill bits to the next long? */
-			if (shift >= null_bits) {
-				*dst++ = val;
-				val = (shift == null_bits) ? 0 :
-					FB_SHIFT_LOW(p, color, 32 - shift);
-			}
-			shift += bpp;
-			shift &= (32 - 1);
-			if (!l) { l = 8; s++; }
-		}
-
-		/* write trailing bits */
- 		if (shift) {
-			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
-
-			*dst &= end_mask;
-			*dst |= val;
-		}
-
-		dst1 += pitch;
-		src += spitch;
-		if (pitch_index) {
-			dst2 += pitch;
-			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
-			start_index += pitch_index;
-			start_index &= 32 - 1;
-		}
-
-	}
-}
-
-/*
- * fast_imageblit - optimized monochrome color expansion
- *
- * Only if:  bits_per_pixel == 8, 16, or 32
- *           image->width is divisible by pixel/dword (ppw);
- *           fix->line_legth is divisible by 4;
- *           beginning and end of a scanline is dword aligned
- */
-static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
-				  void *dst1, u32 fgcolor, u32 bgcolor)
-{
-	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
-	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
-	u32 bit_mask, eorx, shift;
-	const u8 *s = image->data, *src;
-	u32 *dst;
-	const u32 *tab;
-	size_t tablen;
-	u32 colortab[16];
-	int i, j, k;
-
-	switch (bpp) {
-	case 8:
-		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
-		tablen = 16;
-		break;
-	case 16:
-		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
-		tablen = 4;
-		break;
-	case 32:
-		tab = cfb_tab32;
-		tablen = 2;
-		break;
-	default:
-		return;
-	}
-
-	for (i = ppw-1; i--; ) {
-		fgx <<= bpp;
-		bgx <<= bpp;
-		fgx |= fgcolor;
-		bgx |= bgcolor;
-	}
-
-	bit_mask = (1 << ppw) - 1;
-	eorx = fgx ^ bgx;
-	k = image->width/ppw;
-
-	for (i = 0; i < tablen; ++i)
-		colortab[i] = (tab[i] & eorx) ^ bgx;
-
-	for (i = image->height; i--; ) {
-		dst = dst1;
-		shift = 8;
-		src = s;
-
-		/*
-		 * Manually unroll the per-line copying loop for better
-		 * performance. This works until we processed the last
-		 * completely filled source byte (inclusive).
-		 */
-		switch (ppw) {
-		case 4: /* 8 bpp */
-			for (j = k; j >= 2; j -= 2, ++src) {
-				*dst++ = colortab[(*src >> 4) & bit_mask];
-				*dst++ = colortab[(*src >> 0) & bit_mask];
-			}
-			break;
-		case 2: /* 16 bpp */
-			for (j = k; j >= 4; j -= 4, ++src) {
-				*dst++ = colortab[(*src >> 6) & bit_mask];
-				*dst++ = colortab[(*src >> 4) & bit_mask];
-				*dst++ = colortab[(*src >> 2) & bit_mask];
-				*dst++ = colortab[(*src >> 0) & bit_mask];
-			}
-			break;
-		case 1: /* 32 bpp */
-			for (j = k; j >= 8; j -= 8, ++src) {
-				*dst++ = colortab[(*src >> 7) & bit_mask];
-				*dst++ = colortab[(*src >> 6) & bit_mask];
-				*dst++ = colortab[(*src >> 5) & bit_mask];
-				*dst++ = colortab[(*src >> 4) & bit_mask];
-				*dst++ = colortab[(*src >> 3) & bit_mask];
-				*dst++ = colortab[(*src >> 2) & bit_mask];
-				*dst++ = colortab[(*src >> 1) & bit_mask];
-				*dst++ = colortab[(*src >> 0) & bit_mask];
-			}
-			break;
-		}
-
-		/*
-		 * For image widths that are not a multiple of 8, there
-		 * are trailing pixels left on the current line. Print
-		 * them as well.
-		 */
-		for (; j--; ) {
-			shift -= ppw;
-			*dst++ = colortab[(*src >> shift) & bit_mask];
-			if (!shift) {
-				shift = 8;
-				++src;
-			}
-		}
-
-		dst1 += p->fix.line_length;
-		s += spitch;
-	}
-}
-
-void sys_imageblit(struct fb_info *p, const struct fb_image *image)
-{
-	u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
-	u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
-	u32 width = image->width;
-	u32 dx = image->dx, dy = image->dy;
-	void *dst1;
-
-	if (p->state != FBINFO_STATE_RUNNING)
-		return;
-
-	if (!(p->flags & FBINFO_VIRTFB))
-		fb_warn_once(p, "Framebuffer is not in virtual address space.");
-
-	bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
-	start_index = bitstart & (32 - 1);
-	pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
-
-	bitstart /= 8;
-	bitstart &= ~(bpl - 1);
-	dst1 = (void __force *)p->screen_base + bitstart;
-
-	if (p->fbops->fb_sync)
-		p->fbops->fb_sync(p);
-
-	if (image->depth == 1) {
-		if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
-		    p->fix.visual == FB_VISUAL_DIRECTCOLOR) {
-			fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
-			bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
-		} else {
-			fgcolor = image->fg_color;
-			bgcolor = image->bg_color;
-		}
-
-		if (32 % bpp == 0 && !start_index && !pitch_index &&
-		    ((width & (32/bpp-1)) == 0) &&
-		    bpp >= 8 && bpp <= 32)
-			fast_imageblit(image, p, dst1, fgcolor, bgcolor);
-		else
-			slow_imageblit(image, p, dst1, fgcolor, bgcolor,
-					start_index, pitch_index);
-	} else
-		color_imageblit(image, p, dst1, start_index, pitch_index);
-}
+#define FB_READL(a)       (*a)
+#define FB_WRITEL(a,b)    do { *(b) = (a); } while (false)
+#define FB_MEM            /* nothing */
+#define FB_IMAGEBLIT      sys_imageblit
+#define FB_SPACE          FBINFO_VIRTFB
+#define FB_SPACE_NAME     "virtual"
+#define FB_SCREEN_BASE(a) ((a)->screen_buffer)
+#include "fb_imageblit.h"
 
 EXPORT_SYMBOL(sys_imageblit);
 
 MODULE_AUTHOR("Antonino Daplas <adaplas@pol.net>");
 MODULE_DESCRIPTION("1-bit/8-bit to 1-32 bit color expansion (sys-to-sys)");
 MODULE_LICENSE("GPL");
-
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH RESEND 13/13] fbdev: core: Split CFB and SYS pixel reversing configuration
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (11 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 12/13] fbdev: core: Use generic imageblit for as sys_imageblit Zsolt Kajtar
@ 2025-02-07  4:18 ` Zsolt Kajtar
  2025-02-07  7:18 ` [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Helge Deller
  2025-02-07  8:12 ` Thomas Zimmermann
  14 siblings, 0 replies; 19+ messages in thread
From: Zsolt Kajtar @ 2025-02-07  4:18 UTC (permalink / raw)
  To: linux-fbdev, dri-devel; +Cc: Zsolt Kajtar

Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
---
 drivers/video/fbdev/core/Kconfig       | 10 +++++++++-
 drivers/video/fbdev/core/cfbcopyarea.c |  1 +
 drivers/video/fbdev/core/cfbfillrect.c |  1 +
 drivers/video/fbdev/core/cfbimgblt.c   |  1 +
 drivers/video/fbdev/core/fb_draw.h     |  6 +++---
 drivers/video/fbdev/core/syscopyarea.c |  1 +
 drivers/video/fbdev/core/sysfillrect.c |  1 +
 drivers/video/fbdev/core/sysimgblt.c   |  1 +
 8 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/video/fbdev/core/Kconfig b/drivers/video/fbdev/core/Kconfig
index d554d8c54..05aa9b42a 100644
--- a/drivers/video/fbdev/core/Kconfig
+++ b/drivers/video/fbdev/core/Kconfig
@@ -69,7 +69,7 @@ config FB_CFB_REV_PIXELS_IN_BYTE
 	bool
 	depends on FB_CORE
 	help
-	  Allow generic frame-buffer functions to work on displays with 1, 2
+	  Allow I/O memory frame-buffer functions to work on displays with 1, 2
 	  and 4 bits per pixel depths which has opposite order of pixels in
 	  byte order to bytes in long order.
 
@@ -97,6 +97,14 @@ config FB_SYS_IMAGEBLIT
 	  blitting. This is used by drivers that don't provide their own
 	  (accelerated) version and the framebuffer is in system RAM.
 
+config FB_SYS_REV_PIXELS_IN_BYTE
+	bool
+	depends on FB_CORE
+	help
+	  Allow SYS memory frame-buffer functions to work on displays with 1, 2
+	  and 4 bits per pixel depths which has opposite order of pixels in
+	  byte order to bytes in long order.
+
 config FB_PROVIDE_GET_FB_UNMAPPED_AREA
 	bool
 	depends on FB
diff --git a/drivers/video/fbdev/core/cfbcopyarea.c b/drivers/video/fbdev/core/cfbcopyarea.c
index ba0ebd115..85c406125 100644
--- a/drivers/video/fbdev/core/cfbcopyarea.c
+++ b/drivers/video/fbdev/core/cfbcopyarea.c
@@ -25,6 +25,7 @@
 #define FB_SPACE          0
 #define FB_SPACE_NAME     "I/O"
 #define FB_SCREEN_BASE(a) ((a)->screen_base)
+#define FB_REV_PIXELS_IN_BYTE CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
 #include "fb_copyarea.h"
 
 EXPORT_SYMBOL(cfb_copyarea);
diff --git a/drivers/video/fbdev/core/cfbfillrect.c b/drivers/video/fbdev/core/cfbfillrect.c
index 116d56de2..9fff21680 100644
--- a/drivers/video/fbdev/core/cfbfillrect.c
+++ b/drivers/video/fbdev/core/cfbfillrect.c
@@ -25,6 +25,7 @@
 #define FB_SPACE          0
 #define FB_SPACE_NAME     "I/O"
 #define FB_SCREEN_BASE(a) ((a)->screen_base)
+#define FB_REV_PIXELS_IN_BYTE CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
 #include "fb_fillrect.h"
 
 EXPORT_SYMBOL(cfb_fillrect);
diff --git a/drivers/video/fbdev/core/cfbimgblt.c b/drivers/video/fbdev/core/cfbimgblt.c
index a5bb63913..729bf1ace 100644
--- a/drivers/video/fbdev/core/cfbimgblt.c
+++ b/drivers/video/fbdev/core/cfbimgblt.c
@@ -19,6 +19,7 @@
 #define FB_SPACE          0
 #define FB_SPACE_NAME     "I/O"
 #define FB_SCREEN_BASE(a) ((a)->screen_base)
+#define FB_REV_PIXELS_IN_BYTE CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
 #include "fb_imageblit.h"
 
 EXPORT_SYMBOL(cfb_imageblit);
diff --git a/drivers/video/fbdev/core/fb_draw.h b/drivers/video/fbdev/core/fb_draw.h
index e0d829873..1ed7e58f1 100644
--- a/drivers/video/fbdev/core/fb_draw.h
+++ b/drivers/video/fbdev/core/fb_draw.h
@@ -75,7 +75,7 @@ pixel_to_pat( u32 bpp, u32 pixel)
 }
 #endif
 
-#ifdef CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
+#ifdef FB_REV_PIXELS_IN_BYTE
 #if BITS_PER_LONG == 64
 #define REV_PIXELS_MASK1 0x5555555555555555ul
 #define REV_PIXELS_MASK2 0x3333333333333333ul
@@ -157,7 +157,7 @@ static inline u32 fb_compute_bswapmask(struct fb_info *info)
 	return bswapmask;
 }
 
-#else /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
+#else /* FB_REV_PIXELS_IN_BYTE */
 
 static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
 						  u32 bswapmask)
@@ -169,7 +169,7 @@ static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
 #define fb_shifted_pixels_mask_long(p, i, b) FB_SHIFT_HIGH((p), ~0UL, (i))
 #define fb_compute_bswapmask(...) 0
 
-#endif  /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
+#endif  /* FB_REV_PIXELS_IN_BYTE */
 
 #define cpu_to_le_long _cpu_to_le_long(BITS_PER_LONG)
 #define _cpu_to_le_long(x) __cpu_to_le_long(x)
diff --git a/drivers/video/fbdev/core/syscopyarea.c b/drivers/video/fbdev/core/syscopyarea.c
index 124831eed..a14328f98 100644
--- a/drivers/video/fbdev/core/syscopyarea.c
+++ b/drivers/video/fbdev/core/syscopyarea.c
@@ -23,6 +23,7 @@
 #define FB_SPACE          FBINFO_VIRTFB
 #define FB_SPACE_NAME     "virtual"
 #define FB_SCREEN_BASE(a) ((a)->screen_buffer)
+#define FB_REV_PIXELS_IN_BYTE CONFIG_FB_SYS_REV_PIXELS_IN_BYTE
 #include "fb_copyarea.h"
 
 EXPORT_SYMBOL(sys_copyarea);
diff --git a/drivers/video/fbdev/core/sysfillrect.c b/drivers/video/fbdev/core/sysfillrect.c
index 48d0f0efb..1b039573b 100644
--- a/drivers/video/fbdev/core/sysfillrect.c
+++ b/drivers/video/fbdev/core/sysfillrect.c
@@ -23,6 +23,7 @@
 #define FB_SPACE          FBINFO_VIRTFB
 #define FB_SPACE_NAME     "virtual"
 #define FB_SCREEN_BASE(a) ((a)->screen_buffer)
+#define FB_REV_PIXELS_IN_BYTE CONFIG_FB_SYS_REV_PIXELS_IN_BYTE
 #include "fb_fillrect.h"
 
 EXPORT_SYMBOL(sys_fillrect);
diff --git a/drivers/video/fbdev/core/sysimgblt.c b/drivers/video/fbdev/core/sysimgblt.c
index 6e60e3486..e8b849b82 100644
--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@@ -21,6 +21,7 @@
 #define FB_SPACE          FBINFO_VIRTFB
 #define FB_SPACE_NAME     "virtual"
 #define FB_SCREEN_BASE(a) ((a)->screen_buffer)
+#define FB_REV_PIXELS_IN_BYTE CONFIG_FB_SYS_REV_PIXELS_IN_BYTE
 #include "fb_imageblit.h"
 
 EXPORT_SYMBOL(sys_imageblit);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (12 preceding siblings ...)
  2025-02-07  4:18 ` [PATCH RESEND 13/13] fbdev: core: Split CFB and SYS pixel reversing configuration Zsolt Kajtar
@ 2025-02-07  7:18 ` Helge Deller
  2025-02-07  8:12 ` Thomas Zimmermann
  14 siblings, 0 replies; 19+ messages in thread
From: Helge Deller @ 2025-02-07  7:18 UTC (permalink / raw)
  To: Zsolt Kajtar, linux-fbdev, dri-devel

On 2/7/25 05:18, Zsolt Kajtar wrote:
> In 68648ed1f58d98b8e8d994022e5e25331fbfe42a the drawing routines were
> duplicated to have separate I/O and system memory versions.
>
> Later the pixel reversing in 779121e9f17525769c04a00475fd85600c8c04eb
> was only added to the I/O version and not to system.
>
> That's unfortunate as reversing is not something only applicable for
> I/O memory and I happen to need both I/O and system version now.
>
> One option is to bring the system version up to date, but from the
> maintenance perspective it's better to not have two versions in the
> first place.
>
> The drawing routines (based on the cfb version) were moved to header
> files. These are now included in both cfb and sys modules. The memory
> access and other minor differences were handled with a few macros.
>
> The last patch adds a separate config option for the system version.
>
> Zsolt Kajtar (13):
>    fbdev: core: Copy cfbcopyarea to fb_copyarea
>    fbdev: core: Make fb_copyarea generic
>    fbdev: core: Use generic copyarea for as cfb_copyarea
>    fbdev: core: Use generic copyarea for as sys_copyarea
>    fbdev: core: Copy cfbfillrect to fb_fillrect
>    fbdev: core: Make fb_fillrect generic
>    fbdev: core: Use generic fillrect for as cfb_fillrect
>    fbdev: core: Use generic fillrect for as sys_fillrect
>    fbdev: core: Copy cfbimgblt to fb_imageblit
>    fbdev: core: Make fb_imageblit generic
>    fbdev: core: Use generic imageblit for as cfb_imageblit
>    fbdev: core: Use generic imageblit for as sys_imageblit
>    fbdev: core: Split CFB and SYS pixel reversing configuration
>
>   drivers/video/fbdev/core/Kconfig        |  10 +-
>   drivers/video/fbdev/core/cfbcopyarea.c  | 427 +-----------------------
>   drivers/video/fbdev/core/cfbfillrect.c  | 363 +-------------------
>   drivers/video/fbdev/core/cfbimgblt.c    | 358 +-------------------
>   drivers/video/fbdev/core/fb_copyarea.h  | 421 +++++++++++++++++++++++
>   drivers/video/fbdev/core/fb_draw.h      |   6 +-
>   drivers/video/fbdev/core/fb_fillrect.h  | 359 ++++++++++++++++++++
>   drivers/video/fbdev/core/fb_imageblit.h | 356 ++++++++++++++++++++
>   drivers/video/fbdev/core/syscopyarea.c  | 358 +-------------------
>   drivers/video/fbdev/core/sysfillrect.c  | 315 +----------------
>   drivers/video/fbdev/core/sysimgblt.c    | 326 +-----------------
>   11 files changed, 1208 insertions(+), 2091 deletions(-)
>   create mode 100644 drivers/video/fbdev/core/fb_copyarea.h
>   create mode 100644 drivers/video/fbdev/core/fb_fillrect.h
>   create mode 100644 drivers/video/fbdev/core/fb_imageblit.h

It's a bigger change.
I've applied the series to the fbdev for-next git tree to give
it some compile- and runtime testing.

Helge

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops
  2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
                   ` (13 preceding siblings ...)
  2025-02-07  7:18 ` [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Helge Deller
@ 2025-02-07  8:12 ` Thomas Zimmermann
  2025-02-08  0:51   ` Kajtár Zsolt
  14 siblings, 1 reply; 19+ messages in thread
From: Thomas Zimmermann @ 2025-02-07  8:12 UTC (permalink / raw)
  To: Zsolt Kajtar, linux-fbdev, dri-devel

Hi


Am 07.02.25 um 05:18 schrieb Zsolt Kajtar:
> In 68648ed1f58d98b8e8d994022e5e25331fbfe42a the drawing routines were
> duplicated to have separate I/O and system memory versions.
>
> Later the pixel reversing in 779121e9f17525769c04a00475fd85600c8c04eb
> was only added to the I/O version and not to system.
>
> That's unfortunate as reversing is not something only applicable for
> I/O memory and I happen to need both I/O and system version now.
>
> One option is to bring the system version up to date, but from the
> maintenance perspective it's better to not have two versions in the
> first place.

No it's not. Major code abstractions behind preprocessor tokens are 
terrible to maintain. It's also technically not possible to switch 
between system and I/O memory at will. These are very different things.

If you want that pixel-reversing feature in sys_ helpers, please 
implement it there.

Sorry, but NAK on this series.

Best regards
Thomas

>
> The drawing routines (based on the cfb version) were moved to header
> files. These are now included in both cfb and sys modules. The memory
> access and other minor differences were handled with a few macros.
>
> The last patch adds a separate config option for the system version.
>
> Zsolt Kajtar (13):
>    fbdev: core: Copy cfbcopyarea to fb_copyarea
>    fbdev: core: Make fb_copyarea generic
>    fbdev: core: Use generic copyarea for as cfb_copyarea
>    fbdev: core: Use generic copyarea for as sys_copyarea
>    fbdev: core: Copy cfbfillrect to fb_fillrect
>    fbdev: core: Make fb_fillrect generic
>    fbdev: core: Use generic fillrect for as cfb_fillrect
>    fbdev: core: Use generic fillrect for as sys_fillrect
>    fbdev: core: Copy cfbimgblt to fb_imageblit
>    fbdev: core: Make fb_imageblit generic
>    fbdev: core: Use generic imageblit for as cfb_imageblit
>    fbdev: core: Use generic imageblit for as sys_imageblit
>    fbdev: core: Split CFB and SYS pixel reversing configuration
>
>   drivers/video/fbdev/core/Kconfig        |  10 +-
>   drivers/video/fbdev/core/cfbcopyarea.c  | 427 +-----------------------
>   drivers/video/fbdev/core/cfbfillrect.c  | 363 +-------------------
>   drivers/video/fbdev/core/cfbimgblt.c    | 358 +-------------------
>   drivers/video/fbdev/core/fb_copyarea.h  | 421 +++++++++++++++++++++++
>   drivers/video/fbdev/core/fb_draw.h      |   6 +-
>   drivers/video/fbdev/core/fb_fillrect.h  | 359 ++++++++++++++++++++
>   drivers/video/fbdev/core/fb_imageblit.h | 356 ++++++++++++++++++++
>   drivers/video/fbdev/core/syscopyarea.c  | 358 +-------------------
>   drivers/video/fbdev/core/sysfillrect.c  | 315 +----------------
>   drivers/video/fbdev/core/sysimgblt.c    | 326 +-----------------
>   11 files changed, 1208 insertions(+), 2091 deletions(-)
>   create mode 100644 drivers/video/fbdev/core/fb_copyarea.h
>   create mode 100644 drivers/video/fbdev/core/fb_fillrect.h
>   create mode 100644 drivers/video/fbdev/core/fb_imageblit.h
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops
  2025-02-07  8:12 ` Thomas Zimmermann
@ 2025-02-08  0:51   ` Kajtár Zsolt
  2025-02-10  9:24     ` Thomas Zimmermann
  0 siblings, 1 reply; 19+ messages in thread
From: Kajtár Zsolt @ 2025-02-08  0:51 UTC (permalink / raw)
  To: Thomas Zimmermann, linux-fbdev, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1879 bytes --]

Hello Thomas!

> No it's not. Major code abstractions behind preprocessor tokens are 
> terrible to maintain.

Hmm, don't get me wrong but I'm not sure if the changes were really
checked in detail. At first sight it might look like I'm adding tons of
new macro ridden code in those header files replacing cleaner code.

While actually that's just how the I/O version currently is, copied and
white space cleaned (as it was requested) plus comment style matched
with sys.

The only new thing which hides the mentioned abstraction a little more
is FB_MEM, which replaced __iomem. But that's a tradeoff to be able to
use the same source for system as well.

Or the concern is that now system memory specific code might get mixed
in there by mistake?

It was not planned as the final version, the current maintainer asked
for addressing some pre-existing quality issues with further patches but
otherwise accepted the taken approach.

> It's also technically not possible to switch between system and I/O 
> memory at will. These are very different things.

Yes, there are architectures where these two don't mix at all, I'm aware
of that. I need that on x86 only (for old hw), and there it seems
doable. Depending on the resolution either the aperture or the defio
memory is mapped. If the framebuffer is not remapped after a mode change
that's an application bug. Otherwise it's harmless as both are always
there and don't change.

I'd better like to find out problems sooner than later, so if you or
anyone else could share any concerns that'd be really helpful!

> If you want that pixel-reversing feature in sys_ helpers, please 
> implement it there.

Actually that's what I did first. Then did it once more by adapting the
I/O version as that gave me more confidence that it'll work exactly the
same and there's less room for error.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops
  2025-02-08  0:51   ` Kajtár Zsolt
@ 2025-02-10  9:24     ` Thomas Zimmermann
  2025-02-24 21:52       ` Kajtár Zsolt
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Zimmermann @ 2025-02-10  9:24 UTC (permalink / raw)
  To: Kajtár Zsolt, linux-fbdev, dri-devel

Hi

Am 08.02.25 um 01:51 schrieb Kajtár Zsolt:
> Hello Thomas!
>
>> No it's not. Major code abstractions behind preprocessor tokens are
>> terrible to maintain.
> Hmm, don't get me wrong but I'm not sure if the changes were really
> checked in detail. At first sight it might look like I'm adding tons of
> new macro ridden code in those header files replacing cleaner code.
>
> While actually that's just how the I/O version currently is, copied and
> white space cleaned (as it was requested) plus comment style matched
> with sys.
>
> The only new thing which hides the mentioned abstraction a little more
> is FB_MEM, which replaced __iomem. But that's a tradeoff to be able to
> use the same source for system as well.
>
> Or the concern is that now system memory specific code might get mixed
> in there by mistake?
>
> It was not planned as the final version, the current maintainer asked
> for addressing some pre-existing quality issues with further patches but
> otherwise accepted the taken approach.
>
>> It's also technically not possible to switch between system and I/O
>> memory at will. These are very different things.
> Yes, there are architectures where these two don't mix at all, I'm aware
> of that. I need that on x86 only (for old hw), and there it seems
> doable. Depending on the resolution either the aperture or the defio
> memory is mapped. If the framebuffer is not remapped after a mode change
> that's an application bug. Otherwise it's harmless as both are always
> there and don't change.
>
> I'd better like to find out problems sooner than later, so if you or
> anyone else could share any concerns that'd be really helpful!

First of all, commit 779121e9f175 ("fbdev: Support for byte-reversed 
framebuffer formats") isn't super complicated AFAICT. I can be 
implemented in the sys_ helpers as well. It seems like you initially did 
that.

About the series at hand: generating code by macro expansion is good for 
simple cases. I've done that in several places within fbdev myself, such 
as [1]. But if the generated code requires Turing-completeness, it 
becomes much harder to see through the macros and understand what is 
going on. This makes code undiscoverable; and discoverability is a 
requirement for maintenance.

[1] https://elixir.bootlin.com/linux/v6.13.1/source/include/linux/fb.h#L700

Then there's type-safety and type-casting. The current series defeats it 
by casting various pointers to whatever the macros define. For example, 
looking at the copyarea patches, they use screen_base [2] from struct 
fb_info. The thing is, using screen_base is wrong for sys_copyarea(). 
The function should use 'screen_buffer' instead. It works because both 
fields share the same bits of a union. Using screen_base is a bug in the 
current implementation that should be fixed, while this patch series 
would set it in stone.

[2] 
https://elixir.bootlin.com/linux/v6.13.1/source/drivers/video/fbdev/core/syscopyarea.c#L340

Next, if you look through the commit history, you'll find that there are 
several commits with performance improvements. Memory access in the sys 
variants is not guaranteed to be 32-bit aligned by default. The compiler 
has to assume unaligned access, which results in slower code. Hence, 
some manual intervention has to be done. It's too easy to accidentally 
mess this up by using nontransparent macros for access.


If you want to do meaningful work here, please do actual refactoring 
instead of throwing unrelated code together. First of all, never use 
macros, but functions. You can supply callback functions to access the 
framebuffer. Each callback should know whether it operates on 
screen_base or screen_buffer.

But using callbacks for individual reads and writes can have runtime 
overhead. It's better to operate on complete scanlines. The current 
helpers are already organized that way. Again, from the copyarea helper:

sys_copyarea()
{
     // first prepare

     // then go through the scanlines
     while (height) {
         do_something_for_the_current_scanline().
     }
}

The inner helper do_something_...() has to be written for various cfb 
and sys cases and can be given as function pointer to a generic helper.

Best regards
Thomas


>
>> If you want that pixel-reversing feature in sys_ helpers, please
>> implement it there.
> Actually that's what I did first. Then did it once more by adapting the
> I/O version as that gave me more confidence that it'll work exactly the
> same and there's less room for error.

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops
  2025-02-10  9:24     ` Thomas Zimmermann
@ 2025-02-24 21:52       ` Kajtár Zsolt
  0 siblings, 0 replies; 19+ messages in thread
From: Kajtár Zsolt @ 2025-02-24 21:52 UTC (permalink / raw)
  To: Thomas Zimmermann, linux-fbdev, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 3390 bytes --]

Hello Thomas!

Wanted to answer earlier but things took time, and a lot more than expected.

> First of all, commit 779121e9f175 ("fbdev: Support for byte-reversed
> framebuffer formats") isn't super complicated AFAICT. I can be
> implemented in the sys_ helpers as well. It seems like you initially did
> that.

Meanwhile I found out that this implementation had corner cases. I also
expected original implementations be a bit more complete.

> About the series at hand: generating code by macro expansion is good for
> simple cases. I've done that in several places within fbdev myself, such
> as [1]. But if the generated code requires Turing-completeness, it
> becomes much harder to see through the macros and understand what is
> going on. This makes code undiscoverable; and discoverability is a
> requirement for maintenance.

In the new version I resorted to only generate tables with them, in
close proximity. The mentioned part made me think when I first run into
it, btw.

> Then there's type-safety and type-casting. The current series defeats it
> by casting various pointers to whatever the macros define. For example,
> looking at the copyarea patches, they use screen_base [2] from struct
> fb_info. The thing is, using screen_base is wrong for sys_copyarea().
> The function should use 'screen_buffer' instead. It works because both
> fields share the same bits of a union. Using screen_base is a bug in the
> current implementation that should be fixed, while this patch series
> would set it in stone.

I've noticed the screen base vs. buffer issue back then and was already
corrected. But it's handled more cleanly now.

> Next, if you look through the commit history, you'll find that there are
> several commits with performance improvements. Memory access in the sys
> variants is not guaranteed to be 32-bit aligned by default. The compiler
> has to assume unaligned access, which results in slower code. Hence,
> some manual intervention has to be done. It's too easy to accidentally
> mess this up by using nontransparent macros for access.

In the new version I made it very hard to get the alignment wrong.

> If you want to do meaningful work here, please do actual refactoring
> instead of throwing unrelated code together. First of all, never use
> macros, but functions. You can supply callback functions to access the
> framebuffer. Each callback should know whether it operates on
> screen_base or screen_buffer.

I've used such callbacks but not for the read/writes as that would have
made the parameter list huge, in terms of lines. Not to mention passing
them down to lowest level.

> But using callbacks for individual reads and writes can have runtime
> overhead. It's better to operate on complete scanlines. The current
> helpers are already organized that way. Again, from the copyarea helper:

If done slightly differently the compiler inlines these and there's no
overhead.

> The inner helper do_something_...() has to be written for various cfb
> and sys cases and can be given as function pointer to a generic helper.
The vertical loops are small, but I kept them separate from the scanline
rendering part.

Thanks for the tips, that was really helpful and used them when applicable.

While the updated version is not quite so as described I hope it isn't
too bad either.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2025-02-24 21:52 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-07  4:18 [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 01/13] fbdev: core: Copy cfbcopyarea to fb_copyarea Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 02/13] fbdev: core: Make fb_copyarea generic Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 03/13] fbdev: core: Use generic copyarea for as cfb_copyarea Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 04/13] fbdev: core: Use generic copyarea for as sys_copyarea Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 05/13] fbdev: core: Copy cfbfillrect to fb_fillrect Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 06/13] fbdev: core: Make fb_fillrect generic Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 07/13] fbdev: core: Use generic fillrect for as cfb_fillrect Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 08/13] fbdev: core: Use generic fillrect for as sys_fillrect Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 09/13] fbdev: core: Copy cfbimgblt to fb_imageblit Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 10/13] fbdev: core: Make fb_imageblit generic Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 11/13] fbdev: core: Use generic imageblit for as cfb_imageblit Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 12/13] fbdev: core: Use generic imageblit for as sys_imageblit Zsolt Kajtar
2025-02-07  4:18 ` [PATCH RESEND 13/13] fbdev: core: Split CFB and SYS pixel reversing configuration Zsolt Kajtar
2025-02-07  7:18 ` [PATCH RESEND 00/13] fbdev: core: Deduplicate cfb/sys drawing fbops Helge Deller
2025-02-07  8:12 ` Thomas Zimmermann
2025-02-08  0:51   ` Kajtár Zsolt
2025-02-10  9:24     ` Thomas Zimmermann
2025-02-24 21:52       ` Kajtár Zsolt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).