qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to
@ 2025-11-18 15:48 Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 01/11] ati-vga: Fix DST_PITCH and SRC_PITCH reads Chad Jablonski
                   ` (11 more replies)
  0 siblings, 12 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

This series implements HOST_DATA as a blit source enabling text rendering in
xterm under X.org with 2D acceleration.

The series builds up functionality incrementally:

* Patches 1-5:   Bug fixes and register implementations
* Patches 6-7:   Scissor clipping
* Patches 8-9:   Destination setup refactor
* Patches 10-11: HOST_DATA register writes, color expansion, and
                 accumulator flush

Tested with xterm on X.org using the r128 driver built with
--disable-dri (MMIO mode).

Hardware tests can be found at: https://codeberg.org/cjab/ati-tests.

Changes from v2:

- Verified with hardware that clipping default bit latches
- Verified with hardware that pitch/offset default bits latch
  (supersedes BALATON's "ati-vga: Separate default control bit for
  source")
- A new approach to HOST_DATA using a 128-bit accumulator instead of a
  large buffer
- Fixed a longstanding bug in (DST/SRC)_PITCH reads that zeroed pitch value.
- Removed mask from the DP_GUI_MASTER_CNTL write, all fields are
  important
- Created helpers for code shared between ati_2d_blt and
  ati_host_data_flush

Changes from v1:

- Split monolithic patch into 7 logical patches as requested
- Integrate HOST_DATA blit into existing ati_2d_blt()
- Add fallback implementation for non-pixman builds
- Remove unnecessary initialization in realize (kept in reset only)
- Fixed DP_GUI_MASTER_CNTL mask to preserve GMC_SRC_SOURCE field
- Implement scissor clipping

Chad Jablonski (11):
  ati-vga: Fix DST_PITCH and SRC_PITCH reads
  ati-vga: Add scissor clipping register support
  ati-vga: Implement foreground and background color register writes
  ati-vga: Latch src and dst pitch and offset on master_cntl default
  ati-vga: Fix DP_GUI_MASTER_CNTL register mask
  ati-vga: Create dst and sc rectangle helpers
  ati-vga: Implement scissor rectangle clipping for 2D operations
  ati-vga: Create 2d_blt destination setup helper
  ati-vga: Refactor ati_2d_blt to use dst setup helper
  ati-vga: Implement HOST_DATA register writes
  ati-vga: Implement HOST_DATA flush to VRAM

 hw/display/ati.c      | 124 +++++++++++++-
 hw/display/ati_2d.c   | 370 +++++++++++++++++++++++++++++++-----------
 hw/display/ati_dbg.c  |   9 +
 hw/display/ati_int.h  |  21 ++-
 hw/display/ati_regs.h |  44 ++++-
 5 files changed, 467 insertions(+), 101 deletions(-)

-- 
2.51.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 01/11] ati-vga: Fix DST_PITCH and SRC_PITCH reads
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 02/11] ati-vga: Add scissor clipping register support Chad Jablonski
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

Reading DST_PITCH and SRC_PITCH on the Rage 128 is broken. The read
handlers attempt to construct the value from pitch and tile bits in
the register state but mistakenly AND them instead of ORing them. This
means the pitch is always zero on read.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
 hw/display/ati.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index 0b4298d078..66fad6459a 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -438,7 +438,7 @@ static uint64_t ati_mm_read(void *opaque, hwaddr addr, unsigned int size)
     case DST_PITCH:
         val = s->regs.dst_pitch;
         if (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF) {
-            val &= s->regs.dst_tile << 16;
+            val |= s->regs.dst_tile << 16;
         }
         break;
     case DST_WIDTH:
@@ -468,7 +468,7 @@ static uint64_t ati_mm_read(void *opaque, hwaddr addr, unsigned int size)
     case SRC_PITCH:
         val = s->regs.src_pitch;
         if (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF) {
-            val &= s->regs.src_tile << 16;
+            val |= s->regs.src_tile << 16;
         }
         break;
     case DP_BRUSH_BKGD_CLR:
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 02/11] ati-vga: Add scissor clipping register support
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 01/11] ati-vga: Fix DST_PITCH and SRC_PITCH reads Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 03/11] ati-vga: Implement foreground and background color register writes Chad Jablonski
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

Implement read and write operations on SC_TOP_LEFT, SC_BOTTOM_RIGHT,
and SRC_SC_BOTTOM_RIGHT registers. These registers are also updated
when the src and/or dst clipping fields on DP_GUI_MASTER_CNTL are set
to default clipping.

Scissor clipping is used when rendering text in X.org. The r128 driver
sends host data much wider than is necessary to draw a glyph and cuts it
down to size using clipping before rendering. The actual clipping
implementation follows in a future patch.

This also includes a very minor refactor of the combined
default_sc_bottom_right field in the registers struct to
default_sc_bottom and default_sc_right. This was done to
stay consistent with the other scissor registers and prevent repeated
masking and extraction.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
 hw/display/ati.c      | 71 +++++++++++++++++++++++++++++++++++++++++--
 hw/display/ati_int.h  |  9 +++++-
 hw/display/ati_regs.h | 12 ++++++--
 3 files changed, 87 insertions(+), 5 deletions(-)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index 66fad6459a..cb979ae0fa 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -508,7 +508,32 @@ static uint64_t ati_mm_read(void *opaque, hwaddr addr, unsigned int size)
         val |= s->regs.default_tile << 16;
         break;
     case DEFAULT_SC_BOTTOM_RIGHT:
-        val = s->regs.default_sc_bottom_right;
+        val = (s->regs.default_sc_bottom << 16) |
+              s->regs.default_sc_right;
+        break;
+    case SC_TOP:
+        val = s->regs.sc_top;
+        break;
+    case SC_LEFT:
+        val = s->regs.sc_left;
+        break;
+    case SC_BOTTOM:
+        val = s->regs.sc_bottom;
+        break;
+    case SC_RIGHT:
+        val = s->regs.sc_right;
+        break;
+    case SRC_SC_BOTTOM:
+        val = s->regs.src_sc_bottom;
+        break;
+    case SRC_SC_RIGHT:
+        val = s->regs.src_sc_right;
+        break;
+    case SC_TOP_LEFT:
+    case SC_BOTTOM_RIGHT:
+    case SRC_SC_BOTTOM_RIGHT:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "Read from write-only register 0x%x\n", (unsigned)addr);
         break;
     default:
         break;
@@ -862,6 +887,17 @@ static void ati_mm_write(void *opaque, hwaddr addr,
         s->regs.dp_datatype = (data & 0x0f00) >> 8 | (data & 0x30f0) << 4 |
                               (data & 0x4000) << 16;
         s->regs.dp_mix = (data & GMC_ROP3_MASK) | (data & 0x7000000) >> 16;
+
+        if ((data & GMC_SRC_CLIPPING_MASK) == GMC_SRC_CLIP_DEFAULT) {
+            s->regs.src_sc_right = s->regs.default_sc_right;
+            s->regs.src_sc_bottom = s->regs.default_sc_bottom;
+        }
+        if ((data & GMC_DST_CLIPPING_MASK) == GMC_DST_CLIP_DEFAULT) {
+            s->regs.sc_top = 0;
+            s->regs.sc_left = 0;
+            s->regs.sc_right = s->regs.default_sc_right;
+            s->regs.sc_bottom = s->regs.default_sc_bottom;
+        }
         break;
     case DST_WIDTH_X:
         s->regs.dst_x = data & 0x3fff;
@@ -935,7 +971,38 @@ static void ati_mm_write(void *opaque, hwaddr addr,
         }
         break;
     case DEFAULT_SC_BOTTOM_RIGHT:
-        s->regs.default_sc_bottom_right = data & 0x3fff3fff;
+        s->regs.default_sc_right = data & 0x3fff;
+        s->regs.default_sc_bottom = (data >> 16) & 0x3fff;
+        break;
+    case SC_TOP_LEFT:
+        s->regs.sc_left = data & 0x3fff;
+        s->regs.sc_top = (data >> 16) & 0x3fff;
+        break;
+    case SC_LEFT:
+        s->regs.sc_left = data & 0x3fff;
+        break;
+    case SC_TOP:
+        s->regs.sc_top = data & 0x3fff;
+        break;
+    case SC_BOTTOM_RIGHT:
+        s->regs.sc_right = data & 0x3fff;
+        s->regs.sc_bottom = (data >> 16) & 0x3fff;
+        break;
+    case SC_RIGHT:
+        s->regs.sc_right = data & 0x3fff;
+        break;
+    case SC_BOTTOM:
+        s->regs.sc_bottom = data & 0x3fff;
+        break;
+    case SRC_SC_BOTTOM_RIGHT:
+        s->regs.src_sc_right = data & 0x3fff;
+        s->regs.src_sc_bottom = (data >> 16) & 0x3fff;
+        break;
+    case SRC_SC_RIGHT:
+        s->regs.src_sc_right = data & 0x3fff;
+        break;
+    case SRC_SC_BOTTOM:
+        s->regs.src_sc_bottom = data & 0x3fff;
         break;
     default:
         break;
diff --git a/hw/display/ati_int.h b/hw/display/ati_int.h
index 708cc1dd3a..1e999b11c2 100644
--- a/hw/display/ati_int.h
+++ b/hw/display/ati_int.h
@@ -85,7 +85,14 @@ typedef struct ATIVGARegs {
     uint32_t default_offset;
     uint32_t default_pitch;
     uint32_t default_tile;
-    uint32_t default_sc_bottom_right;
+    uint16_t default_sc_bottom;
+    uint16_t default_sc_right;
+    uint16_t sc_top;
+    uint16_t sc_left;
+    uint16_t sc_bottom;
+    uint16_t sc_right;
+    uint16_t src_sc_bottom;
+    uint16_t src_sc_right;
 } ATIVGARegs;
 
 struct ATIVGAState {
diff --git a/hw/display/ati_regs.h b/hw/display/ati_regs.h
index d7127748ff..2b56b9fb66 100644
--- a/hw/display/ati_regs.h
+++ b/hw/display/ati_regs.h
@@ -392,8 +392,6 @@
 /* DP_GUI_MASTER_CNTL bit constants */
 #define GMC_SRC_PITCH_OFFSET_CNTL               0x00000001
 #define GMC_DST_PITCH_OFFSET_CNTL               0x00000002
-#define GMC_SRC_CLIP_DEFAULT                    0x00000000
-#define GMC_DST_CLIP_DEFAULT                    0x00000000
 #define GMC_BRUSH_SOLIDCOLOR                    0x000000d0
 #define GMC_SRC_DSTCOLOR                        0x00003000
 #define GMC_BYTE_ORDER_MSB_TO_LSB               0x00000000
@@ -404,6 +402,16 @@
 #define GMC_WRITE_MASK_SET                      0x40000000
 #define GMC_DP_CONVERSION_TEMP_6500             0x00000000
 
+/* DP_GUI_MASTER_CNTL DP_SRC_CLIPPING named constants */
+#define GMC_SRC_CLIPPING_MASK                   0x00000004
+#define GMC_SRC_CLIP_DEFAULT                    0x00000000
+#define GMC_SRC_CLIP_LEAVE_ALONE                0x00000004
+
+/* DP_GUI_MASTER_CNTL DP_DST_CLIPPING named constants */
+#define GMC_DST_CLIPPING_MASK                   0x00000008
+#define GMC_DST_CLIP_DEFAULT                    0x00000000
+#define GMC_DST_CLIP_LEAVE_ALONE                0x00000008
+
 /* DP_GUI_MASTER_CNTL ROP3 named constants */
 #define GMC_ROP3_MASK                           0x00ff0000
 #define ROP3_BLACKNESS                          0x00000000
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 03/11] ati-vga: Implement foreground and background color register writes
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 01/11] ati-vga: Fix DST_PITCH and SRC_PITCH reads Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 02/11] ati-vga: Add scissor clipping register support Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 04/11] ati-vga: Latch src and dst pitch and offset on master_cntl default Chad Jablonski
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

These are straightforward 32-bit register write handlers. They're
necessary for a future patch which will use them for color expansion
from monochrome host data transfers.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
---
 hw/display/ati.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index cb979ae0fa..a2c0302e42 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -945,6 +945,12 @@ static void ati_mm_write(void *opaque, hwaddr addr,
     case DP_CNTL:
         s->regs.dp_cntl = data;
         break;
+    case DP_SRC_FRGD_CLR:
+        s->regs.dp_src_frgd_clr = data;
+        break;
+    case DP_SRC_BKGD_CLR:
+        s->regs.dp_src_bkgd_clr = data;
+        break;
     case DP_DATATYPE:
         s->regs.dp_datatype = data & 0xe0070f0f;
         break;
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 04/11] ati-vga: Latch src and dst pitch and offset on master_cntl default
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
                   ` (2 preceding siblings ...)
  2025-11-18 15:48 ` [PATCH v3 03/11] ati-vga: Implement foreground and background color register writes Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 05/11] ati-vga: Fix DP_GUI_MASTER_CNTL register mask Chad Jablonski
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

Hardware testing on the Rage 128 confirms that (SRC/DST)_OFFSET,
and (SRC/DST)_PITCH are latched when (SRC/DST)_PITCH_OFFSET_CNTL bits
in DP_GUI_MASTER_CNTL are set to "default".

The earlier approach looked at the state of the (SRC/DST)_PITCH_OFFSET_CNTL
bits when offset and pitch registers were used. This meant that when
(SRC/DST)_PITCH_OFFSET_CNTL was reset to "leave alone" the old values
stored in the registers would return. This is not how the real hardware
works.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
 hw/display/ati.c      |  8 ++++++++
 hw/display/ati_2d.c   | 13 ++++---------
 hw/display/ati_regs.h | 10 ++++++++++
 3 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index a2c0302e42..d0fa51f773 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -888,6 +888,14 @@ static void ati_mm_write(void *opaque, hwaddr addr,
                               (data & 0x4000) << 16;
         s->regs.dp_mix = (data & GMC_ROP3_MASK) | (data & 0x7000000) >> 16;
 
+        if ((data & GMC_SRC_PITCH_OFFSET_CNTL_MASK) == GMC_SRC_PITCH_OFFSET_DEFAULT) {
+            s->regs.src_offset = s->regs.default_offset;
+            s->regs.src_pitch = s->regs.default_pitch;
+        }
+        if ((data & GMC_DST_PITCH_OFFSET_CNTL_MASK) == GMC_DST_PITCH_OFFSET_DEFAULT) {
+            s->regs.dst_offset = s->regs.default_offset;
+            s->regs.dst_pitch = s->regs.default_pitch;
+        }
         if ((data & GMC_SRC_CLIPPING_MASK) == GMC_SRC_CLIP_DEFAULT) {
             s->regs.src_sc_right = s->regs.default_sc_right;
             s->regs.src_sc_bottom = s->regs.default_sc_bottom;
diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
index 309bb5ccb6..a8c4c534b9 100644
--- a/hw/display/ati_2d.c
+++ b/hw/display/ati_2d.c
@@ -43,8 +43,6 @@ static int ati_bpp_from_datatype(ATIVGAState *s)
     }
 }
 
-#define DEFAULT_CNTL (s->regs.dp_gui_master_cntl & GMC_DST_PITCH_OFFSET_CNTL)
-
 void ati_2d_blt(ATIVGAState *s)
 {
     /* FIXME it is probably more complex than this and may need to be */
@@ -63,13 +61,12 @@ void ati_2d_blt(ATIVGAState *s)
         qemu_log_mask(LOG_GUEST_ERROR, "Invalid bpp\n");
         return;
     }
-    int dst_stride = DEFAULT_CNTL ? s->regs.dst_pitch : s->regs.default_pitch;
+    int dst_stride = s->regs.dst_pitch;
     if (!dst_stride) {
         qemu_log_mask(LOG_GUEST_ERROR, "Zero dest pitch\n");
         return;
     }
-    uint8_t *dst_bits = s->vga.vram_ptr + (DEFAULT_CNTL ?
-                        s->regs.dst_offset : s->regs.default_offset);
+    uint8_t *dst_bits = s->vga.vram_ptr + s->regs.dst_offset;
 
     if (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF) {
         dst_bits += s->regs.crtc_offset & 0x07ffffff;
@@ -97,14 +94,12 @@ void ati_2d_blt(ATIVGAState *s)
                        s->regs.src_x : s->regs.src_x + 1 - s->regs.dst_width);
         unsigned src_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
                        s->regs.src_y : s->regs.src_y + 1 - s->regs.dst_height);
-        int src_stride = DEFAULT_CNTL ?
-                         s->regs.src_pitch : s->regs.default_pitch;
+        int src_stride = s->regs.src_pitch;
         if (!src_stride) {
             qemu_log_mask(LOG_GUEST_ERROR, "Zero source pitch\n");
             return;
         }
-        uint8_t *src_bits = s->vga.vram_ptr + (DEFAULT_CNTL ?
-                            s->regs.src_offset : s->regs.default_offset);
+        uint8_t *src_bits = s->vga.vram_ptr + s->regs.src_offset;
 
         if (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF) {
             src_bits += s->regs.crtc_offset & 0x07ffffff;
diff --git a/hw/display/ati_regs.h b/hw/display/ati_regs.h
index 2b56b9fb66..02025eef36 100644
--- a/hw/display/ati_regs.h
+++ b/hw/display/ati_regs.h
@@ -402,6 +402,16 @@
 #define GMC_WRITE_MASK_SET                      0x40000000
 #define GMC_DP_CONVERSION_TEMP_6500             0x00000000
 
+/* DP_GUI_MASTER_CNTL DP_SRC_PITCH_OFFSET named constants */
+#define GMC_SRC_PITCH_OFFSET_CNTL_MASK          0x00000001
+#define GMC_SRC_PITCH_OFFSET_DEFAULT            0x00000000
+#define GMC_SRC_PITCH_OFFSET_LEAVE_ALONE        0x00000001
+
+/* DP_GUI_MASTER_CNTL DP_DST_PITCH_OFFSET named constants */
+#define GMC_DST_PITCH_OFFSET_CNTL_MASK          0x00000002
+#define GMC_DST_PITCH_OFFSET_DEFAULT            0x00000000
+#define GMC_DST_PITCH_OFFSET_LEAVE_ALONE        0x00000002
+
 /* DP_GUI_MASTER_CNTL DP_SRC_CLIPPING named constants */
 #define GMC_SRC_CLIPPING_MASK                   0x00000004
 #define GMC_SRC_CLIP_DEFAULT                    0x00000000
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 05/11] ati-vga: Fix DP_GUI_MASTER_CNTL register mask
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
                   ` (3 preceding siblings ...)
  2025-11-18 15:48 ` [PATCH v3 04/11] ati-vga: Latch src and dst pitch and offset on master_cntl default Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 06/11] ati-vga: Create dst and sc rectangle helpers Chad Jablonski
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

Remove the register mask on DP_GUI_MASTER_CNTL writes. Masking on writes
is usually used to exclude reserved fields. This register contains no
reserved fields. All of the fields are important, even if not yet fully
implemented.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
 hw/display/ati.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index d0fa51f773..cfb5dc2fb8 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -883,7 +883,7 @@ static void ati_mm_write(void *opaque, hwaddr addr,
         ati_2d_blt(s);
         break;
     case DP_GUI_MASTER_CNTL:
-        s->regs.dp_gui_master_cntl = data & 0xf800000f;
+        s->regs.dp_gui_master_cntl = data;
         s->regs.dp_datatype = (data & 0x0f00) >> 8 | (data & 0x30f0) << 4 |
                               (data & 0x4000) << 16;
         s->regs.dp_mix = (data & GMC_ROP3_MASK) | (data & 0x7000000) >> 16;
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 06/11] ati-vga: Create dst and sc rectangle helpers
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
                   ` (4 preceding siblings ...)
  2025-11-18 15:48 ` [PATCH v3 05/11] ati-vga: Fix DP_GUI_MASTER_CNTL register mask Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 07/11] ati-vga: Implement scissor rectangle clipping for 2D operations Chad Jablonski
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

This creates helper functions for building QemuRects using the dst and
sc registers. These are useful during blit setup and the next patch in
this series will make use of them.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
 hw/display/ati_2d.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
index a8c4c534b9..0695c26b3b 100644
--- a/hw/display/ati_2d.c
+++ b/hw/display/ati_2d.c
@@ -13,6 +13,7 @@
 #include "qemu/log.h"
 #include "ui/pixel_ops.h"
 #include "ui/console.h"
+#include "ui/rect.h"
 
 /*
  * NOTE:
@@ -43,6 +44,29 @@ static int ati_bpp_from_datatype(ATIVGAState *s)
     }
 }
 
+static QemuRect dst_rect(ATIVGAState *s)
+{
+    QemuRect dst;
+    unsigned dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
+                     s->regs.dst_x :
+                     s->regs.dst_x + 1 - s->regs.dst_width);
+    unsigned dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
+                     s->regs.dst_y :
+                     s->regs.dst_y + 1 - s->regs.dst_height);
+    qemu_rect_init(&dst, dst_x, dst_y, s->regs.dst_width, s->regs.dst_height);
+    return dst;
+}
+
+static QemuRect sc_rect(ATIVGAState *s)
+{
+    QemuRect sc;
+    qemu_rect_init(&sc,
+                   s->regs.sc_left, s->regs.sc_top,
+                   s->regs.sc_right - s->regs.sc_left + 1,
+                   s->regs.sc_bottom - s->regs.sc_top + 1);
+    return sc;
+}
+
 void ati_2d_blt(ATIVGAState *s)
 {
     /* FIXME it is probably more complex than this and may need to be */
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 07/11] ati-vga: Implement scissor rectangle clipping for 2D operations
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
                   ` (5 preceding siblings ...)
  2025-11-18 15:48 ` [PATCH v3 06/11] ati-vga: Create dst and sc rectangle helpers Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 08/11] ati-vga: Create 2d_blt destination setup helper Chad Jablonski
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

Use scissor registers to clip blit operations. This is required
for text rendering in X using the r128 driver. Without it overly-wide
glyphs are drawn and create all sorts of chaos.

Use QemuRect helpers for calculating the intersection of the
destination and scissor rectangles. Source coordinates are
also updated to reflect clipping. The original destination dimensions
are stored in 'dst' while the clipped rectangle is in 'visible' for
clear distinction between the two.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
 hw/display/ati_2d.c | 90 ++++++++++++++++++++++++---------------------
 1 file changed, 49 insertions(+), 41 deletions(-)

diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
index 0695c26b3b..9c96ee155f 100644
--- a/hw/display/ati_2d.c
+++ b/hw/display/ati_2d.c
@@ -76,10 +76,16 @@ void ati_2d_blt(ATIVGAState *s)
             s->vga.vbe_start_addr, surface_data(ds), surface_stride(ds),
             surface_bits_per_pixel(ds),
             (s->regs.dp_mix & GMC_ROP3_MASK) >> 16);
-    unsigned dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
-                      s->regs.dst_x : s->regs.dst_x + 1 - s->regs.dst_width);
-    unsigned dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
-                      s->regs.dst_y : s->regs.dst_y + 1 - s->regs.dst_height);
+
+    QemuRect dst = dst_rect(s);
+    QemuRect scissor = sc_rect(s);
+    QemuRect visible;
+    if (!qemu_rect_intersect(&dst, &scissor, &visible)) {
+        return;
+    }
+    uint32_t src_left_offset = visible.x - dst.x;
+    uint32_t src_top_offset = visible.y - dst.y;
+
     int bpp = ati_bpp_from_datatype(s);
     if (!bpp) {
         qemu_log_mask(LOG_GUEST_ERROR, "Invalid bpp\n");
@@ -97,17 +103,16 @@ void ati_2d_blt(ATIVGAState *s)
         dst_stride *= bpp;
     }
     uint8_t *end = s->vga.vram_ptr + s->vga.vram_size;
-    if (dst_x > 0x3fff || dst_y > 0x3fff || dst_bits >= end
-        || dst_bits + dst_x
-         + (dst_y + s->regs.dst_height) * dst_stride >= end) {
+    if (visible.x > 0x3fff || visible.y > 0x3fff || dst_bits >= end
+        || dst_bits + visible.x
+         + (visible.y + visible.height) * dst_stride >= end) {
         qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
         return;
     }
     DPRINTF("%d %d %d, %d %d %d, (%d,%d) -> (%d,%d) %dx%d %c %c\n",
             s->regs.src_offset, s->regs.dst_offset, s->regs.default_offset,
             s->regs.src_pitch, s->regs.dst_pitch, s->regs.default_pitch,
-            s->regs.src_x, s->regs.src_y, dst_x, dst_y,
-            s->regs.dst_width, s->regs.dst_height,
+            s->regs.src_x, s->regs.src_y, dst.x, dst.y, dst.width, dst.height,
             (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ? '>' : '<'),
             (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ? 'v' : '^'));
     switch (s->regs.dp_mix & GMC_ROP3_MASK) {
@@ -115,9 +120,11 @@ void ati_2d_blt(ATIVGAState *s)
     {
         bool fallback = false;
         unsigned src_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
-                       s->regs.src_x : s->regs.src_x + 1 - s->regs.dst_width);
+                         s->regs.src_x + src_left_offset :
+                         s->regs.src_x + 1 - dst.width + src_left_offset);
         unsigned src_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
-                       s->regs.src_y : s->regs.src_y + 1 - s->regs.dst_height);
+                         s->regs.src_y + src_top_offset :
+                         s->regs.src_y + 1 - dst.height + src_top_offset);
         int src_stride = s->regs.src_pitch;
         if (!src_stride) {
             qemu_log_mask(LOG_GUEST_ERROR, "Zero source pitch\n");
@@ -131,7 +138,7 @@ void ati_2d_blt(ATIVGAState *s)
         }
         if (src_x > 0x3fff || src_y > 0x3fff || src_bits >= end
             || src_bits + src_x
-             + (src_y + s->regs.dst_height) * src_stride >= end) {
+             + (src_y + visible.height) * src_stride >= end) {
             qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
             return;
         }
@@ -140,31 +147,31 @@ void ati_2d_blt(ATIVGAState *s)
         dst_stride /= sizeof(uint32_t);
         DPRINTF("pixman_blt(%p, %p, %d, %d, %d, %d, %d, %d, %d, %d, %d, %d)\n",
                 src_bits, dst_bits, src_stride, dst_stride, bpp, bpp,
-                src_x, src_y, dst_x, dst_y,
-                s->regs.dst_width, s->regs.dst_height);
+                src_x, src_y, visible.x, visible.y,
+                visible.width, visible.height);
 #ifdef CONFIG_PIXMAN
         if ((s->use_pixman & BIT(1)) &&
             s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT &&
             s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM) {
             fallback = !pixman_blt((uint32_t *)src_bits, (uint32_t *)dst_bits,
                                    src_stride, dst_stride, bpp, bpp,
-                                   src_x, src_y, dst_x, dst_y,
-                                   s->regs.dst_width, s->regs.dst_height);
+                                   src_x, src_y, visible.x, visible.y,
+                                   visible.width, visible.height);
         } else if (s->use_pixman & BIT(1)) {
             /* FIXME: We only really need a temporary if src and dst overlap */
-            int llb = s->regs.dst_width * (bpp / 8);
+            int llb = visible.width * (bpp / 8);
             int tmp_stride = DIV_ROUND_UP(llb, sizeof(uint32_t));
             uint32_t *tmp = g_malloc(tmp_stride * sizeof(uint32_t) *
-                                     s->regs.dst_height);
+                                     visible.height);
             fallback = !pixman_blt((uint32_t *)src_bits, tmp,
                                    src_stride, tmp_stride, bpp, bpp,
                                    src_x, src_y, 0, 0,
-                                   s->regs.dst_width, s->regs.dst_height);
+                                   visible.width, visible.height);
             if (!fallback) {
                 fallback = !pixman_blt(tmp, (uint32_t *)dst_bits,
                                        tmp_stride, dst_stride, bpp, bpp,
-                                       0, 0, dst_x, dst_y,
-                                       s->regs.dst_width, s->regs.dst_height);
+                                       0, 0, visible.x, visible.y,
+                                       visible.width, visible.height);
             }
             g_free(tmp);
         } else
@@ -177,17 +184,17 @@ void ati_2d_blt(ATIVGAState *s)
             unsigned int src_pitch = src_stride * sizeof(uint32_t);
             unsigned int dst_pitch = dst_stride * sizeof(uint32_t);
 
-            for (y = 0; y < s->regs.dst_height; y++) {
-                i = dst_x * bypp;
+            for (y = 0; y < visible.height; y++) {
+                i = visible.x * bypp;
                 j = src_x * bypp;
                 if (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM) {
-                    i += (dst_y + y) * dst_pitch;
+                    i += (visible.y + y) * dst_pitch;
                     j += (src_y + y) * src_pitch;
                 } else {
-                    i += (dst_y + s->regs.dst_height - 1 - y) * dst_pitch;
-                    j += (src_y + s->regs.dst_height - 1 - y) * src_pitch;
+                    i += (visible.y + visible.height - 1 - y) * dst_pitch;
+                    j += (src_y + visible.height - 1 - y) * src_pitch;
                 }
-                memmove(&dst_bits[i], &src_bits[j], s->regs.dst_width * bypp);
+                memmove(&dst_bits[i], &src_bits[j], visible.width * bypp);
             }
         }
         if (dst_bits >= s->vga.vram_ptr + s->vga.vbe_start_addr &&
@@ -195,13 +202,13 @@ void ati_2d_blt(ATIVGAState *s)
             s->vga.vbe_regs[VBE_DISPI_INDEX_YRES] * s->vga.vbe_line_offset) {
             memory_region_set_dirty(&s->vga.vram, s->vga.vbe_start_addr +
                                     s->regs.dst_offset +
-                                    dst_y * surface_stride(ds),
-                                    s->regs.dst_height * surface_stride(ds));
+                                    visible.y * surface_stride(ds),
+                                    visible.height * surface_stride(ds));
         }
         s->regs.dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
-                         dst_x + s->regs.dst_width : dst_x);
+                         visible.x + visible.width : visible.x);
         s->regs.dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
-                         dst_y + s->regs.dst_height : dst_y);
+                         visible.y + visible.height : visible.y);
         break;
     }
     case ROP3_PATCOPY:
@@ -226,20 +233,21 @@ void ati_2d_blt(ATIVGAState *s)
 
         dst_stride /= sizeof(uint32_t);
         DPRINTF("pixman_fill(%p, %d, %d, %d, %d, %d, %d, %x)\n",
-                dst_bits, dst_stride, bpp, dst_x, dst_y,
-                s->regs.dst_width, s->regs.dst_height, filler);
+                dst_bits, dst_stride, bpp, visible.x, visible.y,
+                visible.width, visible.height, filler);
 #ifdef CONFIG_PIXMAN
         if (!(s->use_pixman & BIT(0)) ||
-            !pixman_fill((uint32_t *)dst_bits, dst_stride, bpp, dst_x, dst_y,
-                    s->regs.dst_width, s->regs.dst_height, filler))
+            !pixman_fill((uint32_t *)dst_bits, dst_stride, bpp,
+                         visible.x, visible.y, visible.width, visible.height,
+                         filler))
 #endif
         {
             /* fallback when pixman failed or we don't want to call it */
             unsigned int x, y, i, bypp = bpp / 8;
             unsigned int dst_pitch = dst_stride * sizeof(uint32_t);
-            for (y = 0; y < s->regs.dst_height; y++) {
-                i = dst_x * bypp + (dst_y + y) * dst_pitch;
-                for (x = 0; x < s->regs.dst_width; x++, i += bypp) {
+            for (y = 0; y < visible.height; y++) {
+                i = visible.x * bypp + (visible.y + y) * dst_pitch;
+                for (x = 0; x < visible.width; x++, i += bypp) {
                     stn_he_p(&dst_bits[i], bypp, filler);
                 }
             }
@@ -249,11 +257,11 @@ void ati_2d_blt(ATIVGAState *s)
             s->vga.vbe_regs[VBE_DISPI_INDEX_YRES] * s->vga.vbe_line_offset) {
             memory_region_set_dirty(&s->vga.vram, s->vga.vbe_start_addr +
                                     s->regs.dst_offset +
-                                    dst_y * surface_stride(ds),
-                                    s->regs.dst_height * surface_stride(ds));
+                                    visible.y * surface_stride(ds),
+                                    visible.height * surface_stride(ds));
         }
         s->regs.dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
-                         dst_y + s->regs.dst_height : dst_y);
+                         visible.y + visible.height : visible.y);
         break;
     }
     default:
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 08/11] ati-vga: Create 2d_blt destination setup helper
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
                   ` (6 preceding siblings ...)
  2025-11-18 15:48 ` [PATCH v3 07/11] ati-vga: Implement scissor rectangle clipping for 2D operations Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 09/11] ati-vga: Refactor ati_2d_blt to use dst " Chad Jablonski
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

A large amount of the common setup involved in a blit deals with the
destination. This moves that setup to a helper function returning a
struct (ATIBlitDest) holding all of that state. The idea here is that
this setup will be shared between blits from memory as well as from
HOST_DATA and maybe others in the future.

The next patch refactors ati_2d_blt to use this new helper.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
 hw/display/ati_2d.c | 54 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
index 9c96ee155f..f2e01e28e9 100644
--- a/hw/display/ati_2d.c
+++ b/hw/display/ati_2d.c
@@ -67,6 +67,60 @@ static QemuRect sc_rect(ATIVGAState *s)
     return sc;
 }
 
+typedef struct {
+    QemuRect rect;
+    QemuRect visible;
+    uint32_t src_left_offset;
+    uint32_t src_top_offset;
+    int bpp;
+    int stride;
+    bool top_to_bottom;
+    bool left_to_right;
+    bool valid;
+    uint8_t *bits;
+} ATIBlitDest;
+
+static ATIBlitDest setup_2d_blt_dst(ATIVGAState *s)
+{
+    ATIBlitDest dst = { .valid = false };
+    uint8_t *end = s->vga.vram_ptr + s->vga.vram_size;
+    QemuRect scissor = sc_rect(s);
+
+    dst.rect = dst_rect(s);
+    if (!qemu_rect_intersect(&dst.rect, &scissor, &dst.visible)) {
+        /* Destination is completely clipped, nothing to draw */
+        return dst;
+    }
+    dst.bpp = ati_bpp_from_datatype(s);
+    if (!dst.bpp) {
+        qemu_log_mask(LOG_GUEST_ERROR, "Invalid bpp\n");
+        return dst;
+    }
+    dst.stride = s->regs.dst_pitch;
+    if (!dst.stride) {
+        qemu_log_mask(LOG_GUEST_ERROR, "Zero dest pitch\n");
+        return dst;
+    }
+    dst.bits = s->vga.vram_ptr + s->regs.dst_offset;
+    if (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF) {
+        dst.bits += s->regs.crtc_offset & 0x07ffffff;
+        dst.stride *= dst.bpp;
+    }
+    if (dst.visible.x > 0x3fff || dst.visible.y > 0x3fff || dst.bits >= end
+        || dst.bits + dst.visible.x
+         + (dst.visible.y + dst.visible.height) * dst.stride >= end) {
+        qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
+        return dst;
+    }
+    dst.src_left_offset = dst.visible.x - dst.rect.x;
+    dst.src_top_offset = dst.visible.y - dst.rect.y;
+    dst.left_to_right = s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT;
+    dst.top_to_bottom = s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM;
+    dst.valid = true;
+
+    return dst;
+}
+
 void ati_2d_blt(ATIVGAState *s)
 {
     /* FIXME it is probably more complex than this and may need to be */
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 09/11] ati-vga: Refactor ati_2d_blt to use dst setup helper
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
                   ` (7 preceding siblings ...)
  2025-11-18 15:48 ` [PATCH v3 08/11] ati-vga: Create 2d_blt destination setup helper Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 10/11] ati-vga: Implement HOST_DATA register writes Chad Jablonski
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

This is a pure refactor of the ati_2d_blt function to make use of the
setup_2d_blt_dst helper and the struct it returns. There should be no
changes in behavior in this patch.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
 hw/display/ati_2d.c | 166 +++++++++++++++++++-------------------------
 1 file changed, 71 insertions(+), 95 deletions(-)

diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
index f2e01e28e9..e30a4cd941 100644
--- a/hw/display/ati_2d.c
+++ b/hw/display/ati_2d.c
@@ -130,55 +130,29 @@ void ati_2d_blt(ATIVGAState *s)
             s->vga.vbe_start_addr, surface_data(ds), surface_stride(ds),
             surface_bits_per_pixel(ds),
             (s->regs.dp_mix & GMC_ROP3_MASK) >> 16);
-
-    QemuRect dst = dst_rect(s);
-    QemuRect scissor = sc_rect(s);
-    QemuRect visible;
-    if (!qemu_rect_intersect(&dst, &scissor, &visible)) {
-        return;
-    }
-    uint32_t src_left_offset = visible.x - dst.x;
-    uint32_t src_top_offset = visible.y - dst.y;
-
-    int bpp = ati_bpp_from_datatype(s);
-    if (!bpp) {
-        qemu_log_mask(LOG_GUEST_ERROR, "Invalid bpp\n");
-        return;
-    }
-    int dst_stride = s->regs.dst_pitch;
-    if (!dst_stride) {
-        qemu_log_mask(LOG_GUEST_ERROR, "Zero dest pitch\n");
-        return;
-    }
-    uint8_t *dst_bits = s->vga.vram_ptr + s->regs.dst_offset;
-
-    if (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF) {
-        dst_bits += s->regs.crtc_offset & 0x07ffffff;
-        dst_stride *= bpp;
-    }
+    ATIBlitDest dst = setup_2d_blt_dst(s);
     uint8_t *end = s->vga.vram_ptr + s->vga.vram_size;
-    if (visible.x > 0x3fff || visible.y > 0x3fff || dst_bits >= end
-        || dst_bits + visible.x
-         + (visible.y + visible.height) * dst_stride >= end) {
-        qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
-        return;
-    }
+
     DPRINTF("%d %d %d, %d %d %d, (%d,%d) -> (%d,%d) %dx%d %c %c\n",
             s->regs.src_offset, s->regs.dst_offset, s->regs.default_offset,
             s->regs.src_pitch, s->regs.dst_pitch, s->regs.default_pitch,
-            s->regs.src_x, s->regs.src_y, dst.x, dst.y, dst.width, dst.height,
-            (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ? '>' : '<'),
-            (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ? 'v' : '^'));
+            s->regs.src_x, s->regs.src_y,
+            dst.rect.x, dst.rect.y, dst.rect.width, dst.rect.height,
+            (dst.left_to_right ? '>' : '<'),
+            (dst.top_to_bottom ? 'v' : '^'));
+
     switch (s->regs.dp_mix & GMC_ROP3_MASK) {
     case ROP3_SRCCOPY:
     {
         bool fallback = false;
-        unsigned src_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
-                         s->regs.src_x + src_left_offset :
-                         s->regs.src_x + 1 - dst.width + src_left_offset);
-        unsigned src_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
-                         s->regs.src_y + src_top_offset :
-                         s->regs.src_y + 1 - dst.height + src_top_offset);
+        unsigned src_x = (dst.left_to_right ?
+                         s->regs.src_x + dst.src_left_offset :
+                         s->regs.src_x + 1 -
+                         dst.rect.width + dst.src_left_offset);
+        unsigned src_y = (dst.top_to_bottom ?
+                         s->regs.src_y + dst.src_top_offset :
+                         s->regs.src_y + 1 -
+                         dst.rect.height + dst.src_top_offset);
         int src_stride = s->regs.src_pitch;
         if (!src_stride) {
             qemu_log_mask(LOG_GUEST_ERROR, "Zero source pitch\n");
@@ -188,44 +162,44 @@ void ati_2d_blt(ATIVGAState *s)
 
         if (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF) {
             src_bits += s->regs.crtc_offset & 0x07ffffff;
-            src_stride *= bpp;
+            src_stride *= dst.bpp;
         }
         if (src_x > 0x3fff || src_y > 0x3fff || src_bits >= end
             || src_bits + src_x
-             + (src_y + visible.height) * src_stride >= end) {
+             + (src_y + dst.visible.height) * src_stride >= end) {
             qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
             return;
         }
 
         src_stride /= sizeof(uint32_t);
-        dst_stride /= sizeof(uint32_t);
+        dst.stride /= sizeof(uint32_t);
         DPRINTF("pixman_blt(%p, %p, %d, %d, %d, %d, %d, %d, %d, %d, %d, %d)\n",
-                src_bits, dst_bits, src_stride, dst_stride, bpp, bpp,
-                src_x, src_y, visible.x, visible.y,
-                visible.width, visible.height);
+                src_bits, dst.bits, src_stride, dst.stride, dst.bpp, dst.bpp,
+                src_x, src_y, dst.visible.x, dst.visible.y,
+                dst.visible.width, dst.visible.height);
 #ifdef CONFIG_PIXMAN
         if ((s->use_pixman & BIT(1)) &&
-            s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT &&
-            s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM) {
-            fallback = !pixman_blt((uint32_t *)src_bits, (uint32_t *)dst_bits,
-                                   src_stride, dst_stride, bpp, bpp,
-                                   src_x, src_y, visible.x, visible.y,
-                                   visible.width, visible.height);
+            dst.left_to_right &&
+            dst.top_to_bottom) {
+            fallback = !pixman_blt((uint32_t *)src_bits, (uint32_t *)dst.bits,
+                                   src_stride, dst.stride, dst.bpp, dst.bpp,
+                                   src_x, src_y, dst.visible.x, dst.visible.y,
+                                   dst.visible.width, dst.visible.height);
         } else if (s->use_pixman & BIT(1)) {
             /* FIXME: We only really need a temporary if src and dst overlap */
-            int llb = visible.width * (bpp / 8);
+            int llb = dst.visible.width * (dst.bpp / 8);
             int tmp_stride = DIV_ROUND_UP(llb, sizeof(uint32_t));
             uint32_t *tmp = g_malloc(tmp_stride * sizeof(uint32_t) *
-                                     visible.height);
+                                     dst.visible.height);
             fallback = !pixman_blt((uint32_t *)src_bits, tmp,
-                                   src_stride, tmp_stride, bpp, bpp,
+                                   src_stride, tmp_stride, dst.bpp, dst.bpp,
                                    src_x, src_y, 0, 0,
-                                   visible.width, visible.height);
+                                   dst.visible.width, dst.visible.height);
             if (!fallback) {
-                fallback = !pixman_blt(tmp, (uint32_t *)dst_bits,
-                                       tmp_stride, dst_stride, bpp, bpp,
-                                       0, 0, visible.x, visible.y,
-                                       visible.width, visible.height);
+                fallback = !pixman_blt(tmp, (uint32_t *)dst.bits,
+                                       tmp_stride, dst.stride, dst.bpp, dst.bpp,
+                                       0, 0, dst.visible.x, dst.visible.y,
+                                       dst.visible.width, dst.visible.height);
             }
             g_free(tmp);
         } else
@@ -234,35 +208,36 @@ void ati_2d_blt(ATIVGAState *s)
             fallback = true;
         }
         if (fallback) {
-            unsigned int y, i, j, bypp = bpp / 8;
+            unsigned int y, i, j, bypp = dst.bpp / 8;
             unsigned int src_pitch = src_stride * sizeof(uint32_t);
-            unsigned int dst_pitch = dst_stride * sizeof(uint32_t);
+            unsigned int dst_pitch = dst.stride * sizeof(uint32_t);
 
-            for (y = 0; y < visible.height; y++) {
-                i = visible.x * bypp;
+            for (y = 0; y < dst.visible.height; y++) {
+                i = dst.visible.x * bypp;
                 j = src_x * bypp;
-                if (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM) {
-                    i += (visible.y + y) * dst_pitch;
+                if (dst.top_to_bottom) {
+                    i += (dst.visible.y + y) * dst_pitch;
                     j += (src_y + y) * src_pitch;
                 } else {
-                    i += (visible.y + visible.height - 1 - y) * dst_pitch;
-                    j += (src_y + visible.height - 1 - y) * src_pitch;
+                    i += (dst.visible.y + dst.visible.height - 1 - y) *
+                         dst_pitch;
+                    j += (src_y + dst.visible.height - 1 - y) * src_pitch;
                 }
-                memmove(&dst_bits[i], &src_bits[j], visible.width * bypp);
+                memmove(&dst.bits[i], &src_bits[j], dst.visible.width * bypp);
             }
         }
-        if (dst_bits >= s->vga.vram_ptr + s->vga.vbe_start_addr &&
-            dst_bits < s->vga.vram_ptr + s->vga.vbe_start_addr +
+        if (dst.bits >= s->vga.vram_ptr + s->vga.vbe_start_addr &&
+            dst.bits < s->vga.vram_ptr + s->vga.vbe_start_addr +
             s->vga.vbe_regs[VBE_DISPI_INDEX_YRES] * s->vga.vbe_line_offset) {
             memory_region_set_dirty(&s->vga.vram, s->vga.vbe_start_addr +
                                     s->regs.dst_offset +
-                                    visible.y * surface_stride(ds),
-                                    visible.height * surface_stride(ds));
+                                    dst.visible.y * surface_stride(ds),
+                                    dst.visible.height * surface_stride(ds));
         }
-        s->regs.dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
-                         visible.x + visible.width : visible.x);
-        s->regs.dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
-                         visible.y + visible.height : visible.y);
+        s->regs.dst_x = (dst.left_to_right ?
+                         dst.visible.x + dst.visible.width : dst.visible.x);
+        s->regs.dst_y = (dst.top_to_bottom ?
+                         dst.visible.y + dst.visible.height : dst.visible.y);
         break;
     }
     case ROP3_PATCOPY:
@@ -285,37 +260,38 @@ void ati_2d_blt(ATIVGAState *s)
             break;
         }
 
-        dst_stride /= sizeof(uint32_t);
+        dst.stride /= sizeof(uint32_t);
         DPRINTF("pixman_fill(%p, %d, %d, %d, %d, %d, %d, %x)\n",
-                dst_bits, dst_stride, bpp, visible.x, visible.y,
-                visible.width, visible.height, filler);
+                dst.bits, dst.stride, dst.bpp, dst.visible.x, dst.visible.y,
+                dst.visible.width, dst.visible.height, filler);
 #ifdef CONFIG_PIXMAN
         if (!(s->use_pixman & BIT(0)) ||
-            !pixman_fill((uint32_t *)dst_bits, dst_stride, bpp,
-                         visible.x, visible.y, visible.width, visible.height,
+            !pixman_fill((uint32_t *)dst.bits, dst.stride, dst.bpp,
+                         dst.visible.x, dst.visible.y,
+                         dst.visible.width, dst.visible.height,
                          filler))
 #endif
         {
             /* fallback when pixman failed or we don't want to call it */
-            unsigned int x, y, i, bypp = bpp / 8;
-            unsigned int dst_pitch = dst_stride * sizeof(uint32_t);
-            for (y = 0; y < visible.height; y++) {
-                i = visible.x * bypp + (visible.y + y) * dst_pitch;
-                for (x = 0; x < visible.width; x++, i += bypp) {
-                    stn_he_p(&dst_bits[i], bypp, filler);
+            unsigned int x, y, i, bypp = dst.bpp / 8;
+            unsigned int dst_pitch = dst.stride * sizeof(uint32_t);
+            for (y = 0; y < dst.visible.height; y++) {
+                i = dst.visible.x * bypp + (dst.visible.y + y) * dst_pitch;
+                for (x = 0; x < dst.visible.width; x++, i += bypp) {
+                    stn_he_p(&dst.bits[i], bypp, filler);
                 }
             }
         }
-        if (dst_bits >= s->vga.vram_ptr + s->vga.vbe_start_addr &&
-            dst_bits < s->vga.vram_ptr + s->vga.vbe_start_addr +
+        if (dst.bits >= s->vga.vram_ptr + s->vga.vbe_start_addr &&
+            dst.bits < s->vga.vram_ptr + s->vga.vbe_start_addr +
             s->vga.vbe_regs[VBE_DISPI_INDEX_YRES] * s->vga.vbe_line_offset) {
             memory_region_set_dirty(&s->vga.vram, s->vga.vbe_start_addr +
                                     s->regs.dst_offset +
-                                    visible.y * surface_stride(ds),
-                                    visible.height * surface_stride(ds));
+                                    dst.visible.y * surface_stride(ds),
+                                    dst.visible.height * surface_stride(ds));
         }
-        s->regs.dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
-                         visible.y + visible.height : visible.y);
+        s->regs.dst_y = (dst.top_to_bottom ?
+                         dst.visible.y + dst.visible.height : dst.visible.y);
         break;
     }
     default:
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 10/11] ati-vga: Implement HOST_DATA register writes
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
                   ` (8 preceding siblings ...)
  2025-11-18 15:48 ` [PATCH v3 09/11] ati-vga: Refactor ati_2d_blt to use dst " Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-11-18 15:48 ` [PATCH v3 11/11] ati-vga: Implement HOST_DATA flush to VRAM Chad Jablonski
  2025-12-11  2:33 ` [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

Writing to any of the HOST_DATA0-7 registers pushes the written data
into a 128-bit accumulator. When the accumulator is full a flush is
triggered to copy it to the framebuffer. A final write to HOST_DATA_LAST
will also initiate a flush. The flush itself is left for the next patch.

Unaligned HOST_DATA* writes result in, from what I can tell, undefined
behavior on real hardware. A well-behaved driver shouldn't be doing this
anyway. For that reason they are not handled here at all.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
 hw/display/ati.c      | 33 +++++++++++++++++++++++++++++++++
 hw/display/ati_dbg.c  |  9 +++++++++
 hw/display/ati_int.h  |  9 +++++++++
 hw/display/ati_regs.h |  9 +++++++++
 4 files changed, 60 insertions(+)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index cfb5dc2fb8..dc302eb6f2 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -555,6 +555,13 @@ static inline void ati_reg_write_offs(uint32_t *reg, int offs,
     }
 }
 
+static void ati_host_data_reset(ATIHostDataState *hd)
+{
+    hd->next = 0;
+    hd->row = 0;
+    hd->col = 0;
+}
+
 static void ati_mm_write(void *opaque, hwaddr addr,
                            uint64_t data, unsigned int size)
 {
@@ -830,6 +837,7 @@ static void ati_mm_write(void *opaque, hwaddr addr,
         break;
     case DST_WIDTH:
         s->regs.dst_width = data & 0x3fff;
+        ati_host_data_reset(&s->host_data);
         ati_2d_blt(s);
         break;
     case DST_HEIGHT:
@@ -880,6 +888,7 @@ static void ati_mm_write(void *opaque, hwaddr addr,
     case DST_HEIGHT_WIDTH:
         s->regs.dst_width = data & 0x3fff;
         s->regs.dst_height = (data >> 16) & 0x3fff;
+        ati_host_data_reset(&s->host_data);
         ati_2d_blt(s);
         break;
     case DP_GUI_MASTER_CNTL:
@@ -910,6 +919,7 @@ static void ati_mm_write(void *opaque, hwaddr addr,
     case DST_WIDTH_X:
         s->regs.dst_x = data & 0x3fff;
         s->regs.dst_width = (data >> 16) & 0x3fff;
+        ati_host_data_reset(&s->host_data);
         ati_2d_blt(s);
         break;
     case SRC_X_Y:
@@ -923,6 +933,7 @@ static void ati_mm_write(void *opaque, hwaddr addr,
     case DST_WIDTH_HEIGHT:
         s->regs.dst_height = data & 0x3fff;
         s->regs.dst_width = (data >> 16) & 0x3fff;
+        ati_host_data_reset(&s->host_data);
         ati_2d_blt(s);
         break;
     case DST_HEIGHT_Y:
@@ -1017,6 +1028,26 @@ static void ati_mm_write(void *opaque, hwaddr addr,
         break;
     case SRC_SC_BOTTOM:
         s->regs.src_sc_bottom = data & 0x3fff;
+    case HOST_DATA0:
+    case HOST_DATA1:
+    case HOST_DATA2:
+    case HOST_DATA3:
+    case HOST_DATA4:
+    case HOST_DATA5:
+    case HOST_DATA6:
+    case HOST_DATA7:
+        s->host_data.acc[s->host_data.next] = data;
+        if (s->host_data.next >= 3) {
+            qemu_log_mask(LOG_UNIMP, "HOST_DATA flush not yet implemented\n");
+            s->host_data.next = 0;
+            break;
+        }
+        s->host_data.next += 1;
+        break;
+    case HOST_DATA_LAST:
+        s->host_data.acc[s->host_data.next] = data;
+        qemu_log_mask(LOG_UNIMP, "HOST_DATA flush not yet implemented\n");
+        ati_host_data_reset(&s->host_data);
         break;
     default:
         break;
@@ -1123,6 +1154,8 @@ static void ati_vga_reset(DeviceState *dev)
     /* reset vga */
     vga_common_reset(&s->vga);
     s->mode = VGA_MODE;
+
+    ati_host_data_reset(&s->host_data);
 }
 
 static void ati_vga_exit(PCIDevice *dev)
diff --git a/hw/display/ati_dbg.c b/hw/display/ati_dbg.c
index 3ffa7f35df..5c799d540a 100644
--- a/hw/display/ati_dbg.c
+++ b/hw/display/ati_dbg.c
@@ -252,6 +252,15 @@ static struct ati_regdesc ati_reg_names[] = {
     {"MC_SRC1_CNTL", 0x19D8},
     {"TEX_CNTL", 0x1800},
     {"RAGE128_MPP_TB_CONFIG", 0x01c0},
+    {"HOST_DATA0", 0x17c0},
+    {"HOST_DATA1", 0x17c4},
+    {"HOST_DATA2", 0x17c8},
+    {"HOST_DATA3", 0x17cc},
+    {"HOST_DATA4", 0x17d0},
+    {"HOST_DATA5", 0x17d4},
+    {"HOST_DATA6", 0x17d8},
+    {"HOST_DATA7", 0x17dc},
+    {"HOST_DATA_LAST", 0x17e0},
     {NULL, -1}
 };
 
diff --git a/hw/display/ati_int.h b/hw/display/ati_int.h
index 1e999b11c2..b9142ce6d8 100644
--- a/hw/display/ati_int.h
+++ b/hw/display/ati_int.h
@@ -15,6 +15,7 @@
 #include "hw/i2c/bitbang_i2c.h"
 #include "vga_int.h"
 #include "qom/object.h"
+#include "qemu/units.h"
 
 /*#define DEBUG_ATI*/
 
@@ -95,6 +96,13 @@ typedef struct ATIVGARegs {
     uint16_t src_sc_right;
 } ATIVGARegs;
 
+typedef struct ATIHostDataState {
+    uint32_t row;
+    uint32_t col;
+    uint32_t next;
+    uint32_t acc[4];
+} ATIHostDataState;
+
 struct ATIVGAState {
     PCIDevice dev;
     VGACommonState vga;
@@ -112,6 +120,7 @@ struct ATIVGAState {
     MemoryRegion io;
     MemoryRegion mm;
     ATIVGARegs regs;
+    ATIHostDataState host_data;
 };
 
 const char *ati_reg_name(int num);
diff --git a/hw/display/ati_regs.h b/hw/display/ati_regs.h
index 02025eef36..9eb68fbec6 100644
--- a/hw/display/ati_regs.h
+++ b/hw/display/ati_regs.h
@@ -252,6 +252,15 @@
 #define DP_T12_CNTL                             0x178c
 #define DST_BRES_T1_LNTH                        0x1790
 #define DST_BRES_T2_LNTH                        0x1794
+#define HOST_DATA0                              0x17c0
+#define HOST_DATA1                              0x17c4
+#define HOST_DATA2                              0x17c8
+#define HOST_DATA3                              0x17cc
+#define HOST_DATA4                              0x17d0
+#define HOST_DATA5                              0x17d4
+#define HOST_DATA6                              0x17d8
+#define HOST_DATA7                              0x17dc
+#define HOST_DATA_LAST                          0x17e0
 #define SCALE_SRC_HEIGHT_WIDTH                  0x1994
 #define SCALE_OFFSET_0                          0x1998
 #define SCALE_PITCH                             0x199c
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 11/11] ati-vga: Implement HOST_DATA flush to VRAM
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
                   ` (9 preceding siblings ...)
  2025-11-18 15:48 ` [PATCH v3 10/11] ati-vga: Implement HOST_DATA register writes Chad Jablonski
@ 2025-11-18 15:48 ` Chad Jablonski
  2025-12-11  2:33 ` [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
  11 siblings, 0 replies; 14+ messages in thread
From: Chad Jablonski @ 2025-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: balaton, Chad Jablonski

Implement flushing the 128-bit HOST_DATA accumulator to VRAM to enable
text rendering in X. Currently supports only the monochrome
foreground/background datatype with the SRCCOPY ROP.

The flush is broken up into two steps for clarity. First, expansion of the
monochrome bits to the destination color depth. Then the expanded pixels
are clipped and copied into VRAM.

Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
 hw/display/ati.c      |   4 +-
 hw/display/ati_2d.c   | 127 ++++++++++++++++++++++++++++++++++++++++++
 hw/display/ati_int.h  |   3 +
 hw/display/ati_regs.h |  13 +++++
 4 files changed, 145 insertions(+), 2 deletions(-)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index dc302eb6f2..c5892a6305 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -1038,7 +1038,7 @@ static void ati_mm_write(void *opaque, hwaddr addr,
     case HOST_DATA7:
         s->host_data.acc[s->host_data.next] = data;
         if (s->host_data.next >= 3) {
-            qemu_log_mask(LOG_UNIMP, "HOST_DATA flush not yet implemented\n");
+            ati_flush_host_data(s);
             s->host_data.next = 0;
             break;
         }
@@ -1046,7 +1046,7 @@ static void ati_mm_write(void *opaque, hwaddr addr,
         break;
     case HOST_DATA_LAST:
         s->host_data.acc[s->host_data.next] = data;
-        qemu_log_mask(LOG_UNIMP, "HOST_DATA flush not yet implemented\n");
+        ati_flush_host_data(s);
         ati_host_data_reset(&s->host_data);
         break;
     default:
diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
index e30a4cd941..2cd5348965 100644
--- a/hw/display/ati_2d.c
+++ b/hw/display/ati_2d.c
@@ -123,6 +123,12 @@ static ATIBlitDest setup_2d_blt_dst(ATIVGAState *s)
 
 void ati_2d_blt(ATIVGAState *s)
 {
+    uint32_t src = s->regs.dp_gui_master_cntl & GMC_SRC_SOURCE_MASK;
+    if (src == GMC_SRC_SOURCE_HOST_DATA) {
+        /* HOST_DATA blits are handled separately by ati_flush_host_data() */
+        return;
+    }
+
     /* FIXME it is probably more complex than this and may need to be */
     /* rewritten but for now as a start just to get some output: */
     DisplaySurface *ds = qemu_console_surface(s->vga.con);
@@ -299,3 +305,124 @@ void ati_2d_blt(ATIVGAState *s)
                       (s->regs.dp_mix & GMC_ROP3_MASK) >> 16);
     }
 }
+
+void ati_flush_host_data(ATIVGAState *s)
+{
+    DisplaySurface *ds;
+    unsigned src = s->regs.dp_gui_master_cntl & GMC_SRC_SOURCE_MASK;
+    unsigned src_datatype = s->regs.dp_gui_master_cntl & GMC_SRC_DATATYPE_MASK;
+    unsigned rop = s->regs.dp_mix & GMC_ROP3_MASK;
+    ATIBlitDest dst;
+    bool lsb_to_msb;
+    uint32_t fg, bg;
+    unsigned bypp, row, col, idx;
+    uint8_t pix_buf[ATI_HOST_DATA_ACC_BITS * sizeof(uint32_t)];
+
+    if (src != GMC_SRC_SOURCE_HOST_DATA) {
+        qemu_log_mask(LOG_UNIMP,
+                      "host_data_blt: only GMC_SRC_SOURCE_HOST_DATA "
+                      "supported\n");
+        return;
+    }
+
+    if (src_datatype != GMC_SRC_DATATYPE_MONO_FRGD_BKGD) {
+        qemu_log_mask(LOG_UNIMP,
+                      "host_data_blt: only GMC_SRC_DATATYPE_MONO_FRGD_BKGD "
+                      "supported\n");
+        return;
+    }
+
+    if (rop != ROP3_SRCCOPY) {
+        qemu_log_mask(LOG_UNIMP,
+                      "host_data_blt: only ROP3_SRCCOPY supported. rop: %x\n",
+                      rop);
+        return;
+    }
+
+    dst = setup_2d_blt_dst(s);
+    if (!dst.valid) {
+        return;
+    }
+
+    if (!dst.left_to_right || !dst.top_to_bottom) {
+        qemu_log_mask(LOG_UNIMP, "host_data_blt: only L->R, T->B supported\n");
+        return;
+    }
+
+    lsb_to_msb = s->regs.dp_gui_master_cntl & GMC_BYTE_ORDER_LSB_TO_MSB;
+    fg = s->regs.dp_src_frgd_clr;
+    bg = s->regs.dp_src_bkgd_clr;
+    bypp = dst.bpp / 8;
+
+    /* Expand monochrome bits to color pixels */
+    idx = 0;
+    for (int word = 0; word < 4; word++) {
+        for (int i = 0; i < 32; i++) {
+            int bit = lsb_to_msb ? i : (31 - i);
+            bool is_fg = extract32(s->host_data.acc[word], bit, 1);
+            uint32_t color = is_fg ? fg : bg;
+            stn_he_p(&pix_buf[idx * bypp], bypp, color);
+            idx += 1;
+        }
+    }
+
+    /* Copy to VRAM one scanline at a time */
+    row = s->host_data.row;
+    col = s->host_data.col;
+    idx = 0;
+    while (idx < ATI_HOST_DATA_ACC_BITS && row < dst.rect.height) {
+        uint8_t *vram_dst;
+        unsigned start_col, end_col, vis_row, num_pix, pix_idx;
+        unsigned pix_in_scanline = MIN(ATI_HOST_DATA_ACC_BITS -
+                                       idx, dst.rect.width - col);
+
+        /* Row-based clipping */
+        if (row < dst.src_top_offset ||
+            row >= dst.src_top_offset + dst.visible.height) {
+            goto skip_pix;
+        }
+
+        /* Column-based clipping */
+        start_col = MAX(col, dst.src_left_offset);
+        end_col = MIN(col + pix_in_scanline,
+                      dst.src_left_offset + dst.visible.width);
+        if (end_col <= start_col) {
+            goto skip_pix;
+        }
+
+        /* Copy expanded bits/pixels to VRAM */
+        vis_row = row - dst.src_top_offset;
+        num_pix = end_col - start_col;
+        vram_dst = dst.bits +
+                   (dst.visible.y + vis_row) * dst.stride +
+                   (dst.visible.x + (start_col - dst.src_left_offset)) * bypp;
+
+        pix_idx = (idx + (start_col - col)) * bypp;
+        memcpy(vram_dst, &pix_buf[pix_idx], num_pix * bypp);
+
+    skip_pix:
+        idx += pix_in_scanline;
+        col += pix_in_scanline;
+        if (col >= dst.rect.width) {
+            col = 0;
+            row += 1;
+        }
+    }
+    /* Track state of the overall blit for use by the next flush */
+    s->host_data.row = row;
+    s->host_data.col = col;
+
+    /*
+     * TODO: This is setting the entire blit region to dirty.
+     *       We maybe just need this tiny section?
+     */
+    ds = qemu_console_surface(s->vga.con);
+    if (dst.bits >= s->vga.vram_ptr + s->vga.vbe_start_addr &&
+        dst.bits < s->vga.vram_ptr + s->vga.vbe_start_addr +
+        s->vga.vbe_regs[VBE_DISPI_INDEX_YRES] * s->vga.vbe_line_offset) {
+        memory_region_set_dirty(&s->vga.vram, s->vga.vbe_start_addr +
+                                s->regs.dst_offset +
+                                dst.visible.y * surface_stride(ds),
+                                dst.visible.height * surface_stride(ds));
+    }
+}
diff --git a/hw/display/ati_int.h b/hw/display/ati_int.h
index b9142ce6d8..f38f6a43d7 100644
--- a/hw/display/ati_int.h
+++ b/hw/display/ati_int.h
@@ -34,6 +34,8 @@
 #define ATI_RAGE128_LINEAR_APER_SIZE (64 * MiB)
 #define ATI_R100_LINEAR_APER_SIZE (128 * MiB)
 
+#define ATI_HOST_DATA_ACC_BITS 128
+
 #define TYPE_ATI_VGA "ati-vga"
 OBJECT_DECLARE_SIMPLE_TYPE(ATIVGAState, ATI_VGA)
 
@@ -126,5 +128,6 @@ struct ATIVGAState {
 const char *ati_reg_name(int num);
 
 void ati_2d_blt(ATIVGAState *s);
+void ati_flush_host_data(ATIVGAState *s);
 
 #endif /* ATI_INT_H */
diff --git a/hw/display/ati_regs.h b/hw/display/ati_regs.h
index 9eb68fbec6..8f3335817f 100644
--- a/hw/display/ati_regs.h
+++ b/hw/display/ati_regs.h
@@ -404,6 +404,7 @@
 #define GMC_BRUSH_SOLIDCOLOR                    0x000000d0
 #define GMC_SRC_DSTCOLOR                        0x00003000
 #define GMC_BYTE_ORDER_MSB_TO_LSB               0x00000000
+#define GMC_BYTE_ORDER_LSB_TO_MSB               0x00004000
 #define GMC_DP_SRC_RECT                         0x02000000
 #define GMC_3D_FCN_EN_CLR                       0x00000000
 #define GMC_AUX_CLIP_CLEAR                      0x20000000
@@ -431,6 +432,18 @@
 #define GMC_DST_CLIP_DEFAULT                    0x00000000
 #define GMC_DST_CLIP_LEAVE_ALONE                0x00000008
 
+/* DP_GUI_MASTER_CNTL DP_SRC_DATATYPE named constants */
+#define GMC_SRC_DATATYPE_MASK                   0x00003000
+#define GMC_SRC_DATATYPE_MONO_FRGD_BKGD         0x00000000
+#define GMC_SRC_DATATYPE_MONO_FRGD              0x00001000
+#define GMC_SRC_DATATYPE_COLOR                  0x00003000
+
+/* DP_GUI_MASTER_CNTL DP_SRC_SOURCE named constants */
+#define GMC_SRC_SOURCE_MASK                     0x07000000
+#define GMC_SRC_SOURCE_MEMORY                   0x02000000
+#define GMC_SRC_SOURCE_HOST_DATA                0x03000000
+#define GMC_SRC_SOURCE_HOST_DATA_ALIGNED        0x04000000
+
 /* DP_GUI_MASTER_CNTL ROP3 named constants */
 #define GMC_ROP3_MASK                           0x00ff0000
 #define ROP3_BLACKNESS                          0x00000000
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to
  2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
                   ` (10 preceding siblings ...)
  2025-11-18 15:48 ` [PATCH v3 11/11] ati-vga: Implement HOST_DATA flush to VRAM Chad Jablonski
@ 2025-12-11  2:33 ` Chad Jablonski
  2025-12-11 11:54   ` BALATON Zoltan
  11 siblings, 1 reply; 14+ messages in thread
From: Chad Jablonski @ 2025-12-11  2:33 UTC (permalink / raw)
  To: Chad Jablonski, qemu-devel; +Cc: balaton

Hey BALATON, just a friendly ping on the v3 HOST_DATA series. If there's
something I can take a look at as far as hardware validation or anything
else let me know! In the meantime I've been getting familiar with CCE and
poking at the microcode a bit.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to
  2025-12-11  2:33 ` [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
@ 2025-12-11 11:54   ` BALATON Zoltan
  0 siblings, 0 replies; 14+ messages in thread
From: BALATON Zoltan @ 2025-12-11 11:54 UTC (permalink / raw)
  To: Chad Jablonski; +Cc: qemu-devel

Hello,

On Wed, 10 Dec 2025, Chad Jablonski wrote:
> Hey BALATON, just a friendly ping on the v3 HOST_DATA series. If there's
> something I can take a look at as far as hardware validation or anything
> else let me know! In the meantime I've been getting familiar with CCE and
> poking at the microcode a bit.

I haven't missed it but I did not have time to test and review it yet. I 
hoped somebody else could also help out but I still intend to review it 
when I have more time. For the CCE I've restored the ticket I got about it 
before here: https://codeberg.org/qmiga/pages/issues/3 which has the info 
I've collected so far that may help but I could not find info on the 
microcode of these older ATI cards, Only newer versions seem to be known 
that are linked in the ticket. So unless you can find out how it works the 
way to go may be to parse the packets directly the same way as Xenia 
emulator does and not trying to run the microcode but it would be nice to 
emulate the micro engine if the microcode can be reversed or understood 
enough. Newer versions of the microcode appear to have short segments for 
each packet type that do simple operations, mainly stuffing data in the 
command FIFO so that FIFO may also need to be implemented first then 
another possible improvement is to run the drawing engine in a separate 
thread asynchronously fed through the command FIFO as it works on the real 
chip.

Regards,
BALATON Zoltan


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-12-11 11:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-18 15:48 [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 01/11] ati-vga: Fix DST_PITCH and SRC_PITCH reads Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 02/11] ati-vga: Add scissor clipping register support Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 03/11] ati-vga: Implement foreground and background color register writes Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 04/11] ati-vga: Latch src and dst pitch and offset on master_cntl default Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 05/11] ati-vga: Fix DP_GUI_MASTER_CNTL register mask Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 06/11] ati-vga: Create dst and sc rectangle helpers Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 07/11] ati-vga: Implement scissor rectangle clipping for 2D operations Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 08/11] ati-vga: Create 2d_blt destination setup helper Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 09/11] ati-vga: Refactor ati_2d_blt to use dst " Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 10/11] ati-vga: Implement HOST_DATA register writes Chad Jablonski
2025-11-18 15:48 ` [PATCH v3 11/11] ati-vga: Implement HOST_DATA flush to VRAM Chad Jablonski
2025-12-11  2:33 ` [PATCH v3 00/11] ati-vga: Implement HOST_DATA transfers to Chad Jablonski
2025-12-11 11:54   ` BALATON Zoltan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).