* [PATCH v2 0/7] ati-vga: Implement HOST_DATA transfers to enable X.org text rendering
@ 2025-11-03 3:36 Chad Jablonski
2025-11-03 3:36 ` [PATCH v2 1/7] ati-vga: Add scissor clipping register support Chad Jablonski
` (6 more replies)
0 siblings, 7 replies; 19+ messages in thread
From: Chad Jablonski @ 2025-11-03 3:36 UTC (permalink / raw)
To: qemu-devel; +Cc: balaton, Chad Jablonski
This series implements HOST_DATA as a blit source enabling text rendering in
xterm under X.org with 2D acceleration.
The series builds up functionality incrementally:
* Patches 1-2: Scissor rectangle clipping registers and implementation
* Patches 3-4: Additional register support needed for HOST_DATA
* Patches 5-7: HOST_DATA register writes, color expansion, and final blit
implementation
Tested with xterm on X.org using the r128 driver built with
--disable-dri (MMIO mode).
Changes from v1:
- Split monolithic patch into 7 logical patches as requested
- Integrate HOST_DATA blit into existing ati_2d_blt()
- Add fallback implementation for non-pixman builds
- Remove unnecessary initialization in realize (kept in reset only)
- Fixed DP_GUI_MASTER_CNTL mask to preserve GMC_SRC_SOURCE field
- Implement scissor clipping
Chad Jablonski (7):
ati-vga: Add scissor clipping register support
ati-vga: Implement scissor rectangle clipping for 2D operations
ati-vga: Implement foreground and background color register writes
ati-vga: Fix DP_GUI_MASTER_CNTL register mask
ati-vga: Implement HOST_DATA register writes
ati-vga: Add expand_colors() helper for monochrome color expansion
ati-vga: Implement HOST_DATA blit source with color expansion
hw/display/ati.c | 77 ++++++++++++-
hw/display/ati_2d.c | 257 +++++++++++++++++++++++++++++++++---------
hw/display/ati_dbg.c | 9 ++
hw/display/ati_int.h | 6 +
hw/display/ati_regs.h | 34 +++++-
5 files changed, 320 insertions(+), 63 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 1/7] ati-vga: Add scissor clipping register support
2025-11-03 3:36 [PATCH v2 0/7] ati-vga: Implement HOST_DATA transfers to enable X.org text rendering Chad Jablonski
@ 2025-11-03 3:36 ` Chad Jablonski
2025-11-03 13:16 ` BALATON Zoltan
2025-11-03 3:36 ` [PATCH v2 2/7] ati-vga: Implement scissor rectangle clipping for 2D operations Chad Jablonski
` (5 subsequent siblings)
6 siblings, 1 reply; 19+ messages in thread
From: Chad Jablonski @ 2025-11-03 3:36 UTC (permalink / raw)
To: qemu-devel; +Cc: balaton, Chad Jablonski
Implement read and write operations on SC_TOP_LEFT, SC_BOTTOM_RIGHT,
and SRC_SC_BOTTOM_RIGHT registers. These registers are also updated
when the src and/or dst clipping fields on DP_GUI_MASTER_CNTL are set
to default clipping.
Scissor clipping is used when rendering text in X.org. The r128 driver
sends host data much wider than is necessary to draw a glyph and cuts it
down to size using clipping before rendering. The actual clipping
implementation follows in a future patch.
Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
hw/display/ati.c | 26 ++++++++++++++++++++++++++
hw/display/ati_int.h | 3 +++
hw/display/ati_regs.h | 12 ++++++++++--
3 files changed, 39 insertions(+), 2 deletions(-)
diff --git a/hw/display/ati.c b/hw/display/ati.c
index 0b4298d078..eb9b30672f 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -510,6 +510,15 @@ static uint64_t ati_mm_read(void *opaque, hwaddr addr, unsigned int size)
case DEFAULT_SC_BOTTOM_RIGHT:
val = s->regs.default_sc_bottom_right;
break;
+ case SC_TOP_LEFT:
+ val = s->regs.sc_top_left;
+ break;
+ case SC_BOTTOM_RIGHT:
+ val = s->regs.sc_bottom_right;
+ break;
+ case SRC_SC_BOTTOM_RIGHT:
+ val = s->regs.src_sc_bottom_right;
+ break;
default:
break;
}
@@ -862,6 +871,14 @@ static void ati_mm_write(void *opaque, hwaddr addr,
s->regs.dp_datatype = (data & 0x0f00) >> 8 | (data & 0x30f0) << 4 |
(data & 0x4000) << 16;
s->regs.dp_mix = (data & GMC_ROP3_MASK) | (data & 0x7000000) >> 16;
+
+ if ((data & GMC_SRC_CLIPPING_MASK) == GMC_SRC_CLIP_DEFAULT) {
+ s->regs.src_sc_bottom_right = s->regs.default_sc_bottom_right;
+ }
+ if ((data & GMC_DST_CLIPPING_MASK) == GMC_DST_CLIP_DEFAULT) {
+ s->regs.sc_top_left = 0;
+ s->regs.sc_bottom_right = s->regs.default_sc_bottom_right;
+ }
break;
case DST_WIDTH_X:
s->regs.dst_x = data & 0x3fff;
@@ -937,6 +954,15 @@ static void ati_mm_write(void *opaque, hwaddr addr,
case DEFAULT_SC_BOTTOM_RIGHT:
s->regs.default_sc_bottom_right = data & 0x3fff3fff;
break;
+ case SC_TOP_LEFT:
+ s->regs.sc_top_left = data;
+ break;
+ case SC_BOTTOM_RIGHT:
+ s->regs.sc_bottom_right = data;
+ break;
+ case SRC_SC_BOTTOM_RIGHT:
+ s->regs.src_sc_bottom_right = data;
+ break;
default:
break;
}
diff --git a/hw/display/ati_int.h b/hw/display/ati_int.h
index 708cc1dd3a..aab3cbf81a 100644
--- a/hw/display/ati_int.h
+++ b/hw/display/ati_int.h
@@ -86,6 +86,9 @@ typedef struct ATIVGARegs {
uint32_t default_pitch;
uint32_t default_tile;
uint32_t default_sc_bottom_right;
+ uint32_t sc_top_left;
+ uint32_t sc_bottom_right;
+ uint32_t src_sc_bottom_right;
} ATIVGARegs;
struct ATIVGAState {
diff --git a/hw/display/ati_regs.h b/hw/display/ati_regs.h
index d7127748ff..2b56b9fb66 100644
--- a/hw/display/ati_regs.h
+++ b/hw/display/ati_regs.h
@@ -392,8 +392,6 @@
/* DP_GUI_MASTER_CNTL bit constants */
#define GMC_SRC_PITCH_OFFSET_CNTL 0x00000001
#define GMC_DST_PITCH_OFFSET_CNTL 0x00000002
-#define GMC_SRC_CLIP_DEFAULT 0x00000000
-#define GMC_DST_CLIP_DEFAULT 0x00000000
#define GMC_BRUSH_SOLIDCOLOR 0x000000d0
#define GMC_SRC_DSTCOLOR 0x00003000
#define GMC_BYTE_ORDER_MSB_TO_LSB 0x00000000
@@ -404,6 +402,16 @@
#define GMC_WRITE_MASK_SET 0x40000000
#define GMC_DP_CONVERSION_TEMP_6500 0x00000000
+/* DP_GUI_MASTER_CNTL DP_SRC_CLIPPING named constants */
+#define GMC_SRC_CLIPPING_MASK 0x00000004
+#define GMC_SRC_CLIP_DEFAULT 0x00000000
+#define GMC_SRC_CLIP_LEAVE_ALONE 0x00000004
+
+/* DP_GUI_MASTER_CNTL DP_DST_CLIPPING named constants */
+#define GMC_DST_CLIPPING_MASK 0x00000008
+#define GMC_DST_CLIP_DEFAULT 0x00000000
+#define GMC_DST_CLIP_LEAVE_ALONE 0x00000008
+
/* DP_GUI_MASTER_CNTL ROP3 named constants */
#define GMC_ROP3_MASK 0x00ff0000
#define ROP3_BLACKNESS 0x00000000
--
2.51.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 2/7] ati-vga: Implement scissor rectangle clipping for 2D operations
2025-11-03 3:36 [PATCH v2 0/7] ati-vga: Implement HOST_DATA transfers to enable X.org text rendering Chad Jablonski
2025-11-03 3:36 ` [PATCH v2 1/7] ati-vga: Add scissor clipping register support Chad Jablonski
@ 2025-11-03 3:36 ` Chad Jablonski
2025-11-03 13:29 ` BALATON Zoltan
2025-11-03 3:36 ` [PATCH v2 3/7] ati-vga: Implement foreground and background color register writes Chad Jablonski
` (4 subsequent siblings)
6 siblings, 1 reply; 19+ messages in thread
From: Chad Jablonski @ 2025-11-03 3:36 UTC (permalink / raw)
To: qemu-devel; +Cc: balaton, Chad Jablonski
Use scissor registers to clip blit operations. This is required
for text rendering in X using the r128 driver. Without it overly-wide
glyphs are drawn and create all sorts of chaos.
Use QemuRect helpers for calculating the intersection of the
destination and scissor rectangles. Source coordinates are
also updated to reflect clipping. The original destination dimensions
are stored in 'dst' while the clipped rectangle is in 'clipped' for
clear distinction between the two.
Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
hw/display/ati_2d.c | 110 +++++++++++++++++++++++++++-----------------
1 file changed, 69 insertions(+), 41 deletions(-)
diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
index 309bb5ccb6..15cf29a061 100644
--- a/hw/display/ati_2d.c
+++ b/hw/display/ati_2d.c
@@ -13,6 +13,7 @@
#include "qemu/log.h"
#include "ui/pixel_ops.h"
#include "ui/console.h"
+#include "ui/rect.h"
/*
* NOTE:
@@ -54,10 +55,35 @@ void ati_2d_blt(ATIVGAState *s)
s->vga.vbe_start_addr, surface_data(ds), surface_stride(ds),
surface_bits_per_pixel(ds),
(s->regs.dp_mix & GMC_ROP3_MASK) >> 16);
- unsigned dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
- s->regs.dst_x : s->regs.dst_x + 1 - s->regs.dst_width);
- unsigned dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
- s->regs.dst_y : s->regs.dst_y + 1 - s->regs.dst_height);
+
+ QemuRect dst;
+ {
+ unsigned dst_width = s->regs.dst_width;
+ unsigned dst_height = s->regs.dst_height;
+ unsigned dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
+ s->regs.dst_x : s->regs.dst_x + 1 - dst_width);
+ unsigned dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
+ s->regs.dst_y : s->regs.dst_y + 1 - dst_height);
+ qemu_rect_init(&dst, dst_x, dst_y, dst_width, dst_height);
+ }
+
+ QemuRect scissor;
+ {
+ uint16_t sc_left = s->regs.sc_top_left & 0x3fff;
+ uint16_t sc_top = (s->regs.sc_top_left >> 16) & 0x3fff;
+ uint16_t sc_right = s->regs.sc_bottom_right & 0x3fff;
+ uint16_t sc_bottom = (s->regs.sc_bottom_right >> 16) & 0x3fff;
+ qemu_rect_init(&scissor, sc_left, sc_top,
+ sc_right - sc_left + 1, sc_bottom - sc_top + 1);
+ }
+
+ QemuRect clipped;
+ if (!qemu_rect_intersect(&dst, &scissor, &clipped)) {
+ return;
+ }
+ uint32_t clip_left = clipped.x - dst.x;
+ uint32_t clip_top = clipped.y - dst.y;
+
int bpp = ati_bpp_from_datatype(s);
if (!bpp) {
qemu_log_mask(LOG_GUEST_ERROR, "Invalid bpp\n");
@@ -76,17 +102,16 @@ void ati_2d_blt(ATIVGAState *s)
dst_stride *= bpp;
}
uint8_t *end = s->vga.vram_ptr + s->vga.vram_size;
- if (dst_x > 0x3fff || dst_y > 0x3fff || dst_bits >= end
- || dst_bits + dst_x
- + (dst_y + s->regs.dst_height) * dst_stride >= end) {
+ if (clipped.x > 0x3fff || clipped.y > 0x3fff || dst_bits >= end
+ || dst_bits + clipped.x
+ + (clipped.y + clipped.height) * dst_stride >= end) {
qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
return;
}
DPRINTF("%d %d %d, %d %d %d, (%d,%d) -> (%d,%d) %dx%d %c %c\n",
s->regs.src_offset, s->regs.dst_offset, s->regs.default_offset,
s->regs.src_pitch, s->regs.dst_pitch, s->regs.default_pitch,
- s->regs.src_x, s->regs.src_y, dst_x, dst_y,
- s->regs.dst_width, s->regs.dst_height,
+ s->regs.src_x, s->regs.src_y, dst.x, dst.y, dst.width, dst.height,
(s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ? '>' : '<'),
(s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ? 'v' : '^'));
switch (s->regs.dp_mix & GMC_ROP3_MASK) {
@@ -94,9 +119,11 @@ void ati_2d_blt(ATIVGAState *s)
{
bool fallback = false;
unsigned src_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
- s->regs.src_x : s->regs.src_x + 1 - s->regs.dst_width);
+ s->regs.src_x + clip_left :
+ s->regs.src_x + 1 - dst.width + clip_left);
unsigned src_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
- s->regs.src_y : s->regs.src_y + 1 - s->regs.dst_height);
+ s->regs.src_y + clip_top :
+ s->regs.src_y + 1 - dst.height + clip_top);
int src_stride = DEFAULT_CNTL ?
s->regs.src_pitch : s->regs.default_pitch;
if (!src_stride) {
@@ -112,7 +139,7 @@ void ati_2d_blt(ATIVGAState *s)
}
if (src_x > 0x3fff || src_y > 0x3fff || src_bits >= end
|| src_bits + src_x
- + (src_y + s->regs.dst_height) * src_stride >= end) {
+ + (src_y + clipped.height) * src_stride >= end) {
qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
return;
}
@@ -121,31 +148,31 @@ void ati_2d_blt(ATIVGAState *s)
dst_stride /= sizeof(uint32_t);
DPRINTF("pixman_blt(%p, %p, %d, %d, %d, %d, %d, %d, %d, %d, %d, %d)\n",
src_bits, dst_bits, src_stride, dst_stride, bpp, bpp,
- src_x, src_y, dst_x, dst_y,
- s->regs.dst_width, s->regs.dst_height);
+ src_x, src_y, clipped.x, clipped.y,
+ clipped.width, clipped.height);
#ifdef CONFIG_PIXMAN
if ((s->use_pixman & BIT(1)) &&
s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT &&
s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM) {
fallback = !pixman_blt((uint32_t *)src_bits, (uint32_t *)dst_bits,
src_stride, dst_stride, bpp, bpp,
- src_x, src_y, dst_x, dst_y,
- s->regs.dst_width, s->regs.dst_height);
+ src_x, src_y, clipped.x, clipped.y,
+ clipped.width, clipped.height);
} else if (s->use_pixman & BIT(1)) {
/* FIXME: We only really need a temporary if src and dst overlap */
- int llb = s->regs.dst_width * (bpp / 8);
+ int llb = clipped.width * (bpp / 8);
int tmp_stride = DIV_ROUND_UP(llb, sizeof(uint32_t));
uint32_t *tmp = g_malloc(tmp_stride * sizeof(uint32_t) *
- s->regs.dst_height);
+ clipped.height);
fallback = !pixman_blt((uint32_t *)src_bits, tmp,
src_stride, tmp_stride, bpp, bpp,
src_x, src_y, 0, 0,
- s->regs.dst_width, s->regs.dst_height);
+ clipped.width, clipped.height);
if (!fallback) {
fallback = !pixman_blt(tmp, (uint32_t *)dst_bits,
tmp_stride, dst_stride, bpp, bpp,
- 0, 0, dst_x, dst_y,
- s->regs.dst_width, s->regs.dst_height);
+ 0, 0, clipped.x, clipped.y,
+ clipped.width, clipped.height);
}
g_free(tmp);
} else
@@ -158,17 +185,17 @@ void ati_2d_blt(ATIVGAState *s)
unsigned int src_pitch = src_stride * sizeof(uint32_t);
unsigned int dst_pitch = dst_stride * sizeof(uint32_t);
- for (y = 0; y < s->regs.dst_height; y++) {
- i = dst_x * bypp;
+ for (y = 0; y < clipped.height; y++) {
+ i = clipped.x * bypp;
j = src_x * bypp;
if (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM) {
- i += (dst_y + y) * dst_pitch;
+ i += (clipped.y + y) * dst_pitch;
j += (src_y + y) * src_pitch;
} else {
- i += (dst_y + s->regs.dst_height - 1 - y) * dst_pitch;
- j += (src_y + s->regs.dst_height - 1 - y) * src_pitch;
+ i += (clipped.y + clipped.height - 1 - y) * dst_pitch;
+ j += (src_y + clipped.height - 1 - y) * src_pitch;
}
- memmove(&dst_bits[i], &src_bits[j], s->regs.dst_width * bypp);
+ memmove(&dst_bits[i], &src_bits[j], clipped.width * bypp);
}
}
if (dst_bits >= s->vga.vram_ptr + s->vga.vbe_start_addr &&
@@ -176,13 +203,13 @@ void ati_2d_blt(ATIVGAState *s)
s->vga.vbe_regs[VBE_DISPI_INDEX_YRES] * s->vga.vbe_line_offset) {
memory_region_set_dirty(&s->vga.vram, s->vga.vbe_start_addr +
s->regs.dst_offset +
- dst_y * surface_stride(ds),
- s->regs.dst_height * surface_stride(ds));
+ clipped.y * surface_stride(ds),
+ clipped.height * surface_stride(ds));
}
s->regs.dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
- dst_x + s->regs.dst_width : dst_x);
+ clipped.x + clipped.width : clipped.x);
s->regs.dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
- dst_y + s->regs.dst_height : dst_y);
+ clipped.y + clipped.height : clipped.y);
break;
}
case ROP3_PATCOPY:
@@ -207,20 +234,21 @@ void ati_2d_blt(ATIVGAState *s)
dst_stride /= sizeof(uint32_t);
DPRINTF("pixman_fill(%p, %d, %d, %d, %d, %d, %d, %x)\n",
- dst_bits, dst_stride, bpp, dst_x, dst_y,
- s->regs.dst_width, s->regs.dst_height, filler);
+ dst_bits, dst_stride, bpp, clipped.x, clipped.y,
+ clipped.width, clipped.height, filler);
#ifdef CONFIG_PIXMAN
if (!(s->use_pixman & BIT(0)) ||
- !pixman_fill((uint32_t *)dst_bits, dst_stride, bpp, dst_x, dst_y,
- s->regs.dst_width, s->regs.dst_height, filler))
+ !pixman_fill((uint32_t *)dst_bits, dst_stride, bpp,
+ clipped.x, clipped.y, clipped.width, clipped.height,
+ filler))
#endif
{
/* fallback when pixman failed or we don't want to call it */
unsigned int x, y, i, bypp = bpp / 8;
unsigned int dst_pitch = dst_stride * sizeof(uint32_t);
- for (y = 0; y < s->regs.dst_height; y++) {
- i = dst_x * bypp + (dst_y + y) * dst_pitch;
- for (x = 0; x < s->regs.dst_width; x++, i += bypp) {
+ for (y = 0; y < clipped.height; y++) {
+ i = clipped.x * bypp + (clipped.y + y) * dst_pitch;
+ for (x = 0; x < clipped.width; x++, i += bypp) {
stn_he_p(&dst_bits[i], bypp, filler);
}
}
@@ -230,11 +258,11 @@ void ati_2d_blt(ATIVGAState *s)
s->vga.vbe_regs[VBE_DISPI_INDEX_YRES] * s->vga.vbe_line_offset) {
memory_region_set_dirty(&s->vga.vram, s->vga.vbe_start_addr +
s->regs.dst_offset +
- dst_y * surface_stride(ds),
- s->regs.dst_height * surface_stride(ds));
+ clipped.y * surface_stride(ds),
+ clipped.height * surface_stride(ds));
}
s->regs.dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
- dst_y + s->regs.dst_height : dst_y);
+ clipped.y + clipped.height : clipped.y);
break;
}
default:
--
2.51.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 3/7] ati-vga: Implement foreground and background color register writes
2025-11-03 3:36 [PATCH v2 0/7] ati-vga: Implement HOST_DATA transfers to enable X.org text rendering Chad Jablonski
2025-11-03 3:36 ` [PATCH v2 1/7] ati-vga: Add scissor clipping register support Chad Jablonski
2025-11-03 3:36 ` [PATCH v2 2/7] ati-vga: Implement scissor rectangle clipping for 2D operations Chad Jablonski
@ 2025-11-03 3:36 ` Chad Jablonski
2025-11-03 13:33 ` BALATON Zoltan
2025-11-03 3:36 ` [PATCH v2 4/7] ati-vga: Fix DP_GUI_MASTER_CNTL register mask Chad Jablonski
` (3 subsequent siblings)
6 siblings, 1 reply; 19+ messages in thread
From: Chad Jablonski @ 2025-11-03 3:36 UTC (permalink / raw)
To: qemu-devel; +Cc: balaton, Chad Jablonski
These are straightforward 32-bit register write handlers. They're
necessary for a future patch which will use them for color expansion
from monochrome host data transfers.
Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
hw/display/ati.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/hw/display/ati.c b/hw/display/ati.c
index eb9b30672f..bf7a037e64 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -926,6 +926,12 @@ static void ati_mm_write(void *opaque, hwaddr addr,
case DP_CNTL:
s->regs.dp_cntl = data;
break;
+ case DP_SRC_FRGD_CLR:
+ s->regs.dp_src_frgd_clr = data;
+ break;
+ case DP_SRC_BKGD_CLR:
+ s->regs.dp_src_bkgd_clr = data;
+ break;
case DP_DATATYPE:
s->regs.dp_datatype = data & 0xe0070f0f;
break;
--
2.51.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 4/7] ati-vga: Fix DP_GUI_MASTER_CNTL register mask
2025-11-03 3:36 [PATCH v2 0/7] ati-vga: Implement HOST_DATA transfers to enable X.org text rendering Chad Jablonski
` (2 preceding siblings ...)
2025-11-03 3:36 ` [PATCH v2 3/7] ati-vga: Implement foreground and background color register writes Chad Jablonski
@ 2025-11-03 3:36 ` Chad Jablonski
2025-11-03 13:46 ` BALATON Zoltan
2025-11-03 3:36 ` [PATCH v2 5/7] ati-vga: Implement HOST_DATA register writes Chad Jablonski
` (2 subsequent siblings)
6 siblings, 1 reply; 19+ messages in thread
From: Chad Jablonski @ 2025-11-03 3:36 UTC (permalink / raw)
To: qemu-devel; +Cc: balaton, Chad Jablonski
Change the register mask from 0xf800000f to 0xff00000f to preserve bits
24-26. This is the GMC_SRC_SOURCE field which is needed to determine
the type of source for the blit operation.
Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
hw/display/ati.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/display/ati.c b/hw/display/ati.c
index bf7a037e64..4ff17209c4 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -867,7 +867,7 @@ static void ati_mm_write(void *opaque, hwaddr addr,
ati_2d_blt(s);
break;
case DP_GUI_MASTER_CNTL:
- s->regs.dp_gui_master_cntl = data & 0xf800000f;
+ s->regs.dp_gui_master_cntl = data & 0xff00000f;
s->regs.dp_datatype = (data & 0x0f00) >> 8 | (data & 0x30f0) << 4 |
(data & 0x4000) << 16;
s->regs.dp_mix = (data & GMC_ROP3_MASK) | (data & 0x7000000) >> 16;
--
2.51.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 5/7] ati-vga: Implement HOST_DATA register writes
2025-11-03 3:36 [PATCH v2 0/7] ati-vga: Implement HOST_DATA transfers to enable X.org text rendering Chad Jablonski
` (3 preceding siblings ...)
2025-11-03 3:36 ` [PATCH v2 4/7] ati-vga: Fix DP_GUI_MASTER_CNTL register mask Chad Jablonski
@ 2025-11-03 3:36 ` Chad Jablonski
2025-11-03 13:58 ` BALATON Zoltan
2025-11-03 3:36 ` [PATCH v2 6/7] ati-vga: Add expand_colors() helper for monochrome color expansion Chad Jablonski
2025-11-03 3:36 ` [PATCH v2 7/7] ati-vga: Implement HOST_DATA blit source with " Chad Jablonski
6 siblings, 1 reply; 19+ messages in thread
From: Chad Jablonski @ 2025-11-03 3:36 UTC (permalink / raw)
To: qemu-devel; +Cc: balaton, Chad Jablonski
Writing to any of the HOST_DATA0-7 registers pushes the written data
into a buffer. A final write to HOST_DATA_LAST writes data to the
buffer and triggers the pending blit operation.
The buffer for now is a static 4MiB and overflows are checked. This
seems like a large enough value given what I've seen in testing. Future
work could dynamically size the buffer based on the destination dimensions if
needed.
This sets things up for implementation of HOST_DATA as a blit operation
source in a future patch.
Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
hw/display/ati.c | 15 +++++++++++++++
hw/display/ati_dbg.c | 9 +++++++++
hw/display/ati_int.h | 3 +++
hw/display/ati_regs.h | 9 +++++++++
4 files changed, 36 insertions(+)
diff --git a/hw/display/ati.c b/hw/display/ati.c
index 4ff17209c4..0a686750ae 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -969,6 +969,20 @@ static void ati_mm_write(void *opaque, hwaddr addr,
case SRC_SC_BOTTOM_RIGHT:
s->regs.src_sc_bottom_right = data;
break;
+ case HOST_DATA0 ... HOST_DATA7:
+ case HOST_DATA_LAST:
+ if (s->host_data_pos + 4 > sizeof(s->host_data_buffer)) {
+ qemu_log_mask(LOG_UNIMP, "HOST_DATA buffer overflow "
+ "(buffer size: %zu bytes)\n",
+ sizeof(s->host_data_buffer));
+ return;
+ }
+ stn_he_p(&s->host_data_buffer[s->host_data_pos], 4, data);
+ s->host_data_pos += 4;
+ if (addr == HOST_DATA_LAST) {
+ ati_2d_blt(s);
+ }
+ break;
default:
break;
}
@@ -1074,6 +1088,7 @@ static void ati_vga_reset(DeviceState *dev)
/* reset vga */
vga_common_reset(&s->vga);
s->mode = VGA_MODE;
+ s->host_data_pos = 0;
}
static void ati_vga_exit(PCIDevice *dev)
diff --git a/hw/display/ati_dbg.c b/hw/display/ati_dbg.c
index 3ffa7f35df..5c799d540a 100644
--- a/hw/display/ati_dbg.c
+++ b/hw/display/ati_dbg.c
@@ -252,6 +252,15 @@ static struct ati_regdesc ati_reg_names[] = {
{"MC_SRC1_CNTL", 0x19D8},
{"TEX_CNTL", 0x1800},
{"RAGE128_MPP_TB_CONFIG", 0x01c0},
+ {"HOST_DATA0", 0x17c0},
+ {"HOST_DATA1", 0x17c4},
+ {"HOST_DATA2", 0x17c8},
+ {"HOST_DATA3", 0x17cc},
+ {"HOST_DATA4", 0x17d0},
+ {"HOST_DATA5", 0x17d4},
+ {"HOST_DATA6", 0x17d8},
+ {"HOST_DATA7", 0x17dc},
+ {"HOST_DATA_LAST", 0x17e0},
{NULL, -1}
};
diff --git a/hw/display/ati_int.h b/hw/display/ati_int.h
index aab3cbf81a..16e5d29a5a 100644
--- a/hw/display/ati_int.h
+++ b/hw/display/ati_int.h
@@ -15,6 +15,7 @@
#include "hw/i2c/bitbang_i2c.h"
#include "vga_int.h"
#include "qom/object.h"
+#include "qemu/units.h"
/*#define DEBUG_ATI*/
@@ -108,6 +109,8 @@ struct ATIVGAState {
MemoryRegion io;
MemoryRegion mm;
ATIVGARegs regs;
+ uint32_t host_data_pos;
+ uint8_t host_data_buffer[4 * MiB];
};
const char *ati_reg_name(int num);
diff --git a/hw/display/ati_regs.h b/hw/display/ati_regs.h
index 2b56b9fb66..9b52b61dcb 100644
--- a/hw/display/ati_regs.h
+++ b/hw/display/ati_regs.h
@@ -252,6 +252,15 @@
#define DP_T12_CNTL 0x178c
#define DST_BRES_T1_LNTH 0x1790
#define DST_BRES_T2_LNTH 0x1794
+#define HOST_DATA0 0x17c0
+#define HOST_DATA1 0x17c4
+#define HOST_DATA2 0x17c8
+#define HOST_DATA3 0x17cc
+#define HOST_DATA4 0x17d0
+#define HOST_DATA5 0x17d4
+#define HOST_DATA6 0x17d8
+#define HOST_DATA7 0x17dc
+#define HOST_DATA_LAST 0x17e0
#define SCALE_SRC_HEIGHT_WIDTH 0x1994
#define SCALE_OFFSET_0 0x1998
#define SCALE_PITCH 0x199c
--
2.51.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 6/7] ati-vga: Add expand_colors() helper for monochrome color expansion
2025-11-03 3:36 [PATCH v2 0/7] ati-vga: Implement HOST_DATA transfers to enable X.org text rendering Chad Jablonski
` (4 preceding siblings ...)
2025-11-03 3:36 ` [PATCH v2 5/7] ati-vga: Implement HOST_DATA register writes Chad Jablonski
@ 2025-11-03 3:36 ` Chad Jablonski
2025-11-03 14:03 ` BALATON Zoltan
2025-11-03 3:36 ` [PATCH v2 7/7] ati-vga: Implement HOST_DATA blit source with " Chad Jablonski
6 siblings, 1 reply; 19+ messages in thread
From: Chad Jablonski @ 2025-11-03 3:36 UTC (permalink / raw)
To: qemu-devel; +Cc: balaton, Chad Jablonski
Convert 1bpp monochrome images to 32bpp ARGB given a foreground and
background color. This also supports most significant and least
significant bit ordering.
This is useful for host data transfers of glyphs when drawing text in X.
Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
hw/display/ati_2d.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
index 15cf29a061..181bf634f0 100644
--- a/hw/display/ati_2d.c
+++ b/hw/display/ati_2d.c
@@ -45,6 +45,28 @@ static int ati_bpp_from_datatype(ATIVGAState *s)
}
#define DEFAULT_CNTL (s->regs.dp_gui_master_cntl & GMC_DST_PITCH_OFFSET_CNTL)
+/* Convert 1bpp monochrome data to 32bpp ARGB using color expansion */
+static void expand_colors(uint8_t *color_dst, const uint8_t *mono_src,
+ uint32_t width, uint32_t height,
+ uint32_t fg_color, uint32_t bg_color,
+ bool lsb_to_msb)
+{
+ uint32_t byte, color;
+ uint8_t *pixel;
+ int i, j, bit;
+ /* Rows are 32-bit aligned */
+ int bytes_per_row = ((width + 31) / 32) * 4;
+
+ for (i = 0; i < height; i++) {
+ for (j = 0; j < width; j++) {
+ byte = mono_src[i * bytes_per_row + (j / 8)];
+ bit = lsb_to_msb ? 7 - (j % 8) : j % 8;
+ color = (byte >> bit) & 0x1 ? fg_color : bg_color;
+ pixel = &color_dst[(i * width + j) * 4];
+ memcpy(pixel, &color, sizeof(color));
+ }
+ }
+}
void ati_2d_blt(ATIVGAState *s)
{
--
2.51.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 7/7] ati-vga: Implement HOST_DATA blit source with color expansion
2025-11-03 3:36 [PATCH v2 0/7] ati-vga: Implement HOST_DATA transfers to enable X.org text rendering Chad Jablonski
` (5 preceding siblings ...)
2025-11-03 3:36 ` [PATCH v2 6/7] ati-vga: Add expand_colors() helper for monochrome color expansion Chad Jablonski
@ 2025-11-03 3:36 ` Chad Jablonski
6 siblings, 0 replies; 19+ messages in thread
From: Chad Jablonski @ 2025-11-03 3:36 UTC (permalink / raw)
To: qemu-devel; +Cc: balaton, Chad Jablonski
SRCCOPY blits using 1bpp HOST_DATA as a source are expanded
to 32bpp ARGB. If pixman is enabled any additional color depth conversions
are handled. The fallback path does not yet support color conversion
and logs an error.
Unlike VRAM sourced blits, host data blits are not triggered by writing
to the dst width registers. They're only triggered on HOST_DATA_LAST
writes.
Supports MONO_FRGD_BKGD and COLOR datatypes. MONO_FRGD (transparent
background) is left for future work and logged.
GMC_SRC_SOURCE_HOST_DATA_ALIGNED is unimplemented and also logged.
Neither of these are used for xterm text rendering.
This combines clipping, host data transfers and color expansion to
enable text rendering in xterm under X.org.
Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
---
hw/display/ati.c | 28 +++++++--
hw/display/ati_2d.c | 135 +++++++++++++++++++++++++++++++++++-------
hw/display/ati_regs.h | 13 ++++
3 files changed, 152 insertions(+), 24 deletions(-)
diff --git a/hw/display/ati.c b/hw/display/ati.c
index 0a686750ae..6ec50279ed 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -813,9 +813,14 @@ static void ati_mm_write(void *opaque, hwaddr addr,
}
break;
case DST_WIDTH:
+ {
+ uint32_t src = s->regs.dp_gui_master_cntl & GMC_SRC_SOURCE_MASK;
s->regs.dst_width = data & 0x3fff;
- ati_2d_blt(s);
+ if (src != GMC_SRC_SOURCE_HOST_DATA) {
+ ati_2d_blt(s);
+ }
break;
+ }
case DST_HEIGHT:
s->regs.dst_height = data & 0x3fff;
break;
@@ -862,10 +867,15 @@ static void ati_mm_write(void *opaque, hwaddr addr,
s->regs.dst_y = (data >> 16) & 0x3fff;
break;
case DST_HEIGHT_WIDTH:
+ {
+ uint32_t src = s->regs.dp_gui_master_cntl & GMC_SRC_SOURCE_MASK;
s->regs.dst_width = data & 0x3fff;
s->regs.dst_height = (data >> 16) & 0x3fff;
- ati_2d_blt(s);
+ if (src != GMC_SRC_SOURCE_HOST_DATA) {
+ ati_2d_blt(s);
+ }
break;
+ }
case DP_GUI_MASTER_CNTL:
s->regs.dp_gui_master_cntl = data & 0xff00000f;
s->regs.dp_datatype = (data & 0x0f00) >> 8 | (data & 0x30f0) << 4 |
@@ -881,10 +891,15 @@ static void ati_mm_write(void *opaque, hwaddr addr,
}
break;
case DST_WIDTH_X:
+ {
+ uint32_t src = s->regs.dp_gui_master_cntl & GMC_SRC_SOURCE_MASK;
s->regs.dst_x = data & 0x3fff;
s->regs.dst_width = (data >> 16) & 0x3fff;
- ati_2d_blt(s);
+ if (src != GMC_SRC_SOURCE_HOST_DATA) {
+ ati_2d_blt(s);
+ }
break;
+ }
case SRC_X_Y:
s->regs.src_y = data & 0x3fff;
s->regs.src_x = (data >> 16) & 0x3fff;
@@ -894,10 +909,15 @@ static void ati_mm_write(void *opaque, hwaddr addr,
s->regs.dst_x = (data >> 16) & 0x3fff;
break;
case DST_WIDTH_HEIGHT:
+ {
+ uint32_t src = s->regs.dp_gui_master_cntl & GMC_SRC_SOURCE_MASK;
s->regs.dst_height = data & 0x3fff;
s->regs.dst_width = (data >> 16) & 0x3fff;
- ati_2d_blt(s);
+ if (src != GMC_SRC_SOURCE_HOST_DATA) {
+ ati_2d_blt(s);
+ }
break;
+ }
case DST_HEIGHT_Y:
s->regs.dst_y = data & 0x3fff;
s->regs.dst_height = (data >> 16) & 0x3fff;
diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
index 181bf634f0..c177686338 100644
--- a/hw/display/ati_2d.c
+++ b/hw/display/ati_2d.c
@@ -25,6 +25,9 @@
* possible.
*/
+#define DEFAULT_CNTL (s->regs.dp_gui_master_cntl & GMC_DST_PITCH_OFFSET_CNTL)
+#define EXPANDED_SRC_BPP 32
+
static int ati_bpp_from_datatype(ATIVGAState *s)
{
switch (s->regs.dp_datatype & 0xf) {
@@ -44,7 +47,6 @@ static int ati_bpp_from_datatype(ATIVGAState *s)
}
}
-#define DEFAULT_CNTL (s->regs.dp_gui_master_cntl & GMC_DST_PITCH_OFFSET_CNTL)
/* Convert 1bpp monochrome data to 32bpp ARGB using color expansion */
static void expand_colors(uint8_t *color_dst, const uint8_t *mono_src,
uint32_t width, uint32_t height,
@@ -139,30 +141,112 @@ void ati_2d_blt(ATIVGAState *s)
switch (s->regs.dp_mix & GMC_ROP3_MASK) {
case ROP3_SRCCOPY:
{
+ uint32_t src = s->regs.dp_gui_master_cntl & GMC_SRC_SOURCE_MASK;
+ uint8_t *color_data = NULL;
bool fallback = false;
- unsigned src_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
- s->regs.src_x + clip_left :
- s->regs.src_x + 1 - dst.width + clip_left);
- unsigned src_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
- s->regs.src_y + clip_top :
- s->regs.src_y + 1 - dst.height + clip_top);
- int src_stride = DEFAULT_CNTL ?
+ unsigned src_x, src_y, src_stride, src_bpp;
+ uint8_t *src_bits;
+
+ switch (src) {
+ case GMC_SRC_SOURCE_HOST_DATA:
+ {
+ unsigned src_datatype = s->regs.dp_gui_master_cntl &
+ GMC_SRC_DATATYPE_MASK;
+ switch (src_datatype) {
+ case GMC_SRC_DATATYPE_MONO_FRGD_BKGD:
+ {
+ bool lsb_to_msb = s->regs.dp_gui_master_cntl &
+ GMC_BYTE_ORDER_LSB_TO_MSB;
+ /* Monochrome source is 1 bpp aligned to 32-bit rows */
+ uint32_t mono_size = ((dst.width + 31) / 32) * 4 * dst.height;
+ if (s->host_data_pos < mono_size) {
+ qemu_log_mask(LOG_UNIMP,
+ "HOST_DATA blit requires %u bytes, buffer holds %u "
+ "(increase buffer size)\n",
+ mono_size, s->host_data_pos);
+ return;
+ }
+
+ /* Expand all of the source, clipping will be applied later */
+ color_data = g_malloc(dst.width * dst.height *
+ sizeof(uint32_t));
+ src_bpp = EXPANDED_SRC_BPP;
+ expand_colors(color_data, s->host_data_buffer,
+ dst.width, dst.height, s->regs.dp_src_frgd_clr,
+ s->regs.dp_src_bkgd_clr, lsb_to_msb);
+ break;
+ }
+ case GMC_SRC_DATATYPE_COLOR:
+ {
+ uint32_t color_size = dst.width * dst.height * (bpp / 8);
+ if (s->host_data_pos < color_size) {
+ qemu_log_mask(LOG_UNIMP,
+ "HOST_DATA blit requires %u bytes, buffer holds %u "
+ "(increase buffer size)\n",
+ color_size, s->host_data_pos);
+ return;
+ }
+ /*
+ * The rage128 register guide states that the bit depth in this
+ * case matches the bit depth of the dst. There is no
+ * independent bit depth register for the src.
+ */
+ src_bpp = bpp;
+ color_data = s->host_data_buffer;
+ break;
+ }
+ case GMC_SRC_DATATYPE_MONO_FRGD:
+ qemu_log_mask(LOG_UNIMP, "ati_2d blt source datatype "
+ "MONO_FRGD (leave-alone) not yet supported\n");
+ return;
+ default:
+ qemu_log_mask(LOG_UNIMP, "ati_2d blt source datatype %x is "
+ "not yet supported\n", src_datatype);
+ return;
+ }
+
+ src_x = clip_left;
+ src_y = clip_top;
+ src_stride = dst.width * (src_bpp / 8);
+ src_bits = color_data;
+ s->host_data_pos = 0;
+ break;
+ }
+ case GMC_SRC_SOURCE_MEMORY:
+ {
+ src_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
+ s->regs.src_x + clip_left :
+ s->regs.src_x + 1 - dst.width + clip_left);
+ src_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
+ s->regs.src_y + clip_top :
+ s->regs.src_y + 1 - dst.height + clip_top);
+ src_stride = DEFAULT_CNTL ?
s->regs.src_pitch : s->regs.default_pitch;
- if (!src_stride) {
- qemu_log_mask(LOG_GUEST_ERROR, "Zero source pitch\n");
+ src_bpp = bpp;
+ src_bits = s->vga.vram_ptr + (DEFAULT_CNTL ?
+ s->regs.src_offset : s->regs.default_offset);
+ if (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF) {
+ src_bits += s->regs.crtc_offset & 0x07ffffff;
+ src_stride *= bpp;
+ }
+ if (src_x > 0x3fff || src_y > 0x3fff || src_bits >= end
+ || src_bits + src_x
+ + (src_y + clipped.height) * src_stride >= end) {
+ qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
+ return;
+ }
+ break;
+ }
+ case GMC_SRC_SOURCE_HOST_DATA_ALIGNED:
+ default:
+ qemu_log_mask(LOG_UNIMP, "ati_2d blt source %x is not "
+ "yet supported\n", src);
return;
}
- uint8_t *src_bits = s->vga.vram_ptr + (DEFAULT_CNTL ?
- s->regs.src_offset : s->regs.default_offset);
- if (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF) {
- src_bits += s->regs.crtc_offset & 0x07ffffff;
- src_stride *= bpp;
- }
- if (src_x > 0x3fff || src_y > 0x3fff || src_bits >= end
- || src_bits + src_x
- + (src_y + clipped.height) * src_stride >= end) {
- qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
+ if (!src_stride) {
+ qemu_log_mask(LOG_GUEST_ERROR, "Zero source pitch\n");
+ g_free(color_data);
return;
}
@@ -207,6 +291,15 @@ void ati_2d_blt(ATIVGAState *s)
unsigned int src_pitch = src_stride * sizeof(uint32_t);
unsigned int dst_pitch = dst_stride * sizeof(uint32_t);
+ if (src_bpp != bpp) {
+ qemu_log_mask(LOG_UNIMP,
+ "Mismatched bit depths not yet supported "
+ "in the fallback (non-pixman) implementation. "
+ "src: %d != dst: %d\n", src_bpp, bpp);
+ g_free(color_data);
+ return;
+ }
+
for (y = 0; y < clipped.height; y++) {
i = clipped.x * bypp;
j = src_x * bypp;
@@ -232,6 +325,8 @@ void ati_2d_blt(ATIVGAState *s)
clipped.x + clipped.width : clipped.x);
s->regs.dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
clipped.y + clipped.height : clipped.y);
+
+ g_free(color_data);
break;
}
case ROP3_PATCOPY:
diff --git a/hw/display/ati_regs.h b/hw/display/ati_regs.h
index 9b52b61dcb..b741c3c6b8 100644
--- a/hw/display/ati_regs.h
+++ b/hw/display/ati_regs.h
@@ -404,6 +404,7 @@
#define GMC_BRUSH_SOLIDCOLOR 0x000000d0
#define GMC_SRC_DSTCOLOR 0x00003000
#define GMC_BYTE_ORDER_MSB_TO_LSB 0x00000000
+#define GMC_BYTE_ORDER_LSB_TO_MSB 0x00004000
#define GMC_DP_SRC_RECT 0x02000000
#define GMC_3D_FCN_EN_CLR 0x00000000
#define GMC_AUX_CLIP_CLEAR 0x20000000
@@ -421,6 +422,18 @@
#define GMC_DST_CLIP_DEFAULT 0x00000000
#define GMC_DST_CLIP_LEAVE_ALONE 0x00000008
+/* DP_GUI_MASTER_CNTL DP_SRC_DATATYPE named constants */
+#define GMC_SRC_DATATYPE_MASK 0x00003000
+#define GMC_SRC_DATATYPE_MONO_FRGD_BKGD 0x00000000
+#define GMC_SRC_DATATYPE_MONO_FRGD 0x00001000
+#define GMC_SRC_DATATYPE_COLOR 0x00003000
+
+/* DP_GUI_MASTER_CNTL DP_SRC_SOURCE named constants */
+#define GMC_SRC_SOURCE_MASK 0x07000000
+#define GMC_SRC_SOURCE_MEMORY 0x02000000
+#define GMC_SRC_SOURCE_HOST_DATA 0x03000000
+#define GMC_SRC_SOURCE_HOST_DATA_ALIGNED 0x04000000
+
/* DP_GUI_MASTER_CNTL ROP3 named constants */
#define GMC_ROP3_MASK 0x00ff0000
#define ROP3_BLACKNESS 0x00000000
--
2.51.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/7] ati-vga: Add scissor clipping register support
2025-11-03 3:36 ` [PATCH v2 1/7] ati-vga: Add scissor clipping register support Chad Jablonski
@ 2025-11-03 13:16 ` BALATON Zoltan
2025-11-04 18:22 ` Chad Jablonski
0 siblings, 1 reply; 19+ messages in thread
From: BALATON Zoltan @ 2025-11-03 13:16 UTC (permalink / raw)
To: Chad Jablonski; +Cc: qemu-devel
On Sun, 2 Nov 2025, Chad Jablonski wrote:
> Implement read and write operations on SC_TOP_LEFT, SC_BOTTOM_RIGHT,
> and SRC_SC_BOTTOM_RIGHT registers. These registers are also updated
> when the src and/or dst clipping fields on DP_GUI_MASTER_CNTL are set
> to default clipping.
>
> Scissor clipping is used when rendering text in X.org. The r128 driver
> sends host data much wider than is necessary to draw a glyph and cuts it
> down to size using clipping before rendering. The actual clipping
> implementation follows in a future patch.
>
> Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
> ---
> hw/display/ati.c | 26 ++++++++++++++++++++++++++
> hw/display/ati_int.h | 3 +++
> hw/display/ati_regs.h | 12 ++++++++++--
> 3 files changed, 39 insertions(+), 2 deletions(-)
>
> diff --git a/hw/display/ati.c b/hw/display/ati.c
> index 0b4298d078..eb9b30672f 100644
> --- a/hw/display/ati.c
> +++ b/hw/display/ati.c
> @@ -510,6 +510,15 @@ static uint64_t ati_mm_read(void *opaque, hwaddr addr, unsigned int size)
> case DEFAULT_SC_BOTTOM_RIGHT:
> val = s->regs.default_sc_bottom_right;
> break;
> + case SC_TOP_LEFT:
> + val = s->regs.sc_top_left;
> + break;
> + case SC_BOTTOM_RIGHT:
> + val = s->regs.sc_bottom_right;
> + break;
> + case SRC_SC_BOTTOM_RIGHT:
> + val = s->regs.src_sc_bottom_right;
> + break;
> default:
> break;
> }
> @@ -862,6 +871,14 @@ static void ati_mm_write(void *opaque, hwaddr addr,
> s->regs.dp_datatype = (data & 0x0f00) >> 8 | (data & 0x30f0) << 4 |
> (data & 0x4000) << 16;
> s->regs.dp_mix = (data & GMC_ROP3_MASK) | (data & 0x7000000) >> 16;
> +
> + if ((data & GMC_SRC_CLIPPING_MASK) == GMC_SRC_CLIP_DEFAULT) {
> + s->regs.src_sc_bottom_right = s->regs.default_sc_bottom_right;
> + }
> + if ((data & GMC_DST_CLIPPING_MASK) == GMC_DST_CLIP_DEFAULT) {
> + s->regs.sc_top_left = 0;
> + s->regs.sc_bottom_right = s->regs.default_sc_bottom_right;
> + }
Or is this what you meant by style? Now I get that. I think the bits
should not reset the regs just cause the operation to use the default
values instead but if you can verify what actual hardware does that would
be best.
Regards,
BALATON Zoltan
> break;
> case DST_WIDTH_X:
> s->regs.dst_x = data & 0x3fff;
> @@ -937,6 +954,15 @@ static void ati_mm_write(void *opaque, hwaddr addr,
> case DEFAULT_SC_BOTTOM_RIGHT:
> s->regs.default_sc_bottom_right = data & 0x3fff3fff;
> break;
> + case SC_TOP_LEFT:
> + s->regs.sc_top_left = data;
> + break;
> + case SC_BOTTOM_RIGHT:
> + s->regs.sc_bottom_right = data;
> + break;
> + case SRC_SC_BOTTOM_RIGHT:
> + s->regs.src_sc_bottom_right = data;
> + break;
> default:
> break;
> }
> diff --git a/hw/display/ati_int.h b/hw/display/ati_int.h
> index 708cc1dd3a..aab3cbf81a 100644
> --- a/hw/display/ati_int.h
> +++ b/hw/display/ati_int.h
> @@ -86,6 +86,9 @@ typedef struct ATIVGARegs {
> uint32_t default_pitch;
> uint32_t default_tile;
> uint32_t default_sc_bottom_right;
> + uint32_t sc_top_left;
> + uint32_t sc_bottom_right;
> + uint32_t src_sc_bottom_right;
> } ATIVGARegs;
>
> struct ATIVGAState {
> diff --git a/hw/display/ati_regs.h b/hw/display/ati_regs.h
> index d7127748ff..2b56b9fb66 100644
> --- a/hw/display/ati_regs.h
> +++ b/hw/display/ati_regs.h
> @@ -392,8 +392,6 @@
> /* DP_GUI_MASTER_CNTL bit constants */
> #define GMC_SRC_PITCH_OFFSET_CNTL 0x00000001
> #define GMC_DST_PITCH_OFFSET_CNTL 0x00000002
> -#define GMC_SRC_CLIP_DEFAULT 0x00000000
> -#define GMC_DST_CLIP_DEFAULT 0x00000000
> #define GMC_BRUSH_SOLIDCOLOR 0x000000d0
> #define GMC_SRC_DSTCOLOR 0x00003000
> #define GMC_BYTE_ORDER_MSB_TO_LSB 0x00000000
> @@ -404,6 +402,16 @@
> #define GMC_WRITE_MASK_SET 0x40000000
> #define GMC_DP_CONVERSION_TEMP_6500 0x00000000
>
> +/* DP_GUI_MASTER_CNTL DP_SRC_CLIPPING named constants */
> +#define GMC_SRC_CLIPPING_MASK 0x00000004
> +#define GMC_SRC_CLIP_DEFAULT 0x00000000
> +#define GMC_SRC_CLIP_LEAVE_ALONE 0x00000004
> +
> +/* DP_GUI_MASTER_CNTL DP_DST_CLIPPING named constants */
> +#define GMC_DST_CLIPPING_MASK 0x00000008
> +#define GMC_DST_CLIP_DEFAULT 0x00000000
> +#define GMC_DST_CLIP_LEAVE_ALONE 0x00000008
> +
> /* DP_GUI_MASTER_CNTL ROP3 named constants */
> #define GMC_ROP3_MASK 0x00ff0000
> #define ROP3_BLACKNESS 0x00000000
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/7] ati-vga: Implement scissor rectangle clipping for 2D operations
2025-11-03 3:36 ` [PATCH v2 2/7] ati-vga: Implement scissor rectangle clipping for 2D operations Chad Jablonski
@ 2025-11-03 13:29 ` BALATON Zoltan
2025-11-05 19:29 ` Chad Jablonski
0 siblings, 1 reply; 19+ messages in thread
From: BALATON Zoltan @ 2025-11-03 13:29 UTC (permalink / raw)
To: Chad Jablonski; +Cc: qemu-devel
On Sun, 2 Nov 2025, Chad Jablonski wrote:
> Use scissor registers to clip blit operations. This is required
> for text rendering in X using the r128 driver. Without it overly-wide
> glyphs are drawn and create all sorts of chaos.
>
> Use QemuRect helpers for calculating the intersection of the
> destination and scissor rectangles. Source coordinates are
> also updated to reflect clipping. The original destination dimensions
> are stored in 'dst' while the clipped rectangle is in 'clipped' for
> clear distinction between the two.
>
> Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
> ---
> hw/display/ati_2d.c | 110 +++++++++++++++++++++++++++-----------------
> 1 file changed, 69 insertions(+), 41 deletions(-)
>
> diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
> index 309bb5ccb6..15cf29a061 100644
> --- a/hw/display/ati_2d.c
> +++ b/hw/display/ati_2d.c
> @@ -13,6 +13,7 @@
> #include "qemu/log.h"
> #include "ui/pixel_ops.h"
> #include "ui/console.h"
> +#include "ui/rect.h"
>
> /*
> * NOTE:
> @@ -54,10 +55,35 @@ void ati_2d_blt(ATIVGAState *s)
> s->vga.vbe_start_addr, surface_data(ds), surface_stride(ds),
> surface_bits_per_pixel(ds),
> (s->regs.dp_mix & GMC_ROP3_MASK) >> 16);
> - unsigned dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
> - s->regs.dst_x : s->regs.dst_x + 1 - s->regs.dst_width);
> - unsigned dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
> - s->regs.dst_y : s->regs.dst_y + 1 - s->regs.dst_height);
> +
> + QemuRect dst;
> + {
> + unsigned dst_width = s->regs.dst_width;
> + unsigned dst_height = s->regs.dst_height;
> + unsigned dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
> + s->regs.dst_x : s->regs.dst_x + 1 - dst_width);
> + unsigned dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
> + s->regs.dst_y : s->regs.dst_y + 1 - dst_height);
> + qemu_rect_init(&dst, dst_x, dst_y, dst_width, dst_height);
> + }
This is a bit unusual style putting variable init in a block. I'm not sure
it's acceptable for QEMU, maybe you could put it in a static helper
function which is more usual style.
> +
> + QemuRect scissor;
> + {
> + uint16_t sc_left = s->regs.sc_top_left & 0x3fff;
> + uint16_t sc_top = (s->regs.sc_top_left >> 16) & 0x3fff;
> + uint16_t sc_right = s->regs.sc_bottom_right & 0x3fff;
> + uint16_t sc_bottom = (s->regs.sc_bottom_right >> 16) & 0x3fff;
> + qemu_rect_init(&scissor, sc_left, sc_top,
> + sc_right - sc_left + 1, sc_bottom - sc_top + 1);
> + }
This could be checked on real hardware too what happens if you store
something in reserved bits (the docs may suggest that e.g. SC_BOTTOM,
SC_RIGHT and SC_BOTTOM_RIGHT might be the same register so writing
reserved bits may overwrite others or masked out by hardware but it's not
clear from docs; the rage128pro docs aren't even clear on what the limits
are as the summary text in the section 3.28 gives different limits than
individual register descriptions right after that). To simplify using
these values I generally tried to apply reserved bits mask on write so no
need to do that at read and using the values. Maybe these should do the
same?
Regards,
BALATON Zoltan
> +
> + QemuRect clipped;
> + if (!qemu_rect_intersect(&dst, &scissor, &clipped)) {
> + return;
> + }
> + uint32_t clip_left = clipped.x - dst.x;
> + uint32_t clip_top = clipped.y - dst.y;
> +
> int bpp = ati_bpp_from_datatype(s);
> if (!bpp) {
> qemu_log_mask(LOG_GUEST_ERROR, "Invalid bpp\n");
> @@ -76,17 +102,16 @@ void ati_2d_blt(ATIVGAState *s)
> dst_stride *= bpp;
> }
> uint8_t *end = s->vga.vram_ptr + s->vga.vram_size;
> - if (dst_x > 0x3fff || dst_y > 0x3fff || dst_bits >= end
> - || dst_bits + dst_x
> - + (dst_y + s->regs.dst_height) * dst_stride >= end) {
> + if (clipped.x > 0x3fff || clipped.y > 0x3fff || dst_bits >= end
> + || dst_bits + clipped.x
> + + (clipped.y + clipped.height) * dst_stride >= end) {
> qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
> return;
> }
> DPRINTF("%d %d %d, %d %d %d, (%d,%d) -> (%d,%d) %dx%d %c %c\n",
> s->regs.src_offset, s->regs.dst_offset, s->regs.default_offset,
> s->regs.src_pitch, s->regs.dst_pitch, s->regs.default_pitch,
> - s->regs.src_x, s->regs.src_y, dst_x, dst_y,
> - s->regs.dst_width, s->regs.dst_height,
> + s->regs.src_x, s->regs.src_y, dst.x, dst.y, dst.width, dst.height,
> (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ? '>' : '<'),
> (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ? 'v' : '^'));
> switch (s->regs.dp_mix & GMC_ROP3_MASK) {
> @@ -94,9 +119,11 @@ void ati_2d_blt(ATIVGAState *s)
> {
> bool fallback = false;
> unsigned src_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
> - s->regs.src_x : s->regs.src_x + 1 - s->regs.dst_width);
> + s->regs.src_x + clip_left :
> + s->regs.src_x + 1 - dst.width + clip_left);
> unsigned src_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
> - s->regs.src_y : s->regs.src_y + 1 - s->regs.dst_height);
> + s->regs.src_y + clip_top :
> + s->regs.src_y + 1 - dst.height + clip_top);
> int src_stride = DEFAULT_CNTL ?
> s->regs.src_pitch : s->regs.default_pitch;
> if (!src_stride) {
> @@ -112,7 +139,7 @@ void ati_2d_blt(ATIVGAState *s)
> }
> if (src_x > 0x3fff || src_y > 0x3fff || src_bits >= end
> || src_bits + src_x
> - + (src_y + s->regs.dst_height) * src_stride >= end) {
> + + (src_y + clipped.height) * src_stride >= end) {
> qemu_log_mask(LOG_UNIMP, "blt outside vram not implemented\n");
> return;
> }
> @@ -121,31 +148,31 @@ void ati_2d_blt(ATIVGAState *s)
> dst_stride /= sizeof(uint32_t);
> DPRINTF("pixman_blt(%p, %p, %d, %d, %d, %d, %d, %d, %d, %d, %d, %d)\n",
> src_bits, dst_bits, src_stride, dst_stride, bpp, bpp,
> - src_x, src_y, dst_x, dst_y,
> - s->regs.dst_width, s->regs.dst_height);
> + src_x, src_y, clipped.x, clipped.y,
> + clipped.width, clipped.height);
> #ifdef CONFIG_PIXMAN
> if ((s->use_pixman & BIT(1)) &&
> s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT &&
> s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM) {
> fallback = !pixman_blt((uint32_t *)src_bits, (uint32_t *)dst_bits,
> src_stride, dst_stride, bpp, bpp,
> - src_x, src_y, dst_x, dst_y,
> - s->regs.dst_width, s->regs.dst_height);
> + src_x, src_y, clipped.x, clipped.y,
> + clipped.width, clipped.height);
> } else if (s->use_pixman & BIT(1)) {
> /* FIXME: We only really need a temporary if src and dst overlap */
> - int llb = s->regs.dst_width * (bpp / 8);
> + int llb = clipped.width * (bpp / 8);
> int tmp_stride = DIV_ROUND_UP(llb, sizeof(uint32_t));
> uint32_t *tmp = g_malloc(tmp_stride * sizeof(uint32_t) *
> - s->regs.dst_height);
> + clipped.height);
> fallback = !pixman_blt((uint32_t *)src_bits, tmp,
> src_stride, tmp_stride, bpp, bpp,
> src_x, src_y, 0, 0,
> - s->regs.dst_width, s->regs.dst_height);
> + clipped.width, clipped.height);
> if (!fallback) {
> fallback = !pixman_blt(tmp, (uint32_t *)dst_bits,
> tmp_stride, dst_stride, bpp, bpp,
> - 0, 0, dst_x, dst_y,
> - s->regs.dst_width, s->regs.dst_height);
> + 0, 0, clipped.x, clipped.y,
> + clipped.width, clipped.height);
> }
> g_free(tmp);
> } else
> @@ -158,17 +185,17 @@ void ati_2d_blt(ATIVGAState *s)
> unsigned int src_pitch = src_stride * sizeof(uint32_t);
> unsigned int dst_pitch = dst_stride * sizeof(uint32_t);
>
> - for (y = 0; y < s->regs.dst_height; y++) {
> - i = dst_x * bypp;
> + for (y = 0; y < clipped.height; y++) {
> + i = clipped.x * bypp;
> j = src_x * bypp;
> if (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM) {
> - i += (dst_y + y) * dst_pitch;
> + i += (clipped.y + y) * dst_pitch;
> j += (src_y + y) * src_pitch;
> } else {
> - i += (dst_y + s->regs.dst_height - 1 - y) * dst_pitch;
> - j += (src_y + s->regs.dst_height - 1 - y) * src_pitch;
> + i += (clipped.y + clipped.height - 1 - y) * dst_pitch;
> + j += (src_y + clipped.height - 1 - y) * src_pitch;
> }
> - memmove(&dst_bits[i], &src_bits[j], s->regs.dst_width * bypp);
> + memmove(&dst_bits[i], &src_bits[j], clipped.width * bypp);
> }
> }
> if (dst_bits >= s->vga.vram_ptr + s->vga.vbe_start_addr &&
> @@ -176,13 +203,13 @@ void ati_2d_blt(ATIVGAState *s)
> s->vga.vbe_regs[VBE_DISPI_INDEX_YRES] * s->vga.vbe_line_offset) {
> memory_region_set_dirty(&s->vga.vram, s->vga.vbe_start_addr +
> s->regs.dst_offset +
> - dst_y * surface_stride(ds),
> - s->regs.dst_height * surface_stride(ds));
> + clipped.y * surface_stride(ds),
> + clipped.height * surface_stride(ds));
> }
> s->regs.dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
> - dst_x + s->regs.dst_width : dst_x);
> + clipped.x + clipped.width : clipped.x);
> s->regs.dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
> - dst_y + s->regs.dst_height : dst_y);
> + clipped.y + clipped.height : clipped.y);
> break;
> }
> case ROP3_PATCOPY:
> @@ -207,20 +234,21 @@ void ati_2d_blt(ATIVGAState *s)
>
> dst_stride /= sizeof(uint32_t);
> DPRINTF("pixman_fill(%p, %d, %d, %d, %d, %d, %d, %x)\n",
> - dst_bits, dst_stride, bpp, dst_x, dst_y,
> - s->regs.dst_width, s->regs.dst_height, filler);
> + dst_bits, dst_stride, bpp, clipped.x, clipped.y,
> + clipped.width, clipped.height, filler);
> #ifdef CONFIG_PIXMAN
> if (!(s->use_pixman & BIT(0)) ||
> - !pixman_fill((uint32_t *)dst_bits, dst_stride, bpp, dst_x, dst_y,
> - s->regs.dst_width, s->regs.dst_height, filler))
> + !pixman_fill((uint32_t *)dst_bits, dst_stride, bpp,
> + clipped.x, clipped.y, clipped.width, clipped.height,
> + filler))
> #endif
> {
> /* fallback when pixman failed or we don't want to call it */
> unsigned int x, y, i, bypp = bpp / 8;
> unsigned int dst_pitch = dst_stride * sizeof(uint32_t);
> - for (y = 0; y < s->regs.dst_height; y++) {
> - i = dst_x * bypp + (dst_y + y) * dst_pitch;
> - for (x = 0; x < s->regs.dst_width; x++, i += bypp) {
> + for (y = 0; y < clipped.height; y++) {
> + i = clipped.x * bypp + (clipped.y + y) * dst_pitch;
> + for (x = 0; x < clipped.width; x++, i += bypp) {
> stn_he_p(&dst_bits[i], bypp, filler);
> }
> }
> @@ -230,11 +258,11 @@ void ati_2d_blt(ATIVGAState *s)
> s->vga.vbe_regs[VBE_DISPI_INDEX_YRES] * s->vga.vbe_line_offset) {
> memory_region_set_dirty(&s->vga.vram, s->vga.vbe_start_addr +
> s->regs.dst_offset +
> - dst_y * surface_stride(ds),
> - s->regs.dst_height * surface_stride(ds));
> + clipped.y * surface_stride(ds),
> + clipped.height * surface_stride(ds));
> }
> s->regs.dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
> - dst_y + s->regs.dst_height : dst_y);
> + clipped.y + clipped.height : clipped.y);
> break;
> }
> default:
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/7] ati-vga: Implement foreground and background color register writes
2025-11-03 3:36 ` [PATCH v2 3/7] ati-vga: Implement foreground and background color register writes Chad Jablonski
@ 2025-11-03 13:33 ` BALATON Zoltan
0 siblings, 0 replies; 19+ messages in thread
From: BALATON Zoltan @ 2025-11-03 13:33 UTC (permalink / raw)
To: Chad Jablonski; +Cc: qemu-devel
On Sun, 2 Nov 2025, Chad Jablonski wrote:
> These are straightforward 32-bit register write handlers. They're
> necessary for a future patch which will use them for color expansion
> from monochrome host data transfers.
>
> Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
> ---
> hw/display/ati.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/hw/display/ati.c b/hw/display/ati.c
> index eb9b30672f..bf7a037e64 100644
> --- a/hw/display/ati.c
> +++ b/hw/display/ati.c
> @@ -926,6 +926,12 @@ static void ati_mm_write(void *opaque, hwaddr addr,
> case DP_CNTL:
> s->regs.dp_cntl = data;
> break;
> + case DP_SRC_FRGD_CLR:
> + s->regs.dp_src_frgd_clr = data;
> + break;
> + case DP_SRC_BKGD_CLR:
> + s->regs.dp_src_bkgd_clr = data;
> + break;
> case DP_DATATYPE:
> s->regs.dp_datatype = data & 0xe0070f0f;
> break;
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/7] ati-vga: Fix DP_GUI_MASTER_CNTL register mask
2025-11-03 3:36 ` [PATCH v2 4/7] ati-vga: Fix DP_GUI_MASTER_CNTL register mask Chad Jablonski
@ 2025-11-03 13:46 ` BALATON Zoltan
0 siblings, 0 replies; 19+ messages in thread
From: BALATON Zoltan @ 2025-11-03 13:46 UTC (permalink / raw)
To: Chad Jablonski; +Cc: qemu-devel
On Sun, 2 Nov 2025, Chad Jablonski wrote:
> Change the register mask from 0xf800000f to 0xff00000f to preserve bits
> 24-26. This is the GMC_SRC_SOURCE field which is needed to determine
> the type of source for the blit operation.
>
> Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
> ---
> hw/display/ati.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/display/ati.c b/hw/display/ati.c
> index bf7a037e64..4ff17209c4 100644
> --- a/hw/display/ati.c
> +++ b/hw/display/ati.c
> @@ -867,7 +867,7 @@ static void ati_mm_write(void *opaque, hwaddr addr,
> ati_2d_blt(s);
> break;
> case DP_GUI_MASTER_CNTL:
> - s->regs.dp_gui_master_cntl = data & 0xf800000f;
> + s->regs.dp_gui_master_cntl = data & 0xff00000f;
> s->regs.dp_datatype = (data & 0x0f00) >> 8 | (data & 0x30f0) << 4 |
> (data & 0x4000) << 16;
> s->regs.dp_mix = (data & GMC_ROP3_MASK) | (data & 0x7000000) >> 16;
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 5/7] ati-vga: Implement HOST_DATA register writes
2025-11-03 3:36 ` [PATCH v2 5/7] ati-vga: Implement HOST_DATA register writes Chad Jablonski
@ 2025-11-03 13:58 ` BALATON Zoltan
2025-11-04 11:26 ` BALATON Zoltan
0 siblings, 1 reply; 19+ messages in thread
From: BALATON Zoltan @ 2025-11-03 13:58 UTC (permalink / raw)
To: Chad Jablonski; +Cc: qemu-devel
On Sun, 2 Nov 2025, Chad Jablonski wrote:
> Writing to any of the HOST_DATA0-7 registers pushes the written data
> into a buffer. A final write to HOST_DATA_LAST writes data to the
> buffer and triggers the pending blit operation.
>
> The buffer for now is a static 4MiB and overflows are checked. This
> seems like a large enough value given what I've seen in testing. Future
> work could dynamically size the buffer based on the destination dimensions if
> needed.
I wonder where the real chip stores this information?
Regards,
BALATON Zoltan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 6/7] ati-vga: Add expand_colors() helper for monochrome color expansion
2025-11-03 3:36 ` [PATCH v2 6/7] ati-vga: Add expand_colors() helper for monochrome color expansion Chad Jablonski
@ 2025-11-03 14:03 ` BALATON Zoltan
2025-11-10 20:20 ` Chad Jablonski
0 siblings, 1 reply; 19+ messages in thread
From: BALATON Zoltan @ 2025-11-03 14:03 UTC (permalink / raw)
To: Chad Jablonski; +Cc: qemu-devel
On Sun, 2 Nov 2025, Chad Jablonski wrote:
> Convert 1bpp monochrome images to 32bpp ARGB given a foreground and
> background color. This also supports most significant and least
> significant bit ordering.
>
> This is useful for host data transfers of glyphs when drawing text in X.
>
> Signed-off-by: Chad Jablonski <chad@jablonski.xyz>
> ---
> hw/display/ati_2d.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/hw/display/ati_2d.c b/hw/display/ati_2d.c
> index 15cf29a061..181bf634f0 100644
> --- a/hw/display/ati_2d.c
> +++ b/hw/display/ati_2d.c
> @@ -45,6 +45,28 @@ static int ati_bpp_from_datatype(ATIVGAState *s)
> }
>
> #define DEFAULT_CNTL (s->regs.dp_gui_master_cntl & GMC_DST_PITCH_OFFSET_CNTL)
> +/* Convert 1bpp monochrome data to 32bpp ARGB using color expansion */
> +static void expand_colors(uint8_t *color_dst, const uint8_t *mono_src,
> + uint32_t width, uint32_t height,
> + uint32_t fg_color, uint32_t bg_color,
> + bool lsb_to_msb)
> +{
> + uint32_t byte, color;
> + uint8_t *pixel;
> + int i, j, bit;
> + /* Rows are 32-bit aligned */
> + int bytes_per_row = ((width + 31) / 32) * 4;
I think there's some QEMU_ALIGN macro for that maybe in qemu/osdep.h?
> +
> + for (i = 0; i < height; i++) {
> + for (j = 0; j < width; j++) {
> + byte = mono_src[i * bytes_per_row + (j / 8)];
> + bit = lsb_to_msb ? 7 - (j % 8) : j % 8;
> + color = (byte >> bit) & 0x1 ? fg_color : bg_color;
> + pixel = &color_dst[(i * width + j) * 4];
> + memcpy(pixel, &color, sizeof(color));
Since it's just writing a 32 bit value maybe cast and = would be faster
than calling memcpy for this.
Regards,
BALATON Zoltan
> + }
> + }
> +}
>
> void ati_2d_blt(ATIVGAState *s)
> {
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 5/7] ati-vga: Implement HOST_DATA register writes
2025-11-03 13:58 ` BALATON Zoltan
@ 2025-11-04 11:26 ` BALATON Zoltan
2025-11-10 19:53 ` Chad Jablonski
0 siblings, 1 reply; 19+ messages in thread
From: BALATON Zoltan @ 2025-11-04 11:26 UTC (permalink / raw)
To: Chad Jablonski; +Cc: qemu-devel
On Mon, 3 Nov 2025, BALATON Zoltan wrote:
> On Sun, 2 Nov 2025, Chad Jablonski wrote:
>> Writing to any of the HOST_DATA0-7 registers pushes the written data
>> into a buffer. A final write to HOST_DATA_LAST writes data to the
>> buffer and triggers the pending blit operation.
>>
>> The buffer for now is a static 4MiB and overflows are checked. This
>> seems like a large enough value given what I've seen in testing. Future
>> work could dynamically size the buffer based on the destination dimensions
>> if
>> needed.
>
> I wonder where the real chip stores this information?
I don't think there's a separate buffer for this on real card and the
command FIFO is not long enough to store it so it should probably use
vram. But how does it know which part of that can be used? Maybe you could
write some pattern into HOST_DATAx registers (like 0xaaaaaaaa, 0x55555555
but longer than the FIFO to make sure it's not staying there) and then
before writing HOST_DATA_LAST look for that pattern in vram to see if it
appears anywhere. Maybe some register points there or the card has some
memory management I don't know about? (I don't know much about GPUs so
it's quite possible I have no idea how it should work.) If the pattern is
not found I don't have any better idea to find out how this should work.
(We could keep the separate buffer in emulation for now but I'm curious
how the real chip does it and if we can emulate that.)
Regards,
BALATON Zoltan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/7] ati-vga: Add scissor clipping register support
2025-11-03 13:16 ` BALATON Zoltan
@ 2025-11-04 18:22 ` Chad Jablonski
0 siblings, 0 replies; 19+ messages in thread
From: Chad Jablonski @ 2025-11-04 18:22 UTC (permalink / raw)
To: BALATON Zoltan, Chad Jablonski; +Cc: qemu-devel
>> + if ((data & GMC_SRC_CLIPPING_MASK) == GMC_SRC_CLIP_DEFAULT) {
>> + s->regs.src_sc_bottom_right = s->regs.default_sc_bottom_right;
>> + }
>> + if ((data & GMC_DST_CLIPPING_MASK) == GMC_DST_CLIP_DEFAULT) {
>> + s->regs.sc_top_left = 0;
>> + s->regs.sc_bottom_right = s->regs.default_sc_bottom_right;
>> + }
>
> Or is this what you meant by style? Now I get that. I think the bits
> should not reset the regs just cause the operation to use the default
> values instead but if you can verify what actual hardware does that would
> be best.
Hi BALATON,
I've tested this on the real 'Rage 128 Pro Ultra TF' and it shows that this is
the correct behavior (copying to the registers). I was definitely a bit
surprised!
I've run this test both with X running (idle over ssh) and prior to
starting X with the same results. I would like to write a bare-metal test
to make sure that the environment is a bit more controlled. But I struggled
with modesetting and it was looking like it was going to become a bit of a
rabbit hole. So I settled on testing under Linux for now and left that for
another day.
Here is the output:
Test SRC clipping
====================================
** Initializing DEFAULT_SC_BOTTOM_RIGHT to 0x0 **
** Initializing SRC_SC_BOTTOM to 0x0 **
** Initializing SRC_SC_RIGHT to 0x0 **
Initial State
------------------------------------
DEFAULT_GUI_MASTER_CNTL: 0x4a1de00f
DEFAULT_SC_BOTTOM_RIGHT: 0x00000000
SRC_SC_BOTTOM: 0x00000000
SRC_SC_RIGHT: 0x00000000
** Setting DEFAULT_SC_BOTTOM_RIGHT to 0x0aaa0bbb **
** Setting SRC_SC_BOTTOM to 0x111 **
** Setting SRC_SC_RIGHT to 0x222 **
State After Setting
------------------------------------
DEFAULT_GUI_MASTER_CNTL: 0x4a1de00f
DEFAULT_SC_BOTTOM_RIGHT: 0x0aaa0bbb
SRC_SC_BOTTOM: 0x00000111
SRC_SC_RIGHT: 0x00000222
** Setting GMC_SRC_CLIPPING to default **
State After Setting Default
------------------------------------
DEFAULT_GUI_MASTER_CNTL: 0x4a1de00b
DEFAULT_SC_BOTTOM_RIGHT: 0x0aaa0bbb
SRC_SC_BOTTOM: 0x00000aaa <======= Set to default
SRC_SC_RIGHT: 0x00000bbb <======= Set to default
** Setting GMC_SRC_CLIPPING to leave alone **
State After Setting Leave Alone
------------------------------------
DEFAULT_GUI_MASTER_CNTL: 0x4a1de00f
DEFAULT_SC_BOTTOM_RIGHT: 0x0aaa0bbb
SRC_SC_BOTTOM: 0x00000aaa <======= STILL default
SRC_SC_RIGHT: 0x00000bbb <======= STILL default
Test DST clipping
====================================
** Initializing DEFAULT_SC_BOTTOM_RIGHT to 0x0 **
** Initializing SC_BOTTOM to 0x0 **
** Initializing SC_RIGHT to 0x0 **
** Initializing SC_TOP to 0x0 **
** Initializing SC_LEFT to 0x0 **
Initial State
------------------------------------
DEFAULT_GUI_MASTER_CNTL: 0x4a1de00f
DEFAULT_SC_BOTTOM_RIGHT: 0x00000000
SC_BOTTOM: 0x00000000
SC_RIGHT: 0x00000000
SC_TOP: 0x00000000
SC_LEFT: 0x00000000
** Setting DEFAULT_SC_BOTTOM_RIGHT to 0x0aaa0bbb **
** Setting SC_BOTTOM to 0x111 **
** Setting SC_RIGHT to 0x222 **
** SETTING SC_TOP to 0x333 **
** SETTING SC_LEFT to 0x444 **
State After Setting
------------------------------------
DEFAULT_GUI_MASTER_CNTL: 0x4a1de00f
DEFAULT_SC_BOTTOM_RIGHT: 0x0aaa0bbb
SC_BOTTOM: 0x00000111
SC_RIGHT: 0x00000222
SC_TOP: 0x00000333
SC_LEFT: 0x00000444
** Setting GMC_DST_CLIPPING to default **
State After Setting Default
------------------------------------
DEFAULT_GUI_MASTER_CNTL: 0x4a1de007
DEFAULT_SC_BOTTOM_RIGHT: 0x0aaa0bbb
SC_BOTTOM: 0x00000aaa <======= Set to default
SC_RIGHT: 0x00000bbb <======= Set to default
SC_TOP: 0x00000000 <======= Set to default
SC_LEFT: 0x00000000 <======= Set to default
** Setting GMC_DST_CLIPPING to leave alone **
State After Setting Leave Alone
------------------------------------
DEFAULT_GUI_MASTER_CNTL: 0x4a1de00f
DEFAULT_SC_BOTTOM_RIGHT: 0x0aaa0bbb
SC_BOTTOM: 0x00000aaa <======= STILL default
SC_RIGHT: 0x00000bbb <======= STILL default
SC_TOP: 0x00000000 <======= STILL default
SC_LEFT: 0x00000000 <======= STILL default
And the source:
===============================================================================
/*
* ATI Rage 128 Pro Clipping Mode Hardware Test
*
* Tests whether clipping mode bits exhibit latching or dynamic behavior
*
* Build: gcc -std=c99 -o test test.c -lpci
* Requirements: libpci-dev, root privileges, ATI Rage 128 Pro hardware
* Note: Run with X.org idle (SSH session recommended)
*/
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <stdlib.h>
#include <dirent.h>
#include <string.h>
#include <pci/pci.h>
#define ATI_VENDOR_ID 0x1002
#define MAX_ATI_DEVICES 10
#define DP_GUI_MASTER_CNTL 0x146c
#define SRC_SC_BOTTOM_RIGHT 0x16f4
#define SRC_SC_BOTTOM 0x165C
#define SRC_SC_RIGHT 0x1654
#define DEFAULT_SC_BOTTOM_RIGHT 0x16e8
#define SC_TOP_LEFT 0x1640
#define SC_LEFT 0x1640
#define SC_TOP 0x1648
#define SC_BOTTOM_RIGHT 0x1644
#define SC_RIGHT 0x1644
#define SC_BOTTOM 0x164C
#define DST_OFFSET 0x1404
#define DST_PITCH 0x1408
#define DP_BRUSH_FRGD_CLR 0x147c
#define DP_WRITE_MASK 0x16cc
#define DST_HEIGHT 0x1410
#define DST_X_Y 0x1594
#define DST_WIDTH_X 0x1588
#define FATAL do { fprintf(stderr, "Error at line %d, file %s (%d) [%s]\n", \
__LINE__, __FILE__, errno, strerror(errno)); exit(1); } while(0)
void run_tests(void *bar2);
struct pci_dev *find_device(struct pci_access *pacc,
char *name_out, int name_len);
void print_devices(struct pci_access *pacc);
void *map_bar(struct pci_dev *dev, int bar_idx);
static inline uint32_t reg_read(void *base, uint32_t offset);
static inline uint32_t reg_write(void *base, uint32_t offset, uint32_t value);
int main(int argc, char **argv) {
struct pci_access *pacc = pci_alloc();
char name[256];
struct pci_dev *dev = find_device(pacc, name, sizeof(name));
void *bar2 = map_bar(dev, 2);
run_tests(bar2);
return 0;
}
void test_dst_clipping(void *bar2) {
uint32_t dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
uint32_t default_sc_bottom_right = reg_read(bar2, DEFAULT_SC_BOTTOM_RIGHT);
uint32_t sc_bottom = reg_read(bar2, SC_BOTTOM);
uint32_t sc_right = reg_read(bar2, SC_RIGHT);
uint32_t sc_top = reg_read(bar2, SC_TOP);
uint32_t sc_left = reg_read(bar2, SC_LEFT);
printf("Test DST clipping\n");
printf("====================================\n\n");
printf("** Initializing DEFAULT_SC_BOTTOM_RIGHT to 0x0 **\n");
printf("** Initializing SC_BOTTOM to 0x0 **\n");
printf("** Initializing SC_RIGHT to 0x0 **\n");
printf("** Initializing SC_TOP to 0x0 **\n");
printf("** Initializing SC_LEFT to 0x0 **\n");
reg_write(bar2, DEFAULT_SC_BOTTOM_RIGHT, 0x0);
reg_write(bar2, SC_BOTTOM, 0x0);
reg_write(bar2, SC_RIGHT, 0x0);
reg_write(bar2, SC_TOP, 0x0);
reg_write(bar2, SC_LEFT, 0x0);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
default_sc_bottom_right = reg_read(bar2, DEFAULT_SC_BOTTOM_RIGHT);
sc_bottom = reg_read(bar2, SC_BOTTOM);
sc_right = reg_read(bar2, SC_RIGHT);
sc_top = reg_read(bar2, SC_TOP);
sc_left = reg_read(bar2, SC_LEFT);
printf("\n");
printf("Initial State\n");
printf("------------------------------------\n");
printf("DEFAULT_GUI_MASTER_CNTL: 0x%08x\n", dp_gui_master_cntl);
printf("DEFAULT_SC_BOTTOM_RIGHT: 0x%08x\n", default_sc_bottom_right);
printf("SC_BOTTOM: 0x%08x\n", sc_bottom);
printf("SC_RIGHT: 0x%08x\n", sc_right);
printf("SC_TOP: 0x%08x\n", sc_top);
printf("SC_LEFT: 0x%08x\n", sc_left);
printf("\n");
printf("** Setting DEFAULT_SC_BOTTOM_RIGHT to 0x0aaa0bbb **\n");
printf("** Setting SC_BOTTOM to 0x111 **\n");
printf("** Setting SC_RIGHT to 0x222 **\n");
printf("** SETTING SC_TOP to 0x333 **\n");
printf("** SETTING SC_LEFT to 0x444 **\n");
reg_write(bar2, DEFAULT_SC_BOTTOM_RIGHT, 0x0aaa0bbb);
reg_write(bar2, SC_BOTTOM, 0x00000111);
reg_write(bar2, SC_RIGHT, 0x00000222);
reg_write(bar2, SC_TOP, 0x00000333);
reg_write(bar2, SC_LEFT, 0x00000444);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
default_sc_bottom_right = reg_read(bar2, DEFAULT_SC_BOTTOM_RIGHT);
sc_bottom = reg_read(bar2, SC_BOTTOM);
sc_right = reg_read(bar2, SC_RIGHT);
sc_top = reg_read(bar2, SC_TOP);
sc_left = reg_read(bar2, SC_LEFT);
printf("\n");
printf("State After Setting\n");
printf("------------------------------------\n");
printf("DEFAULT_GUI_MASTER_CNTL: 0x%08x\n", dp_gui_master_cntl);
printf("DEFAULT_SC_BOTTOM_RIGHT: 0x%08x\n", default_sc_bottom_right);
printf("SC_BOTTOM: 0x%08x\n", sc_bottom);
printf("SC_RIGHT: 0x%08x\n", sc_right);
printf("SC_TOP: 0x%08x\n", sc_top);
printf("SC_LEFT: 0x%08x\n", sc_left);
printf("\n");
printf("** Setting GMC_DST_CLIPPING to default **\n");
reg_write(bar2, DP_GUI_MASTER_CNTL, dp_gui_master_cntl & ~0x8);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
default_sc_bottom_right = reg_read(bar2, DEFAULT_SC_BOTTOM_RIGHT);
sc_bottom = reg_read(bar2, SC_BOTTOM);
sc_right = reg_read(bar2, SC_RIGHT);
sc_top = reg_read(bar2, SC_TOP);
sc_left = reg_read(bar2, SC_LEFT);
printf("\n");
printf("State After Setting Default\n");
printf("------------------------------------\n");
printf("DEFAULT_GUI_MASTER_CNTL: 0x%08x\n", dp_gui_master_cntl);
printf("DEFAULT_SC_BOTTOM_RIGHT: 0x%08x\n", default_sc_bottom_right);
printf("SC_BOTTOM: 0x%08x\n", sc_bottom);
printf("SC_RIGHT: 0x%08x\n", sc_right);
printf("SC_TOP: 0x%08x\n", sc_top);
printf("SC_LEFT: 0x%08x\n", sc_left);
printf("\n");
printf("** Setting GMC_DST_CLIPPING to leave alone **\n");
reg_write(bar2, DP_GUI_MASTER_CNTL, dp_gui_master_cntl | 0x8);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
default_sc_bottom_right = reg_read(bar2, DEFAULT_SC_BOTTOM_RIGHT);
sc_bottom = reg_read(bar2, SC_BOTTOM);
sc_right = reg_read(bar2, SC_RIGHT);
sc_top = reg_read(bar2, SC_TOP);
sc_left = reg_read(bar2, SC_LEFT);
printf("\n");
printf("State After Setting Leave Alone\n");
printf("------------------------------------\n");
printf("DEFAULT_GUI_MASTER_CNTL: 0x%08x\n", dp_gui_master_cntl);
printf("DEFAULT_SC_BOTTOM_RIGHT: 0x%08x\n", default_sc_bottom_right);
printf("SC_BOTTOM: 0x%08x\n", sc_bottom);
printf("SC_RIGHT: 0x%08x\n", sc_right);
printf("SC_TOP: 0x%08x\n", sc_top);
printf("SC_LEFT: 0x%08x\n", sc_left);
printf("\n");
}
void test_src_clipping(void *bar2) {
uint32_t dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
uint32_t default_sc_bottom_right = reg_read(bar2, DEFAULT_SC_BOTTOM_RIGHT);
uint32_t src_sc_bottom = reg_read(bar2, SRC_SC_BOTTOM);
uint32_t src_sc_right = reg_read(bar2, SRC_SC_RIGHT);
printf("Test SRC clipping\n");
printf("====================================\n\n");
printf("** Initializing DEFAULT_SC_BOTTOM_RIGHT to 0x0 **\n");
printf("** Initializing SRC_SC_BOTTOM to 0x0 **\n");
printf("** Initializing SRC_SC_RIGHT to 0x0 **\n");
reg_write(bar2, DEFAULT_SC_BOTTOM_RIGHT, 0x0);
reg_write(bar2, SRC_SC_BOTTOM, 0x0);
reg_write(bar2, SRC_SC_RIGHT, 0x0);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
default_sc_bottom_right = reg_read(bar2, DEFAULT_SC_BOTTOM_RIGHT);
src_sc_bottom = reg_read(bar2, SRC_SC_BOTTOM);
src_sc_right = reg_read(bar2, SRC_SC_RIGHT);
printf("\n");
printf("Initial State\n");
printf("------------------------------------\n");
printf("DEFAULT_GUI_MASTER_CNTL: 0x%08x\n", dp_gui_master_cntl);
printf("DEFAULT_SC_BOTTOM_RIGHT: 0x%08x\n", default_sc_bottom_right);
printf("SRC_SC_BOTTOM: 0x%08x\n", src_sc_bottom);
printf("SRC_SC_RIGHT: 0x%08x\n", src_sc_right);
printf("\n");
printf("** Setting DEFAULT_SC_BOTTOM_RIGHT to 0x0aaa0bbb **\n");
printf("** Setting SRC_SC_BOTTOM to 0x111 **\n");
printf("** Setting SRC_SC_RIGHT to 0x222 **\n");
reg_write(bar2, DEFAULT_SC_BOTTOM_RIGHT, 0x0aaa0bbb);
reg_write(bar2, SRC_SC_BOTTOM, 0x00000111);
reg_write(bar2, SRC_SC_RIGHT, 0x00000222);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
default_sc_bottom_right = reg_read(bar2, DEFAULT_SC_BOTTOM_RIGHT);
src_sc_bottom = reg_read(bar2, SRC_SC_BOTTOM);
src_sc_right = reg_read(bar2, SRC_SC_RIGHT);
printf("\n");
printf("State After Setting\n");
printf("------------------------------------\n");
printf("DEFAULT_GUI_MASTER_CNTL: 0x%08x\n", dp_gui_master_cntl);
printf("DEFAULT_SC_BOTTOM_RIGHT: 0x%08x\n", default_sc_bottom_right);
printf("SRC_SC_BOTTOM: 0x%08x\n", src_sc_bottom);
printf("SRC_SC_RIGHT: 0x%08x\n", src_sc_right);
printf("\n");
printf("** Setting GMC_SRC_CLIPPING to default **\n");
reg_write(bar2, DP_GUI_MASTER_CNTL, dp_gui_master_cntl & ~0x4);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
default_sc_bottom_right = reg_read(bar2, DEFAULT_SC_BOTTOM_RIGHT);
src_sc_bottom = reg_read(bar2, SRC_SC_BOTTOM);
src_sc_right = reg_read(bar2, SRC_SC_RIGHT);
printf("\n");
printf("State After Setting Default\n");
printf("------------------------------------\n");
printf("DEFAULT_GUI_MASTER_CNTL: 0x%08x\n", dp_gui_master_cntl);
printf("DEFAULT_SC_BOTTOM_RIGHT: 0x%08x\n", default_sc_bottom_right);
printf("SRC_SC_BOTTOM: 0x%08x\n", src_sc_bottom);
printf("SRC_SC_RIGHT: 0x%08x\n", src_sc_right);
printf("\n");
printf("** Setting GMC_SRC_CLIPPING to leave alone **\n");
reg_write(bar2, DP_GUI_MASTER_CNTL, dp_gui_master_cntl | 0x4);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
dp_gui_master_cntl = reg_read(bar2, DP_GUI_MASTER_CNTL);
default_sc_bottom_right = reg_read(bar2, DEFAULT_SC_BOTTOM_RIGHT);
src_sc_bottom = reg_read(bar2, SRC_SC_BOTTOM);
src_sc_right = reg_read(bar2, SRC_SC_RIGHT);
printf("\n");
printf("State After Setting Leave Alone\n");
printf("------------------------------------\n");
printf("DEFAULT_GUI_MASTER_CNTL: 0x%08x\n", dp_gui_master_cntl);
printf("DEFAULT_SC_BOTTOM_RIGHT: 0x%08x\n", default_sc_bottom_right);
printf("SRC_SC_BOTTOM: 0x%08x\n", src_sc_bottom);
printf("SRC_SC_RIGHT: 0x%08x\n", src_sc_right);
printf("\n");
}
void run_tests(void *bar2) {
test_src_clipping(bar2);
test_dst_clipping(bar2);
}
struct pci_dev *find_device(struct pci_access *pacc,
char *name_out, int name_len) {
struct pci_dev *dev, *it;
int device_count = 0;
pci_init(pacc);
pci_scan_bus(pacc);
for (it = pacc->devices; it; it = it->next) {
if (it->vendor_id == ATI_VENDOR_ID) {
if (device_count == 0) {
dev = it;
}
device_count += 1;
}
}
if (device_count == 0) {
printf("No ATI devices found\n");
exit(1);
}
if (device_count > 1) {
printf("Found multiple ATI devices:\n");
print_devices(pacc);
}
pci_lookup_name(pacc, name_out, name_len,
PCI_LOOKUP_VENDOR | PCI_LOOKUP_DEVICE,
dev->vendor_id, dev->device_id);
printf("# %s\n\n", name_out);
return dev;
}
void print_devices(struct pci_access *pacc) {
struct pci_dev *dev;
char name[256];
for (dev = pacc->devices; dev; dev = dev->next) {
if (dev->vendor_id != ATI_VENDOR_ID) continue;
pci_lookup_name(pacc, name, sizeof(name),
PCI_LOOKUP_VENDOR | PCI_LOOKUP_DEVICE,
dev->vendor_id, dev->device_id);
printf("\t- %s\n", name);
}
}
void *map_bar(struct pci_dev *dev, int bar_idx) {
char pci_loc[32];
sprintf(pci_loc, "%04x:%02x:%02x.%d", dev->domain, dev->bus,
dev->dev, dev->func);
char base_path[256];
sprintf(base_path, "/sys/bus/pci/devices/%s", pci_loc);
char bar_path[256];
sprintf(bar_path, "%s/resource%d", base_path, bar_idx);
int bar_fd = open(bar_path, O_RDWR | O_SYNC);
if (bar_fd == -1) FATAL;
void *bar = mmap(NULL, dev->size[bar_idx], PROT_READ | PROT_WRITE,
MAP_SHARED, bar_fd, 0);
if (bar == (void *) -1) FATAL;
return bar;
}
static inline uint32_t reg_read(void *base, uint32_t offset) {
volatile uint32_t *reg = (volatile uint32_t *)(base + offset);
return *reg;
}
static inline uint32_t reg_write(void *base, uint32_t offset, uint32_t value) {
volatile uint32_t *reg = (volatile uint32_t *)(base + offset);
*reg = value;
}
===============================================================================
I haven't tested them yet, but I'll also run this test for
GMC_SRC_PITCH_OFFSET_CNTL and GMC_DST_PITCH_OFFSET_CNTL to confirm that
they have the same behavior.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/7] ati-vga: Implement scissor rectangle clipping for 2D operations
2025-11-03 13:29 ` BALATON Zoltan
@ 2025-11-05 19:29 ` Chad Jablonski
0 siblings, 0 replies; 19+ messages in thread
From: Chad Jablonski @ 2025-11-05 19:29 UTC (permalink / raw)
To: BALATON Zoltan, Chad Jablonski; +Cc: qemu-devel
>> +
>> + QemuRect dst;
>> + {
>> + unsigned dst_width = s->regs.dst_width;
>> + unsigned dst_height = s->regs.dst_height;
>> + unsigned dst_x = (s->regs.dp_cntl & DST_X_LEFT_TO_RIGHT ?
>> + s->regs.dst_x : s->regs.dst_x + 1 - dst_width);
>> + unsigned dst_y = (s->regs.dp_cntl & DST_Y_TOP_TO_BOTTOM ?
>> + s->regs.dst_y : s->regs.dst_y + 1 - dst_height);
>> + qemu_rect_init(&dst, dst_x, dst_y, dst_width, dst_height);
>> + }
>
> This is a bit unusual style putting variable init in a block. I'm not sure
> it's acceptable for QEMU, maybe you could put it in a static helper
> function which is more usual style.
>
Not a problem, I'll break these out into a helper in v3.
>> +
>> + QemuRect scissor;
>> + {
>> + uint16_t sc_left = s->regs.sc_top_left & 0x3fff;
>> + uint16_t sc_top = (s->regs.sc_top_left >> 16) & 0x3fff;
>> + uint16_t sc_right = s->regs.sc_bottom_right & 0x3fff;
>> + uint16_t sc_bottom = (s->regs.sc_bottom_right >> 16) & 0x3fff;
>> + qemu_rect_init(&scissor, sc_left, sc_top,
>> + sc_right - sc_left + 1, sc_bottom - sc_top + 1);
>> + }
>
> This could be checked on real hardware too what happens if you store
> something in reserved bits (the docs may suggest that e.g. SC_BOTTOM,
> SC_RIGHT and SC_BOTTOM_RIGHT might be the same register so writing
> reserved bits may overwrite others or masked out by hardware but it's not
> clear from docs; the rage128pro docs aren't even clear on what the limits
> are as the summary text in the section 3.28 gives different limits than
> individual register descriptions right after that). To simplify using
> these values I generally tried to apply reserved bits mask on write so no
> need to do that at read and using the values. Maybe these should do the
> same?
>
It looks like the scissor registers are masked at write! Bits 13:0 are set
on write and writing to reserved bits are just ignored. So you're absolutely
right we shouldn't be bothering with this on read. I'll update this in v3.
# Tested On: ATI Technologies Inc Rage 128 Pro Ultra TF
reserved scissor bits
======================================
** Initializing SC_BOTTOM to 0x0 **
** Initializing SC_RIGHT to 0x0 **
** Initializing SC_TOP to 0x0 **
** Initializing SC_LEFT to 0x0 **
Initial State
------------------------------------
SC_BOTTOM: 0x00000000
SC_RIGHT: 0x00000000
SC_TOP: 0x00000000
SC_LEFT: 0x00000000
** Setting SC_BOTTOM to 0xffffffff **
** Setting SC_TOP to 0xffffffff **
After State
------------------------------------
SC_BOTTOM: 0x00003fff <====== Masked at write (14-bit)
SC_RIGHT: 0x00000000
SC_TOP: 0x00003fff <====== Masked at write (14-bit)
SC_LEFT: 0x00000000
** Setting SC_RIGHT to 0xffffffff **
** Setting SC_LEFT to 0xffffffff **
After State
------------------------------------
SC_BOTTOM: 0x00003fff
SC_RIGHT: 0x00003fff <====== Masked at write (14-bit)
SC_TOP: 0x00003fff
SC_LEFT: 0x00003fff <====== Masked at write (14-bit)
** Setting SC_BOTTOM_RIGHT to 0xfeeefeee **
** Setting SC_TOP_LEFT to 0xfeeefeee **
After State
------------------------------------
SC_BOTTOM: 0x00003eee <====== Masked at write (14-bit)
SC_RIGHT: 0x00003eee <====== Masked at write (14-bit)
SC_TOP: 0x00003eee <====== Masked at write (14-bit)
SC_LEFT: 0x00003eee <====== Masked at write (14-bit)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 5/7] ati-vga: Implement HOST_DATA register writes
2025-11-04 11:26 ` BALATON Zoltan
@ 2025-11-10 19:53 ` Chad Jablonski
0 siblings, 0 replies; 19+ messages in thread
From: Chad Jablonski @ 2025-11-10 19:53 UTC (permalink / raw)
To: BALATON Zoltan, Chad Jablonski; +Cc: qemu-devel
>>
>> I wonder where the real chip stores this information?
>
> I don't think there's a separate buffer for this on real card and the
> command FIFO is not long enough to store it so it should probably use
> vram. But how does it know which part of that can be used? Maybe you could
> write some pattern into HOST_DATAx registers (like 0xaaaaaaaa, 0x55555555
> but longer than the FIFO to make sure it's not staying there) and then
> before writing HOST_DATA_LAST look for that pattern in vram to see if it
> appears anywhere. Maybe some register points there or the card has some
> memory management I don't know about? (I don't know much about GPUs so
> it's quite possible I have no idea how it should work.) If the pattern is
> not found I don't have any better idea to find out how this should work.
> (We could keep the separate buffer in emulation for now but I'm curious
> how the real chip does it and if we can emulate that.)
>
> Regards,
> BALATON Zoltan
Hi BALATON,
You're absolutely right. After spending some time setting up a nicer test
environment I'm able to confirm this behavior on the Rage 128 Pro Ultra TF:
write HOST_DATA0
write HOST_DATA1
write HOST_DATA2
write HOST_DATA3
-> Data appears in the framebuffer at the destination
write HOST_DATA4
write HOST_DATA5
write HOST_DATA6
write HOST_DATA7
-> Data appears in the framebuffer at the destination
The card does not wait for HOST_DATA_LAST to flush to the destination.
So it would appear that there is no buffer at all or even a special area in
VRAM. It's looking like there is a 128-bit accumulator which makes total
sense given the architecture of the card. I'd like to do some additional
testing but you were right to question this. I'll address it in patch v3.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 6/7] ati-vga: Add expand_colors() helper for monochrome color expansion
2025-11-03 14:03 ` BALATON Zoltan
@ 2025-11-10 20:20 ` Chad Jablonski
0 siblings, 0 replies; 19+ messages in thread
From: Chad Jablonski @ 2025-11-10 20:20 UTC (permalink / raw)
To: BALATON Zoltan, Chad Jablonski; +Cc: qemu-devel
>> + int bytes_per_row = ((width + 31) / 32) * 4;
>
> I think there's some QEMU_ALIGN macro for that maybe in qemu/osdep.h?
>
>> +
>> + for (i = 0; i < height; i++) {
>> + for (j = 0; j < width; j++) {
>> + byte = mono_src[i * bytes_per_row + (j / 8)];
>> + bit = lsb_to_msb ? 7 - (j % 8) : j % 8;
>> + color = (byte >> bit) & 0x1 ? fg_color : bg_color;
>> + pixel = &color_dst[(i * width + j) * 4];
>> + memcpy(pixel, &color, sizeof(color));
>
> Since it's just writing a 32 bit value maybe cast and = would be faster
> than calling memcpy for this.
>
Yep, good call! I'll make these changes in v3 also.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-11-10 20:48 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-03 3:36 [PATCH v2 0/7] ati-vga: Implement HOST_DATA transfers to enable X.org text rendering Chad Jablonski
2025-11-03 3:36 ` [PATCH v2 1/7] ati-vga: Add scissor clipping register support Chad Jablonski
2025-11-03 13:16 ` BALATON Zoltan
2025-11-04 18:22 ` Chad Jablonski
2025-11-03 3:36 ` [PATCH v2 2/7] ati-vga: Implement scissor rectangle clipping for 2D operations Chad Jablonski
2025-11-03 13:29 ` BALATON Zoltan
2025-11-05 19:29 ` Chad Jablonski
2025-11-03 3:36 ` [PATCH v2 3/7] ati-vga: Implement foreground and background color register writes Chad Jablonski
2025-11-03 13:33 ` BALATON Zoltan
2025-11-03 3:36 ` [PATCH v2 4/7] ati-vga: Fix DP_GUI_MASTER_CNTL register mask Chad Jablonski
2025-11-03 13:46 ` BALATON Zoltan
2025-11-03 3:36 ` [PATCH v2 5/7] ati-vga: Implement HOST_DATA register writes Chad Jablonski
2025-11-03 13:58 ` BALATON Zoltan
2025-11-04 11:26 ` BALATON Zoltan
2025-11-10 19:53 ` Chad Jablonski
2025-11-03 3:36 ` [PATCH v2 6/7] ati-vga: Add expand_colors() helper for monochrome color expansion Chad Jablonski
2025-11-03 14:03 ` BALATON Zoltan
2025-11-10 20:20 ` Chad Jablonski
2025-11-03 3:36 ` [PATCH v2 7/7] ati-vga: Implement HOST_DATA blit source with " Chad Jablonski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).